Message ID | 20210804153626.1549001-2-elder@linaro.org (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: ipa: more work toward runtime PM | expand |
Context | Check | Description |
---|---|---|
netdev/cover_letter | success | Link |
netdev/fixes_present | success | Link |
netdev/patch_count | success | Link |
netdev/tree_selection | success | Clearly marked for net-next |
netdev/subject_prefix | success | Link |
netdev/cc_maintainers | success | CCed 4 of 4 maintainers |
netdev/source_inline | success | Was 0 now: 0 |
netdev/verify_signedoff | success | Link |
netdev/module_param | success | Was 0 now: 0 |
netdev/build_32bit | success | Errors and warnings before: 0 this patch: 0 |
netdev/kdoc | success | Errors and warnings before: 0 this patch: 0 |
netdev/verify_fixes | success | Link |
netdev/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 31 lines checked |
netdev/build_allmodconfig_warn | success | Errors and warnings before: 0 this patch: 0 |
netdev/header_inline | success | Link |
On Wed, 4 Aug 2021 10:36:21 -0500 Alex Elder wrote: > The modem network device is set up by ipa_modem_start(). But its > TX queue is not actually started and endpoints enabled until it is > opened. > > So avoid stopping the modem network device TX queue and disabling > endpoints on suspend or stop unless the netdev is marked UP. And > skip attempting to resume unless it is UP. > > Signed-off-by: Alex Elder <elder@linaro.org> You said in the cover letter that in practice this fix doesn't matter. It seems trivial to test so perhaps it doesn't and we should leave the code be? Looking at dev->flags without holding rtnl_lock() seems suspicious, drivers commonly put the relevant portion of suspend/resume routines under rtnl_lock()/rtnl_unlock() (although to be completely frank IDK if it's actually possible for concurrent suspend + open/close to happen). Are there any callers of ipa_modem_stop() which don't hold rtnl_lock()? > diff --git a/drivers/net/ipa/ipa_modem.c b/drivers/net/ipa/ipa_modem.c > index 4ea8287e9d237..663a610979e70 100644 > --- a/drivers/net/ipa/ipa_modem.c > +++ b/drivers/net/ipa/ipa_modem.c > @@ -178,6 +178,9 @@ void ipa_modem_suspend(struct net_device *netdev) > struct ipa_priv *priv = netdev_priv(netdev); > struct ipa *ipa = priv->ipa; > > + if (!(netdev->flags & IFF_UP)) > + return; > + > netif_stop_queue(netdev); > > ipa_endpoint_suspend_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]); > @@ -194,6 +197,9 @@ void ipa_modem_resume(struct net_device *netdev) > struct ipa_priv *priv = netdev_priv(netdev); > struct ipa *ipa = priv->ipa; > > + if (!(netdev->flags & IFF_UP)) > + return; > + > ipa_endpoint_resume_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_TX]); > ipa_endpoint_resume_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]); > > @@ -265,9 +271,11 @@ int ipa_modem_stop(struct ipa *ipa) > /* Prevent the modem from triggering a call to ipa_setup() */ > ipa_smp2p_disable(ipa); > > - /* Stop the queue and disable the endpoints if it's open */ > + /* Clean up the netdev and endpoints if it was started */ > if (netdev) { > - (void)ipa_stop(netdev); > + /* If it was opened, stop it first */ > + if (netdev->flags & IFF_UP) > + (void)ipa_stop(netdev); > ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]->netdev = NULL; > ipa->name_map[IPA_ENDPOINT_AP_MODEM_TX]->netdev = NULL; > ipa->modem_netdev = NULL;
On 8/5/21 8:26 PM, Jakub Kicinski wrote: > On Wed, 4 Aug 2021 10:36:21 -0500 Alex Elder wrote: >> The modem network device is set up by ipa_modem_start(). But its >> TX queue is not actually started and endpoints enabled until it is >> opened. >> >> So avoid stopping the modem network device TX queue and disabling >> endpoints on suspend or stop unless the netdev is marked UP. And >> skip attempting to resume unless it is UP. >> >> Signed-off-by: Alex Elder <elder@linaro.org> > > You said in the cover letter that in practice this fix doesn't matter. I don't think we've seen this problem with system suspend, but with runtime suspend we could get a forced suspend request at any time (and frequently), so if there is a problem, it will be much more likely to occur. For suspend, I don't think it's actually a "problem". Disabling the TX queue if it wasn't open is harmless--it just sets the DRV_XOFF bit in the TX queue state field. And we have a separate "enabled endpoints" mask that prevents stopping or suspending the endpoint if it wasn't opened. But for resume, waking the queue schedules it. I'm not sure what exactly ensues in that case, but it's not correct if the network device hasn't been opened. For endpoints, again, they won't be resumed if they weren't enabled, so that part's OK. > It seems trivial to test so perhaps it doesn't and we should leave the > code be? Looking at dev->flags without holding rtnl_lock() seems > suspicious, drivers commonly put the relevant portion of suspend/resume > routines under rtnl_lock()/rtnl_unlock() (although to be completely I don't use rtnl_lock()/rtnl_unlock() *anywhere* in the driver. It has no netlink interface (yet), and therefore I didn't even think about using rtnl_lock(). Do I need it? > frank IDK if it's actually possible for concurrent suspend + > open/close to happen). I think it isn't possible, but I'm less than 100% sure. I've been thinking a lot about exactly this sort of question lately... > Are there any callers of ipa_modem_stop() which don't hold rtnl_lock()? None of them take that lock. It is called in the driver ->remove callback, and is called during cleanup if the modem crashes. I think this fix is good, but as I said in the cover letter I'm not aware of ever having hit it to date. Thank you very much for your review and comments. -Alex >> diff --git a/drivers/net/ipa/ipa_modem.c b/drivers/net/ipa/ipa_modem.c >> index 4ea8287e9d237..663a610979e70 100644 >> --- a/drivers/net/ipa/ipa_modem.c >> +++ b/drivers/net/ipa/ipa_modem.c >> @@ -178,6 +178,9 @@ void ipa_modem_suspend(struct net_device *netdev) >> struct ipa_priv *priv = netdev_priv(netdev); >> struct ipa *ipa = priv->ipa; >> >> + if (!(netdev->flags & IFF_UP)) >> + return; >> + >> netif_stop_queue(netdev); >> >> ipa_endpoint_suspend_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]); >> @@ -194,6 +197,9 @@ void ipa_modem_resume(struct net_device *netdev) >> struct ipa_priv *priv = netdev_priv(netdev); >> struct ipa *ipa = priv->ipa; >> >> + if (!(netdev->flags & IFF_UP)) >> + return; >> + >> ipa_endpoint_resume_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_TX]); >> ipa_endpoint_resume_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]); >> >> @@ -265,9 +271,11 @@ int ipa_modem_stop(struct ipa *ipa) >> /* Prevent the modem from triggering a call to ipa_setup() */ >> ipa_smp2p_disable(ipa); >> >> - /* Stop the queue and disable the endpoints if it's open */ >> + /* Clean up the netdev and endpoints if it was started */ >> if (netdev) { >> - (void)ipa_stop(netdev); >> + /* If it was opened, stop it first */ >> + if (netdev->flags & IFF_UP) >> + (void)ipa_stop(netdev); >> ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]->netdev = NULL; >> ipa->name_map[IPA_ENDPOINT_AP_MODEM_TX]->netdev = NULL; >> ipa->modem_netdev = NULL; >
On Fri, 6 Aug 2021 06:39:46 -0500 Alex Elder wrote: > On 8/5/21 8:26 PM, Jakub Kicinski wrote: > > On Wed, 4 Aug 2021 10:36:21 -0500 Alex Elder wrote: > >> The modem network device is set up by ipa_modem_start(). But its > >> TX queue is not actually started and endpoints enabled until it is > >> opened. > >> > >> So avoid stopping the modem network device TX queue and disabling > >> endpoints on suspend or stop unless the netdev is marked UP. And > >> skip attempting to resume unless it is UP. > >> > >> Signed-off-by: Alex Elder <elder@linaro.org> > > > > You said in the cover letter that in practice this fix doesn't matter. > > I don't think we've seen this problem with system suspend, but > with runtime suspend we could get a forced suspend request at > any time (and frequently), so if there is a problem, it will be > much more likely to occur. > > For suspend, I don't think it's actually a "problem". Disabling > the TX queue if it wasn't open is harmless--it just sets the > DRV_XOFF bit in the TX queue state field. And we have a > separate "enabled endpoints" mask that prevents stopping or > suspending the endpoint if it wasn't opened. > > But for resume, waking the queue schedules it. I'm not sure > what exactly ensues in that case, but it's not correct if the > network device hasn't been opened. For endpoints, again, they > won't be resumed if they weren't enabled, so that part's OK. > > > It seems trivial to test so perhaps it doesn't and we should leave the > > code be? Looking at dev->flags without holding rtnl_lock() seems > > suspicious, drivers commonly put the relevant portion of suspend/resume > > routines under rtnl_lock()/rtnl_unlock() (although to be completely > > I don't use rtnl_lock()/rtnl_unlock() *anywhere* in the driver. > It has no netlink interface (yet), and therefore I didn't even > think about using rtnl_lock(). Do I need it? Runtime PM interactions with rtnl_lock get really tricky, if there are callers which will wake the device up while holding rtnl then taking rtnl in .resume will cause an obvious deadlock, right? I'm starting to feel like driver's RPM-related code has to be under it's own lock, and interrogating higher layer's (e.g. network stack's) state from RPM code should be avoided... Long story short I don't think we have a good handle on this, I certainly don't so maybe let's leave your code be, for now. > > frank IDK if it's actually possible for concurrent suspend + > > open/close to happen). > > I think it isn't possible, but I'm less than 100% sure. I've > been thinking a lot about exactly this sort of question lately... > > > Are there any callers of ipa_modem_stop() which don't hold rtnl_lock()? > > None of them take that lock. It is called in the driver ->remove > callback, and is called during cleanup if the modem crashes. > > I think this fix is good, but as I said in the cover letter I'm > not aware of ever having hit it to date. > > Thank you very much for your review and comments.
diff --git a/drivers/net/ipa/ipa_modem.c b/drivers/net/ipa/ipa_modem.c index 4ea8287e9d237..663a610979e70 100644 --- a/drivers/net/ipa/ipa_modem.c +++ b/drivers/net/ipa/ipa_modem.c @@ -178,6 +178,9 @@ void ipa_modem_suspend(struct net_device *netdev) struct ipa_priv *priv = netdev_priv(netdev); struct ipa *ipa = priv->ipa; + if (!(netdev->flags & IFF_UP)) + return; + netif_stop_queue(netdev); ipa_endpoint_suspend_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]); @@ -194,6 +197,9 @@ void ipa_modem_resume(struct net_device *netdev) struct ipa_priv *priv = netdev_priv(netdev); struct ipa *ipa = priv->ipa; + if (!(netdev->flags & IFF_UP)) + return; + ipa_endpoint_resume_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_TX]); ipa_endpoint_resume_one(ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]); @@ -265,9 +271,11 @@ int ipa_modem_stop(struct ipa *ipa) /* Prevent the modem from triggering a call to ipa_setup() */ ipa_smp2p_disable(ipa); - /* Stop the queue and disable the endpoints if it's open */ + /* Clean up the netdev and endpoints if it was started */ if (netdev) { - (void)ipa_stop(netdev); + /* If it was opened, stop it first */ + if (netdev->flags & IFF_UP) + (void)ipa_stop(netdev); ipa->name_map[IPA_ENDPOINT_AP_MODEM_RX]->netdev = NULL; ipa->name_map[IPA_ENDPOINT_AP_MODEM_TX]->netdev = NULL; ipa->modem_netdev = NULL;
The modem network device is set up by ipa_modem_start(). But its TX queue is not actually started and endpoints enabled until it is opened. So avoid stopping the modem network device TX queue and disabling endpoints on suspend or stop unless the netdev is marked UP. And skip attempting to resume unless it is UP. Signed-off-by: Alex Elder <elder@linaro.org> --- drivers/net/ipa/ipa_modem.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)