Message ID | 20190301134837.88087-1-nbd@nbd.name (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Johannes Berg |
Headers | show |
Series | mac80211: do not call driver wake_tx_queue op during reconfig | expand |
On 3/1/19 5:48 AM, Felix Fietkau wrote: > There are several scenarios in which mac80211 can call drv_wake_tx_queue > after ieee80211_restart_hw has been called and has not yet completed. > Driver private structs are considered uninitialized until mac80211 has > uploaded the vifs, stations and keys again, so using private tx queue > data during that time is not safe. > > The driver can also not rely on drv_reconfig_complete to figure out when > it is safe to accept drv_wake_tx_queue calls again, because it is only > called after all tx queues are woken again. > > To fix this, bail out early in drv_wake_tx_queue if local->in_reconfig > is set. This reminded me of a patch I posted back in 2016. The discussion just sort of ended on it, but curious if you have a new opinion on it after debugging the issue in this patch: https://patchwork.kernel.org/patch/9457709/ For what its worth, I've been using the patch above since I posted it, and it seems to work well for ath9k and ath10k. Thanks, Ben > > Cc: stable@vger.kernel.org > Signed-off-by: Felix Fietkau <nbd@nbd.name> > --- > net/mac80211/driver-ops.h | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/net/mac80211/driver-ops.h b/net/mac80211/driver-ops.h > index 28d022a3eee3..ae4f0be3b393 100644 > --- a/net/mac80211/driver-ops.h > +++ b/net/mac80211/driver-ops.h > @@ -1195,6 +1195,9 @@ static inline void drv_wake_tx_queue(struct ieee80211_local *local, > { > struct ieee80211_sub_if_data *sdata = vif_to_sdata(txq->txq.vif); > > + if (local->in_reconfig) > + return; > + > if (!check_sdata_in_driver(sdata)) > return; > >
On 2019-03-01 16:29, Ben Greear wrote: > On 3/1/19 5:48 AM, Felix Fietkau wrote: >> There are several scenarios in which mac80211 can call drv_wake_tx_queue >> after ieee80211_restart_hw has been called and has not yet completed. >> Driver private structs are considered uninitialized until mac80211 has >> uploaded the vifs, stations and keys again, so using private tx queue >> data during that time is not safe. >> >> The driver can also not rely on drv_reconfig_complete to figure out when >> it is safe to accept drv_wake_tx_queue calls again, because it is only >> called after all tx queues are woken again. >> >> To fix this, bail out early in drv_wake_tx_queue if local->in_reconfig >> is set. > > This reminded me of a patch I posted back in 2016. The discussion just sort of > ended on it, but curious if you have a new opinion on it after debugging the > issue in this patch: > > https://patchwork.kernel.org/patch/9457709/ > > For what its worth, I've been using the patch above since I posted it, and > it seems to work well for ath9k and ath10k. I agree with what Johannes wrote about that patch. Fixing this could likely be as simple as clearing IEEE80211_SDATA_IN_DRIVER on all interfaces before bringing any of them back up. That way the normal interface add logic applies without nasty special cases. The reconfig code checks for ieee80211_sdata_running (which is unaffected), so I don't think we need to save the previous value of that flag. - Felix
On 2019-03-01 17:06, Felix Fietkau wrote: > On 2019-03-01 16:29, Ben Greear wrote: >> On 3/1/19 5:48 AM, Felix Fietkau wrote: >>> There are several scenarios in which mac80211 can call drv_wake_tx_queue >>> after ieee80211_restart_hw has been called and has not yet completed. >>> Driver private structs are considered uninitialized until mac80211 has >>> uploaded the vifs, stations and keys again, so using private tx queue >>> data during that time is not safe. >>> >>> The driver can also not rely on drv_reconfig_complete to figure out when >>> it is safe to accept drv_wake_tx_queue calls again, because it is only >>> called after all tx queues are woken again. >>> >>> To fix this, bail out early in drv_wake_tx_queue if local->in_reconfig >>> is set. >> >> This reminded me of a patch I posted back in 2016. The discussion just sort of >> ended on it, but curious if you have a new opinion on it after debugging the >> issue in this patch: >> >> https://patchwork.kernel.org/patch/9457709/ >> >> For what its worth, I've been using the patch above since I posted it, and >> it seems to work well for ath9k and ath10k. > I agree with what Johannes wrote about that patch. Fixing this could > likely be as simple as clearing IEEE80211_SDATA_IN_DRIVER on all > interfaces before bringing any of them back up. That way the normal > interface add logic applies without nasty special cases. > The reconfig code checks for ieee80211_sdata_running (which is > unaffected), so I don't think we need to save the previous value of that > flag. How about this? (untested) --- --- a/net/mac80211/util.c +++ b/net/mac80211/util.c @@ -1956,7 +1956,6 @@ static void ieee80211_flush_completed_scan(struct ieee80211_local *local, static void ieee80211_handle_reconfig_failure(struct ieee80211_local *local) { - struct ieee80211_sub_if_data *sdata; struct ieee80211_chanctx *ctx; /* @@ -1980,9 +1979,6 @@ static void ieee80211_handle_reconfig_failure(struct ieee80211_local *local) */ ieee80211_sched_scan_end(local); - list_for_each_entry(sdata, &local->interfaces, list) - sdata->flags &= ~IEEE80211_SDATA_IN_DRIVER; - /* Mark channel contexts as not being in the driver any more to avoid * removing them from the driver during the shutdown process... */ @@ -2135,6 +2131,9 @@ int ieee80211_reconfig(struct ieee80211_local *local) local->started = false; + list_for_each_entry(sdata, &local->interfaces, list) + sdata->flags &= ~IEEE80211_SDATA_IN_DRIVER; + /* * Upon resume hardware can sometimes be goofy due to * various platform / driver / bus issues, so restarting
On 3/1/19 8:13 AM, Felix Fietkau wrote: > On 2019-03-01 17:06, Felix Fietkau wrote: >> On 2019-03-01 16:29, Ben Greear wrote: >>> On 3/1/19 5:48 AM, Felix Fietkau wrote: >>>> There are several scenarios in which mac80211 can call drv_wake_tx_queue >>>> after ieee80211_restart_hw has been called and has not yet completed. >>>> Driver private structs are considered uninitialized until mac80211 has >>>> uploaded the vifs, stations and keys again, so using private tx queue >>>> data during that time is not safe. >>>> >>>> The driver can also not rely on drv_reconfig_complete to figure out when >>>> it is safe to accept drv_wake_tx_queue calls again, because it is only >>>> called after all tx queues are woken again. >>>> >>>> To fix this, bail out early in drv_wake_tx_queue if local->in_reconfig >>>> is set. >>> >>> This reminded me of a patch I posted back in 2016. The discussion just sort of >>> ended on it, but curious if you have a new opinion on it after debugging the >>> issue in this patch: >>> >>> https://patchwork.kernel.org/patch/9457709/ >>> >>> For what its worth, I've been using the patch above since I posted it, and >>> it seems to work well for ath9k and ath10k. >> I agree with what Johannes wrote about that patch. Fixing this could >> likely be as simple as clearing IEEE80211_SDATA_IN_DRIVER on all >> interfaces before bringing any of them back up. That way the normal >> interface add logic applies without nasty special cases. >> The reconfig code checks for ieee80211_sdata_running (which is >> unaffected), so I don't think we need to save the previous value of that >> flag. > How about this? (untested) > --- > --- a/net/mac80211/util.c > +++ b/net/mac80211/util.c > @@ -1956,7 +1956,6 @@ static void ieee80211_flush_completed_scan(struct ieee80211_local *local, > > static void ieee80211_handle_reconfig_failure(struct ieee80211_local *local) > { > - struct ieee80211_sub_if_data *sdata; > struct ieee80211_chanctx *ctx; > > /* > @@ -1980,9 +1979,6 @@ static void ieee80211_handle_reconfig_failure(struct ieee80211_local *local) > */ > ieee80211_sched_scan_end(local); > > - list_for_each_entry(sdata, &local->interfaces, list) > - sdata->flags &= ~IEEE80211_SDATA_IN_DRIVER; > - > /* Mark channel contexts as not being in the driver any more to avoid > * removing them from the driver during the shutdown process... > */ > @@ -2135,6 +2131,9 @@ int ieee80211_reconfig(struct ieee80211_local *local) > > local->started = false; > > + list_for_each_entry(sdata, &local->interfaces, list) > + sdata->flags &= ~IEEE80211_SDATA_IN_DRIVER; > + > /* > * Upon resume hardware can sometimes be goofy due to > * various platform / driver / bus issues, so restarting > Let me see if anyone (Johannes?) has comments on this... I can test it out a bit later if no one suggests changes. Thanks, Ben
diff --git a/net/mac80211/driver-ops.h b/net/mac80211/driver-ops.h index 28d022a3eee3..ae4f0be3b393 100644 --- a/net/mac80211/driver-ops.h +++ b/net/mac80211/driver-ops.h @@ -1195,6 +1195,9 @@ static inline void drv_wake_tx_queue(struct ieee80211_local *local, { struct ieee80211_sub_if_data *sdata = vif_to_sdata(txq->txq.vif); + if (local->in_reconfig) + return; + if (!check_sdata_in_driver(sdata)) return;
There are several scenarios in which mac80211 can call drv_wake_tx_queue after ieee80211_restart_hw has been called and has not yet completed. Driver private structs are considered uninitialized until mac80211 has uploaded the vifs, stations and keys again, so using private tx queue data during that time is not safe. The driver can also not rely on drv_reconfig_complete to figure out when it is safe to accept drv_wake_tx_queue calls again, because it is only called after all tx queues are woken again. To fix this, bail out early in drv_wake_tx_queue if local->in_reconfig is set. Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name> --- net/mac80211/driver-ops.h | 3 +++ 1 file changed, 3 insertions(+)