Message ID | 20181003140745.7650-1-baijiaju1990@gmail.com (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Kalle Valo |
Headers | show |
Series | net: wireless: iwlegacy: Fix possible data races in il4965_send_rxon_assoc() | expand |
On Wed, Oct 03, 2018 at 10:07:45PM +0800, Jia-Ju Bai wrote: > These possible races are detected by a runtime testing. > To fix these races, the mutex lock is used in il4965_send_rxon_assoc() > to protect the data. Really ? I'm surprised by that, see below. > @@ -1297,6 +1297,7 @@ il4965_send_rxon_assoc(struct il_priv *il) > const struct il_rxon_cmd *rxon1 = &il->staging; > const struct il_rxon_cmd *rxon2 = &il->active; > > + mutex_lock(&il->mutex); > if (rxon1->flags == rxon2->flags && For 4965 driver il4965_send_rxon_assoc() is only called by il_mac_bss_info_changed() and il4965_commit_rxon(). il_mac_bss_info_changed() acquire il->mutex and callers of il4965_commit_rxon() acquire il->mutex (but I did not check all of them). So I wonder how this patch did not cause the deadlock ? Anyway what can be done is adding: lockdep_assert_held(&il->mutex); il4965_commit_rxon() to check if we hold the mutex. Thanks Stanislaw
Thanks for your reply :) On 2018/10/4 15:59, Stanislaw Gruszka wrote: > On Wed, Oct 03, 2018 at 10:07:45PM +0800, Jia-Ju Bai wrote: >> These possible races are detected by a runtime testing. >> To fix these races, the mutex lock is used in il4965_send_rxon_assoc() >> to protect the data. > Really ? I'm surprised by that, see below. My runtime testing shows that il4965_send_rxon_assoc() and il4965_configure_filter() are concurrently executed. But after seeing your reply, I need to carefully check whether my runtime testing is right, because I think you are right. In fact, I only monitored the iwl4965 driver, but did not monitor the iwlegacy driver, so I will do the testing again with monitoring the lwlegacy driver. > >> @@ -1297,6 +1297,7 @@ il4965_send_rxon_assoc(struct il_priv *il) >> const struct il_rxon_cmd *rxon1 = &il->staging; >> const struct il_rxon_cmd *rxon2 = &il->active; >> >> + mutex_lock(&il->mutex); >> if (rxon1->flags == rxon2->flags && > For 4965 driver il4965_send_rxon_assoc() is only called by > il_mac_bss_info_changed() and il4965_commit_rxon(). > > il_mac_bss_info_changed() acquire il->mutex and > callers of il4965_commit_rxon() acquire il->mutex > (but I did not check all of them). > > So I wonder how this patch did not cause the deadlock ? Oh, sorry, anyway, my patch will cause double locks... > > Anyway what can be done is adding: > > lockdep_assert_held(&il->mutex); > > il4965_commit_rxon() to check if we hold the mutex. I agree. Best wishes, Jia-Ju Bai
On Thu, Oct 04, 2018 at 04:52:19PM +0800, Jia-Ju Bai wrote: > On 2018/10/4 15:59, Stanislaw Gruszka wrote: > >On Wed, Oct 03, 2018 at 10:07:45PM +0800, Jia-Ju Bai wrote: > >>These possible races are detected by a runtime testing. > >>To fix these races, the mutex lock is used in il4965_send_rxon_assoc() > >>to protect the data. > >Really ? I'm surprised by that, see below. > > My runtime testing shows that il4965_send_rxon_assoc() and > il4965_configure_filter() are concurrently executed. > But after seeing your reply, I need to carefully check whether my > runtime testing is right, because I think you are right. > In fact, I only monitored the iwl4965 driver, but did not monitor > the iwlegacy driver, so I will do the testing again with monitoring > the lwlegacy driver. <snip> > >So I wonder how this patch did not cause the deadlock ? > > Oh, sorry, anyway, my patch will cause double locks... So how those runtime test were performend such you didn't notice this ? > >Anyway what can be done is adding: > > > >lockdep_assert_held(&il->mutex); > > > >il4965_commit_rxon() to check if we hold the mutex. > > I agree. Care to post a patch ? Thanks Stanislaw
On 2018/10/5 15:54, Stanislaw Gruszka wrote: > On Thu, Oct 04, 2018 at 04:52:19PM +0800, Jia-Ju Bai wrote: >> On 2018/10/4 15:59, Stanislaw Gruszka wrote: >>> On Wed, Oct 03, 2018 at 10:07:45PM +0800, Jia-Ju Bai wrote: >>>> These possible races are detected by a runtime testing. >>>> To fix these races, the mutex lock is used in il4965_send_rxon_assoc() >>>> to protect the data. >>> Really ? I'm surprised by that, see below. >> My runtime testing shows that il4965_send_rxon_assoc() and >> il4965_configure_filter() are concurrently executed. >> But after seeing your reply, I need to carefully check whether my >> runtime testing is right, because I think you are right. >> In fact, I only monitored the iwl4965 driver, but did not monitor >> the iwlegacy driver, so I will do the testing again with monitoring >> the lwlegacy driver. > <snip> >>> So I wonder how this patch did not cause the deadlock ? >> Oh, sorry, anyway, my patch will cause double locks... > So how those runtime test were performend such you didn't > notice this ? I write a tool to perform runtime testing. This tool records the lock status during driver execution. Some calls to mutex_lock() are in common.c that I did not handle, so the corresponding lock status was not recorded by my tool, causing this false positive. Now I have handled common.c, and this false positive is not reported any more. Actually, I get several new reports. I will send you these reports to you later, and hope you can have a look, thanks in advance :) > >>> Anyway what can be done is adding: >>> >>> lockdep_assert_held(&il->mutex); >>> >>> il4965_commit_rxon() to check if we hold the mutex. >> I agree. > Care to post a patch ? Sure :) Best wishes, Jia-Ju Bai
diff --git a/drivers/net/wireless/intel/iwlegacy/4965.c b/drivers/net/wireless/intel/iwlegacy/4965.c index c3c638ed0ed7..45342777a5f1 100644 --- a/drivers/net/wireless/intel/iwlegacy/4965.c +++ b/drivers/net/wireless/intel/iwlegacy/4965.c @@ -1297,6 +1297,7 @@ il4965_send_rxon_assoc(struct il_priv *il) const struct il_rxon_cmd *rxon1 = &il->staging; const struct il_rxon_cmd *rxon2 = &il->active; + mutex_lock(&il->mutex); if (rxon1->flags == rxon2->flags && rxon1->filter_flags == rxon2->filter_flags && rxon1->cck_basic_rates == rxon2->cck_basic_rates && @@ -1307,6 +1308,7 @@ il4965_send_rxon_assoc(struct il_priv *il) rxon1->rx_chain == rxon2->rx_chain && rxon1->ofdm_basic_rates == rxon2->ofdm_basic_rates) { D_INFO("Using current RXON_ASSOC. Not resending.\n"); + mutex_unlock(&il->mutex); return 0; } @@ -1321,6 +1323,8 @@ il4965_send_rxon_assoc(struct il_priv *il) il->staging.ofdm_ht_dual_stream_basic_rates; rxon_assoc.rx_chain_select_flags = il->staging.rx_chain; + mutex_unlock(&il->mutex); + ret = il_send_cmd_pdu_async(il, C_RXON_ASSOC, sizeof(rxon_assoc), &rxon_assoc, NULL);
CPU0: il4965_configure_filter mutex_lock() line 6183: il->staging.filter_flags &= ... [WRITE] line 6184: il->staging.filter_flags |= ... [WRITE] CPU1: il4965_send_rxon_assoc line 1301: rxon1->filter_flags, rxon1->filter_flags [READ] line 1314: il->staging.filter_flags [READ] The WRITE operations in CPU0 are performed with holding a mutex lock, but the READ operations in CPU1 are performed without holding this lock, so there may exist data races. These possible races are detected by a runtime testing. To fix these races, the mutex lock is used in il4965_send_rxon_assoc() to protect the data. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> --- drivers/net/wireless/intel/iwlegacy/4965.c | 4 ++++ 1 file changed, 4 insertions(+)