diff mbox series

net: wireless: iwlegacy: Fix possible data races in il4965_send_rxon_assoc()

Message ID 20181003140745.7650-1-baijiaju1990@gmail.com (mailing list archive)
State Not Applicable
Delegated to: Kalle Valo
Headers show
Series net: wireless: iwlegacy: Fix possible data races in il4965_send_rxon_assoc() | expand

Commit Message

Jia-Ju Bai Oct. 3, 2018, 2:07 p.m. UTC
CPU0:
il4965_configure_filter
	mutex_lock()
	line 6183: il->staging.filter_flags &= ... [WRITE]
	line 6184: il->staging.filter_flags |= ... [WRITE]

CPU1:
il4965_send_rxon_assoc
	line 1301: rxon1->filter_flags, rxon1->filter_flags [READ]
	line 1314: il->staging.filter_flags [READ]

The WRITE operations in CPU0 are performed with holding a mutex lock, 
but the READ operations in CPU1 are performed without holding this lock, 
so there may exist data races. 

These possible races are detected by a runtime testing. 
To fix these races, the mutex lock is used in il4965_send_rxon_assoc() 
to protect the data.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
---
 drivers/net/wireless/intel/iwlegacy/4965.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Stanislaw Gruszka Oct. 4, 2018, 7:59 a.m. UTC | #1
On Wed, Oct 03, 2018 at 10:07:45PM +0800, Jia-Ju Bai wrote:
> These possible races are detected by a runtime testing. 
> To fix these races, the mutex lock is used in il4965_send_rxon_assoc() 
> to protect the data.

Really ? I'm surprised by that, see below.

> @@ -1297,6 +1297,7 @@ il4965_send_rxon_assoc(struct il_priv *il)
>  	const struct il_rxon_cmd *rxon1 = &il->staging;
>  	const struct il_rxon_cmd *rxon2 = &il->active;
>  
> +	mutex_lock(&il->mutex);
>  	if (rxon1->flags == rxon2->flags &&

For 4965 driver il4965_send_rxon_assoc() is only called by
il_mac_bss_info_changed() and il4965_commit_rxon().

il_mac_bss_info_changed() acquire il->mutex and
callers of  il4965_commit_rxon() acquire il->mutex
(but I did not check all of them).

So I wonder how this patch did not cause the deadlock ?
  
Anyway what can be done is adding:

lockdep_assert_held(&il->mutex);

il4965_commit_rxon() to check if we hold the mutex.

Thanks
Stanislaw
Jia-Ju Bai Oct. 4, 2018, 8:52 a.m. UTC | #2
Thanks for your reply :)


On 2018/10/4 15:59, Stanislaw Gruszka wrote:
> On Wed, Oct 03, 2018 at 10:07:45PM +0800, Jia-Ju Bai wrote:
>> These possible races are detected by a runtime testing.
>> To fix these races, the mutex lock is used in il4965_send_rxon_assoc()
>> to protect the data.
> Really ? I'm surprised by that, see below.

My runtime testing shows that il4965_send_rxon_assoc() and 
il4965_configure_filter() are concurrently executed.
But after seeing your reply, I need to carefully check whether my 
runtime testing is right, because I think you are right.
In fact, I only monitored the iwl4965 driver, but did not monitor the 
iwlegacy driver, so I will do the testing again with monitoring the 
lwlegacy driver.

>
>> @@ -1297,6 +1297,7 @@ il4965_send_rxon_assoc(struct il_priv *il)
>>   	const struct il_rxon_cmd *rxon1 = &il->staging;
>>   	const struct il_rxon_cmd *rxon2 = &il->active;
>>   
>> +	mutex_lock(&il->mutex);
>>   	if (rxon1->flags == rxon2->flags &&
> For 4965 driver il4965_send_rxon_assoc() is only called by
> il_mac_bss_info_changed() and il4965_commit_rxon().
>
> il_mac_bss_info_changed() acquire il->mutex and
> callers of  il4965_commit_rxon() acquire il->mutex
> (but I did not check all of them).
>
> So I wonder how this patch did not cause the deadlock ?

Oh, sorry, anyway, my patch will cause double locks...

>    
> Anyway what can be done is adding:
>
> lockdep_assert_held(&il->mutex);
>
> il4965_commit_rxon() to check if we hold the mutex.

I agree.


Best wishes,
Jia-Ju Bai
Stanislaw Gruszka Oct. 5, 2018, 7:54 a.m. UTC | #3
On Thu, Oct 04, 2018 at 04:52:19PM +0800, Jia-Ju Bai wrote:
> On 2018/10/4 15:59, Stanislaw Gruszka wrote:
> >On Wed, Oct 03, 2018 at 10:07:45PM +0800, Jia-Ju Bai wrote:
> >>These possible races are detected by a runtime testing.
> >>To fix these races, the mutex lock is used in il4965_send_rxon_assoc()
> >>to protect the data.
> >Really ? I'm surprised by that, see below.
> 
> My runtime testing shows that il4965_send_rxon_assoc() and
> il4965_configure_filter() are concurrently executed.
> But after seeing your reply, I need to carefully check whether my
> runtime testing is right, because I think you are right.
> In fact, I only monitored the iwl4965 driver, but did not monitor
> the iwlegacy driver, so I will do the testing again with monitoring
> the lwlegacy driver.
<snip>
> >So I wonder how this patch did not cause the deadlock ?
> 
> Oh, sorry, anyway, my patch will cause double locks...

So how those runtime test were performend such you didn't
notice this ?

> >Anyway what can be done is adding:
> >
> >lockdep_assert_held(&il->mutex);
> >
> >il4965_commit_rxon() to check if we hold the mutex.
> 
> I agree.

Care to post a patch ?

Thanks
Stanislaw
Jia-Ju Bai Oct. 5, 2018, 1:42 p.m. UTC | #4
On 2018/10/5 15:54, Stanislaw Gruszka wrote:
> On Thu, Oct 04, 2018 at 04:52:19PM +0800, Jia-Ju Bai wrote:
>> On 2018/10/4 15:59, Stanislaw Gruszka wrote:
>>> On Wed, Oct 03, 2018 at 10:07:45PM +0800, Jia-Ju Bai wrote:
>>>> These possible races are detected by a runtime testing.
>>>> To fix these races, the mutex lock is used in il4965_send_rxon_assoc()
>>>> to protect the data.
>>> Really ? I'm surprised by that, see below.
>> My runtime testing shows that il4965_send_rxon_assoc() and
>> il4965_configure_filter() are concurrently executed.
>> But after seeing your reply, I need to carefully check whether my
>> runtime testing is right, because I think you are right.
>> In fact, I only monitored the iwl4965 driver, but did not monitor
>> the iwlegacy driver, so I will do the testing again with monitoring
>> the lwlegacy driver.
> <snip>
>>> So I wonder how this patch did not cause the deadlock ?
>> Oh, sorry, anyway, my patch will cause double locks...
> So how those runtime test were performend such you didn't
> notice this ?

I write a tool to perform runtime testing.
This tool records the lock status during driver execution.
Some calls to mutex_lock() are in common.c that I did not handle, so the 
corresponding lock status was not recorded by my tool, causing this 
false positive.

Now I have handled common.c, and this false positive is not reported any 
more.
Actually, I get several new reports.
I will send you these reports to you later, and hope you can have a 
look, thanks in advance :)

>
>>> Anyway what can be done is adding:
>>>
>>> lockdep_assert_held(&il->mutex);
>>>
>>> il4965_commit_rxon() to check if we hold the mutex.
>> I agree.
> Care to post a patch ?

Sure :)


Best wishes,
Jia-Ju Bai
diff mbox series

Patch

diff --git a/drivers/net/wireless/intel/iwlegacy/4965.c b/drivers/net/wireless/intel/iwlegacy/4965.c
index c3c638ed0ed7..45342777a5f1 100644
--- a/drivers/net/wireless/intel/iwlegacy/4965.c
+++ b/drivers/net/wireless/intel/iwlegacy/4965.c
@@ -1297,6 +1297,7 @@  il4965_send_rxon_assoc(struct il_priv *il)
 	const struct il_rxon_cmd *rxon1 = &il->staging;
 	const struct il_rxon_cmd *rxon2 = &il->active;
 
+	mutex_lock(&il->mutex);
 	if (rxon1->flags == rxon2->flags &&
 	    rxon1->filter_flags == rxon2->filter_flags &&
 	    rxon1->cck_basic_rates == rxon2->cck_basic_rates &&
@@ -1307,6 +1308,7 @@  il4965_send_rxon_assoc(struct il_priv *il)
 	    rxon1->rx_chain == rxon2->rx_chain &&
 	    rxon1->ofdm_basic_rates == rxon2->ofdm_basic_rates) {
 		D_INFO("Using current RXON_ASSOC.  Not resending.\n");
+		mutex_unlock(&il->mutex);
 		return 0;
 	}
 
@@ -1321,6 +1323,8 @@  il4965_send_rxon_assoc(struct il_priv *il)
 	    il->staging.ofdm_ht_dual_stream_basic_rates;
 	rxon_assoc.rx_chain_select_flags = il->staging.rx_chain;
 
+	mutex_unlock(&il->mutex);
+
 	ret =
 	    il_send_cmd_pdu_async(il, C_RXON_ASSOC, sizeof(rxon_assoc),
 				  &rxon_assoc, NULL);