diff mbox series

cfg80211: Save the regulatory domain with a lock

Message ID iwlwifi.20210105165657.613e9a876829.Ia38d27dbebea28bf9c56d70691d243186ede70e7@changeid (mailing list archive)
State Accepted
Delegated to: Johannes Berg
Headers show
Series cfg80211: Save the regulatory domain with a lock | expand

Commit Message

Luca Coelho Jan. 5, 2021, 2:56 p.m. UTC
From: Ilan Peer <ilan.peer@intel.com>

Saving the regulatory domain while setting custom regulatory domain
was done while accessing a RCU protected pointer but without any
protection.

Fix this by using RTNL while accessing the pointer.

Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Reported-by: syzbot+27771d4abcd9b7a1f5d3@syzkaller.appspotmail.com
Reported-by: syzbot+db4035751c56c0079282@syzkaller.appspotmail.com
Reported-by: Hans de Goede <hdegoede@redhat.com>
Fixes: beee24695157 ("cfg80211: Save the regulatory domain when setting custom regulatory")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
---
 net/wireless/reg.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

Hans de Goede Jan. 23, 2021, 9:28 p.m. UTC | #1
Hi,

On 1/5/21 3:56 PM, Luca Coelho wrote:
> From: Ilan Peer <ilan.peer@intel.com>
> 
> Saving the regulatory domain while setting custom regulatory domain
> was done while accessing a RCU protected pointer but without any
> protection.
> 
> Fix this by using RTNL while accessing the pointer.
> 
> Signed-off-by: Ilan Peer <ilan.peer@intel.com>
> Reported-by: syzbot+27771d4abcd9b7a1f5d3@syzkaller.appspotmail.com
> Reported-by: syzbot+db4035751c56c0079282@syzkaller.appspotmail.com
> Reported-by: Hans de Goede <hdegoede@redhat.com>

So I'm afraid that I have some bad news about this patch, it fixes
the RCU warning which I reported:

https://lore.kernel.org/linux-wireless/20210104170713.66956-1-hdegoede@redhat.com/

But it introduces a deadlock. I hit this while testing 5.11-rc4 + this patch
on top on a device with a RTL8723BS wifi chip. Granted the driver for that
chipset comes from staging. But I seriously wonder if that is the only driver to
hit this deadlock:

[   63.231667] rtl8723bs: acquire FW from file:rtlwifi/rtl8723bs_nic.bin

[   65.206585] ============================================
[   65.206602] WARNING: possible recursive locking detected
[   65.206620] 5.11.0-rc4+ #210 Tainted: G         C  E    
[   65.206638] --------------------------------------------
[   65.206651] NetworkManager/988 is trying to acquire lock:
[   65.206668] ffffffff8dbc3188 (rtnl_mutex){+.+.}-{3:3}, at: wiphy_apply_custom_regulatory+0x9d/0x2b0 [cfg80211]
[   65.206867] 
               but task is already holding lock:
[   65.206880] ffffffff8dbc3188 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x142/0x480
[   65.206945] 
               other info that might help us debug this:
[   65.206960]  Possible unsafe locking scenario:

[   65.206974]        CPU0
[   65.206986]        ----
[   65.206997]   lock(rtnl_mutex);
[   65.207021]   lock(rtnl_mutex);
[   65.207044] 
                *** DEADLOCK ***

[   65.207055]  May be due to missing lock nesting notation

[   65.207067] 2 locks held by NetworkManager/988:
[   65.207085]  #0: ffffffff8dbc3188 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x142/0x480
[   65.207130]  #1: ffff8c27825161f0 (&pdvobj->hw_init_mutex){+.+.}-{3:3}, at: netdev_open+0x39/0x70 [r8723bs]
[   65.207230] 
               stack backtrace:
[   65.207237] CPU: 3 PID: 988 Comm: NetworkManager Tainted: G         C  E     5.11.0-rc4+ #210
[   65.207247] Hardware name: Jumper EZpad/EZpad, BIOS Jumper12x.WJ2012.bsBKRCP05 04/28/2018
[   65.207254] Call Trace:
[   65.207271]  dump_stack+0x8b/0xb0
[   65.207287]  __lock_acquire.cold+0x159/0x2ab
[   65.207303]  ? __lock_acquire+0x382/0x1e10
[   65.207319]  lock_acquire+0x116/0x370
[   65.207337]  ? wiphy_apply_custom_regulatory+0x9d/0x2b0 [cfg80211]
[   65.207409]  __mutex_lock+0x7e/0x7a0
[   65.207425]  ? wiphy_apply_custom_regulatory+0x9d/0x2b0 [cfg80211]
[   65.207489]  ? wiphy_apply_custom_regulatory+0x9d/0x2b0 [cfg80211]
[   65.207556]  ? rcu_read_lock_sched_held+0x3f/0x80
[   65.207571]  ? __kmalloc+0x2de/0x310
[   65.207592]  wiphy_apply_custom_regulatory+0x9d/0x2b0 [cfg80211]
[   65.207664]  rtw_regd_init+0x32/0x40 [r8723bs]
[   65.207755]  rtw_cfg80211_init_wiphy+0x79/0xb0 [r8723bs]
[   65.207848]  _netdev_open+0x8b/0x1b0 [r8723bs]
[   65.207939]  netdev_open+0x45/0x70 [r8723bs]
[   65.208030]  __dev_open+0xd4/0x1a0
[   65.208047]  __dev_change_flags+0x1cb/0x240
[   65.208066]  dev_change_flags+0x21/0x60
[   65.208081]  do_setlink+0x251/0x10b0
[   65.208096]  ? __snmp6_fill_stats64.constprop.0+0x53/0xe0
[   65.208113]  ? __lock_acquire+0x382/0x1e10
[   65.208126]  ? __nla_validate_parse+0x4f/0xb20
[   65.208150]  __rtnl_newlink+0x60c/0x980
[   65.208201]  ? lock_acquire+0x116/0x370
[   65.208226]  ? sock_def_readable+0x5/0x2a0
[   65.208256]  ? find_held_lock+0x2b/0x80
[   65.208283]  ? sock_def_readable+0xb0/0x2a0
[   65.208315]  ? netlink_unicast+0x1f7/0x230
[   65.208348]  ? trace_event_raw_event_lock_acquire+0x100/0x100
[   65.208371]  ? __bfs+0xe0/0x210
[   65.208445]  ? rcu_read_lock_sched_held+0x3f/0x80
[   65.208457]  ? kmem_cache_alloc_trace+0x292/0x2c0
[   65.208473]  rtnl_newlink+0x44/0x70
[   65.208488]  rtnetlink_rcv_msg+0x16e/0x480
[   65.208502]  ? netlink_deliver_tap+0x95/0x3e0
[   65.208519]  ? rtnetlink_put_metrics+0x1c0/0x1c0
[   65.208533]  netlink_rcv_skb+0x50/0xf0
[   65.208553]  netlink_unicast+0x16d/0x230
[   65.208569]  netlink_sendmsg+0x23f/0x460
[   65.208589]  sock_sendmsg+0x5e/0x60
[   65.208605]  ____sys_sendmsg+0x231/0x270
[   65.208620]  ? import_iovec+0x17/0x20
[   65.208633]  ? sendmsg_copy_msghdr+0x5c/0x80
[   65.208649]  ___sys_sendmsg+0x75/0xb0
[   65.208670]  ? find_held_lock+0x2b/0x80
[   65.208683]  ? __fget_files+0xd0/0x1a0
[   65.208700]  ? __fget_files+0xef/0x1a0
[   65.208715]  __sys_sendmsg+0x49/0x80
[   65.208744]  ? syscall_enter_from_user_mode+0x27/0x80
[   65.208776]  do_syscall_64+0x33/0x40
[   65.208808]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   65.208836] RIP: 0033:0x7f846e33b80d
[   65.208865] Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 ea ec ff ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2f 44 89 c7 48 89 44 24 08 e8 1e ed ff ff 48
[   65.208890] RSP: 002b:00007ffd17e602d0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
[   65.208920] RAX: ffffffffffffffda RBX: 000055fa8da219c0 RCX: 00007f846e33b80d
[   65.208939] RDX: 0000000000000000 RSI: 00007ffd17e60320 RDI: 000000000000000d
[   65.208957] RBP: 00007ffd17e60320 R08: 0000000000000000 R09: 0000000000000000
[   65.208974] R10: 0000000000000000 R11: 0000000000000293 R12: 000055fa8da219c0
[   65.208992] R13: 00007ffd17e604d8 R14: 00007ffd17e604cc R15: 0000000000000000

Upon a quick check I could not figure out the exact cause of this so
I'm going to drop the "cfg80211: Save the regulatory domain with a lock"
patch from my local kernels for now.

Note we really should fix this new deadlock before 5.11 is released. This is
worse then the RCU warning which this patch fixes.

Regards,

Hans





> Fixes: beee24695157 ("cfg80211: Save the regulatory domain when setting custom regulatory")
> Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
> ---
>  net/wireless/reg.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/net/wireless/reg.c b/net/wireless/reg.c
> index bb72447ad960..8114bba8556c 100644
> --- a/net/wireless/reg.c
> +++ b/net/wireless/reg.c
> @@ -5,7 +5,7 @@
>   * Copyright 2008-2011	Luis R. Rodriguez <mcgrof@qca.qualcomm.com>
>   * Copyright 2013-2014  Intel Mobile Communications GmbH
>   * Copyright      2017  Intel Deutschland GmbH
> - * Copyright (C) 2018 - 2019 Intel Corporation
> + * Copyright (C) 2018 - 2021 Intel Corporation
>   *
>   * Permission to use, copy, modify, and/or distribute this software for any
>   * purpose with or without fee is hereby granted, provided that the above
> @@ -139,6 +139,11 @@ static const struct ieee80211_regdomain *get_cfg80211_regdom(void)
>  	return rcu_dereference_rtnl(cfg80211_regdomain);
>  }
>  
> +/*
> + * Returns the regulatory domain associated with the wiphy.
> + *
> + * Requires either RTNL or RCU protection
> + */
>  const struct ieee80211_regdomain *get_wiphy_regdom(struct wiphy *wiphy)
>  {
>  	return rcu_dereference_rtnl(wiphy->regd);
> @@ -2571,9 +2576,13 @@ void wiphy_apply_custom_regulatory(struct wiphy *wiphy,
>  	if (IS_ERR(new_regd))
>  		return;
>  
> +	rtnl_lock();
> +
>  	tmp = get_wiphy_regdom(wiphy);
>  	rcu_assign_pointer(wiphy->regd, new_regd);
>  	rcu_free_regdom(tmp);
> +
> +	rtnl_unlock();
>  }
>  EXPORT_SYMBOL(wiphy_apply_custom_regulatory);
>  
>
Johannes Berg Jan. 25, 2021, 9:43 a.m. UTC | #2
Hi Hans,

> So I'm afraid that I have some bad news about this patch, it fixes
> the RCU warning which I reported:

Uh. Just spoke with Ilan and we realized this was the staging driver,
doing things the wrong way.

Could you test this?

https://p.sipsolutions.net/235c352b8ae5db88.txt

johannes
diff mbox series

Patch

diff --git a/net/wireless/reg.c b/net/wireless/reg.c
index bb72447ad960..8114bba8556c 100644
--- a/net/wireless/reg.c
+++ b/net/wireless/reg.c
@@ -5,7 +5,7 @@ 
  * Copyright 2008-2011	Luis R. Rodriguez <mcgrof@qca.qualcomm.com>
  * Copyright 2013-2014  Intel Mobile Communications GmbH
  * Copyright      2017  Intel Deutschland GmbH
- * Copyright (C) 2018 - 2019 Intel Corporation
+ * Copyright (C) 2018 - 2021 Intel Corporation
  *
  * Permission to use, copy, modify, and/or distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
@@ -139,6 +139,11 @@  static const struct ieee80211_regdomain *get_cfg80211_regdom(void)
 	return rcu_dereference_rtnl(cfg80211_regdomain);
 }
 
+/*
+ * Returns the regulatory domain associated with the wiphy.
+ *
+ * Requires either RTNL or RCU protection
+ */
 const struct ieee80211_regdomain *get_wiphy_regdom(struct wiphy *wiphy)
 {
 	return rcu_dereference_rtnl(wiphy->regd);
@@ -2571,9 +2576,13 @@  void wiphy_apply_custom_regulatory(struct wiphy *wiphy,
 	if (IS_ERR(new_regd))
 		return;
 
+	rtnl_lock();
+
 	tmp = get_wiphy_regdom(wiphy);
 	rcu_assign_pointer(wiphy->regd, new_regd);
 	rcu_free_regdom(tmp);
+
+	rtnl_unlock();
 }
 EXPORT_SYMBOL(wiphy_apply_custom_regulatory);