diff mbox series

cfg80211: fix race in netlink owner interface destruction

Message ID 20220201130951.22093-1-johannes@sipsolutions.net (mailing list archive)
State Accepted
Delegated to: Johannes Berg
Headers show
Series cfg80211: fix race in netlink owner interface destruction | expand

Commit Message

Johannes Berg Feb. 1, 2022, 1:09 p.m. UTC
From: Johannes Berg <johannes.berg@intel.com>

My previous fix here to fix the deadlock left a race where
the exact same deadlock (see the original commit referenced
below) can still happen if cfg80211_destroy_ifaces() already
runs while nl80211_netlink_notify() is still marking some
interfaces as nl_owner_dead.

The race happens because we have two loops here - first we
dev_close() all the netdevs, and then we destroy them. If we
also have two netdevs (first one need only be a wdev though)
then we can find one during the first iteration, close it,
and go to the second iteration -- but then find two, and try
to destroy also the one we didn't close yet.

Fix this by only iterating once.

Change-Id: Ie56cd0ef3f0d2108bb8a25c8bb5efced15e6a909
Reported-by: Toke Høiland-Jørgensen <toke@redhat.com>
Fixes: ea6b2098dd02 ("cfg80211: fix locking in netlink owner interface destruction")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
 net/wireless/core.c | 17 ++++-------------
 1 file changed, 4 insertions(+), 13 deletions(-)

Comments

Toke Høiland-Jørgensen Feb. 1, 2022, 2:03 p.m. UTC | #1
Johannes Berg <johannes@sipsolutions.net> writes:

> From: Johannes Berg <johannes.berg@intel.com>
>
> My previous fix here to fix the deadlock left a race where
> the exact same deadlock (see the original commit referenced
> below) can still happen if cfg80211_destroy_ifaces() already
> runs while nl80211_netlink_notify() is still marking some
> interfaces as nl_owner_dead.
>
> The race happens because we have two loops here - first we
> dev_close() all the netdevs, and then we destroy them. If we
> also have two netdevs (first one need only be a wdev though)
> then we can find one during the first iteration, close it,
> and go to the second iteration -- but then find two, and try
> to destroy also the one we didn't close yet.
>
> Fix this by only iterating once.
>
> Change-Id: Ie56cd0ef3f0d2108bb8a25c8bb5efced15e6a909
> Reported-by: Toke Høiland-Jørgensen <toke@redhat.com>
> Fixes: ea6b2098dd02 ("cfg80211: fix locking in netlink owner interface destruction")
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>

Looks like this fixed the crash I was seeing - thanks!

Tested-by: Toke Høiland-Jørgensen <toke@redhat.com>
Kalle Valo Feb. 1, 2022, 2:35 p.m. UTC | #2
Johannes Berg <johannes@sipsolutions.net> writes:

> From: Johannes Berg <johannes.berg@intel.com>
>
> My previous fix here to fix the deadlock left a race where
> the exact same deadlock (see the original commit referenced
> below) can still happen if cfg80211_destroy_ifaces() already
> runs while nl80211_netlink_notify() is still marking some
> interfaces as nl_owner_dead.
>
> The race happens because we have two loops here - first we
> dev_close() all the netdevs, and then we destroy them. If we
> also have two netdevs (first one need only be a wdev though)
> then we can find one during the first iteration, close it,
> and go to the second iteration -- but then find two, and try
> to destroy also the one we didn't close yet.
>
> Fix this by only iterating once.
>
> Change-Id: Ie56cd0ef3f0d2108bb8a25c8bb5efced15e6a909

You forgot the Change-Id.
Johannes Berg Feb. 1, 2022, 2:36 p.m. UTC | #3
On Tue, 2022-02-01 at 16:35 +0200, Kalle Valo wrote:
> Johannes Berg <johannes@sipsolutions.net> writes:
> 
> > From: Johannes Berg <johannes.berg@intel.com>
> > 
> > My previous fix here to fix the deadlock left a race where
> > the exact same deadlock (see the original commit referenced
> > below) can still happen if cfg80211_destroy_ifaces() already
> > runs while nl80211_netlink_notify() is still marking some
> > interfaces as nl_owner_dead.
> > 
> > The race happens because we have two loops here - first we
> > dev_close() all the netdevs, and then we destroy them. If we
> > also have two netdevs (first one need only be a wdev though)
> > then we can find one during the first iteration, close it,
> > and go to the second iteration -- but then find two, and try
> > to destroy also the one we didn't close yet.
> > 
> > Fix this by only iterating once.
> > 
> > Change-Id: Ie56cd0ef3f0d2108bb8a25c8bb5efced15e6a909
> 
> You forgot the Change-Id.
> 
Wtf, sorry, guess I sent it from a place with the wrong script. I'll
strip it when I apply it, I guess

johannes
diff mbox series

Patch

diff --git a/net/wireless/core.c b/net/wireless/core.c
index ff74549b1022..d151a433388c 100644
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -5,7 +5,7 @@ 
  * Copyright 2006-2010		Johannes Berg <johannes@sipsolutions.net>
  * Copyright 2013-2014  Intel Mobile Communications GmbH
  * Copyright 2015-2017	Intel Deutschland GmbH
- * Copyright (C) 2018-2021 Intel Corporation
+ * Copyright (C) 2018-2022 Intel Corporation
  */
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
@@ -332,29 +332,20 @@  static void cfg80211_event_work(struct work_struct *work)
 void cfg80211_destroy_ifaces(struct cfg80211_registered_device *rdev)
 {
 	struct wireless_dev *wdev, *tmp;
-	bool found = false;
 
 	ASSERT_RTNL();
 
-	list_for_each_entry(wdev, &rdev->wiphy.wdev_list, list) {
+	list_for_each_entry_safe(wdev, tmp, &rdev->wiphy.wdev_list, list) {
 		if (wdev->nl_owner_dead) {
 			if (wdev->netdev)
 				dev_close(wdev->netdev);
-			found = true;
-		}
-	}
-
-	if (!found)
-		return;
 
-	wiphy_lock(&rdev->wiphy);
-	list_for_each_entry_safe(wdev, tmp, &rdev->wiphy.wdev_list, list) {
-		if (wdev->nl_owner_dead) {
+			wiphy_lock(&rdev->wiphy);
 			cfg80211_leave(rdev, wdev);
 			rdev_del_virtual_intf(rdev, wdev);
+			wiphy_unlock(&rdev->wiphy);
 		}
 	}
-	wiphy_unlock(&rdev->wiphy);
 }
 
 static void cfg80211_destroy_iface_wk(struct work_struct *work)