diff mbox

cfg80211: fix deadlock during reg chan check

Message ID 1419847199-25493-1-git-send-email-arik@wizery.com (mailing list archive)
State Accepted
Delegated to: Johannes Berg
Headers show

Commit Message

Arik Nemtsov Dec. 29, 2014, 9:59 a.m. UTC
If a P2P GO is active, the cfg80211_reg_can_beacon function will take
the wdev lock, in its call to cfg80211_go_permissive_chan. But the wdev lock
is already taken by the parent channel-checking function, causing a
deadlock.
Split the checking code into two parts. The first part will check if the
wdev is active and saves the channel under the wdev lock. The second part
will check actual channel validity according to type.

Signed-off-by: Arik Nemtsov <arikx.nemtsov@intel.com>
Reviewed-by: Ilan Peer <ilan.peer@intel.com>
Reviewed-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
---
Requires the patch "cfg80211: correctly check ad-hoc channels" to be applied
first.

 net/wireless/reg.c | 56 +++++++++++++++++++++++++++++++++---------------------
 1 file changed, 34 insertions(+), 22 deletions(-)

Comments

Johannes Berg Jan. 6, 2015, 10:51 a.m. UTC | #1
On Mon, 2014-12-29 at 11:59 +0200, Arik Nemtsov wrote:
> If a P2P GO is active, the cfg80211_reg_can_beacon function will take
> the wdev lock, in its call to cfg80211_go_permissive_chan. But the wdev lock
> is already taken by the parent channel-checking function, causing a
> deadlock.
> Split the checking code into two parts. The first part will check if the
> wdev is active and saves the channel under the wdev lock. The second part
> will check actual channel validity according to type.

I'm not convinced this is the right thing to do. When checking for the
current wdev that it can use a channel, then it seems that it's own
current BSS connection (if any) shouldn't actually be taken into account
- ergo the lock shouldn't have to be taken, that interface should be
excluded from the "can beacon due to concurrent check" anyway.

Also, the only reason this can happen anyway is when you call "can
beacon" for a station interface - which seems nonsensical. Given that
this is now really becoming far more complex than originally designed
("can beacon") with TDLS ("can IR") perhaps this should be
 * renamed to reg_can_ir()
 * passed an "exclude wdev"
 * (and use mutex_lock_nested with a big comment to explain to lockdep
what's
   going on)

or so.

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arik Nemtsov Jan. 7, 2015, 1:34 p.m. UTC | #2
On Tue, Jan 6, 2015 at 12:51 PM, Johannes Berg
<johannes@sipsolutions.net> wrote:
> On Mon, 2014-12-29 at 11:59 +0200, Arik Nemtsov wrote:
>> If a P2P GO is active, the cfg80211_reg_can_beacon function will take
>> the wdev lock, in its call to cfg80211_go_permissive_chan. But the wdev lock
>> is already taken by the parent channel-checking function, causing a
>> deadlock.
>> Split the checking code into two parts. The first part will check if the
>> wdev is active and saves the channel under the wdev lock. The second part
>> will check actual channel validity according to type.
>
> I'm not convinced this is the right thing to do. When checking for the
> current wdev that it can use a channel, then it seems that it's own
> current BSS connection (if any) shouldn't actually be taken into account
> - ergo the lock shouldn't have to be taken, that interface should be
> excluded from the "can beacon due to concurrent check" anyway.

We have a couple of checks we want to add in the pipeline that also
need "this" wdev in the concurrent check, so I'd prefer to avoid this.

Unless we treat the exclude_wdev as "already locked wdev", which I
think is unglier than what I do here.

>
> Also, the only reason this can happen anyway is when you call "can
> beacon" for a station interface - which seems nonsensical. Given that

This is not true. This happens with current code for a p2p-go
interface during channel validity checks in reg.c.

Arik
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg Jan. 7, 2015, 1:37 p.m. UTC | #3
On Wed, 2015-01-07 at 15:34 +0200, Arik Nemtsov wrote:

> > I'm not convinced this is the right thing to do. When checking for the
> > current wdev that it can use a channel, then it seems that it's own
> > current BSS connection (if any) shouldn't actually be taken into account
> > - ergo the lock shouldn't have to be taken, that interface should be
> > excluded from the "can beacon due to concurrent check" anyway.
> 
> We have a couple of checks we want to add in the pipeline that also
> need "this" wdev in the concurrent check, so I'd prefer to avoid this.

Why would you need to check "this" wdev when doing something for "this"
wdev? Seems odd? But I'm willing to learn :)

> Unless we treat the exclude_wdev as "already locked wdev", which I
> think is unglier than what I do here.

Yeah that doesn't seem right, agree.

> > Also, the only reason this can happen anyway is when you call "can
> > beacon" for a station interface - which seems nonsensical. Given that
> 
> This is not true. This happens with current code for a p2p-go
> interface during channel validity checks in reg.c.

Not sure I see this? The only thing doing wdev locking is
cfg80211_go_permissive_chan(), no? And that only for station interfaces.

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arik Nemtsov Jan. 7, 2015, 1:42 p.m. UTC | #4
On Wed, Jan 7, 2015 at 3:37 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
> On Wed, 2015-01-07 at 15:34 +0200, Arik Nemtsov wrote:
>
>> > I'm not convinced this is the right thing to do. When checking for the
>> > current wdev that it can use a channel, then it seems that it's own
>> > current BSS connection (if any) shouldn't actually be taken into account
>> > - ergo the lock shouldn't have to be taken, that interface should be
>> > excluded from the "can beacon due to concurrent check" anyway.
>>
>> We have a couple of checks we want to add in the pipeline that also
>> need "this" wdev in the concurrent check, so I'd prefer to avoid this.
>
> Why would you need to check "this" wdev when doing something for "this"
> wdev? Seems odd? But I'm willing to learn :)

There's some convoluted regulatory logic where if this GO (or any
other) are operating on this GO_CONCURRENT (and not indoor-only)
channel, then it may continue in its operation even after the STA that
operated concurrently has disconnected.

>
>> > Also, the only reason this can happen anyway is when you call "can
>> > beacon" for a station interface - which seems nonsensical. Given that
>>
>> This is not true. This happens with current code for a p2p-go
>> interface during channel validity checks in reg.c.
>
> Not sure I see this? The only thing doing wdev locking is
> cfg80211_go_permissive_chan(), no? And that only for station interfaces.

cfg80211_go_permissive_chan is called from cfg80211_reg_can_beacon,
currently only for GO interfaces, but for STA also in the future
(hopefully).
The latter is called during channel validity checks for GO.

Arik
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg Jan. 7, 2015, 1:46 p.m. UTC | #5
On Wed, 2015-01-07 at 15:42 +0200, Arik Nemtsov wrote:
> On Wed, Jan 7, 2015 at 3:37 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
> > On Wed, 2015-01-07 at 15:34 +0200, Arik Nemtsov wrote:
> >
> >> > I'm not convinced this is the right thing to do. When checking for the
> >> > current wdev that it can use a channel, then it seems that it's own
> >> > current BSS connection (if any) shouldn't actually be taken into account
> >> > - ergo the lock shouldn't have to be taken, that interface should be
> >> > excluded from the "can beacon due to concurrent check" anyway.
> >>
> >> We have a couple of checks we want to add in the pipeline that also
> >> need "this" wdev in the concurrent check, so I'd prefer to avoid this.
> >
> > Why would you need to check "this" wdev when doing something for "this"
> > wdev? Seems odd? But I'm willing to learn :)
> 
> There's some convoluted regulatory logic where if this GO (or any
> other) are operating on this GO_CONCURRENT (and not indoor-only)
> channel, then it may continue in its operation even after the STA that
> operated concurrently has disconnected.

Uh, ok, not sure I have that yet...

> >
> >> > Also, the only reason this can happen anyway is when you call "can
> >> > beacon" for a station interface - which seems nonsensical. Given that
> >>
> >> This is not true. This happens with current code for a p2p-go
> >> interface during channel validity checks in reg.c.
> >
> > Not sure I see this? The only thing doing wdev locking is
> > cfg80211_go_permissive_chan(), no? And that only for station interfaces.
> 
> cfg80211_go_permissive_chan is called from cfg80211_reg_can_beacon,
> currently only for GO interfaces, but for STA also in the future
> (hopefully).
> The latter is called during channel validity checks for GO.

Ok.

Should I just apply the patch as it is then?

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arik Nemtsov Jan. 7, 2015, 1:48 p.m. UTC | #6
>
>> >
>> >> > Also, the only reason this can happen anyway is when you call "can
>> >> > beacon" for a station interface - which seems nonsensical. Given that
>> >>
>> >> This is not true. This happens with current code for a p2p-go
>> >> interface during channel validity checks in reg.c.
>> >
>> > Not sure I see this? The only thing doing wdev locking is
>> > cfg80211_go_permissive_chan(), no? And that only for station interfaces.
>>
>> cfg80211_go_permissive_chan is called from cfg80211_reg_can_beacon,
>> currently only for GO interfaces, but for STA also in the future
>> (hopefully).
>> The latter is called during channel validity checks for GO.
>
> Ok.
>
> Should I just apply the patch as it is then?

It fixes a real existing deadlock, so I think so, yea.

Arik
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg Jan. 7, 2015, 1:50 p.m. UTC | #7
On Wed, 2015-01-07 at 15:48 +0200, Arik Nemtsov wrote:
> >
> >> >
> >> >> > Also, the only reason this can happen anyway is when you call "can
> >> >> > beacon" for a station interface - which seems nonsensical. Given that
> >> >>
> >> >> This is not true. This happens with current code for a p2p-go
> >> >> interface during channel validity checks in reg.c.
> >> >
> >> > Not sure I see this? The only thing doing wdev locking is
> >> > cfg80211_go_permissive_chan(), no? And that only for station interfaces.
> >>
> >> cfg80211_go_permissive_chan is called from cfg80211_reg_can_beacon,
> >> currently only for GO interfaces, but for STA also in the future
> >> (hopefully).
> >> The latter is called during channel validity checks for GO.
> >
> > Ok.
> >
> > Should I just apply the patch as it is then?
> 
> It fixes a real existing deadlock, so I think so, yea.

Is it needed on 3.19?

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arik Nemtsov Jan. 7, 2015, 1:52 p.m. UTC | #8
On Wed, Jan 7, 2015 at 3:50 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
> On Wed, 2015-01-07 at 15:48 +0200, Arik Nemtsov wrote:
>> >
>> >> >
>> >> >> > Also, the only reason this can happen anyway is when you call "can
>> >> >> > beacon" for a station interface - which seems nonsensical. Given that
>> >> >>
>> >> >> This is not true. This happens with current code for a p2p-go
>> >> >> interface during channel validity checks in reg.c.
>> >> >
>> >> > Not sure I see this? The only thing doing wdev locking is
>> >> > cfg80211_go_permissive_chan(), no? And that only for station interfaces.
>> >>
>> >> cfg80211_go_permissive_chan is called from cfg80211_reg_can_beacon,
>> >> currently only for GO interfaces, but for STA also in the future
>> >> (hopefully).
>> >> The latter is called during channel validity checks for GO.
>> >
>> > Ok.
>> >
>> > Should I just apply the patch as it is then?
>>
>> It fixes a real existing deadlock, so I think so, yea.
>
> Is it needed on 3.19?

Yes, since the channel validity checking is already there. Basically
everyone that sets up a GO and has some regulatory change afterwards
might deadlock..

Arik
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg Jan. 7, 2015, 1:54 p.m. UTC | #9
On Mon, 2014-12-29 at 11:59 +0200, Arik Nemtsov wrote:
> If a P2P GO is active, the cfg80211_reg_can_beacon function will take
> the wdev lock, in its call to cfg80211_go_permissive_chan. But the wdev lock
> is already taken by the parent channel-checking function, causing a
> deadlock.
> Split the checking code into two parts. The first part will check if the
> wdev is active and saves the channel under the wdev lock. The second part
> will check actual channel validity according to type.

Applied to mac80211.git.

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/wireless/reg.c b/net/wireless/reg.c
index 978a5fd..fde4e17 100644
--- a/net/wireless/reg.c
+++ b/net/wireless/reg.c
@@ -1533,45 +1533,40 @@  static void reg_call_notifier(struct wiphy *wiphy,
 
 static bool reg_wdev_chan_valid(struct wiphy *wiphy, struct wireless_dev *wdev)
 {
-	struct ieee80211_channel *ch;
 	struct cfg80211_chan_def chandef;
 	struct cfg80211_registered_device *rdev = wiphy_to_rdev(wiphy);
-	bool ret = true;
+	enum nl80211_iftype iftype;
 
 	wdev_lock(wdev);
+	iftype = wdev->iftype;
 
+	/* make sure the interface is active */
 	if (!wdev->netdev || !netif_running(wdev->netdev))
-		goto out;
+		goto wdev_inactive_unlock;
 
-	switch (wdev->iftype) {
+	switch (iftype) {
 	case NL80211_IFTYPE_AP:
 	case NL80211_IFTYPE_P2P_GO:
 		if (!wdev->beacon_interval)
-			goto out;
-
-		ret = cfg80211_reg_can_beacon(wiphy,
-					      &wdev->chandef, wdev->iftype);
+			goto wdev_inactive_unlock;
+		chandef = wdev->chandef;
 		break;
 	case NL80211_IFTYPE_ADHOC:
 		if (!wdev->ssid_len)
-			goto out;
-
-		ret = cfg80211_reg_can_beacon(wiphy,
-					      &wdev->chandef, wdev->iftype);
+			goto wdev_inactive_unlock;
+		chandef = wdev->chandef;
 		break;
 	case NL80211_IFTYPE_STATION:
 	case NL80211_IFTYPE_P2P_CLIENT:
 		if (!wdev->current_bss ||
 		    !wdev->current_bss->pub.channel)
-			goto out;
+			goto wdev_inactive_unlock;
 
-		ch = wdev->current_bss->pub.channel;
-		if (rdev->ops->get_channel &&
-		    !rdev_get_channel(rdev, wdev, &chandef))
-			ret = cfg80211_chandef_usable(wiphy, &chandef,
-						      IEEE80211_CHAN_DISABLED);
-		else
-			ret = !(ch->flags & IEEE80211_CHAN_DISABLED);
+		if (!rdev->ops->get_channel ||
+		    rdev_get_channel(rdev, wdev, &chandef))
+			cfg80211_chandef_create(&chandef,
+						wdev->current_bss->pub.channel,
+						NL80211_CHAN_NO_HT);
 		break;
 	case NL80211_IFTYPE_MONITOR:
 	case NL80211_IFTYPE_AP_VLAN:
@@ -1584,9 +1579,26 @@  static bool reg_wdev_chan_valid(struct wiphy *wiphy, struct wireless_dev *wdev)
 		break;
 	}
 
-out:
 	wdev_unlock(wdev);
-	return ret;
+
+	switch (iftype) {
+	case NL80211_IFTYPE_AP:
+	case NL80211_IFTYPE_P2P_GO:
+	case NL80211_IFTYPE_ADHOC:
+		return cfg80211_reg_can_beacon(wiphy, &chandef, iftype);
+	case NL80211_IFTYPE_STATION:
+	case NL80211_IFTYPE_P2P_CLIENT:
+		return cfg80211_chandef_usable(wiphy, &chandef,
+					       IEEE80211_CHAN_DISABLED);
+	default:
+		break;
+	}
+
+	return true;
+
+wdev_inactive_unlock:
+	wdev_unlock(wdev);
+	return true;
 }
 
 static void reg_leave_invalid_chans(struct wiphy *wiphy)