Message ID | 20230602225751.164525-2-kglund@google.com (mailing list archive) |
---|---|
State | Rejected |
Delegated to: | Kalle Valo |
Headers | show |
Series | [1/2] wifi: cfg80211: Reject (re-)association to the same BSSID | expand |
On Fri, Jun 02, 2023 at 04:57:51PM -0600, Kevin Lund wrote: > Currently, the Marvell WiFi driver rejects any connection attmept while > we are currently connected. This is poor logic, since there are several > legitimate use-cases for initiating a connection attempt while > connected, including re-association and BSS Transitioning. This logic > means that it's impossible for userspace to request the driver to > connect to a different BSS on the same ESS without explicitly requesting > a disconnect first. > > Remove the check from the driver so that we can complete BSS transitions > on the first attempt. > > Testing on Chrome OS has shown that this change resolves some issues > with failed BSS transitions. > > Signed-off-by: Kevin Lund <kglund@google.com> > --- > drivers/net/wireless/marvell/mwifiex/cfg80211.c | 6 ------ > 1 file changed, 6 deletions(-) I've been told this one might need an extra look, but first off, it's marked Rejected, likely due to feedback on patch 1, so probably needs a resubmit if it needs consideration: https://patchwork.kernel.org/project/linux-wireless/patch/20230602225751.164525-2-kglund@google.com/ But, did you attempt any background work on this, to determine why it exists, or what other mitigations are in place? For example, I see that sme.c's cfg80211_connect() makes a similar check, but only rejects things if the SSID is different. So with that understanding, it's a reasonable guess to say that mwifiex would be OK with just relying on the existing cfg80211 checks instead. In other words, I think this patch may be OK, but probably could use a bit more explanation. Also, how does "BSS Transitioning" (in your description) fit in here? IIUC, cfg80211_connect() doesn't support that, as it only allows reassociation to the same BSSID. (Or, I might be confused here.) Brian
Hey Brian, thanks for the response. I'm revising the patch now to better illustrate the context and purpose for the patch, but I'll also respond to your comments here so there is a clear chain. > But, did you attempt any background work on this, to determine why it > exists, or what other mitigations are in place? Yes, I'll share some of my background checks here: The check was initially added in the following patch: https://github.com/torvalds/linux/commit/71954f24c93fd569314985e9a7319b68e0b918e6 Before this patch, the driver would explicitly deauthenticate from its current AP whenever a new connection was requested. Apparently this was racy, so the deauthentication was removed. The commit message states that they want to "Avoid re-association while the device is already associated to a network." My assertion is that this is invalid, since re-associating while connect is a valid action, and it happens during BSS-transitions. I cross checked this behavior with the userspace SME side and looked into a driver with userspace SME, and I could not find any indication that the driver rejects association attempts while connected. > For example, I see that > sme.c's cfg80211_connect() makes a similar check, but only rejects > things if the SSID is different. So with that understanding, it's a > reasonable guess to say that mwifiex would be OK with just relying on > the existing cfg80211 checks instead. > In other words, I think this patch may be OK, but probably could use a > bit more explanation. Yes, sme.c's cfg80211_connect() is a very useful point of reference, and I do believe that it makes sufficient checks such that the Marvell driver can rely on them, and is a part of what drove me to make this patch in the first place. The function makes two checks that basically carve out the exact re-association functionality that I'm enabling in the Marvell driver. These checks are commented as such: /* * If we have an ssid_len, we're trying to connect or are * already connected, so reject a new SSID unless it's the * same (which is the case for re-association.) */ and /* * If connected, reject (re-)association unless prev_bssid * matches the current BSSID. */ The first condition asserts that if we're connected, then the only network we should be trying to connect to should be the network which we're currently connected to. Basically, re-association is only valid within the same ESS. The second condition says that if we are currently connected, then 1. We must have a prev_bssid value. This is the BSSID that we were connected to previously, and it's presence is the indication that we are requesting a re-association rather than a normal association. Basically, if we're connected, then we must be re-associating, not normal associating. 2. prev_bssid must be the same as our current bssid. All this is saying is that the previous BSSID which was supplied by the caller must match the BSSID which the driver thinks it is connected to. Basically, "the caller is saying we should move from A to B, let's make sure we're actually connected to A". So, based on these checks it is abundantly clear that cfg80211 absolutely intends it to be a normal flow to request a connection while currently connected, and it makes deliberate checks to ensure we're in a good state when that happens. > Also, how does "BSS Transitioning" (in your description) fit in here? > IIUC, cfg80211_connect() doesn't support that, as it only allows > reassociation to the same BSSID. This is covered above, but cfg80211_connect() doesn't actually assert that we're re-associating to the same BSSID. It only asserts that the BSSID the caller thinks we're transitioning from is the same BSSID that the driver thinks we are currently connected to. Thanks, Kevin On Thu, Aug 3, 2023 at 7:21 PM Brian Norris <briannorris@chromium.org> wrote: > > On Fri, Jun 02, 2023 at 04:57:51PM -0600, Kevin Lund wrote: > > Currently, the Marvell WiFi driver rejects any connection attmept while > > we are currently connected. This is poor logic, since there are several > > legitimate use-cases for initiating a connection attempt while > > connected, including re-association and BSS Transitioning. This logic > > means that it's impossible for userspace to request the driver to > > connect to a different BSS on the same ESS without explicitly requesting > > a disconnect first. > > > > Remove the check from the driver so that we can complete BSS transitions > > on the first attempt. > > > > Testing on Chrome OS has shown that this change resolves some issues > > with failed BSS transitions. > > > > Signed-off-by: Kevin Lund <kglund@google.com> > > --- > > drivers/net/wireless/marvell/mwifiex/cfg80211.c | 6 ------ > > 1 file changed, 6 deletions(-) > > I've been told this one might need an extra look, but first off, it's > marked Rejected, likely due to feedback on patch 1, so probably needs a > resubmit if it needs consideration: > > https://patchwork.kernel.org/project/linux-wireless/patch/20230602225751.164525-2-kglund@google.com/ > > But, did you attempt any background work on this, to determine why it > exists, or what other mitigations are in place? For example, I see that > sme.c's cfg80211_connect() makes a similar check, but only rejects > things if the SSID is different. So with that understanding, it's a > reasonable guess to say that mwifiex would be OK with just relying on > the existing cfg80211 checks instead. > > In other words, I think this patch may be OK, but probably could use a > bit more explanation. > > Also, how does "BSS Transitioning" (in your description) fit in here? > IIUC, cfg80211_connect() doesn't support that, as it only allows > reassociation to the same BSSID. > (Or, I might be confused here.) > > Brian
On Mon, Aug 07, 2023 at 04:35:49PM -0600, Kevin Lund wrote: > Hey Brian, thanks for the response. I'm revising the patch now to > better illustrate the context and purpose for the patch, but I'll also > respond to your comments here so there is a clear chain. Yes, good idea :) > > But, did you attempt any background work on this, to determine why it > > exists, or what other mitigations are in place? > > Yes, I'll share some of my background checks here: > > The check was initially added in the following patch: > > https://github.com/torvalds/linux/commit/71954f24c93fd569314985e9a7319b68e0b918e6 Ah, good context. I didn't notice that part, and instead assumed that the quirk (like so many quirks in this driver) existed since its introduction. Definitely note this in your followup patch, and while you're at it, maybe avoid linking github and instead use the preferred commit format. I think think checkpatch.pl would tell you that's something like: commit 71954f24c93f ("mwifiex: do not re-associate when already connected") And you'll want to explain how you account for the original problem being solved, or else explain that the problem doesn't apply. > Before this patch, the driver would explicitly deauthenticate > from its current AP whenever a new connection was requested. > Apparently this was racy, so the deauthentication was removed. > The commit message states that they want to "Avoid > re-association while the device is already associated to a > network." My assertion is that this is invalid, since > re-associating while connect is a valid action, and it happens > during BSS-transitions. Actually, I believe that commit was primarily addressing a particular WARN()/WARN_ON() issue seen when doing this kind of reassociation, because of how mwifiex was doing an internal deassociation, and losing the context of the SSID that it was going to report back to cfg80211 upon (re)association. So you'd need to show that you don't hit that warning again. (I doubt you will, because the aforementioned commit also dropped the internal deauthentication call.) But I think you'd also need to explain why (or even better, explain+test) the internal deauthentication was present in the first place. The previous commit removed the internal deauthentication, and instead just rejected the request. The former is OK to do, but only(?) because we did the latter. So, you'd have to help me remove that question mark: why is it OK for mwifiex to run its connect (HostCmd_CMD_802_11_ASSOCIATE) flow without an intervening disconnect? It's not clear to me that the firmware protocol actually supports this, or that it's been vetted very much, given how the original mwifiex used to behave (with a full disconnect and reconnect). Maybe David Lin @ NXP (CC'd) would be able to help here, as this starts to ask questions we can only answer by either inspecting the firmware (i.e., ask NXP) or by testing. Regarding testing: what kind of testing has been done? On your multi-BSS setup, have you ensured we're really sending appropriate 802.11 level frames with this patch? e.g., do mwifiex clients end up sending a proper REASSOCIATION REQUEST frame, or just ASSOCIATION REQUEST? It doesn't look like mwifiex actually pushes the required "Current AP Address" down into the association command, but it's always possible the firmware just memorizes this and rewrites it... ...or alternatively, maybe mwifiex doesn't actually do Reassociation at all here, and it just kinda happens to work OK when sending a regular Association. I'm not sure if that makes this a good patch, or if we'll end up with interop problems where cfg80211 thinks we're reassociating, but the AP thinks we're associating, and eventually things break down. Sorry if that's just a bunch of "unknowns" here, but that's life when trying to retrofit things into an old full-MAC driver with no support from the owners of the proprietary firmware. Maybe 802.11 protocol dumps would make us happy enough though. > I cross checked this behavior with the userspace SME side > and looked into a driver with userspace SME, and I could > not find any indication that the driver rejects association > attempts while connected. I don't think comparison with other drivers gives much evidence here. It's a question of whether the firmware is properly tracking "reassociation" (to the same or different BSS), or whether it really needs a DEAUTH in between. > > For example, I see that > > sme.c's cfg80211_connect() makes a similar check, but only rejects > > things if the SSID is different. So with that understanding, it's a > > reasonable guess to say that mwifiex would be OK with just relying on > > the existing cfg80211 checks instead. > > > In other words, I think this patch may be OK, but probably could use a > > bit more explanation. > [...] > The second condition says that if we are currently > connected, then > > 1. We must have a prev_bssid value. This is the BSSID > that we were connected to previously, and it's presence > is the indication that we are requesting a re-association > rather than a normal association. Basically, if we're > connected, then we must be re-associating, not > normal associating. > > 2. prev_bssid must be the same as our current bssid. > All this is saying is that the previous BSSID which was > supplied by the caller must match the BSSID which the > driver thinks it is connected to. Basically, "the caller is > saying we should move from A to B, let's make sure > we're actually connected to A". I misread what cfg80211_connect() was looking for -- thanks for the additional explanation. > So, based on these checks it is abundantly clear that > cfg80211 absolutely intends it to be a normal flow to > request a connection while currently connected, and > it makes deliberate checks to ensure we're in a good > state when that happens. Right, thanks, I misunderstood cfg80211 for a bit. So I agree on the cfg80211 expectation, but that still doesn't tell us how the mwifiex firmware really behaves. I guess I retain some confusion (per above) of why mwifiex would have forcibly DEAUTH'd on each reassociation request in the past, if it wasn't necessary. Brian
diff --git a/drivers/net/wireless/marvell/mwifiex/cfg80211.c b/drivers/net/wireless/marvell/mwifiex/cfg80211.c index bcd564dc3554a..84d650c9dceb0 100644 --- a/drivers/net/wireless/marvell/mwifiex/cfg80211.c +++ b/drivers/net/wireless/marvell/mwifiex/cfg80211.c @@ -2414,12 +2414,6 @@ mwifiex_cfg80211_connect(struct wiphy *wiphy, struct net_device *dev, return -EINVAL; } - if (priv->wdev.connected) { - mwifiex_dbg(adapter, ERROR, - "%s: already connected\n", dev->name); - return -EALREADY; - } - if (priv->scan_block) priv->scan_block = false;
Currently, the Marvell WiFi driver rejects any connection attmept while we are currently connected. This is poor logic, since there are several legitimate use-cases for initiating a connection attempt while connected, including re-association and BSS Transitioning. This logic means that it's impossible for userspace to request the driver to connect to a different BSS on the same ESS without explicitly requesting a disconnect first. Remove the check from the driver so that we can complete BSS transitions on the first attempt. Testing on Chrome OS has shown that this change resolves some issues with failed BSS transitions. Signed-off-by: Kevin Lund <kglund@google.com> --- drivers/net/wireless/marvell/mwifiex/cfg80211.c | 6 ------ 1 file changed, 6 deletions(-)