Message ID | 20230822100409.1242-1-quic_wgong@quicinc.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Johannes Berg |
Headers | show |
Series | [v2] wifi: mac80211: fix cfg80211_bss always hold when assoc response fail for MLO connection | expand |
On Tue, 2023-08-22 at 06:04 -0400, Wen Gong wrote: > > It is 100% ratio to reproduce the issue with this change. > Add "mgmt->u.assoc_resp.status_code = 1" in ieee80211_rx_mgmt_assoc_resp(). Don't think that adds any value, but I guess I can always remove it. > + /* use the addr of assoc_data link which is set in ieee80211_mgd_assoc() */ not sure that's so useful? but anyway The reason I'm even writing this message is that you didn't think this change through: > - const u8 *addr; > + u8 addr[ETH_ALEN] __aligned(2); That has some other consequences you need to adapt to, in particular /* need to have local link addresses for MLO connections */ WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr); makes no sense anymore. Not sure if that's the only one. johannes
On 8/22/2023 7:16 PM, Johannes Berg wrote: > On Tue, 2023-08-22 at 06:04 -0400, Wen Gong wrote: >> It is 100% ratio to reproduce the issue with this change. >> Add "mgmt->u.assoc_resp.status_code = 1" in ieee80211_rx_mgmt_assoc_resp(). > Don't think that adds any value, but I guess I can always remove it. change the "mgmt->u.assoc_resp.status_code = 1" is only a way to reproduce the bss hold issue. you can remove it when upstream this patch.
On Wed, 2023-08-23 at 10:28 +0800, Wen Gong wrote: > > > > /* need to have local link addresses for MLO connections */ > > > > WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr); > > > > makes no sense anymore. Not sure if that's the only one. > > After this patch, the cr.links[link_id].addr will be a valid local link > address from > > struct cfg80211_rx_assoc_resp, so I think it is not needed remove now. You don't understand. The issue is that it's set the line above. > cr.links[link_id].addr = data->links[link_id].addr; > /* need to have local link addresses for MLO connections */ > WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr); But look at that! What values can cr.links[link_id].addr get? Note how it's a pointer - assigned from an array. johannes
On 8/23/2023 2:34 PM, Johannes Berg wrote: > On Wed, 2023-08-23 at 10:28 +0800, Wen Gong wrote: >>> /* need to have local link addresses for MLO connections */ >>> >>> WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr); >>> >>> makes no sense anymore. Not sure if that's the only one. >> After this patch, the cr.links[link_id].addr will be a valid local link >> address from >> >> struct cfg80211_rx_assoc_resp, so I think it is not needed remove now. > You don't understand. > > The issue is that it's set the line above. > >> cr.links[link_id].addr = data->links[link_id].addr; >> /* need to have local link addresses for MLO connections */ >> WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr); > But look at that! What values can cr.links[link_id].addr get? Note how > it's a pointer - assigned from an array. Oh, I know it now. the cr.links[link_id].addr will always NOT 0 because it is pointer to an array "u8 addr[ETH_ALEN]" of struct cfg80211_rx_assoc_resp. So maybe we can choose one of the 2 things to do: 1. remove the "WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr);" 2. change like this: WARN_ON(cr.ap_mld_addr && !is_valid_ether_addr(cr.links[link_id].addr)); What about your advise? > > johannes
On Thu, 2023-08-24 at 10:26 +0800, Wen Gong wrote: > On 8/23/2023 2:34 PM, Johannes Berg wrote: > > On Wed, 2023-08-23 at 10:28 +0800, Wen Gong wrote: > > > > /* need to have local link addresses for MLO connections */ > > > > > > > > WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr); > > > > > > > > makes no sense anymore. Not sure if that's the only one. > > > After this patch, the cr.links[link_id].addr will be a valid local link > > > address from > > > > > > struct cfg80211_rx_assoc_resp, so I think it is not needed remove now. > > You don't understand. > > > > The issue is that it's set the line above. > > > > > cr.links[link_id].addr = data->links[link_id].addr; > > > /* need to have local link addresses for MLO connections */ > > > WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr); > > But look at that! What values can cr.links[link_id].addr get? Note how > > it's a pointer - assigned from an array. > > Oh, I know it now. the cr.links[link_id].addr will always NOT 0 because > it is pointer > > to an array "u8 addr[ETH_ALEN]" of struct cfg80211_rx_assoc_resp. Yep. > So maybe we can choose one of the 2 things to do: > > 1. remove the "WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr);" > > 2. change like this: > > WARN_ON(cr.ap_mld_addr && !is_valid_ether_addr(cr.links[link_id].addr)); > I think we should check that it's valid, because if you don't fill it, it'll (hopefully) point to zeroed data somewhere. johannes
On 8/24/2023 3:03 PM, Johannes Berg wrote: > On Thu, 2023-08-24 at 10:26 +0800, Wen Gong wrote: >> On 8/23/2023 2:34 PM, Johannes Berg wrote: >>> On Wed, 2023-08-23 at 10:28 +0800, Wen Gong wrote: >>>>> /* need to have local link addresses for MLO connections */ >>>>> >>>>> WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr); >>>>> >>>>> makes no sense anymore. Not sure if that's the only one. >>>> After this patch, the cr.links[link_id].addr will be a valid local link >>>> address from >>>> >>>> struct cfg80211_rx_assoc_resp, so I think it is not needed remove now. >>> You don't understand. >>> >>> The issue is that it's set the line above. >>> >>>> cr.links[link_id].addr = data->links[link_id].addr; >>>> /* need to have local link addresses for MLO connections */ >>>> WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr); >>> But look at that! What values can cr.links[link_id].addr get? Note how >>> it's a pointer - assigned from an array. >> Oh, I know it now. the cr.links[link_id].addr will always NOT 0 because >> it is pointer >> >> to an array "u8 addr[ETH_ALEN]" of struct cfg80211_rx_assoc_resp. > Yep. > >> So maybe we can choose one of the 2 things to do: >> >> 1. remove the "WARN_ON(cr.ap_mld_addr && !cr.links[link_id].addr);" >> >> 2. change like this: >> >> WARN_ON(cr.ap_mld_addr && !is_valid_ether_addr(cr.links[link_id].addr)); >> > I think we should check that it's valid, because if you don't fill it, > it'll (hopefully) point to zeroed data somewhere. > > johannes OK. So I will change it in next version.
diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h index 7c7d03aa9d06..5a53cc0edb30 100644 --- a/include/net/cfg80211.h +++ b/include/net/cfg80211.h @@ -7229,7 +7229,7 @@ struct cfg80211_rx_assoc_resp { int uapsd_queues; const u8 *ap_mld_addr; struct { - const u8 *addr; + u8 addr[ETH_ALEN] __aligned(2); struct cfg80211_bss *bss; u16 status; } links[IEEE80211_MLD_MAX_NUM_LINKS]; diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c index f93eb38ae0b8..f36359269377 100644 --- a/net/mac80211/mlme.c +++ b/net/mac80211/mlme.c @@ -5429,17 +5429,19 @@ static void ieee80211_rx_mgmt_assoc_resp(struct ieee80211_sub_if_data *sdata, for (link_id = 0; link_id < IEEE80211_MLD_MAX_NUM_LINKS; link_id++) { struct ieee80211_link_data *link; - link = sdata_dereference(sdata->link[link_id], sdata); - if (!link) - continue; - if (!assoc_data->link[link_id].bss) continue; resp.links[link_id].bss = assoc_data->link[link_id].bss; - resp.links[link_id].addr = link->conf->addr; + /* use the addr of assoc_data link which is set in ieee80211_mgd_assoc() */ + ether_addr_copy(resp.links[link_id].addr, + assoc_data->link[link_id].addr); resp.links[link_id].status = assoc_data->link[link_id].status; + link = sdata_dereference(sdata->link[link_id], sdata); + if (!link) + continue; + /* get uapsd queues configuration - same for all links */ resp.uapsd_queues = 0; for (ac = 0; ac < IEEE80211_NUM_ACS; ac++)
When connect to MLO AP with more than one link, and the assoc response of AP is not success, then cfg80211_unhold_bss() is not called for all the links' cfg80211_bss except the primary link which means the link used by the latest successful association request. Thus the hold value of the cfg80211_bss is not reset to 0 after the assoc fail, and then the __cfg80211_unlink_bss() will not be called for the cfg80211_bss by __cfg80211_bss_expire(). Then the AP always looks exist even the AP is shutdown or reconfigured to another type, then it will lead error while connecting it again. The detail info are as below. When connect with muti-links AP, cfg80211_hold_bss() is called by cfg80211_mlme_assoc() for each cfg80211_bss of all the links. When assoc response from AP is not success(such as status_code==1), the ieee80211_link_data of non-primary link(sdata->link[link_id]) is NULL because ieee80211_assoc_success()->ieee80211_vif_update_links() is not called for the links. Then struct cfg80211_rx_assoc_resp resp in cfg80211_rx_assoc_resp() and struct cfg80211_connect_resp_params cr in __cfg80211_connect_result() will only have the data of the primary link, and finally function cfg80211_connect_result_release_bsses() only call cfg80211_unhold_bss() for the primary link. Then cfg80211_bss of the other links will never free because its hold is always > 0 now. Hence assign value for the bss and status from assoc_data since it is valid for this case. Also assign value of addr from assoc_data when the link is NULL because the addrs of assoc_data and link both represent the local link addr and they are same value for success connection. It is 100% ratio to reproduce the issue with this change. Add "mgmt->u.assoc_resp.status_code = 1" in ieee80211_rx_mgmt_assoc_resp(). Fixes: 81151ce462e5 ("wifi: mac80211: support MLO authentication/association with one link") Signed-off-by: Wen Gong <quic_wgong@quicinc.com> --- v2: change logic per johannes include/net/cfg80211.h | 2 +- net/mac80211/mlme.c | 12 +++++++----- 2 files changed, 8 insertions(+), 6 deletions(-) base-commit: 3f257461ab0ab19806bae2bfde4c3cd88dbf050e