Message ID | 20241030025515.16004-1-quic_lingbok@quicinc.com (mailing list archive) |
---|---|
State | Under Review |
Delegated to: | Johannes Berg |
Headers | show |
Series | wifi: cfg80211: Correctly handle Medium Synchronization Delay | expand |
Hi, On Wed, 2024-10-30 at 10:55 +0800, Lingbo Kong wrote: > Currently, when the driver attempts to connect to an AP MLD with multiple > APs, the cfg80211_mlme_check_mlo_compat() function requires the Medium > Synchronization Delay values from different APs of the same AP MLD to be > equal, which may result in connection failures. > > This is because when the driver receives a multi-link probe response from > an AP MLD with multiple APs, cfg80211 updates the Elements for each AP > based on the multi-link probe response. If the Medium Synchronization Delay > is set in the multi-link probe response, the Elements for each AP belonging > to the same AP MLD will have the Medium Synchronization Delay set > simultaneously. If non-multi-link probe responses are received from > different APs of the same MLD AP, cfg80211 will still update the Elements > based on the non-multi-link probe response. Since the non-multi-link probe > response does not set the Medium Synchronization Delay > (IEEE 802.11be-2024-35.3.4.4), if the Elements from a non-multi-link probe > response overwrite those from a multi-link probe response that has set the > Medium Synchronization Delay, the Medium Synchronization Delay values for > APs belonging to the same AP MLD will not be equal. This discrepancy causes > the cfg80211_mlme_check_mlo_compat() function to fail, leading to > connection failures. Commit ccb964b4ab16 > ("wifi: cfg80211: validate MLO connections better") did not take this into > account. The specification confuses me a bit and I am probably missing something. > To address this issue, add a u16 type member to the struct cfg80211_bss to > store the Medium Synchronization Delay. When the driver receives a probe > response with the Medium Synchronization Delay set, update this member’s > value; otherwise, do not update it. Additionally, modify the parameter list > of cfg80211_mlme_check_mlo_compat() so that this member variable can be > directly accessed within the cfg80211_mlme_check_mlo_compat() function viaq > a pointer to the struct cfg80211_bss. This will allow checking whether the > Medium Synchronization Delay values from different APs of the same AP MLD > are equal. This feels like a recipe for failures. Whether or not an ML-Probe Response is seen should not generally affect the validity of a BSS. Which means that the validity check itself may be incorrect. Maybe we should only compare the values if the subfield has been included? That said, do you understand when exactly the subfield should be included? It seems that it may be carried in the "(Re)Association Response", in which case it would be consistent anyway. The ML probe response part seems a bit specific. Could it be that this is intended to be used by NSTR APs only, in which case there would not be a beacon on the secondary AP. If that is the case, then there wouldn't be much of an inconsistency (though we would still need to be a bit careful). Benjamin > > Fixes: ccb964b4ab16 ("wifi: cfg80211: validate MLO connections better") > Signed-off-by: Lingbo Kong <quic_lingbok@quicinc.com> > --- > include/net/cfg80211.h | 4 ++++ > net/wireless/mlme.c | 11 +++++++---- > net/wireless/scan.c | 15 +++++++++++++++ > 3 files changed, 26 insertions(+), 4 deletions(-) > > diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h > index 27acf1292a5c..ebeb305c1d0c 100644 > --- a/include/net/cfg80211.h > +++ b/include/net/cfg80211.h > @@ -2956,6 +2956,8 @@ struct cfg80211_bss_ies { > * @cannot_use_reasons: the reasons (bitmap) for not being able to connect, > * if @restrict_use is set and @use_for is zero (empty); may be 0 for > * unspecified reasons; see &enum nl80211_bss_cannot_use_reasons > + * @med_sync_delay: Medium Synchronization delay as described in > + * IEEE 802.11be-2024 Figure 9-1074q. > * @priv: private area for driver use, has at least wiphy->bss_priv_size bytes > */ > struct cfg80211_bss { > @@ -2986,6 +2988,8 @@ struct cfg80211_bss { > u8 use_for; > u8 cannot_use_reasons; > > + u16 med_sync_delay; > + > u8 priv[] __aligned(sizeof(void *)); > }; > > diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c > index 4dac81854721..ae00c36779d2 100644 > --- a/net/wireless/mlme.c > +++ b/net/wireless/mlme.c > @@ -326,7 +326,9 @@ void cfg80211_oper_and_vht_capa(struct ieee80211_vht_cap *vht_capa, > } > > static int > -cfg80211_mlme_check_mlo_compat(const struct ieee80211_multi_link_elem *mle_a, > +cfg80211_mlme_check_mlo_compat(const struct cfg80211_bss *bss_a, > + const struct cfg80211_bss *bss_b, > + const struct ieee80211_multi_link_elem *mle_a, > const struct ieee80211_multi_link_elem *mle_b, > struct netlink_ext_ack *extack) > { > @@ -340,8 +342,7 @@ cfg80211_mlme_check_mlo_compat(const struct ieee80211_multi_link_elem *mle_a, > return -EINVAL; > } > > - if (ieee80211_mle_get_eml_med_sync_delay((const u8 *)mle_a) != > - ieee80211_mle_get_eml_med_sync_delay((const u8 *)mle_b)) { > + if (bss_a->med_sync_delay != bss_b->med_sync_delay) { > NL_SET_ERR_MSG(extack, "link EML medium sync delay mismatch"); > return -EINVAL; > } > @@ -426,7 +427,9 @@ static int cfg80211_mlme_check_mlo(struct net_device *dev, > if (WARN_ON(!mles[i])) > goto error; > > - if (cfg80211_mlme_check_mlo_compat(mles[req->link_id], mles[i], > + if (cfg80211_mlme_check_mlo_compat(req->links[req->link_id].bss, > + req->links[i].bss, > + mles[req->link_id], mles[i], > extack)) { > req->links[i].error = -EINVAL; > goto error; > diff --git a/net/wireless/scan.c b/net/wireless/scan.c > index 1c6fd45aa809..6ad935475484 100644 > --- a/net/wireless/scan.c > +++ b/net/wireless/scan.c > @@ -1899,6 +1899,13 @@ cfg80211_update_known_bss(struct cfg80211_registered_device *rdev, > */ > if (signal_valid) > known->pub.signal = new->pub.signal; > + > + /* update medium synchronization delay when its value is not > + * the default. > + */ > + if (new->pub.med_sync_delay != IEEE80211_MED_SYNC_DELAY_DEFAULT) > + known->pub.med_sync_delay = new->pub.med_sync_delay; > + > known->pub.capability = new->pub.capability; > known->ts = new->ts; > known->ts_boottime = new->ts_boottime; > @@ -2224,6 +2231,7 @@ cfg80211_inform_single_bss_data(struct wiphy *wiphy, > int bss_type; > bool signal_valid; > unsigned long ts; > + const struct element *ml; > > if (WARN_ON(!wiphy)) > return NULL; > @@ -2267,6 +2275,7 @@ cfg80211_inform_single_bss_data(struct wiphy *wiphy, > tmp.pub.use_for = data->use_for; > tmp.pub.cannot_use_reasons = data->cannot_use_reasons; > tmp.bss_source = data->bss_source; > + tmp.pub.med_sync_delay = IEEE80211_MED_SYNC_DELAY_DEFAULT; > > switch (data->bss_source) { > case BSS_SOURCE_MBSSID: > @@ -2321,6 +2330,12 @@ cfg80211_inform_single_bss_data(struct wiphy *wiphy, > break; > case CFG80211_BSS_FTYPE_PRESP: > rcu_assign_pointer(tmp.pub.proberesp_ies, ies); > + ml = cfg80211_find_ext_elem(WLAN_EID_EXT_EHT_MULTI_LINK, > + ies->data, ies->len); > + if (ml) > + tmp.pub.med_sync_delay = > + ieee80211_mle_get_eml_med_sync_delay(ml->data + 1); > + > break; > } > rcu_assign_pointer(tmp.pub.ies, ies); > > base-commit: e7e2957f403ba4655199f2ba9920c1a015a7be44
On 2024/10/30 18:23, Benjamin Berg wrote: > Hi, > > On Wed, 2024-10-30 at 10:55 +0800, Lingbo Kong wrote: >> Currently, when the driver attempts to connect to an AP MLD with multiple >> APs, the cfg80211_mlme_check_mlo_compat() function requires the Medium >> Synchronization Delay values from different APs of the same AP MLD to be >> equal, which may result in connection failures. >> >> This is because when the driver receives a multi-link probe response from >> an AP MLD with multiple APs, cfg80211 updates the Elements for each AP >> based on the multi-link probe response. If the Medium Synchronization Delay >> is set in the multi-link probe response, the Elements for each AP belonging >> to the same AP MLD will have the Medium Synchronization Delay set >> simultaneously. If non-multi-link probe responses are received from >> different APs of the same MLD AP, cfg80211 will still update the Elements >> based on the non-multi-link probe response. Since the non-multi-link probe >> response does not set the Medium Synchronization Delay >> (IEEE 802.11be-2024-35.3.4.4), if the Elements from a non-multi-link probe >> response overwrite those from a multi-link probe response that has set the >> Medium Synchronization Delay, the Medium Synchronization Delay values for >> APs belonging to the same AP MLD will not be equal. This discrepancy causes >> the cfg80211_mlme_check_mlo_compat() function to fail, leading to >> connection failures. Commit ccb964b4ab16 >> ("wifi: cfg80211: validate MLO connections better") did not take this into >> account. > > The specification confuses me a bit and I am probably missing > something. > >> To address this issue, add a u16 type member to the struct cfg80211_bss to >> store the Medium Synchronization Delay. When the driver receives a probe >> response with the Medium Synchronization Delay set, update this member’s >> value; otherwise, do not update it. Additionally, modify the parameter list >> of cfg80211_mlme_check_mlo_compat() so that this member variable can be >> directly accessed within the cfg80211_mlme_check_mlo_compat() function viaq >> a pointer to the struct cfg80211_bss. This will allow checking whether the >> Medium Synchronization Delay values from different APs of the same AP MLD >> are equal. > > This feels like a recipe for failures. Whether or not an ML-Probe > Response is seen should not generally affect the validity of a BSS. > Which means that the validity check itself may be incorrect. > > Maybe we should only compare the values if the subfield has been > included? > > > That said, do you understand when exactly the subfield should be > included? It seems that it may be carried in the "(Re)Association > Response", in which case it would be consistent anyway. > > The ML probe response part seems a bit specific. Could it be that this > is intended to be used by NSTR APs only, in which case there would not > be a beacon on the secondary AP. If that is the case, then there > wouldn't be much of an inconsistency (though we would still need to be > a bit careful). > > Benjamin > hi,benjamin, regarding your comment, I believe removing this validity check might be the easiest solution. As for the scenarios where this subfield would be included, I referred to the IEEE 802.11be specification. According to section 35.3.16.8, it states: A STA shall not start a MediumSyncDelay timer unless the STA is one of the following: — a non-AP STA affiliated with a non-AP MLD operating on an NSTR link pair or — a non-AP STA affiliated with a non-AP MLD operating on an EMLSR link or — a non-AP STA affiliated with a non-AP MLD operating on an EMLMR link or — an AP affiliated with an NSTR mobile AP MLD operating on the nonprimary link of an NSTR link pair. Regarding the statement “Could it be that this is intended to be used by NSTR APs only,” I did not find any explicit indication in the spec that the multi-link probe response carrying the medium sync delay subfield is solely applicable to NSTR APs. Additionally, do you have any suggestions on how to address this issue?
Hi, On Thu, 2024-10-31 at 13:23 +0800, Lingbo Kong wrote: > On 2024/10/30 18:23, Benjamin Berg wrote: > > Hi, > > > > On Wed, 2024-10-30 at 10:55 +0800, Lingbo Kong wrote: > > > Currently, when the driver attempts to connect to an AP MLD with multiple > > > APs, the cfg80211_mlme_check_mlo_compat() function requires the Medium > > > Synchronization Delay values from different APs of the same AP MLD to be > > > equal, which may result in connection failures. > > > > > > This is because when the driver receives a multi-link probe response from > > > an AP MLD with multiple APs, cfg80211 updates the Elements for each AP > > > based on the multi-link probe response. If the Medium Synchronization Delay > > > is set in the multi-link probe response, the Elements for each AP belonging > > > to the same AP MLD will have the Medium Synchronization Delay set > > > simultaneously. If non-multi-link probe responses are received from > > > different APs of the same MLD AP, cfg80211 will still update the Elements > > > based on the non-multi-link probe response. Since the non-multi-link probe > > > response does not set the Medium Synchronization Delay > > > (IEEE 802.11be-2024-35.3.4.4), if the Elements from a non-multi-link probe > > > response overwrite those from a multi-link probe response that has set the > > > Medium Synchronization Delay, the Medium Synchronization Delay values for > > > APs belonging to the same AP MLD will not be equal. This discrepancy causes > > > the cfg80211_mlme_check_mlo_compat() function to fail, leading to > > > connection failures. Commit ccb964b4ab16 > > > ("wifi: cfg80211: validate MLO connections better") did not take this into > > > account. > > > > The specification confuses me a bit and I am probably missing > > something. > > > > > To address this issue, add a u16 type member to the struct cfg80211_bss to > > > store the Medium Synchronization Delay. When the driver receives a probe > > > response with the Medium Synchronization Delay set, update this member’s > > > value; otherwise, do not update it. Additionally, modify the parameter list > > > of cfg80211_mlme_check_mlo_compat() so that this member variable can be > > > directly accessed within the cfg80211_mlme_check_mlo_compat() function viaq > > > a pointer to the struct cfg80211_bss. This will allow checking whether the > > > Medium Synchronization Delay values from different APs of the same AP MLD > > > are equal. > > > > This feels like a recipe for failures. Whether or not an ML-Probe > > Response is seen should not generally affect the validity of a BSS. > > Which means that the validity check itself may be incorrect. > > > > Maybe we should only compare the values if the subfield has been > > included? > > > > > > That said, do you understand when exactly the subfield should be > > included? It seems that it may be carried in the "(Re)Association > > Response", in which case it would be consistent anyway. > > > > The ML probe response part seems a bit specific. Could it be that this > > is intended to be used by NSTR APs only, in which case there would not > > be a beacon on the secondary AP. If that is the case, then there > > wouldn't be much of an inconsistency (though we would still need to be > > a bit careful). > > > > Benjamin > > > > hi,benjamin, > > regarding your comment, I believe removing this validity check might be > the easiest solution. As for the scenarios where this subfield would be > included, I referred to the IEEE 802.11be specification. According to > section 35.3.16.8, it states: > > A STA shall not start a MediumSyncDelay timer unless the STA is one of > the following: > — a non-AP STA affiliated with a non-AP MLD operating on an NSTR link > pair or > — a non-AP STA affiliated with a non-AP MLD operating on an EMLSR link or > — a non-AP STA affiliated with a non-AP MLD operating on an EMLMR link or > — an AP affiliated with an NSTR mobile AP MLD operating on the > nonprimary link of an NSTR link pair. Right, which does seem to imply that a non-NSTR AP can very well set a Medium Delay Synchronization. > Regarding the statement “Could it be that this is intended to be used by > NSTR APs only,” I did not find any explicit indication in the spec that > the multi-link probe response carrying the medium sync delay subfield is > solely applicable to NSTR APs. > > Additionally, do you have any suggestions on how to address this issue?
On Thu, 2024-10-31 at 10:25 +0100, Benjamin Berg wrote: > > Right, which does seem to imply that a non-NSTR AP can very well set a > Medium Delay Synchronization. Yes, we also pass it to the firmware in iwlwifi, so definitely. > > Regarding the statement “Could it be that this is intended to be used by > > NSTR APs only,” I did not find any explicit indication in the spec that > > the multi-link probe response carrying the medium sync delay subfield is > > solely applicable to NSTR APs. > > > > Additionally, do you have any suggestions on how to address this issue?
On 2024/10/31 18:47, Johannes Berg wrote: > On Thu, 2024-10-31 at 10:25 +0100, Benjamin Berg wrote: >> >> Right, which does seem to imply that a non-NSTR AP can very well set a >> Medium Delay Synchronization. > > Yes, we also pass it to the firmware in iwlwifi, so definitely. > >>> Regarding the statement “Could it be that this is intended to be used by >>> NSTR APs only,” I did not find any explicit indication in the spec that >>> the multi-link probe response carrying the medium sync delay subfield is >>> solely applicable to NSTR APs. >>> >>> Additionally, do you have any suggestions on how to address this issue?
diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h index 27acf1292a5c..ebeb305c1d0c 100644 --- a/include/net/cfg80211.h +++ b/include/net/cfg80211.h @@ -2956,6 +2956,8 @@ struct cfg80211_bss_ies { * @cannot_use_reasons: the reasons (bitmap) for not being able to connect, * if @restrict_use is set and @use_for is zero (empty); may be 0 for * unspecified reasons; see &enum nl80211_bss_cannot_use_reasons + * @med_sync_delay: Medium Synchronization delay as described in + * IEEE 802.11be-2024 Figure 9-1074q. * @priv: private area for driver use, has at least wiphy->bss_priv_size bytes */ struct cfg80211_bss { @@ -2986,6 +2988,8 @@ struct cfg80211_bss { u8 use_for; u8 cannot_use_reasons; + u16 med_sync_delay; + u8 priv[] __aligned(sizeof(void *)); }; diff --git a/net/wireless/mlme.c b/net/wireless/mlme.c index 4dac81854721..ae00c36779d2 100644 --- a/net/wireless/mlme.c +++ b/net/wireless/mlme.c @@ -326,7 +326,9 @@ void cfg80211_oper_and_vht_capa(struct ieee80211_vht_cap *vht_capa, } static int -cfg80211_mlme_check_mlo_compat(const struct ieee80211_multi_link_elem *mle_a, +cfg80211_mlme_check_mlo_compat(const struct cfg80211_bss *bss_a, + const struct cfg80211_bss *bss_b, + const struct ieee80211_multi_link_elem *mle_a, const struct ieee80211_multi_link_elem *mle_b, struct netlink_ext_ack *extack) { @@ -340,8 +342,7 @@ cfg80211_mlme_check_mlo_compat(const struct ieee80211_multi_link_elem *mle_a, return -EINVAL; } - if (ieee80211_mle_get_eml_med_sync_delay((const u8 *)mle_a) != - ieee80211_mle_get_eml_med_sync_delay((const u8 *)mle_b)) { + if (bss_a->med_sync_delay != bss_b->med_sync_delay) { NL_SET_ERR_MSG(extack, "link EML medium sync delay mismatch"); return -EINVAL; } @@ -426,7 +427,9 @@ static int cfg80211_mlme_check_mlo(struct net_device *dev, if (WARN_ON(!mles[i])) goto error; - if (cfg80211_mlme_check_mlo_compat(mles[req->link_id], mles[i], + if (cfg80211_mlme_check_mlo_compat(req->links[req->link_id].bss, + req->links[i].bss, + mles[req->link_id], mles[i], extack)) { req->links[i].error = -EINVAL; goto error; diff --git a/net/wireless/scan.c b/net/wireless/scan.c index 1c6fd45aa809..6ad935475484 100644 --- a/net/wireless/scan.c +++ b/net/wireless/scan.c @@ -1899,6 +1899,13 @@ cfg80211_update_known_bss(struct cfg80211_registered_device *rdev, */ if (signal_valid) known->pub.signal = new->pub.signal; + + /* update medium synchronization delay when its value is not + * the default. + */ + if (new->pub.med_sync_delay != IEEE80211_MED_SYNC_DELAY_DEFAULT) + known->pub.med_sync_delay = new->pub.med_sync_delay; + known->pub.capability = new->pub.capability; known->ts = new->ts; known->ts_boottime = new->ts_boottime; @@ -2224,6 +2231,7 @@ cfg80211_inform_single_bss_data(struct wiphy *wiphy, int bss_type; bool signal_valid; unsigned long ts; + const struct element *ml; if (WARN_ON(!wiphy)) return NULL; @@ -2267,6 +2275,7 @@ cfg80211_inform_single_bss_data(struct wiphy *wiphy, tmp.pub.use_for = data->use_for; tmp.pub.cannot_use_reasons = data->cannot_use_reasons; tmp.bss_source = data->bss_source; + tmp.pub.med_sync_delay = IEEE80211_MED_SYNC_DELAY_DEFAULT; switch (data->bss_source) { case BSS_SOURCE_MBSSID: @@ -2321,6 +2330,12 @@ cfg80211_inform_single_bss_data(struct wiphy *wiphy, break; case CFG80211_BSS_FTYPE_PRESP: rcu_assign_pointer(tmp.pub.proberesp_ies, ies); + ml = cfg80211_find_ext_elem(WLAN_EID_EXT_EHT_MULTI_LINK, + ies->data, ies->len); + if (ml) + tmp.pub.med_sync_delay = + ieee80211_mle_get_eml_med_sync_delay(ml->data + 1); + break; } rcu_assign_pointer(tmp.pub.ies, ies);
Currently, when the driver attempts to connect to an AP MLD with multiple APs, the cfg80211_mlme_check_mlo_compat() function requires the Medium Synchronization Delay values from different APs of the same AP MLD to be equal, which may result in connection failures. This is because when the driver receives a multi-link probe response from an AP MLD with multiple APs, cfg80211 updates the Elements for each AP based on the multi-link probe response. If the Medium Synchronization Delay is set in the multi-link probe response, the Elements for each AP belonging to the same AP MLD will have the Medium Synchronization Delay set simultaneously. If non-multi-link probe responses are received from different APs of the same MLD AP, cfg80211 will still update the Elements based on the non-multi-link probe response. Since the non-multi-link probe response does not set the Medium Synchronization Delay (IEEE 802.11be-2024-35.3.4.4), if the Elements from a non-multi-link probe response overwrite those from a multi-link probe response that has set the Medium Synchronization Delay, the Medium Synchronization Delay values for APs belonging to the same AP MLD will not be equal. This discrepancy causes the cfg80211_mlme_check_mlo_compat() function to fail, leading to connection failures. Commit ccb964b4ab16 ("wifi: cfg80211: validate MLO connections better") did not take this into account. To address this issue, add a u16 type member to the struct cfg80211_bss to store the Medium Synchronization Delay. When the driver receives a probe response with the Medium Synchronization Delay set, update this member’s value; otherwise, do not update it. Additionally, modify the parameter list of cfg80211_mlme_check_mlo_compat() so that this member variable can be directly accessed within the cfg80211_mlme_check_mlo_compat() function viaq a pointer to the struct cfg80211_bss. This will allow checking whether the Medium Synchronization Delay values from different APs of the same AP MLD are equal. Fixes: ccb964b4ab16 ("wifi: cfg80211: validate MLO connections better") Signed-off-by: Lingbo Kong <quic_lingbok@quicinc.com> --- include/net/cfg80211.h | 4 ++++ net/wireless/mlme.c | 11 +++++++---- net/wireless/scan.c | 15 +++++++++++++++ 3 files changed, 26 insertions(+), 4 deletions(-) base-commit: e7e2957f403ba4655199f2ba9920c1a015a7be44