Message ID | 20240523082200.15544-1-kevin_yang@realtek.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Johannes Berg |
Headers | show |
Series | wifi: mac80211: fix NULL dereference at band check in starting tx ba session | expand |
<kevin_yang@realtek.com> writes: > From: Zong-Zhe Yang <kevin_yang@realtek.com> > > In MLD connection, link_data/link_conf are dynamically allocated. They > don't point to vif->bss_conf. So, there will be no chanreq assigned to > vif->bss_conf and then the chan will be NULL. To get deflink's chanreq, > we change the code to access vif->link_conf with deflink's link id. > > This way should also work in non-MLD connection, because non-MLD deflink > link id should be 0 and non-MLD link_conf[0] will point to vif->bss_conf. > > Crash log (with rtw89 version under MLO development): > [ 9890.526087] BUG: kernel NULL pointer dereference, address: 0000000000000000 > [ 9890.526102] #PF: supervisor read access in kernel mode > [ 9890.526105] #PF: error_code(0x0000) - not-present page > [ 9890.526109] PGD 0 P4D 0 > [ 9890.526114] Oops: 0000 [#1] PREEMPT SMP PTI > [ 9890.526119] CPU: 2 PID: 6367 Comm: kworker/u16:2 Kdump: loaded Tainted: G OE 6.9.0 #1 > [ 9890.526123] Hardware name: LENOVO 2356AD1/2356AD1, BIOS G7ETB3WW (2.73 ) 11/28/2018 > [ 9890.526126] Workqueue: phy2 rtw89_core_ba_work [rtw89_core] > [ 9890.526203] RIP: 0010:ieee80211_start_tx_ba_session (net/mac80211/agg-tx.c:618 (discriminator 1)) mac80211 > [ 9890.526279] Code: f7 e8 d5 93 3e ea 48 83 c4 28 89 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc 49 8b 84 24 e0 f1 ff ff 48 8b 80 90 1b 00 00 <83> 38 03 0f 84 37 fe ff ff bb ea ff ff ff eb cc 49 8b 84 24 10 f3 > All code > ======== > 0: f7 e8 imul %eax > 2: d5 (bad) > 3: 93 xchg %eax,%ebx > 4: 3e ea ds (bad) > 6: 48 83 c4 28 add $0x28,%rsp > a: 89 d8 mov %ebx,%eax > c: 5b pop %rbx > d: 41 5c pop %r12 > f: 41 5d pop %r13 > 11: 41 5e pop %r14 > 13: 41 5f pop %r15 > 15: 5d pop %rbp > 16: c3 retq > 17: cc int3 > 18: cc int3 > 19: cc int3 > 1a: cc int3 > 1b: 49 8b 84 24 e0 f1 ff mov -0xe20(%r12),%rax > 22: ff > 23: 48 8b 80 90 1b 00 00 mov 0x1b90(%rax),%rax > 2a:* 83 38 03 cmpl $0x3,(%rax) <-- trapping instruction > 2d: 0f 84 37 fe ff ff je 0xfffffffffffffe6a > 33: bb ea ff ff ff mov $0xffffffea,%ebx > 38: eb cc jmp 0x6 > 3a: 49 rex.WB > 3b: 8b .byte 0x8b > 3c: 84 24 10 test %ah,(%rax,%rdx,1) > 3f: f3 repz > > Code starting with the faulting instruction > =========================================== > 0: 83 38 03 cmpl $0x3,(%rax) > 3: 0f 84 37 fe ff ff je 0xfffffffffffffe40 > 9: bb ea ff ff ff mov $0xffffffea,%ebx > e: eb cc jmp 0xffffffffffffffdc > 10: 49 rex.WB > 11: 8b .byte 0x8b > 12: 84 24 10 test %ah,(%rax,%rdx,1) > 15: f3 repz > [ 9890.526285] RSP: 0018:ffffb8db09013d68 EFLAGS: 00010246 > [ 9890.526291] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9308e0d656c8 > [ 9890.526295] RDX: 0000000000000000 RSI: ffffffffab99460b RDI: ffffffffab9a7685 > [ 9890.526300] RBP: ffffb8db09013db8 R08: 0000000000000000 R09: 0000000000000873 > [ 9890.526304] R10: ffff9308e0d64800 R11: 0000000000000002 R12: ffff9308e5ff6e70 > [ 9890.526308] R13: ffff930952500e20 R14: ffff9309192a8c00 R15: 0000000000000000 > [ 9890.526313] FS: 0000000000000000(0000) GS:ffff930b4e700000(0000) knlGS:0000000000000000 > [ 9890.526316] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 9890.526318] CR2: 0000000000000000 CR3: 0000000391c58005 CR4: 00000000001706f0 > [ 9890.526321] Call Trace: > [ 9890.526324] <TASK> > [ 9890.526327] ? show_regs (arch/x86/kernel/dumpstack.c:479) > [ 9890.526335] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434) > [ 9890.526340] ? page_fault_oops (arch/x86/mm/fault.c:713) > [ 9890.526347] ? search_module_extables (kernel/module/main.c:3256 (discriminator 3)) > [ 9890.526353] ? ieee80211_start_tx_ba_session (net/mac80211/agg-tx.c:618 (discriminator 1)) mac80211 > > Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Kevin's s-o-b missing.
Kalle Valo <kvalo@kernel.org> wrote: > > <kevin_yang@realtek.com> writes: > > > From: Zong-Zhe Yang <kevin_yang@realtek.com> > > > > [...] > > > > Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> > > Kevin's s-o-b missing. > Sorry, I didn't configure sendemail.from properly. (I will double-check my configuration before sending patch next time.) But, my s-o-b has been there.
Zong-Zhe Yang <kevin_yang@realtek.com> writes: > Kalle Valo <kvalo@kernel.org> wrote: >> >> <kevin_yang@realtek.com> writes: >> >> > From: Zong-Zhe Yang <kevin_yang@realtek.com> >> > >> > [...] >> > >> > Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> >> >> Kevin's s-o-b missing. >> > > Sorry, I didn't configure sendemail.from properly. > (I will double-check my configuration before sending patch next time.) > But, my s-o-b has been there. Yeah, I got confused. Sorry for the noise.
On Thu, 2024-05-23 at 16:22 +0800, kevin_yang@realtek.com wrote: > > - if (!pubsta->deflink.ht_cap.ht_supported && > - sta->sdata->vif.bss_conf.chanreq.oper.chan->band != NL80211_BAND_6GHZ) > - return -EINVAL; I can see how this fixes the crash, and I can also see why we didn't notice (TX agg sessions offloaded to FW), but ... > + if (!pubsta->deflink.ht_cap.ht_supported) { > + struct ieee80211_vif *vif = &sta->sdata->vif; > + struct ieee80211_bss_conf *bss_conf; > + > + rcu_read_lock(); > + > + bss_conf = rcu_dereference(vif->link_conf[pubsta->deflink.link_id]); > + if (unlikely(!bss_conf) || > + bss_conf->chanreq.oper.chan->band != NL80211_BAND_6GHZ) { > + rcu_read_unlock(); > + return -EINVAL; > is this really right? This checks that the *first* link the STA used isn't 6 GHz, but maybe it should be *any* link? But then again, we don't really need this check for an MLO STA since it will have HT supported unless it associated on 6 GHz. Maybe we should just not do the check this way, but check if it has HT or VHT or HE or something like that? johannes
Johannes Berg <johannes@sipsolutions.net> wrote: > > On Thu, 2024-05-23 at 16:22 +0800, kevin_yang@realtek.com wrote: > > [...] > > This checks that the *first* link the STA used isn't 6 GHz, but maybe it should be *any* link? > > But then again, we don't really need this check for an MLO STA since it will have HT supported > unless it associated on 6 GHz. Maybe we should just not do the check this way, but check if it > has HT or VHT or HE or something like that? > I think there are two points here. 1. the way to avoid this NULL dereference (Current patch just followed original logic and made it runnable on both MLD and non-MLD.) According to comments, I will change to check ht_supported/vht_supported/has_he/has_eht. Then, it doesn't need to reference chanreq.oper.chan here. So, there won't be NULL dereference. 2. the check logic when MLD (Current patch didn't consider this properly.) According to spec., BA agreement does once per TID and apply to all corresponding links. So, I am thinking maybe I check the conditions on all valid_links when MLD. And, only check deflink when non-MLD. How about it?
On Thu, 2024-05-30 at 13:49 +0000, Zong-Zhe Yang wrote: > Johannes Berg <johannes@sipsolutions.net> wrote: > > > > On Thu, 2024-05-23 at 16:22 +0800, kevin_yang@realtek.com wrote: > > > > [...] > > > > This checks that the *first* link the STA used isn't 6 GHz, but maybe it should be *any* link? > > > > But then again, we don't really need this check for an MLO STA since it will have HT supported > > unless it associated on 6 GHz. Maybe we should just not do the check this way, but check if it > > has HT or VHT or HE or something like that? > > > > I think there are two points here. > > 1. the way to avoid this NULL dereference > (Current patch just followed original logic and made it runnable on both MLD and non-MLD.) > > According to comments, I will change to check ht_supported/vht_supported/has_he/has_eht. > Then, it doesn't need to reference chanreq.oper.chan here. So, there won't be NULL dereference. > > 2. the check logic when MLD > (Current patch didn't consider this properly.) > > According to spec., BA agreement does once per TID and apply to all corresponding links. > So, I am thinking maybe I check the conditions on all valid_links when MLD. > And, only check deflink when non-MLD. Well, spec also requires that you have EHT (on all links) to be able to do MLO in the first place, so you shouldn't be connected. IOW, checking one link should be sufficient? And that can even be deflink, because for a STA that's always used as the assoc link (unlike for vif) johannes
Johannes Berg <johannes@sipsolutions.net> wrote: > On Thu, 2024-05-30 at 13:49 +0000, Zong-Zhe Yang wrote: > > > > I think there are two points here. > > > > 1. the way to avoid this NULL dereference (Current patch just followed > > original logic and made it runnable on both MLD and non-MLD.) > > > > According to comments, I will change to check > ht_supported/vht_supported/has_he/has_eht. > > Then, it doesn't need to reference chanreq.oper.chan here. So, there won't be NULL > dereference. > > > > 2. the check logic when MLD > > (Current patch didn't consider this properly.) > > > > According to spec., BA agreement does once per TID and apply to all corresponding links. > > So, I am thinking maybe I check the conditions on all valid_links when MLD. > > And, only check deflink when non-MLD. > > Well, spec also requires that you have EHT (on all links) to be able to do MLO in the first place, > so you shouldn't be connected. IOW, checking one link should be sufficient? And that can even > be deflink, because for a STA that's always used as the assoc link (unlike for vif) > Then I am thinking to just check ht_supported/vht_supported/has_he/has_eht on sta deflink, whether non-MLD connection or MLD connection. Any further suggestions?
Zong-Zhe Yang <kevin_yang@realtek.com> wrote: > > [...] > > Then I am thinking to just check ht_supported/vht_supported/has_he/has_eht on sta deflink, > whether non-MLD connection or MLD connection. > Any further suggestions? I sent v2 in [1]. If there are further suggestions, perhaps we can discuss them there. [1]: https://patchwork.kernel.org/project/linux-wireless/patch/20240617115217.22344-1-kevin_yang@realtek.com/
diff --git a/net/mac80211/agg-tx.c b/net/mac80211/agg-tx.c index 21d55dc539f6..41f02d0baed7 100644 --- a/net/mac80211/agg-tx.c +++ b/net/mac80211/agg-tx.c @@ -615,9 +615,21 @@ int ieee80211_start_tx_ba_session(struct ieee80211_sta *pubsta, u16 tid, "Requested to start BA session on reserved tid=%d", tid)) return -EINVAL; - if (!pubsta->deflink.ht_cap.ht_supported && - sta->sdata->vif.bss_conf.chanreq.oper.chan->band != NL80211_BAND_6GHZ) - return -EINVAL; + if (!pubsta->deflink.ht_cap.ht_supported) { + struct ieee80211_vif *vif = &sta->sdata->vif; + struct ieee80211_bss_conf *bss_conf; + + rcu_read_lock(); + + bss_conf = rcu_dereference(vif->link_conf[pubsta->deflink.link_id]); + if (unlikely(!bss_conf) || + bss_conf->chanreq.oper.chan->band != NL80211_BAND_6GHZ) { + rcu_read_unlock(); + return -EINVAL; + } + + rcu_read_unlock(); + } if (WARN_ON_ONCE(!local->ops->ampdu_action)) return -EINVAL;