Message ID | 20240517145420.8891-1-concord@gentoo.org (mailing list archive) |
---|---|
State | Awaiting Upstream |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [v2] wifi: mac80211: Avoid address calculations via out of bounds array indexing | expand |
On Fri, May 17, 2024 at 10:54:20AM -0400, Kenton Groombridge wrote: > req->n_channels must be set before req->channels[] can be used. > > This patch fixes one of the issues encountered in [1]. > > [ 83.964252] ------------[ cut here ]------------ > [ 83.964255] UBSAN: array-index-out-of-bounds in net/mac80211/scan.c:364:4 > [ 83.964258] index 0 is out of range for type 'struct ieee80211_channel *[]' > [ 83.964260] CPU: 0 PID: 1695 Comm: iwd Tainted: G O T 6.8.9-gentoo-hardened1 #1 > [ 83.964262] Hardware name: System76 Pangolin/Pangolin, BIOS ARB928_V00.01_T0025ASY1_ms 04/20/2023 > [ 83.964264] Call Trace: > [ 83.964267] <TASK> > [ 83.964269] dump_stack_lvl+0x3f/0xc0 > [ 83.964274] __ubsan_handle_out_of_bounds+0xec/0x110 > [ 83.964278] ieee80211_prep_hw_scan+0x2db/0x4b0 > [ 83.964281] __ieee80211_start_scan+0x601/0x990 > [ 83.964284] ? srso_alias_return_thunk+0x5/0xfbef5 > [ 83.964287] ? cfg80211_scan+0x149/0x250 > [ 83.964291] nl80211_trigger_scan+0x874/0x980 > [ 83.964295] genl_family_rcv_msg_doit+0xe8/0x160 > [ 83.964298] genl_rcv_msg+0x240/0x270 > [ 83.964301] ? __cfi_nl80211_trigger_scan+0x10/0x10 > [ 83.964302] ? __cfi_nl80211_post_doit+0x10/0x10 > [ 83.964304] ? __cfi_nl80211_pre_doit+0x10/0x10 > [ 83.964307] ? __cfi_genl_rcv_msg+0x10/0x10 > [ 83.964309] netlink_rcv_skb+0x102/0x130 > [ 83.964312] genl_rcv+0x23/0x40 > [ 83.964314] netlink_unicast+0x23b/0x340 > [ 83.964316] netlink_sendmsg+0x3a9/0x450 > [ 83.964319] __sys_sendto+0x3ae/0x3c0 > [ 83.964324] __x64_sys_sendto+0x21/0x40 > [ 83.964326] do_syscall_64+0x90/0x150 > [ 83.964329] ? srso_alias_return_thunk+0x5/0xfbef5 > [ 83.964331] ? syscall_exit_work+0xc2/0xf0 > [ 83.964333] ? srso_alias_return_thunk+0x5/0xfbef5 > [ 83.964335] ? syscall_exit_to_user_mode+0x74/0xa0 > [ 83.964337] ? srso_alias_return_thunk+0x5/0xfbef5 > [ 83.964339] ? do_syscall_64+0x9c/0x150 > [ 83.964340] ? srso_alias_return_thunk+0x5/0xfbef5 > [ 83.964342] ? syscall_exit_to_user_mode+0x74/0xa0 > [ 83.964344] ? srso_alias_return_thunk+0x5/0xfbef5 > [ 83.964346] ? do_syscall_64+0x9c/0x150 > [ 83.964347] ? srso_alias_return_thunk+0x5/0xfbef5 > [ 83.964349] ? do_syscall_64+0x9c/0x150 > [ 83.964351] ? srso_alias_return_thunk+0x5/0xfbef5 u> [ 83.964353] ? syscall_exit_work+0xc2/0xf0 > [ 83.964354] ? srso_alias_return_thunk+0x5/0xfbef5 > [ 83.964356] ? syscall_exit_to_user_mode+0x74/0xa0 > [ 83.964358] ? srso_alias_return_thunk+0x5/0xfbef5 > [ 83.964359] ? do_syscall_64+0x9c/0x150 > [ 83.964361] ? srso_alias_return_thunk+0x5/0xfbef5 > [ 83.964362] ? do_user_addr_fault+0x488/0x620 > [ 83.964366] ? srso_alias_return_thunk+0x5/0xfbef5 > [ 83.964367] ? srso_alias_return_thunk+0x5/0xfbef5 > [ 83.964369] entry_SYSCALL_64_after_hwframe+0x55/0x5d > [ 83.964372] RIP: 0033:0x6200808578d7 > [ 83.964374] Code: 00 00 90 f3 0f 1e fa 41 56 55 41 89 ce 48 83 ec 28 80 3d 7b f7 0d 00 00 74 29 45 31 c9 45 31 c0 41 89 ca b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 71 48 83 c4 28 5d 41 5e c3 66 0f 1f 84 00 00 > [ 83.964375] RSP: 002b:0000730c4e821530 EFLAGS: 00000246 ORIG_RAX: 000000000000002c > [ 83.964378] RAX: ffffffffffffffda RBX: 000006dbc456c570 RCX: 00006200808578d7 > [ 83.964379] RDX: 000000000000005c RSI: 000006dbc45884f0 RDI: 0000000000000004 > [ 83.964381] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000 > [ 83.964382] R10: 0000000000000000 R11: 0000000000000246 R12: 000006dbc456c480 > [ 83.964383] R13: 000006dbc456c450 R14: 0000000000000000 R15: 000006dbc456c610 > [ 83.964386] </TASK> > [ 83.964386] ---[ end trace ]--- > > [1] https://bugzilla.kernel.org/show_bug.cgi?id=218810 > > v1->v2: > - Drop changes in cfg80211 as requested by Johannes > > Co-authored-by: Kees Cook <keescook@chromium.org> > Signed-off-by: Kenton Groombridge <concord@gentoo.org> Thanks Kenton, FWWIW, it seems unfortunate to me that the __counted_by field (n_channels) is set some distance away from the allocation of the flex-array (channels) whose bounds it checks. It seems it would be pretty easy for a bug in the code being updated here to result in an overrun. But in any case, I think this is an improvement and seems correct to me. Reviewed-by: Simon Horman <horms@kernel.org>
On Fri, 2024-05-17 at 21:45 +0100, Simon Horman wrote: > > FWWIW, it seems unfortunate to me that the __counted_by field (n_channels) > is set some distance away from the allocation of the flex-array (channels) > whose bounds it checks. It seems it would be pretty easy for a bug in the > code being updated here to result in an overrun. > In a way, this is a more general problem, this allocates the max we know we might need, but then filter it down. It'd have to iterate twice to actually allocate the "correct" size, but then you could still have bugs by having different filter conditions in the two loops ... Don't see any good solutions to this kind of code? johannes
On Fri, 2024-05-17 at 10:54 -0400, Kenton Groombridge wrote: > req->n_channels must be set before req->channels[] can be used. > I don't know why, but this patch breaks a number of hwsim test cases. https://w1.fi/cgit/hostap/tree/tests/hwsim/ johannes
On Thu, May 23, 2024 at 11:35:37AM +0200, Johannes Berg wrote: > On Fri, 2024-05-17 at 21:45 +0100, Simon Horman wrote: > > > > FWWIW, it seems unfortunate to me that the __counted_by field (n_channels) > > is set some distance away from the allocation of the flex-array (channels) > > whose bounds it checks. It seems it would be pretty easy for a bug in the > > code being updated here to result in an overrun. > > > > In a way, this is a more general problem, this allocates the max we know > we might need, but then filter it down. It'd have to iterate twice to > actually allocate the "correct" size, but then you could still have bugs > by having different filter conditions in the two loops ... Yes, I agree this problem is more general than this patch or the code it updates. > Don't see any good solutions to this kind of code? I was hoping you might :)
Hello, kernel test robot noticed "hwsim.autogo_force_diff_channel.fail" on: commit: f68b86b282be59dcb687700c3b9ede5cd16e36cf ("[PATCH v2] wifi: mac80211: Avoid address calculations via out of bounds array indexing") url: https://github.com/intel-lab-lkp/linux/commits/Kenton-Groombridge/wifi-mac80211-Avoid-address-calculations-via-out-of-bounds-array-indexing/20240517-230100 base: https://git.kernel.org/cgit/linux/kernel/git/wireless/wireless-next.git main patch link: https://lore.kernel.org/all/20240517145420.8891-1-concord@gentoo.org/ patch subject: [PATCH v2] wifi: mac80211: Avoid address calculations via out of bounds array indexing in testcase: hwsim version: hwsim-x86_64-07c9f183e-1_20240402 with following parameters: test: group-14 compiler: gcc-13 test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz (Haswell) with 8G memory (please refer to attached dmesg/kmsg for entire log/backtrace) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@intel.com> | Closes: https://lore.kernel.org/oe-lkp/202406031527.5530f205-oliver.sang@intel.com 2024-06-01 05:33:00 ./run-tests.py autogo_force_diff_channel DEV: wlan0: 02:00:00:00:00:00 DEV: wlan1: 02:00:00:00:01:00 DEV: wlan2: 02:00:00:00:02:00 APDEV: wlan3 APDEV: wlan4 START autogo_force_diff_channel 1/1 Test: P2P autonomous GO and station interface operate on different channels Starting AP wlan3 Connect STA wlan7 to AP Connection timed out Traceback (most recent call last): File "/lkp/benchmarks/hwsim/tests/hwsim/./run-tests.py", line 591, in main t(dev, apdev) File "/lkp/benchmarks/hwsim/tests/hwsim/test_p2p_channel.py", line 444, in test_autogo_force_diff_channel wpas.connect("ap-test", key_mgmt="NONE", scan_freq="2412") File "/lkp/benchmarks/hwsim/tests/hwsim/wpasupplicant.py", line 1153, in connect self.connect_network(id, timeout=timeout) File "/lkp/benchmarks/hwsim/tests/hwsim/wpasupplicant.py", line 506, in connect_network self.wait_connected(timeout=timeout) File "/lkp/benchmarks/hwsim/tests/hwsim/wpasupplicant.py", line 1437, in wait_connected raise Exception(error) Exception: Connection timed out FAIL autogo_force_diff_channel 29.021569 2024-06-01 05:33:42.767850 passed 0 test case(s) skipped 0 test case(s) failed tests: autogo_force_diff_channel The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20240603/202406031527.5530f205-oliver.sang@intel.com
On 24/05/29 04:54PM, Johannes Berg wrote: > On Fri, 2024-05-17 at 10:54 -0400, Kenton Groombridge wrote: > > req->n_channels must be set before req->channels[] can be used. > > > > I don't know why, but this patch breaks a number of hwsim test cases. > > https://w1.fi/cgit/hostap/tree/tests/hwsim/ > > johannes Pardon my absence. I'm also not sure why these tests are failing. Unless I'm missing something, the runtime behavior of these code paths shouldn't have changed significantly. FWIW, I have been running this patch on real hardware and connecting to Wi-Fi networks with it for the past couple weeks without problems. Though, I see that these tests are probably testing Wi-Fi modes that I am not using.
On Tue, 2024-06-04 at 14:53 -0400, Kenton Groombridge wrote: > On 24/05/29 04:54PM, Johannes Berg wrote: > > On Fri, 2024-05-17 at 10:54 -0400, Kenton Groombridge wrote: > > > req->n_channels must be set before req->channels[] can be used. > > > > > > > I don't know why, but this patch breaks a number of hwsim test cases. > > > > https://w1.fi/cgit/hostap/tree/tests/hwsim/ > > > > johannes > > Pardon my absence. > > I'm also not sure why these tests are failing. Unless I'm missing > something, the runtime behavior of these code paths shouldn't have > changed significantly. > Looking at your patch again, this seems wrong? > + local->hw_scan_req->req.channels[*n_chans++] = > req->channels[i]; > This will increment n_chans rather than *n_chans, no? johannes
On 24/06/04 09:29PM, Johannes Berg wrote: > Looking at your patch again, this seems wrong? > > > + local->hw_scan_req->req.channels[*n_chans++] = > > req->channels[i]; > > > > This will increment n_chans rather than *n_chans, no? > Ah ha! A silly mistake that I missed. V3 to follow soon.
diff --git a/net/mac80211/scan.c b/net/mac80211/scan.c index 73850312580f..b88e99c211ff 100644 --- a/net/mac80211/scan.c +++ b/net/mac80211/scan.c @@ -358,7 +358,8 @@ static bool ieee80211_prep_hw_scan(struct ieee80211_sub_if_data *sdata) struct cfg80211_scan_request *req; struct cfg80211_chan_def chandef; u8 bands_used = 0; - int i, ielen, n_chans; + int i, ielen; + u32 *n_chans; u32 flags = 0; req = rcu_dereference_protected(local->scan_req, @@ -368,34 +369,34 @@ static bool ieee80211_prep_hw_scan(struct ieee80211_sub_if_data *sdata) return false; if (ieee80211_hw_check(&local->hw, SINGLE_SCAN_ON_ALL_BANDS)) { + local->hw_scan_req->req.n_channels = req->n_channels; + for (i = 0; i < req->n_channels; i++) { local->hw_scan_req->req.channels[i] = req->channels[i]; bands_used |= BIT(req->channels[i]->band); } - - n_chans = req->n_channels; } else { do { if (local->hw_scan_band == NUM_NL80211_BANDS) return false; - n_chans = 0; + n_chans = &local->hw_scan_req->req.n_channels; + *n_chans = 0; for (i = 0; i < req->n_channels; i++) { if (req->channels[i]->band != local->hw_scan_band) continue; - local->hw_scan_req->req.channels[n_chans] = + local->hw_scan_req->req.channels[*n_chans++] = req->channels[i]; - n_chans++; + bands_used |= BIT(req->channels[i]->band); } local->hw_scan_band++; - } while (!n_chans); + } while (!*n_chans); } - local->hw_scan_req->req.n_channels = n_chans; ieee80211_prepare_scan_chandef(&chandef); if (req->flags & NL80211_SCAN_FLAG_MIN_PREQ_CONTENT)
req->n_channels must be set before req->channels[] can be used. This patch fixes one of the issues encountered in [1]. [ 83.964252] ------------[ cut here ]------------ [ 83.964255] UBSAN: array-index-out-of-bounds in net/mac80211/scan.c:364:4 [ 83.964258] index 0 is out of range for type 'struct ieee80211_channel *[]' [ 83.964260] CPU: 0 PID: 1695 Comm: iwd Tainted: G O T 6.8.9-gentoo-hardened1 #1 [ 83.964262] Hardware name: System76 Pangolin/Pangolin, BIOS ARB928_V00.01_T0025ASY1_ms 04/20/2023 [ 83.964264] Call Trace: [ 83.964267] <TASK> [ 83.964269] dump_stack_lvl+0x3f/0xc0 [ 83.964274] __ubsan_handle_out_of_bounds+0xec/0x110 [ 83.964278] ieee80211_prep_hw_scan+0x2db/0x4b0 [ 83.964281] __ieee80211_start_scan+0x601/0x990 [ 83.964284] ? srso_alias_return_thunk+0x5/0xfbef5 [ 83.964287] ? cfg80211_scan+0x149/0x250 [ 83.964291] nl80211_trigger_scan+0x874/0x980 [ 83.964295] genl_family_rcv_msg_doit+0xe8/0x160 [ 83.964298] genl_rcv_msg+0x240/0x270 [ 83.964301] ? __cfi_nl80211_trigger_scan+0x10/0x10 [ 83.964302] ? __cfi_nl80211_post_doit+0x10/0x10 [ 83.964304] ? __cfi_nl80211_pre_doit+0x10/0x10 [ 83.964307] ? __cfi_genl_rcv_msg+0x10/0x10 [ 83.964309] netlink_rcv_skb+0x102/0x130 [ 83.964312] genl_rcv+0x23/0x40 [ 83.964314] netlink_unicast+0x23b/0x340 [ 83.964316] netlink_sendmsg+0x3a9/0x450 [ 83.964319] __sys_sendto+0x3ae/0x3c0 [ 83.964324] __x64_sys_sendto+0x21/0x40 [ 83.964326] do_syscall_64+0x90/0x150 [ 83.964329] ? srso_alias_return_thunk+0x5/0xfbef5 [ 83.964331] ? syscall_exit_work+0xc2/0xf0 [ 83.964333] ? srso_alias_return_thunk+0x5/0xfbef5 [ 83.964335] ? syscall_exit_to_user_mode+0x74/0xa0 [ 83.964337] ? srso_alias_return_thunk+0x5/0xfbef5 [ 83.964339] ? do_syscall_64+0x9c/0x150 [ 83.964340] ? srso_alias_return_thunk+0x5/0xfbef5 [ 83.964342] ? syscall_exit_to_user_mode+0x74/0xa0 [ 83.964344] ? srso_alias_return_thunk+0x5/0xfbef5 [ 83.964346] ? do_syscall_64+0x9c/0x150 [ 83.964347] ? srso_alias_return_thunk+0x5/0xfbef5 [ 83.964349] ? do_syscall_64+0x9c/0x150 [ 83.964351] ? srso_alias_return_thunk+0x5/0xfbef5 [ 83.964353] ? syscall_exit_work+0xc2/0xf0 [ 83.964354] ? srso_alias_return_thunk+0x5/0xfbef5 [ 83.964356] ? syscall_exit_to_user_mode+0x74/0xa0 [ 83.964358] ? srso_alias_return_thunk+0x5/0xfbef5 [ 83.964359] ? do_syscall_64+0x9c/0x150 [ 83.964361] ? srso_alias_return_thunk+0x5/0xfbef5 [ 83.964362] ? do_user_addr_fault+0x488/0x620 [ 83.964366] ? srso_alias_return_thunk+0x5/0xfbef5 [ 83.964367] ? srso_alias_return_thunk+0x5/0xfbef5 [ 83.964369] entry_SYSCALL_64_after_hwframe+0x55/0x5d [ 83.964372] RIP: 0033:0x6200808578d7 [ 83.964374] Code: 00 00 90 f3 0f 1e fa 41 56 55 41 89 ce 48 83 ec 28 80 3d 7b f7 0d 00 00 74 29 45 31 c9 45 31 c0 41 89 ca b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 71 48 83 c4 28 5d 41 5e c3 66 0f 1f 84 00 00 [ 83.964375] RSP: 002b:0000730c4e821530 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 83.964378] RAX: ffffffffffffffda RBX: 000006dbc456c570 RCX: 00006200808578d7 [ 83.964379] RDX: 000000000000005c RSI: 000006dbc45884f0 RDI: 0000000000000004 [ 83.964381] RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000 [ 83.964382] R10: 0000000000000000 R11: 0000000000000246 R12: 000006dbc456c480 [ 83.964383] R13: 000006dbc456c450 R14: 0000000000000000 R15: 000006dbc456c610 [ 83.964386] </TASK> [ 83.964386] ---[ end trace ]--- [1] https://bugzilla.kernel.org/show_bug.cgi?id=218810 v1->v2: - Drop changes in cfg80211 as requested by Johannes Co-authored-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kenton Groombridge <concord@gentoo.org> --- net/mac80211/scan.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-)