Message ID | 20210922175055.568763-1-bvanassche@acm.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | null_blk: Fix a NULL pointer dereference | expand |
On 9/22/21 11:50 AM, Bart Van Assche wrote: > Skip queue mapping for shared tag sets. This patch fixes the following bug: > > ================================================================== > BUG: KASAN: null-ptr-deref in null_map_queues+0x131/0x1a0 [null_blk] > Read of size 8 at addr 0000000000000000 by task modprobe/4320 > > CPU: 9 PID: 4320 Comm: modprobe Tainted: G E 5.15.0-rc2-dbg+ #2 > Call Trace: > show_stack+0x52/0x58 > dump_stack_lvl+0x49/0x5e > kasan_report.cold+0x64/0xdb > __asan_load8+0x69/0x90 > null_map_queues+0x131/0x1a0 [null_blk] > blk_mq_update_queue_map+0x122/0x1a0 > blk_mq_alloc_tag_set+0x1e8/0x570 > null_init_tag_set+0x197/0x220 [null_blk] > null_init+0x1dc/0x1000 [null_blk] > do_one_initcall+0xc7/0x440 > do_init_module+0x10a/0x3d0 > load_module+0x115c/0x1220 > __do_sys_finit_module+0x124/0x1a0 > __x64_sys_finit_module+0x42/0x50 > do_syscall_64+0x35/0xb0 > entry_SYSCALL_64_after_hwframe+0x44/0xae Thanks Bart, do you mind if I fold this one in? I can add a Fixes-by tag as well.
On 9/22/21 10:54 AM, Jens Axboe wrote: > On 9/22/21 11:50 AM, Bart Van Assche wrote: >> Skip queue mapping for shared tag sets. This patch fixes the following bug: >> >> ================================================================== >> BUG: KASAN: null-ptr-deref in null_map_queues+0x131/0x1a0 [null_blk] >> Read of size 8 at addr 0000000000000000 by task modprobe/4320 >> >> CPU: 9 PID: 4320 Comm: modprobe Tainted: G E 5.15.0-rc2-dbg+ #2 >> Call Trace: >> show_stack+0x52/0x58 >> dump_stack_lvl+0x49/0x5e >> kasan_report.cold+0x64/0xdb >> __asan_load8+0x69/0x90 >> null_map_queues+0x131/0x1a0 [null_blk] >> blk_mq_update_queue_map+0x122/0x1a0 >> blk_mq_alloc_tag_set+0x1e8/0x570 >> null_init_tag_set+0x197/0x220 [null_blk] >> null_init+0x1dc/0x1000 [null_blk] >> do_one_initcall+0xc7/0x440 >> do_init_module+0x10a/0x3d0 >> load_module+0x115c/0x1220 >> __do_sys_finit_module+0x124/0x1a0 >> __x64_sys_finit_module+0x42/0x50 >> do_syscall_64+0x35/0xb0 >> entry_SYSCALL_64_after_hwframe+0x44/0xae > > Thanks Bart, do you mind if I fold this one in? I can add a Fixes-by tag > as well. Hi Jens, That sounds good to me. In case this patch would be retained: the word "Skip" in the description should be changed into "Fix". Thanks, Bart.
On 9/22/21 11:03 AM, Bart Van Assche wrote: > On 9/22/21 10:54 AM, Jens Axboe wrote: >> On 9/22/21 11:50 AM, Bart Van Assche wrote: >>> Skip queue mapping for shared tag sets. This patch fixes the following bug: >>> >>> ================================================================== >>> BUG: KASAN: null-ptr-deref in null_map_queues+0x131/0x1a0 [null_blk] >>> Read of size 8 at addr 0000000000000000 by task modprobe/4320 >>> >>> CPU: 9 PID: 4320 Comm: modprobe Tainted: G E 5.15.0-rc2-dbg+ #2 >>> Call Trace: >>> show_stack+0x52/0x58 >>> dump_stack_lvl+0x49/0x5e >>> kasan_report.cold+0x64/0xdb >>> __asan_load8+0x69/0x90 >>> null_map_queues+0x131/0x1a0 [null_blk] >>> blk_mq_update_queue_map+0x122/0x1a0 >>> blk_mq_alloc_tag_set+0x1e8/0x570 >>> null_init_tag_set+0x197/0x220 [null_blk] >>> null_init+0x1dc/0x1000 [null_blk] >>> do_one_initcall+0xc7/0x440 >>> do_init_module+0x10a/0x3d0 >>> load_module+0x115c/0x1220 >>> __do_sys_finit_module+0x124/0x1a0 >>> __x64_sys_finit_module+0x42/0x50 >>> do_syscall_64+0x35/0xb0 >>> entry_SYSCALL_64_after_hwframe+0x44/0xae >> >> Thanks Bart, do you mind if I fold this one in? I can add a Fixes-by tag >> as well. > > That sounds good to me. In case this patch would be retained: the word "Skip" > in the description should be changed into "Fix". Unfortunately my patch is not good enough. I run into other crashes with this patch applied since with this patch some hwctx pointers may be NULL: BUG: KASAN: null-ptr-deref in blk_mq_free_rqs+0x1f4/0x380 Read of size 8 at addr 0000000000000090 by task modprobe/5085 CPU: 19 PID: 5085 Comm: modprobe Tainted: G E 5.15.0-rc1-dbg+ #7 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 Call Trace: show_stack+0x52/0x58 dump_stack_lvl+0x38/0x49 kasan_report.cold+0x64/0xdb __asan_load8+0x69/0x90 blk_mq_free_rqs+0x1f4/0x380 blk_mq_sched_free_requests+0x98/0xc0 blk_cleanup_queue+0xe6/0x110 blk_cleanup_disk+0x1f/0x40 null_del_dev.part.0+0xdf/0x2b0 [null_blk] null_exit+0x65/0xb4 [null_blk] __do_sys_delete_module.constprop.0+0x1dd/0x2e0 __x64_sys_delete_module+0x1f/0x30 do_syscall_64+0x35/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae Bart.
On 9/22/21 12:26 PM, Bart Van Assche wrote: > On 9/22/21 11:03 AM, Bart Van Assche wrote: >> On 9/22/21 10:54 AM, Jens Axboe wrote: >>> On 9/22/21 11:50 AM, Bart Van Assche wrote: >>>> Skip queue mapping for shared tag sets. This patch fixes the following bug: >>>> >>>> ================================================================== >>>> BUG: KASAN: null-ptr-deref in null_map_queues+0x131/0x1a0 [null_blk] >>>> Read of size 8 at addr 0000000000000000 by task modprobe/4320 >>>> >>>> CPU: 9 PID: 4320 Comm: modprobe Tainted: G E 5.15.0-rc2-dbg+ #2 >>>> Call Trace: >>>> show_stack+0x52/0x58 >>>> dump_stack_lvl+0x49/0x5e >>>> kasan_report.cold+0x64/0xdb >>>> __asan_load8+0x69/0x90 >>>> null_map_queues+0x131/0x1a0 [null_blk] >>>> blk_mq_update_queue_map+0x122/0x1a0 >>>> blk_mq_alloc_tag_set+0x1e8/0x570 >>>> null_init_tag_set+0x197/0x220 [null_blk] >>>> null_init+0x1dc/0x1000 [null_blk] >>>> do_one_initcall+0xc7/0x440 >>>> do_init_module+0x10a/0x3d0 >>>> load_module+0x115c/0x1220 >>>> __do_sys_finit_module+0x124/0x1a0 >>>> __x64_sys_finit_module+0x42/0x50 >>>> do_syscall_64+0x35/0xb0 >>>> entry_SYSCALL_64_after_hwframe+0x44/0xae >>> >>> Thanks Bart, do you mind if I fold this one in? I can add a Fixes-by tag >>> as well. >> >> That sounds good to me. In case this patch would be retained: the word "Skip" >> in the description should be changed into "Fix". > > Unfortunately my patch is not good enough. I run into other crashes with this > patch applied since with this patch some hwctx pointers may be NULL: > > BUG: KASAN: null-ptr-deref in blk_mq_free_rqs+0x1f4/0x380 > Read of size 8 at addr 0000000000000090 by task modprobe/5085 > CPU: 19 PID: 5085 Comm: modprobe Tainted: G E 5.15.0-rc1-dbg+ #7 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 > Call Trace: > show_stack+0x52/0x58 > dump_stack_lvl+0x38/0x49 > kasan_report.cold+0x64/0xdb > __asan_load8+0x69/0x90 > blk_mq_free_rqs+0x1f4/0x380 > blk_mq_sched_free_requests+0x98/0xc0 > blk_cleanup_queue+0xe6/0x110 > blk_cleanup_disk+0x1f/0x40 > null_del_dev.part.0+0xdf/0x2b0 [null_blk] > null_exit+0x65/0xb4 [null_blk] > __do_sys_delete_module.constprop.0+0x1dd/0x2e0 > __x64_sys_delete_module+0x1f/0x30 > do_syscall_64+0x35/0xb0 > entry_SYSCALL_64_after_hwframe+0x44/0xae What options are you loading null_blk with?
On 9/23/21 9:04 AM, Jens Axboe wrote:
> What options are you loading null_blk with?
The issues I reported are the result of running test blk/010 from the
blktests suite. That test loads the null_blk kernel module twice:
_init_null_blk queue_mode=2 submit_queues=16 nr_devices=32
[ ... ]
_exit_null_blk
_init_null_blk queue_mode=2 submit_queues=16 nr_devices=32 shared_tags=1
[ ... ]
_exit_null_blk
Please let me know if you need more information.
Bart.
On 9/23/21 10:22 AM, Bart Van Assche wrote: > On 9/23/21 9:04 AM, Jens Axboe wrote: >> What options are you loading null_blk with? > > The issues I reported are the result of running test blk/010 from the > blktests suite. That test loads the null_blk kernel module twice: > > _init_null_blk queue_mode=2 submit_queues=16 nr_devices=32 > [ ... ] > _exit_null_blk > _init_null_blk queue_mode=2 submit_queues=16 nr_devices=32 shared_tags=1 > [ ... ] > _exit_null_blk > > Please let me know if you need more information. Tried both that and running block/010, didn't trigger anything for me. Odd...
On 9/23/21 9:39 AM, Jens Axboe wrote: > On 9/23/21 10:22 AM, Bart Van Assche wrote: >> On 9/23/21 9:04 AM, Jens Axboe wrote: >>> What options are you loading null_blk with? >> >> The issues I reported are the result of running test blk/010 from the >> blktests suite. That test loads the null_blk kernel module twice: >> >> _init_null_blk queue_mode=2 submit_queues=16 nr_devices=32 >> [ ... ] >> _exit_null_blk >> _init_null_blk queue_mode=2 submit_queues=16 nr_devices=32 shared_tags=1 >> [ ... ] >> _exit_null_blk >> >> Please let me know if you need more information. > > Tried both that and running block/010, didn't trigger anything for me. > Odd... Hi Jens, I took another look at the kernel logs from yesterday of the VM that I use for testing. In that kernel log I found the following: * Without any changes on top of the for-next branch of git://git.kernel.dk/linux-block (commit 4129031563d0 ("Merge branch 'for-5.16/io_uring' into for-next"), test block/010 triggers the oops reported at the start of this email thread. * With the patch at the start of this email thread applied, the first test that triggers a kernel oops is block/020 (blk_mq_free_rqs+0x1f4). This morning I rebuilt the block for-next kernel with and without my null_blk patch applied. I was able to reproduce what I observed yesterday. Test block/020 passes with if commit 5f7acddf706c ("null_blk: poll queue support") is reverted. This is why I wrote that my patch does not seem to be sufficient to fix commit 5f7acddf706c. Thanks, Bart.
On 9/23/21 11:51 AM, Bart Van Assche wrote: > On 9/23/21 9:39 AM, Jens Axboe wrote: >> On 9/23/21 10:22 AM, Bart Van Assche wrote: >>> On 9/23/21 9:04 AM, Jens Axboe wrote: >>>> What options are you loading null_blk with? >>> >>> The issues I reported are the result of running test blk/010 from the >>> blktests suite. That test loads the null_blk kernel module twice: >>> >>> _init_null_blk queue_mode=2 submit_queues=16 nr_devices=32 >>> [ ... ] >>> _exit_null_blk >>> _init_null_blk queue_mode=2 submit_queues=16 nr_devices=32 shared_tags=1 >>> [ ... ] >>> _exit_null_blk >>> >>> Please let me know if you need more information. >> >> Tried both that and running block/010, didn't trigger anything for me. >> Odd... > > Hi Jens, > > I took another look at the kernel logs from yesterday of the VM that I use > for testing. In that kernel log I found the following: > * Without any changes on top of the for-next branch of > git://git.kernel.dk/linux-block (commit 4129031563d0 ("Merge branch > 'for-5.16/io_uring' into for-next"), test block/010 triggers the oops > reported at the start of this email thread. > * With the patch at the start of this email thread applied, the first test > that triggers a kernel oops is block/020 (blk_mq_free_rqs+0x1f4). > > This morning I rebuilt the block for-next kernel with and without my > null_blk patch applied. I was able to reproduce what I observed yesterday. > Test block/020 passes with if commit 5f7acddf706c ("null_blk: poll queue > support") is reverted. This is why I wrote that my patch does not seem to > be sufficient to fix commit 5f7acddf706c. Ah ok, so it's block/020, not block/010 for the later one. I'll take a look.
On 9/23/21 11:55 AM, Jens Axboe wrote: > On 9/23/21 11:51 AM, Bart Van Assche wrote: >> On 9/23/21 9:39 AM, Jens Axboe wrote: >>> On 9/23/21 10:22 AM, Bart Van Assche wrote: >>>> On 9/23/21 9:04 AM, Jens Axboe wrote: >>>>> What options are you loading null_blk with? >>>> >>>> The issues I reported are the result of running test blk/010 from the >>>> blktests suite. That test loads the null_blk kernel module twice: >>>> >>>> _init_null_blk queue_mode=2 submit_queues=16 nr_devices=32 >>>> [ ... ] >>>> _exit_null_blk >>>> _init_null_blk queue_mode=2 submit_queues=16 nr_devices=32 shared_tags=1 >>>> [ ... ] >>>> _exit_null_blk >>>> >>>> Please let me know if you need more information. >>> >>> Tried both that and running block/010, didn't trigger anything for me. >>> Odd... >> >> Hi Jens, >> >> I took another look at the kernel logs from yesterday of the VM that I use >> for testing. In that kernel log I found the following: >> * Without any changes on top of the for-next branch of >> git://git.kernel.dk/linux-block (commit 4129031563d0 ("Merge branch >> 'for-5.16/io_uring' into for-next"), test block/010 triggers the oops >> reported at the start of this email thread. >> * With the patch at the start of this email thread applied, the first test >> that triggers a kernel oops is block/020 (blk_mq_free_rqs+0x1f4). >> >> This morning I rebuilt the block for-next kernel with and without my >> null_blk patch applied. I was able to reproduce what I observed yesterday. >> Test block/020 passes with if commit 5f7acddf706c ("null_blk: poll queue >> support") is reverted. This is why I wrote that my patch does not seem to >> be sufficient to fix commit 5f7acddf706c. > > Ah ok, so it's block/020, not block/010 for the later one. I'll take a > look. I think it's the assumption that poll_queueus == submit_queues. Does this work for you? diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c index eb5cfe189e90..dac88c5daff9 100644 --- a/drivers/block/null_blk/main.c +++ b/drivers/block/null_blk/main.c @@ -1748,7 +1757,7 @@ static int setup_queues(struct nullb *nullb) int nqueues = nr_cpu_ids; if (g_poll_queues) - nqueues *= 2; + nqueues += g_poll_queues; nullb->queues = kcalloc(nqueues, sizeof(struct nullb_queue), GFP_KERNEL); @@ -1814,7 +1823,7 @@ static int null_init_tag_set(struct nullb *nullb, struct blk_mq_tag_set *set) g_submit_queues; poll_queues = nullb ? nullb->dev->poll_queues : g_poll_queues; if (poll_queues) - set->nr_hw_queues *= 2; + set->nr_hw_queues += poll_queues; set->queue_depth = nullb ? nullb->dev->hw_queue_depth : g_hw_queue_depth; set->numa_node = nullb ? nullb->dev->home_node : g_home_node;
On 9/23/21 11:53 AM, Jens Axboe wrote: > I think it's the assumption that poll_queueus == submit_queues. Does this > work for you? > > > diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c > index eb5cfe189e90..dac88c5daff9 100644 > --- a/drivers/block/null_blk/main.c > +++ b/drivers/block/null_blk/main.c > @@ -1748,7 +1757,7 @@ static int setup_queues(struct nullb *nullb) > int nqueues = nr_cpu_ids; > > if (g_poll_queues) > - nqueues *= 2; > + nqueues += g_poll_queues; > > nullb->queues = kcalloc(nqueues, sizeof(struct nullb_queue), > GFP_KERNEL); > @@ -1814,7 +1823,7 @@ static int null_init_tag_set(struct nullb *nullb, struct blk_mq_tag_set *set) > g_submit_queues; > poll_queues = nullb ? nullb->dev->poll_queues : g_poll_queues; > if (poll_queues) > - set->nr_hw_queues *= 2; > + set->nr_hw_queues += poll_queues; > set->queue_depth = nullb ? nullb->dev->hw_queue_depth : > g_hw_queue_depth; > set->numa_node = nullb ? nullb->dev->home_node : g_home_node; Hi Jens, That patch makes test block/020 pass but unfortunately is not sufficient to make test block/029 pass. This is what is triggered by test block/029: null_blk: module loaded ================================================================== BUG: KASAN: null-ptr-deref in blk_mq_map_swqueue+0x2d2/0x820 Read of size 8 at addr 0000000000000128 by task check/6065 CPU: 11 PID: 6065 Comm: check Tainted: G E 5.15.0-rc1-dbg+ #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 Call Trace: show_stack+0x52/0x58 dump_stack_lvl+0x49/0x5e kasan_report.cold+0x64/0xdb kasan_check_range+0x14f/0x1a0 __kasan_check_read+0x11/0x20 blk_mq_map_swqueue+0x2d2/0x820 __blk_mq_update_nr_hw_queues+0x358/0x670 blk_mq_update_nr_hw_queues+0x31/0x50 nullb_device_submit_queues_store+0xc1/0x130 [null_blk] configfs_write_iter+0x18b/0x220 new_sync_write+0x287/0x390 vfs_write+0x442/0x590 ksys_write+0xbd/0x150 __x64_sys_write+0x42/0x50 do_syscall_64+0x35/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7ff43deb0367 Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 RSP: 002b:00007ffcabaa5df8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007ff43deb0367 RDX: 0000000000000002 RSI: 000055c2e105a510 RDI: 0000000000000001 RBP: 000055c2e105a510 R08: 000000000000000a R09: 00007ff43df464e0 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002 R13: 00007ff43df83520 R14: 0000000000000002 R15: 00007ff43df83700 ==================================================================
diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c index eb5cfe189e90..62b7036f5e8d 100644 --- a/drivers/block/null_blk/main.c +++ b/drivers/block/null_blk/main.c @@ -1472,13 +1472,15 @@ static int null_map_queues(struct blk_mq_tag_set *set) switch (i) { case HCTX_TYPE_DEFAULT: - map->nr_queues = nullb->dev->submit_queues; + map->nr_queues = nullb ? nullb->dev->submit_queues : + g_submit_queues; break; case HCTX_TYPE_READ: map->nr_queues = 0; continue; case HCTX_TYPE_POLL: - map->nr_queues = nullb->dev->poll_queues; + map->nr_queues = + nullb ? nullb->dev->poll_queues : g_poll_queues; break; } map->queue_offset = qoff;
Skip queue mapping for shared tag sets. This patch fixes the following bug: ================================================================== BUG: KASAN: null-ptr-deref in null_map_queues+0x131/0x1a0 [null_blk] Read of size 8 at addr 0000000000000000 by task modprobe/4320 CPU: 9 PID: 4320 Comm: modprobe Tainted: G E 5.15.0-rc2-dbg+ #2 Call Trace: show_stack+0x52/0x58 dump_stack_lvl+0x49/0x5e kasan_report.cold+0x64/0xdb __asan_load8+0x69/0x90 null_map_queues+0x131/0x1a0 [null_blk] blk_mq_update_queue_map+0x122/0x1a0 blk_mq_alloc_tag_set+0x1e8/0x570 null_init_tag_set+0x197/0x220 [null_blk] null_init+0x1dc/0x1000 [null_blk] do_one_initcall+0xc7/0x440 do_init_module+0x10a/0x3d0 load_module+0x115c/0x1220 __do_sys_finit_module+0x124/0x1a0 __x64_sys_finit_module+0x42/0x50 do_syscall_64+0x35/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xae Cc: Christoph Hellwig <hch@lst.de> Cc: Damien Le Moal <damien.lemoal@wdc.com> Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Fixes: 5f7acddf706c ("null_blk: poll queue support") Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- drivers/block/null_blk/main.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)