Message ID | cover.1588856361.git.zhangweiping@didiglobal.com (mailing list archive) |
---|---|
Headers | show |
Series | Fix potential kernel panic when increase hardware queue | expand |
The whole series looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
On 5/7/20 7:03 AM, Weiping Zhang wrote: > Hi, > > This series mainly fix the kernel panic when increase hardware queue, > and also fix some other misc issue. > > Memleak 1: > > __blk_mq_alloc_rq_maps > __blk_mq_alloc_rq_map > > if fail > blk_mq_free_rq_map > > Actually, __blk_mq_alloc_rq_map alloc both map and request, here > also need free request. Applied for 5.8, thanks.
On 2020-05-07 06:03, Weiping Zhang wrote: > This series mainly fix the kernel panic when increase hardware queue, > and also fix some other misc issue. Does this patch series survive blktests? I'm asking this because blktests triggers the crash shown below for Jens' block-for-next branch. I think this report is the result of a recent change. run blktests block/030 null_blk: module loaded Increasing nr_hw_queues to 8 fails, fallback to 1 ================================================================== BUG: KASAN: null-ptr-deref in blk_mq_map_swqueue+0x2f2/0x830 Read of size 8 at addr 0000000000000128 by task nproc/8541 CPU: 5 PID: 8541 Comm: nproc Not tainted 5.7.0-rc4-dbg+ #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014 Call Trace: dump_stack+0xa5/0xe6 __kasan_report.cold+0x65/0xbb kasan_report+0x45/0x60 check_memory_region+0x15e/0x1c0 __kasan_check_read+0x15/0x20 blk_mq_map_swqueue+0x2f2/0x830 __blk_mq_update_nr_hw_queues+0x3df/0x690 blk_mq_update_nr_hw_queues+0x32/0x50 nullb_device_submit_queues_store+0xde/0x160 [null_blk] configfs_write_file+0x1c4/0x250 [configfs] __vfs_write+0x4c/0x90 vfs_write+0x14b/0x2d0 ksys_write+0xdd/0x180 __x64_sys_write+0x47/0x50 do_syscall_64+0x6f/0x310 entry_SYSCALL_64_after_hwframe+0x49/0xb3 Thanks, Bart.
On Tue, May 12, 2020 at 9:31 AM Bart Van Assche <bvanassche@acm.org> wrote: > > On 2020-05-07 06:03, Weiping Zhang wrote: > > This series mainly fix the kernel panic when increase hardware queue, > > and also fix some other misc issue. > > Does this patch series survive blktests? I'm asking this because > blktests triggers the crash shown below for Jens' block-for-next branch. > I think this report is the result of a recent change. > > run blktests block/030 > > null_blk: module loaded > Increasing nr_hw_queues to 8 fails, fallback to 1 > ================================================================== > BUG: KASAN: null-ptr-deref in blk_mq_map_swqueue+0x2f2/0x830 > Read of size 8 at addr 0000000000000128 by task nproc/8541 > > CPU: 5 PID: 8541 Comm: nproc Not tainted 5.7.0-rc4-dbg+ #3 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014 > Call Trace: > dump_stack+0xa5/0xe6 > __kasan_report.cold+0x65/0xbb > kasan_report+0x45/0x60 > check_memory_region+0x15e/0x1c0 > __kasan_check_read+0x15/0x20 > blk_mq_map_swqueue+0x2f2/0x830 > __blk_mq_update_nr_hw_queues+0x3df/0x690 > blk_mq_update_nr_hw_queues+0x32/0x50 > nullb_device_submit_queues_store+0xde/0x160 [null_blk] > configfs_write_file+0x1c4/0x250 [configfs] > __vfs_write+0x4c/0x90 > vfs_write+0x14b/0x2d0 > ksys_write+0xdd/0x180 > __x64_sys_write+0x47/0x50 > do_syscall_64+0x6f/0x310 > entry_SYSCALL_64_after_hwframe+0x49/0xb3 > > Thanks, > Hi Bart, I don't test block/030, since I don't pull blktest very often. It's a different problem, because the mapping cann't be reset when do fallback, so the cpu[>=1] will point to a hctx(!=0). it should be fixed by: diff --git a/block/blk-mq.c b/block/blk-mq.c index bc34d6b572b6..d82cefb0474f 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3365,8 +3365,8 @@ static void __blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, goto reregister; set->nr_hw_queues = nr_hw_queues; - blk_mq_update_queue_map(set); fallback: + blk_mq_update_queue_map(set); list_for_each_entry(q, &set->tag_list, tag_set_list) { blk_mq_realloc_hw_ctxs(set, q); if (q->nr_hw_queues != set->nr_hw_queues) { > Bart.
On Tue, May 12, 2020 at 8:09 PM Weiping Zhang <zwp10758@gmail.com> wrote: > > On Tue, May 12, 2020 at 9:31 AM Bart Van Assche <bvanassche@acm.org> wrote: > > > > On 2020-05-07 06:03, Weiping Zhang wrote: > > > This series mainly fix the kernel panic when increase hardware queue, > > > and also fix some other misc issue. > > > > Does this patch series survive blktests? I'm asking this because > > blktests triggers the crash shown below for Jens' block-for-next branch. > > I think this report is the result of a recent change. > > > > run blktests block/030 > > > > null_blk: module loaded > > Increasing nr_hw_queues to 8 fails, fallback to 1 > > ================================================================== > > BUG: KASAN: null-ptr-deref in blk_mq_map_swqueue+0x2f2/0x830 > > Read of size 8 at addr 0000000000000128 by task nproc/8541 > > > > CPU: 5 PID: 8541 Comm: nproc Not tainted 5.7.0-rc4-dbg+ #3 > > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014 > > Call Trace: > > dump_stack+0xa5/0xe6 > > __kasan_report.cold+0x65/0xbb > > kasan_report+0x45/0x60 > > check_memory_region+0x15e/0x1c0 > > __kasan_check_read+0x15/0x20 > > blk_mq_map_swqueue+0x2f2/0x830 > > __blk_mq_update_nr_hw_queues+0x3df/0x690 > > blk_mq_update_nr_hw_queues+0x32/0x50 > > nullb_device_submit_queues_store+0xde/0x160 [null_blk] > > configfs_write_file+0x1c4/0x250 [configfs] > > __vfs_write+0x4c/0x90 > > vfs_write+0x14b/0x2d0 > > ksys_write+0xdd/0x180 > > __x64_sys_write+0x47/0x50 > > do_syscall_64+0x6f/0x310 > > entry_SYSCALL_64_after_hwframe+0x49/0xb3 > > > > Thanks, > > > > Hi Bart, > > I don't test block/030, since I don't pull blktest very often. > > It's a different problem, > because the mapping cann't be reset when do fallback, so the > cpu[>=1] will point to a hctx(!=0). > > it should be fixed by: > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index bc34d6b572b6..d82cefb0474f 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -3365,8 +3365,8 @@ static void __blk_mq_update_nr_hw_queues(struct > blk_mq_tag_set *set, > goto reregister; > > set->nr_hw_queues = nr_hw_queues; > - blk_mq_update_queue_map(set); > fallback: > + blk_mq_update_queue_map(set); > list_for_each_entry(q, &set->tag_list, tag_set_list) { > blk_mq_realloc_hw_ctxs(set, q); > if (q->nr_hw_queues != set->nr_hw_queues) { And block/030 should also be improved ? 35 # Since older null_blk versions do not allow "submit_queues" to be 36 # modified, check first whether that configs attribute is writeable. 37 # Each iteration of the loop below triggers $(nproc) + 1 38 # null_init_hctx() calls. Since <interval>=$(nproc), all possible 39 # blk_mq_realloc_hw_ctxs() error paths will be triggered. Whether or 40 # not this test succeeds depends on whether or not _check_dmesg() 41 # detects a kernel warning. 42 if { echo "$(<"$sq")" >$sq; } 2>/dev/null; then 43 for ((i = 0; i < 100; i++)); do 44 echo 1 > $sq 45 nproc > $sq # this line output lots "nproc: write error: Cannot allocate memory" 46 done 47 else 48 SKIP_REASON="Skipping test because $sq cannot be modified" 49 fi The test result show this test case [failed], actually it [pass], there is no warning detect in kernel log, if apply above patch. block/030 (trigger the blk_mq_realloc_hw_ctxs() error path) [failed] runtime 1.999s ... 2.115s --- tests/block/030.out 2020-05-12 10:42:26.345782849 +0800 +++ /data1/zwp/src/blktests/results/nodev/block/030.out.bad 2020-05-12 20:14:59.878915218 +0800 @@ -1 +1,51 @@ +nproc: write error: Cannot allocate memory +nproc: write error: Cannot allocate memory +nproc: write error: Cannot allocate memory +nproc: write error: Cannot allocate memory +nproc: write error: Cannot allocate memory +nproc: write error: Cannot allocate memory +nproc: write error: Cannot allocate memory ... (Run 'diff -u tests/block/030.out /data1/zwp/src/blktests/results/nodev/block/030.out.bad' to see the entire diff) Thanks Weiping
On 2020-05-12 05:20, Weiping Zhang wrote: > On Tue, May 12, 2020 at 8:09 PM Weiping Zhang <zwp10758@gmail.com> wrote: >> I don't test block/030, since I don't pull blktest very often. That's unfortunate ... >> It's a different problem, >> because the mapping cann't be reset when do fallback, so the >> cpu[>=1] will point to a hctx(!=0). >> >> it should be fixed by: >> >> diff --git a/block/blk-mq.c b/block/blk-mq.c >> index bc34d6b572b6..d82cefb0474f 100644 >> --- a/block/blk-mq.c >> +++ b/block/blk-mq.c >> @@ -3365,8 +3365,8 @@ static void __blk_mq_update_nr_hw_queues(struct >> blk_mq_tag_set *set, >> goto reregister; >> >> set->nr_hw_queues = nr_hw_queues; >> - blk_mq_update_queue_map(set); >> fallback: >> + blk_mq_update_queue_map(set); >> list_for_each_entry(q, &set->tag_list, tag_set_list) { >> blk_mq_realloc_hw_ctxs(set, q); >> if (q->nr_hw_queues != set->nr_hw_queues) { If this is posted as a patch, feel free to add: Tested-by: Bart van Assche <bvanassche@acm.org> > And block/030 should also be improved ? > > 35 # Since older null_blk versions do not allow "submit_queues" to be > 36 # modified, check first whether that configs attribute is writeable. > 37 # Each iteration of the loop below triggers $(nproc) + 1 > 38 # null_init_hctx() calls. Since <interval>=$(nproc), all possible > 39 # blk_mq_realloc_hw_ctxs() error paths will be triggered. Whether or > 40 # not this test succeeds depends on whether or not _check_dmesg() > 41 # detects a kernel warning. > 42 if { echo "$(<"$sq")" >$sq; } 2>/dev/null; then > 43 for ((i = 0; i < 100; i++)); do > 44 echo 1 > $sq > 45 nproc > $sq # this line output lots > "nproc: write error: Cannot allocate memory" > 46 done > 47 else > 48 SKIP_REASON="Skipping test because $sq cannot be modified" > 49 fi > > > The test result show this test case [failed], actually it [pass], > there is no warning detect > in kernel log, if apply above patch. > > block/030 (trigger the blk_mq_realloc_hw_ctxs() error path) [failed] > runtime 1.999s ... 2.115s > --- tests/block/030.out 2020-05-12 10:42:26.345782849 +0800 > +++ /data1/zwp/src/blktests/results/nodev/block/030.out.bad > 2020-05-12 20:14:59.878915218 +0800 > @@ -1 +1,51 @@ > +nproc: write error: Cannot allocate memory > +nproc: write error: Cannot allocate memory > +nproc: write error: Cannot allocate memory > +nproc: write error: Cannot allocate memory > +nproc: write error: Cannot allocate memory > +nproc: write error: Cannot allocate memory > +nproc: write error: Cannot allocate memory > ... > (Run 'diff -u tests/block/030.out > /data1/zwp/src/blktests/results/nodev/block/030.out.bad' to see the > entire diff) That's weird. I have not yet encountered this. Test block/030 passes on my setup. Thanks, Bart.
On Wed, May 13, 2020 at 7:08 AM Bart Van Assche <bvanassche@acm.org> wrote: > > On 2020-05-12 05:20, Weiping Zhang wrote: > > On Tue, May 12, 2020 at 8:09 PM Weiping Zhang <zwp10758@gmail.com> wrote: > >> I don't test block/030, since I don't pull blktest very often. > > That's unfortunate ... > > >> It's a different problem, > >> because the mapping cann't be reset when do fallback, so the > >> cpu[>=1] will point to a hctx(!=0). > >> > >> it should be fixed by: > >> > >> diff --git a/block/blk-mq.c b/block/blk-mq.c > >> index bc34d6b572b6..d82cefb0474f 100644 > >> --- a/block/blk-mq.c > >> +++ b/block/blk-mq.c > >> @@ -3365,8 +3365,8 @@ static void __blk_mq_update_nr_hw_queues(struct > >> blk_mq_tag_set *set, > >> goto reregister; > >> > >> set->nr_hw_queues = nr_hw_queues; > >> - blk_mq_update_queue_map(set); > >> fallback: > >> + blk_mq_update_queue_map(set); > >> list_for_each_entry(q, &set->tag_list, tag_set_list) { > >> blk_mq_realloc_hw_ctxs(set, q); > >> if (q->nr_hw_queues != set->nr_hw_queues) { > > If this is posted as a patch, feel free to add: > > Tested-by: Bart van Assche <bvanassche@acm.org> > Post it latter, thank you > > And block/030 should also be improved ? > > > > 35 # Since older null_blk versions do not allow "submit_queues" to be > > 36 # modified, check first whether that configs attribute is writeable. > > 37 # Each iteration of the loop below triggers $(nproc) + 1 > > 38 # null_init_hctx() calls. Since <interval>=$(nproc), all possible > > 39 # blk_mq_realloc_hw_ctxs() error paths will be triggered. Whether or > > 40 # not this test succeeds depends on whether or not _check_dmesg() > > 41 # detects a kernel warning. > > 42 if { echo "$(<"$sq")" >$sq; } 2>/dev/null; then > > 43 for ((i = 0; i < 100; i++)); do > > 44 echo 1 > $sq > > 45 nproc > $sq # this line output lots > > "nproc: write error: Cannot allocate memory" > > 46 done > > 47 else > > 48 SKIP_REASON="Skipping test because $sq cannot be modified" > > 49 fi > > > > > > The test result show this test case [failed], actually it [pass], > > there is no warning detect > > in kernel log, if apply above patch. > > > > block/030 (trigger the blk_mq_realloc_hw_ctxs() error path) [failed] > > runtime 1.999s ... 2.115s > > --- tests/block/030.out 2020-05-12 10:42:26.345782849 +0800 > > +++ /data1/zwp/src/blktests/results/nodev/block/030.out.bad > > 2020-05-12 20:14:59.878915218 +0800 > > @@ -1 +1,51 @@ > > +nproc: write error: Cannot allocate memory > > +nproc: write error: Cannot allocate memory > > +nproc: write error: Cannot allocate memory > > +nproc: write error: Cannot allocate memory > > +nproc: write error: Cannot allocate memory > > +nproc: write error: Cannot allocate memory > > +nproc: write error: Cannot allocate memory > > ... > > (Run 'diff -u tests/block/030.out > > /data1/zwp/src/blktests/results/nodev/block/030.out.bad' to see the > > entire diff) > > That's weird. I have not yet encountered this. Test block/030 passes on > my setup. > > Thanks, > > Bart.