Message ID | 04b653322f5d8b32c22083a4df9baf6d4c9097ea.1689744163.git.wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: scrub: make sctx->stripes[] a ring buffer | expand |
Hello, kernel test robot noticed "BUG:KASAN:use-after-free_in__list_del_entry_valid" on: commit: 70c50070fcdecf82cb3ca4dc718eab1098f3c8ad ("[PATCH RFC 4/4] btrfs: scrub: make sctx->stripes[] array work as a ring buffer") url: https://github.com/intel-lab-lkp/linux/commits/Qu-Wenruo/btrfs-scrub-move-write-back-of-repaired-sectors-into-scrub_stripe_read_repair_worker/20230719-133412 base: https://git.kernel.org/cgit/linux/kernel/git/kdave/linux.git for-next patch link: https://lore.kernel.org/all/04b653322f5d8b32c22083a4df9baf6d4c9097ea.1689744163.git.wqu@suse.com/ patch subject: [PATCH RFC 4/4] btrfs: scrub: make sctx->stripes[] array work as a ring buffer in testcase: xfstests version: xfstests-x86_64-bb8af9c-1_20230724 with following parameters: disk: 6HDD fs: btrfs test: btrfs-group-07 compiler: gcc-12 test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz (Haswell) with 8G memory (please refer to attached dmesg/kmsg for entire log/backtrace) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@intel.com> | Closes: https://lore.kernel.org/oe-lkp/202307281035.31af197-oliver.sang@intel.com [ 72.139677][ T65] BUG: KASAN: use-after-free in __list_del_entry_valid (lib/list_debug.c:62) [ 72.147444][ T65] Read of size 8 at addr ffff888205850178 by task kworker/u16:3/65 [ 72.155213][ T65] [ 72.157407][ T65] CPU: 1 PID: 65 Comm: kworker/u16:3 Not tainted 6.5.0-rc2-00048-g70c50070fcde #1 [ 72.166478][ T65] Hardware name: Dell Inc. OptiPlex 9020/0DNKMN, BIOS A05 12/05/2013 [ 72.174417][ T65] Workqueue: 0x0 (btrfs-btrfs-scrub) [ 72.179666][ T65] Call Trace: [ 72.182819][ T65] <TASK> [ 72.185620][ T65] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1)) [ 72.189996][ T65] print_address_description+0x2c/0x3a0 [ 72.196462][ T65] ? __list_del_entry_valid (lib/list_debug.c:62) [ 72.201880][ T65] print_report (mm/kasan/report.c:476) [ 72.206170][ T65] ? kasan_addr_to_slab (mm/kasan/common.c:35) [ 72.210979][ T65] ? __list_del_entry_valid (lib/list_debug.c:62) [ 72.216394][ T65] kasan_report (mm/kasan/report.c:590) [ 72.220680][ T65] ? __list_del_entry_valid (lib/list_debug.c:62) [ 72.226098][ T65] __list_del_entry_valid (lib/list_debug.c:62) [ 72.231342][ T65] move_linked_works (include/linux/list.h:134 include/linux/list.h:229 kernel/workqueue.c:1284) [ 72.236064][ T65] ? process_one_work (kernel/workqueue.c:938 kernel/workqueue.c:2633) [ 72.241050][ T65] pwq_activate_inactive_work (include/linux/instrumented.h:40 include/asm-generic/bitops/instrumented-non-atomic.h:44 kernel/workqueue.c:1364) [ 72.246554][ T65] pwq_dec_nr_in_flight (kernel/workqueue.c:1374 kernel/workqueue.c:1396) [ 72.251623][ T65] worker_thread (include/linux/list.h:292 kernel/workqueue.c:2749) [ 72.256172][ T65] ? rescuer_thread (kernel/workqueue.c:2691) [ 72.260894][ T65] kthread (kernel/kthread.c:389) [ 72.264833][ T65] ? kthread_complete_and_exit (kernel/kthread.c:342) [ 72.270336][ T65] ret_from_fork (arch/x86/kernel/process.c:151) [ 72.274623][ T65] ? kthread_complete_and_exit (kernel/kthread.c:342) [ 72.280128][ T65] ret_from_fork_asm (arch/x86/entry/entry_64.S:298) [ 72.284763][ T65] RIP: 0000:0x0 [ 72.288091][ T65] Code: Unable to access opcode bytes at 0xffffffffffffffd6. Code starting with the faulting instruction =========================================== [ 72.295335][ T65] RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000 [ 72.303626][ T65] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 72.311477][ T65] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 72.319330][ T65] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 72.327183][ T65] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 72.335038][ T65] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 72.342895][ T65] </TASK> [ 72.345784][ T65] [ 72.347976][ T65] The buggy address belongs to the physical page: [ 72.354262][ T65] page:00000000cb5c2985 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0 pfn:0x205850 [ 72.364640][ T65] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) [ 72.371887][ T65] page_type: 0xffffff7f(buddy) [ 72.376520][ T65] raw: 0017ffffc0000000 ffffea0008153008 ffff88821edd6020 0000000000000000 [ 72.384980][ T65] raw: 0000000000000000 0000000000000004 00000000ffffff7f 0000000000000000 [ 72.393440][ T65] page dumped because: kasan: bad access detected [ 72.399725][ T65] [ 72.401915][ T65] Memory state around the buggy address: [ 72.407417][ T65] ffff888205850000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 72.415354][ T65] ffff888205850080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 72.423291][ T65] >ffff888205850100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 72.431228][ T65] ^ [ 72.439080][ T65] ffff888205850180: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 72.447018][ T65] ffff888205850200: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 72.454953][ T65] ================================================================== [ 72.462889][ T65] Disabling lock debugging due to kernel taint [ 72.469218][ T64] general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN PTI [ 72.481165][ T64] KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] [ 72.489451][ T64] CPU: 3 PID: 64 Comm: kworker/u16:2 Tainted: G B 6.5.0-rc2-00048-g70c50070fcde #1 [ 72.499997][ T64] Hardware name: Dell Inc. OptiPlex 9020/0DNKMN, BIOS A05 12/05/2013 [ 72.507935][ T64] Workqueue: btrfs-btrfs-scrub btrfs_work_helper [btrfs] [ 72.514927][ T64] RIP: 0010:scrub_verify_one_sector (fs/btrfs/scrub.c:712) btrfs [ 72.521821][ T64] Code: 82 98 00 00 00 4c 8b 74 24 08 48 89 d8 4c 8b 54 24 10 48 c1 e0 04 49 01 c6 48 b8 00 00 00 00 00 fc ff df 4c 89 f1 48 c1 e9 03 <0f> b6 04 01 84 c0 74 06 0f 8e 1a 03 00 00 41 80 3e 00 0f 84 a2 00 All code ======== 0: 82 (bad) 1: 98 cwtl 2: 00 00 add %al,(%rax) 4: 00 4c 8b 74 add %cl,0x74(%rbx,%rcx,4) 8: 24 08 and $0x8,%al a: 48 89 d8 mov %rbx,%rax d: 4c 8b 54 24 10 mov 0x10(%rsp),%r10 12: 48 c1 e0 04 shl $0x4,%rax 16: 49 01 c6 add %rax,%r14 19: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax 20: fc ff df 23: 4c 89 f1 mov %r14,%rcx 26: 48 c1 e9 03 shr $0x3,%rcx 2a:* 0f b6 04 01 movzbl (%rcx,%rax,1),%eax <-- trapping instruction 2e: 84 c0 test %al,%al 30: 74 06 je 0x38 32: 0f 8e 1a 03 00 00 jle 0x352 38: 41 80 3e 00 cmpb $0x0,(%r14) 3c: 0f .byte 0xf 3d: 84 .byte 0x84 3e: a2 .byte 0xa2 ... Code starting with the faulting instruction =========================================== 0: 0f b6 04 01 movzbl (%rcx,%rax,1),%eax 4: 84 c0 test %al,%al 6: 74 06 je 0xe 8: 0f 8e 1a 03 00 00 jle 0x328 e: 41 80 3e 00 cmpb $0x0,(%r14) 12: 0f .byte 0xf 13: 84 .byte 0x84 14: a2 .byte 0xa2 ... [ 72.530498][ T2521] BTRFS info (device sda2): scrub: finished on devid 2 with status: 0 [ 72.541321][ T64] RSP: 0018:ffffc900004cfab8 EFLAGS: 00010246 [ 72.541325][ T64] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 72.541328][ T64] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffc900004cfa60 [ 72.570983][ T64] RBP: 0000000000000000 R08: 0000000000000000 R09: ffffed1040b0a0c3 [ 72.578832][ T64] R10: ffff88813b476000 R11: 0000000000000001 R12: 000000000000000c [ 72.586683][ T64] R13: ffffc900004cfb10 R14: 0000000000000000 R15: ffff8882058504f8 [ 72.594534][ T64] FS: 0000000000000000(0000) GS:ffff8881b5980000(0000) knlGS:0000000000000000 [ 72.603348][ T64] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 72.609810][ T64] CR2: 00007f5a3bba8610 CR3: 000000011981e003 CR4: 00000000001706e0 [ 72.617665][ T64] Call Trace: [ 72.620819][ T64] <TASK> [ 72.623622][ T64] ? die_addr (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:460) [ 72.627649][ T64] ? exc_general_protection (arch/x86/kernel/traps.c:786 arch/x86/kernel/traps.c:728) [ 72.633072][ T64] ? asm_exc_general_protection (arch/x86/include/asm/idtentry.h:564) [ 72.638672][ T64] ? scrub_verify_one_sector (fs/btrfs/scrub.c:712) btrfs [ 72.644974][ T64] ? scrub_verify_one_sector (arch/x86/include/asm/bitops.h:228 (discriminator 8) arch/x86/include/asm/bitops.h:240 (discriminator 8) include/asm-generic/bitops/instrumented-non-atomic.h:142 (discriminator 8) fs/btrfs/scrub.c:708 (discriminator 8)) btrfs [ 72.651266][ T64] ? scrub_verify_one_metadata (fs/btrfs/scrub.c:692) btrfs [ 72.657906][ T64] scrub_verify_one_stripe (fs/btrfs/scrub.c:761) btrfs [ 72.664021][ T64] ? scrub_verify_one_sector (fs/btrfs/scrub.c:754) btrfs [ 72.670305][ T64] ? scrub_write_sectors (fs/btrfs/scrub.c:1656) btrfs [ 72.676236][ T64] scrub_stripe_read_repair_worker (fs/btrfs/scrub.c:1125) btrfs [ 72.683040][ T64] ? _raw_read_unlock_irqrestore (kernel/locking/spinlock.c:161) [ 72.688720][ T64] ? kasan_report (mm/kasan/report.c:572) [ 72.693178][ T64] ? scrub_stripe_report_errors (fs/btrfs/scrub.c:1111) btrfs [ 72.699726][ T64] ? scrub_write_sectors (fs/btrfs/scrub.c:1656) btrfs [ 72.705660][ T64] btrfs_work_helper (fs/btrfs/async-thread.c:315) btrfs [ 72.711245][ T64] process_one_work (kernel/workqueue.c:2602) [ 72.716054][ T64] worker_thread (include/linux/list.h:292 kernel/workqueue.c:2749) [ 72.720600][ T64] ? rescuer_thread (kernel/workqueue.c:2691) [ 72.725318][ T64] kthread (kernel/kthread.c:389) [ 72.729254][ T64] ? kthread_complete_and_exit (kernel/kthread.c:342) [ 72.734755][ T64] ret_from_fork (arch/x86/kernel/process.c:151) [ 72.739039][ T64] ? kthread_complete_and_exit (kernel/kthread.c:342) [ 72.744541][ T64] ret_from_fork_asm (arch/x86/entry/entry_64.S:298) [ 72.749176][ T64] RIP: 0000:0x0 [ 72.752505][ T64] Code: Unable to access opcode bytes at 0xffffffffffffffd6. Code starting with the faulting instruction =========================================== To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests sudo bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run sudo bin/lkp run generated-yaml-file # if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state.
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index 784fbe2341d4..e71ee8d2fbf7 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -1770,7 +1770,7 @@ static int flush_scrub_stripes(struct scrub_ctx *sctx) blk_finish_plug(&plug); /* Wait for all stripes to finish. */ - for (int i = 0; i < nr_stripes; i++) { + for (int i = 0; i < SCRUB_STRIPES_PER_SCTX; i++) { stripe = &sctx->stripes[i]; wait_event(stripe->repair_wait, stripe->state == 0); @@ -1794,40 +1794,46 @@ static int queue_scrub_stripe(struct scrub_ctx *sctx, struct btrfs_block_group * int ret; /* - * We should always have at least one slot, when full the last one who - * queued a slot should handle the flush. + * We should always have at least one slot, as the stripes[] array + * is used as a ring buffer. */ ASSERT(sctx->cur_stripe < SCRUB_STRIPES_PER_SCTX); stripe = &sctx->stripes[sctx->cur_stripe]; - scrub_reset_stripe(stripe); + + /* Wait if the stripe is still under usage. */ + wait_event(stripe->repair_wait, stripe->state == 0); + ret = scrub_find_fill_first_stripe(bg, dev, physical, mirror_num, logical, length, stripe); /* Either >0 as no more extents or <0 for error. */ if (ret) return ret; + *found_stripe_start_ret = stripe->logical; - sctx->cur_stripe++; - - /* Last slot used, flush them all. */ - if (sctx->cur_stripe == SCRUB_STRIPES_PER_SCTX) - return flush_scrub_stripes(sctx); - - /* We have filled one group, submit them now. */ - if (sctx->cur_stripe % SCRUB_STRIPES_PER_GROUP == 0) { + /* Last slot in the group, submit the group.*/ + if ((sctx->cur_stripe + 1) % SCRUB_STRIPES_PER_GROUP == 0) { + const int first_stripe = sctx->cur_stripe + 1 - + SCRUB_STRIPES_PER_GROUP; struct blk_plug plug; - scrub_throttle_dev_io(sctx, sctx->stripes[0].dev, - btrfs_stripe_nr_to_offset(SCRUB_STRIPES_PER_GROUP)); + scrub_throttle_dev_io(sctx, dev, + btrfs_stripe_nr_to_offset(SCRUB_STRIPES_PER_GROUP)); blk_start_plug(&plug); - for (int i = sctx->cur_stripe - SCRUB_STRIPES_PER_GROUP; - i < sctx->cur_stripe; i++) { + for (int i = first_stripe; + i < first_stripe + SCRUB_STRIPES_PER_GROUP; i++) { stripe = &sctx->stripes[i]; + + /* All stripes in the group should haven been queued. */ + ASSERT(test_bit(SCRUB_STRIPE_FLAG_INITIALIZED, + &stripe->state)); scrub_submit_initial_read(sctx, stripe); } blk_finish_plug(&plug); } + + sctx->cur_stripe = (sctx->cur_stripe + 1) % SCRUB_STRIPES_PER_SCTX; return 0; }
Since all scrub works (repair using other mirrors, writeback to the original mirror, writeback to the dev-replace target) all happens inside a btrfs workqueue, there is no need to manually flush all the stripes inside queue_scrub_stripe(). We only need to wait for the current slot to finish its work, then queue the stripe, and finally increase the cur_stripe number (and mod it). This should allow scrub to increase its queue depth when submitting read requests. Signed-off-by: Qu Wenruo <wqu@suse.com> --- fs/btrfs/scrub.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-)