Message ID | 20240607114628.5471-1-jth@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: zoned: allocate dummy checksums for zoned NODATASUM writes | expand |
On Jun 07, 2024 / 13:46, Johannes Thumshirn wrote: > From: Johannes Thumshirn <johannes.thumshirn@wdc.com> > > Shin'ichiro reported that when he's running fstests' test-case > btrfs/167 on emulated zoned devices, he's seeing the following NULL > pointer dereference in 'btrfs_zone_finish_endio()': > > Oops: general protection fault, probably for non-canonical address 0xdffffc0000000011: 0000 [#1] PREEMPT SMP KASAN NOPTI > KASAN: null-ptr-deref in range [0x0000000000000088-0x000000000000008f] > CPU: 4 PID: 2332440 Comm: kworker/u80:15 Tainted: G W 6.10.0-rc2-kts+ #4 > Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.3 02/21/2020 > Workqueue: btrfs-endio-write btrfs_work_helper [btrfs] > RIP: 0010:btrfs_zone_finish_endio.part.0+0x34/0x160 [btrfs] [...] > As zoned emulation mode simulates conventional zones on regular > devices, we cannot use zone-append for writing. But we're only > attaching dummy checksums if we're doing a zone-append write. > > So for NOCOW zoned data writes on conventional zones, also attach a > dummy checksum. > > Fixes: cbfce4c7fbde ("btrfs: optimize the logical to physical mapping for zoned writes") > Cc: Naohiro Aota <Naohiro.Aota@wdc.com> > Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> > Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Johaness, thank you very much for the fix. I confirmed that this patch avoids the KASAN null-ptr-deref as well as the hang at btrfs/167. Also, I ran my zoned btrfs test set and observed no regression: Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
On Fri, Jun 07, 2024 at 01:46:28PM GMT, Johannes Thumshirn wrote: > From: Johannes Thumshirn <johannes.thumshirn@wdc.com> > > Shin'ichiro reported that when he's running fstests' test-case > btrfs/167 on emulated zoned devices, he's seeing the following NULL > pointer dereference in 'btrfs_zone_finish_endio()': > > Oops: general protection fault, probably for non-canonical address 0xdffffc0000000011: 0000 [#1] PREEMPT SMP KASAN NOPTI > KASAN: null-ptr-deref in range [0x0000000000000088-0x000000000000008f] > CPU: 4 PID: 2332440 Comm: kworker/u80:15 Tainted: G W 6.10.0-rc2-kts+ #4 > Hardware name: Supermicro Super Server/X11SPi-TF, BIOS 3.3 02/21/2020 > Workqueue: btrfs-endio-write btrfs_work_helper [btrfs] > RIP: 0010:btrfs_zone_finish_endio.part.0+0x34/0x160 [btrfs] > > RSP: 0018:ffff88867f107a90 EFLAGS: 00010206 > RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffffff893e5534 > RDX: 0000000000000011 RSI: 0000000000000004 RDI: 0000000000000088 > RBP: 0000000000000002 R08: 0000000000000001 R09: ffffed1081696028 > R10: ffff88840b4b0143 R11: ffff88834dfff600 R12: ffff88840b4b0000 > R13: 0000000000020000 R14: 0000000000000000 R15: ffff888530ad5210 > FS: 0000000000000000(0000) GS:ffff888e3f800000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007f87223fff38 CR3: 00000007a7c6a002 CR4: 00000000007706f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > PKRU: 55555554 > Call Trace: > <TASK> > ? __die_body.cold+0x19/0x27 > ? die_addr+0x46/0x70 > ? exc_general_protection+0x14f/0x250 > ? asm_exc_general_protection+0x26/0x30 > ? do_raw_read_unlock+0x44/0x70 > ? btrfs_zone_finish_endio.part.0+0x34/0x160 [btrfs] > btrfs_finish_one_ordered+0x5d9/0x19a0 [btrfs] > ? __pfx_lock_release+0x10/0x10 > ? do_raw_write_lock+0x90/0x260 > ? __pfx_do_raw_write_lock+0x10/0x10 > ? __pfx_btrfs_finish_one_ordered+0x10/0x10 [btrfs] > ? _raw_write_unlock+0x23/0x40 > ? btrfs_finish_ordered_zoned+0x5a9/0x850 [btrfs] > ? lock_acquire+0x435/0x500 > btrfs_work_helper+0x1b1/0xa70 [btrfs] > ? __schedule+0x10a8/0x60b0 > ? __pfx___might_resched+0x10/0x10 > process_one_work+0x862/0x1410 > ? __pfx_lock_acquire+0x10/0x10 > ? __pfx_process_one_work+0x10/0x10 > ? assign_work+0x16c/0x240 > worker_thread+0x5e6/0x1010 > ? __pfx_worker_thread+0x10/0x10 > kthread+0x2c3/0x3a0 > ? trace_irq_enable.constprop.0+0xce/0x110 > ? __pfx_kthread+0x10/0x10 > ret_from_fork+0x31/0x70 > ? __pfx_kthread+0x10/0x10 > ret_from_fork_asm+0x1a/0x30 > </TASK> > > ---[ end trace 0000000000000000 ]--- > > Enabling CONFIG_BTRFS_ASSERT revealed the following assertion to > trigger: > > assertion failed: !list_empty(&ordered->list), in fs/btrfs/zoned.c:1815 > > This indicates, that we're missing the checksums list on the > ordered_extent. As btrfs/167 is doing a NOCOW write this is to be > expected. > > Further analysis with drgn confirmed the assumption: > > >>> inode = prog.crashed_thread().stack_trace()[11]['ordered'].inode > >>> btrfs_inode = drgn.container_of(inode, "struct btrfs_inode", \ > "vfs_inode") > >>> print(btrfs_inode.flags) > (u32)1 > > As zoned emulation mode simulates conventional zones on regular > devices, we cannot use zone-append for writing. But we're only > attaching dummy checksums if we're doing a zone-append write. > > So for NOCOW zoned data writes on conventional zones, also attach a > dummy checksum. > > Fixes: cbfce4c7fbde ("btrfs: optimize the logical to physical mapping for zoned writes") > Cc: Naohiro Aota <Naohiro.Aota@wdc.com> > Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com> > Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Looks good to me. Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com> Thanks,
diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c index 477f350a8bd0..e3a57196b0ee 100644 --- a/fs/btrfs/bio.c +++ b/fs/btrfs/bio.c @@ -741,7 +741,9 @@ static bool btrfs_submit_chunk(struct btrfs_bio *bbio, int mirror_num) ret = btrfs_bio_csum(bbio); if (ret) goto fail_put_bio; - } else if (use_append) { + } else if (use_append || + (btrfs_is_zoned(fs_info) && inode && + inode->flags & BTRFS_INODE_NODATASUM)) { ret = btrfs_alloc_dummy_sum(bbio); if (ret) goto fail_put_bio;