Message ID | 20241118215028.1066662-5-paul@darkrain42.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | SMB cached directory fixes around reconnection/unmounting | expand |
Hi Paul, Thanks for looking into this! Really appreciate it. Paul Aurich <paul@darkrain42.org> writes: > The unmount process (cifs_kill_sb() calling close_all_cached_dirs()) can > race with various cached directory operations, which ultimately results > in dentries not being dropped and these kernel BUGs: > > BUG: Dentry ffff88814f37e358{i=1000000000080,n=/} still in use (2) [unmount of cifs cifs] > VFS: Busy inodes after unmount of cifs (cifs) > ------------[ cut here ]------------ > kernel BUG at fs/super.c:661! > > This happens when a cfid is in the process of being cleaned up when, and > has been removed from the cfids->entries list, including: > > - Receiving a lease break from the server > - Server reconnection triggers invalidate_all_cached_dirs(), which > removes all the cfids from the list > - The laundromat thread decides to expire an old cfid. > > To solve these problems, dropping the dentry is done in queued work done > in a newly-added cfid_put_wq workqueue, and close_all_cached_dirs() > flushes that workqueue after it drops all the dentries of which it's > aware. This is a global workqueue (rather than scoped to a mount), but > the queued work is minimal. Why does it need to be a global workqueue? Can't you make it per tcon? > The final cleanup work for cleaning up a cfid is performed via work > queued in the serverclose_wq workqueue; this is done separate from > dropping the dentries so that close_all_cached_dirs() doesn't block on > any server operations. > > Both of these queued works expect to invoked with a cfid reference and > a tcon reference to avoid those objects from being freed while the work > is ongoing. Why do you need to take a tcon reference? Can't you drop the dentries when tearing down tcon in cifs_put_tcon()? No concurrent mounts would be able to access or free it. After running xfstests I've seen a leaked tcon in /proc/fs/cifs/DebugData with no CIFS superblocks, which might be related to this. Could you please check if there is any leaked connection in /proc/fs/cifs/DebugData after running your tests?
On 2024-11-21 23:05:51 -0300, Paulo Alcantara wrote: >Hi Paul, > >Thanks for looking into this! Really appreciate it. > >Paul Aurich <paul@darkrain42.org> writes: > >> The unmount process (cifs_kill_sb() calling close_all_cached_dirs()) can >> race with various cached directory operations, which ultimately results >> in dentries not being dropped and these kernel BUGs: >> >> BUG: Dentry ffff88814f37e358{i=1000000000080,n=/} still in use (2) [unmount of cifs cifs] >> VFS: Busy inodes after unmount of cifs (cifs) >> ------------[ cut here ]------------ >> kernel BUG at fs/super.c:661! >> >> This happens when a cfid is in the process of being cleaned up when, and >> has been removed from the cfids->entries list, including: >> >> - Receiving a lease break from the server >> - Server reconnection triggers invalidate_all_cached_dirs(), which >> removes all the cfids from the list >> - The laundromat thread decides to expire an old cfid. >> >> To solve these problems, dropping the dentry is done in queued work done >> in a newly-added cfid_put_wq workqueue, and close_all_cached_dirs() >> flushes that workqueue after it drops all the dentries of which it's >> aware. This is a global workqueue (rather than scoped to a mount), but >> the queued work is minimal. > >Why does it need to be a global workqueue? Can't you make it per tcon? The problem with a per-tcon workqueue is I didn't see clean way to deal with multiuser mounts and flushing the workqueue in close_all_cached_dirs() -- when dealing with each individual tcon, we're still holding tlink_tree_lock, so an arbitrary sleep seems problematic. There could be a per-sb workqueue (stored in cifs_sb or the master tcon) but is there a way to get back to the superblock / master tcon with just a tcon (e.g. cached_dir_lease_break, when processing a lease break)? >> The final cleanup work for cleaning up a cfid is performed via work >> queued in the serverclose_wq workqueue; this is done separate from >> dropping the dentries so that close_all_cached_dirs() doesn't block on >> any server operations. >> >> Both of these queued works expect to invoked with a cfid reference and >> a tcon reference to avoid those objects from being freed while the work >> is ongoing. > >Why do you need to take a tcon reference? In the existing code (and my patch, without the refs), I was seeing an intermittent use-after-free of the tcon or cached_fids struct by queued work processing a lease break -- the cfid isn't linked from cached_fids, but smb2_close_cached_fid invoking SMB2_close can race with the unmount and cifs_put_tcon Something like: t1 t2 cached_dir_lease_break smb2_cached_lease_break smb2_close_cached_fid SMB2_close starts cifs_kill_sb cifs_umount cifs_put_link cifs_put_tcon SMB2_close continues I had a version of the patch that kept the 'in flight lease breaks' on a second list in cached_fids so that they could be cancelled synchronously from free_cached_fids(), but I struggled with it (I can't remember exactly, but I think I was struggling to get the linked list membership / removal handling and num_entries handling consistent). > Can't you drop the dentries >when tearing down tcon in cifs_put_tcon()? No concurrent mounts would >be able to access or free it. The dentries being dropped must occur before kill_anon_super(), as that's where the 'Dentry still in use' check is. All the tcons are put in cifs_umount(), which occurs after: kill_anon_super(sb); cifs_umount(cifs_sb); The other thing is that cifs_umount_begin() has this comment, which made me think a tcon can actually be tied to two distinct mount points: if ((tcon->tc_count > 1) || (tcon->status == TID_EXITING)) { /* we have other mounts to same share or we have already tried to umount this and woken up all waiting network requests, nothing to do */ Although, as I'm thinking about it again, I think I've misunderstood (and that comment is wrong?). It did cross my mind to pull some of the work out of cifs_umount into cifs_kill_sb (specifically, I wanted to cancel prune_tlinks earlier) -- no prune_tlinks would make it more feasible to drop tlink_tree_lock in close_all_cached_dirs(), at which point a per-tcon workqueue is more practical. >After running xfstests I've seen a leaked tcon in >/proc/fs/cifs/DebugData with no CIFS superblocks, which might be related >to this. > >Could you please check if there is any leaked connection in >/proc/fs/cifs/DebugData after running your tests? After I finish with my tests (I'm not using xfstests, although perhaps I should be) and unmount the share, DebugData doesn't show any connections for me. ~Paul
On 2024-11-22 19:28:34 -0800, Paul Aurich wrote: >On 2024-11-21 23:05:51 -0300, Paulo Alcantara wrote: >>Hi Paul, >> >>Thanks for looking into this! Really appreciate it. >> >>Paul Aurich <paul@darkrain42.org> writes: >> >>>The unmount process (cifs_kill_sb() calling close_all_cached_dirs()) can >>>race with various cached directory operations, which ultimately results >>>in dentries not being dropped and these kernel BUGs: >>> >>>BUG: Dentry ffff88814f37e358{i=1000000000080,n=/} still in use (2) [unmount of cifs cifs] >>>VFS: Busy inodes after unmount of cifs (cifs) >>>------------[ cut here ]------------ >>>kernel BUG at fs/super.c:661! >>> >>>This happens when a cfid is in the process of being cleaned up when, and >>>has been removed from the cfids->entries list, including: >>> >>>- Receiving a lease break from the server >>>- Server reconnection triggers invalidate_all_cached_dirs(), which >>> removes all the cfids from the list >>>- The laundromat thread decides to expire an old cfid. >>> >>>To solve these problems, dropping the dentry is done in queued work done >>>in a newly-added cfid_put_wq workqueue, and close_all_cached_dirs() >>>flushes that workqueue after it drops all the dentries of which it's >>>aware. This is a global workqueue (rather than scoped to a mount), but >>>the queued work is minimal. >> >>Why does it need to be a global workqueue? Can't you make it per tcon? > >The problem with a per-tcon workqueue is I didn't see clean way to >deal with multiuser mounts and flushing the workqueue in >close_all_cached_dirs() -- when dealing with each individual tcon, >we're still holding tlink_tree_lock, so an arbitrary sleep seems >problematic. > >There could be a per-sb workqueue (stored in cifs_sb or the master >tcon) but is there a way to get back to the superblock / master tcon >with just a tcon (e.g. cached_dir_lease_break, when processing a lease >break)? > >>>The final cleanup work for cleaning up a cfid is performed via work >>>queued in the serverclose_wq workqueue; this is done separate from >>>dropping the dentries so that close_all_cached_dirs() doesn't block on >>>any server operations. >>> >>>Both of these queued works expect to invoked with a cfid reference and >>>a tcon reference to avoid those objects from being freed while the work >>>is ongoing. >> >>Why do you need to take a tcon reference? > >In the existing code (and my patch, without the refs), I was seeing an >intermittent use-after-free of the tcon or cached_fids struct by >queued work processing a lease break -- the cfid isn't linked from >cached_fids, but smb2_close_cached_fid invoking SMB2_close can race >with the unmount and cifs_put_tcon > >Something like: > > t1 t2 >cached_dir_lease_break >smb2_cached_lease_break >smb2_close_cached_fid >SMB2_close starts > cifs_kill_sb > cifs_umount > cifs_put_link > cifs_put_tcon >SMB2_close continues > >I had a version of the patch that kept the 'in flight lease breaks' on >a second list in cached_fids so that they could be cancelled >synchronously from free_cached_fids(), but I struggled with it (I >can't remember exactly, but I think I was struggling to get the linked >list membership / removal handling and num_entries handling >consistent). > >>Can't you drop the dentries >>when tearing down tcon in cifs_put_tcon()? No concurrent mounts would >>be able to access or free it. > >The dentries being dropped must occur before kill_anon_super(), as >that's where the 'Dentry still in use' check is. All the tcons are put >in cifs_umount(), which occurs after: > > kill_anon_super(sb); > cifs_umount(cifs_sb); > >The other thing is that cifs_umount_begin() has this comment, which >made me think a tcon can actually be tied to two distinct mount >points: > > if ((tcon->tc_count > 1) || (tcon->status == TID_EXITING)) { > /* we have other mounts to same share or we have > already tried to umount this and woken up > all waiting network requests, nothing to do */ > >Although, as I'm thinking about it again, I think I've misunderstood >(and that comment is wrong?). > >It did cross my mind to pull some of the work out of cifs_umount into >cifs_kill_sb (specifically, I wanted to cancel prune_tlinks earlier) >-- no prune_tlinks would make it more feasible to drop tlink_tree_lock >in close_all_cached_dirs(), at which point a per-tcon workqueue is >more practical. > >>After running xfstests I've seen a leaked tcon in >>/proc/fs/cifs/DebugData with no CIFS superblocks, which might be related >>to this. >> >>Could you please check if there is any leaked connection in >>/proc/fs/cifs/DebugData after running your tests? > >After I finish with my tests (I'm not using xfstests, although perhaps >I should be) and unmount the share, DebugData doesn't show any >connections for me. I was able to reproduce this leak. I believe the attached patch addresses it. I'm able to intermittently see a 'Dentry still in use' bug with xfstests generic/241 (what Steve saw) (the attached patch doesn't help with that). I'm still unsure what's going on there. >~Paul
I did see the generic/241 failure again with current for-next (unrelated to this patch though). Will try to repro it again - but any ideas how to narrow it down or fix it would be helpful. SECTION -- smb3 FSTYP -- cifs PLATFORM -- Linux/x86_64 fedora29 6.12.0 #1 SMP PREEMPT_DYNAMIC Wed Nov 27 01:02:07 UTC 2024 MKFS_OPTIONS -- //win16.vm.test/Scratch generic/241 73s Ran: generic/241 Passed all 1 tests SECTION -- smb3 ========================= Ran: generic/241 Passed all 1 tests Number of reconnects: 0 Test completed smb3 generic/241 at Wed Nov 27 06:38:47 AM UTC 2024 dmesg output during the test: [Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Share [Wed Nov 27 00:37:32 2024] CIFS: VFS: generate_smb3signingkey: dumping generated AES session keys [Wed Nov 27 00:37:32 2024] CIFS: VFS: Session Id 45 00 00 08 00 c8 00 00 [Wed Nov 27 00:37:32 2024] CIFS: VFS: Cipher type 2 [Wed Nov 27 00:37:32 2024] CIFS: VFS: Session Key 00 bf ed c7 f1 95 0e 29 06 e8 82 87 b5 c8 72 06 [Wed Nov 27 00:37:32 2024] CIFS: VFS: Signing Key a4 0f 15 64 d2 69 02 2f 4e 78 60 7a fe 3e 31 4e [Wed Nov 27 00:37:32 2024] CIFS: VFS: ServerIn Key a6 fd 04 f6 04 ea 0e 6e 60 c0 1b b1 ee 63 38 e9 [Wed Nov 27 00:37:32 2024] CIFS: VFS: ServerOut Key a6 e3 e3 22 8c c2 b0 6e b1 9d 40 ea d0 89 6d d8 [Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Scratch [Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Scratch [Wed Nov 27 00:37:32 2024] run fstests generic/241 at 2024-11-27 00:37:33 [Wed Nov 27 00:38:46 2024] ------------[ cut here ]------------ [Wed Nov 27 00:38:46 2024] BUG: Dentry 00000000318d67d4{i=11000000033f68,n=~dmtmp} still in use (1) [unmount of cifs cifs] [Wed Nov 27 00:38:46 2024] WARNING: CPU: 2 PID: 316177 at fs/dcache.c:1546 umount_check+0xc3/0xf0 [Wed Nov 27 00:38:46 2024] Modules linked in: cifs ccm cmac nls_utf8 cifs_arc4 nls_ucs2_utils cifs_md4 rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace netfs nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables sunrpc kvm_intel kvm virtio_net net_failover failover virtio_balloon loop fuse dm_multipath nfnetlink zram xfs bochs drm_client_lib drm_shmem_helper drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm sha512_ssse3 sha256_ssse3 sha1_ssse3 floppy virtio_blk qemu_fw_cfg virtio_console [last unloaded: cifs] [Wed Nov 27 00:38:46 2024] CPU: 2 UID: 0 PID: 316177 Comm: umount Not tainted 6.12.0 #1 [Wed Nov 27 00:38:46 2024] Hardware name: Red Hat KVM, BIOS 1.16.3-2.el9 04/01/2014 [Wed Nov 27 00:38:46 2024] RIP: 0010:umount_check+0xc3/0xf0 [Wed Nov 27 00:38:46 2024] Code: db 74 0d 48 8d 7b 40 e8 db df f5 ff 48 8b 53 40 41 55 4d 89 f1 45 89 e0 48 89 e9 48 89 ee 48 c7 c7 80 99 ba ad e8 2d 27 a2 ff <0f> 0b 58 31 c0 5b 5d 41 5c 41 5d 41 5e c3 cc cc cc cc 41 83 fc 01 [Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fd20 EFLAGS: 00010282 [Wed Nov 27 00:38:46 2024] RAX: dffffc0000000000 RBX: ff1100010c574ce0 RCX: 0000000000000027 [Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ff110004cb131a48 [Wed Nov 27 00:38:46 2024] RBP: ff1100012c76bd60 R08: ffffffffac3fd2fe R09: ffe21c0099626349 [Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 R12: 0000000000000001 [Wed Nov 27 00:38:46 2024] R13: ff110001238b6668 R14: ffffffffc1d6e6c0 R15: ff1100012c76be18 [Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) GS:ff110004cb100000(0000) knlGS:0000000000000000 [Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 CR4: 0000000000373ef0 [Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [Wed Nov 27 00:38:46 2024] Call Trace: [Wed Nov 27 00:38:46 2024] <TASK> [Wed Nov 27 00:38:46 2024] ? __warn+0xa9/0x220 [Wed Nov 27 00:38:46 2024] ? umount_check+0xc3/0xf0 [Wed Nov 27 00:38:46 2024] ? report_bug+0x1d4/0x1e0 [Wed Nov 27 00:38:46 2024] ? handle_bug+0x5b/0xa0 [Wed Nov 27 00:38:46 2024] ? exc_invalid_op+0x18/0x50 [Wed Nov 27 00:38:46 2024] ? asm_exc_invalid_op+0x1a/0x20 [Wed Nov 27 00:38:46 2024] ? irq_work_claim+0x1e/0x40 [Wed Nov 27 00:38:46 2024] ? umount_check+0xc3/0xf0 [Wed Nov 27 00:38:46 2024] ? __pfx_umount_check+0x10/0x10 [Wed Nov 27 00:38:46 2024] d_walk+0xf3/0x4e0 [Wed Nov 27 00:38:46 2024] ? d_walk+0x4b/0x4e0 [Wed Nov 27 00:38:46 2024] shrink_dcache_for_umount+0x6d/0x220 [Wed Nov 27 00:38:46 2024] generic_shutdown_super+0x4a/0x1c0 [Wed Nov 27 00:38:46 2024] kill_anon_super+0x22/0x40 [Wed Nov 27 00:38:46 2024] cifs_kill_sb+0x78/0x90 [cifs] [Wed Nov 27 00:38:46 2024] deactivate_locked_super+0x69/0xf0 [Wed Nov 27 00:38:46 2024] cleanup_mnt+0x195/0x200 [Wed Nov 27 00:38:46 2024] task_work_run+0xec/0x150 [Wed Nov 27 00:38:46 2024] ? __pfx_task_work_run+0x10/0x10 [Wed Nov 27 00:38:46 2024] ? mark_held_locks+0x24/0x90 [Wed Nov 27 00:38:46 2024] syscall_exit_to_user_mode+0x269/0x2a0 [Wed Nov 27 00:38:46 2024] do_syscall_64+0x81/0x180 [Wed Nov 27 00:38:46 2024] entry_SYSCALL_64_after_hwframe+0x76/0x7e [Wed Nov 27 00:38:46 2024] RIP: 0033:0x7fddc1ff43eb [Wed Nov 27 00:38:46 2024] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 f9 a9 0c 00 f7 d8 [Wed Nov 27 00:38:46 2024] RSP: 002b:00007ffe64be88d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [Wed Nov 27 00:38:46 2024] RAX: 0000000000000000 RBX: 00005632106d4c20 RCX: 00007fddc1ff43eb [Wed Nov 27 00:38:46 2024] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00005632106d9410 [Wed Nov 27 00:38:46 2024] RBP: 00007ffe64be89b0 R08: 00005632106d4010 R09: 0000000000000007 [Wed Nov 27 00:38:46 2024] R10: 0000000000000000 R11: 0000000000000246 R12: 00005632106d4d28 [Wed Nov 27 00:38:46 2024] R13: 0000000000000000 R14: 00005632106d9410 R15: 00005632106d5030 [Wed Nov 27 00:38:46 2024] </TASK> [Wed Nov 27 00:38:46 2024] irq event stamp: 8317 [Wed Nov 27 00:38:46 2024] hardirqs last enabled at (8323): [<ffffffffac230dce>] __up_console_sem+0x5e/0x70 [Wed Nov 27 00:38:46 2024] hardirqs last disabled at (8328): [<ffffffffac230db3>] __up_console_sem+0x43/0x70 [Wed Nov 27 00:38:46 2024] softirqs last enabled at (6628): [<ffffffffac135745>] __irq_exit_rcu+0x135/0x160 [Wed Nov 27 00:38:46 2024] softirqs last disabled at (6539): [<ffffffffac135745>] __irq_exit_rcu+0x135/0x160 [Wed Nov 27 00:38:46 2024] ---[ end trace 0000000000000000 ]--- [Wed Nov 27 00:38:46 2024] VFS: Busy inodes after unmount of cifs (cifs) [Wed Nov 27 00:38:46 2024] ------------[ cut here ]------------ [Wed Nov 27 00:38:46 2024] kernel BUG at fs/super.c:650! [Wed Nov 27 00:38:46 2024] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI [Wed Nov 27 00:38:46 2024] CPU: 2 UID: 0 PID: 316177 Comm: umount Tainted: G W 6.12.0 #1 [Wed Nov 27 00:38:46 2024] Tainted: [W]=WARN [Wed Nov 27 00:38:46 2024] Hardware name: Red Hat KVM, BIOS 1.16.3-2.el9 04/01/2014 [Wed Nov 27 00:38:46 2024] RIP: 0010:generic_shutdown_super+0x1b7/0x1c0 [Wed Nov 27 00:38:46 2024] Code: 7b 28 e8 5c ca f8 ff 48 8b 6b 28 48 89 ef e8 50 ca f8 ff 48 8b 55 00 48 8d b3 68 06 00 00 48 c7 c7 e0 38 ba ad e8 d9 c1 b5 ff <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 [Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fdf0 EFLAGS: 00010282 [Wed Nov 27 00:38:46 2024] RAX: 000000000000002d RBX: ff110001238b6000 RCX: 0000000000000027 [Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ff110004cb131a48 [Wed Nov 27 00:38:46 2024] RBP: ffffffffc1c6ac00 R08: ffffffffac3fd2fe R09: ffe21c0099626349 [Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 R12: ff110001238b69c0 [Wed Nov 27 00:38:46 2024] R13: ff110001238b6780 R14: 1fe220002432ffd4 R15: 0000000000000000 [Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) GS:ff110004cb100000(0000) knlGS:0000000000000000 [Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 CR4: 0000000000373ef0 [Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [Wed Nov 27 00:38:46 2024] Call Trace: [Wed Nov 27 00:38:46 2024] <TASK> [Wed Nov 27 00:38:46 2024] ? die+0x37/0x90 [Wed Nov 27 00:38:46 2024] ? do_trap+0x133/0x230 [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 [Wed Nov 27 00:38:46 2024] ? do_error_trap+0x94/0x130 [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 [Wed Nov 27 00:38:46 2024] ? handle_invalid_op+0x2c/0x40 [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 [Wed Nov 27 00:38:46 2024] ? exc_invalid_op+0x2f/0x50 [Wed Nov 27 00:38:46 2024] ? asm_exc_invalid_op+0x1a/0x20 [Wed Nov 27 00:38:46 2024] ? irq_work_claim+0x1e/0x40 [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 [Wed Nov 27 00:38:46 2024] kill_anon_super+0x22/0x40 [Wed Nov 27 00:38:46 2024] cifs_kill_sb+0x78/0x90 [cifs] [Wed Nov 27 00:38:46 2024] deactivate_locked_super+0x69/0xf0 [Wed Nov 27 00:38:46 2024] cleanup_mnt+0x195/0x200 [Wed Nov 27 00:38:46 2024] task_work_run+0xec/0x150 [Wed Nov 27 00:38:46 2024] ? __pfx_task_work_run+0x10/0x10 [Wed Nov 27 00:38:46 2024] ? mark_held_locks+0x24/0x90 [Wed Nov 27 00:38:46 2024] syscall_exit_to_user_mode+0x269/0x2a0 [Wed Nov 27 00:38:46 2024] do_syscall_64+0x81/0x180 [Wed Nov 27 00:38:46 2024] entry_SYSCALL_64_after_hwframe+0x76/0x7e [Wed Nov 27 00:38:46 2024] RIP: 0033:0x7fddc1ff43eb [Wed Nov 27 00:38:46 2024] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 f9 a9 0c 00 f7 d8 [Wed Nov 27 00:38:46 2024] RSP: 002b:00007ffe64be88d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 [Wed Nov 27 00:38:46 2024] RAX: 0000000000000000 RBX: 00005632106d4c20 RCX: 00007fddc1ff43eb [Wed Nov 27 00:38:46 2024] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00005632106d9410 [Wed Nov 27 00:38:46 2024] RBP: 00007ffe64be89b0 R08: 00005632106d4010 R09: 0000000000000007 [Wed Nov 27 00:38:46 2024] R10: 0000000000000000 R11: 0000000000000246 R12: 00005632106d4d28 [Wed Nov 27 00:38:46 2024] R13: 0000000000000000 R14: 00005632106d9410 R15: 00005632106d5030 [Wed Nov 27 00:38:46 2024] </TASK> [Wed Nov 27 00:38:46 2024] Modules linked in: cifs ccm cmac nls_utf8 cifs_arc4 nls_ucs2_utils cifs_md4 rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace netfs nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw iptable_security ip_set ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables sunrpc kvm_intel kvm virtio_net net_failover failover virtio_balloon loop fuse dm_multipath nfnetlink zram xfs bochs drm_client_lib drm_shmem_helper drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm sha512_ssse3 sha256_ssse3 sha1_ssse3 floppy virtio_blk qemu_fw_cfg virtio_console [last unloaded: cifs] [Wed Nov 27 00:38:46 2024] ---[ end trace 0000000000000000 ]--- [Wed Nov 27 00:38:46 2024] RIP: 0010:generic_shutdown_super+0x1b7/0x1c0 [Wed Nov 27 00:38:46 2024] Code: 7b 28 e8 5c ca f8 ff 48 8b 6b 28 48 89 ef e8 50 ca f8 ff 48 8b 55 00 48 8d b3 68 06 00 00 48 c7 c7 e0 38 ba ad e8 d9 c1 b5 ff <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 [Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fdf0 EFLAGS: 00010282 [Wed Nov 27 00:38:46 2024] RAX: 000000000000002d RBX: ff110001238b6000 RCX: 0000000000000027 [Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ff110004cb131a48 [Wed Nov 27 00:38:46 2024] RBP: ffffffffc1c6ac00 R08: ffffffffac3fd2fe R09: ffe21c0099626349 [Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 R12: ff110001238b69c0 [Wed Nov 27 00:38:46 2024] R13: ff110001238b6780 R14: 1fe220002432ffd4 R15: 0000000000000000 [Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) GS:ff110004cb100000(0000) knlGS:0000000000000000 [Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 CR4: 0000000000373ef0 [Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 On Tue, Nov 26, 2024 at 3:50 PM Paul Aurich <paul@darkrain42.org> wrote: > > On 2024-11-22 19:28:34 -0800, Paul Aurich wrote: > >On 2024-11-21 23:05:51 -0300, Paulo Alcantara wrote: > >>Hi Paul, > >> > >>Thanks for looking into this! Really appreciate it. > >> > >>Paul Aurich <paul@darkrain42.org> writes: > >> > >>>The unmount process (cifs_kill_sb() calling close_all_cached_dirs()) can > >>>race with various cached directory operations, which ultimately results > >>>in dentries not being dropped and these kernel BUGs: > >>> > >>>BUG: Dentry ffff88814f37e358{i=1000000000080,n=/} still in use (2) [unmount of cifs cifs] > >>>VFS: Busy inodes after unmount of cifs (cifs) > >>>------------[ cut here ]------------ > >>>kernel BUG at fs/super.c:661! > >>> > >>>This happens when a cfid is in the process of being cleaned up when, and > >>>has been removed from the cfids->entries list, including: > >>> > >>>- Receiving a lease break from the server > >>>- Server reconnection triggers invalidate_all_cached_dirs(), which > >>> removes all the cfids from the list > >>>- The laundromat thread decides to expire an old cfid. > >>> > >>>To solve these problems, dropping the dentry is done in queued work done > >>>in a newly-added cfid_put_wq workqueue, and close_all_cached_dirs() > >>>flushes that workqueue after it drops all the dentries of which it's > >>>aware. This is a global workqueue (rather than scoped to a mount), but > >>>the queued work is minimal. > >> > >>Why does it need to be a global workqueue? Can't you make it per tcon? > > > >The problem with a per-tcon workqueue is I didn't see clean way to > >deal with multiuser mounts and flushing the workqueue in > >close_all_cached_dirs() -- when dealing with each individual tcon, > >we're still holding tlink_tree_lock, so an arbitrary sleep seems > >problematic. > > > >There could be a per-sb workqueue (stored in cifs_sb or the master > >tcon) but is there a way to get back to the superblock / master tcon > >with just a tcon (e.g. cached_dir_lease_break, when processing a lease > >break)? > > > >>>The final cleanup work for cleaning up a cfid is performed via work > >>>queued in the serverclose_wq workqueue; this is done separate from > >>>dropping the dentries so that close_all_cached_dirs() doesn't block on > >>>any server operations. > >>> > >>>Both of these queued works expect to invoked with a cfid reference and > >>>a tcon reference to avoid those objects from being freed while the work > >>>is ongoing. > >> > >>Why do you need to take a tcon reference? > > > >In the existing code (and my patch, without the refs), I was seeing an > >intermittent use-after-free of the tcon or cached_fids struct by > >queued work processing a lease break -- the cfid isn't linked from > >cached_fids, but smb2_close_cached_fid invoking SMB2_close can race > >with the unmount and cifs_put_tcon > > > >Something like: > > > > t1 t2 > >cached_dir_lease_break > >smb2_cached_lease_break > >smb2_close_cached_fid > >SMB2_close starts > > cifs_kill_sb > > cifs_umount > > cifs_put_link > > cifs_put_tcon > >SMB2_close continues > > > >I had a version of the patch that kept the 'in flight lease breaks' on > >a second list in cached_fids so that they could be cancelled > >synchronously from free_cached_fids(), but I struggled with it (I > >can't remember exactly, but I think I was struggling to get the linked > >list membership / removal handling and num_entries handling > >consistent). > > > >>Can't you drop the dentries > >>when tearing down tcon in cifs_put_tcon()? No concurrent mounts would > >>be able to access or free it. > > > >The dentries being dropped must occur before kill_anon_super(), as > >that's where the 'Dentry still in use' check is. All the tcons are put > >in cifs_umount(), which occurs after: > > > > kill_anon_super(sb); > > cifs_umount(cifs_sb); > > > >The other thing is that cifs_umount_begin() has this comment, which > >made me think a tcon can actually be tied to two distinct mount > >points: > > > > if ((tcon->tc_count > 1) || (tcon->status == TID_EXITING)) { > > /* we have other mounts to same share or we have > > already tried to umount this and woken up > > all waiting network requests, nothing to do */ > > > >Although, as I'm thinking about it again, I think I've misunderstood > >(and that comment is wrong?). > > > >It did cross my mind to pull some of the work out of cifs_umount into > >cifs_kill_sb (specifically, I wanted to cancel prune_tlinks earlier) > >-- no prune_tlinks would make it more feasible to drop tlink_tree_lock > >in close_all_cached_dirs(), at which point a per-tcon workqueue is > >more practical. > > > >>After running xfstests I've seen a leaked tcon in > >>/proc/fs/cifs/DebugData with no CIFS superblocks, which might be related > >>to this. > >> > >>Could you please check if there is any leaked connection in > >>/proc/fs/cifs/DebugData after running your tests? > > > >After I finish with my tests (I'm not using xfstests, although perhaps > >I should be) and unmount the share, DebugData doesn't show any > >connections for me. > > I was able to reproduce this leak. I believe the attached patch addresses it. > > I'm able to intermittently see a 'Dentry still in use' bug with xfstests > generic/241 (what Steve saw) (the attached patch doesn't help with that). I'm > still unsure what's going on there. > > >~Paul
On 11/27, Steve French wrote: >I did see the generic/241 failure again with current for-next >(unrelated to this patch though). Will try to repro it again - but >any ideas how to narrow it down or fix it would be helpful. We're seeing this too when backporting that patch series to SLE15-SP6, by only running generic/072, so I don't think it's unrelated. We also hit, also with generic/072, but only once, the WARN() in cached_dir_offload_close() (introduced in this same patch): [ 526.946722] WARNING: CPU: 2 PID: 23778 at fs/smb/client/cached_dir.c:555 cached_dir_offload_close+0x90/0xa0 [cifs] [ 526.948561] Modules linked in: cifs cifs_arc4 cifs_md4 [ 526.949385] CPU: 2 PID: 23778 Comm: kworker/2:1 Kdump: loaded Not tainted 6.4.0-lku #91 SLE15-SP6 (unreleased) [ 526.949394] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-2-gc13ff2cd-prebuilt.qemu.org 04/01/2014 [ 526.949398] Workqueue: serverclose cached_dir_offload_close [cifs] [ 526.951938] RIP: 0010:cached_dir_offload_close+0x90/0xa0 [cifs] [ 526.953827] Code: e8 a5 fb ff ff 4c 89 e7 5b 5d 41 5c e9 99 57 fc ff 48 89 ef be 03 00 00 00 e8 5c d4 c5 c9 4c 89 e7 5b 5d 41 5c e9 80 57 fc ff <0f> 0b eb 99 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 [ 526.953836] RSP: 0018:ffff888108b7fdd8 EFLAGS: 00010202 [ 526.953844] RAX: 0000000000000000 RBX: ffff888100de68d0 RCX: ffffffffc042e7d4 [ 526.953849] RDX: 1ffff110201bcd04 RSI: 0000000000000008 RDI: ffff888100de6820 [ 526.953854] RBP: ffff8881063d8a00 R08: 0000000000000001 R09: ffffed10201bcd1a [ 526.953858] R10: ffff888100de68d7 R11: 0000000000000000 R12: ffff888100a2b000 [ 526.953862] R13: 0000000000000080 R14: ffff88814f336ea8 R15: ffff888100de68d8 [ 526.953872] FS: 0000000000000000(0000) GS:ffff88814f300000(0000) knlGS:0000000000000000 [ 526.965888] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 526.965897] CR2: 00007f841e6ea698 CR3: 0000000106c68000 CR4: 0000000000750ee0 [ 526.965904] PKRU: 55555554 [ 526.965908] Call Trace: [ 526.965912] <TASK> [ 526.965917] ? __warn+0x92/0xf0 [ 526.965932] ? cached_dir_offload_close+0x90/0xa0 [cifs] [ 526.972634] ? report_bug+0x163/0x190 [ 526.972652] ? handle_bug+0x3a/0x70 [ 526.972665] ? exc_invalid_op+0x17/0x40 [ 526.972674] ? asm_exc_invalid_op+0x1a/0x20 [ 526.972688] ? cached_dir_offload_close+0x24/0xa0 [cifs] [ 526.977239] ? cached_dir_offload_close+0x90/0xa0 [cifs] [ 526.978831] ? cached_dir_offload_close+0x24/0xa0 [cifs] [ 526.979133] process_one_work+0x42c/0x730 [ 526.979176] worker_thread+0x8e/0x700 [ 526.979190] ? __pfx_worker_thread+0x10/0x10 [ 526.979200] kthread+0x197/0x1d0 [ 526.979208] ? kthread+0xeb/0x1d0 [ 526.979216] ? __pfx_kthread+0x10/0x10 [ 526.979225] ret_from_fork+0x29/0x50 [ 526.979237] </TASK> [ 526.979241] ---[ end trace 0000000000000000 ]--- Will update here if we find a fix/root cause. Cheers, Enzo >SECTION -- smb3 >FSTYP -- cifs >PLATFORM -- Linux/x86_64 fedora29 6.12.0 #1 SMP PREEMPT_DYNAMIC Wed >Nov 27 01:02:07 UTC 2024 >MKFS_OPTIONS -- //win16.vm.test/Scratch >generic/241 73s >Ran: generic/241 >Passed all 1 tests >SECTION -- smb3 >========================= >Ran: generic/241 >Passed all 1 tests >Number of reconnects: 0 >Test completed smb3 generic/241 at Wed Nov 27 06:38:47 AM UTC 2024 >dmesg output during the test: >[Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Share >[Wed Nov 27 00:37:32 2024] CIFS: VFS: generate_smb3signingkey: dumping >generated AES session keys >[Wed Nov 27 00:37:32 2024] CIFS: VFS: Session Id 45 00 00 08 00 c8 00 00 >[Wed Nov 27 00:37:32 2024] CIFS: VFS: Cipher type 2 >[Wed Nov 27 00:37:32 2024] CIFS: VFS: Session Key 00 bf ed c7 f1 95 0e >29 06 e8 82 87 b5 c8 72 06 >[Wed Nov 27 00:37:32 2024] CIFS: VFS: Signing Key a4 0f 15 64 d2 69 02 >2f 4e 78 60 7a fe 3e 31 4e >[Wed Nov 27 00:37:32 2024] CIFS: VFS: ServerIn Key a6 fd 04 f6 04 ea >0e 6e 60 c0 1b b1 ee 63 38 e9 >[Wed Nov 27 00:37:32 2024] CIFS: VFS: ServerOut Key a6 e3 e3 22 8c c2 >b0 6e b1 9d 40 ea d0 89 6d d8 >[Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Scratch >[Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Scratch >[Wed Nov 27 00:37:32 2024] run fstests generic/241 at 2024-11-27 00:37:33 >[Wed Nov 27 00:38:46 2024] ------------[ cut here ]------------ >[Wed Nov 27 00:38:46 2024] BUG: Dentry >00000000318d67d4{i=11000000033f68,n=~dmtmp} still in use (1) [unmount >of cifs cifs] >[Wed Nov 27 00:38:46 2024] WARNING: CPU: 2 PID: 316177 at >fs/dcache.c:1546 umount_check+0xc3/0xf0 >[Wed Nov 27 00:38:46 2024] Modules linked in: cifs ccm cmac nls_utf8 >cifs_arc4 nls_ucs2_utils cifs_md4 rpcsec_gss_krb5 auth_rpcgss nfsv4 >dns_resolver nfs lockd grace netfs nf_conntrack_netbios_ns >nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib >nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct >nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat >ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat >nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw >iptable_security ip_set ebtable_filter ebtables ip6table_filter >ip6_tables iptable_filter ip_tables sunrpc kvm_intel kvm virtio_net >net_failover failover virtio_balloon loop fuse dm_multipath nfnetlink >zram xfs bochs drm_client_lib drm_shmem_helper drm_kms_helper >crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm >sha512_ssse3 sha256_ssse3 sha1_ssse3 floppy virtio_blk qemu_fw_cfg >virtio_console [last unloaded: cifs] >[Wed Nov 27 00:38:46 2024] CPU: 2 UID: 0 PID: 316177 Comm: umount Not >tainted 6.12.0 #1 >[Wed Nov 27 00:38:46 2024] Hardware name: Red Hat KVM, BIOS >1.16.3-2.el9 04/01/2014 >[Wed Nov 27 00:38:46 2024] RIP: 0010:umount_check+0xc3/0xf0 >[Wed Nov 27 00:38:46 2024] Code: db 74 0d 48 8d 7b 40 e8 db df f5 ff >48 8b 53 40 41 55 4d 89 f1 45 89 e0 48 89 e9 48 89 ee 48 c7 c7 80 99 >ba ad e8 2d 27 a2 ff <0f> 0b 58 31 c0 5b 5d 41 5c 41 5d 41 5e c3 cc cc >cc cc 41 83 fc 01 >[Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fd20 EFLAGS: 00010282 >[Wed Nov 27 00:38:46 2024] RAX: dffffc0000000000 RBX: ff1100010c574ce0 >RCX: 0000000000000027 >[Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 >RDI: ff110004cb131a48 >[Wed Nov 27 00:38:46 2024] RBP: ff1100012c76bd60 R08: ffffffffac3fd2fe >R09: ffe21c0099626349 >[Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 >R12: 0000000000000001 >[Wed Nov 27 00:38:46 2024] R13: ff110001238b6668 R14: ffffffffc1d6e6c0 >R15: ff1100012c76be18 >[Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) >GS:ff110004cb100000(0000) knlGS:0000000000000000 >[Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >[Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 >CR4: 0000000000373ef0 >[Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 >DR2: 0000000000000000 >[Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 >DR7: 0000000000000400 >[Wed Nov 27 00:38:46 2024] Call Trace: >[Wed Nov 27 00:38:46 2024] <TASK> >[Wed Nov 27 00:38:46 2024] ? __warn+0xa9/0x220 >[Wed Nov 27 00:38:46 2024] ? umount_check+0xc3/0xf0 >[Wed Nov 27 00:38:46 2024] ? report_bug+0x1d4/0x1e0 >[Wed Nov 27 00:38:46 2024] ? handle_bug+0x5b/0xa0 >[Wed Nov 27 00:38:46 2024] ? exc_invalid_op+0x18/0x50 >[Wed Nov 27 00:38:46 2024] ? asm_exc_invalid_op+0x1a/0x20 >[Wed Nov 27 00:38:46 2024] ? irq_work_claim+0x1e/0x40 >[Wed Nov 27 00:38:46 2024] ? umount_check+0xc3/0xf0 >[Wed Nov 27 00:38:46 2024] ? __pfx_umount_check+0x10/0x10 >[Wed Nov 27 00:38:46 2024] d_walk+0xf3/0x4e0 >[Wed Nov 27 00:38:46 2024] ? d_walk+0x4b/0x4e0 >[Wed Nov 27 00:38:46 2024] shrink_dcache_for_umount+0x6d/0x220 >[Wed Nov 27 00:38:46 2024] generic_shutdown_super+0x4a/0x1c0 >[Wed Nov 27 00:38:46 2024] kill_anon_super+0x22/0x40 >[Wed Nov 27 00:38:46 2024] cifs_kill_sb+0x78/0x90 [cifs] >[Wed Nov 27 00:38:46 2024] deactivate_locked_super+0x69/0xf0 >[Wed Nov 27 00:38:46 2024] cleanup_mnt+0x195/0x200 >[Wed Nov 27 00:38:46 2024] task_work_run+0xec/0x150 >[Wed Nov 27 00:38:46 2024] ? __pfx_task_work_run+0x10/0x10 >[Wed Nov 27 00:38:46 2024] ? mark_held_locks+0x24/0x90 >[Wed Nov 27 00:38:46 2024] syscall_exit_to_user_mode+0x269/0x2a0 >[Wed Nov 27 00:38:46 2024] do_syscall_64+0x81/0x180 >[Wed Nov 27 00:38:46 2024] entry_SYSCALL_64_after_hwframe+0x76/0x7e >[Wed Nov 27 00:38:46 2024] RIP: 0033:0x7fddc1ff43eb >[Wed Nov 27 00:38:46 2024] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f >1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa >b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 >f9 a9 0c 00 f7 d8 >[Wed Nov 27 00:38:46 2024] RSP: 002b:00007ffe64be88d8 EFLAGS: 00000246 >ORIG_RAX: 00000000000000a6 >[Wed Nov 27 00:38:46 2024] RAX: 0000000000000000 RBX: 00005632106d4c20 >RCX: 00007fddc1ff43eb >[Wed Nov 27 00:38:46 2024] RDX: 0000000000000000 RSI: 0000000000000000 >RDI: 00005632106d9410 >[Wed Nov 27 00:38:46 2024] RBP: 00007ffe64be89b0 R08: 00005632106d4010 >R09: 0000000000000007 >[Wed Nov 27 00:38:46 2024] R10: 0000000000000000 R11: 0000000000000246 >R12: 00005632106d4d28 >[Wed Nov 27 00:38:46 2024] R13: 0000000000000000 R14: 00005632106d9410 >R15: 00005632106d5030 >[Wed Nov 27 00:38:46 2024] </TASK> >[Wed Nov 27 00:38:46 2024] irq event stamp: 8317 >[Wed Nov 27 00:38:46 2024] hardirqs last enabled at (8323): >[<ffffffffac230dce>] __up_console_sem+0x5e/0x70 >[Wed Nov 27 00:38:46 2024] hardirqs last disabled at (8328): >[<ffffffffac230db3>] __up_console_sem+0x43/0x70 >[Wed Nov 27 00:38:46 2024] softirqs last enabled at (6628): >[<ffffffffac135745>] __irq_exit_rcu+0x135/0x160 >[Wed Nov 27 00:38:46 2024] softirqs last disabled at (6539): >[<ffffffffac135745>] __irq_exit_rcu+0x135/0x160 >[Wed Nov 27 00:38:46 2024] ---[ end trace 0000000000000000 ]--- >[Wed Nov 27 00:38:46 2024] VFS: Busy inodes after unmount of cifs (cifs) >[Wed Nov 27 00:38:46 2024] ------------[ cut here ]------------ >[Wed Nov 27 00:38:46 2024] kernel BUG at fs/super.c:650! >[Wed Nov 27 00:38:46 2024] Oops: invalid opcode: 0000 [#1] PREEMPT SMP >KASAN NOPTI >[Wed Nov 27 00:38:46 2024] CPU: 2 UID: 0 PID: 316177 Comm: umount >Tainted: G W 6.12.0 #1 >[Wed Nov 27 00:38:46 2024] Tainted: [W]=WARN >[Wed Nov 27 00:38:46 2024] Hardware name: Red Hat KVM, BIOS >1.16.3-2.el9 04/01/2014 >[Wed Nov 27 00:38:46 2024] RIP: 0010:generic_shutdown_super+0x1b7/0x1c0 >[Wed Nov 27 00:38:46 2024] Code: 7b 28 e8 5c ca f8 ff 48 8b 6b 28 48 >89 ef e8 50 ca f8 ff 48 8b 55 00 48 8d b3 68 06 00 00 48 c7 c7 e0 38 >ba ad e8 d9 c1 b5 ff <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 >90 90 90 90 90 90 >[Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fdf0 EFLAGS: 00010282 >[Wed Nov 27 00:38:46 2024] RAX: 000000000000002d RBX: ff110001238b6000 >RCX: 0000000000000027 >[Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 >RDI: ff110004cb131a48 >[Wed Nov 27 00:38:46 2024] RBP: ffffffffc1c6ac00 R08: ffffffffac3fd2fe >R09: ffe21c0099626349 >[Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 >R12: ff110001238b69c0 >[Wed Nov 27 00:38:46 2024] R13: ff110001238b6780 R14: 1fe220002432ffd4 >R15: 0000000000000000 >[Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) >GS:ff110004cb100000(0000) knlGS:0000000000000000 >[Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >[Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 >CR4: 0000000000373ef0 >[Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 >DR2: 0000000000000000 >[Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 >DR7: 0000000000000400 >[Wed Nov 27 00:38:46 2024] Call Trace: >[Wed Nov 27 00:38:46 2024] <TASK> >[Wed Nov 27 00:38:46 2024] ? die+0x37/0x90 >[Wed Nov 27 00:38:46 2024] ? do_trap+0x133/0x230 >[Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 >[Wed Nov 27 00:38:46 2024] ? do_error_trap+0x94/0x130 >[Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 >[Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 >[Wed Nov 27 00:38:46 2024] ? handle_invalid_op+0x2c/0x40 >[Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 >[Wed Nov 27 00:38:46 2024] ? exc_invalid_op+0x2f/0x50 >[Wed Nov 27 00:38:46 2024] ? asm_exc_invalid_op+0x1a/0x20 >[Wed Nov 27 00:38:46 2024] ? irq_work_claim+0x1e/0x40 >[Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 >[Wed Nov 27 00:38:46 2024] kill_anon_super+0x22/0x40 >[Wed Nov 27 00:38:46 2024] cifs_kill_sb+0x78/0x90 [cifs] >[Wed Nov 27 00:38:46 2024] deactivate_locked_super+0x69/0xf0 >[Wed Nov 27 00:38:46 2024] cleanup_mnt+0x195/0x200 >[Wed Nov 27 00:38:46 2024] task_work_run+0xec/0x150 >[Wed Nov 27 00:38:46 2024] ? __pfx_task_work_run+0x10/0x10 >[Wed Nov 27 00:38:46 2024] ? mark_held_locks+0x24/0x90 >[Wed Nov 27 00:38:46 2024] syscall_exit_to_user_mode+0x269/0x2a0 >[Wed Nov 27 00:38:46 2024] do_syscall_64+0x81/0x180 >[Wed Nov 27 00:38:46 2024] entry_SYSCALL_64_after_hwframe+0x76/0x7e >[Wed Nov 27 00:38:46 2024] RIP: 0033:0x7fddc1ff43eb >[Wed Nov 27 00:38:46 2024] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f >1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa >b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 >f9 a9 0c 00 f7 d8 >[Wed Nov 27 00:38:46 2024] RSP: 002b:00007ffe64be88d8 EFLAGS: 00000246 >ORIG_RAX: 00000000000000a6 >[Wed Nov 27 00:38:46 2024] RAX: 0000000000000000 RBX: 00005632106d4c20 >RCX: 00007fddc1ff43eb >[Wed Nov 27 00:38:46 2024] RDX: 0000000000000000 RSI: 0000000000000000 >RDI: 00005632106d9410 >[Wed Nov 27 00:38:46 2024] RBP: 00007ffe64be89b0 R08: 00005632106d4010 >R09: 0000000000000007 >[Wed Nov 27 00:38:46 2024] R10: 0000000000000000 R11: 0000000000000246 >R12: 00005632106d4d28 >[Wed Nov 27 00:38:46 2024] R13: 0000000000000000 R14: 00005632106d9410 >R15: 00005632106d5030 >[Wed Nov 27 00:38:46 2024] </TASK> >[Wed Nov 27 00:38:46 2024] Modules linked in: cifs ccm cmac nls_utf8 >cifs_arc4 nls_ucs2_utils cifs_md4 rpcsec_gss_krb5 auth_rpcgss nfsv4 >dns_resolver nfs lockd grace netfs nf_conntrack_netbios_ns >nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib >nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct >nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat >ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat >nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw >iptable_security ip_set ebtable_filter ebtables ip6table_filter >ip6_tables iptable_filter ip_tables sunrpc kvm_intel kvm virtio_net >net_failover failover virtio_balloon loop fuse dm_multipath nfnetlink >zram xfs bochs drm_client_lib drm_shmem_helper drm_kms_helper >crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm >sha512_ssse3 sha256_ssse3 sha1_ssse3 floppy virtio_blk qemu_fw_cfg >virtio_console [last unloaded: cifs] >[Wed Nov 27 00:38:46 2024] ---[ end trace 0000000000000000 ]--- >[Wed Nov 27 00:38:46 2024] RIP: 0010:generic_shutdown_super+0x1b7/0x1c0 >[Wed Nov 27 00:38:46 2024] Code: 7b 28 e8 5c ca f8 ff 48 8b 6b 28 48 >89 ef e8 50 ca f8 ff 48 8b 55 00 48 8d b3 68 06 00 00 48 c7 c7 e0 38 >ba ad e8 d9 c1 b5 ff <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 >90 90 90 90 90 90 >[Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fdf0 EFLAGS: 00010282 >[Wed Nov 27 00:38:46 2024] RAX: 000000000000002d RBX: ff110001238b6000 >RCX: 0000000000000027 >[Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 >RDI: ff110004cb131a48 >[Wed Nov 27 00:38:46 2024] RBP: ffffffffc1c6ac00 R08: ffffffffac3fd2fe >R09: ffe21c0099626349 >[Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 >R12: ff110001238b69c0 >[Wed Nov 27 00:38:46 2024] R13: ff110001238b6780 R14: 1fe220002432ffd4 >R15: 0000000000000000 >[Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) >GS:ff110004cb100000(0000) knlGS:0000000000000000 >[Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >[Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 >CR4: 0000000000373ef0 >[Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 >DR2: 0000000000000000 >[Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 >DR7: 0000000000000400 > >On Tue, Nov 26, 2024 at 3:50 PM Paul Aurich <paul@darkrain42.org> wrote: >> >> On 2024-11-22 19:28:34 -0800, Paul Aurich wrote: >> >On 2024-11-21 23:05:51 -0300, Paulo Alcantara wrote: >> >>Hi Paul, >> >> >> >>Thanks for looking into this! Really appreciate it. >> >> >> >>Paul Aurich <paul@darkrain42.org> writes: >> >> >> >>>The unmount process (cifs_kill_sb() calling close_all_cached_dirs()) can >> >>>race with various cached directory operations, which ultimately results >> >>>in dentries not being dropped and these kernel BUGs: >> >>> >> >>>BUG: Dentry ffff88814f37e358{i=1000000000080,n=/} still in use (2) [unmount of cifs cifs] >> >>>VFS: Busy inodes after unmount of cifs (cifs) >> >>>------------[ cut here ]------------ >> >>>kernel BUG at fs/super.c:661! >> >>> >> >>>This happens when a cfid is in the process of being cleaned up when, and >> >>>has been removed from the cfids->entries list, including: >> >>> >> >>>- Receiving a lease break from the server >> >>>- Server reconnection triggers invalidate_all_cached_dirs(), which >> >>> removes all the cfids from the list >> >>>- The laundromat thread decides to expire an old cfid. >> >>> >> >>>To solve these problems, dropping the dentry is done in queued work done >> >>>in a newly-added cfid_put_wq workqueue, and close_all_cached_dirs() >> >>>flushes that workqueue after it drops all the dentries of which it's >> >>>aware. This is a global workqueue (rather than scoped to a mount), but >> >>>the queued work is minimal. >> >> >> >>Why does it need to be a global workqueue? Can't you make it per tcon? >> > >> >The problem with a per-tcon workqueue is I didn't see clean way to >> >deal with multiuser mounts and flushing the workqueue in >> >close_all_cached_dirs() -- when dealing with each individual tcon, >> >we're still holding tlink_tree_lock, so an arbitrary sleep seems >> >problematic. >> > >> >There could be a per-sb workqueue (stored in cifs_sb or the master >> >tcon) but is there a way to get back to the superblock / master tcon >> >with just a tcon (e.g. cached_dir_lease_break, when processing a lease >> >break)? >> > >> >>>The final cleanup work for cleaning up a cfid is performed via work >> >>>queued in the serverclose_wq workqueue; this is done separate from >> >>>dropping the dentries so that close_all_cached_dirs() doesn't block on >> >>>any server operations. >> >>> >> >>>Both of these queued works expect to invoked with a cfid reference and >> >>>a tcon reference to avoid those objects from being freed while the work >> >>>is ongoing. >> >> >> >>Why do you need to take a tcon reference? >> > >> >In the existing code (and my patch, without the refs), I was seeing an >> >intermittent use-after-free of the tcon or cached_fids struct by >> >queued work processing a lease break -- the cfid isn't linked from >> >cached_fids, but smb2_close_cached_fid invoking SMB2_close can race >> >with the unmount and cifs_put_tcon >> > >> >Something like: >> > >> > t1 t2 >> >cached_dir_lease_break >> >smb2_cached_lease_break >> >smb2_close_cached_fid >> >SMB2_close starts >> > cifs_kill_sb >> > cifs_umount >> > cifs_put_link >> > cifs_put_tcon >> >SMB2_close continues >> > >> >I had a version of the patch that kept the 'in flight lease breaks' on >> >a second list in cached_fids so that they could be cancelled >> >synchronously from free_cached_fids(), but I struggled with it (I >> >can't remember exactly, but I think I was struggling to get the linked >> >list membership / removal handling and num_entries handling >> >consistent). >> > >> >>Can't you drop the dentries >> >>when tearing down tcon in cifs_put_tcon()? No concurrent mounts would >> >>be able to access or free it. >> > >> >The dentries being dropped must occur before kill_anon_super(), as >> >that's where the 'Dentry still in use' check is. All the tcons are put >> >in cifs_umount(), which occurs after: >> > >> > kill_anon_super(sb); >> > cifs_umount(cifs_sb); >> > >> >The other thing is that cifs_umount_begin() has this comment, which >> >made me think a tcon can actually be tied to two distinct mount >> >points: >> > >> > if ((tcon->tc_count > 1) || (tcon->status == TID_EXITING)) { >> > /* we have other mounts to same share or we have >> > already tried to umount this and woken up >> > all waiting network requests, nothing to do */ >> > >> >Although, as I'm thinking about it again, I think I've misunderstood >> >(and that comment is wrong?). >> > >> >It did cross my mind to pull some of the work out of cifs_umount into >> >cifs_kill_sb (specifically, I wanted to cancel prune_tlinks earlier) >> >-- no prune_tlinks would make it more feasible to drop tlink_tree_lock >> >in close_all_cached_dirs(), at which point a per-tcon workqueue is >> >more practical. >> > >> >>After running xfstests I've seen a leaked tcon in >> >>/proc/fs/cifs/DebugData with no CIFS superblocks, which might be related >> >>to this. >> >> >> >>Could you please check if there is any leaked connection in >> >>/proc/fs/cifs/DebugData after running your tests? >> > >> >After I finish with my tests (I'm not using xfstests, although perhaps >> >I should be) and unmount the share, DebugData doesn't show any >> >connections for me. >> >> I was able to reproduce this leak. I believe the attached patch addresses it. >> >> I'm able to intermittently see a 'Dentry still in use' bug with xfstests >> generic/241 (what Steve saw) (the attached patch doesn't help with that). I'm >> still unsure what's going on there. >> >> >~Paul > > > >-- >Thanks, > >Steve >
On Wed, Nov 27, 2024 at 11:07 AM Enzo Matsumiya <ematsumiya@suse.de> wrote: > > On 11/27, Steve French wrote: > >I did see the generic/241 failure again with current for-next > >(unrelated to this patch though). Will try to repro it again - but > >any ideas how to narrow it down or fix it would be helpful. > > We're seeing this too when backporting that patch series to SLE15-SP6, > by only running generic/072, so I don't think it's unrelated. > > We also hit, also with generic/072, but only once, the WARN() in > cached_dir_offload_close() (introduced in this same patch): > > [ 526.946722] WARNING: CPU: 2 PID: 23778 at fs/smb/client/cached_dir.c:555 cached_dir_offload_close+0x90/0xa0 [cifs] > [ 526.948561] Modules linked in: cifs cifs_arc4 cifs_md4 > [ 526.949385] CPU: 2 PID: 23778 Comm: kworker/2:1 Kdump: loaded Not tainted 6.4.0-lku #91 SLE15-SP6 (unreleased) > [ 526.949394] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-2-gc13ff2cd-prebuilt.qemu.org 04/01/2014 > [ 526.949398] Workqueue: serverclose cached_dir_offload_close [cifs] > [ 526.951938] RIP: 0010:cached_dir_offload_close+0x90/0xa0 [cifs] > [ 526.953827] Code: e8 a5 fb ff ff 4c 89 e7 5b 5d 41 5c e9 99 57 fc ff 48 89 ef be 03 00 00 00 e8 5c d4 c5 c9 4c 89 e7 5b 5d 41 5c e9 80 57 fc ff <0f> 0b eb 99 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 > [ 526.953836] RSP: 0018:ffff888108b7fdd8 EFLAGS: 00010202 > [ 526.953844] RAX: 0000000000000000 RBX: ffff888100de68d0 RCX: ffffffffc042e7d4 > [ 526.953849] RDX: 1ffff110201bcd04 RSI: 0000000000000008 RDI: ffff888100de6820 > [ 526.953854] RBP: ffff8881063d8a00 R08: 0000000000000001 R09: ffffed10201bcd1a > [ 526.953858] R10: ffff888100de68d7 R11: 0000000000000000 R12: ffff888100a2b000 > [ 526.953862] R13: 0000000000000080 R14: ffff88814f336ea8 R15: ffff888100de68d8 > [ 526.953872] FS: 0000000000000000(0000) GS:ffff88814f300000(0000) knlGS:0000000000000000 > [ 526.965888] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 526.965897] CR2: 00007f841e6ea698 CR3: 0000000106c68000 CR4: 0000000000750ee0 > [ 526.965904] PKRU: 55555554 > [ 526.965908] Call Trace: > [ 526.965912] <TASK> > [ 526.965917] ? __warn+0x92/0xf0 > [ 526.965932] ? cached_dir_offload_close+0x90/0xa0 [cifs] > [ 526.972634] ? report_bug+0x163/0x190 > [ 526.972652] ? handle_bug+0x3a/0x70 > [ 526.972665] ? exc_invalid_op+0x17/0x40 > [ 526.972674] ? asm_exc_invalid_op+0x1a/0x20 > [ 526.972688] ? cached_dir_offload_close+0x24/0xa0 [cifs] > [ 526.977239] ? cached_dir_offload_close+0x90/0xa0 [cifs] > [ 526.978831] ? cached_dir_offload_close+0x24/0xa0 [cifs] > [ 526.979133] process_one_work+0x42c/0x730 > [ 526.979176] worker_thread+0x8e/0x700 > [ 526.979190] ? __pfx_worker_thread+0x10/0x10 > [ 526.979200] kthread+0x197/0x1d0 > [ 526.979208] ? kthread+0xeb/0x1d0 > [ 526.979216] ? __pfx_kthread+0x10/0x10 > [ 526.979225] ret_from_fork+0x29/0x50 > [ 526.979237] </TASK> > [ 526.979241] ---[ end trace 0000000000000000 ]--- > > Will update here if we find a fix/root cause. > > > Cheers, > > Enzo Presumably it is related to the dir lease series but one of the earlier ones not the most recent one, since we saw it running before - but without the patch series we saw the unmount crash when racing with freeing cached dentries so would be helpful to narrow the bug down so we can fix the original problem that has been around for quite a while now - especially important now that more servers will be enabling directory leases (e.g. Samba can now be tested against)
Paul Aurich <paul@darkrain42.org> writes: > On 2024-11-21 23:05:51 -0300, Paulo Alcantara wrote: >> >>Why does it need to be a global workqueue? Can't you make it per tcon? > > The problem with a per-tcon workqueue is I didn't see clean way to deal with > multiuser mounts and flushing the workqueue in close_all_cached_dirs() -- when > dealing with each individual tcon, we're still holding tlink_tree_lock, so an > arbitrary sleep seems problematic. OK. > There could be a per-sb workqueue (stored in cifs_sb or the master tcon) but > is there a way to get back to the superblock / master tcon with just a tcon > (e.g. cached_dir_lease_break, when processing a lease break)? Yes - cifs_get_dfs_tcon_super() does that. >>> The final cleanup work for cleaning up a cfid is performed via work >>> queued in the serverclose_wq workqueue; this is done separate from >>> dropping the dentries so that close_all_cached_dirs() doesn't block on >>> any server operations. >>> >>> Both of these queued works expect to invoked with a cfid reference and >>> a tcon reference to avoid those objects from being freed while the work >>> is ongoing. >> >>Why do you need to take a tcon reference? > > In the existing code (and my patch, without the refs), I was seeing an > intermittent use-after-free of the tcon or cached_fids struct by queued work > processing a lease break -- the cfid isn't linked from cached_fids, but > smb2_close_cached_fid invoking SMB2_close can race with the unmount and > cifs_put_tcon > > Something like: > > t1 t2 > cached_dir_lease_break > smb2_cached_lease_break > smb2_close_cached_fid > SMB2_close starts > cifs_kill_sb > cifs_umount > cifs_put_link > cifs_put_tcon > SMB2_close continues Makes sense. > I had a version of the patch that kept the 'in flight lease breaks' on > a second list in cached_fids so that they could be cancelled synchronously > from free_cached_fids(), but I struggled with it (I can't remember exactly, > but I think I was struggling to get the linked list membership / removal > handling and num_entries handling consistent). No worries. The damn thing isn't trivial to follow. >> Can't you drop the dentries >>when tearing down tcon in cifs_put_tcon()? No concurrent mounts would >>be able to access or free it. > > The dentries being dropped must occur before kill_anon_super(), as that's > where the 'Dentry still in use' check is. All the tcons are put in > cifs_umount(), which occurs after: > > kill_anon_super(sb); > cifs_umount(cifs_sb); Right. Can't we call cancel_work_sync() to make sure that any leases breakes are processed on the cached directory handle before calling the above? > The other thing is that cifs_umount_begin() has this comment, which made me > think a tcon can actually be tied to two distinct mount points: > > if ((tcon->tc_count > 1) || (tcon->status == TID_EXITING)) { > /* we have other mounts to same share or we have > already tried to umount this and woken up > all waiting network requests, nothing to do */ > > Although, as I'm thinking about it again, I think I've misunderstood (and that > comment is wrong?). Comment is correct as a single tcon may be shared among different mounts. Consider the following where a single tcon is shared: mount.cifs //srv/share /mnt/1 -o $opts mount.cifs //srv/share/dir /mnt/2 -o $opts There will be two different superblocks that end up using same tcon. > It did cross my mind to pull some of the work out of cifs_umount into > cifs_kill_sb (specifically, I wanted to cancel prune_tlinks earlier) -- no > prune_tlinks would make it more feasible to drop tlink_tree_lock in > close_all_cached_dirs(), at which point a per-tcon workqueue is more > practical. Yeah, multiuser tcons just make it even more complicated. Sorry for the delay as I've been quite busy with other stuff. Great work, BTW.
I see this error at the end of generic/241 (in dmesg) even without the more recent dir lease patch: "smb: Initialize cfid->tcon before performing network ops" so it is likely unrelated (earlier bug) Nov 27 18:07:46 fedora29 kernel: Call Trace: Nov 27 18:07:46 fedora29 kernel: <TASK> Nov 27 18:07:46 fedora29 kernel: ? __warn+0xa9/0x220 Nov 27 18:07:46 fedora29 kernel: ? umount_check+0xc3/0xf0 Nov 27 18:07:46 fedora29 kernel: ? report_bug+0x1d4/0x1e0 Nov 27 18:07:46 fedora29 kernel: ? handle_bug+0x5b/0xa0 Nov 27 18:07:46 fedora29 kernel: ? exc_invalid_op+0x18/0x50 Nov 27 18:07:46 fedora29 kernel: ? asm_exc_invalid_op+0x1a/0x20 Nov 27 18:07:46 fedora29 kernel: ? irq_work_claim+0x1e/0x40 Nov 27 18:07:46 fedora29 kernel: ? umount_check+0xc3/0xf0 Nov 27 18:07:46 fedora29 kernel: ? __pfx_umount_check+0x10/0x10 Nov 27 18:07:46 fedora29 kernel: d_walk+0xf3/0x4e0 Nov 27 18:07:46 fedora29 kernel: ? d_walk+0x4b/0x4e0 Nov 27 18:07:46 fedora29 kernel: shrink_dcache_for_umount+0x6d/0x220 Nov 27 18:07:46 fedora29 kernel: generic_shutdown_super+0x4a/0x1c0 Nov 27 18:07:46 fedora29 kernel: kill_anon_super+0x22/0x40 Nov 27 18:07:46 fedora29 kernel: cifs_kill_sb+0x78/0x90 [cifs] On Wed, Nov 27, 2024 at 10:38 AM Steve French <smfrench@gmail.com> wrote: > > I did see the generic/241 failure again with current for-next > (unrelated to this patch though). Will try to repro it again - but > any ideas how to narrow it down or fix it would be helpful. > > SECTION -- smb3 > FSTYP -- cifs > PLATFORM -- Linux/x86_64 fedora29 6.12.0 #1 SMP PREEMPT_DYNAMIC Wed > Nov 27 01:02:07 UTC 2024 > MKFS_OPTIONS -- //win16.vm.test/Scratch > generic/241 73s > Ran: generic/241 > Passed all 1 tests > SECTION -- smb3 > ========================= > Ran: generic/241 > Passed all 1 tests > Number of reconnects: 0 > Test completed smb3 generic/241 at Wed Nov 27 06:38:47 AM UTC 2024 > dmesg output during the test: > [Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Share > [Wed Nov 27 00:37:32 2024] CIFS: VFS: generate_smb3signingkey: dumping > generated AES session keys > [Wed Nov 27 00:37:32 2024] CIFS: VFS: Session Id 45 00 00 08 00 c8 00 00 > [Wed Nov 27 00:37:32 2024] CIFS: VFS: Cipher type 2 > [Wed Nov 27 00:37:32 2024] CIFS: VFS: Session Key 00 bf ed c7 f1 95 0e > 29 06 e8 82 87 b5 c8 72 06 > [Wed Nov 27 00:37:32 2024] CIFS: VFS: Signing Key a4 0f 15 64 d2 69 02 > 2f 4e 78 60 7a fe 3e 31 4e > [Wed Nov 27 00:37:32 2024] CIFS: VFS: ServerIn Key a6 fd 04 f6 04 ea > 0e 6e 60 c0 1b b1 ee 63 38 e9 > [Wed Nov 27 00:37:32 2024] CIFS: VFS: ServerOut Key a6 e3 e3 22 8c c2 > b0 6e b1 9d 40 ea d0 89 6d d8 > [Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Scratch > [Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Scratch > [Wed Nov 27 00:37:32 2024] run fstests generic/241 at 2024-11-27 00:37:33 > [Wed Nov 27 00:38:46 2024] ------------[ cut here ]------------ > [Wed Nov 27 00:38:46 2024] BUG: Dentry > 00000000318d67d4{i=11000000033f68,n=~dmtmp} still in use (1) [unmount > of cifs cifs] > [Wed Nov 27 00:38:46 2024] WARNING: CPU: 2 PID: 316177 at > fs/dcache.c:1546 umount_check+0xc3/0xf0 > [Wed Nov 27 00:38:46 2024] Modules linked in: cifs ccm cmac nls_utf8 > cifs_arc4 nls_ucs2_utils cifs_md4 rpcsec_gss_krb5 auth_rpcgss nfsv4 > dns_resolver nfs lockd grace netfs nf_conntrack_netbios_ns > nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib > nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct > nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat > ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat > nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw > iptable_security ip_set ebtable_filter ebtables ip6table_filter > ip6_tables iptable_filter ip_tables sunrpc kvm_intel kvm virtio_net > net_failover failover virtio_balloon loop fuse dm_multipath nfnetlink > zram xfs bochs drm_client_lib drm_shmem_helper drm_kms_helper > crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm > sha512_ssse3 sha256_ssse3 sha1_ssse3 floppy virtio_blk qemu_fw_cfg > virtio_console [last unloaded: cifs] > [Wed Nov 27 00:38:46 2024] CPU: 2 UID: 0 PID: 316177 Comm: umount Not > tainted 6.12.0 #1 > [Wed Nov 27 00:38:46 2024] Hardware name: Red Hat KVM, BIOS > 1.16.3-2.el9 04/01/2014 > [Wed Nov 27 00:38:46 2024] RIP: 0010:umount_check+0xc3/0xf0 > [Wed Nov 27 00:38:46 2024] Code: db 74 0d 48 8d 7b 40 e8 db df f5 ff > 48 8b 53 40 41 55 4d 89 f1 45 89 e0 48 89 e9 48 89 ee 48 c7 c7 80 99 > ba ad e8 2d 27 a2 ff <0f> 0b 58 31 c0 5b 5d 41 5c 41 5d 41 5e c3 cc cc > cc cc 41 83 fc 01 > [Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fd20 EFLAGS: 00010282 > [Wed Nov 27 00:38:46 2024] RAX: dffffc0000000000 RBX: ff1100010c574ce0 > RCX: 0000000000000027 > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 > RDI: ff110004cb131a48 > [Wed Nov 27 00:38:46 2024] RBP: ff1100012c76bd60 R08: ffffffffac3fd2fe > R09: ffe21c0099626349 > [Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 > R12: 0000000000000001 > [Wed Nov 27 00:38:46 2024] R13: ff110001238b6668 R14: ffffffffc1d6e6c0 > R15: ff1100012c76be18 > [Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) > GS:ff110004cb100000(0000) knlGS:0000000000000000 > [Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 > CR4: 0000000000373ef0 > [Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 > DR2: 0000000000000000 > [Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 > DR7: 0000000000000400 > [Wed Nov 27 00:38:46 2024] Call Trace: > [Wed Nov 27 00:38:46 2024] <TASK> > [Wed Nov 27 00:38:46 2024] ? __warn+0xa9/0x220 > [Wed Nov 27 00:38:46 2024] ? umount_check+0xc3/0xf0 > [Wed Nov 27 00:38:46 2024] ? report_bug+0x1d4/0x1e0 > [Wed Nov 27 00:38:46 2024] ? handle_bug+0x5b/0xa0 > [Wed Nov 27 00:38:46 2024] ? exc_invalid_op+0x18/0x50 > [Wed Nov 27 00:38:46 2024] ? asm_exc_invalid_op+0x1a/0x20 > [Wed Nov 27 00:38:46 2024] ? irq_work_claim+0x1e/0x40 > [Wed Nov 27 00:38:46 2024] ? umount_check+0xc3/0xf0 > [Wed Nov 27 00:38:46 2024] ? __pfx_umount_check+0x10/0x10 > [Wed Nov 27 00:38:46 2024] d_walk+0xf3/0x4e0 > [Wed Nov 27 00:38:46 2024] ? d_walk+0x4b/0x4e0 > [Wed Nov 27 00:38:46 2024] shrink_dcache_for_umount+0x6d/0x220 > [Wed Nov 27 00:38:46 2024] generic_shutdown_super+0x4a/0x1c0 > [Wed Nov 27 00:38:46 2024] kill_anon_super+0x22/0x40 > [Wed Nov 27 00:38:46 2024] cifs_kill_sb+0x78/0x90 [cifs] > [Wed Nov 27 00:38:46 2024] deactivate_locked_super+0x69/0xf0 > [Wed Nov 27 00:38:46 2024] cleanup_mnt+0x195/0x200 > [Wed Nov 27 00:38:46 2024] task_work_run+0xec/0x150 > [Wed Nov 27 00:38:46 2024] ? __pfx_task_work_run+0x10/0x10 > [Wed Nov 27 00:38:46 2024] ? mark_held_locks+0x24/0x90 > [Wed Nov 27 00:38:46 2024] syscall_exit_to_user_mode+0x269/0x2a0 > [Wed Nov 27 00:38:46 2024] do_syscall_64+0x81/0x180 > [Wed Nov 27 00:38:46 2024] entry_SYSCALL_64_after_hwframe+0x76/0x7e > [Wed Nov 27 00:38:46 2024] RIP: 0033:0x7fddc1ff43eb > [Wed Nov 27 00:38:46 2024] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f > 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa > b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 > f9 a9 0c 00 f7 d8 > [Wed Nov 27 00:38:46 2024] RSP: 002b:00007ffe64be88d8 EFLAGS: 00000246 > ORIG_RAX: 00000000000000a6 > [Wed Nov 27 00:38:46 2024] RAX: 0000000000000000 RBX: 00005632106d4c20 > RCX: 00007fddc1ff43eb > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000000 RSI: 0000000000000000 > RDI: 00005632106d9410 > [Wed Nov 27 00:38:46 2024] RBP: 00007ffe64be89b0 R08: 00005632106d4010 > R09: 0000000000000007 > [Wed Nov 27 00:38:46 2024] R10: 0000000000000000 R11: 0000000000000246 > R12: 00005632106d4d28 > [Wed Nov 27 00:38:46 2024] R13: 0000000000000000 R14: 00005632106d9410 > R15: 00005632106d5030 > [Wed Nov 27 00:38:46 2024] </TASK> > [Wed Nov 27 00:38:46 2024] irq event stamp: 8317 > [Wed Nov 27 00:38:46 2024] hardirqs last enabled at (8323): > [<ffffffffac230dce>] __up_console_sem+0x5e/0x70 > [Wed Nov 27 00:38:46 2024] hardirqs last disabled at (8328): > [<ffffffffac230db3>] __up_console_sem+0x43/0x70 > [Wed Nov 27 00:38:46 2024] softirqs last enabled at (6628): > [<ffffffffac135745>] __irq_exit_rcu+0x135/0x160 > [Wed Nov 27 00:38:46 2024] softirqs last disabled at (6539): > [<ffffffffac135745>] __irq_exit_rcu+0x135/0x160 > [Wed Nov 27 00:38:46 2024] ---[ end trace 0000000000000000 ]--- > [Wed Nov 27 00:38:46 2024] VFS: Busy inodes after unmount of cifs (cifs) > [Wed Nov 27 00:38:46 2024] ------------[ cut here ]------------ > [Wed Nov 27 00:38:46 2024] kernel BUG at fs/super.c:650! > [Wed Nov 27 00:38:46 2024] Oops: invalid opcode: 0000 [#1] PREEMPT SMP > KASAN NOPTI > [Wed Nov 27 00:38:46 2024] CPU: 2 UID: 0 PID: 316177 Comm: umount > Tainted: G W 6.12.0 #1 > [Wed Nov 27 00:38:46 2024] Tainted: [W]=WARN > [Wed Nov 27 00:38:46 2024] Hardware name: Red Hat KVM, BIOS > 1.16.3-2.el9 04/01/2014 > [Wed Nov 27 00:38:46 2024] RIP: 0010:generic_shutdown_super+0x1b7/0x1c0 > [Wed Nov 27 00:38:46 2024] Code: 7b 28 e8 5c ca f8 ff 48 8b 6b 28 48 > 89 ef e8 50 ca f8 ff 48 8b 55 00 48 8d b3 68 06 00 00 48 c7 c7 e0 38 > ba ad e8 d9 c1 b5 ff <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 > 90 90 90 90 90 90 > [Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fdf0 EFLAGS: 00010282 > [Wed Nov 27 00:38:46 2024] RAX: 000000000000002d RBX: ff110001238b6000 > RCX: 0000000000000027 > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 > RDI: ff110004cb131a48 > [Wed Nov 27 00:38:46 2024] RBP: ffffffffc1c6ac00 R08: ffffffffac3fd2fe > R09: ffe21c0099626349 > [Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 > R12: ff110001238b69c0 > [Wed Nov 27 00:38:46 2024] R13: ff110001238b6780 R14: 1fe220002432ffd4 > R15: 0000000000000000 > [Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) > GS:ff110004cb100000(0000) knlGS:0000000000000000 > [Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 > CR4: 0000000000373ef0 > [Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 > DR2: 0000000000000000 > [Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 > DR7: 0000000000000400 > [Wed Nov 27 00:38:46 2024] Call Trace: > [Wed Nov 27 00:38:46 2024] <TASK> > [Wed Nov 27 00:38:46 2024] ? die+0x37/0x90 > [Wed Nov 27 00:38:46 2024] ? do_trap+0x133/0x230 > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > [Wed Nov 27 00:38:46 2024] ? do_error_trap+0x94/0x130 > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > [Wed Nov 27 00:38:46 2024] ? handle_invalid_op+0x2c/0x40 > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > [Wed Nov 27 00:38:46 2024] ? exc_invalid_op+0x2f/0x50 > [Wed Nov 27 00:38:46 2024] ? asm_exc_invalid_op+0x1a/0x20 > [Wed Nov 27 00:38:46 2024] ? irq_work_claim+0x1e/0x40 > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > [Wed Nov 27 00:38:46 2024] kill_anon_super+0x22/0x40 > [Wed Nov 27 00:38:46 2024] cifs_kill_sb+0x78/0x90 [cifs] > [Wed Nov 27 00:38:46 2024] deactivate_locked_super+0x69/0xf0 > [Wed Nov 27 00:38:46 2024] cleanup_mnt+0x195/0x200 > [Wed Nov 27 00:38:46 2024] task_work_run+0xec/0x150 > [Wed Nov 27 00:38:46 2024] ? __pfx_task_work_run+0x10/0x10 > [Wed Nov 27 00:38:46 2024] ? mark_held_locks+0x24/0x90 > [Wed Nov 27 00:38:46 2024] syscall_exit_to_user_mode+0x269/0x2a0 > [Wed Nov 27 00:38:46 2024] do_syscall_64+0x81/0x180 > [Wed Nov 27 00:38:46 2024] entry_SYSCALL_64_after_hwframe+0x76/0x7e > [Wed Nov 27 00:38:46 2024] RIP: 0033:0x7fddc1ff43eb > [Wed Nov 27 00:38:46 2024] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f > 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa > b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 > f9 a9 0c 00 f7 d8 > [Wed Nov 27 00:38:46 2024] RSP: 002b:00007ffe64be88d8 EFLAGS: 00000246 > ORIG_RAX: 00000000000000a6 > [Wed Nov 27 00:38:46 2024] RAX: 0000000000000000 RBX: 00005632106d4c20 > RCX: 00007fddc1ff43eb > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000000 RSI: 0000000000000000 > RDI: 00005632106d9410 > [Wed Nov 27 00:38:46 2024] RBP: 00007ffe64be89b0 R08: 00005632106d4010 > R09: 0000000000000007 > [Wed Nov 27 00:38:46 2024] R10: 0000000000000000 R11: 0000000000000246 > R12: 00005632106d4d28 > [Wed Nov 27 00:38:46 2024] R13: 0000000000000000 R14: 00005632106d9410 > R15: 00005632106d5030 > [Wed Nov 27 00:38:46 2024] </TASK> > [Wed Nov 27 00:38:46 2024] Modules linked in: cifs ccm cmac nls_utf8 > cifs_arc4 nls_ucs2_utils cifs_md4 rpcsec_gss_krb5 auth_rpcgss nfsv4 > dns_resolver nfs lockd grace netfs nf_conntrack_netbios_ns > nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib > nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct > nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat > ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat > nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw > iptable_security ip_set ebtable_filter ebtables ip6table_filter > ip6_tables iptable_filter ip_tables sunrpc kvm_intel kvm virtio_net > net_failover failover virtio_balloon loop fuse dm_multipath nfnetlink > zram xfs bochs drm_client_lib drm_shmem_helper drm_kms_helper > crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm > sha512_ssse3 sha256_ssse3 sha1_ssse3 floppy virtio_blk qemu_fw_cfg > virtio_console [last unloaded: cifs] > [Wed Nov 27 00:38:46 2024] ---[ end trace 0000000000000000 ]--- > [Wed Nov 27 00:38:46 2024] RIP: 0010:generic_shutdown_super+0x1b7/0x1c0 > [Wed Nov 27 00:38:46 2024] Code: 7b 28 e8 5c ca f8 ff 48 8b 6b 28 48 > 89 ef e8 50 ca f8 ff 48 8b 55 00 48 8d b3 68 06 00 00 48 c7 c7 e0 38 > ba ad e8 d9 c1 b5 ff <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 > 90 90 90 90 90 90 > [Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fdf0 EFLAGS: 00010282 > [Wed Nov 27 00:38:46 2024] RAX: 000000000000002d RBX: ff110001238b6000 > RCX: 0000000000000027 > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 > RDI: ff110004cb131a48 > [Wed Nov 27 00:38:46 2024] RBP: ffffffffc1c6ac00 R08: ffffffffac3fd2fe > R09: ffe21c0099626349 > [Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 > R12: ff110001238b69c0 > [Wed Nov 27 00:38:46 2024] R13: ff110001238b6780 R14: 1fe220002432ffd4 > R15: 0000000000000000 > [Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) > GS:ff110004cb100000(0000) knlGS:0000000000000000 > [Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 > CR4: 0000000000373ef0 > [Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 > DR2: 0000000000000000 > [Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 > DR7: 0000000000000400 > > On Tue, Nov 26, 2024 at 3:50 PM Paul Aurich <paul@darkrain42.org> wrote: > > > > On 2024-11-22 19:28:34 -0800, Paul Aurich wrote: > > >On 2024-11-21 23:05:51 -0300, Paulo Alcantara wrote: > > >>Hi Paul, > > >> > > >>Thanks for looking into this! Really appreciate it. > > >> > > >>Paul Aurich <paul@darkrain42.org> writes: > > >> > > >>>The unmount process (cifs_kill_sb() calling close_all_cached_dirs()) can > > >>>race with various cached directory operations, which ultimately results > > >>>in dentries not being dropped and these kernel BUGs: > > >>> > > >>>BUG: Dentry ffff88814f37e358{i=1000000000080,n=/} still in use (2) [unmount of cifs cifs] > > >>>VFS: Busy inodes after unmount of cifs (cifs) > > >>>------------[ cut here ]------------ > > >>>kernel BUG at fs/super.c:661! > > >>> > > >>>This happens when a cfid is in the process of being cleaned up when, and > > >>>has been removed from the cfids->entries list, including: > > >>> > > >>>- Receiving a lease break from the server > > >>>- Server reconnection triggers invalidate_all_cached_dirs(), which > > >>> removes all the cfids from the list > > >>>- The laundromat thread decides to expire an old cfid. > > >>> > > >>>To solve these problems, dropping the dentry is done in queued work done > > >>>in a newly-added cfid_put_wq workqueue, and close_all_cached_dirs() > > >>>flushes that workqueue after it drops all the dentries of which it's > > >>>aware. This is a global workqueue (rather than scoped to a mount), but > > >>>the queued work is minimal. > > >> > > >>Why does it need to be a global workqueue? Can't you make it per tcon? > > > > > >The problem with a per-tcon workqueue is I didn't see clean way to > > >deal with multiuser mounts and flushing the workqueue in > > >close_all_cached_dirs() -- when dealing with each individual tcon, > > >we're still holding tlink_tree_lock, so an arbitrary sleep seems > > >problematic. > > > > > >There could be a per-sb workqueue (stored in cifs_sb or the master > > >tcon) but is there a way to get back to the superblock / master tcon > > >with just a tcon (e.g. cached_dir_lease_break, when processing a lease > > >break)? > > > > > >>>The final cleanup work for cleaning up a cfid is performed via work > > >>>queued in the serverclose_wq workqueue; this is done separate from > > >>>dropping the dentries so that close_all_cached_dirs() doesn't block on > > >>>any server operations. > > >>> > > >>>Both of these queued works expect to invoked with a cfid reference and > > >>>a tcon reference to avoid those objects from being freed while the work > > >>>is ongoing. > > >> > > >>Why do you need to take a tcon reference? > > > > > >In the existing code (and my patch, without the refs), I was seeing an > > >intermittent use-after-free of the tcon or cached_fids struct by > > >queued work processing a lease break -- the cfid isn't linked from > > >cached_fids, but smb2_close_cached_fid invoking SMB2_close can race > > >with the unmount and cifs_put_tcon > > > > > >Something like: > > > > > > t1 t2 > > >cached_dir_lease_break > > >smb2_cached_lease_break > > >smb2_close_cached_fid > > >SMB2_close starts > > > cifs_kill_sb > > > cifs_umount > > > cifs_put_link > > > cifs_put_tcon > > >SMB2_close continues > > > > > >I had a version of the patch that kept the 'in flight lease breaks' on > > >a second list in cached_fids so that they could be cancelled > > >synchronously from free_cached_fids(), but I struggled with it (I > > >can't remember exactly, but I think I was struggling to get the linked > > >list membership / removal handling and num_entries handling > > >consistent). > > > > > >>Can't you drop the dentries > > >>when tearing down tcon in cifs_put_tcon()? No concurrent mounts would > > >>be able to access or free it. > > > > > >The dentries being dropped must occur before kill_anon_super(), as > > >that's where the 'Dentry still in use' check is. All the tcons are put > > >in cifs_umount(), which occurs after: > > > > > > kill_anon_super(sb); > > > cifs_umount(cifs_sb); > > > > > >The other thing is that cifs_umount_begin() has this comment, which > > >made me think a tcon can actually be tied to two distinct mount > > >points: > > > > > > if ((tcon->tc_count > 1) || (tcon->status == TID_EXITING)) { > > > /* we have other mounts to same share or we have > > > already tried to umount this and woken up > > > all waiting network requests, nothing to do */ > > > > > >Although, as I'm thinking about it again, I think I've misunderstood > > >(and that comment is wrong?). > > > > > >It did cross my mind to pull some of the work out of cifs_umount into > > >cifs_kill_sb (specifically, I wanted to cancel prune_tlinks earlier) > > >-- no prune_tlinks would make it more feasible to drop tlink_tree_lock > > >in close_all_cached_dirs(), at which point a per-tcon workqueue is > > >more practical. > > > > > >>After running xfstests I've seen a leaked tcon in > > >>/proc/fs/cifs/DebugData with no CIFS superblocks, which might be related > > >>to this. > > >> > > >>Could you please check if there is any leaked connection in > > >>/proc/fs/cifs/DebugData after running your tests? > > > > > >After I finish with my tests (I'm not using xfstests, although perhaps > > >I should be) and unmount the share, DebugData doesn't show any > > >connections for me. > > > > I was able to reproduce this leak. I believe the attached patch addresses it. > > > > I'm able to intermittently see a 'Dentry still in use' bug with xfstests > > generic/241 (what Steve saw) (the attached patch doesn't help with that). I'm > > still unsure what's going on there. > > > > >~Paul > > > > -- > Thanks, > > Steve
And also see it without patch "smb: During unmount, ensure all cached dir instances drop their dentry" On Wed, Nov 27, 2024 at 7:10 PM Steve French <smfrench@gmail.com> wrote: > > I see this error at the end of generic/241 (in dmesg) even without the > more recent dir lease patch: > > "smb: Initialize cfid->tcon before performing network ops" > > so it is likely unrelated (earlier bug) > > > > Nov 27 18:07:46 fedora29 kernel: Call Trace: > Nov 27 18:07:46 fedora29 kernel: <TASK> > Nov 27 18:07:46 fedora29 kernel: ? __warn+0xa9/0x220 > Nov 27 18:07:46 fedora29 kernel: ? umount_check+0xc3/0xf0 > Nov 27 18:07:46 fedora29 kernel: ? report_bug+0x1d4/0x1e0 > Nov 27 18:07:46 fedora29 kernel: ? handle_bug+0x5b/0xa0 > Nov 27 18:07:46 fedora29 kernel: ? exc_invalid_op+0x18/0x50 > Nov 27 18:07:46 fedora29 kernel: ? asm_exc_invalid_op+0x1a/0x20 > Nov 27 18:07:46 fedora29 kernel: ? irq_work_claim+0x1e/0x40 > Nov 27 18:07:46 fedora29 kernel: ? umount_check+0xc3/0xf0 > Nov 27 18:07:46 fedora29 kernel: ? __pfx_umount_check+0x10/0x10 > Nov 27 18:07:46 fedora29 kernel: d_walk+0xf3/0x4e0 > Nov 27 18:07:46 fedora29 kernel: ? d_walk+0x4b/0x4e0 > Nov 27 18:07:46 fedora29 kernel: shrink_dcache_for_umount+0x6d/0x220 > Nov 27 18:07:46 fedora29 kernel: generic_shutdown_super+0x4a/0x1c0 > Nov 27 18:07:46 fedora29 kernel: kill_anon_super+0x22/0x40 > Nov 27 18:07:46 fedora29 kernel: cifs_kill_sb+0x78/0x90 [cifs] > > On Wed, Nov 27, 2024 at 10:38 AM Steve French <smfrench@gmail.com> wrote: > > > > I did see the generic/241 failure again with current for-next > > (unrelated to this patch though). Will try to repro it again - but > > any ideas how to narrow it down or fix it would be helpful. > > > > SECTION -- smb3 > > FSTYP -- cifs > > PLATFORM -- Linux/x86_64 fedora29 6.12.0 #1 SMP PREEMPT_DYNAMIC Wed > > Nov 27 01:02:07 UTC 2024 > > MKFS_OPTIONS -- //win16.vm.test/Scratch > > generic/241 73s > > Ran: generic/241 > > Passed all 1 tests > > SECTION -- smb3 > > ========================= > > Ran: generic/241 > > Passed all 1 tests > > Number of reconnects: 0 > > Test completed smb3 generic/241 at Wed Nov 27 06:38:47 AM UTC 2024 > > dmesg output during the test: > > [Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Share > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: generate_smb3signingkey: dumping > > generated AES session keys > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: Session Id 45 00 00 08 00 c8 00 00 > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: Cipher type 2 > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: Session Key 00 bf ed c7 f1 95 0e > > 29 06 e8 82 87 b5 c8 72 06 > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: Signing Key a4 0f 15 64 d2 69 02 > > 2f 4e 78 60 7a fe 3e 31 4e > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: ServerIn Key a6 fd 04 f6 04 ea > > 0e 6e 60 c0 1b b1 ee 63 38 e9 > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: ServerOut Key a6 e3 e3 22 8c c2 > > b0 6e b1 9d 40 ea d0 89 6d d8 > > [Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Scratch > > [Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Scratch > > [Wed Nov 27 00:37:32 2024] run fstests generic/241 at 2024-11-27 00:37:33 > > [Wed Nov 27 00:38:46 2024] ------------[ cut here ]------------ > > [Wed Nov 27 00:38:46 2024] BUG: Dentry > > 00000000318d67d4{i=11000000033f68,n=~dmtmp} still in use (1) [unmount > > of cifs cifs] > > [Wed Nov 27 00:38:46 2024] WARNING: CPU: 2 PID: 316177 at > > fs/dcache.c:1546 umount_check+0xc3/0xf0 > > [Wed Nov 27 00:38:46 2024] Modules linked in: cifs ccm cmac nls_utf8 > > cifs_arc4 nls_ucs2_utils cifs_md4 rpcsec_gss_krb5 auth_rpcgss nfsv4 > > dns_resolver nfs lockd grace netfs nf_conntrack_netbios_ns > > nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib > > nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct > > nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat > > ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat > > nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw > > iptable_security ip_set ebtable_filter ebtables ip6table_filter > > ip6_tables iptable_filter ip_tables sunrpc kvm_intel kvm virtio_net > > net_failover failover virtio_balloon loop fuse dm_multipath nfnetlink > > zram xfs bochs drm_client_lib drm_shmem_helper drm_kms_helper > > crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm > > sha512_ssse3 sha256_ssse3 sha1_ssse3 floppy virtio_blk qemu_fw_cfg > > virtio_console [last unloaded: cifs] > > [Wed Nov 27 00:38:46 2024] CPU: 2 UID: 0 PID: 316177 Comm: umount Not > > tainted 6.12.0 #1 > > [Wed Nov 27 00:38:46 2024] Hardware name: Red Hat KVM, BIOS > > 1.16.3-2.el9 04/01/2014 > > [Wed Nov 27 00:38:46 2024] RIP: 0010:umount_check+0xc3/0xf0 > > [Wed Nov 27 00:38:46 2024] Code: db 74 0d 48 8d 7b 40 e8 db df f5 ff > > 48 8b 53 40 41 55 4d 89 f1 45 89 e0 48 89 e9 48 89 ee 48 c7 c7 80 99 > > ba ad e8 2d 27 a2 ff <0f> 0b 58 31 c0 5b 5d 41 5c 41 5d 41 5e c3 cc cc > > cc cc 41 83 fc 01 > > [Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fd20 EFLAGS: 00010282 > > [Wed Nov 27 00:38:46 2024] RAX: dffffc0000000000 RBX: ff1100010c574ce0 > > RCX: 0000000000000027 > > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 > > RDI: ff110004cb131a48 > > [Wed Nov 27 00:38:46 2024] RBP: ff1100012c76bd60 R08: ffffffffac3fd2fe > > R09: ffe21c0099626349 > > [Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 > > R12: 0000000000000001 > > [Wed Nov 27 00:38:46 2024] R13: ff110001238b6668 R14: ffffffffc1d6e6c0 > > R15: ff1100012c76be18 > > [Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) > > GS:ff110004cb100000(0000) knlGS:0000000000000000 > > [Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 > > CR4: 0000000000373ef0 > > [Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 > > DR2: 0000000000000000 > > [Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 > > DR7: 0000000000000400 > > [Wed Nov 27 00:38:46 2024] Call Trace: > > [Wed Nov 27 00:38:46 2024] <TASK> > > [Wed Nov 27 00:38:46 2024] ? __warn+0xa9/0x220 > > [Wed Nov 27 00:38:46 2024] ? umount_check+0xc3/0xf0 > > [Wed Nov 27 00:38:46 2024] ? report_bug+0x1d4/0x1e0 > > [Wed Nov 27 00:38:46 2024] ? handle_bug+0x5b/0xa0 > > [Wed Nov 27 00:38:46 2024] ? exc_invalid_op+0x18/0x50 > > [Wed Nov 27 00:38:46 2024] ? asm_exc_invalid_op+0x1a/0x20 > > [Wed Nov 27 00:38:46 2024] ? irq_work_claim+0x1e/0x40 > > [Wed Nov 27 00:38:46 2024] ? umount_check+0xc3/0xf0 > > [Wed Nov 27 00:38:46 2024] ? __pfx_umount_check+0x10/0x10 > > [Wed Nov 27 00:38:46 2024] d_walk+0xf3/0x4e0 > > [Wed Nov 27 00:38:46 2024] ? d_walk+0x4b/0x4e0 > > [Wed Nov 27 00:38:46 2024] shrink_dcache_for_umount+0x6d/0x220 > > [Wed Nov 27 00:38:46 2024] generic_shutdown_super+0x4a/0x1c0 > > [Wed Nov 27 00:38:46 2024] kill_anon_super+0x22/0x40 > > [Wed Nov 27 00:38:46 2024] cifs_kill_sb+0x78/0x90 [cifs] > > [Wed Nov 27 00:38:46 2024] deactivate_locked_super+0x69/0xf0 > > [Wed Nov 27 00:38:46 2024] cleanup_mnt+0x195/0x200 > > [Wed Nov 27 00:38:46 2024] task_work_run+0xec/0x150 > > [Wed Nov 27 00:38:46 2024] ? __pfx_task_work_run+0x10/0x10 > > [Wed Nov 27 00:38:46 2024] ? mark_held_locks+0x24/0x90 > > [Wed Nov 27 00:38:46 2024] syscall_exit_to_user_mode+0x269/0x2a0 > > [Wed Nov 27 00:38:46 2024] do_syscall_64+0x81/0x180 > > [Wed Nov 27 00:38:46 2024] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > [Wed Nov 27 00:38:46 2024] RIP: 0033:0x7fddc1ff43eb > > [Wed Nov 27 00:38:46 2024] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f > > 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa > > b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 > > f9 a9 0c 00 f7 d8 > > [Wed Nov 27 00:38:46 2024] RSP: 002b:00007ffe64be88d8 EFLAGS: 00000246 > > ORIG_RAX: 00000000000000a6 > > [Wed Nov 27 00:38:46 2024] RAX: 0000000000000000 RBX: 00005632106d4c20 > > RCX: 00007fddc1ff43eb > > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000000 RSI: 0000000000000000 > > RDI: 00005632106d9410 > > [Wed Nov 27 00:38:46 2024] RBP: 00007ffe64be89b0 R08: 00005632106d4010 > > R09: 0000000000000007 > > [Wed Nov 27 00:38:46 2024] R10: 0000000000000000 R11: 0000000000000246 > > R12: 00005632106d4d28 > > [Wed Nov 27 00:38:46 2024] R13: 0000000000000000 R14: 00005632106d9410 > > R15: 00005632106d5030 > > [Wed Nov 27 00:38:46 2024] </TASK> > > [Wed Nov 27 00:38:46 2024] irq event stamp: 8317 > > [Wed Nov 27 00:38:46 2024] hardirqs last enabled at (8323): > > [<ffffffffac230dce>] __up_console_sem+0x5e/0x70 > > [Wed Nov 27 00:38:46 2024] hardirqs last disabled at (8328): > > [<ffffffffac230db3>] __up_console_sem+0x43/0x70 > > [Wed Nov 27 00:38:46 2024] softirqs last enabled at (6628): > > [<ffffffffac135745>] __irq_exit_rcu+0x135/0x160 > > [Wed Nov 27 00:38:46 2024] softirqs last disabled at (6539): > > [<ffffffffac135745>] __irq_exit_rcu+0x135/0x160 > > [Wed Nov 27 00:38:46 2024] ---[ end trace 0000000000000000 ]--- > > [Wed Nov 27 00:38:46 2024] VFS: Busy inodes after unmount of cifs (cifs) > > [Wed Nov 27 00:38:46 2024] ------------[ cut here ]------------ > > [Wed Nov 27 00:38:46 2024] kernel BUG at fs/super.c:650! > > [Wed Nov 27 00:38:46 2024] Oops: invalid opcode: 0000 [#1] PREEMPT SMP > > KASAN NOPTI > > [Wed Nov 27 00:38:46 2024] CPU: 2 UID: 0 PID: 316177 Comm: umount > > Tainted: G W 6.12.0 #1 > > [Wed Nov 27 00:38:46 2024] Tainted: [W]=WARN > > [Wed Nov 27 00:38:46 2024] Hardware name: Red Hat KVM, BIOS > > 1.16.3-2.el9 04/01/2014 > > [Wed Nov 27 00:38:46 2024] RIP: 0010:generic_shutdown_super+0x1b7/0x1c0 > > [Wed Nov 27 00:38:46 2024] Code: 7b 28 e8 5c ca f8 ff 48 8b 6b 28 48 > > 89 ef e8 50 ca f8 ff 48 8b 55 00 48 8d b3 68 06 00 00 48 c7 c7 e0 38 > > ba ad e8 d9 c1 b5 ff <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 > > 90 90 90 90 90 90 > > [Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fdf0 EFLAGS: 00010282 > > [Wed Nov 27 00:38:46 2024] RAX: 000000000000002d RBX: ff110001238b6000 > > RCX: 0000000000000027 > > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 > > RDI: ff110004cb131a48 > > [Wed Nov 27 00:38:46 2024] RBP: ffffffffc1c6ac00 R08: ffffffffac3fd2fe > > R09: ffe21c0099626349 > > [Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 > > R12: ff110001238b69c0 > > [Wed Nov 27 00:38:46 2024] R13: ff110001238b6780 R14: 1fe220002432ffd4 > > R15: 0000000000000000 > > [Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) > > GS:ff110004cb100000(0000) knlGS:0000000000000000 > > [Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 > > CR4: 0000000000373ef0 > > [Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 > > DR2: 0000000000000000 > > [Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 > > DR7: 0000000000000400 > > [Wed Nov 27 00:38:46 2024] Call Trace: > > [Wed Nov 27 00:38:46 2024] <TASK> > > [Wed Nov 27 00:38:46 2024] ? die+0x37/0x90 > > [Wed Nov 27 00:38:46 2024] ? do_trap+0x133/0x230 > > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > > [Wed Nov 27 00:38:46 2024] ? do_error_trap+0x94/0x130 > > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > > [Wed Nov 27 00:38:46 2024] ? handle_invalid_op+0x2c/0x40 > > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > > [Wed Nov 27 00:38:46 2024] ? exc_invalid_op+0x2f/0x50 > > [Wed Nov 27 00:38:46 2024] ? asm_exc_invalid_op+0x1a/0x20 > > [Wed Nov 27 00:38:46 2024] ? irq_work_claim+0x1e/0x40 > > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > > [Wed Nov 27 00:38:46 2024] kill_anon_super+0x22/0x40 > > [Wed Nov 27 00:38:46 2024] cifs_kill_sb+0x78/0x90 [cifs] > > [Wed Nov 27 00:38:46 2024] deactivate_locked_super+0x69/0xf0 > > [Wed Nov 27 00:38:46 2024] cleanup_mnt+0x195/0x200 > > [Wed Nov 27 00:38:46 2024] task_work_run+0xec/0x150 > > [Wed Nov 27 00:38:46 2024] ? __pfx_task_work_run+0x10/0x10 > > [Wed Nov 27 00:38:46 2024] ? mark_held_locks+0x24/0x90 > > [Wed Nov 27 00:38:46 2024] syscall_exit_to_user_mode+0x269/0x2a0 > > [Wed Nov 27 00:38:46 2024] do_syscall_64+0x81/0x180 > > [Wed Nov 27 00:38:46 2024] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > [Wed Nov 27 00:38:46 2024] RIP: 0033:0x7fddc1ff43eb > > [Wed Nov 27 00:38:46 2024] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f > > 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa > > b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 > > f9 a9 0c 00 f7 d8 > > [Wed Nov 27 00:38:46 2024] RSP: 002b:00007ffe64be88d8 EFLAGS: 00000246 > > ORIG_RAX: 00000000000000a6 > > [Wed Nov 27 00:38:46 2024] RAX: 0000000000000000 RBX: 00005632106d4c20 > > RCX: 00007fddc1ff43eb > > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000000 RSI: 0000000000000000 > > RDI: 00005632106d9410 > > [Wed Nov 27 00:38:46 2024] RBP: 00007ffe64be89b0 R08: 00005632106d4010 > > R09: 0000000000000007 > > [Wed Nov 27 00:38:46 2024] R10: 0000000000000000 R11: 0000000000000246 > > R12: 00005632106d4d28 > > [Wed Nov 27 00:38:46 2024] R13: 0000000000000000 R14: 00005632106d9410 > > R15: 00005632106d5030 > > [Wed Nov 27 00:38:46 2024] </TASK> > > [Wed Nov 27 00:38:46 2024] Modules linked in: cifs ccm cmac nls_utf8 > > cifs_arc4 nls_ucs2_utils cifs_md4 rpcsec_gss_krb5 auth_rpcgss nfsv4 > > dns_resolver nfs lockd grace netfs nf_conntrack_netbios_ns > > nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib > > nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct > > nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat > > ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat > > nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw > > iptable_security ip_set ebtable_filter ebtables ip6table_filter > > ip6_tables iptable_filter ip_tables sunrpc kvm_intel kvm virtio_net > > net_failover failover virtio_balloon loop fuse dm_multipath nfnetlink > > zram xfs bochs drm_client_lib drm_shmem_helper drm_kms_helper > > crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm > > sha512_ssse3 sha256_ssse3 sha1_ssse3 floppy virtio_blk qemu_fw_cfg > > virtio_console [last unloaded: cifs] > > [Wed Nov 27 00:38:46 2024] ---[ end trace 0000000000000000 ]--- > > [Wed Nov 27 00:38:46 2024] RIP: 0010:generic_shutdown_super+0x1b7/0x1c0 > > [Wed Nov 27 00:38:46 2024] Code: 7b 28 e8 5c ca f8 ff 48 8b 6b 28 48 > > 89 ef e8 50 ca f8 ff 48 8b 55 00 48 8d b3 68 06 00 00 48 c7 c7 e0 38 > > ba ad e8 d9 c1 b5 ff <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 > > 90 90 90 90 90 90 > > [Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fdf0 EFLAGS: 00010282 > > [Wed Nov 27 00:38:46 2024] RAX: 000000000000002d RBX: ff110001238b6000 > > RCX: 0000000000000027 > > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 > > RDI: ff110004cb131a48 > > [Wed Nov 27 00:38:46 2024] RBP: ffffffffc1c6ac00 R08: ffffffffac3fd2fe > > R09: ffe21c0099626349 > > [Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 > > R12: ff110001238b69c0 > > [Wed Nov 27 00:38:46 2024] R13: ff110001238b6780 R14: 1fe220002432ffd4 > > R15: 0000000000000000 > > [Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) > > GS:ff110004cb100000(0000) knlGS:0000000000000000 > > [Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 > > CR4: 0000000000373ef0 > > [Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 > > DR2: 0000000000000000 > > [Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 > > DR7: 0000000000000400 > > > > On Tue, Nov 26, 2024 at 3:50 PM Paul Aurich <paul@darkrain42.org> wrote: > > > > > > On 2024-11-22 19:28:34 -0800, Paul Aurich wrote: > > > >On 2024-11-21 23:05:51 -0300, Paulo Alcantara wrote: > > > >>Hi Paul, > > > >> > > > >>Thanks for looking into this! Really appreciate it. > > > >> > > > >>Paul Aurich <paul@darkrain42.org> writes: > > > >> > > > >>>The unmount process (cifs_kill_sb() calling close_all_cached_dirs()) can > > > >>>race with various cached directory operations, which ultimately results > > > >>>in dentries not being dropped and these kernel BUGs: > > > >>> > > > >>>BUG: Dentry ffff88814f37e358{i=1000000000080,n=/} still in use (2) [unmount of cifs cifs] > > > >>>VFS: Busy inodes after unmount of cifs (cifs) > > > >>>------------[ cut here ]------------ > > > >>>kernel BUG at fs/super.c:661! > > > >>> > > > >>>This happens when a cfid is in the process of being cleaned up when, and > > > >>>has been removed from the cfids->entries list, including: > > > >>> > > > >>>- Receiving a lease break from the server > > > >>>- Server reconnection triggers invalidate_all_cached_dirs(), which > > > >>> removes all the cfids from the list > > > >>>- The laundromat thread decides to expire an old cfid. > > > >>> > > > >>>To solve these problems, dropping the dentry is done in queued work done > > > >>>in a newly-added cfid_put_wq workqueue, and close_all_cached_dirs() > > > >>>flushes that workqueue after it drops all the dentries of which it's > > > >>>aware. This is a global workqueue (rather than scoped to a mount), but > > > >>>the queued work is minimal. > > > >> > > > >>Why does it need to be a global workqueue? Can't you make it per tcon? > > > > > > > >The problem with a per-tcon workqueue is I didn't see clean way to > > > >deal with multiuser mounts and flushing the workqueue in > > > >close_all_cached_dirs() -- when dealing with each individual tcon, > > > >we're still holding tlink_tree_lock, so an arbitrary sleep seems > > > >problematic. > > > > > > > >There could be a per-sb workqueue (stored in cifs_sb or the master > > > >tcon) but is there a way to get back to the superblock / master tcon > > > >with just a tcon (e.g. cached_dir_lease_break, when processing a lease > > > >break)? > > > > > > > >>>The final cleanup work for cleaning up a cfid is performed via work > > > >>>queued in the serverclose_wq workqueue; this is done separate from > > > >>>dropping the dentries so that close_all_cached_dirs() doesn't block on > > > >>>any server operations. > > > >>> > > > >>>Both of these queued works expect to invoked with a cfid reference and > > > >>>a tcon reference to avoid those objects from being freed while the work > > > >>>is ongoing. > > > >> > > > >>Why do you need to take a tcon reference? > > > > > > > >In the existing code (and my patch, without the refs), I was seeing an > > > >intermittent use-after-free of the tcon or cached_fids struct by > > > >queued work processing a lease break -- the cfid isn't linked from > > > >cached_fids, but smb2_close_cached_fid invoking SMB2_close can race > > > >with the unmount and cifs_put_tcon > > > > > > > >Something like: > > > > > > > > t1 t2 > > > >cached_dir_lease_break > > > >smb2_cached_lease_break > > > >smb2_close_cached_fid > > > >SMB2_close starts > > > > cifs_kill_sb > > > > cifs_umount > > > > cifs_put_link > > > > cifs_put_tcon > > > >SMB2_close continues > > > > > > > >I had a version of the patch that kept the 'in flight lease breaks' on > > > >a second list in cached_fids so that they could be cancelled > > > >synchronously from free_cached_fids(), but I struggled with it (I > > > >can't remember exactly, but I think I was struggling to get the linked > > > >list membership / removal handling and num_entries handling > > > >consistent). > > > > > > > >>Can't you drop the dentries > > > >>when tearing down tcon in cifs_put_tcon()? No concurrent mounts would > > > >>be able to access or free it. > > > > > > > >The dentries being dropped must occur before kill_anon_super(), as > > > >that's where the 'Dentry still in use' check is. All the tcons are put > > > >in cifs_umount(), which occurs after: > > > > > > > > kill_anon_super(sb); > > > > cifs_umount(cifs_sb); > > > > > > > >The other thing is that cifs_umount_begin() has this comment, which > > > >made me think a tcon can actually be tied to two distinct mount > > > >points: > > > > > > > > if ((tcon->tc_count > 1) || (tcon->status == TID_EXITING)) { > > > > /* we have other mounts to same share or we have > > > > already tried to umount this and woken up > > > > all waiting network requests, nothing to do */ > > > > > > > >Although, as I'm thinking about it again, I think I've misunderstood > > > >(and that comment is wrong?). > > > > > > > >It did cross my mind to pull some of the work out of cifs_umount into > > > >cifs_kill_sb (specifically, I wanted to cancel prune_tlinks earlier) > > > >-- no prune_tlinks would make it more feasible to drop tlink_tree_lock > > > >in close_all_cached_dirs(), at which point a per-tcon workqueue is > > > >more practical. > > > > > > > >>After running xfstests I've seen a leaked tcon in > > > >>/proc/fs/cifs/DebugData with no CIFS superblocks, which might be related > > > >>to this. > > > >> > > > >>Could you please check if there is any leaked connection in > > > >>/proc/fs/cifs/DebugData after running your tests? > > > > > > > >After I finish with my tests (I'm not using xfstests, although perhaps > > > >I should be) and unmount the share, DebugData doesn't show any > > > >connections for me. > > > > > > I was able to reproduce this leak. I believe the attached patch addresses it. > > > > > > I'm able to intermittently see a 'Dentry still in use' bug with xfstests > > > generic/241 (what Steve saw) (the attached patch doesn't help with that). I'm > > > still unsure what's going on there. > > > > > > >~Paul > > > > > > > > -- > > Thanks, > > > > Steve > > > > -- > Thanks, > > Steve
The unmount crash also happens with mainline so not related to patches in for-next http://smb311-linux-testing.southcentralus.cloudapp.azure.com/#/builders/6/builds/128/steps/94/logs/stdio On Wed, Nov 27, 2024 at 11:00 PM Steve French <smfrench@gmail.com> wrote: > > And also see it without patch "smb: During unmount, ensure all cached > dir instances drop their dentry" > > On Wed, Nov 27, 2024 at 7:10 PM Steve French <smfrench@gmail.com> wrote: > > > > I see this error at the end of generic/241 (in dmesg) even without the > > more recent dir lease patch: > > > > "smb: Initialize cfid->tcon before performing network ops" > > > > so it is likely unrelated (earlier bug) > > > > > > > > Nov 27 18:07:46 fedora29 kernel: Call Trace: > > Nov 27 18:07:46 fedora29 kernel: <TASK> > > Nov 27 18:07:46 fedora29 kernel: ? __warn+0xa9/0x220 > > Nov 27 18:07:46 fedora29 kernel: ? umount_check+0xc3/0xf0 > > Nov 27 18:07:46 fedora29 kernel: ? report_bug+0x1d4/0x1e0 > > Nov 27 18:07:46 fedora29 kernel: ? handle_bug+0x5b/0xa0 > > Nov 27 18:07:46 fedora29 kernel: ? exc_invalid_op+0x18/0x50 > > Nov 27 18:07:46 fedora29 kernel: ? asm_exc_invalid_op+0x1a/0x20 > > Nov 27 18:07:46 fedora29 kernel: ? irq_work_claim+0x1e/0x40 > > Nov 27 18:07:46 fedora29 kernel: ? umount_check+0xc3/0xf0 > > Nov 27 18:07:46 fedora29 kernel: ? __pfx_umount_check+0x10/0x10 > > Nov 27 18:07:46 fedora29 kernel: d_walk+0xf3/0x4e0 > > Nov 27 18:07:46 fedora29 kernel: ? d_walk+0x4b/0x4e0 > > Nov 27 18:07:46 fedora29 kernel: shrink_dcache_for_umount+0x6d/0x220 > > Nov 27 18:07:46 fedora29 kernel: generic_shutdown_super+0x4a/0x1c0 > > Nov 27 18:07:46 fedora29 kernel: kill_anon_super+0x22/0x40 > > Nov 27 18:07:46 fedora29 kernel: cifs_kill_sb+0x78/0x90 [cifs] > > > > On Wed, Nov 27, 2024 at 10:38 AM Steve French <smfrench@gmail.com> wrote: > > > > > > I did see the generic/241 failure again with current for-next > > > (unrelated to this patch though). Will try to repro it again - but > > > any ideas how to narrow it down or fix it would be helpful. > > > > > > SECTION -- smb3 > > > FSTYP -- cifs > > > PLATFORM -- Linux/x86_64 fedora29 6.12.0 #1 SMP PREEMPT_DYNAMIC Wed > > > Nov 27 01:02:07 UTC 2024 > > > MKFS_OPTIONS -- //win16.vm.test/Scratch > > > generic/241 73s > > > Ran: generic/241 > > > Passed all 1 tests > > > SECTION -- smb3 > > > ========================= > > > Ran: generic/241 > > > Passed all 1 tests > > > Number of reconnects: 0 > > > Test completed smb3 generic/241 at Wed Nov 27 06:38:47 AM UTC 2024 > > > dmesg output during the test: > > > [Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Share > > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: generate_smb3signingkey: dumping > > > generated AES session keys > > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: Session Id 45 00 00 08 00 c8 00 00 > > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: Cipher type 2 > > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: Session Key 00 bf ed c7 f1 95 0e > > > 29 06 e8 82 87 b5 c8 72 06 > > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: Signing Key a4 0f 15 64 d2 69 02 > > > 2f 4e 78 60 7a fe 3e 31 4e > > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: ServerIn Key a6 fd 04 f6 04 ea > > > 0e 6e 60 c0 1b b1 ee 63 38 e9 > > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: ServerOut Key a6 e3 e3 22 8c c2 > > > b0 6e b1 9d 40 ea d0 89 6d d8 > > > [Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Scratch > > > [Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Scratch > > > [Wed Nov 27 00:37:32 2024] run fstests generic/241 at 2024-11-27 00:37:33 > > > [Wed Nov 27 00:38:46 2024] ------------[ cut here ]------------ > > > [Wed Nov 27 00:38:46 2024] BUG: Dentry > > > 00000000318d67d4{i=11000000033f68,n=~dmtmp} still in use (1) [unmount > > > of cifs cifs] > > > [Wed Nov 27 00:38:46 2024] WARNING: CPU: 2 PID: 316177 at > > > fs/dcache.c:1546 umount_check+0xc3/0xf0 > > > [Wed Nov 27 00:38:46 2024] Modules linked in: cifs ccm cmac nls_utf8 > > > cifs_arc4 nls_ucs2_utils cifs_md4 rpcsec_gss_krb5 auth_rpcgss nfsv4 > > > dns_resolver nfs lockd grace netfs nf_conntrack_netbios_ns > > > nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib > > > nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct > > > nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat > > > ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat > > > nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw > > > iptable_security ip_set ebtable_filter ebtables ip6table_filter > > > ip6_tables iptable_filter ip_tables sunrpc kvm_intel kvm virtio_net > > > net_failover failover virtio_balloon loop fuse dm_multipath nfnetlink > > > zram xfs bochs drm_client_lib drm_shmem_helper drm_kms_helper > > > crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm > > > sha512_ssse3 sha256_ssse3 sha1_ssse3 floppy virtio_blk qemu_fw_cfg > > > virtio_console [last unloaded: cifs] > > > [Wed Nov 27 00:38:46 2024] CPU: 2 UID: 0 PID: 316177 Comm: umount Not > > > tainted 6.12.0 #1 > > > [Wed Nov 27 00:38:46 2024] Hardware name: Red Hat KVM, BIOS > > > 1.16.3-2.el9 04/01/2014 > > > [Wed Nov 27 00:38:46 2024] RIP: 0010:umount_check+0xc3/0xf0 > > > [Wed Nov 27 00:38:46 2024] Code: db 74 0d 48 8d 7b 40 e8 db df f5 ff > > > 48 8b 53 40 41 55 4d 89 f1 45 89 e0 48 89 e9 48 89 ee 48 c7 c7 80 99 > > > ba ad e8 2d 27 a2 ff <0f> 0b 58 31 c0 5b 5d 41 5c 41 5d 41 5e c3 cc cc > > > cc cc 41 83 fc 01 > > > [Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fd20 EFLAGS: 00010282 > > > [Wed Nov 27 00:38:46 2024] RAX: dffffc0000000000 RBX: ff1100010c574ce0 > > > RCX: 0000000000000027 > > > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 > > > RDI: ff110004cb131a48 > > > [Wed Nov 27 00:38:46 2024] RBP: ff1100012c76bd60 R08: ffffffffac3fd2fe > > > R09: ffe21c0099626349 > > > [Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 > > > R12: 0000000000000001 > > > [Wed Nov 27 00:38:46 2024] R13: ff110001238b6668 R14: ffffffffc1d6e6c0 > > > R15: ff1100012c76be18 > > > [Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) > > > GS:ff110004cb100000(0000) knlGS:0000000000000000 > > > [Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > [Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 > > > CR4: 0000000000373ef0 > > > [Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 > > > DR2: 0000000000000000 > > > [Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 > > > DR7: 0000000000000400 > > > [Wed Nov 27 00:38:46 2024] Call Trace: > > > [Wed Nov 27 00:38:46 2024] <TASK> > > > [Wed Nov 27 00:38:46 2024] ? __warn+0xa9/0x220 > > > [Wed Nov 27 00:38:46 2024] ? umount_check+0xc3/0xf0 > > > [Wed Nov 27 00:38:46 2024] ? report_bug+0x1d4/0x1e0 > > > [Wed Nov 27 00:38:46 2024] ? handle_bug+0x5b/0xa0 > > > [Wed Nov 27 00:38:46 2024] ? exc_invalid_op+0x18/0x50 > > > [Wed Nov 27 00:38:46 2024] ? asm_exc_invalid_op+0x1a/0x20 > > > [Wed Nov 27 00:38:46 2024] ? irq_work_claim+0x1e/0x40 > > > [Wed Nov 27 00:38:46 2024] ? umount_check+0xc3/0xf0 > > > [Wed Nov 27 00:38:46 2024] ? __pfx_umount_check+0x10/0x10 > > > [Wed Nov 27 00:38:46 2024] d_walk+0xf3/0x4e0 > > > [Wed Nov 27 00:38:46 2024] ? d_walk+0x4b/0x4e0 > > > [Wed Nov 27 00:38:46 2024] shrink_dcache_for_umount+0x6d/0x220 > > > [Wed Nov 27 00:38:46 2024] generic_shutdown_super+0x4a/0x1c0 > > > [Wed Nov 27 00:38:46 2024] kill_anon_super+0x22/0x40 > > > [Wed Nov 27 00:38:46 2024] cifs_kill_sb+0x78/0x90 [cifs] > > > [Wed Nov 27 00:38:46 2024] deactivate_locked_super+0x69/0xf0 > > > [Wed Nov 27 00:38:46 2024] cleanup_mnt+0x195/0x200 > > > [Wed Nov 27 00:38:46 2024] task_work_run+0xec/0x150 > > > [Wed Nov 27 00:38:46 2024] ? __pfx_task_work_run+0x10/0x10 > > > [Wed Nov 27 00:38:46 2024] ? mark_held_locks+0x24/0x90 > > > [Wed Nov 27 00:38:46 2024] syscall_exit_to_user_mode+0x269/0x2a0 > > > [Wed Nov 27 00:38:46 2024] do_syscall_64+0x81/0x180 > > > [Wed Nov 27 00:38:46 2024] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > > [Wed Nov 27 00:38:46 2024] RIP: 0033:0x7fddc1ff43eb > > > [Wed Nov 27 00:38:46 2024] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f > > > 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa > > > b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 > > > f9 a9 0c 00 f7 d8 > > > [Wed Nov 27 00:38:46 2024] RSP: 002b:00007ffe64be88d8 EFLAGS: 00000246 > > > ORIG_RAX: 00000000000000a6 > > > [Wed Nov 27 00:38:46 2024] RAX: 0000000000000000 RBX: 00005632106d4c20 > > > RCX: 00007fddc1ff43eb > > > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000000 RSI: 0000000000000000 > > > RDI: 00005632106d9410 > > > [Wed Nov 27 00:38:46 2024] RBP: 00007ffe64be89b0 R08: 00005632106d4010 > > > R09: 0000000000000007 > > > [Wed Nov 27 00:38:46 2024] R10: 0000000000000000 R11: 0000000000000246 > > > R12: 00005632106d4d28 > > > [Wed Nov 27 00:38:46 2024] R13: 0000000000000000 R14: 00005632106d9410 > > > R15: 00005632106d5030 > > > [Wed Nov 27 00:38:46 2024] </TASK> > > > [Wed Nov 27 00:38:46 2024] irq event stamp: 8317 > > > [Wed Nov 27 00:38:46 2024] hardirqs last enabled at (8323): > > > [<ffffffffac230dce>] __up_console_sem+0x5e/0x70 > > > [Wed Nov 27 00:38:46 2024] hardirqs last disabled at (8328): > > > [<ffffffffac230db3>] __up_console_sem+0x43/0x70 > > > [Wed Nov 27 00:38:46 2024] softirqs last enabled at (6628): > > > [<ffffffffac135745>] __irq_exit_rcu+0x135/0x160 > > > [Wed Nov 27 00:38:46 2024] softirqs last disabled at (6539): > > > [<ffffffffac135745>] __irq_exit_rcu+0x135/0x160 > > > [Wed Nov 27 00:38:46 2024] ---[ end trace 0000000000000000 ]--- > > > [Wed Nov 27 00:38:46 2024] VFS: Busy inodes after unmount of cifs (cifs) > > > [Wed Nov 27 00:38:46 2024] ------------[ cut here ]------------ > > > [Wed Nov 27 00:38:46 2024] kernel BUG at fs/super.c:650! > > > [Wed Nov 27 00:38:46 2024] Oops: invalid opcode: 0000 [#1] PREEMPT SMP > > > KASAN NOPTI > > > [Wed Nov 27 00:38:46 2024] CPU: 2 UID: 0 PID: 316177 Comm: umount > > > Tainted: G W 6.12.0 #1 > > > [Wed Nov 27 00:38:46 2024] Tainted: [W]=WARN > > > [Wed Nov 27 00:38:46 2024] Hardware name: Red Hat KVM, BIOS > > > 1.16.3-2.el9 04/01/2014 > > > [Wed Nov 27 00:38:46 2024] RIP: 0010:generic_shutdown_super+0x1b7/0x1c0 > > > [Wed Nov 27 00:38:46 2024] Code: 7b 28 e8 5c ca f8 ff 48 8b 6b 28 48 > > > 89 ef e8 50 ca f8 ff 48 8b 55 00 48 8d b3 68 06 00 00 48 c7 c7 e0 38 > > > ba ad e8 d9 c1 b5 ff <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 > > > 90 90 90 90 90 90 > > > [Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fdf0 EFLAGS: 00010282 > > > [Wed Nov 27 00:38:46 2024] RAX: 000000000000002d RBX: ff110001238b6000 > > > RCX: 0000000000000027 > > > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 > > > RDI: ff110004cb131a48 > > > [Wed Nov 27 00:38:46 2024] RBP: ffffffffc1c6ac00 R08: ffffffffac3fd2fe > > > R09: ffe21c0099626349 > > > [Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 > > > R12: ff110001238b69c0 > > > [Wed Nov 27 00:38:46 2024] R13: ff110001238b6780 R14: 1fe220002432ffd4 > > > R15: 0000000000000000 > > > [Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) > > > GS:ff110004cb100000(0000) knlGS:0000000000000000 > > > [Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > [Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 > > > CR4: 0000000000373ef0 > > > [Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 > > > DR2: 0000000000000000 > > > [Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 > > > DR7: 0000000000000400 > > > [Wed Nov 27 00:38:46 2024] Call Trace: > > > [Wed Nov 27 00:38:46 2024] <TASK> > > > [Wed Nov 27 00:38:46 2024] ? die+0x37/0x90 > > > [Wed Nov 27 00:38:46 2024] ? do_trap+0x133/0x230 > > > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > > > [Wed Nov 27 00:38:46 2024] ? do_error_trap+0x94/0x130 > > > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > > > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > > > [Wed Nov 27 00:38:46 2024] ? handle_invalid_op+0x2c/0x40 > > > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > > > [Wed Nov 27 00:38:46 2024] ? exc_invalid_op+0x2f/0x50 > > > [Wed Nov 27 00:38:46 2024] ? asm_exc_invalid_op+0x1a/0x20 > > > [Wed Nov 27 00:38:46 2024] ? irq_work_claim+0x1e/0x40 > > > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > > > [Wed Nov 27 00:38:46 2024] kill_anon_super+0x22/0x40 > > > [Wed Nov 27 00:38:46 2024] cifs_kill_sb+0x78/0x90 [cifs] > > > [Wed Nov 27 00:38:46 2024] deactivate_locked_super+0x69/0xf0 > > > [Wed Nov 27 00:38:46 2024] cleanup_mnt+0x195/0x200 > > > [Wed Nov 27 00:38:46 2024] task_work_run+0xec/0x150 > > > [Wed Nov 27 00:38:46 2024] ? __pfx_task_work_run+0x10/0x10 > > > [Wed Nov 27 00:38:46 2024] ? mark_held_locks+0x24/0x90 > > > [Wed Nov 27 00:38:46 2024] syscall_exit_to_user_mode+0x269/0x2a0 > > > [Wed Nov 27 00:38:46 2024] do_syscall_64+0x81/0x180 > > > [Wed Nov 27 00:38:46 2024] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > > [Wed Nov 27 00:38:46 2024] RIP: 0033:0x7fddc1ff43eb > > > [Wed Nov 27 00:38:46 2024] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f > > > 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa > > > b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 > > > f9 a9 0c 00 f7 d8 > > > [Wed Nov 27 00:38:46 2024] RSP: 002b:00007ffe64be88d8 EFLAGS: 00000246 > > > ORIG_RAX: 00000000000000a6 > > > [Wed Nov 27 00:38:46 2024] RAX: 0000000000000000 RBX: 00005632106d4c20 > > > RCX: 00007fddc1ff43eb > > > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000000 RSI: 0000000000000000 > > > RDI: 00005632106d9410 > > > [Wed Nov 27 00:38:46 2024] RBP: 00007ffe64be89b0 R08: 00005632106d4010 > > > R09: 0000000000000007 > > > [Wed Nov 27 00:38:46 2024] R10: 0000000000000000 R11: 0000000000000246 > > > R12: 00005632106d4d28 > > > [Wed Nov 27 00:38:46 2024] R13: 0000000000000000 R14: 00005632106d9410 > > > R15: 00005632106d5030 > > > [Wed Nov 27 00:38:46 2024] </TASK> > > > [Wed Nov 27 00:38:46 2024] Modules linked in: cifs ccm cmac nls_utf8 > > > cifs_arc4 nls_ucs2_utils cifs_md4 rpcsec_gss_krb5 auth_rpcgss nfsv4 > > > dns_resolver nfs lockd grace netfs nf_conntrack_netbios_ns > > > nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib > > > nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct > > > nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat > > > ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat > > > nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw > > > iptable_security ip_set ebtable_filter ebtables ip6table_filter > > > ip6_tables iptable_filter ip_tables sunrpc kvm_intel kvm virtio_net > > > net_failover failover virtio_balloon loop fuse dm_multipath nfnetlink > > > zram xfs bochs drm_client_lib drm_shmem_helper drm_kms_helper > > > crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm > > > sha512_ssse3 sha256_ssse3 sha1_ssse3 floppy virtio_blk qemu_fw_cfg > > > virtio_console [last unloaded: cifs] > > > [Wed Nov 27 00:38:46 2024] ---[ end trace 0000000000000000 ]--- > > > [Wed Nov 27 00:38:46 2024] RIP: 0010:generic_shutdown_super+0x1b7/0x1c0 > > > [Wed Nov 27 00:38:46 2024] Code: 7b 28 e8 5c ca f8 ff 48 8b 6b 28 48 > > > 89 ef e8 50 ca f8 ff 48 8b 55 00 48 8d b3 68 06 00 00 48 c7 c7 e0 38 > > > ba ad e8 d9 c1 b5 ff <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 > > > 90 90 90 90 90 90 > > > [Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fdf0 EFLAGS: 00010282 > > > [Wed Nov 27 00:38:46 2024] RAX: 000000000000002d RBX: ff110001238b6000 > > > RCX: 0000000000000027 > > > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 > > > RDI: ff110004cb131a48 > > > [Wed Nov 27 00:38:46 2024] RBP: ffffffffc1c6ac00 R08: ffffffffac3fd2fe > > > R09: ffe21c0099626349 > > > [Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 > > > R12: ff110001238b69c0 > > > [Wed Nov 27 00:38:46 2024] R13: ff110001238b6780 R14: 1fe220002432ffd4 > > > R15: 0000000000000000 > > > [Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) > > > GS:ff110004cb100000(0000) knlGS:0000000000000000 > > > [Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > [Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 > > > CR4: 0000000000373ef0 > > > [Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 > > > DR2: 0000000000000000 > > > [Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 > > > DR7: 0000000000000400 > > > > > > On Tue, Nov 26, 2024 at 3:50 PM Paul Aurich <paul@darkrain42.org> wrote: > > > > > > > > On 2024-11-22 19:28:34 -0800, Paul Aurich wrote: > > > > >On 2024-11-21 23:05:51 -0300, Paulo Alcantara wrote: > > > > >>Hi Paul, > > > > >> > > > > >>Thanks for looking into this! Really appreciate it. > > > > >> > > > > >>Paul Aurich <paul@darkrain42.org> writes: > > > > >> > > > > >>>The unmount process (cifs_kill_sb() calling close_all_cached_dirs()) can > > > > >>>race with various cached directory operations, which ultimately results > > > > >>>in dentries not being dropped and these kernel BUGs: > > > > >>> > > > > >>>BUG: Dentry ffff88814f37e358{i=1000000000080,n=/} still in use (2) [unmount of cifs cifs] > > > > >>>VFS: Busy inodes after unmount of cifs (cifs) > > > > >>>------------[ cut here ]------------ > > > > >>>kernel BUG at fs/super.c:661! > > > > >>> > > > > >>>This happens when a cfid is in the process of being cleaned up when, and > > > > >>>has been removed from the cfids->entries list, including: > > > > >>> > > > > >>>- Receiving a lease break from the server > > > > >>>- Server reconnection triggers invalidate_all_cached_dirs(), which > > > > >>> removes all the cfids from the list > > > > >>>- The laundromat thread decides to expire an old cfid. > > > > >>> > > > > >>>To solve these problems, dropping the dentry is done in queued work done > > > > >>>in a newly-added cfid_put_wq workqueue, and close_all_cached_dirs() > > > > >>>flushes that workqueue after it drops all the dentries of which it's > > > > >>>aware. This is a global workqueue (rather than scoped to a mount), but > > > > >>>the queued work is minimal. > > > > >> > > > > >>Why does it need to be a global workqueue? Can't you make it per tcon? > > > > > > > > > >The problem with a per-tcon workqueue is I didn't see clean way to > > > > >deal with multiuser mounts and flushing the workqueue in > > > > >close_all_cached_dirs() -- when dealing with each individual tcon, > > > > >we're still holding tlink_tree_lock, so an arbitrary sleep seems > > > > >problematic. > > > > > > > > > >There could be a per-sb workqueue (stored in cifs_sb or the master > > > > >tcon) but is there a way to get back to the superblock / master tcon > > > > >with just a tcon (e.g. cached_dir_lease_break, when processing a lease > > > > >break)? > > > > > > > > > >>>The final cleanup work for cleaning up a cfid is performed via work > > > > >>>queued in the serverclose_wq workqueue; this is done separate from > > > > >>>dropping the dentries so that close_all_cached_dirs() doesn't block on > > > > >>>any server operations. > > > > >>> > > > > >>>Both of these queued works expect to invoked with a cfid reference and > > > > >>>a tcon reference to avoid those objects from being freed while the work > > > > >>>is ongoing. > > > > >> > > > > >>Why do you need to take a tcon reference? > > > > > > > > > >In the existing code (and my patch, without the refs), I was seeing an > > > > >intermittent use-after-free of the tcon or cached_fids struct by > > > > >queued work processing a lease break -- the cfid isn't linked from > > > > >cached_fids, but smb2_close_cached_fid invoking SMB2_close can race > > > > >with the unmount and cifs_put_tcon > > > > > > > > > >Something like: > > > > > > > > > > t1 t2 > > > > >cached_dir_lease_break > > > > >smb2_cached_lease_break > > > > >smb2_close_cached_fid > > > > >SMB2_close starts > > > > > cifs_kill_sb > > > > > cifs_umount > > > > > cifs_put_link > > > > > cifs_put_tcon > > > > >SMB2_close continues > > > > > > > > > >I had a version of the patch that kept the 'in flight lease breaks' on > > > > >a second list in cached_fids so that they could be cancelled > > > > >synchronously from free_cached_fids(), but I struggled with it (I > > > > >can't remember exactly, but I think I was struggling to get the linked > > > > >list membership / removal handling and num_entries handling > > > > >consistent). > > > > > > > > > >>Can't you drop the dentries > > > > >>when tearing down tcon in cifs_put_tcon()? No concurrent mounts would > > > > >>be able to access or free it. > > > > > > > > > >The dentries being dropped must occur before kill_anon_super(), as > > > > >that's where the 'Dentry still in use' check is. All the tcons are put > > > > >in cifs_umount(), which occurs after: > > > > > > > > > > kill_anon_super(sb); > > > > > cifs_umount(cifs_sb); > > > > > > > > > >The other thing is that cifs_umount_begin() has this comment, which > > > > >made me think a tcon can actually be tied to two distinct mount > > > > >points: > > > > > > > > > > if ((tcon->tc_count > 1) || (tcon->status == TID_EXITING)) { > > > > > /* we have other mounts to same share or we have > > > > > already tried to umount this and woken up > > > > > all waiting network requests, nothing to do */ > > > > > > > > > >Although, as I'm thinking about it again, I think I've misunderstood > > > > >(and that comment is wrong?). > > > > > > > > > >It did cross my mind to pull some of the work out of cifs_umount into > > > > >cifs_kill_sb (specifically, I wanted to cancel prune_tlinks earlier) > > > > >-- no prune_tlinks would make it more feasible to drop tlink_tree_lock > > > > >in close_all_cached_dirs(), at which point a per-tcon workqueue is > > > > >more practical. > > > > > > > > > >>After running xfstests I've seen a leaked tcon in > > > > >>/proc/fs/cifs/DebugData with no CIFS superblocks, which might be related > > > > >>to this. > > > > >> > > > > >>Could you please check if there is any leaked connection in > > > > >>/proc/fs/cifs/DebugData after running your tests? > > > > > > > > > >After I finish with my tests (I'm not using xfstests, although perhaps > > > > >I should be) and unmount the share, DebugData doesn't show any > > > > >connections for me. > > > > > > > > I was able to reproduce this leak. I believe the attached patch addresses it. > > > > > > > > I'm able to intermittently see a 'Dentry still in use' bug with xfstests > > > > generic/241 (what Steve saw) (the attached patch doesn't help with that). I'm > > > > still unsure what's going on there. > > > > > > > > >~Paul > > > > > > > > > > > > -- > > > Thanks, > > > > > > Steve > > > > > > > > -- > > Thanks, > > > > Steve > > > > -- > Thanks, > > Steve
I see the attached umount check warning to current Samba (presumably due to enabling directory leases) but it is rare - I saw once in test generic/337 once in generic/339 On Thu, Nov 28, 2024 at 8:16 AM Steve French <smfrench@gmail.com> wrote: > > The unmount crash also happens with mainline so not related to patches > in for-next > > http://smb311-linux-testing.southcentralus.cloudapp.azure.com/#/builders/6/builds/128/steps/94/logs/stdio > > On Wed, Nov 27, 2024 at 11:00 PM Steve French <smfrench@gmail.com> wrote: > > > > And also see it without patch "smb: During unmount, ensure all cached > > dir instances drop their dentry" > > > > On Wed, Nov 27, 2024 at 7:10 PM Steve French <smfrench@gmail.com> wrote: > > > > > > I see this error at the end of generic/241 (in dmesg) even without the > > > more recent dir lease patch: > > > > > > "smb: Initialize cfid->tcon before performing network ops" > > > > > > so it is likely unrelated (earlier bug) > > > > > > > > > > > > Nov 27 18:07:46 fedora29 kernel: Call Trace: > > > Nov 27 18:07:46 fedora29 kernel: <TASK> > > > Nov 27 18:07:46 fedora29 kernel: ? __warn+0xa9/0x220 > > > Nov 27 18:07:46 fedora29 kernel: ? umount_check+0xc3/0xf0 > > > Nov 27 18:07:46 fedora29 kernel: ? report_bug+0x1d4/0x1e0 > > > Nov 27 18:07:46 fedora29 kernel: ? handle_bug+0x5b/0xa0 > > > Nov 27 18:07:46 fedora29 kernel: ? exc_invalid_op+0x18/0x50 > > > Nov 27 18:07:46 fedora29 kernel: ? asm_exc_invalid_op+0x1a/0x20 > > > Nov 27 18:07:46 fedora29 kernel: ? irq_work_claim+0x1e/0x40 > > > Nov 27 18:07:46 fedora29 kernel: ? umount_check+0xc3/0xf0 > > > Nov 27 18:07:46 fedora29 kernel: ? __pfx_umount_check+0x10/0x10 > > > Nov 27 18:07:46 fedora29 kernel: d_walk+0xf3/0x4e0 > > > Nov 27 18:07:46 fedora29 kernel: ? d_walk+0x4b/0x4e0 > > > Nov 27 18:07:46 fedora29 kernel: shrink_dcache_for_umount+0x6d/0x220 > > > Nov 27 18:07:46 fedora29 kernel: generic_shutdown_super+0x4a/0x1c0 > > > Nov 27 18:07:46 fedora29 kernel: kill_anon_super+0x22/0x40 > > > Nov 27 18:07:46 fedora29 kernel: cifs_kill_sb+0x78/0x90 [cifs] > > > > > > On Wed, Nov 27, 2024 at 10:38 AM Steve French <smfrench@gmail.com> wrote: > > > > > > > > I did see the generic/241 failure again with current for-next > > > > (unrelated to this patch though). Will try to repro it again - but > > > > any ideas how to narrow it down or fix it would be helpful. > > > > > > > > SECTION -- smb3 > > > > FSTYP -- cifs > > > > PLATFORM -- Linux/x86_64 fedora29 6.12.0 #1 SMP PREEMPT_DYNAMIC Wed > > > > Nov 27 01:02:07 UTC 2024 > > > > MKFS_OPTIONS -- //win16.vm.test/Scratch > > > > generic/241 73s > > > > Ran: generic/241 > > > > Passed all 1 tests > > > > SECTION -- smb3 > > > > ========================= > > > > Ran: generic/241 > > > > Passed all 1 tests > > > > Number of reconnects: 0 > > > > Test completed smb3 generic/241 at Wed Nov 27 06:38:47 AM UTC 2024 > > > > dmesg output during the test: > > > > [Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Share > > > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: generate_smb3signingkey: dumping > > > > generated AES session keys > > > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: Session Id 45 00 00 08 00 c8 00 00 > > > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: Cipher type 2 > > > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: Session Key 00 bf ed c7 f1 95 0e > > > > 29 06 e8 82 87 b5 c8 72 06 > > > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: Signing Key a4 0f 15 64 d2 69 02 > > > > 2f 4e 78 60 7a fe 3e 31 4e > > > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: ServerIn Key a6 fd 04 f6 04 ea > > > > 0e 6e 60 c0 1b b1 ee 63 38 e9 > > > > [Wed Nov 27 00:37:32 2024] CIFS: VFS: ServerOut Key a6 e3 e3 22 8c c2 > > > > b0 6e b1 9d 40 ea d0 89 6d d8 > > > > [Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Scratch > > > > [Wed Nov 27 00:37:32 2024] CIFS: Attempting to mount //win16.vm.test/Scratch > > > > [Wed Nov 27 00:37:32 2024] run fstests generic/241 at 2024-11-27 00:37:33 > > > > [Wed Nov 27 00:38:46 2024] ------------[ cut here ]------------ > > > > [Wed Nov 27 00:38:46 2024] BUG: Dentry > > > > 00000000318d67d4{i=11000000033f68,n=~dmtmp} still in use (1) [unmount > > > > of cifs cifs] > > > > [Wed Nov 27 00:38:46 2024] WARNING: CPU: 2 PID: 316177 at > > > > fs/dcache.c:1546 umount_check+0xc3/0xf0 > > > > [Wed Nov 27 00:38:46 2024] Modules linked in: cifs ccm cmac nls_utf8 > > > > cifs_arc4 nls_ucs2_utils cifs_md4 rpcsec_gss_krb5 auth_rpcgss nfsv4 > > > > dns_resolver nfs lockd grace netfs nf_conntrack_netbios_ns > > > > nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib > > > > nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct > > > > nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat > > > > ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat > > > > nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw > > > > iptable_security ip_set ebtable_filter ebtables ip6table_filter > > > > ip6_tables iptable_filter ip_tables sunrpc kvm_intel kvm virtio_net > > > > net_failover failover virtio_balloon loop fuse dm_multipath nfnetlink > > > > zram xfs bochs drm_client_lib drm_shmem_helper drm_kms_helper > > > > crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm > > > > sha512_ssse3 sha256_ssse3 sha1_ssse3 floppy virtio_blk qemu_fw_cfg > > > > virtio_console [last unloaded: cifs] > > > > [Wed Nov 27 00:38:46 2024] CPU: 2 UID: 0 PID: 316177 Comm: umount Not > > > > tainted 6.12.0 #1 > > > > [Wed Nov 27 00:38:46 2024] Hardware name: Red Hat KVM, BIOS > > > > 1.16.3-2.el9 04/01/2014 > > > > [Wed Nov 27 00:38:46 2024] RIP: 0010:umount_check+0xc3/0xf0 > > > > [Wed Nov 27 00:38:46 2024] Code: db 74 0d 48 8d 7b 40 e8 db df f5 ff > > > > 48 8b 53 40 41 55 4d 89 f1 45 89 e0 48 89 e9 48 89 ee 48 c7 c7 80 99 > > > > ba ad e8 2d 27 a2 ff <0f> 0b 58 31 c0 5b 5d 41 5c 41 5d 41 5e c3 cc cc > > > > cc cc 41 83 fc 01 > > > > [Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fd20 EFLAGS: 00010282 > > > > [Wed Nov 27 00:38:46 2024] RAX: dffffc0000000000 RBX: ff1100010c574ce0 > > > > RCX: 0000000000000027 > > > > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 > > > > RDI: ff110004cb131a48 > > > > [Wed Nov 27 00:38:46 2024] RBP: ff1100012c76bd60 R08: ffffffffac3fd2fe > > > > R09: ffe21c0099626349 > > > > [Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 > > > > R12: 0000000000000001 > > > > [Wed Nov 27 00:38:46 2024] R13: ff110001238b6668 R14: ffffffffc1d6e6c0 > > > > R15: ff1100012c76be18 > > > > [Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) > > > > GS:ff110004cb100000(0000) knlGS:0000000000000000 > > > > [Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > [Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 > > > > CR4: 0000000000373ef0 > > > > [Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 > > > > DR2: 0000000000000000 > > > > [Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 > > > > DR7: 0000000000000400 > > > > [Wed Nov 27 00:38:46 2024] Call Trace: > > > > [Wed Nov 27 00:38:46 2024] <TASK> > > > > [Wed Nov 27 00:38:46 2024] ? __warn+0xa9/0x220 > > > > [Wed Nov 27 00:38:46 2024] ? umount_check+0xc3/0xf0 > > > > [Wed Nov 27 00:38:46 2024] ? report_bug+0x1d4/0x1e0 > > > > [Wed Nov 27 00:38:46 2024] ? handle_bug+0x5b/0xa0 > > > > [Wed Nov 27 00:38:46 2024] ? exc_invalid_op+0x18/0x50 > > > > [Wed Nov 27 00:38:46 2024] ? asm_exc_invalid_op+0x1a/0x20 > > > > [Wed Nov 27 00:38:46 2024] ? irq_work_claim+0x1e/0x40 > > > > [Wed Nov 27 00:38:46 2024] ? umount_check+0xc3/0xf0 > > > > [Wed Nov 27 00:38:46 2024] ? __pfx_umount_check+0x10/0x10 > > > > [Wed Nov 27 00:38:46 2024] d_walk+0xf3/0x4e0 > > > > [Wed Nov 27 00:38:46 2024] ? d_walk+0x4b/0x4e0 > > > > [Wed Nov 27 00:38:46 2024] shrink_dcache_for_umount+0x6d/0x220 > > > > [Wed Nov 27 00:38:46 2024] generic_shutdown_super+0x4a/0x1c0 > > > > [Wed Nov 27 00:38:46 2024] kill_anon_super+0x22/0x40 > > > > [Wed Nov 27 00:38:46 2024] cifs_kill_sb+0x78/0x90 [cifs] > > > > [Wed Nov 27 00:38:46 2024] deactivate_locked_super+0x69/0xf0 > > > > [Wed Nov 27 00:38:46 2024] cleanup_mnt+0x195/0x200 > > > > [Wed Nov 27 00:38:46 2024] task_work_run+0xec/0x150 > > > > [Wed Nov 27 00:38:46 2024] ? __pfx_task_work_run+0x10/0x10 > > > > [Wed Nov 27 00:38:46 2024] ? mark_held_locks+0x24/0x90 > > > > [Wed Nov 27 00:38:46 2024] syscall_exit_to_user_mode+0x269/0x2a0 > > > > [Wed Nov 27 00:38:46 2024] do_syscall_64+0x81/0x180 > > > > [Wed Nov 27 00:38:46 2024] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > > > [Wed Nov 27 00:38:46 2024] RIP: 0033:0x7fddc1ff43eb > > > > [Wed Nov 27 00:38:46 2024] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f > > > > 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa > > > > b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 > > > > f9 a9 0c 00 f7 d8 > > > > [Wed Nov 27 00:38:46 2024] RSP: 002b:00007ffe64be88d8 EFLAGS: 00000246 > > > > ORIG_RAX: 00000000000000a6 > > > > [Wed Nov 27 00:38:46 2024] RAX: 0000000000000000 RBX: 00005632106d4c20 > > > > RCX: 00007fddc1ff43eb > > > > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000000 RSI: 0000000000000000 > > > > RDI: 00005632106d9410 > > > > [Wed Nov 27 00:38:46 2024] RBP: 00007ffe64be89b0 R08: 00005632106d4010 > > > > R09: 0000000000000007 > > > > [Wed Nov 27 00:38:46 2024] R10: 0000000000000000 R11: 0000000000000246 > > > > R12: 00005632106d4d28 > > > > [Wed Nov 27 00:38:46 2024] R13: 0000000000000000 R14: 00005632106d9410 > > > > R15: 00005632106d5030 > > > > [Wed Nov 27 00:38:46 2024] </TASK> > > > > [Wed Nov 27 00:38:46 2024] irq event stamp: 8317 > > > > [Wed Nov 27 00:38:46 2024] hardirqs last enabled at (8323): > > > > [<ffffffffac230dce>] __up_console_sem+0x5e/0x70 > > > > [Wed Nov 27 00:38:46 2024] hardirqs last disabled at (8328): > > > > [<ffffffffac230db3>] __up_console_sem+0x43/0x70 > > > > [Wed Nov 27 00:38:46 2024] softirqs last enabled at (6628): > > > > [<ffffffffac135745>] __irq_exit_rcu+0x135/0x160 > > > > [Wed Nov 27 00:38:46 2024] softirqs last disabled at (6539): > > > > [<ffffffffac135745>] __irq_exit_rcu+0x135/0x160 > > > > [Wed Nov 27 00:38:46 2024] ---[ end trace 0000000000000000 ]--- > > > > [Wed Nov 27 00:38:46 2024] VFS: Busy inodes after unmount of cifs (cifs) > > > > [Wed Nov 27 00:38:46 2024] ------------[ cut here ]------------ > > > > [Wed Nov 27 00:38:46 2024] kernel BUG at fs/super.c:650! > > > > [Wed Nov 27 00:38:46 2024] Oops: invalid opcode: 0000 [#1] PREEMPT SMP > > > > KASAN NOPTI > > > > [Wed Nov 27 00:38:46 2024] CPU: 2 UID: 0 PID: 316177 Comm: umount > > > > Tainted: G W 6.12.0 #1 > > > > [Wed Nov 27 00:38:46 2024] Tainted: [W]=WARN > > > > [Wed Nov 27 00:38:46 2024] Hardware name: Red Hat KVM, BIOS > > > > 1.16.3-2.el9 04/01/2014 > > > > [Wed Nov 27 00:38:46 2024] RIP: 0010:generic_shutdown_super+0x1b7/0x1c0 > > > > [Wed Nov 27 00:38:46 2024] Code: 7b 28 e8 5c ca f8 ff 48 8b 6b 28 48 > > > > 89 ef e8 50 ca f8 ff 48 8b 55 00 48 8d b3 68 06 00 00 48 c7 c7 e0 38 > > > > ba ad e8 d9 c1 b5 ff <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 > > > > 90 90 90 90 90 90 > > > > [Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fdf0 EFLAGS: 00010282 > > > > [Wed Nov 27 00:38:46 2024] RAX: 000000000000002d RBX: ff110001238b6000 > > > > RCX: 0000000000000027 > > > > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 > > > > RDI: ff110004cb131a48 > > > > [Wed Nov 27 00:38:46 2024] RBP: ffffffffc1c6ac00 R08: ffffffffac3fd2fe > > > > R09: ffe21c0099626349 > > > > [Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 > > > > R12: ff110001238b69c0 > > > > [Wed Nov 27 00:38:46 2024] R13: ff110001238b6780 R14: 1fe220002432ffd4 > > > > R15: 0000000000000000 > > > > [Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) > > > > GS:ff110004cb100000(0000) knlGS:0000000000000000 > > > > [Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > [Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 > > > > CR4: 0000000000373ef0 > > > > [Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 > > > > DR2: 0000000000000000 > > > > [Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 > > > > DR7: 0000000000000400 > > > > [Wed Nov 27 00:38:46 2024] Call Trace: > > > > [Wed Nov 27 00:38:46 2024] <TASK> > > > > [Wed Nov 27 00:38:46 2024] ? die+0x37/0x90 > > > > [Wed Nov 27 00:38:46 2024] ? do_trap+0x133/0x230 > > > > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > > > > [Wed Nov 27 00:38:46 2024] ? do_error_trap+0x94/0x130 > > > > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > > > > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > > > > [Wed Nov 27 00:38:46 2024] ? handle_invalid_op+0x2c/0x40 > > > > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > > > > [Wed Nov 27 00:38:46 2024] ? exc_invalid_op+0x2f/0x50 > > > > [Wed Nov 27 00:38:46 2024] ? asm_exc_invalid_op+0x1a/0x20 > > > > [Wed Nov 27 00:38:46 2024] ? irq_work_claim+0x1e/0x40 > > > > [Wed Nov 27 00:38:46 2024] ? generic_shutdown_super+0x1b7/0x1c0 > > > > [Wed Nov 27 00:38:46 2024] kill_anon_super+0x22/0x40 > > > > [Wed Nov 27 00:38:46 2024] cifs_kill_sb+0x78/0x90 [cifs] > > > > [Wed Nov 27 00:38:46 2024] deactivate_locked_super+0x69/0xf0 > > > > [Wed Nov 27 00:38:46 2024] cleanup_mnt+0x195/0x200 > > > > [Wed Nov 27 00:38:46 2024] task_work_run+0xec/0x150 > > > > [Wed Nov 27 00:38:46 2024] ? __pfx_task_work_run+0x10/0x10 > > > > [Wed Nov 27 00:38:46 2024] ? mark_held_locks+0x24/0x90 > > > > [Wed Nov 27 00:38:46 2024] syscall_exit_to_user_mode+0x269/0x2a0 > > > > [Wed Nov 27 00:38:46 2024] do_syscall_64+0x81/0x180 > > > > [Wed Nov 27 00:38:46 2024] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > > > [Wed Nov 27 00:38:46 2024] RIP: 0033:0x7fddc1ff43eb > > > > [Wed Nov 27 00:38:46 2024] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 0f > > > > 1f 40 00 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa > > > > b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 > > > > f9 a9 0c 00 f7 d8 > > > > [Wed Nov 27 00:38:46 2024] RSP: 002b:00007ffe64be88d8 EFLAGS: 00000246 > > > > ORIG_RAX: 00000000000000a6 > > > > [Wed Nov 27 00:38:46 2024] RAX: 0000000000000000 RBX: 00005632106d4c20 > > > > RCX: 00007fddc1ff43eb > > > > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000000 RSI: 0000000000000000 > > > > RDI: 00005632106d9410 > > > > [Wed Nov 27 00:38:46 2024] RBP: 00007ffe64be89b0 R08: 00005632106d4010 > > > > R09: 0000000000000007 > > > > [Wed Nov 27 00:38:46 2024] R10: 0000000000000000 R11: 0000000000000246 > > > > R12: 00005632106d4d28 > > > > [Wed Nov 27 00:38:46 2024] R13: 0000000000000000 R14: 00005632106d9410 > > > > R15: 00005632106d5030 > > > > [Wed Nov 27 00:38:46 2024] </TASK> > > > > [Wed Nov 27 00:38:46 2024] Modules linked in: cifs ccm cmac nls_utf8 > > > > cifs_arc4 nls_ucs2_utils cifs_md4 rpcsec_gss_krb5 auth_rpcgss nfsv4 > > > > dns_resolver nfs lockd grace netfs nf_conntrack_netbios_ns > > > > nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib > > > > nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct > > > > nft_chain_nat nf_tables ebtable_nat ebtable_broute ip6table_nat > > > > ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_nat > > > > nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle iptable_raw > > > > iptable_security ip_set ebtable_filter ebtables ip6table_filter > > > > ip6_tables iptable_filter ip_tables sunrpc kvm_intel kvm virtio_net > > > > net_failover failover virtio_balloon loop fuse dm_multipath nfnetlink > > > > zram xfs bochs drm_client_lib drm_shmem_helper drm_kms_helper > > > > crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm > > > > sha512_ssse3 sha256_ssse3 sha1_ssse3 floppy virtio_blk qemu_fw_cfg > > > > virtio_console [last unloaded: cifs] > > > > [Wed Nov 27 00:38:46 2024] ---[ end trace 0000000000000000 ]--- > > > > [Wed Nov 27 00:38:46 2024] RIP: 0010:generic_shutdown_super+0x1b7/0x1c0 > > > > [Wed Nov 27 00:38:46 2024] Code: 7b 28 e8 5c ca f8 ff 48 8b 6b 28 48 > > > > 89 ef e8 50 ca f8 ff 48 8b 55 00 48 8d b3 68 06 00 00 48 c7 c7 e0 38 > > > > ba ad e8 d9 c1 b5 ff <0f> 0b 0f 1f 80 00 00 00 00 90 90 90 90 90 90 90 > > > > 90 90 90 90 90 90 > > > > [Wed Nov 27 00:38:46 2024] RSP: 0018:ff1100012197fdf0 EFLAGS: 00010282 > > > > [Wed Nov 27 00:38:46 2024] RAX: 000000000000002d RBX: ff110001238b6000 > > > > RCX: 0000000000000027 > > > > [Wed Nov 27 00:38:46 2024] RDX: 0000000000000027 RSI: 0000000000000004 > > > > RDI: ff110004cb131a48 > > > > [Wed Nov 27 00:38:46 2024] RBP: ffffffffc1c6ac00 R08: ffffffffac3fd2fe > > > > R09: ffe21c0099626349 > > > > [Wed Nov 27 00:38:46 2024] R10: ff110004cb131a4b R11: 0000000000000001 > > > > R12: ff110001238b69c0 > > > > [Wed Nov 27 00:38:46 2024] R13: ff110001238b6780 R14: 1fe220002432ffd4 > > > > R15: 0000000000000000 > > > > [Wed Nov 27 00:38:46 2024] FS: 00007fddc1dcc800(0000) > > > > GS:ff110004cb100000(0000) knlGS:0000000000000000 > > > > [Wed Nov 27 00:38:46 2024] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > [Wed Nov 27 00:38:46 2024] CR2: 00007fc6440095d0 CR3: 0000000142146005 > > > > CR4: 0000000000373ef0 > > > > [Wed Nov 27 00:38:46 2024] DR0: 0000000000000000 DR1: 0000000000000000 > > > > DR2: 0000000000000000 > > > > [Wed Nov 27 00:38:46 2024] DR3: 0000000000000000 DR6: 00000000fffe0ff0 > > > > DR7: 0000000000000400 > > > > > > > > On Tue, Nov 26, 2024 at 3:50 PM Paul Aurich <paul@darkrain42.org> wrote: > > > > > > > > > > On 2024-11-22 19:28:34 -0800, Paul Aurich wrote: > > > > > >On 2024-11-21 23:05:51 -0300, Paulo Alcantara wrote: > > > > > >>Hi Paul, > > > > > >> > > > > > >>Thanks for looking into this! Really appreciate it. > > > > > >> > > > > > >>Paul Aurich <paul@darkrain42.org> writes: > > > > > >> > > > > > >>>The unmount process (cifs_kill_sb() calling close_all_cached_dirs()) can > > > > > >>>race with various cached directory operations, which ultimately results > > > > > >>>in dentries not being dropped and these kernel BUGs: > > > > > >>> > > > > > >>>BUG: Dentry ffff88814f37e358{i=1000000000080,n=/} still in use (2) [unmount of cifs cifs] > > > > > >>>VFS: Busy inodes after unmount of cifs (cifs) > > > > > >>>------------[ cut here ]------------ > > > > > >>>kernel BUG at fs/super.c:661! > > > > > >>> > > > > > >>>This happens when a cfid is in the process of being cleaned up when, and > > > > > >>>has been removed from the cfids->entries list, including: > > > > > >>> > > > > > >>>- Receiving a lease break from the server > > > > > >>>- Server reconnection triggers invalidate_all_cached_dirs(), which > > > > > >>> removes all the cfids from the list > > > > > >>>- The laundromat thread decides to expire an old cfid. > > > > > >>> > > > > > >>>To solve these problems, dropping the dentry is done in queued work done > > > > > >>>in a newly-added cfid_put_wq workqueue, and close_all_cached_dirs() > > > > > >>>flushes that workqueue after it drops all the dentries of which it's > > > > > >>>aware. This is a global workqueue (rather than scoped to a mount), but > > > > > >>>the queued work is minimal. > > > > > >> > > > > > >>Why does it need to be a global workqueue? Can't you make it per tcon? > > > > > > > > > > > >The problem with a per-tcon workqueue is I didn't see clean way to > > > > > >deal with multiuser mounts and flushing the workqueue in > > > > > >close_all_cached_dirs() -- when dealing with each individual tcon, > > > > > >we're still holding tlink_tree_lock, so an arbitrary sleep seems > > > > > >problematic. > > > > > > > > > > > >There could be a per-sb workqueue (stored in cifs_sb or the master > > > > > >tcon) but is there a way to get back to the superblock / master tcon > > > > > >with just a tcon (e.g. cached_dir_lease_break, when processing a lease > > > > > >break)? > > > > > > > > > > > >>>The final cleanup work for cleaning up a cfid is performed via work > > > > > >>>queued in the serverclose_wq workqueue; this is done separate from > > > > > >>>dropping the dentries so that close_all_cached_dirs() doesn't block on > > > > > >>>any server operations. > > > > > >>> > > > > > >>>Both of these queued works expect to invoked with a cfid reference and > > > > > >>>a tcon reference to avoid those objects from being freed while the work > > > > > >>>is ongoing. > > > > > >> > > > > > >>Why do you need to take a tcon reference? > > > > > > > > > > > >In the existing code (and my patch, without the refs), I was seeing an > > > > > >intermittent use-after-free of the tcon or cached_fids struct by > > > > > >queued work processing a lease break -- the cfid isn't linked from > > > > > >cached_fids, but smb2_close_cached_fid invoking SMB2_close can race > > > > > >with the unmount and cifs_put_tcon > > > > > > > > > > > >Something like: > > > > > > > > > > > > t1 t2 > > > > > >cached_dir_lease_break > > > > > >smb2_cached_lease_break > > > > > >smb2_close_cached_fid > > > > > >SMB2_close starts > > > > > > cifs_kill_sb > > > > > > cifs_umount > > > > > > cifs_put_link > > > > > > cifs_put_tcon > > > > > >SMB2_close continues > > > > > > > > > > > >I had a version of the patch that kept the 'in flight lease breaks' on > > > > > >a second list in cached_fids so that they could be cancelled > > > > > >synchronously from free_cached_fids(), but I struggled with it (I > > > > > >can't remember exactly, but I think I was struggling to get the linked > > > > > >list membership / removal handling and num_entries handling > > > > > >consistent). > > > > > > > > > > > >>Can't you drop the dentries > > > > > >>when tearing down tcon in cifs_put_tcon()? No concurrent mounts would > > > > > >>be able to access or free it. > > > > > > > > > > > >The dentries being dropped must occur before kill_anon_super(), as > > > > > >that's where the 'Dentry still in use' check is. All the tcons are put > > > > > >in cifs_umount(), which occurs after: > > > > > > > > > > > > kill_anon_super(sb); > > > > > > cifs_umount(cifs_sb); > > > > > > > > > > > >The other thing is that cifs_umount_begin() has this comment, which > > > > > >made me think a tcon can actually be tied to two distinct mount > > > > > >points: > > > > > > > > > > > > if ((tcon->tc_count > 1) || (tcon->status == TID_EXITING)) { > > > > > > /* we have other mounts to same share or we have > > > > > > already tried to umount this and woken up > > > > > > all waiting network requests, nothing to do */ > > > > > > > > > > > >Although, as I'm thinking about it again, I think I've misunderstood > > > > > >(and that comment is wrong?). > > > > > > > > > > > >It did cross my mind to pull some of the work out of cifs_umount into > > > > > >cifs_kill_sb (specifically, I wanted to cancel prune_tlinks earlier) > > > > > >-- no prune_tlinks would make it more feasible to drop tlink_tree_lock > > > > > >in close_all_cached_dirs(), at which point a per-tcon workqueue is > > > > > >more practical. > > > > > > > > > > > >>After running xfstests I've seen a leaked tcon in > > > > > >>/proc/fs/cifs/DebugData with no CIFS superblocks, which might be related > > > > > >>to this. > > > > > >> > > > > > >>Could you please check if there is any leaked connection in > > > > > >>/proc/fs/cifs/DebugData after running your tests? > > > > > > > > > > > >After I finish with my tests (I'm not using xfstests, although perhaps > > > > > >I should be) and unmount the share, DebugData doesn't show any > > > > > >connections for me. > > > > > > > > > > I was able to reproduce this leak. I believe the attached patch addresses it. > > > > > > > > > > I'm able to intermittently see a 'Dentry still in use' bug with xfstests > > > > > generic/241 (what Steve saw) (the attached patch doesn't help with that). I'm > > > > > still unsure what's going on there. > > > > > > > > > > >~Paul > > > > > > > > > > > > > > > > -- > > > > Thanks, > > > > > > > > Steve > > > > > > > > > > > > -- > > > Thanks, > > > > > > Steve > > > > > > > > -- > > Thanks, > > > > Steve > > > > -- > Thanks, > > Steve
diff --git a/fs/smb/client/cached_dir.c b/fs/smb/client/cached_dir.c index 64c67cbb2aa5..8fb95f4347df 100644 --- a/fs/smb/client/cached_dir.c +++ b/fs/smb/client/cached_dir.c @@ -15,10 +15,15 @@ static struct cached_fid *init_cached_dir(const char *path); static void free_cached_dir(struct cached_fid *cfid); static void smb2_close_cached_fid(struct kref *ref); static void cfids_laundromat_worker(struct work_struct *work); +struct cached_dir_dentry { + struct list_head entry; + struct dentry *dentry; +}; + static struct cached_fid *find_or_create_cached_dir(struct cached_fids *cfids, const char *path, bool lookup_only, __u32 max_cached_dirs) { @@ -469,42 +474,68 @@ void close_all_cached_dirs(struct cifs_sb_info *cifs_sb) struct rb_node *node; struct cached_fid *cfid; struct cifs_tcon *tcon; struct tcon_link *tlink; struct cached_fids *cfids; + struct cached_dir_dentry *tmp_list, *q; + LIST_HEAD(entry); + spin_lock(&cifs_sb->tlink_tree_lock); for (node = rb_first(root); node; node = rb_next(node)) { tlink = rb_entry(node, struct tcon_link, tl_rbnode); tcon = tlink_tcon(tlink); if (IS_ERR(tcon)) continue; cfids = tcon->cfids; if (cfids == NULL) continue; + spin_lock(&cfids->cfid_list_lock); list_for_each_entry(cfid, &cfids->entries, entry) { - dput(cfid->dentry); + tmp_list = kmalloc(sizeof(*tmp_list), GFP_ATOMIC); + if (tmp_list == NULL) + break; + spin_lock(&cfid->fid_lock); + tmp_list->dentry = cfid->dentry; cfid->dentry = NULL; + spin_unlock(&cfid->fid_lock); + + list_add_tail(&tmp_list->entry, &entry); } + spin_unlock(&cfids->cfid_list_lock); } + spin_unlock(&cifs_sb->tlink_tree_lock); + + list_for_each_entry_safe(tmp_list, q, &entry, entry) { + list_del(&tmp_list->entry); + dput(tmp_list->dentry); + kfree(tmp_list); + } + + /* Flush any pending work that will drop dentries */ + flush_workqueue(cfid_put_wq); } /* * Invalidate all cached dirs when a TCON has been reset * due to a session loss. */ void invalidate_all_cached_dirs(struct cifs_tcon *tcon) { struct cached_fids *cfids = tcon->cfids; struct cached_fid *cfid, *q; - LIST_HEAD(entry); if (cfids == NULL) return; + /* + * Mark all the cfids as closed, and move them to the cfids->dying list. + * They'll be cleaned up later by cfids_invalidation_worker. Take + * a reference to each cfid during this process. + */ spin_lock(&cfids->cfid_list_lock); list_for_each_entry_safe(cfid, q, &cfids->entries, entry) { - list_move(&cfid->entry, &entry); + list_move(&cfid->entry, &cfids->dying); cfids->num_entries--; cfid->is_open = false; cfid->on_list = false; if (cfid->has_lease) { /* @@ -513,30 +544,51 @@ void invalidate_all_cached_dirs(struct cifs_tcon *tcon) */ cfid->has_lease = false; } else kref_get(&cfid->refcount); } + /* + * Queue dropping of the dentries once locks have been dropped + */ + if (!list_empty(&cfids->dying)) + queue_work(cfid_put_wq, &cfids->invalidation_work); spin_unlock(&cfids->cfid_list_lock); - - list_for_each_entry_safe(cfid, q, &entry, entry) { - list_del(&cfid->entry); - cancel_work_sync(&cfid->lease_break); - /* - * Drop the ref-count from above, either the lease-ref (if there - * was one) or the extra one acquired. - */ - kref_put(&cfid->refcount, smb2_close_cached_fid); - } } static void -smb2_cached_lease_break(struct work_struct *work) +cached_dir_offload_close(struct work_struct *work) { struct cached_fid *cfid = container_of(work, - struct cached_fid, lease_break); + struct cached_fid, close_work); + struct cifs_tcon *tcon = cfid->tcon; + + WARN_ON(cfid->on_list); kref_put(&cfid->refcount, smb2_close_cached_fid); + cifs_put_tcon(tcon, netfs_trace_tcon_ref_put_cached_close); +} + +/* + * Release the cached directory's dentry, and then queue work to drop cached + * directory itself (closing on server if needed). + * + * Must be called with a reference to the cached_fid and a reference to the + * tcon. + */ +static void cached_dir_put_work(struct work_struct *work) +{ + struct cached_fid *cfid = container_of(work, struct cached_fid, + put_work); + struct dentry *dentry; + + spin_lock(&cfid->fid_lock); + dentry = cfid->dentry; + cfid->dentry = NULL; + spin_unlock(&cfid->fid_lock); + + dput(dentry); + queue_work(serverclose_wq, &cfid->close_work); } int cached_dir_lease_break(struct cifs_tcon *tcon, __u8 lease_key[16]) { struct cached_fids *cfids = tcon->cfids; @@ -559,12 +611,14 @@ int cached_dir_lease_break(struct cifs_tcon *tcon, __u8 lease_key[16]) */ list_del(&cfid->entry); cfid->on_list = false; cfids->num_entries--; - queue_work(cifsiod_wq, - &cfid->lease_break); + ++tcon->tc_count; + trace_smb3_tcon_ref(tcon->debug_id, tcon->tc_count, + netfs_trace_tcon_ref_get_cached_lease_break); + queue_work(cfid_put_wq, &cfid->put_work); spin_unlock(&cfids->cfid_list_lock); return true; } } spin_unlock(&cfids->cfid_list_lock); @@ -582,11 +636,12 @@ static struct cached_fid *init_cached_dir(const char *path) if (!cfid->path) { kfree(cfid); return NULL; } - INIT_WORK(&cfid->lease_break, smb2_cached_lease_break); + INIT_WORK(&cfid->close_work, cached_dir_offload_close); + INIT_WORK(&cfid->put_work, cached_dir_put_work); INIT_LIST_HEAD(&cfid->entry); INIT_LIST_HEAD(&cfid->dirents.entries); mutex_init(&cfid->dirents.de_mutex); spin_lock_init(&cfid->fid_lock); kref_init(&cfid->refcount); @@ -595,10 +650,13 @@ static struct cached_fid *init_cached_dir(const char *path) static void free_cached_dir(struct cached_fid *cfid) { struct cached_dirent *dirent, *q; + WARN_ON(work_pending(&cfid->close_work)); + WARN_ON(work_pending(&cfid->put_work)); + dput(cfid->dentry); cfid->dentry = NULL; /* * Delete all cached dirent names @@ -612,14 +670,34 @@ static void free_cached_dir(struct cached_fid *cfid) kfree(cfid->path); cfid->path = NULL; kfree(cfid); } +static void cfids_invalidation_worker(struct work_struct *work) +{ + struct cached_fids *cfids = container_of(work, struct cached_fids, + invalidation_work); + struct cached_fid *cfid, *q; + LIST_HEAD(entry); + + spin_lock(&cfids->cfid_list_lock); + /* move cfids->dying to the local list */ + list_cut_before(&entry, &cfids->dying, &cfids->dying); + spin_unlock(&cfids->cfid_list_lock); + + list_for_each_entry_safe(cfid, q, &entry, entry) { + list_del(&cfid->entry); + /* Drop the ref-count acquired in invalidate_all_cached_dirs */ + kref_put(&cfid->refcount, smb2_close_cached_fid); + } +} + static void cfids_laundromat_worker(struct work_struct *work) { struct cached_fids *cfids; struct cached_fid *cfid, *q; + struct dentry *dentry; LIST_HEAD(entry); cfids = container_of(work, struct cached_fids, laundromat_work.work); spin_lock(&cfids->cfid_list_lock); @@ -641,22 +719,32 @@ static void cfids_laundromat_worker(struct work_struct *work) } spin_unlock(&cfids->cfid_list_lock); list_for_each_entry_safe(cfid, q, &entry, entry) { list_del(&cfid->entry); - /* - * Cancel and wait for the work to finish in case we are racing - * with it. - */ - cancel_work_sync(&cfid->lease_break); - /* - * Drop the ref-count from above, either the lease-ref (if there - * was one) or the extra one acquired. - */ - kref_put(&cfid->refcount, smb2_close_cached_fid); + + spin_lock(&cfid->fid_lock); + dentry = cfid->dentry; + cfid->dentry = NULL; + spin_unlock(&cfid->fid_lock); + + dput(dentry); + if (cfid->is_open) { + spin_lock(&cifs_tcp_ses_lock); + ++cfid->tcon->tc_count; + trace_smb3_tcon_ref(cfid->tcon->debug_id, cfid->tcon->tc_count, + netfs_trace_tcon_ref_get_cached_laundromat); + spin_unlock(&cifs_tcp_ses_lock); + queue_work(serverclose_wq, &cfid->close_work); + } else + /* + * Drop the ref-count from above, either the lease-ref (if there + * was one) or the extra one acquired. + */ + kref_put(&cfid->refcount, smb2_close_cached_fid); } - queue_delayed_work(cifsiod_wq, &cfids->laundromat_work, + queue_delayed_work(cfid_put_wq, &cfids->laundromat_work, dir_cache_timeout * HZ); } struct cached_fids *init_cached_dirs(void) { @@ -665,13 +753,15 @@ struct cached_fids *init_cached_dirs(void) cfids = kzalloc(sizeof(*cfids), GFP_KERNEL); if (!cfids) return NULL; spin_lock_init(&cfids->cfid_list_lock); INIT_LIST_HEAD(&cfids->entries); + INIT_LIST_HEAD(&cfids->dying); + INIT_WORK(&cfids->invalidation_work, cfids_invalidation_worker); INIT_DELAYED_WORK(&cfids->laundromat_work, cfids_laundromat_worker); - queue_delayed_work(cifsiod_wq, &cfids->laundromat_work, + queue_delayed_work(cfid_put_wq, &cfids->laundromat_work, dir_cache_timeout * HZ); return cfids; } @@ -686,17 +776,23 @@ void free_cached_dirs(struct cached_fids *cfids) if (cfids == NULL) return; cancel_delayed_work_sync(&cfids->laundromat_work); + cancel_work_sync(&cfids->invalidation_work); spin_lock(&cfids->cfid_list_lock); list_for_each_entry_safe(cfid, q, &cfids->entries, entry) { cfid->on_list = false; cfid->is_open = false; list_move(&cfid->entry, &entry); } + list_for_each_entry_safe(cfid, q, &cfids->dying, entry) { + cfid->on_list = false; + cfid->is_open = false; + list_move(&cfid->entry, &entry); + } spin_unlock(&cfids->cfid_list_lock); list_for_each_entry_safe(cfid, q, &entry, entry) { list_del(&cfid->entry); free_cached_dir(cfid); diff --git a/fs/smb/client/cached_dir.h b/fs/smb/client/cached_dir.h index 81ba0fd5cc16..1dfe79d947a6 100644 --- a/fs/smb/client/cached_dir.h +++ b/fs/smb/client/cached_dir.h @@ -42,23 +42,27 @@ struct cached_fid { struct kref refcount; struct cifs_fid fid; spinlock_t fid_lock; struct cifs_tcon *tcon; struct dentry *dentry; - struct work_struct lease_break; + struct work_struct put_work; + struct work_struct close_work; struct smb2_file_all_info file_all_info; struct cached_dirents dirents; }; /* default MAX_CACHED_FIDS is 16 */ struct cached_fids { /* Must be held when: * - accessing the cfids->entries list + * - accessing the cfids->dying list */ spinlock_t cfid_list_lock; int num_entries; struct list_head entries; + struct list_head dying; + struct work_struct invalidation_work; struct delayed_work laundromat_work; }; extern struct cached_fids *init_cached_dirs(void); extern void free_cached_dirs(struct cached_fids *cfids); diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c index 20cafdff5081..bf909c2f6b96 100644 --- a/fs/smb/client/cifsfs.c +++ b/fs/smb/client/cifsfs.c @@ -155,10 +155,11 @@ struct workqueue_struct *cifsiod_wq; struct workqueue_struct *decrypt_wq; struct workqueue_struct *fileinfo_put_wq; struct workqueue_struct *cifsoplockd_wq; struct workqueue_struct *deferredclose_wq; struct workqueue_struct *serverclose_wq; +struct workqueue_struct *cfid_put_wq; __u32 cifs_lock_secret; /* * Bumps refcount for cifs super block. * Note that it should be only called if a reference to VFS super block is @@ -1893,13 +1894,20 @@ init_cifs(void) if (!serverclose_wq) { rc = -ENOMEM; goto out_destroy_deferredclose_wq; } - rc = cifs_init_inodecache(); - if (rc) + cfid_put_wq = alloc_workqueue("cfid_put_wq", + WQ_FREEZABLE|WQ_MEM_RECLAIM, 0); + if (!cfid_put_wq) { + rc = -ENOMEM; goto out_destroy_serverclose_wq; + } + + rc = cifs_init_inodecache(); + if (rc) + goto out_destroy_cfid_put_wq; rc = cifs_init_netfs(); if (rc) goto out_destroy_inodecache; @@ -1963,10 +1971,12 @@ init_cifs(void) destroy_mids(); out_destroy_netfs: cifs_destroy_netfs(); out_destroy_inodecache: cifs_destroy_inodecache(); +out_destroy_cfid_put_wq: + destroy_workqueue(cfid_put_wq); out_destroy_serverclose_wq: destroy_workqueue(serverclose_wq); out_destroy_deferredclose_wq: destroy_workqueue(deferredclose_wq); out_destroy_cifsoplockd_wq: diff --git a/fs/smb/client/cifsglob.h b/fs/smb/client/cifsglob.h index 5041b1ffc244..31ea19e7b998 100644 --- a/fs/smb/client/cifsglob.h +++ b/fs/smb/client/cifsglob.h @@ -1981,11 +1981,11 @@ require use of the stronger protocol */ * cifsInodeInfo->open_file_lock cifsInodeInfo->openFileList cifs_alloc_inode * cifsInodeInfo->writers_lock cifsInodeInfo->writers cifsInodeInfo_alloc * cifsInodeInfo->lock_sem cifsInodeInfo->llist cifs_init_once * ->can_cache_brlcks * cifsInodeInfo->deferred_lock cifsInodeInfo->deferred_closes cifsInodeInfo_alloc - * cached_fid->fid_mutex cifs_tcon->crfid tcon_info_alloc + * cached_fids->cfid_list_lock cifs_tcon->cfids->entries init_cached_dirs * cifsFileInfo->fh_mutex cifsFileInfo cifs_new_fileinfo * cifsFileInfo->file_info_lock cifsFileInfo->count cifs_new_fileinfo * ->invalidHandle initiate_cifs_search * ->oplock_break_cancelled ****************************************************************************/ @@ -2069,10 +2069,11 @@ extern struct workqueue_struct *cifsiod_wq; extern struct workqueue_struct *decrypt_wq; extern struct workqueue_struct *fileinfo_put_wq; extern struct workqueue_struct *cifsoplockd_wq; extern struct workqueue_struct *deferredclose_wq; extern struct workqueue_struct *serverclose_wq; +extern struct workqueue_struct *cfid_put_wq; extern __u32 cifs_lock_secret; extern mempool_t *cifs_sm_req_poolp; extern mempool_t *cifs_req_poolp; extern mempool_t *cifs_mid_poolp; diff --git a/fs/smb/client/inode.c b/fs/smb/client/inode.c index eff3f57235ee..20484853fb6b 100644 --- a/fs/smb/client/inode.c +++ b/fs/smb/client/inode.c @@ -2471,17 +2471,14 @@ cifs_dentry_needs_reval(struct dentry *dentry) if (!lookupCacheEnabled) return true; if (!open_cached_dir_by_dentry(tcon, dentry->d_parent, &cfid)) { - spin_lock(&cfid->fid_lock); if (cfid->time && cifs_i->time > cfid->time) { - spin_unlock(&cfid->fid_lock); close_cached_dir(cfid); return false; } - spin_unlock(&cfid->fid_lock); close_cached_dir(cfid); } /* * depending on inode type, check if attribute caching disabled for * files or directories diff --git a/fs/smb/client/trace.h b/fs/smb/client/trace.h index 0b52d22a91a0..12cbd3428a6d 100644 --- a/fs/smb/client/trace.h +++ b/fs/smb/client/trace.h @@ -42,18 +42,21 @@ EM(netfs_trace_tcon_ref_free, "FRE ") \ EM(netfs_trace_tcon_ref_free_fail, "FRE Fail ") \ EM(netfs_trace_tcon_ref_free_ipc, "FRE Ipc ") \ EM(netfs_trace_tcon_ref_free_ipc_fail, "FRE Ipc-F ") \ EM(netfs_trace_tcon_ref_free_reconnect_server, "FRE Reconn") \ + EM(netfs_trace_tcon_ref_get_cached_laundromat, "GET Ch-Lau") \ + EM(netfs_trace_tcon_ref_get_cached_lease_break, "GET Ch-Lea") \ EM(netfs_trace_tcon_ref_get_cancelled_close, "GET Cn-Cls") \ EM(netfs_trace_tcon_ref_get_dfs_refer, "GET DfsRef") \ EM(netfs_trace_tcon_ref_get_find, "GET Find ") \ EM(netfs_trace_tcon_ref_get_find_sess_tcon, "GET FndSes") \ EM(netfs_trace_tcon_ref_get_reconnect_server, "GET Reconn") \ EM(netfs_trace_tcon_ref_new, "NEW ") \ EM(netfs_trace_tcon_ref_new_ipc, "NEW Ipc ") \ EM(netfs_trace_tcon_ref_new_reconnect_server, "NEW Reconn") \ + EM(netfs_trace_tcon_ref_put_cached_close, "PUT Ch-Cls") \ EM(netfs_trace_tcon_ref_put_cancelled_close, "PUT Cn-Cls") \ EM(netfs_trace_tcon_ref_put_cancelled_close_fid, "PUT Cn-Fid") \ EM(netfs_trace_tcon_ref_put_cancelled_mid, "PUT Cn-Mid") \ EM(netfs_trace_tcon_ref_put_mnt_ctx, "PUT MntCtx") \ EM(netfs_trace_tcon_ref_put_reconnect_server, "PUT Reconn") \
The unmount process (cifs_kill_sb() calling close_all_cached_dirs()) can race with various cached directory operations, which ultimately results in dentries not being dropped and these kernel BUGs: BUG: Dentry ffff88814f37e358{i=1000000000080,n=/} still in use (2) [unmount of cifs cifs] VFS: Busy inodes after unmount of cifs (cifs) ------------[ cut here ]------------ kernel BUG at fs/super.c:661! This happens when a cfid is in the process of being cleaned up when, and has been removed from the cfids->entries list, including: - Receiving a lease break from the server - Server reconnection triggers invalidate_all_cached_dirs(), which removes all the cfids from the list - The laundromat thread decides to expire an old cfid. To solve these problems, dropping the dentry is done in queued work done in a newly-added cfid_put_wq workqueue, and close_all_cached_dirs() flushes that workqueue after it drops all the dentries of which it's aware. This is a global workqueue (rather than scoped to a mount), but the queued work is minimal. The final cleanup work for cleaning up a cfid is performed via work queued in the serverclose_wq workqueue; this is done separate from dropping the dentries so that close_all_cached_dirs() doesn't block on any server operations. Both of these queued works expect to invoked with a cfid reference and a tcon reference to avoid those objects from being freed while the work is ongoing. While we're here, add proper locking to close_all_cached_dirs(), and locking around the freeing of cfid->dentry. Fixes: ebe98f1447bb ("cifs: enable caching of directories for which a lease is held") Cc: stable@vger.kernel.org Signed-off-by: Paul Aurich <paul@darkrain42.org> --- fs/smb/client/cached_dir.c | 156 ++++++++++++++++++++++++++++++------- fs/smb/client/cached_dir.h | 6 +- fs/smb/client/cifsfs.c | 14 +++- fs/smb/client/cifsglob.h | 3 +- fs/smb/client/inode.c | 3 - fs/smb/client/trace.h | 3 + 6 files changed, 148 insertions(+), 37 deletions(-)