Message ID | 1435687950-22037-1-git-send-email-jeff.layton@primarydata.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, 30 Jun 2015 14:12:30 -0400 Jeff Layton <jlayton@poochiereds.net> wrote: > Jean reported another crash, similar to the one fixed by feaff8e5b2cf: > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000148 > IP: [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0 > PGD 0 > Oops: 0000 [#1] SMP > Modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache vmw_vsock_vmci_transport vsock cfg80211 rfkill coretemp crct10dif_pclmul ppdev vmw_balloon crc32_pclmul crc32c_intel ghash_clmulni_intel pcspkr vmxnet3 parport_pc i2c_piix4 microcode serio_raw parport nfsd floppy vmw_vmci acpi_cpufreq auth_rpcgss shpchp nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi scsi_transport_spi mptscsih ata_generic mptbase i2c_core pata_acpi > CPU: 0 PID: 329 Comm: kworker/0:1H Not tainted 4.1.0-rc7+ #2 > Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013 > Workqueue: rpciod rpc_async_schedule [sunrpc] > 30ec000 > RIP: 0010:[<ffffffff8124ef7f>] [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0 > RSP: 0018:ffff8802330efc08 EFLAGS: 00010296 > RAX: ffff8802330efc58 RBX: ffff880097187c80 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000 > RBP: ffff8802330efc18 R08: ffff88023fc173d8 R09: 3038b7bf00000000 > R10: 00002f1a02000000 R11: 3038b7bf00000000 R12: 0000000000000000 > R13: 0000000000000000 R14: ffff8802337a2300 R15: 0000000000000020 > FS: 0000000000000000(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000148 CR3: 000000003680f000 CR4: 00000000001407f0 > Stack: > ffff880097187c80 ffff880097187cd8 ffff8802330efc98 ffffffff81250281 > ffff8802330efc68 ffffffffa013e7df ffff8802330efc98 0000000000000246 > ffff8801f6901c00 ffff880233d2b8d8 ffff8802330efc58 ffff8802330efc58 > Call Trace: > [<ffffffff81250281>] __posix_lock_file+0x31/0x5e0 > [<ffffffffa013e7df>] ? rpc_wake_up_task_queue_locked.part.35+0xcf/0x240 [sunrpc] > [<ffffffff8125088b>] posix_lock_file_wait+0x3b/0xd0 > [<ffffffffa03890b2>] ? nfs41_wake_and_assign_slot+0x32/0x40 [nfsv4] > [<ffffffffa0365808>] ? nfs41_sequence_done+0xd8/0x300 [nfsv4] > [<ffffffffa0367525>] do_vfs_lock+0x35/0x40 [nfsv4] > [<ffffffffa03690c1>] nfs4_locku_done+0x81/0x120 [nfsv4] > [<ffffffffa013e310>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc] > [<ffffffffa013e310>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc] > [<ffffffffa013e33c>] rpc_exit_task+0x2c/0x90 [sunrpc] > [<ffffffffa0134400>] ? call_refreshresult+0x170/0x170 [sunrpc] > [<ffffffffa013ece4>] __rpc_execute+0x84/0x410 [sunrpc] > [<ffffffffa013f085>] rpc_async_schedule+0x15/0x20 [sunrpc] > [<ffffffff810add67>] process_one_work+0x147/0x400 > [<ffffffff810ae42b>] worker_thread+0x11b/0x460 > [<ffffffff810ae310>] ? rescuer_thread+0x2f0/0x2f0 > [<ffffffff810b35d9>] kthread+0xc9/0xe0 > [<ffffffff81010000>] ? perf_trace_xen_mmu_set_pmd+0xa0/0x160 > [<ffffffff810b3510>] ? kthread_create_on_node+0x170/0x170 > [<ffffffff8173c222>] ret_from_fork+0x42/0x70 > [<ffffffff810b3510>] ? kthread_create_on_node+0x170/0x170 > Code: a5 81 e8 85 75 e4 ff c6 05 31 ee aa 00 01 eb 98 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 49 89 fc 53 <48> 8b 9f 48 01 00 00 48 85 db 74 08 48 89 d8 5b 41 5c 5d c3 83 > RIP [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0 > RSP <ffff8802330efc08> > CR2: 0000000000000148 > ---[ end trace 64484f16250de7ef ]--- > > The problem is almost exactly the same as the one fixed by feaff8e5b2cf. > We must take a reference to the struct file when running the LOCKU > compound to prevent the final fput from running until the operation is > complete. > > Reported-by: Jean Spector <jean@primarydata.com> > Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> > --- > fs/nfs/nfs4proc.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c > index 605c203de556..5e7638d8e31b 100644 > --- a/fs/nfs/nfs4proc.c > +++ b/fs/nfs/nfs4proc.c > @@ -5484,6 +5484,7 @@ static struct nfs4_unlockdata *nfs4_alloc_unlockdata(struct file_lock *fl, > atomic_inc(&lsp->ls_count); > /* Ensure we don't close file until we're done freeing locks! */ > p->ctx = get_nfs_open_context(ctx); > + get_file(fl->fl_file); > memcpy(&p->fl, fl, sizeof(p->fl)); > p->server = NFS_SERVER(inode); > return p; > @@ -5495,6 +5496,7 @@ static void nfs4_locku_release_calldata(void *data) > nfs_free_seqid(calldata->arg.seqid); > nfs4_put_lock_state(calldata->lsp); > put_nfs_open_context(calldata->ctx); > + fput(calldata->fl.fl_file); > kfree(calldata); > } > Oops, I forgot to Cc stable on this one... Trond, can you add that? Thanks,
On Wed, Jul 1, 2015 at 3:37 PM Jeff Layton <jeff.layton@primarydata.com> wrote: > > > The problem is almost exactly the same as the one fixed by feaff8e5b2cf. > > Oops, I forgot to Cc stable on this one... > Trond, can you add that? Is the commit mentionned also targeted for stable? commit feaff8e5b2cfc3eae02cf65db7a400b0b9ffc596 nfs: take extra reference to fl->fl_file when running a setlk Regards,
On Thu, 2 Jul 2015 10:58:59 +0200 William Dauchy <wdauchy@gmail.com> wrote: > On Wed, Jul 1, 2015 at 3:37 PM Jeff Layton <jeff.layton@primarydata.com> wrote: > > > > > The problem is almost exactly the same as the one fixed by feaff8e5b2cf. > > > > Oops, I forgot to Cc stable on this one... > > Trond, can you add that? > > Is the commit mentionned also targeted for stable? > commit feaff8e5b2cfc3eae02cf65db7a400b0b9ffc596 > nfs: take extra reference to fl->fl_file when running a setlk > > Regards, Oh! It wasn't marked as such but probably should be. I'll resend it to stable list a little later. Thanks,
Hello, I don't know if it's related but after applying these two patches, I got a crash; will try to get more info. BUG: unable to handle kernel NULL pointer dereference at (nil) IP: [<ffffffff810ee2b3>] filemap_fault+0x23/0x430 PGD 0 Oops: 0000 [#1] PREEMPT SMP CPU: 2 PID: 32013 Comm: umount.nfs4 Tainted: G W 3.14.46 #1 task: ffff880f6044ecc0 ti: ffff880f6044f248 task.ti: ffff880f6044f248 RIP: 0010:[<ffffffff810ee2b3>] [<ffffffff810ee2b3>] filemap_fault+0x23/0x430 RSP: 0000:ffff880f4e56fcc8 EFLAGS: 00010292 RAX: 0000000000000000 RBX: ffff881ff95d4480 RCX: ffff880f4e5749c8 RDX: ffffffff817b60c0 RSI: ffff880f4e56fd50 RDI: ffff881ff95d4480 RBP: ffff880f4e56fd18 R08: 0000000000000007 R09: 00000000000000a8 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000007 R13: ffff880f5b884700 R14: 000002d4a72be5fc R15: 0000000000000000 FS: 000002d4a80c47e0(0000) GS:ffff88103fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000000160c000 CR4: 00000000001607f0 Stack: ffff880f6044ecc0 0000000000000009 0000000000000082 0000000000000000 ffff880f6044f448 ffff881ff95d4480 00000000000000a8 0000000000000007 000002d4a72be5fc 0000000000000000 ffff880f4e56fd98 ffffffff81111f88 Call Trace: [<ffffffff81111f88>] __do_fault+0x78/0x5e0 [<ffffffff81116f0c>] handle_mm_fault+0x39c/0xcb0 [<ffffffff81033e23>] __do_page_fault+0x1b3/0x620 [<ffffffff815fc01e>] ? retint_swapgs_pax+0x10/0x15 [<ffffffff810a889d>] ? trace_hardirqs_on_caller+0x13d/0x1e0 [<ffffffff812c827e>] ? trace_hardirqs_off_thunk+0x41/0x43 [<ffffffff812c8238>] ? trace_hardirqs_on_thunk+0x41/0x46 [<ffffffff81178c8b>] ? SyS_umount+0x8b/0x4a0 [<ffffffff815fca33>] ? system_call_fastpath+0x16/0x1b [<ffffffff810342cc>] do_page_fault+0xc/0x20 [<ffffffff815fc272>] page_fault+0x22/0x30 Code: fe ff ff 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 83 ec 28 4c 8b af a0 00 00 00 4c 8b 66 08 49 8b 85 b8 01 00 00 <4c> 8b 38 48 89 45 c8 49 8b 47 40 48 8d 90 ff 0f 00 00 b8 02 00 RIP [<ffffffff810ee2b3>] filemap_fault+0x23/0x430 RSP <ffff880f4e56fcc8> CR2: 0000000000000000 ---[ end trace 59f46e48035e53e4 ]---
On Thu, 2 Jul 2015 06:08:27 -0400 Jeff Layton <jeff.layton@primarydata.com> wrote: > On Thu, 2 Jul 2015 10:58:59 +0200 > William Dauchy <wdauchy@gmail.com> wrote: > > > On Wed, Jul 1, 2015 at 3:37 PM Jeff Layton <jeff.layton@primarydata.com> wrote: > > > > > > > The problem is almost exactly the same as the one fixed by feaff8e5b2cf. > > > > > > Oops, I forgot to Cc stable on this one... > > > Trond, can you add that? > > > > Is the commit mentionned also targeted for stable? > > commit feaff8e5b2cfc3eae02cf65db7a400b0b9ffc596 > > nfs: take extra reference to fl->fl_file when running a setlk > > > > Regards, > > Oh! It wasn't marked as such but probably should be. I'll resend it to > stable list a little later. > > Thanks, So, William has done some testing and hit some problems with this patch. I suspect that it's because we can end up running an unlock after the filp->f_count has already gone to zero and are in __fput, so we take an extra reference and end up with a use-after-free. I think it'd be best to revert this patch from all kernels for now (mainline and stable). I don't think the one that changes the setlk codepath is susceptible to this, but it's probably fine to hold off on applying both until I can sort out a better way to fix this one. Thanks!
On Fri, Jul 10, 2015 at 5:02 PM, Jeff Layton <jeff.layton@primarydata.com> wrote: > So, William has done some testing and hit some problems with this > patch. I suspect that it's because we can end up running an unlock > after the filp->f_count has already gone to zero and are in __fput, so > we take an extra reference and end up with a use-after-free. > > I think it'd be best to revert this patch from all kernels for now > (mainline and stable). I don't think the one that changes the setlk > codepath is susceptible to this, but it's probably fine to hold off on > applying both until I can sort out a better way to fix this one. I also think it's safer to revert both of them.
On Fri, 10 Jul 2015 17:56:57 +0200 William Dauchy <wdauchy@gmail.com> wrote: > On Fri, Jul 10, 2015 at 5:02 PM, Jeff Layton > <jeff.layton@primarydata.com> wrote: > > So, William has done some testing and hit some problems with this > > patch. I suspect that it's because we can end up running an unlock > > after the filp->f_count has already gone to zero and are in __fput, so > > we take an extra reference and end up with a use-after-free. > > > > I think it'd be best to revert this patch from all kernels for now > > (mainline and stable). I don't think the one that changes the setlk > > codepath is susceptible to this, but it's probably fine to hold off on > > applying both until I can sort out a better way to fix this one. > > I also think it's safer to revert both of them. > Oh? Do you have some reason to suspect the setlk patch to be problematic? If not, then I'd rather not revert that one (at least not from mainline) since I don't think it's likely to be a problem and it does fix a real bug. It's your call on what you do in stable of course. Either way, I added the warning that I described before and got this, so I do suspect that the unlck patch is the cause of the problem: [ 373.144955] ------------[ cut here ]------------ [ 373.146000] WARNING: CPU: 1 PID: 897 at fs/nfs/nfs4proc.c:5489 nfs4_do_unlck+0x294/0x2e0 [nfsv4]() [ 373.147975] Modules linked in: cts rpcsec_gss_krb5 nfsv4(OE) dns_resolver nfs(OE) fscache xfs libcrc32c snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep ppdev snd_seq snd_seq_device snd_pcm snd_timer snd joydev soundcore e1000 virtio_balloon i2c_piix4 pvpanic parport_pc parport acpi_cpufreq nfsd nfs_acl lockd grace auth_rpcgss sunrpc virtio_console virtio_blk qxl drm_kms_helper ttm drm ata_generic pata_acpi serio_raw virtio_pci virtio_ring virtio [ 373.156863] CPU: 1 PID: 897 Comm: flock Tainted: G OE 4.2.0-0.rc1.git2.1.fc23.x86_64 #1 [ 373.158455] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 373.159496] 0000000000000000 000000003297e63c ffff8800cd27fad8 ffffffff81864405 [ 373.160856] 0000000000000000 0000000000000000 ffff8800cd27fb18 ffffffff810ab446 [ 373.162295] ffff8800cd27faf8 ffff880118d17a00 ffff880118d17a80 0000000000000000 [ 373.163682] Call Trace: [ 373.164130] [<ffffffff81864405>] dump_stack+0x4c/0x65 [ 373.165053] [<ffffffff810ab446>] warn_slowpath_common+0x86/0xc0 [ 373.166262] [<ffffffff810ab57a>] warn_slowpath_null+0x1a/0x20 [ 373.167544] [<ffffffffa046f854>] nfs4_do_unlck+0x294/0x2e0 [nfsv4] [ 373.168627] [<ffffffffa0477a25>] nfs4_proc_lock+0x2d5/0x880 [nfsv4] [ 373.169734] [<ffffffffa042dfe8>] do_unlk+0xa8/0xc0 [nfs] [ 373.170676] [<ffffffffa042e311>] nfs_flock+0x71/0xa0 [nfs] [ 373.171651] [<ffffffff812c99e6>] locks_remove_flock+0xa6/0xf0 [ 373.172660] [<ffffffff812cc9da>] locks_remove_file+0x5a/0x100 [ 373.173685] [<ffffffff8126ff23>] __fput+0xd3/0x200 [ 373.174508] [<ffffffff8127009e>] ____fput+0xe/0x10 [ 373.175356] [<ffffffff810d093d>] task_work_run+0x8d/0xc0 [ 373.176339] [<ffffffff8101cadd>] do_notify_resume+0x8d/0x90 [ 373.177315] [<ffffffff8186dda6>] int_signal+0x12/0x17 [ 373.178167] ---[ end trace b7fce2dedc7eda37 ]--- I'll see if I can cook up a better way to fix this. It's a little tricky since in the event that the task had a signal pending, the filp may be long gone by the time the reply to the LOCKU request comes in. So, we may need to change the prototype of locks_remove_flock to take an inode pointer and take a reference to that instead of the filp. That should be safe, AFAICT since we definitely still hold a reference to the inode at that point.
On Fri, Jul 10, 2015 at 6:07 PM, Jeff Layton <jeff.layton@primarydata.com> wrote: > Oh? Do you have some reason to suspect the setlk patch to be > problematic? If not, then I'd rather not revert that one (at least not > from mainline) since I don't think it's likely to be a problem and it > does fix a real bug. It's your call on what you do in stable of course. Yes, I also had some instabilities with the setlk patch only applied. Same trace as mentioned in the other thread.
On Sat, 11 Jul 2015 00:35:27 +0200 William Dauchy <wdauchy@gmail.com> wrote: > On Fri, Jul 10, 2015 at 6:07 PM, Jeff Layton > <jeff.layton@primarydata.com> wrote: > > Oh? Do you have some reason to suspect the setlk patch to be > > problematic? If not, then I'd rather not revert that one (at least not > > from mainline) since I don't think it's likely to be a problem and it > > does fix a real bug. It's your call on what you do in stable of course. > > Yes, I also had some instabilities with the setlk patch only applied. > Same trace as mentioned in the other thread. > That, I have no explanation for... We clearly hold a reference to the filp already when setting a lock. Hmm...unless there was maybe some reclaim involved? I wouldn't think that we'd try to reclaim locks for a filp that was being torn down, but I'd have to go over that code in detail to be sure. Can you give any insight into what you were doing when it was having problems? Did the server reboot while you were testing? In any case, we can probably get rid of the extra filp reference in that code too if/when the RFC series is merged. We definitely hold a reference to the inode already, so we shouldn't need to take the extra filp reference once that's merged. Not sure whether that patchset will be stable material though since it is a more invasive fix.
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 605c203de556..5e7638d8e31b 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -5484,6 +5484,7 @@ static struct nfs4_unlockdata *nfs4_alloc_unlockdata(struct file_lock *fl, atomic_inc(&lsp->ls_count); /* Ensure we don't close file until we're done freeing locks! */ p->ctx = get_nfs_open_context(ctx); + get_file(fl->fl_file); memcpy(&p->fl, fl, sizeof(p->fl)); p->server = NFS_SERVER(inode); return p; @@ -5495,6 +5496,7 @@ static void nfs4_locku_release_calldata(void *data) nfs_free_seqid(calldata->arg.seqid); nfs4_put_lock_state(calldata->lsp); put_nfs_open_context(calldata->ctx); + fput(calldata->fl.fl_file); kfree(calldata); }
Jean reported another crash, similar to the one fixed by feaff8e5b2cf: BUG: unable to handle kernel NULL pointer dereference at 0000000000000148 IP: [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0 PGD 0 Oops: 0000 [#1] SMP Modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache vmw_vsock_vmci_transport vsock cfg80211 rfkill coretemp crct10dif_pclmul ppdev vmw_balloon crc32_pclmul crc32c_intel ghash_clmulni_intel pcspkr vmxnet3 parport_pc i2c_piix4 microcode serio_raw parport nfsd floppy vmw_vmci acpi_cpufreq auth_rpcgss shpchp nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi scsi_transport_spi mptscsih ata_generic mptbase i2c_core pata_acpi CPU: 0 PID: 329 Comm: kworker/0:1H Not tainted 4.1.0-rc7+ #2 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013 Workqueue: rpciod rpc_async_schedule [sunrpc] 30ec000 RIP: 0010:[<ffffffff8124ef7f>] [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0 RSP: 0018:ffff8802330efc08 EFLAGS: 00010296 RAX: ffff8802330efc58 RBX: ffff880097187c80 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000 RBP: ffff8802330efc18 R08: ffff88023fc173d8 R09: 3038b7bf00000000 R10: 00002f1a02000000 R11: 3038b7bf00000000 R12: 0000000000000000 R13: 0000000000000000 R14: ffff8802337a2300 R15: 0000000000000020 FS: 0000000000000000(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000148 CR3: 000000003680f000 CR4: 00000000001407f0 Stack: ffff880097187c80 ffff880097187cd8 ffff8802330efc98 ffffffff81250281 ffff8802330efc68 ffffffffa013e7df ffff8802330efc98 0000000000000246 ffff8801f6901c00 ffff880233d2b8d8 ffff8802330efc58 ffff8802330efc58 Call Trace: [<ffffffff81250281>] __posix_lock_file+0x31/0x5e0 [<ffffffffa013e7df>] ? rpc_wake_up_task_queue_locked.part.35+0xcf/0x240 [sunrpc] [<ffffffff8125088b>] posix_lock_file_wait+0x3b/0xd0 [<ffffffffa03890b2>] ? nfs41_wake_and_assign_slot+0x32/0x40 [nfsv4] [<ffffffffa0365808>] ? nfs41_sequence_done+0xd8/0x300 [nfsv4] [<ffffffffa0367525>] do_vfs_lock+0x35/0x40 [nfsv4] [<ffffffffa03690c1>] nfs4_locku_done+0x81/0x120 [nfsv4] [<ffffffffa013e310>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc] [<ffffffffa013e310>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc] [<ffffffffa013e33c>] rpc_exit_task+0x2c/0x90 [sunrpc] [<ffffffffa0134400>] ? call_refreshresult+0x170/0x170 [sunrpc] [<ffffffffa013ece4>] __rpc_execute+0x84/0x410 [sunrpc] [<ffffffffa013f085>] rpc_async_schedule+0x15/0x20 [sunrpc] [<ffffffff810add67>] process_one_work+0x147/0x400 [<ffffffff810ae42b>] worker_thread+0x11b/0x460 [<ffffffff810ae310>] ? rescuer_thread+0x2f0/0x2f0 [<ffffffff810b35d9>] kthread+0xc9/0xe0 [<ffffffff81010000>] ? perf_trace_xen_mmu_set_pmd+0xa0/0x160 [<ffffffff810b3510>] ? kthread_create_on_node+0x170/0x170 [<ffffffff8173c222>] ret_from_fork+0x42/0x70 [<ffffffff810b3510>] ? kthread_create_on_node+0x170/0x170 Code: a5 81 e8 85 75 e4 ff c6 05 31 ee aa 00 01 eb 98 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 49 89 fc 53 <48> 8b 9f 48 01 00 00 48 85 db 74 08 48 89 d8 5b 41 5c 5d c3 83 RIP [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0 RSP <ffff8802330efc08> CR2: 0000000000000148 ---[ end trace 64484f16250de7ef ]--- The problem is almost exactly the same as the one fixed by feaff8e5b2cf. We must take a reference to the struct file when running the LOCKU compound to prevent the final fput from running until the operation is complete. Reported-by: Jean Spector <jean@primarydata.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> --- fs/nfs/nfs4proc.c | 2 ++ 1 file changed, 2 insertions(+)