diff mbox

nfs: take extra reference to fl->fl_file when running a LOCKU operation

Message ID 1435687950-22037-1-git-send-email-jeff.layton@primarydata.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jeff Layton June 30, 2015, 6:12 p.m. UTC
Jean reported another crash, similar to the one fixed by feaff8e5b2cf:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000148
    IP: [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0
    PGD 0
    Oops: 0000 [#1] SMP
    Modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache vmw_vsock_vmci_transport vsock cfg80211 rfkill coretemp crct10dif_pclmul ppdev vmw_balloon crc32_pclmul crc32c_intel ghash_clmulni_intel pcspkr vmxnet3 parport_pc i2c_piix4 microcode serio_raw parport nfsd floppy vmw_vmci acpi_cpufreq auth_rpcgss shpchp nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi scsi_transport_spi mptscsih ata_generic mptbase i2c_core pata_acpi
    CPU: 0 PID: 329 Comm: kworker/0:1H Not tainted 4.1.0-rc7+ #2
    Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013
    Workqueue: rpciod rpc_async_schedule [sunrpc]
    30ec000
    RIP: 0010:[<ffffffff8124ef7f>]  [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0
    RSP: 0018:ffff8802330efc08  EFLAGS: 00010296
    RAX: ffff8802330efc58 RBX: ffff880097187c80 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000
    RBP: ffff8802330efc18 R08: ffff88023fc173d8 R09: 3038b7bf00000000
    R10: 00002f1a02000000 R11: 3038b7bf00000000 R12: 0000000000000000
    R13: 0000000000000000 R14: ffff8802337a2300 R15: 0000000000000020
    FS:  0000000000000000(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000148 CR3: 000000003680f000 CR4: 00000000001407f0
    Stack:
     ffff880097187c80 ffff880097187cd8 ffff8802330efc98 ffffffff81250281
     ffff8802330efc68 ffffffffa013e7df ffff8802330efc98 0000000000000246
     ffff8801f6901c00 ffff880233d2b8d8 ffff8802330efc58 ffff8802330efc58
    Call Trace:
     [<ffffffff81250281>] __posix_lock_file+0x31/0x5e0
     [<ffffffffa013e7df>] ? rpc_wake_up_task_queue_locked.part.35+0xcf/0x240 [sunrpc]
     [<ffffffff8125088b>] posix_lock_file_wait+0x3b/0xd0
     [<ffffffffa03890b2>] ? nfs41_wake_and_assign_slot+0x32/0x40 [nfsv4]
     [<ffffffffa0365808>] ? nfs41_sequence_done+0xd8/0x300 [nfsv4]
     [<ffffffffa0367525>] do_vfs_lock+0x35/0x40 [nfsv4]
     [<ffffffffa03690c1>] nfs4_locku_done+0x81/0x120 [nfsv4]
     [<ffffffffa013e310>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
     [<ffffffffa013e310>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
     [<ffffffffa013e33c>] rpc_exit_task+0x2c/0x90 [sunrpc]
     [<ffffffffa0134400>] ? call_refreshresult+0x170/0x170 [sunrpc]
     [<ffffffffa013ece4>] __rpc_execute+0x84/0x410 [sunrpc]
     [<ffffffffa013f085>] rpc_async_schedule+0x15/0x20 [sunrpc]
     [<ffffffff810add67>] process_one_work+0x147/0x400
     [<ffffffff810ae42b>] worker_thread+0x11b/0x460
     [<ffffffff810ae310>] ? rescuer_thread+0x2f0/0x2f0
     [<ffffffff810b35d9>] kthread+0xc9/0xe0
     [<ffffffff81010000>] ? perf_trace_xen_mmu_set_pmd+0xa0/0x160
     [<ffffffff810b3510>] ? kthread_create_on_node+0x170/0x170
     [<ffffffff8173c222>] ret_from_fork+0x42/0x70
     [<ffffffff810b3510>] ? kthread_create_on_node+0x170/0x170
    Code: a5 81 e8 85 75 e4 ff c6 05 31 ee aa 00 01 eb 98 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 49 89 fc 53 <48> 8b 9f 48 01 00 00 48 85 db 74 08 48 89 d8 5b 41 5c 5d c3 83
    RIP  [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0
     RSP <ffff8802330efc08>
    CR2: 0000000000000148
    ---[ end trace 64484f16250de7ef ]---

The problem is almost exactly the same as the one fixed by feaff8e5b2cf.
We must take a reference to the struct file when running the LOCKU
compound to prevent the final fput from running until the operation is
complete.

Reported-by: Jean Spector <jean@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
---
 fs/nfs/nfs4proc.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Jeff Layton July 1, 2015, 1:35 p.m. UTC | #1
On Tue, 30 Jun 2015 14:12:30 -0400
Jeff Layton <jlayton@poochiereds.net> wrote:

> Jean reported another crash, similar to the one fixed by feaff8e5b2cf:
> 
>     BUG: unable to handle kernel NULL pointer dereference at 0000000000000148
>     IP: [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0
>     PGD 0
>     Oops: 0000 [#1] SMP
>     Modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache vmw_vsock_vmci_transport vsock cfg80211 rfkill coretemp crct10dif_pclmul ppdev vmw_balloon crc32_pclmul crc32c_intel ghash_clmulni_intel pcspkr vmxnet3 parport_pc i2c_piix4 microcode serio_raw parport nfsd floppy vmw_vmci acpi_cpufreq auth_rpcgss shpchp nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi scsi_transport_spi mptscsih ata_generic mptbase i2c_core pata_acpi
>     CPU: 0 PID: 329 Comm: kworker/0:1H Not tainted 4.1.0-rc7+ #2
>     Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013
>     Workqueue: rpciod rpc_async_schedule [sunrpc]
>     30ec000
>     RIP: 0010:[<ffffffff8124ef7f>]  [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0
>     RSP: 0018:ffff8802330efc08  EFLAGS: 00010296
>     RAX: ffff8802330efc58 RBX: ffff880097187c80 RCX: 0000000000000000
>     RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000
>     RBP: ffff8802330efc18 R08: ffff88023fc173d8 R09: 3038b7bf00000000
>     R10: 00002f1a02000000 R11: 3038b7bf00000000 R12: 0000000000000000
>     R13: 0000000000000000 R14: ffff8802337a2300 R15: 0000000000000020
>     FS:  0000000000000000(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
>     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>     CR2: 0000000000000148 CR3: 000000003680f000 CR4: 00000000001407f0
>     Stack:
>      ffff880097187c80 ffff880097187cd8 ffff8802330efc98 ffffffff81250281
>      ffff8802330efc68 ffffffffa013e7df ffff8802330efc98 0000000000000246
>      ffff8801f6901c00 ffff880233d2b8d8 ffff8802330efc58 ffff8802330efc58
>     Call Trace:
>      [<ffffffff81250281>] __posix_lock_file+0x31/0x5e0
>      [<ffffffffa013e7df>] ? rpc_wake_up_task_queue_locked.part.35+0xcf/0x240 [sunrpc]
>      [<ffffffff8125088b>] posix_lock_file_wait+0x3b/0xd0
>      [<ffffffffa03890b2>] ? nfs41_wake_and_assign_slot+0x32/0x40 [nfsv4]
>      [<ffffffffa0365808>] ? nfs41_sequence_done+0xd8/0x300 [nfsv4]
>      [<ffffffffa0367525>] do_vfs_lock+0x35/0x40 [nfsv4]
>      [<ffffffffa03690c1>] nfs4_locku_done+0x81/0x120 [nfsv4]
>      [<ffffffffa013e310>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
>      [<ffffffffa013e310>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
>      [<ffffffffa013e33c>] rpc_exit_task+0x2c/0x90 [sunrpc]
>      [<ffffffffa0134400>] ? call_refreshresult+0x170/0x170 [sunrpc]
>      [<ffffffffa013ece4>] __rpc_execute+0x84/0x410 [sunrpc]
>      [<ffffffffa013f085>] rpc_async_schedule+0x15/0x20 [sunrpc]
>      [<ffffffff810add67>] process_one_work+0x147/0x400
>      [<ffffffff810ae42b>] worker_thread+0x11b/0x460
>      [<ffffffff810ae310>] ? rescuer_thread+0x2f0/0x2f0
>      [<ffffffff810b35d9>] kthread+0xc9/0xe0
>      [<ffffffff81010000>] ? perf_trace_xen_mmu_set_pmd+0xa0/0x160
>      [<ffffffff810b3510>] ? kthread_create_on_node+0x170/0x170
>      [<ffffffff8173c222>] ret_from_fork+0x42/0x70
>      [<ffffffff810b3510>] ? kthread_create_on_node+0x170/0x170
>     Code: a5 81 e8 85 75 e4 ff c6 05 31 ee aa 00 01 eb 98 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 49 89 fc 53 <48> 8b 9f 48 01 00 00 48 85 db 74 08 48 89 d8 5b 41 5c 5d c3 83
>     RIP  [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0
>      RSP <ffff8802330efc08>
>     CR2: 0000000000000148
>     ---[ end trace 64484f16250de7ef ]---
> 
> The problem is almost exactly the same as the one fixed by feaff8e5b2cf.
> We must take a reference to the struct file when running the LOCKU
> compound to prevent the final fput from running until the operation is
> complete.
> 
> Reported-by: Jean Spector <jean@primarydata.com>
> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
> ---
>  fs/nfs/nfs4proc.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index 605c203de556..5e7638d8e31b 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -5484,6 +5484,7 @@ static struct nfs4_unlockdata *nfs4_alloc_unlockdata(struct file_lock *fl,
>  	atomic_inc(&lsp->ls_count);
>  	/* Ensure we don't close file until we're done freeing locks! */
>  	p->ctx = get_nfs_open_context(ctx);
> +	get_file(fl->fl_file);
>  	memcpy(&p->fl, fl, sizeof(p->fl));
>  	p->server = NFS_SERVER(inode);
>  	return p;
> @@ -5495,6 +5496,7 @@ static void nfs4_locku_release_calldata(void *data)
>  	nfs_free_seqid(calldata->arg.seqid);
>  	nfs4_put_lock_state(calldata->lsp);
>  	put_nfs_open_context(calldata->ctx);
> +	fput(calldata->fl.fl_file);
>  	kfree(calldata);
>  }
>  

Oops, I forgot to Cc stable on this one...

Trond, can you add that?

Thanks,
William Dauchy July 2, 2015, 8:58 a.m. UTC | #2
On Wed, Jul 1, 2015 at 3:37 PM Jeff Layton <jeff.layton@primarydata.com> wrote:
>
> > The problem is almost exactly the same as the one fixed by feaff8e5b2cf.
>
> Oops, I forgot to Cc stable on this one...
> Trond, can you add that?

Is the commit mentionned also targeted for stable?
commit feaff8e5b2cfc3eae02cf65db7a400b0b9ffc596
nfs: take extra reference to fl->fl_file when running a setlk

Regards,
Jeff Layton July 2, 2015, 10:08 a.m. UTC | #3
On Thu, 2 Jul 2015 10:58:59 +0200
William Dauchy <wdauchy@gmail.com> wrote:

> On Wed, Jul 1, 2015 at 3:37 PM Jeff Layton <jeff.layton@primarydata.com> wrote:
> >
> > > The problem is almost exactly the same as the one fixed by feaff8e5b2cf.
> >
> > Oops, I forgot to Cc stable on this one...
> > Trond, can you add that?
> 
> Is the commit mentionned also targeted for stable?
> commit feaff8e5b2cfc3eae02cf65db7a400b0b9ffc596
> nfs: take extra reference to fl->fl_file when running a setlk
> 
> Regards,

Oh! It wasn't marked as such but probably should be. I'll resend it to
stable list a little later.

Thanks,
William Dauchy July 6, 2015, 9:46 a.m. UTC | #4
Hello,

I don't know if it's related but after applying these two patches, I
got a crash; will try to get more info.

BUG: unable to handle kernel NULL pointer dereference at            (nil)
IP: [<ffffffff810ee2b3>] filemap_fault+0x23/0x430
PGD 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 2 PID: 32013 Comm: umount.nfs4 Tainted: G        W    3.14.46 #1
task: ffff880f6044ecc0 ti: ffff880f6044f248 task.ti: ffff880f6044f248
RIP: 0010:[<ffffffff810ee2b3>]  [<ffffffff810ee2b3>] filemap_fault+0x23/0x430
RSP: 0000:ffff880f4e56fcc8  EFLAGS: 00010292
RAX: 0000000000000000 RBX: ffff881ff95d4480 RCX: ffff880f4e5749c8
RDX: ffffffff817b60c0 RSI: ffff880f4e56fd50 RDI: ffff881ff95d4480
RBP: ffff880f4e56fd18 R08: 0000000000000007 R09: 00000000000000a8
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000007
R13: ffff880f5b884700 R14: 000002d4a72be5fc R15: 0000000000000000
FS:  000002d4a80c47e0(0000) GS:ffff88103fc40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000000160c000 CR4: 00000000001607f0
Stack:
ffff880f6044ecc0 0000000000000009 0000000000000082 0000000000000000
ffff880f6044f448 ffff881ff95d4480 00000000000000a8 0000000000000007
000002d4a72be5fc 0000000000000000 ffff880f4e56fd98 ffffffff81111f88
Call Trace:
[<ffffffff81111f88>] __do_fault+0x78/0x5e0
[<ffffffff81116f0c>] handle_mm_fault+0x39c/0xcb0
[<ffffffff81033e23>] __do_page_fault+0x1b3/0x620
[<ffffffff815fc01e>] ? retint_swapgs_pax+0x10/0x15
[<ffffffff810a889d>] ? trace_hardirqs_on_caller+0x13d/0x1e0
[<ffffffff812c827e>] ? trace_hardirqs_off_thunk+0x41/0x43
[<ffffffff812c8238>] ? trace_hardirqs_on_thunk+0x41/0x46
[<ffffffff81178c8b>] ? SyS_umount+0x8b/0x4a0
[<ffffffff815fca33>] ? system_call_fastpath+0x16/0x1b
[<ffffffff810342cc>] do_page_fault+0xc/0x20
[<ffffffff815fc272>] page_fault+0x22/0x30
Code: fe ff ff 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53
48 83 ec 28 4c 8b af a0 00 00 00 4c 8b 66 08 49 8b 85 b8 01 00 00 <4c>
8b 38 48 89 45 c8 49 8b 47 40 48 8d 90 ff 0f 00 00 b8 02 00
RIP  [<ffffffff810ee2b3>] filemap_fault+0x23/0x430
RSP <ffff880f4e56fcc8>
CR2: 0000000000000000
---[ end trace 59f46e48035e53e4 ]---
Jeff Layton July 10, 2015, 3:02 p.m. UTC | #5
On Thu, 2 Jul 2015 06:08:27 -0400
Jeff Layton <jeff.layton@primarydata.com> wrote:

> On Thu, 2 Jul 2015 10:58:59 +0200
> William Dauchy <wdauchy@gmail.com> wrote:
> 
> > On Wed, Jul 1, 2015 at 3:37 PM Jeff Layton <jeff.layton@primarydata.com> wrote:
> > >
> > > > The problem is almost exactly the same as the one fixed by feaff8e5b2cf.
> > >
> > > Oops, I forgot to Cc stable on this one...
> > > Trond, can you add that?
> > 
> > Is the commit mentionned also targeted for stable?
> > commit feaff8e5b2cfc3eae02cf65db7a400b0b9ffc596
> > nfs: take extra reference to fl->fl_file when running a setlk
> > 
> > Regards,
> 
> Oh! It wasn't marked as such but probably should be. I'll resend it to
> stable list a little later.
> 
> Thanks,

So, William has done some testing and hit some problems with this
patch. I suspect that it's because we can end up running an unlock
after the filp->f_count has already gone to zero and are in __fput, so
we take an extra reference and end up with a use-after-free.

I think it'd be best to revert this patch from all kernels for now
(mainline and stable). I don't think the one that changes the setlk
codepath is susceptible to this, but it's probably fine to hold off on
applying both until I can sort out a better way to fix this one.

Thanks!
William Dauchy July 10, 2015, 3:56 p.m. UTC | #6
On Fri, Jul 10, 2015 at 5:02 PM, Jeff Layton
<jeff.layton@primarydata.com> wrote:
> So, William has done some testing and hit some problems with this
> patch. I suspect that it's because we can end up running an unlock
> after the filp->f_count has already gone to zero and are in __fput, so
> we take an extra reference and end up with a use-after-free.
>
> I think it'd be best to revert this patch from all kernels for now
> (mainline and stable). I don't think the one that changes the setlk
> codepath is susceptible to this, but it's probably fine to hold off on
> applying both until I can sort out a better way to fix this one.

I also think it's safer to revert both of them.
Jeff Layton July 10, 2015, 4:07 p.m. UTC | #7
On Fri, 10 Jul 2015 17:56:57 +0200
William Dauchy <wdauchy@gmail.com> wrote:

> On Fri, Jul 10, 2015 at 5:02 PM, Jeff Layton
> <jeff.layton@primarydata.com> wrote:
> > So, William has done some testing and hit some problems with this
> > patch. I suspect that it's because we can end up running an unlock
> > after the filp->f_count has already gone to zero and are in __fput, so
> > we take an extra reference and end up with a use-after-free.
> >
> > I think it'd be best to revert this patch from all kernels for now
> > (mainline and stable). I don't think the one that changes the setlk
> > codepath is susceptible to this, but it's probably fine to hold off on
> > applying both until I can sort out a better way to fix this one.
> 
> I also think it's safer to revert both of them.
> 

Oh? Do you have some reason to suspect the setlk patch to be
problematic? If not, then I'd rather not revert that one (at least not
from mainline) since I don't think it's likely to be a problem and it
does fix a real bug. It's your call on what you do in stable of course.

Either way, I added the warning that I described before and got this,
so I do suspect that the unlck patch is the cause of the problem:

[  373.144955] ------------[ cut here ]------------
[  373.146000] WARNING: CPU: 1 PID: 897 at fs/nfs/nfs4proc.c:5489 nfs4_do_unlck+0x294/0x2e0 [nfsv4]()
[  373.147975] Modules linked in: cts rpcsec_gss_krb5 nfsv4(OE) dns_resolver nfs(OE) fscache xfs libcrc32c snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep ppdev snd_seq snd_seq_device snd_pcm snd_timer snd joydev soundcore e1000 virtio_balloon i2c_piix4 pvpanic parport_pc parport acpi_cpufreq nfsd nfs_acl lockd grace auth_rpcgss sunrpc virtio_console virtio_blk qxl drm_kms_helper ttm drm ata_generic pata_acpi serio_raw virtio_pci virtio_ring virtio
[  373.156863] CPU: 1 PID: 897 Comm: flock Tainted: G           OE   4.2.0-0.rc1.git2.1.fc23.x86_64 #1
[  373.158455] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  373.159496]  0000000000000000 000000003297e63c ffff8800cd27fad8 ffffffff81864405
[  373.160856]  0000000000000000 0000000000000000 ffff8800cd27fb18 ffffffff810ab446
[  373.162295]  ffff8800cd27faf8 ffff880118d17a00 ffff880118d17a80 0000000000000000
[  373.163682] Call Trace:
[  373.164130]  [<ffffffff81864405>] dump_stack+0x4c/0x65
[  373.165053]  [<ffffffff810ab446>] warn_slowpath_common+0x86/0xc0
[  373.166262]  [<ffffffff810ab57a>] warn_slowpath_null+0x1a/0x20
[  373.167544]  [<ffffffffa046f854>] nfs4_do_unlck+0x294/0x2e0 [nfsv4]
[  373.168627]  [<ffffffffa0477a25>] nfs4_proc_lock+0x2d5/0x880 [nfsv4]
[  373.169734]  [<ffffffffa042dfe8>] do_unlk+0xa8/0xc0 [nfs]
[  373.170676]  [<ffffffffa042e311>] nfs_flock+0x71/0xa0 [nfs]
[  373.171651]  [<ffffffff812c99e6>] locks_remove_flock+0xa6/0xf0
[  373.172660]  [<ffffffff812cc9da>] locks_remove_file+0x5a/0x100
[  373.173685]  [<ffffffff8126ff23>] __fput+0xd3/0x200
[  373.174508]  [<ffffffff8127009e>] ____fput+0xe/0x10
[  373.175356]  [<ffffffff810d093d>] task_work_run+0x8d/0xc0
[  373.176339]  [<ffffffff8101cadd>] do_notify_resume+0x8d/0x90
[  373.177315]  [<ffffffff8186dda6>] int_signal+0x12/0x17
[  373.178167] ---[ end trace b7fce2dedc7eda37 ]---

I'll see if I can cook up a better way to fix this. It's a little
tricky since in the event that the task had a signal pending, the filp
may be long gone by the time the reply to the LOCKU request comes in.

So, we may need to change the prototype of locks_remove_flock to take
an inode pointer and take a reference to that instead of the filp. That
should be safe, AFAICT since we definitely still hold a reference to
the inode at that point.
William Dauchy July 10, 2015, 10:35 p.m. UTC | #8
On Fri, Jul 10, 2015 at 6:07 PM, Jeff Layton
<jeff.layton@primarydata.com> wrote:
> Oh? Do you have some reason to suspect the setlk patch to be
> problematic? If not, then I'd rather not revert that one (at least not
> from mainline) since I don't think it's likely to be a problem and it
> does fix a real bug. It's your call on what you do in stable of course.

Yes, I also had some instabilities with the setlk patch only applied.
Same trace as mentioned in the other thread.
Jeff Layton July 10, 2015, 10:51 p.m. UTC | #9
On Sat, 11 Jul 2015 00:35:27 +0200
William Dauchy <wdauchy@gmail.com> wrote:

> On Fri, Jul 10, 2015 at 6:07 PM, Jeff Layton
> <jeff.layton@primarydata.com> wrote:
> > Oh? Do you have some reason to suspect the setlk patch to be
> > problematic? If not, then I'd rather not revert that one (at least not
> > from mainline) since I don't think it's likely to be a problem and it
> > does fix a real bug. It's your call on what you do in stable of course.
> 
> Yes, I also had some instabilities with the setlk patch only applied.
> Same trace as mentioned in the other thread.
> 

That, I have no explanation for...

We clearly hold a reference to the filp already when setting a lock.

Hmm...unless there was maybe some reclaim involved? I wouldn't think
that we'd try to reclaim locks for a filp that was being torn down, but
I'd have to go over that code in detail to be sure. Can you give any
insight into what you were doing when it was having problems? Did the
server reboot while you were testing?

In any case, we can probably get rid of the extra filp reference in
that code too if/when the RFC series is merged. We definitely hold a
reference to the inode already, so we shouldn't need to take the extra
filp reference once that's merged.

Not sure whether that patchset will be stable material though since it
is a more invasive fix.
diff mbox

Patch

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 605c203de556..5e7638d8e31b 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -5484,6 +5484,7 @@  static struct nfs4_unlockdata *nfs4_alloc_unlockdata(struct file_lock *fl,
 	atomic_inc(&lsp->ls_count);
 	/* Ensure we don't close file until we're done freeing locks! */
 	p->ctx = get_nfs_open_context(ctx);
+	get_file(fl->fl_file);
 	memcpy(&p->fl, fl, sizeof(p->fl));
 	p->server = NFS_SERVER(inode);
 	return p;
@@ -5495,6 +5496,7 @@  static void nfs4_locku_release_calldata(void *data)
 	nfs_free_seqid(calldata->arg.seqid);
 	nfs4_put_lock_state(calldata->lsp);
 	put_nfs_open_context(calldata->ctx);
+	fput(calldata->fl.fl_file);
 	kfree(calldata);
 }