diff mbox series

nfsd: return -EINVAL when namelen is 0

Message ID 20240903111446.659884-1-lilingfeng3@huawei.com (mailing list archive)
State New
Headers show
Series nfsd: return -EINVAL when namelen is 0 | expand

Commit Message

Li Lingfeng Sept. 3, 2024, 11:14 a.m. UTC
When we have a corrupted main.sqlite in /var/lib/nfs/nfsdcld/, it may
result in namelen being 0, which will cause memdup_user() to return
ZERO_SIZE_PTR.
When we access the name.data that has been assigned the value of
ZERO_SIZE_PTR in nfs4_client_to_reclaim(), null pointer dereference is
triggered.

[ T1205] ==================================================================
[ T1205] BUG: KASAN: null-ptr-deref in nfs4_client_to_reclaim+0xe9/0x260
[ T1205] Read of size 1 at addr 0000000000000010 by task nfsdcld/1205
[ T1205]
[ T1205] CPU: 11 PID: 1205 Comm: nfsdcld Not tainted 5.10.0-00003-g2c1423731b8d #406
[ T1205] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
[ T1205] Call Trace:
[ T1205]  dump_stack+0x9a/0xd0
[ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
[ T1205]  __kasan_report.cold+0x34/0x84
[ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
[ T1205]  kasan_report+0x3a/0x50
[ T1205]  nfs4_client_to_reclaim+0xe9/0x260
[ T1205]  ? nfsd4_release_lockowner+0x410/0x410
[ T1205]  cld_pipe_downcall+0x5ca/0x760
[ T1205]  ? nfsd4_cld_tracking_exit+0x1d0/0x1d0
[ T1205]  ? down_write_killable_nested+0x170/0x170
[ T1205]  ? avc_policy_seqno+0x28/0x40
[ T1205]  ? selinux_file_permission+0x1b4/0x1e0
[ T1205]  rpc_pipe_write+0x84/0xb0
[ T1205]  vfs_write+0x143/0x520
[ T1205]  ksys_write+0xc9/0x170
[ T1205]  ? __ia32_sys_read+0x50/0x50
[ T1205]  ? ktime_get_coarse_real_ts64+0xfe/0x110
[ T1205]  ? ktime_get_coarse_real_ts64+0xa2/0x110
[ T1205]  do_syscall_64+0x33/0x40
[ T1205]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
[ T1205] RIP: 0033:0x7fdbdb761bc7
[ T1205] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 514
[ T1205] RSP: 002b:00007fff8c4b7248 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ T1205] RAX: ffffffffffffffda RBX: 000000000000042b RCX: 00007fdbdb761bc7
[ T1205] RDX: 000000000000042b RSI: 00007fff8c4b75f0 RDI: 0000000000000008
[ T1205] RBP: 00007fdbdb761bb0 R08: 0000000000000000 R09: 0000000000000001
[ T1205] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000042b
[ T1205] R13: 0000000000000008 R14: 00007fff8c4b75f0 R15: 0000000000000000
[ T1205] ==================================================================

Fix it by checking namelen.

Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
---
 fs/nfsd/nfs4recover.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Jeff Layton Sept. 3, 2024, 11:23 a.m. UTC | #1
On Tue, 2024-09-03 at 19:14 +0800, Li Lingfeng wrote:
> When we have a corrupted main.sqlite in /var/lib/nfs/nfsdcld/, it may
> result in namelen being 0, which will cause memdup_user() to return
> ZERO_SIZE_PTR.
> When we access the name.data that has been assigned the value of
> ZERO_SIZE_PTR in nfs4_client_to_reclaim(), null pointer dereference is
> triggered.
> 
> [ T1205] ==================================================================
> [ T1205] BUG: KASAN: null-ptr-deref in nfs4_client_to_reclaim+0xe9/0x260
> [ T1205] Read of size 1 at addr 0000000000000010 by task nfsdcld/1205
> [ T1205]
> [ T1205] CPU: 11 PID: 1205 Comm: nfsdcld Not tainted 5.10.0-00003-g2c1423731b8d #406
> [ T1205] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
> [ T1205] Call Trace:
> [ T1205]  dump_stack+0x9a/0xd0
> [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
> [ T1205]  __kasan_report.cold+0x34/0x84
> [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
> [ T1205]  kasan_report+0x3a/0x50
> [ T1205]  nfs4_client_to_reclaim+0xe9/0x260
> [ T1205]  ? nfsd4_release_lockowner+0x410/0x410
> [ T1205]  cld_pipe_downcall+0x5ca/0x760
> [ T1205]  ? nfsd4_cld_tracking_exit+0x1d0/0x1d0
> [ T1205]  ? down_write_killable_nested+0x170/0x170
> [ T1205]  ? avc_policy_seqno+0x28/0x40
> [ T1205]  ? selinux_file_permission+0x1b4/0x1e0
> [ T1205]  rpc_pipe_write+0x84/0xb0
> [ T1205]  vfs_write+0x143/0x520
> [ T1205]  ksys_write+0xc9/0x170
> [ T1205]  ? __ia32_sys_read+0x50/0x50
> [ T1205]  ? ktime_get_coarse_real_ts64+0xfe/0x110
> [ T1205]  ? ktime_get_coarse_real_ts64+0xa2/0x110
> [ T1205]  do_syscall_64+0x33/0x40
> [ T1205]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
> [ T1205] RIP: 0033:0x7fdbdb761bc7
> [ T1205] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 514
> [ T1205] RSP: 002b:00007fff8c4b7248 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [ T1205] RAX: ffffffffffffffda RBX: 000000000000042b RCX: 00007fdbdb761bc7
> [ T1205] RDX: 000000000000042b RSI: 00007fff8c4b75f0 RDI: 0000000000000008
> [ T1205] RBP: 00007fdbdb761bb0 R08: 0000000000000000 R09: 0000000000000001
> [ T1205] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000042b
> [ T1205] R13: 0000000000000008 R14: 00007fff8c4b75f0 R15: 0000000000000000
> [ T1205] ==================================================================
> 
> Fix it by checking namelen.
> 
> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
> ---
>  fs/nfsd/nfs4recover.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> index 67d8673a9391..69a3a84e159e 100644
> --- a/fs/nfsd/nfs4recover.c
> +++ b/fs/nfsd/nfs4recover.c
> @@ -809,6 +809,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
>  			ci = &cmsg->cm_u.cm_clntinfo;
>  			if (get_user(namelen, &ci->cc_name.cn_len))
>  				return -EFAULT;
> +			if (!namelen) {
> +				dprintk("%s: namelen should not be zero", __func__);
> +				return -EINVAL;
> +			}
>  			name.data = memdup_user(&ci->cc_name.cn_id, namelen);
>  			if (IS_ERR(name.data))
>  				return PTR_ERR(name.data);
> @@ -831,6 +835,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
>  			cnm = &cmsg->cm_u.cm_name;
>  			if (get_user(namelen, &cnm->cn_len))
>  				return -EFAULT;
> +			if (!namelen) {
> +				dprintk("%s: namelen should not be zero", __func__);
> +				return -EINVAL;
> +			}
>  			name.data = memdup_user(&cnm->cn_id, namelen);
>  			if (IS_ERR(name.data))
>  				return PTR_ERR(name.data);

This error will come back to nfsdcld in its downcall write(). -EINVAL
is a bit generic. It might be better to go with a more distinct error,
but I don't think it matters too much.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Chuck Lever Sept. 3, 2024, 3:27 p.m. UTC | #2
On Tue, Sep 03, 2024 at 07:23:18AM -0400, Jeff Layton wrote:
> On Tue, 2024-09-03 at 19:14 +0800, Li Lingfeng wrote:
> > When we have a corrupted main.sqlite in /var/lib/nfs/nfsdcld/, it may
> > result in namelen being 0, which will cause memdup_user() to return
> > ZERO_SIZE_PTR.
> > When we access the name.data that has been assigned the value of
> > ZERO_SIZE_PTR in nfs4_client_to_reclaim(), null pointer dereference is
> > triggered.
> > 
> > [ T1205] ==================================================================
> > [ T1205] BUG: KASAN: null-ptr-deref in nfs4_client_to_reclaim+0xe9/0x260
> > [ T1205] Read of size 1 at addr 0000000000000010 by task nfsdcld/1205
> > [ T1205]
> > [ T1205] CPU: 11 PID: 1205 Comm: nfsdcld Not tainted 5.10.0-00003-g2c1423731b8d #406
> > [ T1205] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
> > [ T1205] Call Trace:
> > [ T1205]  dump_stack+0x9a/0xd0
> > [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
> > [ T1205]  __kasan_report.cold+0x34/0x84
> > [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
> > [ T1205]  kasan_report+0x3a/0x50
> > [ T1205]  nfs4_client_to_reclaim+0xe9/0x260
> > [ T1205]  ? nfsd4_release_lockowner+0x410/0x410
> > [ T1205]  cld_pipe_downcall+0x5ca/0x760
> > [ T1205]  ? nfsd4_cld_tracking_exit+0x1d0/0x1d0
> > [ T1205]  ? down_write_killable_nested+0x170/0x170
> > [ T1205]  ? avc_policy_seqno+0x28/0x40
> > [ T1205]  ? selinux_file_permission+0x1b4/0x1e0
> > [ T1205]  rpc_pipe_write+0x84/0xb0
> > [ T1205]  vfs_write+0x143/0x520
> > [ T1205]  ksys_write+0xc9/0x170
> > [ T1205]  ? __ia32_sys_read+0x50/0x50
> > [ T1205]  ? ktime_get_coarse_real_ts64+0xfe/0x110
> > [ T1205]  ? ktime_get_coarse_real_ts64+0xa2/0x110
> > [ T1205]  do_syscall_64+0x33/0x40
> > [ T1205]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
> > [ T1205] RIP: 0033:0x7fdbdb761bc7
> > [ T1205] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 514
> > [ T1205] RSP: 002b:00007fff8c4b7248 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> > [ T1205] RAX: ffffffffffffffda RBX: 000000000000042b RCX: 00007fdbdb761bc7
> > [ T1205] RDX: 000000000000042b RSI: 00007fff8c4b75f0 RDI: 0000000000000008
> > [ T1205] RBP: 00007fdbdb761bb0 R08: 0000000000000000 R09: 0000000000000001
> > [ T1205] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000042b
> > [ T1205] R13: 0000000000000008 R14: 00007fff8c4b75f0 R15: 0000000000000000
> > [ T1205] ==================================================================
> > 
> > Fix it by checking namelen.
> > 
> > Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
> > ---
> >  fs/nfsd/nfs4recover.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> > index 67d8673a9391..69a3a84e159e 100644
> > --- a/fs/nfsd/nfs4recover.c
> > +++ b/fs/nfsd/nfs4recover.c
> > @@ -809,6 +809,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
> >  			ci = &cmsg->cm_u.cm_clntinfo;
> >  			if (get_user(namelen, &ci->cc_name.cn_len))
> >  				return -EFAULT;
> > +			if (!namelen) {
> > +				dprintk("%s: namelen should not be zero", __func__);
> > +				return -EINVAL;
> > +			}
> >  			name.data = memdup_user(&ci->cc_name.cn_id, namelen);
> >  			if (IS_ERR(name.data))
> >  				return PTR_ERR(name.data);
> > @@ -831,6 +835,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
> >  			cnm = &cmsg->cm_u.cm_name;
> >  			if (get_user(namelen, &cnm->cn_len))
> >  				return -EFAULT;
> > +			if (!namelen) {
> > +				dprintk("%s: namelen should not be zero", __func__);
> > +				return -EINVAL;
> > +			}
> >  			name.data = memdup_user(&cnm->cn_id, namelen);
> >  			if (IS_ERR(name.data))
> >  				return PTR_ERR(name.data);
> 
> This error will come back to nfsdcld in its downcall write(). -EINVAL
> is a bit generic. It might be better to go with a more distinct error,
> but I don't think it matters too much.
> 
> Reviewed-by: Jeff Layton <jlayton@kernel.org>

Wondering if

   Fixes: 74725959c33c ("nfsd: un-deprecate nfsdcld")

would be appropriate.
Jeff Layton Sept. 3, 2024, 3:35 p.m. UTC | #3
On Tue, 2024-09-03 at 11:27 -0400, Chuck Lever wrote:
> On Tue, Sep 03, 2024 at 07:23:18AM -0400, Jeff Layton wrote:
> > On Tue, 2024-09-03 at 19:14 +0800, Li Lingfeng wrote:
> > > When we have a corrupted main.sqlite in /var/lib/nfs/nfsdcld/, it may
> > > result in namelen being 0, which will cause memdup_user() to return
> > > ZERO_SIZE_PTR.
> > > When we access the name.data that has been assigned the value of
> > > ZERO_SIZE_PTR in nfs4_client_to_reclaim(), null pointer dereference is
> > > triggered.
> > > 
> > > [ T1205] ==================================================================
> > > [ T1205] BUG: KASAN: null-ptr-deref in nfs4_client_to_reclaim+0xe9/0x260
> > > [ T1205] Read of size 1 at addr 0000000000000010 by task nfsdcld/1205
> > > [ T1205]
> > > [ T1205] CPU: 11 PID: 1205 Comm: nfsdcld Not tainted 5.10.0-00003-g2c1423731b8d #406
> > > [ T1205] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
> > > [ T1205] Call Trace:
> > > [ T1205]  dump_stack+0x9a/0xd0
> > > [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
> > > [ T1205]  __kasan_report.cold+0x34/0x84
> > > [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
> > > [ T1205]  kasan_report+0x3a/0x50
> > > [ T1205]  nfs4_client_to_reclaim+0xe9/0x260
> > > [ T1205]  ? nfsd4_release_lockowner+0x410/0x410
> > > [ T1205]  cld_pipe_downcall+0x5ca/0x760
> > > [ T1205]  ? nfsd4_cld_tracking_exit+0x1d0/0x1d0
> > > [ T1205]  ? down_write_killable_nested+0x170/0x170
> > > [ T1205]  ? avc_policy_seqno+0x28/0x40
> > > [ T1205]  ? selinux_file_permission+0x1b4/0x1e0
> > > [ T1205]  rpc_pipe_write+0x84/0xb0
> > > [ T1205]  vfs_write+0x143/0x520
> > > [ T1205]  ksys_write+0xc9/0x170
> > > [ T1205]  ? __ia32_sys_read+0x50/0x50
> > > [ T1205]  ? ktime_get_coarse_real_ts64+0xfe/0x110
> > > [ T1205]  ? ktime_get_coarse_real_ts64+0xa2/0x110
> > > [ T1205]  do_syscall_64+0x33/0x40
> > > [ T1205]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
> > > [ T1205] RIP: 0033:0x7fdbdb761bc7
> > > [ T1205] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 514
> > > [ T1205] RSP: 002b:00007fff8c4b7248 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> > > [ T1205] RAX: ffffffffffffffda RBX: 000000000000042b RCX: 00007fdbdb761bc7
> > > [ T1205] RDX: 000000000000042b RSI: 00007fff8c4b75f0 RDI: 0000000000000008
> > > [ T1205] RBP: 00007fdbdb761bb0 R08: 0000000000000000 R09: 0000000000000001
> > > [ T1205] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000042b
> > > [ T1205] R13: 0000000000000008 R14: 00007fff8c4b75f0 R15: 0000000000000000
> > > [ T1205] ==================================================================
> > > 
> > > Fix it by checking namelen.
> > > 
> > > Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
> > > ---
> > >  fs/nfsd/nfs4recover.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > > 
> > > diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> > > index 67d8673a9391..69a3a84e159e 100644
> > > --- a/fs/nfsd/nfs4recover.c
> > > +++ b/fs/nfsd/nfs4recover.c
> > > @@ -809,6 +809,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
> > >  			ci = &cmsg->cm_u.cm_clntinfo;
> > >  			if (get_user(namelen, &ci->cc_name.cn_len))
> > >  				return -EFAULT;
> > > +			if (!namelen) {
> > > +				dprintk("%s: namelen should not be zero", __func__);
> > > +				return -EINVAL;
> > > +			}
> > >  			name.data = memdup_user(&ci->cc_name.cn_id, namelen);
> > >  			if (IS_ERR(name.data))
> > >  				return PTR_ERR(name.data);
> > > @@ -831,6 +835,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
> > >  			cnm = &cmsg->cm_u.cm_name;
> > >  			if (get_user(namelen, &cnm->cn_len))
> > >  				return -EFAULT;
> > > +			if (!namelen) {
> > > +				dprintk("%s: namelen should not be zero", __func__);
> > > +				return -EINVAL;
> > > +			}
> > >  			name.data = memdup_user(&cnm->cn_id, namelen);
> > >  			if (IS_ERR(name.data))
> > >  				return PTR_ERR(name.data);
> > 
> > This error will come back to nfsdcld in its downcall write(). -EINVAL
> > is a bit generic. It might be better to go with a more distinct error,
> > but I don't think it matters too much.
> > 
> > Reviewed-by: Jeff Layton <jlayton@kernel.org>
> 
> Wondering if
> 
>    Fixes: 74725959c33c ("nfsd: un-deprecate nfsdcld")
> 
> would be appropriate.
> 
> 

Hmm, 2019 patch. Are there any supported release streams that don't
have that commit?

Oh well, I guess it won't hurt anything to add it.
Chuck Lever Sept. 3, 2024, 9:35 p.m. UTC | #4
On Tue, 03 Sep 2024 19:14:46 +0800, Li Lingfeng wrote:
> When we have a corrupted main.sqlite in /var/lib/nfs/nfsdcld/, it may
> result in namelen being 0, which will cause memdup_user() to return
> ZERO_SIZE_PTR.
> When we access the name.data that has been assigned the value of
> ZERO_SIZE_PTR in nfs4_client_to_reclaim(), null pointer dereference is
> triggered.
> 
> [...]

Kept the new dprintk call sites since this is not a hot path and
there needs to be some observability here rather than a silent
failure. I'm not convinced the error text is especially clear, but
I don't have a better suggestion at the moment.

Applied to nfsd-next for v6.12, thanks!

[1/1] nfsd: return -EINVAL when namelen is 0
      commit: e492841732bbce2b2dd19cd285d5e7f61b1bdaee
Guoqing Jiang Sept. 4, 2024, 1:06 p.m. UTC | #5
On 9/3/24 19:14, Li Lingfeng wrote:
> When we have a corrupted main.sqlite in /var/lib/nfs/nfsdcld/, it may
> result in namelen being 0, which will cause memdup_user() to return
> ZERO_SIZE_PTR.
> When we access the name.data that has been assigned the value of
> ZERO_SIZE_PTR in nfs4_client_to_reclaim(), null pointer dereference is
> triggered.
>
> [ T1205] ==================================================================
> [ T1205] BUG: KASAN: null-ptr-deref in nfs4_client_to_reclaim+0xe9/0x260
> [ T1205] Read of size 1 at addr 0000000000000010 by task nfsdcld/1205
> [ T1205]
> [ T1205] CPU: 11 PID: 1205 Comm: nfsdcld Not tainted 5.10.0-00003-g2c1423731b8d #406
> [ T1205] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
> [ T1205] Call Trace:
> [ T1205]  dump_stack+0x9a/0xd0
> [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
> [ T1205]  __kasan_report.cold+0x34/0x84
> [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
> [ T1205]  kasan_report+0x3a/0x50
> [ T1205]  nfs4_client_to_reclaim+0xe9/0x260
> [ T1205]  ? nfsd4_release_lockowner+0x410/0x410
> [ T1205]  cld_pipe_downcall+0x5ca/0x760
> [ T1205]  ? nfsd4_cld_tracking_exit+0x1d0/0x1d0
> [ T1205]  ? down_write_killable_nested+0x170/0x170
> [ T1205]  ? avc_policy_seqno+0x28/0x40
> [ T1205]  ? selinux_file_permission+0x1b4/0x1e0
> [ T1205]  rpc_pipe_write+0x84/0xb0
> [ T1205]  vfs_write+0x143/0x520
> [ T1205]  ksys_write+0xc9/0x170
> [ T1205]  ? __ia32_sys_read+0x50/0x50
> [ T1205]  ? ktime_get_coarse_real_ts64+0xfe/0x110
> [ T1205]  ? ktime_get_coarse_real_ts64+0xa2/0x110
> [ T1205]  do_syscall_64+0x33/0x40
> [ T1205]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
> [ T1205] RIP: 0033:0x7fdbdb761bc7
> [ T1205] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 514
> [ T1205] RSP: 002b:00007fff8c4b7248 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [ T1205] RAX: ffffffffffffffda RBX: 000000000000042b RCX: 00007fdbdb761bc7
> [ T1205] RDX: 000000000000042b RSI: 00007fff8c4b75f0 RDI: 0000000000000008
> [ T1205] RBP: 00007fdbdb761bb0 R08: 0000000000000000 R09: 0000000000000001
> [ T1205] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000042b
> [ T1205] R13: 0000000000000008 R14: 00007fff8c4b75f0 R15: 0000000000000000
> [ T1205] ==================================================================
>
> Fix it by checking namelen.
>
> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
> ---
>   fs/nfsd/nfs4recover.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
>
> diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> index 67d8673a9391..69a3a84e159e 100644
> --- a/fs/nfsd/nfs4recover.c
> +++ b/fs/nfsd/nfs4recover.c
> @@ -809,6 +809,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
>   			ci = &cmsg->cm_u.cm_clntinfo;
>   			if (get_user(namelen, &ci->cc_name.cn_len))
>   				return -EFAULT;

If namelen is 0, I think the func should return -EFAULT above per below 
comment in [1]. Or can
get_user return success with x was set to zero?

  * Return: zero on success, or -EFAULT on error.
  * On error, the variable @x is set to zero.
  */
#define get_user(x,ptr) ({ might_fault(); 
do_get_user_call(get_user,x,ptr); })

[1]. 
https://elixir.bootlin.com/linux/v6.11-rc6/source/arch/x86/include/asm/uaccess.h#L108

> +			if (!namelen) {
> +				dprintk("%s: namelen should not be zero", __func__);
> +				return -EINVAL;
> +			}
>   			name.data = memdup_user(&ci->cc_name.cn_id, namelen);
>   			if (IS_ERR(name.data))
>   				return PTR_ERR(name.data);
> @@ -831,6 +835,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
>   			cnm = &cmsg->cm_u.cm_name;
>   			if (get_user(namelen, &cnm->cn_len))
>   				return -EFAULT;
> +			if (!namelen) {
> +				dprintk("%s: namelen should not be zero", __func__);
> +				return -EINVAL;
> +			}
>   			name.data = memdup_user(&cnm->cn_id, namelen);
>   			if (IS_ERR(name.data))
>   				return PTR_ERR(name.data);

Thanks,
Guoqing
Chuck Lever Sept. 4, 2024, 2:16 p.m. UTC | #6
On Wed, Sep 04, 2024 at 09:06:07PM +0800, Guoqing Jiang wrote:
> 
> 
> On 9/3/24 19:14, Li Lingfeng wrote:
> > When we have a corrupted main.sqlite in /var/lib/nfs/nfsdcld/, it may
> > result in namelen being 0, which will cause memdup_user() to return
> > ZERO_SIZE_PTR.
> > When we access the name.data that has been assigned the value of
> > ZERO_SIZE_PTR in nfs4_client_to_reclaim(), null pointer dereference is
> > triggered.
> > 
> > [ T1205] ==================================================================
> > [ T1205] BUG: KASAN: null-ptr-deref in nfs4_client_to_reclaim+0xe9/0x260
> > [ T1205] Read of size 1 at addr 0000000000000010 by task nfsdcld/1205
> > [ T1205]
> > [ T1205] CPU: 11 PID: 1205 Comm: nfsdcld Not tainted 5.10.0-00003-g2c1423731b8d #406
> > [ T1205] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
> > [ T1205] Call Trace:
> > [ T1205]  dump_stack+0x9a/0xd0
> > [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
> > [ T1205]  __kasan_report.cold+0x34/0x84
> > [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
> > [ T1205]  kasan_report+0x3a/0x50
> > [ T1205]  nfs4_client_to_reclaim+0xe9/0x260
> > [ T1205]  ? nfsd4_release_lockowner+0x410/0x410
> > [ T1205]  cld_pipe_downcall+0x5ca/0x760
> > [ T1205]  ? nfsd4_cld_tracking_exit+0x1d0/0x1d0
> > [ T1205]  ? down_write_killable_nested+0x170/0x170
> > [ T1205]  ? avc_policy_seqno+0x28/0x40
> > [ T1205]  ? selinux_file_permission+0x1b4/0x1e0
> > [ T1205]  rpc_pipe_write+0x84/0xb0
> > [ T1205]  vfs_write+0x143/0x520
> > [ T1205]  ksys_write+0xc9/0x170
> > [ T1205]  ? __ia32_sys_read+0x50/0x50
> > [ T1205]  ? ktime_get_coarse_real_ts64+0xfe/0x110
> > [ T1205]  ? ktime_get_coarse_real_ts64+0xa2/0x110
> > [ T1205]  do_syscall_64+0x33/0x40
> > [ T1205]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
> > [ T1205] RIP: 0033:0x7fdbdb761bc7
> > [ T1205] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 514
> > [ T1205] RSP: 002b:00007fff8c4b7248 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> > [ T1205] RAX: ffffffffffffffda RBX: 000000000000042b RCX: 00007fdbdb761bc7
> > [ T1205] RDX: 000000000000042b RSI: 00007fff8c4b75f0 RDI: 0000000000000008
> > [ T1205] RBP: 00007fdbdb761bb0 R08: 0000000000000000 R09: 0000000000000001
> > [ T1205] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000042b
> > [ T1205] R13: 0000000000000008 R14: 00007fff8c4b75f0 R15: 0000000000000000
> > [ T1205] ==================================================================
> > 
> > Fix it by checking namelen.
> > 
> > Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
> > ---
> >   fs/nfsd/nfs4recover.c | 8 ++++++++
> >   1 file changed, 8 insertions(+)
> > 
> > diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> > index 67d8673a9391..69a3a84e159e 100644
> > --- a/fs/nfsd/nfs4recover.c
> > +++ b/fs/nfsd/nfs4recover.c
> > @@ -809,6 +809,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
> >   			ci = &cmsg->cm_u.cm_clntinfo;
> >   			if (get_user(namelen, &ci->cc_name.cn_len))
> >   				return -EFAULT;
> 
> If namelen is 0, I think the func should return -EFAULT above per below
> comment in [1].  Or can get_user return success with x was set to zero?

It's legitimate in general to find a zero stored at &ptr. x can also
be set to zero if a fault occurs.

Clearly, get_user() succeeds in the crashing case; otherwise the
flow cannot reach the memdup_user() call that returns ZERO_SIZE_PTR.


>  * Return: zero on success, or -EFAULT on error.
>  * On error, the variable @x is set to zero.
>  */
> #define get_user(x,ptr) ({ might_fault(); do_get_user_call(get_user,x,ptr);
> })
> 
> [1]. https://elixir.bootlin.com/linux/v6.11-rc6/source/arch/x86/include/asm/uaccess.h#L108
> 
> > +			if (!namelen) {
> > +				dprintk("%s: namelen should not be zero", __func__);
> > +				return -EINVAL;
> > +			}
> >   			name.data = memdup_user(&ci->cc_name.cn_id, namelen);
> >   			if (IS_ERR(name.data))
> >   				return PTR_ERR(name.data);
> > @@ -831,6 +835,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
> >   			cnm = &cmsg->cm_u.cm_name;
> >   			if (get_user(namelen, &cnm->cn_len))
> >   				return -EFAULT;
> > +			if (!namelen) {
> > +				dprintk("%s: namelen should not be zero", __func__);
> > +				return -EINVAL;
> > +			}
> >   			name.data = memdup_user(&cnm->cn_id, namelen);
> >   			if (IS_ERR(name.data))
> >   				return PTR_ERR(name.data);
> 
> Thanks,
> Guoqing
Scott Mayhew Sept. 4, 2024, 2:48 p.m. UTC | #7
On Tue, 03 Sep 2024, Li Lingfeng wrote:

> When we have a corrupted main.sqlite in /var/lib/nfs/nfsdcld/, it may
> result in namelen being 0, which will cause memdup_user() to return
> ZERO_SIZE_PTR.
> When we access the name.data that has been assigned the value of
> ZERO_SIZE_PTR in nfs4_client_to_reclaim(), null pointer dereference is
> triggered.
> 
> [ T1205] ==================================================================
> [ T1205] BUG: KASAN: null-ptr-deref in nfs4_client_to_reclaim+0xe9/0x260
> [ T1205] Read of size 1 at addr 0000000000000010 by task nfsdcld/1205
> [ T1205]
> [ T1205] CPU: 11 PID: 1205 Comm: nfsdcld Not tainted 5.10.0-00003-g2c1423731b8d #406
> [ T1205] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
> [ T1205] Call Trace:
> [ T1205]  dump_stack+0x9a/0xd0
> [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
> [ T1205]  __kasan_report.cold+0x34/0x84
> [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
> [ T1205]  kasan_report+0x3a/0x50
> [ T1205]  nfs4_client_to_reclaim+0xe9/0x260
> [ T1205]  ? nfsd4_release_lockowner+0x410/0x410
> [ T1205]  cld_pipe_downcall+0x5ca/0x760
> [ T1205]  ? nfsd4_cld_tracking_exit+0x1d0/0x1d0
> [ T1205]  ? down_write_killable_nested+0x170/0x170
> [ T1205]  ? avc_policy_seqno+0x28/0x40
> [ T1205]  ? selinux_file_permission+0x1b4/0x1e0
> [ T1205]  rpc_pipe_write+0x84/0xb0
> [ T1205]  vfs_write+0x143/0x520
> [ T1205]  ksys_write+0xc9/0x170
> [ T1205]  ? __ia32_sys_read+0x50/0x50
> [ T1205]  ? ktime_get_coarse_real_ts64+0xfe/0x110
> [ T1205]  ? ktime_get_coarse_real_ts64+0xa2/0x110
> [ T1205]  do_syscall_64+0x33/0x40
> [ T1205]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
> [ T1205] RIP: 0033:0x7fdbdb761bc7
> [ T1205] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 514
> [ T1205] RSP: 002b:00007fff8c4b7248 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [ T1205] RAX: ffffffffffffffda RBX: 000000000000042b RCX: 00007fdbdb761bc7
> [ T1205] RDX: 000000000000042b RSI: 00007fff8c4b75f0 RDI: 0000000000000008
> [ T1205] RBP: 00007fdbdb761bb0 R08: 0000000000000000 R09: 0000000000000001
> [ T1205] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000042b
> [ T1205] R13: 0000000000000008 R14: 00007fff8c4b75f0 R15: 0000000000000000
> [ T1205] ==================================================================
> 
> Fix it by checking namelen.
> 
> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
> ---
>  fs/nfsd/nfs4recover.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> index 67d8673a9391..69a3a84e159e 100644
> --- a/fs/nfsd/nfs4recover.c
> +++ b/fs/nfsd/nfs4recover.c
> @@ -809,6 +809,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
>  			ci = &cmsg->cm_u.cm_clntinfo;
>  			if (get_user(namelen, &ci->cc_name.cn_len))
>  				return -EFAULT;
> +			if (!namelen) {
> +				dprintk("%s: namelen should not be zero", __func__);
> +				return -EINVAL;
> +			}
>  			name.data = memdup_user(&ci->cc_name.cn_id, namelen);
>  			if (IS_ERR(name.data))
>  				return PTR_ERR(name.data);
> @@ -831,6 +835,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
>  			cnm = &cmsg->cm_u.cm_name;
>  			if (get_user(namelen, &cnm->cn_len))
>  				return -EFAULT;
> +			if (!namelen) {
> +				dprintk("%s: namelen should not be zero", __func__);
> +				return -EINVAL;
> +			}
>  			name.data = memdup_user(&cnm->cn_id, namelen);
>  			if (IS_ERR(name.data))
>  				return PTR_ERR(name.data);
> -- 
> 2.31.1

Huh, so that would mean sqlite allows null in a primary key.  Any idea
how the corruption occurs in the first place?

Reviewed-and-tested-by: Scott Mayhew <smayhew@redhat.com>

-Scott
> 
>
Li Lingfeng Sept. 5, 2024, 1:25 a.m. UTC | #8
在 2024/9/4 22:48, Scott Mayhew 写道:
> On Tue, 03 Sep 2024, Li Lingfeng wrote:
>
>> When we have a corrupted main.sqlite in /var/lib/nfs/nfsdcld/, it may
>> result in namelen being 0, which will cause memdup_user() to return
>> ZERO_SIZE_PTR.
>> When we access the name.data that has been assigned the value of
>> ZERO_SIZE_PTR in nfs4_client_to_reclaim(), null pointer dereference is
>> triggered.
>>
>> [ T1205] ==================================================================
>> [ T1205] BUG: KASAN: null-ptr-deref in nfs4_client_to_reclaim+0xe9/0x260
>> [ T1205] Read of size 1 at addr 0000000000000010 by task nfsdcld/1205
>> [ T1205]
>> [ T1205] CPU: 11 PID: 1205 Comm: nfsdcld Not tainted 5.10.0-00003-g2c1423731b8d #406
>> [ T1205] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
>> [ T1205] Call Trace:
>> [ T1205]  dump_stack+0x9a/0xd0
>> [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
>> [ T1205]  __kasan_report.cold+0x34/0x84
>> [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
>> [ T1205]  kasan_report+0x3a/0x50
>> [ T1205]  nfs4_client_to_reclaim+0xe9/0x260
>> [ T1205]  ? nfsd4_release_lockowner+0x410/0x410
>> [ T1205]  cld_pipe_downcall+0x5ca/0x760
>> [ T1205]  ? nfsd4_cld_tracking_exit+0x1d0/0x1d0
>> [ T1205]  ? down_write_killable_nested+0x170/0x170
>> [ T1205]  ? avc_policy_seqno+0x28/0x40
>> [ T1205]  ? selinux_file_permission+0x1b4/0x1e0
>> [ T1205]  rpc_pipe_write+0x84/0xb0
>> [ T1205]  vfs_write+0x143/0x520
>> [ T1205]  ksys_write+0xc9/0x170
>> [ T1205]  ? __ia32_sys_read+0x50/0x50
>> [ T1205]  ? ktime_get_coarse_real_ts64+0xfe/0x110
>> [ T1205]  ? ktime_get_coarse_real_ts64+0xa2/0x110
>> [ T1205]  do_syscall_64+0x33/0x40
>> [ T1205]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
>> [ T1205] RIP: 0033:0x7fdbdb761bc7
>> [ T1205] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 514
>> [ T1205] RSP: 002b:00007fff8c4b7248 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
>> [ T1205] RAX: ffffffffffffffda RBX: 000000000000042b RCX: 00007fdbdb761bc7
>> [ T1205] RDX: 000000000000042b RSI: 00007fff8c4b75f0 RDI: 0000000000000008
>> [ T1205] RBP: 00007fdbdb761bb0 R08: 0000000000000000 R09: 0000000000000001
>> [ T1205] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000042b
>> [ T1205] R13: 0000000000000008 R14: 00007fff8c4b75f0 R15: 0000000000000000
>> [ T1205] ==================================================================
>>
>> Fix it by checking namelen.
>>
>> Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
>> ---
>>   fs/nfsd/nfs4recover.c | 8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
>> index 67d8673a9391..69a3a84e159e 100644
>> --- a/fs/nfsd/nfs4recover.c
>> +++ b/fs/nfsd/nfs4recover.c
>> @@ -809,6 +809,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
>>   			ci = &cmsg->cm_u.cm_clntinfo;
>>   			if (get_user(namelen, &ci->cc_name.cn_len))
>>   				return -EFAULT;
>> +			if (!namelen) {
>> +				dprintk("%s: namelen should not be zero", __func__);
>> +				return -EINVAL;
>> +			}
>>   			name.data = memdup_user(&ci->cc_name.cn_id, namelen);
>>   			if (IS_ERR(name.data))
>>   				return PTR_ERR(name.data);
>> @@ -831,6 +835,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
>>   			cnm = &cmsg->cm_u.cm_name;
>>   			if (get_user(namelen, &cnm->cn_len))
>>   				return -EFAULT;
>> +			if (!namelen) {
>> +				dprintk("%s: namelen should not be zero", __func__);
>> +				return -EINVAL;
>> +			}
>>   			name.data = memdup_user(&cnm->cn_id, namelen);
>>   			if (IS_ERR(name.data))
>>   				return PTR_ERR(name.data);
>> -- 
>> 2.31.1
> Huh, so that would mean sqlite allows null in a primary key.  Any idea
> how the corruption occurs in the first place?
During the test on the VM, the machine was hung. After the VM was powered
off, the corrupted sqlite was obtained. But we don't know how to
theoretically trigger this corruption.
>
> Reviewed-and-tested-by: Scott Mayhew <smayhew@redhat.com>
>
> -Scott
>>
>
>
David Laight Sept. 8, 2024, 8:25 p.m. UTC | #9
From: Scott Mayhew 
> Sent: 04 September 2024 15:48
> 
> On Tue, 03 Sep 2024, Li Lingfeng wrote:
> 
> > When we have a corrupted main.sqlite in /var/lib/nfs/nfsdcld/, it may
> > result in namelen being 0, which will cause memdup_user() to return
> > ZERO_SIZE_PTR.
> > When we access the name.data that has been assigned the value of
> > ZERO_SIZE_PTR in nfs4_client_to_reclaim(), null pointer dereference is
> > triggered.
> >
> > [ T1205] ==================================================================
> > [ T1205] BUG: KASAN: null-ptr-deref in nfs4_client_to_reclaim+0xe9/0x260
> > [ T1205] Read of size 1 at addr 0000000000000010 by task nfsdcld/1205
> > [ T1205]
> > [ T1205] CPU: 11 PID: 1205 Comm: nfsdcld Not tainted 5.10.0-00003-g2c1423731b8d #406
> > [ T1205] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-
> ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
> > [ T1205] Call Trace:
> > [ T1205]  dump_stack+0x9a/0xd0
> > [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
> > [ T1205]  __kasan_report.cold+0x34/0x84
> > [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
> > [ T1205]  kasan_report+0x3a/0x50
> > [ T1205]  nfs4_client_to_reclaim+0xe9/0x260
> > [ T1205]  ? nfsd4_release_lockowner+0x410/0x410
> > [ T1205]  cld_pipe_downcall+0x5ca/0x760
> > [ T1205]  ? nfsd4_cld_tracking_exit+0x1d0/0x1d0
> > [ T1205]  ? down_write_killable_nested+0x170/0x170
> > [ T1205]  ? avc_policy_seqno+0x28/0x40
> > [ T1205]  ? selinux_file_permission+0x1b4/0x1e0
> > [ T1205]  rpc_pipe_write+0x84/0xb0
> > [ T1205]  vfs_write+0x143/0x520
> > [ T1205]  ksys_write+0xc9/0x170
> > [ T1205]  ? __ia32_sys_read+0x50/0x50
> > [ T1205]  ? ktime_get_coarse_real_ts64+0xfe/0x110
> > [ T1205]  ? ktime_get_coarse_real_ts64+0xa2/0x110
> > [ T1205]  do_syscall_64+0x33/0x40
> > [ T1205]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
> > [ T1205] RIP: 0033:0x7fdbdb761bc7
> > [ T1205] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18
> 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 514
> > [ T1205] RSP: 002b:00007fff8c4b7248 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> > [ T1205] RAX: ffffffffffffffda RBX: 000000000000042b RCX: 00007fdbdb761bc7
> > [ T1205] RDX: 000000000000042b RSI: 00007fff8c4b75f0 RDI: 0000000000000008
> > [ T1205] RBP: 00007fdbdb761bb0 R08: 0000000000000000 R09: 0000000000000001
> > [ T1205] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000042b
> > [ T1205] R13: 0000000000000008 R14: 00007fff8c4b75f0 R15: 0000000000000000
> > [ T1205] ==================================================================
> >
> > Fix it by checking namelen.
> >
> > Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
> > ---
> >  fs/nfsd/nfs4recover.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> > index 67d8673a9391..69a3a84e159e 100644
> > --- a/fs/nfsd/nfs4recover.c
> > +++ b/fs/nfsd/nfs4recover.c
> > @@ -809,6 +809,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
> >  			ci = &cmsg->cm_u.cm_clntinfo;
> >  			if (get_user(namelen, &ci->cc_name.cn_len))
> >  				return -EFAULT;
> > +			if (!namelen) {
> > +				dprintk("%s: namelen should not be zero", __func__);
> > +				return -EINVAL;
> > +			}
> >  			name.data = memdup_user(&ci->cc_name.cn_id, namelen);

Don't you also want an upper bound sanity check?
(or is cn_len only 8 bit?)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Scott Mayhew Sept. 9, 2024, 11:24 a.m. UTC | #10
On Sun, 08 Sep 2024, David Laight wrote:

> From: Scott Mayhew 
> > Sent: 04 September 2024 15:48
> > 
> > On Tue, 03 Sep 2024, Li Lingfeng wrote:
> > 
> > > When we have a corrupted main.sqlite in /var/lib/nfs/nfsdcld/, it may
> > > result in namelen being 0, which will cause memdup_user() to return
> > > ZERO_SIZE_PTR.
> > > When we access the name.data that has been assigned the value of
> > > ZERO_SIZE_PTR in nfs4_client_to_reclaim(), null pointer dereference is
> > > triggered.
> > >
> > > [ T1205] ==================================================================
> > > [ T1205] BUG: KASAN: null-ptr-deref in nfs4_client_to_reclaim+0xe9/0x260
> > > [ T1205] Read of size 1 at addr 0000000000000010 by task nfsdcld/1205
> > > [ T1205]
> > > [ T1205] CPU: 11 PID: 1205 Comm: nfsdcld Not tainted 5.10.0-00003-g2c1423731b8d #406
> > > [ T1205] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-
> > ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014
> > > [ T1205] Call Trace:
> > > [ T1205]  dump_stack+0x9a/0xd0
> > > [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
> > > [ T1205]  __kasan_report.cold+0x34/0x84
> > > [ T1205]  ? nfs4_client_to_reclaim+0xe9/0x260
> > > [ T1205]  kasan_report+0x3a/0x50
> > > [ T1205]  nfs4_client_to_reclaim+0xe9/0x260
> > > [ T1205]  ? nfsd4_release_lockowner+0x410/0x410
> > > [ T1205]  cld_pipe_downcall+0x5ca/0x760
> > > [ T1205]  ? nfsd4_cld_tracking_exit+0x1d0/0x1d0
> > > [ T1205]  ? down_write_killable_nested+0x170/0x170
> > > [ T1205]  ? avc_policy_seqno+0x28/0x40
> > > [ T1205]  ? selinux_file_permission+0x1b4/0x1e0
> > > [ T1205]  rpc_pipe_write+0x84/0xb0
> > > [ T1205]  vfs_write+0x143/0x520
> > > [ T1205]  ksys_write+0xc9/0x170
> > > [ T1205]  ? __ia32_sys_read+0x50/0x50
> > > [ T1205]  ? ktime_get_coarse_real_ts64+0xfe/0x110
> > > [ T1205]  ? ktime_get_coarse_real_ts64+0xa2/0x110
> > > [ T1205]  do_syscall_64+0x33/0x40
> > > [ T1205]  entry_SYSCALL_64_after_hwframe+0x67/0xd1
> > > [ T1205] RIP: 0033:0x7fdbdb761bc7
> > > [ T1205] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18
> > 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 514
> > > [ T1205] RSP: 002b:00007fff8c4b7248 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> > > [ T1205] RAX: ffffffffffffffda RBX: 000000000000042b RCX: 00007fdbdb761bc7
> > > [ T1205] RDX: 000000000000042b RSI: 00007fff8c4b75f0 RDI: 0000000000000008
> > > [ T1205] RBP: 00007fdbdb761bb0 R08: 0000000000000000 R09: 0000000000000001
> > > [ T1205] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000042b
> > > [ T1205] R13: 0000000000000008 R14: 00007fff8c4b75f0 R15: 0000000000000000
> > > [ T1205] ==================================================================
> > >
> > > Fix it by checking namelen.
> > >
> > > Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
> > > ---
> > >  fs/nfsd/nfs4recover.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > >
> > > diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> > > index 67d8673a9391..69a3a84e159e 100644
> > > --- a/fs/nfsd/nfs4recover.c
> > > +++ b/fs/nfsd/nfs4recover.c
> > > @@ -809,6 +809,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
> > >  			ci = &cmsg->cm_u.cm_clntinfo;
> > >  			if (get_user(namelen, &ci->cc_name.cn_len))
> > >  				return -EFAULT;
> > > +			if (!namelen) {
> > > +				dprintk("%s: namelen should not be zero", __func__);
> > > +				return -EINVAL;
> > > +			}
> > >  			name.data = memdup_user(&ci->cc_name.cn_id, namelen);
> 
> Don't you also want an upper bound sanity check?
> (or is cn_len only 8 bit?)

Yeah, actually it should probably be checking for namelen >
NFS4_OPAQUE_LIMIT.

-Scott
> 
> 	David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>
David Laight Sept. 9, 2024, 11:33 a.m. UTC | #11
From: Scott Mayhew
> Sent: 09 September 2024 12:24
...
> > > > diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> > > > index 67d8673a9391..69a3a84e159e 100644
> > > > --- a/fs/nfsd/nfs4recover.c
> > > > +++ b/fs/nfsd/nfs4recover.c
> > > > @@ -809,6 +809,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
> > > >  			ci = &cmsg->cm_u.cm_clntinfo;
> > > >  			if (get_user(namelen, &ci->cc_name.cn_len))
> > > >  				return -EFAULT;
> > > > +			if (!namelen) {
> > > > +				dprintk("%s: namelen should not be zero", __func__);
> > > > +				return -EINVAL;
> > > > +			}
> > > >  			name.data = memdup_user(&ci->cc_name.cn_id, namelen);
> >
> > Don't you also want an upper bound sanity check?
> > (or is cn_len only 8 bit?)
> 
> Yeah, actually it should probably be checking for namelen >
> NFS4_OPAQUE_LIMIT.

I suspect memdup_user() itself should have a third 'maxlen' argument.
And probably one that is required to be a compile-time constant.

Oh, and is dprintk() rate-limited?
Not that the message looks very helpful.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Chuck Lever Sept. 9, 2024, 2:10 p.m. UTC | #12
On Mon, Sep 09, 2024 at 11:33:04AM +0000, David Laight wrote:
> From: Scott Mayhew
> > Sent: 09 September 2024 12:24
> ...
> > > > > diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
> > > > > index 67d8673a9391..69a3a84e159e 100644
> > > > > --- a/fs/nfsd/nfs4recover.c
> > > > > +++ b/fs/nfsd/nfs4recover.c
> > > > > @@ -809,6 +809,10 @@ __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
> > > > >  			ci = &cmsg->cm_u.cm_clntinfo;
> > > > >  			if (get_user(namelen, &ci->cc_name.cn_len))
> > > > >  				return -EFAULT;
> > > > > +			if (!namelen) {
> > > > > +				dprintk("%s: namelen should not be zero", __func__);
> > > > > +				return -EINVAL;
> > > > > +			}
> > > > >  			name.data = memdup_user(&ci->cc_name.cn_id, namelen);
> > >
> > > Don't you also want an upper bound sanity check?
> > > (or is cn_len only 8 bit?)
> > 
> > Yeah, actually it should probably be checking for namelen >
> > NFS4_OPAQUE_LIMIT.

Scott, can you send me a tested patch? I can squash that in.


> I suspect memdup_user() itself should have a third 'maxlen' argument.
> And probably one that is required to be a compile-time constant.
> 
> Oh, and is dprintk() rate-limited?

No dprintk call site is rate-limited, and they are everywhere in
this code. Since they are off by default, rate-limiting is not a
concern. Also, journald will rate-limit any output from the kernel
to prevent flooding the system lock.


> Not that the message looks very helpful.

The specific message is not helpful; suggestions welcome. But I
prefer having a warning of some kind rather than a silent failure
mode.
diff mbox series

Patch

diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 67d8673a9391..69a3a84e159e 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -809,6 +809,10 @@  __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
 			ci = &cmsg->cm_u.cm_clntinfo;
 			if (get_user(namelen, &ci->cc_name.cn_len))
 				return -EFAULT;
+			if (!namelen) {
+				dprintk("%s: namelen should not be zero", __func__);
+				return -EINVAL;
+			}
 			name.data = memdup_user(&ci->cc_name.cn_id, namelen);
 			if (IS_ERR(name.data))
 				return PTR_ERR(name.data);
@@ -831,6 +835,10 @@  __cld_pipe_inprogress_downcall(const struct cld_msg_v2 __user *cmsg,
 			cnm = &cmsg->cm_u.cm_name;
 			if (get_user(namelen, &cnm->cn_len))
 				return -EFAULT;
+			if (!namelen) {
+				dprintk("%s: namelen should not be zero", __func__);
+				return -EINVAL;
+			}
 			name.data = memdup_user(&cnm->cn_id, namelen);
 			if (IS_ERR(name.data))
 				return PTR_ERR(name.data);