diff mbox series

[v2,2/5] NFSD: Fix NFSv3 SETATTR/CREATE's handling of large file sizes

Message ID 164365349299.3304.4161554101383665486.stgit@bazille.1015granger.net (mailing list archive)
State New, archived
Headers show
Series NFSD size, offset, and count sanity | expand

Commit Message

Chuck Lever Jan. 31, 2022, 6:24 p.m. UTC
iattr::ia_size is a loff_t, so these NFSv3 procedures must be
careful to deal with incoming client size values that are larger
than s64_max without corrupting the value.

Silently capping the value results in storing a different value
than the client passed in which is unexpected behavior, so remove
the min_t() check in decode_sattr3().

Moreover, a large file size is not an XDR error, since anything up
to U64_MAX is permitted for NFSv3 file size values. So it has to be
dealt with in nfs3proc.c, not in the XDR decoder.

Size comparisons like in inode_newsize_ok should now work as
expected -- the VFS returns -EFBIG if the new size is larger than
the underlying filesystem's s_maxbytes.

However, RFC 1813 permits only the WRITE procedure to return
NFS3ERR_FBIG. Extra checks are needed to prevent NFSv3 SETATTR and
CREATE from returning FBIG. Unfortunately RFC 1813 does not provide
a specific status code for either procedure to indicate this
specific failure, so I've chosen NFS3ERR_INVAL for SETATTR and
NFS3ERR_IO for CREATE.

Applications and NFS clients might be better served if the server
stuck with NFS3ERR_FBIG despite what RFC 1813 says.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs3proc.c |    9 +++++++++
 fs/nfsd/nfs3xdr.c  |    2 +-
 2 files changed, 10 insertions(+), 1 deletion(-)

Comments

Trond Myklebust Jan. 31, 2022, 6:37 p.m. UTC | #1
On Mon, 2022-01-31 at 13:24 -0500, Chuck Lever wrote:
> iattr::ia_size is a loff_t, so these NFSv3 procedures must be
> careful to deal with incoming client size values that are larger
> than s64_max without corrupting the value.
> 
> Silently capping the value results in storing a different value
> than the client passed in which is unexpected behavior, so remove
> the min_t() check in decode_sattr3().
> 
> Moreover, a large file size is not an XDR error, since anything up
> to U64_MAX is permitted for NFSv3 file size values. So it has to be
> dealt with in nfs3proc.c, not in the XDR decoder.
> 
> Size comparisons like in inode_newsize_ok should now work as
> expected -- the VFS returns -EFBIG if the new size is larger than
> the underlying filesystem's s_maxbytes.
> 
> However, RFC 1813 permits only the WRITE procedure to return
> NFS3ERR_FBIG. Extra checks are needed to prevent NFSv3 SETATTR and
> CREATE from returning FBIG. Unfortunately RFC 1813 does not provide
> a specific status code for either procedure to indicate this
> specific failure, so I've chosen NFS3ERR_INVAL for SETATTR and
> NFS3ERR_IO for CREATE.
> 
> Applications and NFS clients might be better served if the server
> stuck with NFS3ERR_FBIG despite what RFC 1813 says.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfsd/nfs3proc.c |    9 +++++++++
>  fs/nfsd/nfs3xdr.c  |    2 +-
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
> index 8ef53f6726ec..02edc7074d06 100644
> --- a/fs/nfsd/nfs3proc.c
> +++ b/fs/nfsd/nfs3proc.c
> @@ -73,6 +73,10 @@ nfsd3_proc_setattr(struct svc_rqst *rqstp)
>         fh_copy(&resp->fh, &argp->fh);
>         resp->status = nfsd_setattr(rqstp, &resp->fh, &argp->attrs,
>                                     argp->check_guard, argp-
> >guardtime);
> +
> +       if (resp->status == nfserr_fbig)
> +               resp->status = nfserr_inval;
> +
>         return rpc_success;
>  }
>  
> @@ -245,6 +249,11 @@ nfsd3_proc_create(struct svc_rqst *rqstp)
>         resp->status = do_nfsd_create(rqstp, dirfhp, argp->name,
> argp->len,
>                                       attr, newfhp, argp->createmode,
>                                       (u32 *)argp->verf, NULL, NULL);
> +
> +       /* CREATE must not return NFS3ERR_FBIG */
> +       if (resp->status == nfserr_fbig)
> +               resp->status = nfserr_io;
> +
>         return rpc_success;
>  }
>  
> diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
> index 7c45ba4db61b..2e47a07029f1 100644
> --- a/fs/nfsd/nfs3xdr.c
> +++ b/fs/nfsd/nfs3xdr.c
> @@ -254,7 +254,7 @@ svcxdr_decode_sattr3(struct svc_rqst *rqstp,
> struct xdr_stream *xdr,
>                 if (xdr_stream_decode_u64(xdr, &newsize) < 0)
>                         return false;
>                 iap->ia_valid |= ATTR_SIZE;
> -               iap->ia_size = min_t(u64, newsize, NFS_OFFSET_MAX);
> +               iap->ia_size = newsize;
>         }
>         if (xdr_stream_decode_u32(xdr, &set_it) < 0)0
>                 return false;
> 
> 

NACK.

Unlike NFSV4, NFSv3 has reference implementations, not a reference
specification document. There is no need to change those
implementations to deal with the fact that RFC1813 is underspecified.

This change would just serve to break client behaviour, for no good
reason.
Trond Myklebust Jan. 31, 2022, 6:47 p.m. UTC | #2
On Mon, 2022-01-31 at 13:37 -0500, Trond Myklebust wrote:
> On Mon, 2022-01-31 at 13:24 -0500, Chuck Lever wrote:
> > iattr::ia_size is a loff_t, so these NFSv3 procedures must be
> > careful to deal with incoming client size values that are larger
> > than s64_max without corrupting the value.
> > 
> > Silently capping the value results in storing a different value
> > than the client passed in which is unexpected behavior, so remove
> > the min_t() check in decode_sattr3().
> > 
> > Moreover, a large file size is not an XDR error, since anything up
> > to U64_MAX is permitted for NFSv3 file size values. So it has to be
> > dealt with in nfs3proc.c, not in the XDR decoder.
> > 
> > Size comparisons like in inode_newsize_ok should now work as
> > expected -- the VFS returns -EFBIG if the new size is larger than
> > the underlying filesystem's s_maxbytes.
> > 
> > However, RFC 1813 permits only the WRITE procedure to return
> > NFS3ERR_FBIG. Extra checks are needed to prevent NFSv3 SETATTR and
> > CREATE from returning FBIG. Unfortunately RFC 1813 does not provide
> > a specific status code for either procedure to indicate this
> > specific failure, so I've chosen NFS3ERR_INVAL for SETATTR and
> > NFS3ERR_IO for CREATE.
> > 
> > Applications and NFS clients might be better served if the server
> > stuck with NFS3ERR_FBIG despite what RFC 1813 says.
> > 
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > ---
> >  fs/nfsd/nfs3proc.c |    9 +++++++++
> >  fs/nfsd/nfs3xdr.c  |    2 +-
> >  2 files changed, 10 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
> > index 8ef53f6726ec..02edc7074d06 100644
> > --- a/fs/nfsd/nfs3proc.c
> > +++ b/fs/nfsd/nfs3proc.c
> > @@ -73,6 +73,10 @@ nfsd3_proc_setattr(struct svc_rqst *rqstp)
> >         fh_copy(&resp->fh, &argp->fh);
> >         resp->status = nfsd_setattr(rqstp, &resp->fh, &argp->attrs,
> >                                     argp->check_guard, argp-
> > > guardtime);
> > +
> > +       if (resp->status == nfserr_fbig)
> > +               resp->status = nfserr_inval;
> > +
> >         return rpc_success;
> >  }
> >  
> > @@ -245,6 +249,11 @@ nfsd3_proc_create(struct svc_rqst *rqstp)
> >         resp->status = do_nfsd_create(rqstp, dirfhp, argp->name,
> > argp->len,
> >                                       attr, newfhp, argp-
> > >createmode,
> >                                       (u32 *)argp->verf, NULL,
> > NULL);
> > +
> > +       /* CREATE must not return NFS3ERR_FBIG */
> > +       if (resp->status == nfserr_fbig)
> > +               resp->status = nfserr_io;

BTW: This EFBIG / EOVERFLOW case could only possibly happen due to an
internal server error.

       EFBIG  See EOVERFLOW.

       EOVERFLOW
              pathname  refers  to  a  regular  file  that  is too large to be
              opened.  The usual scenario here is that an application compiled
              on  a  32-bit  platform  without -D_FILE_OFFSET_BITS=64 tried to
              open a  file  whose  size  exceeds  (1<<31)-1  bytes;  see  also
              O_LARGEFILE  above.   This is the error specified by POSIX.1; in
              kernels before 2.6.24, Linux gave the error EFBIG for this case.


> > +
> >         return rpc_success;
> >  }
> >  
> > diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
> > index 7c45ba4db61b..2e47a07029f1 100644
> > --- a/fs/nfsd/nfs3xdr.c
> > +++ b/fs/nfsd/nfs3xdr.c
> > @@ -254,7 +254,7 @@ svcxdr_decode_sattr3(struct svc_rqst *rqstp,
> > struct xdr_stream *xdr,
> >                 if (xdr_stream_decode_u64(xdr, &newsize) < 0)
> >                         return false;
> >                 iap->ia_valid |= ATTR_SIZE;
> > -               iap->ia_size = min_t(u64, newsize, NFS_OFFSET_MAX);
> > +               iap->ia_size = newsize;
> >         }
> >         if (xdr_stream_decode_u32(xdr, &set_it) < 0)0
> >                 return false;
> > 
> > 
> 
> NACK.
> 
> Unlike NFSV4, NFSv3 has reference implementations, not a reference
> specification document. There is no need to change those
> implementations to deal with the fact that RFC1813 is underspecified.
> 
> This change would just serve to break client behaviour, for no good
> reason.
>
Chuck Lever Jan. 31, 2022, 6:49 p.m. UTC | #3
> On Jan 31, 2022, at 1:37 PM, Trond Myklebust <trondmy@hammerspace.com> wrote:
> 
> On Mon, 2022-01-31 at 13:24 -0500, Chuck Lever wrote:
>> iattr::ia_size is a loff_t, so these NFSv3 procedures must be
>> careful to deal with incoming client size values that are larger
>> than s64_max without corrupting the value.
>> 
>> Silently capping the value results in storing a different value
>> than the client passed in which is unexpected behavior, so remove
>> the min_t() check in decode_sattr3().
>> 
>> Moreover, a large file size is not an XDR error, since anything up
>> to U64_MAX is permitted for NFSv3 file size values. So it has to be
>> dealt with in nfs3proc.c, not in the XDR decoder.
>> 
>> Size comparisons like in inode_newsize_ok should now work as
>> expected -- the VFS returns -EFBIG if the new size is larger than
>> the underlying filesystem's s_maxbytes.
>> 
>> However, RFC 1813 permits only the WRITE procedure to return
>> NFS3ERR_FBIG. Extra checks are needed to prevent NFSv3 SETATTR and
>> CREATE from returning FBIG. Unfortunately RFC 1813 does not provide
>> a specific status code for either procedure to indicate this
>> specific failure, so I've chosen NFS3ERR_INVAL for SETATTR and
>> NFS3ERR_IO for CREATE.
>> 
>> Applications and NFS clients might be better served if the server
>> stuck with NFS3ERR_FBIG despite what RFC 1813 says.
>> 
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>>  fs/nfsd/nfs3proc.c |    9 +++++++++
>>  fs/nfsd/nfs3xdr.c  |    2 +-
>>  2 files changed, 10 insertions(+), 1 deletion(-)
>> 
>> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
>> index 8ef53f6726ec..02edc7074d06 100644
>> --- a/fs/nfsd/nfs3proc.c
>> +++ b/fs/nfsd/nfs3proc.c
>> @@ -73,6 +73,10 @@ nfsd3_proc_setattr(struct svc_rqst *rqstp)
>>         fh_copy(&resp->fh, &argp->fh);
>>         resp->status = nfsd_setattr(rqstp, &resp->fh, &argp->attrs,
>>                                     argp->check_guard, argp-
>>> guardtime);
>> +
>> +       if (resp->status == nfserr_fbig)
>> +               resp->status = nfserr_inval;
>> +
>>         return rpc_success;
>>  }
>>  
>> @@ -245,6 +249,11 @@ nfsd3_proc_create(struct svc_rqst *rqstp)
>>         resp->status = do_nfsd_create(rqstp, dirfhp, argp->name,
>> argp->len,
>>                                       attr, newfhp, argp->createmode,
>>                                       (u32 *)argp->verf, NULL, NULL);
>> +
>> +       /* CREATE must not return NFS3ERR_FBIG */
>> +       if (resp->status == nfserr_fbig)
>> +               resp->status = nfserr_io;
>> +
>>         return rpc_success;
>>  }
>>  
>> diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
>> index 7c45ba4db61b..2e47a07029f1 100644
>> --- a/fs/nfsd/nfs3xdr.c
>> +++ b/fs/nfsd/nfs3xdr.c
>> @@ -254,7 +254,7 @@ svcxdr_decode_sattr3(struct svc_rqst *rqstp,
>> struct xdr_stream *xdr,
>>                 if (xdr_stream_decode_u64(xdr, &newsize) < 0)
>>                         return false;
>>                 iap->ia_valid |= ATTR_SIZE;
>> -               iap->ia_size = min_t(u64, newsize, NFS_OFFSET_MAX);
>> +               iap->ia_size = newsize;
>>         }
>>         if (xdr_stream_decode_u32(xdr, &set_it) < 0)0
>>                 return false;
>> 
>> 
> 
> NACK.
> 
> Unlike NFSV4, NFSv3 has reference implementations, not a reference
> specification document. There is no need to change those
> implementations to deal with the fact that RFC1813 is underspecified.
> 
> This change would just serve to break client behaviour, for no good
> reason.

So, I _have_ been asking around. This is not a change that
I'm proposing blithely.

Which part of the change is wrong, and which clients would
break? Solaris NFSv3 server is supposed to return NFS3ERR_FBIG
in this case, I believe. NFSD could return NFS3ERR_FBIG in
these cases instead.

Is there somewhere that the behavior of the reference
implementation is documented? If the current XDR decoder
behavior is a de facto standard, that should be noted in a
comment here.


--
Chuck Lever
Trond Myklebust Jan. 31, 2022, 6:58 p.m. UTC | #4
On Mon, 2022-01-31 at 18:49 +0000, Chuck Lever III wrote:
> 
> 
> > On Jan 31, 2022, at 1:37 PM, Trond Myklebust
> > <trondmy@hammerspace.com> wrote:
> > 
> > On Mon, 2022-01-31 at 13:24 -0500, Chuck Lever wrote:
> > > iattr::ia_size is a loff_t, so these NFSv3 procedures must be
> > > careful to deal with incoming client size values that are larger
> > > than s64_max without corrupting the value.
> > > 
> > > Silently capping the value results in storing a different value
> > > than the client passed in which is unexpected behavior, so remove
> > > the min_t() check in decode_sattr3().
> > > 
> > > Moreover, a large file size is not an XDR error, since anything
> > > up
> > > to U64_MAX is permitted for NFSv3 file size values. So it has to
> > > be
> > > dealt with in nfs3proc.c, not in the XDR decoder.
> > > 
> > > Size comparisons like in inode_newsize_ok should now work as
> > > expected -- the VFS returns -EFBIG if the new size is larger than
> > > the underlying filesystem's s_maxbytes.
> > > 
> > > However, RFC 1813 permits only the WRITE procedure to return
> > > NFS3ERR_FBIG. Extra checks are needed to prevent NFSv3 SETATTR
> > > and
> > > CREATE from returning FBIG. Unfortunately RFC 1813 does not
> > > provide
> > > a specific status code for either procedure to indicate this
> > > specific failure, so I've chosen NFS3ERR_INVAL for SETATTR and
> > > NFS3ERR_IO for CREATE.
> > > 
> > > Applications and NFS clients might be better served if the server
> > > stuck with NFS3ERR_FBIG despite what RFC 1813 says.
> > > 
> > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > > ---
> > >  fs/nfsd/nfs3proc.c |    9 +++++++++
> > >  fs/nfsd/nfs3xdr.c  |    2 +-
> > >  2 files changed, 10 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
> > > index 8ef53f6726ec..02edc7074d06 100644
> > > --- a/fs/nfsd/nfs3proc.c
> > > +++ b/fs/nfsd/nfs3proc.c
> > > @@ -73,6 +73,10 @@ nfsd3_proc_setattr(struct svc_rqst *rqstp)
> > >         fh_copy(&resp->fh, &argp->fh);
> > >         resp->status = nfsd_setattr(rqstp, &resp->fh, &argp-
> > > >attrs,
> > >                                     argp->check_guard, argp-
> > > > guardtime);
> > > +
> > > +       if (resp->status == nfserr_fbig)
> > > +               resp->status = nfserr_inval;
> > > +
> > >         return rpc_success;
> > >  }
> > >  
> > > @@ -245,6 +249,11 @@ nfsd3_proc_create(struct svc_rqst *rqstp)
> > >         resp->status = do_nfsd_create(rqstp, dirfhp, argp->name,
> > > argp->len,
> > >                                       attr, newfhp, argp-
> > > >createmode,
> > >                                       (u32 *)argp->verf, NULL,
> > > NULL);
> > > +
> > > +       /* CREATE must not return NFS3ERR_FBIG */
> > > +       if (resp->status == nfserr_fbig)
> > > +               resp->status = nfserr_io;
> > > +
> > >         return rpc_success;
> > >  }
> > >  
> > > diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
> > > index 7c45ba4db61b..2e47a07029f1 100644
> > > --- a/fs/nfsd/nfs3xdr.c
> > > +++ b/fs/nfsd/nfs3xdr.c
> > > @@ -254,7 +254,7 @@ svcxdr_decode_sattr3(struct svc_rqst *rqstp,
> > > struct xdr_stream *xdr,
> > >                 if (xdr_stream_decode_u64(xdr, &newsize) < 0)
> > >                         return false;
> > >                 iap->ia_valid |= ATTR_SIZE;
> > > -               iap->ia_size = min_t(u64, newsize,
> > > NFS_OFFSET_MAX);
> > > +               iap->ia_size = newsize;
> > >         }
> > >         if (xdr_stream_decode_u32(xdr, &set_it) < 0)0
> > >                 return false;
> > > 
> > > 
> > 
> > NACK.
> > 
> > Unlike NFSV4, NFSv3 has reference implementations, not a reference
> > specification document. There is no need to change those
> > implementations to deal with the fact that RFC1813 is
> > underspecified.
> > 
> > This change would just serve to break client behaviour, for no good
> > reason.
> 
> So, I _have_ been asking around. This is not a change that
> I'm proposing blithely.
> 
> Which part of the change is wrong, and which clients would
> break? Solaris NFSv3 server is supposed to return NFS3ERR_FBIG
> in this case, I believe. NFSD could return NFS3ERR_FBIG in
> these cases instead.
> 
> Is there somewhere that the behavior of the reference
> implementation is documented? If the current XDR decoder
> behavior is a de facto standard, that should be noted in a
> comment here.
> 
> 

Please return NFS3ERR_FBIG in the setattr case, and just drop the
create change (do_nfsd_create() can never return EFBIG given that nfsd
always opens the file with O_LARGEFILE).

There is no document other than the Solaris and Linux NFS code. RFC1813
was never intended as an IETF standard, and never saw any follow up.
Nothing else was published following the Connectathon testing events
which determined the wire protocol.
Chuck Lever Jan. 31, 2022, 7:04 p.m. UTC | #5
> On Jan 31, 2022, at 1:47 PM, Trond Myklebust <trondmy@hammerspace.com> wrote:
> 
> On Mon, 2022-01-31 at 13:37 -0500, Trond Myklebust wrote:
>> On Mon, 2022-01-31 at 13:24 -0500, Chuck Lever wrote:
>>> iattr::ia_size is a loff_t, so these NFSv3 procedures must be
>>> careful to deal with incoming client size values that are larger
>>> than s64_max without corrupting the value.
>>> 
>>> Silently capping the value results in storing a different value
>>> than the client passed in which is unexpected behavior, so remove
>>> the min_t() check in decode_sattr3().
>>> 
>>> Moreover, a large file size is not an XDR error, since anything up
>>> to U64_MAX is permitted for NFSv3 file size values. So it has to be
>>> dealt with in nfs3proc.c, not in the XDR decoder.
>>> 
>>> Size comparisons like in inode_newsize_ok should now work as
>>> expected -- the VFS returns -EFBIG if the new size is larger than
>>> the underlying filesystem's s_maxbytes.
>>> 
>>> However, RFC 1813 permits only the WRITE procedure to return
>>> NFS3ERR_FBIG. Extra checks are needed to prevent NFSv3 SETATTR and
>>> CREATE from returning FBIG. Unfortunately RFC 1813 does not provide
>>> a specific status code for either procedure to indicate this
>>> specific failure, so I've chosen NFS3ERR_INVAL for SETATTR and
>>> NFS3ERR_IO for CREATE.
>>> 
>>> Applications and NFS clients might be better served if the server
>>> stuck with NFS3ERR_FBIG despite what RFC 1813 says.
>>> 
>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>> ---
>>>  fs/nfsd/nfs3proc.c |    9 +++++++++
>>>  fs/nfsd/nfs3xdr.c  |    2 +-
>>>  2 files changed, 10 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
>>> index 8ef53f6726ec..02edc7074d06 100644
>>> --- a/fs/nfsd/nfs3proc.c
>>> +++ b/fs/nfsd/nfs3proc.c
>>> @@ -73,6 +73,10 @@ nfsd3_proc_setattr(struct svc_rqst *rqstp)
>>>         fh_copy(&resp->fh, &argp->fh);
>>>         resp->status = nfsd_setattr(rqstp, &resp->fh, &argp->attrs,
>>>                                     argp->check_guard, argp-
>>>> guardtime);
>>> +
>>> +       if (resp->status == nfserr_fbig)
>>> +               resp->status = nfserr_inval;
>>> +
>>>         return rpc_success;
>>>  }
>>>  
>>> @@ -245,6 +249,11 @@ nfsd3_proc_create(struct svc_rqst *rqstp)
>>>         resp->status = do_nfsd_create(rqstp, dirfhp, argp->name,
>>> argp->len,
>>>                                       attr, newfhp, argp-
>>>> createmode,
>>>                                       (u32 *)argp->verf, NULL,
>>> NULL);
>>> +
>>> +       /* CREATE must not return NFS3ERR_FBIG */
>>> +       if (resp->status == nfserr_fbig)
>>> +               resp->status = nfserr_io;
> 
> BTW: This EFBIG / EOVERFLOW case could only possibly happen due to an
> internal server error.
> 
>       EFBIG  See EOVERFLOW.
> 
>       EOVERFLOW
>              pathname  refers  to  a  regular  file  that  is too large to be
>              opened.  The usual scenario here is that an application compiled
>              on  a  32-bit  platform  without -D_FILE_OFFSET_BITS=64 tried to
>              open a  file  whose  size  exceeds  (1<<31)-1  bytes;  see  also
>              O_LARGEFILE  above.   This is the error specified by POSIX.1; in
>              kernels before 2.6.24, Linux gave the error EFBIG for this case.

What if the client has sent a CREATE with attributes that
has a filesize that is smaller than OFFSET_MAX but larger
than the filesystem's s_maxbytes? I believe notify_change()
will return -EFBIG in this case, and correctly so.

NFSD's NFSv3 SETATTR implementation will leak FBIG in
some cases. If that's going to be a problem for certain
important clients, then I'd like it not to do that.


--
Chuck Lever
Chuck Lever Jan. 31, 2022, 7:09 p.m. UTC | #6
> On Jan 31, 2022, at 1:58 PM, Trond Myklebust <trondmy@hammerspace.com> wrote:
> 
> On Mon, 2022-01-31 at 18:49 +0000, Chuck Lever III wrote:
>> 
>> 
>>> On Jan 31, 2022, at 1:37 PM, Trond Myklebust
>>> <trondmy@hammerspace.com> wrote:
>>> 
>>> On Mon, 2022-01-31 at 13:24 -0500, Chuck Lever wrote:
>>>> iattr::ia_size is a loff_t, so these NFSv3 procedures must be
>>>> careful to deal with incoming client size values that are larger
>>>> than s64_max without corrupting the value.
>>>> 
>>>> Silently capping the value results in storing a different value
>>>> than the client passed in which is unexpected behavior, so remove
>>>> the min_t() check in decode_sattr3().
>>>> 
>>>> Moreover, a large file size is not an XDR error, since anything
>>>> up
>>>> to U64_MAX is permitted for NFSv3 file size values. So it has to
>>>> be
>>>> dealt with in nfs3proc.c, not in the XDR decoder.
>>>> 
>>>> Size comparisons like in inode_newsize_ok should now work as
>>>> expected -- the VFS returns -EFBIG if the new size is larger than
>>>> the underlying filesystem's s_maxbytes.
>>>> 
>>>> However, RFC 1813 permits only the WRITE procedure to return
>>>> NFS3ERR_FBIG. Extra checks are needed to prevent NFSv3 SETATTR
>>>> and
>>>> CREATE from returning FBIG. Unfortunately RFC 1813 does not
>>>> provide
>>>> a specific status code for either procedure to indicate this
>>>> specific failure, so I've chosen NFS3ERR_INVAL for SETATTR and
>>>> NFS3ERR_IO for CREATE.
>>>> 
>>>> Applications and NFS clients might be better served if the server
>>>> stuck with NFS3ERR_FBIG despite what RFC 1813 says.
>>>> 
>>>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>>>> ---
>>>>  fs/nfsd/nfs3proc.c |    9 +++++++++
>>>>  fs/nfsd/nfs3xdr.c  |    2 +-
>>>>  2 files changed, 10 insertions(+), 1 deletion(-)
>>>> 
>>>> diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
>>>> index 8ef53f6726ec..02edc7074d06 100644
>>>> --- a/fs/nfsd/nfs3proc.c
>>>> +++ b/fs/nfsd/nfs3proc.c
>>>> @@ -73,6 +73,10 @@ nfsd3_proc_setattr(struct svc_rqst *rqstp)
>>>>         fh_copy(&resp->fh, &argp->fh);
>>>>         resp->status = nfsd_setattr(rqstp, &resp->fh, &argp-
>>>>> attrs,
>>>>                                     argp->check_guard, argp-
>>>>> guardtime);
>>>> +
>>>> +       if (resp->status == nfserr_fbig)
>>>> +               resp->status = nfserr_inval;
>>>> +
>>>>         return rpc_success;
>>>>  }
>>>>  
>>>> @@ -245,6 +249,11 @@ nfsd3_proc_create(struct svc_rqst *rqstp)
>>>>         resp->status = do_nfsd_create(rqstp, dirfhp, argp->name,
>>>> argp->len,
>>>>                                       attr, newfhp, argp-
>>>>> createmode,
>>>>                                       (u32 *)argp->verf, NULL,
>>>> NULL);
>>>> +
>>>> +       /* CREATE must not return NFS3ERR_FBIG */
>>>> +       if (resp->status == nfserr_fbig)
>>>> +               resp->status = nfserr_io;
>>>> +
>>>>         return rpc_success;
>>>>  }
>>>>  
>>>> diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
>>>> index 7c45ba4db61b..2e47a07029f1 100644
>>>> --- a/fs/nfsd/nfs3xdr.c
>>>> +++ b/fs/nfsd/nfs3xdr.c
>>>> @@ -254,7 +254,7 @@ svcxdr_decode_sattr3(struct svc_rqst *rqstp,
>>>> struct xdr_stream *xdr,
>>>>                 if (xdr_stream_decode_u64(xdr, &newsize) < 0)
>>>>                         return false;
>>>>                 iap->ia_valid |= ATTR_SIZE;
>>>> -               iap->ia_size = min_t(u64, newsize,
>>>> NFS_OFFSET_MAX);
>>>> +               iap->ia_size = newsize;
>>>>         }
>>>>         if (xdr_stream_decode_u32(xdr, &set_it) < 0)0
>>>>                 return false;
>>>> 
>>>> 
>>> 
>>> NACK.
>>> 
>>> Unlike NFSV4, NFSv3 has reference implementations, not a reference
>>> specification document. There is no need to change those
>>> implementations to deal with the fact that RFC1813 is
>>> underspecified.
>>> 
>>> This change would just serve to break client behaviour, for no good
>>> reason.
>> 
>> So, I _have_ been asking around. This is not a change that
>> I'm proposing blithely.
>> 
>> Which part of the change is wrong, and which clients would
>> break? Solaris NFSv3 server is supposed to return NFS3ERR_FBIG
>> in this case, I believe. NFSD could return NFS3ERR_FBIG in
>> these cases instead.
>> 
>> Is there somewhere that the behavior of the reference
>> implementation is documented? If the current XDR decoder
>> behavior is a de facto standard, that should be noted in a
>> comment here.
>> 
>> 
> 
> Please return NFS3ERR_FBIG in the setattr case, and just drop the
> create change (do_nfsd_create() can never return EFBIG given that nfsd
> always opens the file with O_LARGEFILE).
> 
> There is no document other than the Solaris and Linux NFS code. RFC1813
> was never intended as an IETF standard, and never saw any follow up.
> Nothing else was published following the Connectathon testing events
> which determined the wire protocol.

So to make sure I understand you: Drop the hunks that
modify nfsd3_proc_setattr() and nfsd3_proc_create().

I'm fine with that.


--
Chuck Lever
Trond Myklebust Jan. 31, 2022, 7:14 p.m. UTC | #7
On Mon, 2022-01-31 at 19:04 +0000, Chuck Lever III wrote:
> 
> 
> > On Jan 31, 2022, at 1:47 PM, Trond Myklebust
> > <trondmy@hammerspace.com> wrote:
> > 
> > On Mon, 2022-01-31 at 13:37 -0500, Trond Myklebust wrote:
> > > On Mon, 2022-01-31 at 13:24 -0500, Chuck Lever wrote:
> > > > iattr::ia_size is a loff_t, so these NFSv3 procedures must be
> > > > careful to deal with incoming client size values that are
> > > > larger
> > > > than s64_max without corrupting the value.
> > > > 
> > > > Silently capping the value results in storing a different value
> > > > than the client passed in which is unexpected behavior, so
> > > > remove
> > > > the min_t() check in decode_sattr3().
> > > > 
> > > > Moreover, a large file size is not an XDR error, since anything
> > > > up
> > > > to U64_MAX is permitted for NFSv3 file size values. So it has
> > > > to be
> > > > dealt with in nfs3proc.c, not in the XDR decoder.
> > > > 
> > > > Size comparisons like in inode_newsize_ok should now work as
> > > > expected -- the VFS returns -EFBIG if the new size is larger
> > > > than
> > > > the underlying filesystem's s_maxbytes.
> > > > 
> > > > However, RFC 1813 permits only the WRITE procedure to return
> > > > NFS3ERR_FBIG. Extra checks are needed to prevent NFSv3 SETATTR
> > > > and
> > > > CREATE from returning FBIG. Unfortunately RFC 1813 does not
> > > > provide
> > > > a specific status code for either procedure to indicate this
> > > > specific failure, so I've chosen NFS3ERR_INVAL for SETATTR and
> > > > NFS3ERR_IO for CREATE.
> > > > 
> > > > Applications and NFS clients might be better served if the
> > > > server
> > > > stuck with NFS3ERR_FBIG despite what RFC 1813 says.
> > > > 
> > > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > > > ---
> > > >  fs/nfsd/nfs3proc.c |    9 +++++++++
> > > >  fs/nfsd/nfs3xdr.c  |    2 +-
> > > >  2 files changed, 10 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
> > > > index 8ef53f6726ec..02edc7074d06 100644
> > > > --- a/fs/nfsd/nfs3proc.c
> > > > +++ b/fs/nfsd/nfs3proc.c
> > > > @@ -73,6 +73,10 @@ nfsd3_proc_setattr(struct svc_rqst *rqstp)
> > > >         fh_copy(&resp->fh, &argp->fh);
> > > >         resp->status = nfsd_setattr(rqstp, &resp->fh, &argp-
> > > > >attrs,
> > > >                                     argp->check_guard, argp-
> > > > > guardtime);
> > > > +
> > > > +       if (resp->status == nfserr_fbig)
> > > > +               resp->status = nfserr_inval;
> > > > +
> > > >         return rpc_success;
> > > >  }
> > > >  
> > > > @@ -245,6 +249,11 @@ nfsd3_proc_create(struct svc_rqst *rqstp)
> > > >         resp->status = do_nfsd_create(rqstp, dirfhp, argp-
> > > > >name,
> > > > argp->len,
> > > >                                       attr, newfhp, argp-
> > > > > createmode,
> > > >                                       (u32 *)argp->verf, NULL,
> > > > NULL);
> > > > +
> > > > +       /* CREATE must not return NFS3ERR_FBIG */
> > > > +       if (resp->status == nfserr_fbig)
> > > > +               resp->status = nfserr_io;
> > 
> > BTW: This EFBIG / EOVERFLOW case could only possibly happen due to
> > an
> > internal server error.
> > 
> >       EFBIG  See EOVERFLOW.
> > 
> >       EOVERFLOW
> >              pathname  refers  to  a  regular  file  that  is too
> > large to be
> >              opened.  The usual scenario here is that an
> > application compiled
> >              on  a  32-bit  platform  without -
> > D_FILE_OFFSET_BITS=64 tried to
> >              open a  file  whose  size  exceeds  (1<<31)-1  bytes; 
> > see  also
> >              O_LARGEFILE  above.   This is the error specified by
> > POSIX.1; in
> >              kernels before 2.6.24, Linux gave the error EFBIG for
> > this case.
> 
> What if the client has sent a CREATE with attributes that
> has a filesize that is smaller than OFFSET_MAX but larger
> than the filesystem's s_maxbytes? I believe notify_change()
> will return -EFBIG in this case, and correctly so.
> 

There is no POSIX or BSD function that allows you to do this, so that
would be a very unusual client. Pretty sure that Windows won't allow it
either.

> NFSD's NFSv3 SETATTR implementation will leak FBIG in
> some cases. If that's going to be a problem for certain
> important clients, then I'd like it not to do that.
> 

As I said, changing the behaviour at this point, it far more likely to
cause breakage than keeping it would. So I strongly disagree with this
argument.
diff mbox series

Patch

diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c
index 8ef53f6726ec..02edc7074d06 100644
--- a/fs/nfsd/nfs3proc.c
+++ b/fs/nfsd/nfs3proc.c
@@ -73,6 +73,10 @@  nfsd3_proc_setattr(struct svc_rqst *rqstp)
 	fh_copy(&resp->fh, &argp->fh);
 	resp->status = nfsd_setattr(rqstp, &resp->fh, &argp->attrs,
 				    argp->check_guard, argp->guardtime);
+
+	if (resp->status == nfserr_fbig)
+		resp->status = nfserr_inval;
+
 	return rpc_success;
 }
 
@@ -245,6 +249,11 @@  nfsd3_proc_create(struct svc_rqst *rqstp)
 	resp->status = do_nfsd_create(rqstp, dirfhp, argp->name, argp->len,
 				      attr, newfhp, argp->createmode,
 				      (u32 *)argp->verf, NULL, NULL);
+
+	/* CREATE must not return NFS3ERR_FBIG */
+	if (resp->status == nfserr_fbig)
+		resp->status = nfserr_io;
+
 	return rpc_success;
 }
 
diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c
index 7c45ba4db61b..2e47a07029f1 100644
--- a/fs/nfsd/nfs3xdr.c
+++ b/fs/nfsd/nfs3xdr.c
@@ -254,7 +254,7 @@  svcxdr_decode_sattr3(struct svc_rqst *rqstp, struct xdr_stream *xdr,
 		if (xdr_stream_decode_u64(xdr, &newsize) < 0)
 			return false;
 		iap->ia_valid |= ATTR_SIZE;
-		iap->ia_size = min_t(u64, newsize, NFS_OFFSET_MAX);
+		iap->ia_size = newsize;
 	}
 	if (xdr_stream_decode_u32(xdr, &set_it) < 0)
 		return false;