[v2] nfsd: map EBUSY to NFS4ERR_ACCESS for all operations

Message ID	20250121103954.415462-1-amir73il@gmail.com (mailing list archive)
State	New
Headers	show Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com [209.85.218.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 29DB81BD018; Tue, 21 Jan 2025 10:39:59 +0000 (UTC) From: Amir Goldstein <amir73il@gmail.com> To: Chuck Lever <chuck.lever@oracle.com> Cc: Jeff Layton <jlayton@kernel.org>, linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org, Trond Myklebust <trondmy@hammerspace.com>, NeilBrown <neilb@suse.de> Subject: [PATCH v2] nfsd: map EBUSY to NFS4ERR_ACCESS for all operations Date: Tue, 21 Jan 2025 11:39:54 +0100 Message-Id: <20250121103954.415462-1-amir73il@gmail.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	[v2] nfsd: map EBUSY to NFS4ERR_ACCESS for all operations \| expand [v2] nfsd: map EBUSY to NFS4ERR_ACCESS for all operations

Amir Goldstein Jan. 21, 2025, 10:39 a.m. UTC

Commit 466e16f0920f3 ("nfsd: check for EBUSY from vfs_rmdir/vfs_unink.")
mapped EBUSY host error from rmdir/unlink operation to avoid unknown
error server warning.

The same reason that casued the reported EBUSY on rmdir() (dir is a
local mount point in some other bind mount) could also cause EBUSY on
rename and some filesystems (e.g. FUSE) can return EBUSY on other
operations like open().

Therefore, to avoid unknown error warning in server, we need to map
EBUSY for all operations.

The original fix mapped EBUSY to NFS4ERR_FILE_OPEN in v4 server and
to NFS4ERR_ACCESS in v2/v3 server.

During the discussion on this issue, Trond claimed that the mapping
made from EBUSY to NFS4ERR_FILE_OPEN was incorrect according to the
protocol spec and specifically, NFS4ERR_FILE_OPEN is not expected
for directories.

To keep things simple and consistent and avoid the server warning,
map EBUSY to NFS4ERR_ACCESS for all operations in all protocol versions.

Note that the mapping of NFS4ERR_FILE_OPEN to NFSERR_ACCESS in
nfsd3_map_status() and nfsd_map_status() remains for possible future
return of NFS4ERR_FILE_OPEN in a more specific use case (e.g. an unlink
of a sillyrenamed non-dir).

Fixes: 466e16f0920f3 ("nfsd: check for EBUSY from vfs_rmdir/vfs_unink.")
Link: https://lore.kernel.org/linux-nfs/20250120172016.397916-1-amir73il@gmail.com/
Cc: Trond Myklebust <trondmy@hammerspace.com>
Cc: NeilBrown <neilb@suse.de>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/nfsd/vfs.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

Jeff Layton Jan. 21, 2025, 12:21 p.m. UTC | #1

On Tue, 2025-01-21 at 11:39 +0100, Amir Goldstein wrote:
> Commit 466e16f0920f3 ("nfsd: check for EBUSY from vfs_rmdir/vfs_unink.")
> mapped EBUSY host error from rmdir/unlink operation to avoid unknown
> error server warning.
> 
> The same reason that casued the reported EBUSY on rmdir() (dir is a
> local mount point in some other bind mount) could also cause EBUSY on
> rename and some filesystems (e.g. FUSE) can return EBUSY on other
> operations like open().
> 
> Therefore, to avoid unknown error warning in server, we need to map
> EBUSY for all operations.
> 
> The original fix mapped EBUSY to NFS4ERR_FILE_OPEN in v4 server and
> to NFS4ERR_ACCESS in v2/v3 server.
> 
> During the discussion on this issue, Trond claimed that the mapping
> made from EBUSY to NFS4ERR_FILE_OPEN was incorrect according to the
> protocol spec and specifically, NFS4ERR_FILE_OPEN is not expected
> for directories.
> 
> To keep things simple and consistent and avoid the server warning,
> map EBUSY to NFS4ERR_ACCESS for all operations in all protocol versions.
> 
> Note that the mapping of NFS4ERR_FILE_OPEN to NFSERR_ACCESS in
> nfsd3_map_status() and nfsd_map_status() remains for possible future
> return of NFS4ERR_FILE_OPEN in a more specific use case (e.g. an unlink
> of a sillyrenamed non-dir).
> 
> Fixes: 466e16f0920f3 ("nfsd: check for EBUSY from vfs_rmdir/vfs_unink.")
> Link: https://lore.kernel.org/linux-nfs/20250120172016.397916-1-amir73il@gmail.com/
> Cc: Trond Myklebust <trondmy@hammerspace.com>
> Cc: NeilBrown <neilb@suse.de>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/nfsd/vfs.c | 10 ++--------
>  1 file changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 29cb7b812d713..290c7db8a6180 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -69,6 +69,7 @@ nfserrno (int errno)
>  		{ nfserr_fbig, -E2BIG },
>  		{ nfserr_stale, -EBADF },
>  		{ nfserr_acces, -EACCES },
> +		{ nfserr_acces, -EBUSY},
>  		{ nfserr_exist, -EEXIST },
>  		{ nfserr_xdev, -EXDEV },
>  		{ nfserr_mlink, -EMLINK },
> @@ -2006,14 +2007,7 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
>  out_drop_write:
>  	fh_drop_write(fhp);
>  out_nfserr:
> -	if (host_err == -EBUSY) {
> -		/* name is mounted-on. There is no perfect
> -		 * error status.
> -		 */
> -		err = nfserr_file_open;
> -	} else {
> -		err = nfserrno(host_err);
> -	}
> +	err = nfserrno(host_err);
>  out:
>  	return err;
>  out_unlock:

Reviewed-by: Jeff Layton <jlayton@kernel.org>

Chuck Lever Jan. 21, 2025, 7:45 p.m. UTC | #2

Please send patches To: the NFSD reviewers listed in MAINTAINERS and
Cc: linux-nfs and others. Thanks!


On 1/21/25 5:39 AM, Amir Goldstein wrote:
> Commit 466e16f0920f3 ("nfsd: check for EBUSY from vfs_rmdir/vfs_unink.")
> mapped EBUSY host error from rmdir/unlink operation to avoid unknown
> error server warning.

> The same reason that casued the reported EBUSY on rmdir() (dir is a
> local mount point in some other bind mount) could also cause EBUSY on
> rename and some filesystems (e.g. FUSE) can return EBUSY on other
> operations like open().
> 
> Therefore, to avoid unknown error warning in server, we need to map
> EBUSY for all operations.
> 
> The original fix mapped EBUSY to NFS4ERR_FILE_OPEN in v4 server and
> to NFS4ERR_ACCESS in v2/v3 server.
> 
> During the discussion on this issue, Trond claimed that the mapping
> made from EBUSY to NFS4ERR_FILE_OPEN was incorrect according to the
> protocol spec and specifically, NFS4ERR_FILE_OPEN is not expected
> for directories.

NFS4ERR_FILE_OPEN might be incorrect when removing certain types of
file system objects. Here's what I find in RFC 8881 Section 18.25.4:

 > If a file has an outstanding OPEN and this prevents the removal of the
 > file's directory entry, the error NFS4ERR_FILE_OPEN is returned.

It's not normative, but it does suggest that any object that cannot be
associated with an OPEN state ID should never cause REMOVE to return
NFS4ERR_FILE_OPEN.


> To keep things simple and consistent and avoid the server warning,
> map EBUSY to NFS4ERR_ACCESS for all operations in all protocol versions.

Generally a "one size fits all" mapping for these status codes is
not going to cut it. That's why we have nfsd3_map_status() and
nfsd_map_status() -- the set of permitted status codes for each
operation is different for each NFS version.

NFSv3 has REMOVE and RMDIR. You can't pass a directory to NFSv3 REMOVE.

NFSv4 has only REMOVE, and it removes the directory entry for the
object no matter its type. The set of failure modes is different for
this operation compared to NFSv3 REMOVE.

Adding a specific mapping for -EBUSY in nfserrno() is going to have
unintended consequences for any VFS call NFSD might make that returns
-EBUSY.

I think I prefer that the NFSv4 cases be dealt with in nfsd4_remove(),
nfsd4_rename(), and nfsd4_link(), and that -EBUSY should continue to
trigger a warning.


> Note that the mapping of NFS4ERR_FILE_OPEN to NFSERR_ACCESS in
> nfsd3_map_status() and nfsd_map_status() remains for possible future
> return of NFS4ERR_FILE_OPEN in a more specific use case (e.g. an unlink
> of a sillyrenamed non-dir).
> 
> Fixes: 466e16f0920f3 ("nfsd: check for EBUSY from vfs_rmdir/vfs_unink.")
> Link: https://lore.kernel.org/linux-nfs/20250120172016.397916-1-amir73il@gmail.com/
> Cc: Trond Myklebust <trondmy@hammerspace.com>
> Cc: NeilBrown <neilb@suse.de>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>   fs/nfsd/vfs.c | 10 ++--------
>   1 file changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
> index 29cb7b812d713..290c7db8a6180 100644
> --- a/fs/nfsd/vfs.c
> +++ b/fs/nfsd/vfs.c
> @@ -69,6 +69,7 @@ nfserrno (int errno)
>   		{ nfserr_fbig, -E2BIG },
>   		{ nfserr_stale, -EBADF },
>   		{ nfserr_acces, -EACCES },
> +		{ nfserr_acces, -EBUSY},
>   		{ nfserr_exist, -EEXIST },
>   		{ nfserr_xdev, -EXDEV },
>   		{ nfserr_mlink, -EMLINK },

> @@ -2006,14 +2007,7 @@ nfsd_unlink(struct svc_rqst *rqstp, struct svc_fh *fhp, int type,
>   out_drop_write:
>   	fh_drop_write(fhp);
>   out_nfserr:
> -	if (host_err == -EBUSY) {
> -		/* name is mounted-on. There is no perfect
> -		 * error status.
> -		 */
> -		err = nfserr_file_open;
> -	} else {
> -		err = nfserrno(host_err);
> -	}
> +	err = nfserrno(host_err);
>   out:
>   	return err;
>   out_unlock:

Amir Goldstein Jan. 21, 2025, 9:44 p.m. UTC | #3

On Tue, Jan 21, 2025 at 8:45 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> Please send patches To: the NFSD reviewers listed in MAINTAINERS and
> Cc: linux-nfs and others. Thanks!
>
>
> On 1/21/25 5:39 AM, Amir Goldstein wrote:
> > Commit 466e16f0920f3 ("nfsd: check for EBUSY from vfs_rmdir/vfs_unink.")
> > mapped EBUSY host error from rmdir/unlink operation to avoid unknown
> > error server warning.
>
> > The same reason that casued the reported EBUSY on rmdir() (dir is a
> > local mount point in some other bind mount) could also cause EBUSY on
> > rename and some filesystems (e.g. FUSE) can return EBUSY on other
> > operations like open().
> >
> > Therefore, to avoid unknown error warning in server, we need to map
> > EBUSY for all operations.
> >
> > The original fix mapped EBUSY to NFS4ERR_FILE_OPEN in v4 server and
> > to NFS4ERR_ACCESS in v2/v3 server.
> >
> > During the discussion on this issue, Trond claimed that the mapping
> > made from EBUSY to NFS4ERR_FILE_OPEN was incorrect according to the
> > protocol spec and specifically, NFS4ERR_FILE_OPEN is not expected
> > for directories.
>
> NFS4ERR_FILE_OPEN might be incorrect when removing certain types of
> file system objects. Here's what I find in RFC 8881 Section 18.25.4:
>
>  > If a file has an outstanding OPEN and this prevents the removal of the
>  > file's directory entry, the error NFS4ERR_FILE_OPEN is returned.
>
> It's not normative, but it does suggest that any object that cannot be
> associated with an OPEN state ID should never cause REMOVE to return
> NFS4ERR_FILE_OPEN.
>
>
> > To keep things simple and consistent and avoid the server warning,
> > map EBUSY to NFS4ERR_ACCESS for all operations in all protocol versions.
>
> Generally a "one size fits all" mapping for these status codes is
> not going to cut it. That's why we have nfsd3_map_status() and
> nfsd_map_status() -- the set of permitted status codes for each
> operation is different for each NFS version.
>
> NFSv3 has REMOVE and RMDIR. You can't pass a directory to NFSv3 REMOVE.
>
> NFSv4 has only REMOVE, and it removes the directory entry for the
> object no matter its type. The set of failure modes is different for
> this operation compared to NFSv3 REMOVE.
>
> Adding a specific mapping for -EBUSY in nfserrno() is going to have
> unintended consequences for any VFS call NFSD might make that returns
> -EBUSY.
>
> I think I prefer that the NFSv4 cases be dealt with in nfsd4_remove(),
> nfsd4_rename(), and nfsd4_link(), and that -EBUSY should continue to
> trigger a warning.
>
>

Sorry, I didn't understand what you are suggesting.

FUSE can return EBUSY for open().
What do you suggest to do when nfsd encounters EBUSY on open()?

vfs_rename() can return EBUSY.
What do you suggest to do when nfsd v3 encounters EBUSY on rename()?

This sort of assertion:
        WARN_ONCE(1, "nfsd: non-standard errno: %d\n", errno);

Is a code assertion for a situation that should not be possible in the
code and certainly not possible to trigger by userspace.

Both cases above could trigger the warning from userspace.
If you want to leave the warning it should not be a WARN_ONCE()
assertion, but I must say that I did not understand the explanation
for not mapping EBUSY by default to NFS4ERR_ACCESS in nfserrno().

Thanks,
Amir.

NeilBrown Jan. 21, 2025, 10:59 p.m. UTC | #4

On Wed, 22 Jan 2025, Amir Goldstein wrote:
> On Tue, Jan 21, 2025 at 8:45 PM Chuck Lever <chuck.lever@oracle.com> wrote:
> >
> > Please send patches To: the NFSD reviewers listed in MAINTAINERS and
> > Cc: linux-nfs and others. Thanks!
> >
> >
> > On 1/21/25 5:39 AM, Amir Goldstein wrote:
> > > Commit 466e16f0920f3 ("nfsd: check for EBUSY from vfs_rmdir/vfs_unink.")
> > > mapped EBUSY host error from rmdir/unlink operation to avoid unknown
> > > error server warning.
> >
> > > The same reason that casued the reported EBUSY on rmdir() (dir is a
> > > local mount point in some other bind mount) could also cause EBUSY on
> > > rename and some filesystems (e.g. FUSE) can return EBUSY on other
> > > operations like open().
> > >
> > > Therefore, to avoid unknown error warning in server, we need to map
> > > EBUSY for all operations.
> > >
> > > The original fix mapped EBUSY to NFS4ERR_FILE_OPEN in v4 server and
> > > to NFS4ERR_ACCESS in v2/v3 server.
> > >
> > > During the discussion on this issue, Trond claimed that the mapping
> > > made from EBUSY to NFS4ERR_FILE_OPEN was incorrect according to the
> > > protocol spec and specifically, NFS4ERR_FILE_OPEN is not expected
> > > for directories.
> >
> > NFS4ERR_FILE_OPEN might be incorrect when removing certain types of
> > file system objects. Here's what I find in RFC 8881 Section 18.25.4:
> >
> >  > If a file has an outstanding OPEN and this prevents the removal of the
> >  > file's directory entry, the error NFS4ERR_FILE_OPEN is returned.
> >
> > It's not normative, but it does suggest that any object that cannot be
> > associated with an OPEN state ID should never cause REMOVE to return
> > NFS4ERR_FILE_OPEN.
> >
> >
> > > To keep things simple and consistent and avoid the server warning,
> > > map EBUSY to NFS4ERR_ACCESS for all operations in all protocol versions.
> >
> > Generally a "one size fits all" mapping for these status codes is
> > not going to cut it. That's why we have nfsd3_map_status() and
> > nfsd_map_status() -- the set of permitted status codes for each
> > operation is different for each NFS version.
> >
> > NFSv3 has REMOVE and RMDIR. You can't pass a directory to NFSv3 REMOVE.
> >
> > NFSv4 has only REMOVE, and it removes the directory entry for the
> > object no matter its type. The set of failure modes is different for
> > this operation compared to NFSv3 REMOVE.
> >
> > Adding a specific mapping for -EBUSY in nfserrno() is going to have
> > unintended consequences for any VFS call NFSD might make that returns
> > -EBUSY.
> >
> > I think I prefer that the NFSv4 cases be dealt with in nfsd4_remove(),
> > nfsd4_rename(), and nfsd4_link(), and that -EBUSY should continue to
> > trigger a warning.
> >
> >
> 
> Sorry, I didn't understand what you are suggesting.
> 
> FUSE can return EBUSY for open().
> What do you suggest to do when nfsd encounters EBUSY on open()?
> 
> vfs_rename() can return EBUSY.
> What do you suggest to do when nfsd v3 encounters EBUSY on rename()?
> 
> This sort of assertion:
>         WARN_ONCE(1, "nfsd: non-standard errno: %d\n", errno);
> 
> Is a code assertion for a situation that should not be possible in the
> code and certainly not possible to trigger by userspace.
> 
> Both cases above could trigger the warning from userspace.
> If you want to leave the warning it should not be a WARN_ONCE()
> assertion, but I must say that I did not understand the explanation
> for not mapping EBUSY by default to NFS4ERR_ACCESS in nfserrno().

My answer to this last question is that it isn't obvious that EBUSY
should map to NFS4ERR_ACCESS.
I would rather that nfsd explicitly checked the error from unlink/rmdir and
mapped EBUSY to NFS4ERR_ACCESS (if we all agree that is best) with a
comment (like we have now) explaining why it is best.
And nfsd should explicitly check the error from open() and map EBUSY to
whatever seems appropriate.  Maybe that is also NS4ERR_ACCESS but if it
is, the reason is likely different to the reason that it is best for
rmdir.
So again, I would like a comment in the code explaining the choice with
a reference to FUSE.

Then if some other function that we haven't thought about starts
returning EBUSY, we'll get warning and have a change to think about it.

Thanks,
NeilBrown

Amir Goldstein Jan. 22, 2025, 9:05 a.m. UTC | #5

On Tue, Jan 21, 2025 at 11:59 PM NeilBrown <neilb@suse.de> wrote:
>
> On Wed, 22 Jan 2025, Amir Goldstein wrote:
> > On Tue, Jan 21, 2025 at 8:45 PM Chuck Lever <chuck.lever@oracle.com> wrote:
> > >
> > > Please send patches To: the NFSD reviewers listed in MAINTAINERS and
> > > Cc: linux-nfs and others. Thanks!
> > >
> > >
> > > On 1/21/25 5:39 AM, Amir Goldstein wrote:
> > > > Commit 466e16f0920f3 ("nfsd: check for EBUSY from vfs_rmdir/vfs_unink.")
> > > > mapped EBUSY host error from rmdir/unlink operation to avoid unknown
> > > > error server warning.
> > >
> > > > The same reason that casued the reported EBUSY on rmdir() (dir is a
> > > > local mount point in some other bind mount) could also cause EBUSY on
> > > > rename and some filesystems (e.g. FUSE) can return EBUSY on other
> > > > operations like open().
> > > >
> > > > Therefore, to avoid unknown error warning in server, we need to map
> > > > EBUSY for all operations.
> > > >
> > > > The original fix mapped EBUSY to NFS4ERR_FILE_OPEN in v4 server and
> > > > to NFS4ERR_ACCESS in v2/v3 server.
> > > >
> > > > During the discussion on this issue, Trond claimed that the mapping
> > > > made from EBUSY to NFS4ERR_FILE_OPEN was incorrect according to the
> > > > protocol spec and specifically, NFS4ERR_FILE_OPEN is not expected
> > > > for directories.
> > >
> > > NFS4ERR_FILE_OPEN might be incorrect when removing certain types of
> > > file system objects. Here's what I find in RFC 8881 Section 18.25.4:
> > >
> > >  > If a file has an outstanding OPEN and this prevents the removal of the
> > >  > file's directory entry, the error NFS4ERR_FILE_OPEN is returned.
> > >
> > > It's not normative, but it does suggest that any object that cannot be
> > > associated with an OPEN state ID should never cause REMOVE to return
> > > NFS4ERR_FILE_OPEN.
> > >
> > >
> > > > To keep things simple and consistent and avoid the server warning,
> > > > map EBUSY to NFS4ERR_ACCESS for all operations in all protocol versions.
> > >
> > > Generally a "one size fits all" mapping for these status codes is
> > > not going to cut it. That's why we have nfsd3_map_status() and
> > > nfsd_map_status() -- the set of permitted status codes for each
> > > operation is different for each NFS version.
> > >
> > > NFSv3 has REMOVE and RMDIR. You can't pass a directory to NFSv3 REMOVE.
> > >
> > > NFSv4 has only REMOVE, and it removes the directory entry for the
> > > object no matter its type. The set of failure modes is different for
> > > this operation compared to NFSv3 REMOVE.
> > >
> > > Adding a specific mapping for -EBUSY in nfserrno() is going to have
> > > unintended consequences for any VFS call NFSD might make that returns
> > > -EBUSY.
> > >
> > > I think I prefer that the NFSv4 cases be dealt with in nfsd4_remove(),
> > > nfsd4_rename(), and nfsd4_link(), and that -EBUSY should continue to
> > > trigger a warning.
> > >
> > >
> >
> > Sorry, I didn't understand what you are suggesting.
> >
> > FUSE can return EBUSY for open().
> > What do you suggest to do when nfsd encounters EBUSY on open()?
> >
> > vfs_rename() can return EBUSY.
> > What do you suggest to do when nfsd v3 encounters EBUSY on rename()?
> >
> > This sort of assertion:
> >         WARN_ONCE(1, "nfsd: non-standard errno: %d\n", errno);
> >
> > Is a code assertion for a situation that should not be possible in the
> > code and certainly not possible to trigger by userspace.
> >
> > Both cases above could trigger the warning from userspace.
> > If you want to leave the warning it should not be a WARN_ONCE()
> > assertion, but I must say that I did not understand the explanation
> > for not mapping EBUSY by default to NFS4ERR_ACCESS in nfserrno().
>
> My answer to this last question is that it isn't obvious that EBUSY
> should map to NFS4ERR_ACCESS.
> I would rather that nfsd explicitly checked the error from unlink/rmdir and
> mapped EBUSY to NFS4ERR_ACCESS (if we all agree that is best) with a
> comment (like we have now) explaining why it is best.

Can you please suggest the text for this comment because I did not
understand the reasoning for the error.
All I argued for is conformity to NFSv2/3, but you are the one who chose
NFS3ERR_ACCES for v2/3 mapping and I don't know what is the
reasoning for this error code. All I have is:
"For NFSv3, the best we can do is probably NFS3ERR_ACCES,
  which isn't true, but is not less true than the other options."

> And nfsd should explicitly check the error from open() and map EBUSY to
> whatever seems appropriate.  Maybe that is also NS4ERR_ACCESS but if it
> is, the reason is likely different to the reason that it is best for
> rmdir.
> So again, I would like a comment in the code explaining the choice with
> a reference to FUSE.

My specific FUSE filesystem can return EBUSY for open(), but FUSE
filesystem in general can return EBUSY for any operation if that is what
the userspace server returns.

>
> Then if some other function that we haven't thought about starts
> returning EBUSY, we'll get warning and have a change to think about it.
>

I have no objection to that, but I think that the WARN_ONCE should be
converted to pr_warn_once() or pr_warn_ratelimited() because userspace
should not be able to trigger a WARN_ON in any case.

I realize the great value of the stack trace that WARN_ON provides in
this scenario, but if we include in the warning the operation id and the
filesystem sid I think that would be enough information to understand
where the unmapped error is coming from.

This is not expected stop the whack-a-mole of patches like mine and this one:
340e61e44c1d2 ("nfsd: map the EBADMSG to nfserr_io to avoid warning")
but at least the severity of the issues will be reduced without the scary
WARN_ON splat.

I can write a patch if there is an agreement on that.

Thanks,
Amir.

Chuck Lever Jan. 22, 2025, 3:04 p.m. UTC | #6

On 1/22/25 4:05 AM, Amir Goldstein wrote:
> On Tue, Jan 21, 2025 at 11:59 PM NeilBrown <neilb@suse.de> wrote:
>>
>> On Wed, 22 Jan 2025, Amir Goldstein wrote:
>>> On Tue, Jan 21, 2025 at 8:45 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>>
>>>> Please send patches To: the NFSD reviewers listed in MAINTAINERS and
>>>> Cc: linux-nfs and others. Thanks!
>>>>
>>>>
>>>> On 1/21/25 5:39 AM, Amir Goldstein wrote:
>>>>> Commit 466e16f0920f3 ("nfsd: check for EBUSY from vfs_rmdir/vfs_unink.")
>>>>> mapped EBUSY host error from rmdir/unlink operation to avoid unknown
>>>>> error server warning.
>>>>
>>>>> The same reason that casued the reported EBUSY on rmdir() (dir is a
>>>>> local mount point in some other bind mount) could also cause EBUSY on
>>>>> rename and some filesystems (e.g. FUSE) can return EBUSY on other
>>>>> operations like open().
>>>>>
>>>>> Therefore, to avoid unknown error warning in server, we need to map
>>>>> EBUSY for all operations.
>>>>>
>>>>> The original fix mapped EBUSY to NFS4ERR_FILE_OPEN in v4 server and
>>>>> to NFS4ERR_ACCESS in v2/v3 server.
>>>>>
>>>>> During the discussion on this issue, Trond claimed that the mapping
>>>>> made from EBUSY to NFS4ERR_FILE_OPEN was incorrect according to the
>>>>> protocol spec and specifically, NFS4ERR_FILE_OPEN is not expected
>>>>> for directories.
>>>>
>>>> NFS4ERR_FILE_OPEN might be incorrect when removing certain types of
>>>> file system objects. Here's what I find in RFC 8881 Section 18.25.4:
>>>>
>>>>   > If a file has an outstanding OPEN and this prevents the removal of the
>>>>   > file's directory entry, the error NFS4ERR_FILE_OPEN is returned.
>>>>
>>>> It's not normative, but it does suggest that any object that cannot be
>>>> associated with an OPEN state ID should never cause REMOVE to return
>>>> NFS4ERR_FILE_OPEN.
>>>>
>>>>
>>>>> To keep things simple and consistent and avoid the server warning,
>>>>> map EBUSY to NFS4ERR_ACCESS for all operations in all protocol versions.
>>>>
>>>> Generally a "one size fits all" mapping for these status codes is
>>>> not going to cut it. That's why we have nfsd3_map_status() and
>>>> nfsd_map_status() -- the set of permitted status codes for each
>>>> operation is different for each NFS version.
>>>>
>>>> NFSv3 has REMOVE and RMDIR. You can't pass a directory to NFSv3 REMOVE.
>>>>
>>>> NFSv4 has only REMOVE, and it removes the directory entry for the
>>>> object no matter its type. The set of failure modes is different for
>>>> this operation compared to NFSv3 REMOVE.
>>>>
>>>> Adding a specific mapping for -EBUSY in nfserrno() is going to have
>>>> unintended consequences for any VFS call NFSD might make that returns
>>>> -EBUSY.
>>>>
>>>> I think I prefer that the NFSv4 cases be dealt with in nfsd4_remove(),
>>>> nfsd4_rename(), and nfsd4_link(), and that -EBUSY should continue to
>>>> trigger a warning.
>>>>
>>>>
>>>
>>> Sorry, I didn't understand what you are suggesting.

I'm saying that we need to consider the errno -> NFS status code
mapping on a case-by-case basis first.


>>> FUSE can return EBUSY for open().
>>> What do you suggest to do when nfsd encounters EBUSY on open()?
>>>
>>> vfs_rename() can return EBUSY.
>>> What do you suggest to do when nfsd v3 encounters EBUSY on rename()?

I totally agree that we do not want NFSD to leak -EBUSY to NFS clients.

But we do need to examine all the ways -EBUSY can leak through to the
NFS protocol layers (nfs?proc.c). The mapping is not going to be the
same for every NFS operation in every NFS version. (or, at least we
need to examine these cases closely and decide that nfserr_access is
the closest we can get for /every/ case).


>>> This sort of assertion:
>>>          WARN_ONCE(1, "nfsd: non-standard errno: %d\n", errno);
>>>
>>> Is a code assertion for a situation that should not be possible in the
>>> code and certainly not possible to trigger by userspace.
>>>
>>> Both cases above could trigger the warning from userspace.
>>> If you want to leave the warning it should not be a WARN_ONCE()
>>> assertion, but I must say that I did not understand the explanation
>>> for not mapping EBUSY by default to NFS4ERR_ACCESS in nfserrno().
>>
>> My answer to this last question is that it isn't obvious that EBUSY
>> should map to NFS4ERR_ACCESS.
>> I would rather that nfsd explicitly checked the error from unlink/rmdir and
>> mapped EBUSY to NFS4ERR_ACCESS (if we all agree that is best) with a
>> comment (like we have now) explaining why it is best.
> 
> Can you please suggest the text for this comment because I did not
> understand the reasoning for the error.
> All I argued for is conformity to NFSv2/3, but you are the one who chose
> NFS3ERR_ACCES for v2/3 mapping and I don't know what is the
> reasoning for this error code. All I have is:
> "For NFSv3, the best we can do is probably NFS3ERR_ACCES,
>    which isn't true, but is not less true than the other options."

You're proposing to change the behavior of NFSv4 to match NFSv2/3, and
that's where we might need to take a moment. The NFSv4 protocol provides
a richer set of status codes to report this situation.


>> And nfsd should explicitly check the error from open() and map EBUSY to
>> whatever seems appropriate.  Maybe that is also NS4ERR_ACCESS but if it
>> is, the reason is likely different to the reason that it is best for
>> rmdir.
>> So again, I would like a comment in the code explaining the choice with
>> a reference to FUSE.
> 
> My specific FUSE filesystem can return EBUSY for open(), but FUSE
> filesystem in general can return EBUSY for any operation if that is what
> the userspace server returns.

Fair, that suggests that eventually we might need the general nfserrno
mapping in addition to some individual checking in NFS operation- and
version-specific code. I'm not ready to leap to that conclusion yet.


>> Then if some other function that we haven't thought about starts
>> returning EBUSY, we'll get warning and have a change to think about it.
>>
> 
> I have no objection to that, but I think that the WARN_ONCE should be
> converted to pr_warn_once() or pr_warn_ratelimited() because userspace
> should not be able to trigger a WARN_ON in any case.

It isn't user space that's the concern -- it's that NFS clients can
trigger this warning. If a client accidentally or maliciously triggers
it repeatedly, it can fill the NFS server's system journal.

Our general policy is that we use the _ONCE variants to prevent a remote
attack from overflowing the server's root partition.


> I realize the great value of the stack trace that WARN_ON provides in
> this scenario, but if we include in the warning the operation id and the
> filesystem sid I think that would be enough information to understand
> where the unmapped error is coming from.

Hm. The stack trace tells us all that without having to add the extra
(rarely used) arguments to nfserrno. I'm OK with a stack trace here
because this is information for developers, who can make sense of it.
It's not a message that a system admin or user needs to understand.


> This is not expected stop the whack-a-mole of patches like mine and this one:

It will be whack-a-mole. Unfortunately there is no way around that
because of the case-by-case auditing that is necessary.


> 340e61e44c1d2 ("nfsd: map the EBADMSG to nfserr_io to avoid warning")
> but at least the severity of the issues will be reduced without the scary
> WARN_ON splat.

340e61e44c1d2 is one tactic for dealing with these issues, but we do
need case-by-case audit to have confidence that this tactic is
appropriate for each errno value.


> I can write a patch if there is an agreement on that.
> 
> Thanks,
> Amir.

Amir Goldstein Jan. 22, 2025, 3:29 p.m. UTC | #7

On Wed, Jan 22, 2025 at 4:04 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On 1/22/25 4:05 AM, Amir Goldstein wrote:
> > On Tue, Jan 21, 2025 at 11:59 PM NeilBrown <neilb@suse.de> wrote:
> >>
> >> On Wed, 22 Jan 2025, Amir Goldstein wrote:
> >>> On Tue, Jan 21, 2025 at 8:45 PM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>>>
> >>>> Please send patches To: the NFSD reviewers listed in MAINTAINERS and
> >>>> Cc: linux-nfs and others. Thanks!
> >>>>
> >>>>
> >>>> On 1/21/25 5:39 AM, Amir Goldstein wrote:
> >>>>> Commit 466e16f0920f3 ("nfsd: check for EBUSY from vfs_rmdir/vfs_unink.")
> >>>>> mapped EBUSY host error from rmdir/unlink operation to avoid unknown
> >>>>> error server warning.
> >>>>
> >>>>> The same reason that casued the reported EBUSY on rmdir() (dir is a
> >>>>> local mount point in some other bind mount) could also cause EBUSY on
> >>>>> rename and some filesystems (e.g. FUSE) can return EBUSY on other
> >>>>> operations like open().
> >>>>>
> >>>>> Therefore, to avoid unknown error warning in server, we need to map
> >>>>> EBUSY for all operations.
> >>>>>
> >>>>> The original fix mapped EBUSY to NFS4ERR_FILE_OPEN in v4 server and
> >>>>> to NFS4ERR_ACCESS in v2/v3 server.
> >>>>>
> >>>>> During the discussion on this issue, Trond claimed that the mapping
> >>>>> made from EBUSY to NFS4ERR_FILE_OPEN was incorrect according to the
> >>>>> protocol spec and specifically, NFS4ERR_FILE_OPEN is not expected
> >>>>> for directories.
> >>>>
> >>>> NFS4ERR_FILE_OPEN might be incorrect when removing certain types of
> >>>> file system objects. Here's what I find in RFC 8881 Section 18.25.4:
> >>>>
> >>>>   > If a file has an outstanding OPEN and this prevents the removal of the
> >>>>   > file's directory entry, the error NFS4ERR_FILE_OPEN is returned.
> >>>>
> >>>> It's not normative, but it does suggest that any object that cannot be
> >>>> associated with an OPEN state ID should never cause REMOVE to return
> >>>> NFS4ERR_FILE_OPEN.
> >>>>
> >>>>
> >>>>> To keep things simple and consistent and avoid the server warning,
> >>>>> map EBUSY to NFS4ERR_ACCESS for all operations in all protocol versions.
> >>>>
> >>>> Generally a "one size fits all" mapping for these status codes is
> >>>> not going to cut it. That's why we have nfsd3_map_status() and
> >>>> nfsd_map_status() -- the set of permitted status codes for each
> >>>> operation is different for each NFS version.
> >>>>
> >>>> NFSv3 has REMOVE and RMDIR. You can't pass a directory to NFSv3 REMOVE.
> >>>>
> >>>> NFSv4 has only REMOVE, and it removes the directory entry for the
> >>>> object no matter its type. The set of failure modes is different for
> >>>> this operation compared to NFSv3 REMOVE.
> >>>>
> >>>> Adding a specific mapping for -EBUSY in nfserrno() is going to have
> >>>> unintended consequences for any VFS call NFSD might make that returns
> >>>> -EBUSY.
> >>>>
> >>>> I think I prefer that the NFSv4 cases be dealt with in nfsd4_remove(),
> >>>> nfsd4_rename(), and nfsd4_link(), and that -EBUSY should continue to
> >>>> trigger a warning.
> >>>>
> >>>>
> >>>
> >>> Sorry, I didn't understand what you are suggesting.
>
> I'm saying that we need to consider the errno -> NFS status code
> mapping on a case-by-case basis first.
>
>
> >>> FUSE can return EBUSY for open().
> >>> What do you suggest to do when nfsd encounters EBUSY on open()?
> >>>
> >>> vfs_rename() can return EBUSY.
> >>> What do you suggest to do when nfsd v3 encounters EBUSY on rename()?
>
> I totally agree that we do not want NFSD to leak -EBUSY to NFS clients.
>
> But we do need to examine all the ways -EBUSY can leak through to the
> NFS protocol layers (nfs?proc.c). The mapping is not going to be the
> same for every NFS operation in every NFS version. (or, at least we
> need to examine these cases closely and decide that nfserr_access is
> the closest we can get for /every/ case).
>
>
> >>> This sort of assertion:
> >>>          WARN_ONCE(1, "nfsd: non-standard errno: %d\n", errno);
> >>>
> >>> Is a code assertion for a situation that should not be possible in the
> >>> code and certainly not possible to trigger by userspace.
> >>>
> >>> Both cases above could trigger the warning from userspace.
> >>> If you want to leave the warning it should not be a WARN_ONCE()
> >>> assertion, but I must say that I did not understand the explanation
> >>> for not mapping EBUSY by default to NFS4ERR_ACCESS in nfserrno().
> >>
> >> My answer to this last question is that it isn't obvious that EBUSY
> >> should map to NFS4ERR_ACCESS.
> >> I would rather that nfsd explicitly checked the error from unlink/rmdir and
> >> mapped EBUSY to NFS4ERR_ACCESS (if we all agree that is best) with a
> >> comment (like we have now) explaining why it is best.
> >
> > Can you please suggest the text for this comment because I did not
> > understand the reasoning for the error.
> > All I argued for is conformity to NFSv2/3, but you are the one who chose
> > NFS3ERR_ACCES for v2/3 mapping and I don't know what is the
> > reasoning for this error code. All I have is:
> > "For NFSv3, the best we can do is probably NFS3ERR_ACCES,
> >    which isn't true, but is not less true than the other options."
>
> You're proposing to change the behavior of NFSv4 to match NFSv2/3, and
> that's where we might need to take a moment. The NFSv4 protocol provides
> a richer set of status codes to report this situation.
>
>

To be fair, I did not propose that in patch v1:
https://lore.kernel.org/linux-nfs/20250120172016.397916-1-amir73il@gmail.com/

I proposed to keep the EBUSY -> NFS4ERR_FILE_OPEN mapping for v4
and extend the operations that it applies to.
Trond had reservations about his mapping.
I have no problem with going back to v1 mapping and reducing the
mapped operations to rmdir/unlink/rename/open or any other mapping
that you prefer.

> >> And nfsd should explicitly check the error from open() and map EBUSY to
> >> whatever seems appropriate.  Maybe that is also NS4ERR_ACCESS but if it
> >> is, the reason is likely different to the reason that it is best for
> >> rmdir.
> >> So again, I would like a comment in the code explaining the choice with
> >> a reference to FUSE.
> >
> > My specific FUSE filesystem can return EBUSY for open(), but FUSE
> > filesystem in general can return EBUSY for any operation if that is what
> > the userspace server returns.
>
> Fair, that suggests that eventually we might need the general nfserrno
> mapping in addition to some individual checking in NFS operation- and
> version-specific code. I'm not ready to leap to that conclusion yet.
>
>

I am fine with handling EBUSY in unlink/rmdir/rename/open
only for now if that is what everyone prefers.

> >> Then if some other function that we haven't thought about starts
> >> returning EBUSY, we'll get warning and have a change to think about it.
> >>
> >
> > I have no objection to that, but I think that the WARN_ONCE should be
> > converted to pr_warn_once() or pr_warn_ratelimited() because userspace
> > should not be able to trigger a WARN_ON in any case.
>
> It isn't user space that's the concern -- it's that NFS clients can
> trigger this warning. If a client accidentally or maliciously triggers
> it repeatedly, it can fill the NFS server's system journal.
>
> Our general policy is that we use the _ONCE variants to prevent a remote
> attack from overflowing the server's root partition.
>
>

This is what Documentation/process/coding-style.rst has to say:

Do not WARN lightly
*******************

WARN*() is intended for unexpected, this-should-never-happen situations.
WARN*() macros are not to be used for anything that is expected to happen
during normal operation. These are not pre- or post-condition asserts, for
example. Again: WARN*() must not be used for a condition that is expected
to trigger easily, for example, by user space actions. pr_warn_once() is a
possible alternative, if you need to notify the user of a problem.

---

But it's not even that - I find that syzbot and other testers treat any WARN_ON
as a bug (as they should according to coding style).
This WARN_ON in nfsd is really easy to trigger from userspace and from
malicious nfs client.
If you do not replace this WARN_ON, I anticipate that the day will come when
protocol fuzzers will start bombing you with bug reports.

If it is "possible" to hit this assertion - it should not be an assertion.

> > I realize the great value of the stack trace that WARN_ON provides in
> > this scenario, but if we include in the warning the operation id and the
> > filesystem sid I think that would be enough information to understand
> > where the unmapped error is coming from.
>
> Hm. The stack trace tells us all that without having to add the extra
> (rarely used) arguments to nfserrno. I'm OK with a stack trace here
> because this is information for developers, who can make sense of it.
> It's not a message that a system admin or user needs to understand.
>
>

It's your call. If you are not bothered by the bug reports.

Thanks,
Amir.

Chuck Lever Jan. 22, 2025, 4:50 p.m. UTC | #8

On 1/22/25 10:29 AM, Amir Goldstein wrote:
> On Wed, Jan 22, 2025 at 4:04 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>
>> On 1/22/25 4:05 AM, Amir Goldstein wrote:
>>> On Tue, Jan 21, 2025 at 11:59 PM NeilBrown <neilb@suse.de> wrote:
>>>>
>>>> On Wed, 22 Jan 2025, Amir Goldstein wrote:
>>>>> On Tue, Jan 21, 2025 at 8:45 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>>>>
>>>>>> Please send patches To: the NFSD reviewers listed in MAINTAINERS and
>>>>>> Cc: linux-nfs and others. Thanks!
>>>>>>
>>>>>>
>>>>>> On 1/21/25 5:39 AM, Amir Goldstein wrote:
>>>>>>> Commit 466e16f0920f3 ("nfsd: check for EBUSY from vfs_rmdir/vfs_unink.")
>>>>>>> mapped EBUSY host error from rmdir/unlink operation to avoid unknown
>>>>>>> error server warning.
>>>>>>
>>>>>>> The same reason that casued the reported EBUSY on rmdir() (dir is a
>>>>>>> local mount point in some other bind mount) could also cause EBUSY on
>>>>>>> rename and some filesystems (e.g. FUSE) can return EBUSY on other
>>>>>>> operations like open().
>>>>>>>
>>>>>>> Therefore, to avoid unknown error warning in server, we need to map
>>>>>>> EBUSY for all operations.
>>>>>>>
>>>>>>> The original fix mapped EBUSY to NFS4ERR_FILE_OPEN in v4 server and
>>>>>>> to NFS4ERR_ACCESS in v2/v3 server.
>>>>>>>
>>>>>>> During the discussion on this issue, Trond claimed that the mapping
>>>>>>> made from EBUSY to NFS4ERR_FILE_OPEN was incorrect according to the
>>>>>>> protocol spec and specifically, NFS4ERR_FILE_OPEN is not expected
>>>>>>> for directories.
>>>>>>
>>>>>> NFS4ERR_FILE_OPEN might be incorrect when removing certain types of
>>>>>> file system objects. Here's what I find in RFC 8881 Section 18.25.4:
>>>>>>
>>>>>>    > If a file has an outstanding OPEN and this prevents the removal of the
>>>>>>    > file's directory entry, the error NFS4ERR_FILE_OPEN is returned.
>>>>>>
>>>>>> It's not normative, but it does suggest that any object that cannot be
>>>>>> associated with an OPEN state ID should never cause REMOVE to return
>>>>>> NFS4ERR_FILE_OPEN.
>>>>>>
>>>>>>
>>>>>>> To keep things simple and consistent and avoid the server warning,
>>>>>>> map EBUSY to NFS4ERR_ACCESS for all operations in all protocol versions.
>>>>>>
>>>>>> Generally a "one size fits all" mapping for these status codes is
>>>>>> not going to cut it. That's why we have nfsd3_map_status() and
>>>>>> nfsd_map_status() -- the set of permitted status codes for each
>>>>>> operation is different for each NFS version.
>>>>>>
>>>>>> NFSv3 has REMOVE and RMDIR. You can't pass a directory to NFSv3 REMOVE.
>>>>>>
>>>>>> NFSv4 has only REMOVE, and it removes the directory entry for the
>>>>>> object no matter its type. The set of failure modes is different for
>>>>>> this operation compared to NFSv3 REMOVE.
>>>>>>
>>>>>> Adding a specific mapping for -EBUSY in nfserrno() is going to have
>>>>>> unintended consequences for any VFS call NFSD might make that returns
>>>>>> -EBUSY.
>>>>>>
>>>>>> I think I prefer that the NFSv4 cases be dealt with in nfsd4_remove(),
>>>>>> nfsd4_rename(), and nfsd4_link(), and that -EBUSY should continue to
>>>>>> trigger a warning.
>>>>>>
>>>>>>
>>>>>
>>>>> Sorry, I didn't understand what you are suggesting.
>>
>> I'm saying that we need to consider the errno -> NFS status code
>> mapping on a case-by-case basis first.
>>
>>
>>>>> FUSE can return EBUSY for open().
>>>>> What do you suggest to do when nfsd encounters EBUSY on open()?
>>>>>
>>>>> vfs_rename() can return EBUSY.
>>>>> What do you suggest to do when nfsd v3 encounters EBUSY on rename()?
>>
>> I totally agree that we do not want NFSD to leak -EBUSY to NFS clients.
>>
>> But we do need to examine all the ways -EBUSY can leak through to the
>> NFS protocol layers (nfs?proc.c). The mapping is not going to be the
>> same for every NFS operation in every NFS version. (or, at least we
>> need to examine these cases closely and decide that nfserr_access is
>> the closest we can get for /every/ case).
>>
>>
>>>>> This sort of assertion:
>>>>>           WARN_ONCE(1, "nfsd: non-standard errno: %d\n", errno);
>>>>>
>>>>> Is a code assertion for a situation that should not be possible in the
>>>>> code and certainly not possible to trigger by userspace.
>>>>>
>>>>> Both cases above could trigger the warning from userspace.
>>>>> If you want to leave the warning it should not be a WARN_ONCE()
>>>>> assertion, but I must say that I did not understand the explanation
>>>>> for not mapping EBUSY by default to NFS4ERR_ACCESS in nfserrno().
>>>>
>>>> My answer to this last question is that it isn't obvious that EBUSY
>>>> should map to NFS4ERR_ACCESS.
>>>> I would rather that nfsd explicitly checked the error from unlink/rmdir and
>>>> mapped EBUSY to NFS4ERR_ACCESS (if we all agree that is best) with a
>>>> comment (like we have now) explaining why it is best.
>>>
>>> Can you please suggest the text for this comment because I did not
>>> understand the reasoning for the error.
>>> All I argued for is conformity to NFSv2/3, but you are the one who chose
>>> NFS3ERR_ACCES for v2/3 mapping and I don't know what is the
>>> reasoning for this error code. All I have is:
>>> "For NFSv3, the best we can do is probably NFS3ERR_ACCES,
>>>     which isn't true, but is not less true than the other options."
>>
>> You're proposing to change the behavior of NFSv4 to match NFSv2/3, and
>> that's where we might need to take a moment. The NFSv4 protocol provides
>> a richer set of status codes to report this situation.
>>
>>
> 
> To be fair, I did not propose that in patch v1:
> https://lore.kernel.org/linux-nfs/20250120172016.397916-1-amir73il@gmail.com/
> 
> I proposed to keep the EBUSY -> NFS4ERR_FILE_OPEN mapping for v4
> and extend the operations that it applies to.
> Trond had reservations about his mapping.

Well, Trond said that FILE_OPEN is wrong to return if the object
being removed is a directory. It is the correct status code if
the target object is rather a regular file.


> I have no problem with going back to v1 mapping and reducing the
> mapped operations to rmdir/unlink/rename/open or any other mapping
> that you prefer.

v1 doesn't fix Trond's issue, IIUC.


>>>> And nfsd should explicitly check the error from open() and map EBUSY to
>>>> whatever seems appropriate.  Maybe that is also NS4ERR_ACCESS but if it
>>>> is, the reason is likely different to the reason that it is best for
>>>> rmdir.
>>>> So again, I would like a comment in the code explaining the choice with
>>>> a reference to FUSE.
>>>
>>> My specific FUSE filesystem can return EBUSY for open(), but FUSE
>>> filesystem in general can return EBUSY for any operation if that is what
>>> the userspace server returns.
>>
>> Fair, that suggests that eventually we might need the general nfserrno
>> mapping in addition to some individual checking in NFS operation- and
>> version-specific code. I'm not ready to leap to that conclusion yet.
> 
> I am fine with handling EBUSY in unlink/rmdir/rename/open
> only for now if that is what everyone prefers.

As far as I can tell, NFSv2 and NFSv3 REMOVE/RMDIR are working
correctly. NFSv4 REMOVE needs to return a status code that depends
on whether the target object is a file or not. Probably not much more
than something like this:

	status = vfs_unlink( ... );
+	/* RFC 8881 Section 18.25.4 paragraph 5 */
+	if (status == nfserr_file_open && !S_ISREG(...))
+		status = nfserr_access;

added to nfsd4_remove().

Let's visit RENAME once that is addressed.

Then handle OPEN as a third patch, because I bet we are going to meet
some complications there.


>>>> Then if some other function that we haven't thought about starts
>>>> returning EBUSY, we'll get warning and have a change to think about it.
>>>>
>>>
>>> I have no objection to that, but I think that the WARN_ONCE should be
>>> converted to pr_warn_once() or pr_warn_ratelimited() because userspace
>>> should not be able to trigger a WARN_ON in any case.
>>
>> It isn't user space that's the concern -- it's that NFS clients can
>> trigger this warning. If a client accidentally or maliciously triggers
>> it repeatedly, it can fill the NFS server's system journal.
>>
>> Our general policy is that we use the _ONCE variants to prevent a remote
>> attack from overflowing the server's root partition.
>>
>>
> 
> This is what Documentation/process/coding-style.rst has to say:
> 
> Do not WARN lightly
> *******************
> 
> WARN*() is intended for unexpected, this-should-never-happen situations.
> WARN*() macros are not to be used for anything that is expected to happen
> during normal operation. These are not pre- or post-condition asserts, for
> example. Again: WARN*() must not be used for a condition that is expected
> to trigger easily, for example, by user space actions. pr_warn_once() is a
> possible alternative, if you need to notify the user of a problem.
> 
> ---
> 
> But it's not even that - I find that syzbot and other testers treat any WARN_ON
> as a bug (as they should according to coding style).
> This WARN_ON in nfsd is really easy to trigger from userspace and from
> malicious nfs client.
> If you do not replace this WARN_ON, I anticipate that the day will come when
> protocol fuzzers will start bombing you with bug reports.
> 
> If it is "possible" to hit this assertion - it should not be an assertion.

I know some people (eg, Linus) do not approve of this code development
tactic. However, it's not a BUG() and the NFS server will continue to
operate.

We are using WARN_ONCE() here. We'll get exactly one of these warnings
for each boot epoch that encounters this issue.


>>> I realize the great value of the stack trace that WARN_ON provides in
>>> this scenario, but if we include in the warning the operation id and the
>>> filesystem sid I think that would be enough information to understand
>>> where the unmapped error is coming from.
>>
>> Hm. The stack trace tells us all that without having to add the extra
>> (rarely used) arguments to nfserrno. I'm OK with a stack trace here
>> because this is information for developers, who can make sense of it.
>> It's not a message that a system admin or user needs to understand.
> 
> It's your call. If you are not bothered by the bug reports.

I expect to get bug reports for these issues, because these are real
issues in the NFSD implementation that need to be addressed, as you are
doing right now.

Amir Goldstein Jan. 22, 2025, 6:53 p.m. UTC | #9

> > I am fine with handling EBUSY in unlink/rmdir/rename/open
> > only for now if that is what everyone prefers.
>
> As far as I can tell, NFSv2 and NFSv3 REMOVE/RMDIR are working
> correctly. NFSv4 REMOVE needs to return a status code that depends
> on whether the target object is a file or not. Probably not much more
> than something like this:
>
>         status = vfs_unlink( ... );
> +       /* RFC 8881 Section 18.25.4 paragraph 5 */
> +       if (status == nfserr_file_open && !S_ISREG(...))
> +               status = nfserr_access;
>
> added to nfsd4_remove().

Don't you think it's a bit awkward mapping back and forth like this?
Don't you think something like this is a more sane way to keep the
mapping rules in one place:

@@ -111,6 +111,26 @@ nfserrno (int errno)
        return nfserr_io;
 }

+static __be32
+nfsd_map_errno(int host_err, int may_flags, int type)
+{
+       switch (host_err) {
+       case -EBUSY:
+               /*
+                * According to RFC 8881 Section 18.25.4 paragraph 5,
+                * removal of regular file can fail with NFS4ERR_FILE_OPEN.
+                * For failure to remove directory we return NFS4ERR_ACCESS,
+                * same as NFS4ERR_FILE_OPEN is mapped in v3 and v2.
+                */
+               if (may_flags == NFSD_MAY_REMOVE && type == S_IFREG)
+                       return nfserr_file_open;
+               else
+                       return nfserr_acces;
+       }
+
+       return nfserrno(host_err);
+}
+
 /*
  * Called from nfsd_lookup and encode_dirent. Check if we have crossed
  * a mount point.
@@ -2006,14 +2026,7 @@ nfsd_unlink(struct svc_rqst *rqstp, struct
svc_fh *fhp, int type,
 out_drop_write:
        fh_drop_write(fhp);
 out_nfserr:
-       if (host_err == -EBUSY) {
-               /* name is mounted-on. There is no perfect
-                * error status.
-                */
-               err = nfserr_file_open;
-       } else {
-               err = nfserrno(host_err);
-       }
+       err = nfsd_map_errno(host_err, NFSD_MAY_REMOVE, type);
 out:
        return err;

>
> Let's visit RENAME once that is addressed.

And then next patch would be:

@@ -1828,6 +1828,7 @@ nfsd_rename(struct svc_rqst *rqstp, struct
svc_fh *ffhp, char *fname, int flen,
        __be32          err;
        int             host_err;
        bool            close_cached = false;
+       int             type;

        err = fh_verify(rqstp, ffhp, S_IFDIR, NFSD_MAY_REMOVE);
        if (err)
@@ -1922,8 +1923,10 @@ nfsd_rename(struct svc_rqst *rqstp, struct
svc_fh *ffhp, char *fname, int flen,
  out_dput_new:
        dput(ndentry);
  out_dput_old:
+       type = d_inode(odentry)->i_mode & S_IFMT;
        dput(odentry);
  out_nfserr:
-        err = nfserrno(host_err);
+       err = nfsd_map_errno(host_err, NFSD_MAY_REMOVE, type);

>
> Then handle OPEN as a third patch, because I bet we are going to meet
> some complications there.
>
>

Did you think of anything better to do for OPEN other than NFS4ERR_ACCESS?

Thanks,
Amir.

Chuck Lever Jan. 22, 2025, 7:20 p.m. UTC | #10

On 1/22/25 1:53 PM, Amir Goldstein wrote:
>>> I am fine with handling EBUSY in unlink/rmdir/rename/open
>>> only for now if that is what everyone prefers.
>>
>> As far as I can tell, NFSv2 and NFSv3 REMOVE/RMDIR are working
>> correctly. NFSv4 REMOVE needs to return a status code that depends
>> on whether the target object is a file or not. Probably not much more
>> than something like this:
>>
>>          status = vfs_unlink( ... );
>> +       /* RFC 8881 Section 18.25.4 paragraph 5 */
>> +       if (status == nfserr_file_open && !S_ISREG(...))
>> +               status = nfserr_access;
>>
>> added to nfsd4_remove().
> 
> Don't you think it's a bit awkward mapping back and forth like this?

Yes, it's awkward. It's an artifact of the way NFSD's VFS helpers have
been co-opted for new versions of the NFS protocol over the years.

With NFSv2 and NFSv3, the operations and their permitted status codes
are roughly similar so that these VFS helpers can be re-used without
a lot of fuss. This is also why, internally, the symbolic status codes
are named without the version number in them (ie, nfserr_inval).

With NFSv4, the world is more complicated.

The NFSv4 code was prototyped 20 years ago using these NFSv2/3 helpers, 
and is never revisited until there's a bug. Thus there is quite a bit of 
technical debt in fs/nfsd/vfs.c that we're replacing over time.

IMO it would be better if these VFS helpers returned errno values and
then the callers should figure out the conversion to an NFS status code.
I suspect that's difficult because some of the functions invoked by the
VFS helpers (like fh_verify() ) also return NFS status codes. We just
spent some time extracting NFS version-specific code from fh_verify().

> Don't you think something like this is a more sane way to keep the
> mapping rules in one place:
> 
> @@ -111,6 +111,26 @@ nfserrno (int errno)
>          return nfserr_io;
>   }
> 
> +static __be32
> +nfsd_map_errno(int host_err, int may_flags, int type)
> +{
> +       switch (host_err) {
> +       case -EBUSY:
> +               /*
> +                * According to RFC 8881 Section 18.25.4 paragraph 5,
> +                * removal of regular file can fail with NFS4ERR_FILE_OPEN.
> +                * For failure to remove directory we return NFS4ERR_ACCESS,
> +                * same as NFS4ERR_FILE_OPEN is mapped in v3 and v2.
> +                */
> +               if (may_flags == NFSD_MAY_REMOVE && type == S_IFREG)
> +                       return nfserr_file_open;
> +               else
> +                       return nfserr_acces;
> +       }
> +
> +       return nfserrno(host_err);
> +}
> +
>   /*
>    * Called from nfsd_lookup and encode_dirent. Check if we have crossed
>    * a mount point.
> @@ -2006,14 +2026,7 @@ nfsd_unlink(struct svc_rqst *rqstp, struct
> svc_fh *fhp, int type,
>   out_drop_write:
>          fh_drop_write(fhp);
>   out_nfserr:
> -       if (host_err == -EBUSY) {
> -               /* name is mounted-on. There is no perfect
> -                * error status.
> -                */
> -               err = nfserr_file_open;
> -       } else {
> -               err = nfserrno(host_err);
> -       }
> +       err = nfsd_map_errno(host_err, NFSD_MAY_REMOVE, type);
>   out:
>          return err;

No, I don't.

NFSD has Kconfig options that disable support for some versions of NFS.
The code that manages which status code to return really needs to be
inside the functions that are enabled or disabled by Kconfig.

As I keep repeating: there is no good way to handle the NFS status codes
in one set of functions. Each NFS version has its variations that
require special handling.

>> Let's visit RENAME once that is addressed.
> 
> And then next patch would be:
> 
> @@ -1828,6 +1828,7 @@ nfsd_rename(struct svc_rqst *rqstp, struct
> svc_fh *ffhp, char *fname, int flen,
>          __be32          err;
>          int             host_err;
>          bool            close_cached = false;
> +       int             type;
> 
>          err = fh_verify(rqstp, ffhp, S_IFDIR, NFSD_MAY_REMOVE);
>          if (err)
> @@ -1922,8 +1923,10 @@ nfsd_rename(struct svc_rqst *rqstp, struct
> svc_fh *ffhp, char *fname, int flen,
>    out_dput_new:
>          dput(ndentry);
>    out_dput_old:
> +       type = d_inode(odentry)->i_mode & S_IFMT;
>          dput(odentry);
>    out_nfserr:
> -        err = nfserrno(host_err);
> +       err = nfsd_map_errno(host_err, NFSD_MAY_REMOVE, type);

Same problem here: the NFS version-specific status codes have to be
figured out in the callers, not in nfsd_rename(). The status codes
are not common to all NFS versions.

>> Then handle OPEN as a third patch, because I bet we are going to meet
>> some complications there.
> 
> Did you think of anything better to do for OPEN other than NFS4ERR_ACCESS?

I haven't even started to think about that yet.

Amir Goldstein Jan. 22, 2025, 8:11 p.m. UTC | #11

On Wed, Jan 22, 2025 at 8:20 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On 1/22/25 1:53 PM, Amir Goldstein wrote:
> >>> I am fine with handling EBUSY in unlink/rmdir/rename/open
> >>> only for now if that is what everyone prefers.
> >>
> >> As far as I can tell, NFSv2 and NFSv3 REMOVE/RMDIR are working
> >> correctly. NFSv4 REMOVE needs to return a status code that depends
> >> on whether the target object is a file or not. Probably not much more
> >> than something like this:
> >>
> >>          status = vfs_unlink( ... );
> >> +       /* RFC 8881 Section 18.25.4 paragraph 5 */
> >> +       if (status == nfserr_file_open && !S_ISREG(...))
> >> +               status = nfserr_access;
> >>
> >> added to nfsd4_remove().
> >
> > Don't you think it's a bit awkward mapping back and forth like this?
>
> Yes, it's awkward. It's an artifact of the way NFSD's VFS helpers have
> been co-opted for new versions of the NFS protocol over the years.
>
> With NFSv2 and NFSv3, the operations and their permitted status codes
> are roughly similar so that these VFS helpers can be re-used without
> a lot of fuss. This is also why, internally, the symbolic status codes
> are named without the version number in them (ie, nfserr_inval).
>
> With NFSv4, the world is more complicated.
>
> The NFSv4 code was prototyped 20 years ago using these NFSv2/3 helpers,
> and is never revisited until there's a bug. Thus there is quite a bit of
> technical debt in fs/nfsd/vfs.c that we're replacing over time.
>
> IMO it would be better if these VFS helpers returned errno values and
> then the callers should figure out the conversion to an NFS status code.
> I suspect that's difficult because some of the functions invoked by the
> VFS helpers (like fh_verify() ) also return NFS status codes. We just
> spent some time extracting NFS version-specific code from fh_verify().
>
>
> > Don't you think something like this is a more sane way to keep the
> > mapping rules in one place:
> >
> > @@ -111,6 +111,26 @@ nfserrno (int errno)
> >          return nfserr_io;
> >   }
> >
> > +static __be32
> > +nfsd_map_errno(int host_err, int may_flags, int type)
> > +{
> > +       switch (host_err) {
> > +       case -EBUSY:
> > +               /*
> > +                * According to RFC 8881 Section 18.25.4 paragraph 5,
> > +                * removal of regular file can fail with NFS4ERR_FILE_OPEN.
> > +                * For failure to remove directory we return NFS4ERR_ACCESS,
> > +                * same as NFS4ERR_FILE_OPEN is mapped in v3 and v2.
> > +                */
> > +               if (may_flags == NFSD_MAY_REMOVE && type == S_IFREG)
> > +                       return nfserr_file_open;
> > +               else
> > +                       return nfserr_acces;
> > +       }
> > +
> > +       return nfserrno(host_err);
> > +}
> > +
> >   /*
> >    * Called from nfsd_lookup and encode_dirent. Check if we have crossed
> >    * a mount point.
> > @@ -2006,14 +2026,7 @@ nfsd_unlink(struct svc_rqst *rqstp, struct
> > svc_fh *fhp, int type,
> >   out_drop_write:
> >          fh_drop_write(fhp);
> >   out_nfserr:
> > -       if (host_err == -EBUSY) {
> > -               /* name is mounted-on. There is no perfect
> > -                * error status.
> > -                */
> > -               err = nfserr_file_open;
> > -       } else {
> > -               err = nfserrno(host_err);
> > -       }
> > +       err = nfsd_map_errno(host_err, NFSD_MAY_REMOVE, type);
> >   out:
> >          return err;
>
> No, I don't.
>
> NFSD has Kconfig options that disable support for some versions of NFS.
> The code that manages which status code to return really needs to be
> inside the functions that are enabled or disabled by Kconfig.
>
> As I keep repeating: there is no good way to handle the NFS status codes
> in one set of functions. Each NFS version has its variations that
> require special handling.
>
>

ok.

> >> Let's visit RENAME once that is addressed.
> >
> > And then next patch would be:
> >
> > @@ -1828,6 +1828,7 @@ nfsd_rename(struct svc_rqst *rqstp, struct
> > svc_fh *ffhp, char *fname, int flen,
> >          __be32          err;
> >          int             host_err;
> >          bool            close_cached = false;
> > +       int             type;
> >
> >          err = fh_verify(rqstp, ffhp, S_IFDIR, NFSD_MAY_REMOVE);
> >          if (err)
> > @@ -1922,8 +1923,10 @@ nfsd_rename(struct svc_rqst *rqstp, struct
> > svc_fh *ffhp, char *fname, int flen,
> >    out_dput_new:
> >          dput(ndentry);
> >    out_dput_old:
> > +       type = d_inode(odentry)->i_mode & S_IFMT;
> >          dput(odentry);
> >    out_nfserr:
> > -        err = nfserrno(host_err);
> > +       err = nfsd_map_errno(host_err, NFSD_MAY_REMOVE, type);
>
> Same problem here: the NFS version-specific status codes have to be
> figured out in the callers, not in nfsd_rename(). The status codes
> are not common to all NFS versions.
>
>

ok.

> >> Then handle OPEN as a third patch, because I bet we are going to meet
> >> some complications there.
> >
> > Did you think of anything better to do for OPEN other than NFS4ERR_ACCESS?
>
> I haven't even started to think about that yet.
>

ok. Let me know when you have any ideas about that.

My goal is to fix EBUSY WARN for open from FUSE.
The rest is cleanup that I don't mind doing on the way.

Thanks,
Amir.

Chuck Lever Jan. 23, 2025, 2:59 p.m. UTC | #12

On 1/22/25 3:11 PM, Amir Goldstein wrote:
> On Wed, Jan 22, 2025 at 8:20 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>
>> On 1/22/25 1:53 PM, Amir Goldstein wrote:
>>>>> I am fine with handling EBUSY in unlink/rmdir/rename/open
>>>>> only for now if that is what everyone prefers.
>>>>
>>>> As far as I can tell, NFSv2 and NFSv3 REMOVE/RMDIR are working
>>>> correctly. NFSv4 REMOVE needs to return a status code that depends
>>>> on whether the target object is a file or not. Probably not much more
>>>> than something like this:
>>>>
>>>>           status = vfs_unlink( ... );
>>>> +       /* RFC 8881 Section 18.25.4 paragraph 5 */
>>>> +       if (status == nfserr_file_open && !S_ISREG(...))
>>>> +               status = nfserr_access;
>>>>
>>>> added to nfsd4_remove().
>>>
>>> Don't you think it's a bit awkward mapping back and forth like this?
>>
>> Yes, it's awkward. It's an artifact of the way NFSD's VFS helpers have
>> been co-opted for new versions of the NFS protocol over the years.
>>
>> With NFSv2 and NFSv3, the operations and their permitted status codes
>> are roughly similar so that these VFS helpers can be re-used without
>> a lot of fuss. This is also why, internally, the symbolic status codes
>> are named without the version number in them (ie, nfserr_inval).
>>
>> With NFSv4, the world is more complicated.
>>
>> The NFSv4 code was prototyped 20 years ago using these NFSv2/3 helpers,
>> and is never revisited until there's a bug. Thus there is quite a bit of
>> technical debt in fs/nfsd/vfs.c that we're replacing over time.
>>
>> IMO it would be better if these VFS helpers returned errno values and
>> then the callers should figure out the conversion to an NFS status code.
>> I suspect that's difficult because some of the functions invoked by the
>> VFS helpers (like fh_verify() ) also return NFS status codes. We just
>> spent some time extracting NFS version-specific code from fh_verify().
>>
>>
>>> Don't you think something like this is a more sane way to keep the
>>> mapping rules in one place:
>>>
>>> @@ -111,6 +111,26 @@ nfserrno (int errno)
>>>           return nfserr_io;
>>>    }
>>>
>>> +static __be32
>>> +nfsd_map_errno(int host_err, int may_flags, int type)
>>> +{
>>> +       switch (host_err) {
>>> +       case -EBUSY:
>>> +               /*
>>> +                * According to RFC 8881 Section 18.25.4 paragraph 5,
>>> +                * removal of regular file can fail with NFS4ERR_FILE_OPEN.
>>> +                * For failure to remove directory we return NFS4ERR_ACCESS,
>>> +                * same as NFS4ERR_FILE_OPEN is mapped in v3 and v2.
>>> +                */
>>> +               if (may_flags == NFSD_MAY_REMOVE && type == S_IFREG)
>>> +                       return nfserr_file_open;
>>> +               else
>>> +                       return nfserr_acces;
>>> +       }
>>> +
>>> +       return nfserrno(host_err);
>>> +}
>>> +
>>>    /*
>>>     * Called from nfsd_lookup and encode_dirent. Check if we have crossed
>>>     * a mount point.
>>> @@ -2006,14 +2026,7 @@ nfsd_unlink(struct svc_rqst *rqstp, struct
>>> svc_fh *fhp, int type,
>>>    out_drop_write:
>>>           fh_drop_write(fhp);
>>>    out_nfserr:
>>> -       if (host_err == -EBUSY) {
>>> -               /* name is mounted-on. There is no perfect
>>> -                * error status.
>>> -                */
>>> -               err = nfserr_file_open;
>>> -       } else {
>>> -               err = nfserrno(host_err);
>>> -       }
>>> +       err = nfsd_map_errno(host_err, NFSD_MAY_REMOVE, type);
>>>    out:
>>>           return err;
>>
>> No, I don't.
>>
>> NFSD has Kconfig options that disable support for some versions of NFS.
>> The code that manages which status code to return really needs to be
>> inside the functions that are enabled or disabled by Kconfig.
>>
>> As I keep repeating: there is no good way to handle the NFS status codes
>> in one set of functions. Each NFS version has its variations that
>> require special handling.
>>
>>
> 
> ok.
> 
>>>> Let's visit RENAME once that is addressed.
>>>
>>> And then next patch would be:
>>>
>>> @@ -1828,6 +1828,7 @@ nfsd_rename(struct svc_rqst *rqstp, struct
>>> svc_fh *ffhp, char *fname, int flen,
>>>           __be32          err;
>>>           int             host_err;
>>>           bool            close_cached = false;
>>> +       int             type;
>>>
>>>           err = fh_verify(rqstp, ffhp, S_IFDIR, NFSD_MAY_REMOVE);
>>>           if (err)
>>> @@ -1922,8 +1923,10 @@ nfsd_rename(struct svc_rqst *rqstp, struct
>>> svc_fh *ffhp, char *fname, int flen,
>>>     out_dput_new:
>>>           dput(ndentry);
>>>     out_dput_old:
>>> +       type = d_inode(odentry)->i_mode & S_IFMT;
>>>           dput(odentry);
>>>     out_nfserr:
>>> -        err = nfserrno(host_err);
>>> +       err = nfsd_map_errno(host_err, NFSD_MAY_REMOVE, type);
>>
>> Same problem here: the NFS version-specific status codes have to be
>> figured out in the callers, not in nfsd_rename(). The status codes
>> are not common to all NFS versions.
>>
>>
> 
> ok.
> 
>>>> Then handle OPEN as a third patch, because I bet we are going to meet
>>>> some complications there.
>>>
>>> Did you think of anything better to do for OPEN other than NFS4ERR_ACCESS?
>>
>> I haven't even started to think about that yet.
>>
> 
> ok. Let me know when you have any ideas about that.
> 
> My goal is to fix EBUSY WARN for open from FUSE.
> The rest is cleanup that I don't mind doing on the way.

I've poked at nfsd4_remove(). It's not going to work the way I prefer.
But I'll take care of the clean up for remove, rename, and link.

Understood that fixing OPEN is your main priority.

Amir Goldstein Jan. 23, 2025, 3:29 p.m. UTC | #13

On Thu, Jan 23, 2025 at 3:59 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>
> On 1/22/25 3:11 PM, Amir Goldstein wrote:
> > On Wed, Jan 22, 2025 at 8:20 PM Chuck Lever <chuck.lever@oracle.com> wrote:
> >>
> >> On 1/22/25 1:53 PM, Amir Goldstein wrote:
> >>>>> I am fine with handling EBUSY in unlink/rmdir/rename/open
> >>>>> only for now if that is what everyone prefers.
> >>>>
> >>>> As far as I can tell, NFSv2 and NFSv3 REMOVE/RMDIR are working
> >>>> correctly. NFSv4 REMOVE needs to return a status code that depends
> >>>> on whether the target object is a file or not. Probably not much more
> >>>> than something like this:
> >>>>
> >>>>           status = vfs_unlink( ... );
> >>>> +       /* RFC 8881 Section 18.25.4 paragraph 5 */
> >>>> +       if (status == nfserr_file_open && !S_ISREG(...))
> >>>> +               status = nfserr_access;
> >>>>
> >>>> added to nfsd4_remove().
> >>>
> >>> Don't you think it's a bit awkward mapping back and forth like this?
> >>
> >> Yes, it's awkward. It's an artifact of the way NFSD's VFS helpers have
> >> been co-opted for new versions of the NFS protocol over the years.
> >>
> >> With NFSv2 and NFSv3, the operations and their permitted status codes
> >> are roughly similar so that these VFS helpers can be re-used without
> >> a lot of fuss. This is also why, internally, the symbolic status codes
> >> are named without the version number in them (ie, nfserr_inval).
> >>
> >> With NFSv4, the world is more complicated.
> >>
> >> The NFSv4 code was prototyped 20 years ago using these NFSv2/3 helpers,
> >> and is never revisited until there's a bug. Thus there is quite a bit of
> >> technical debt in fs/nfsd/vfs.c that we're replacing over time.
> >>
> >> IMO it would be better if these VFS helpers returned errno values and
> >> then the callers should figure out the conversion to an NFS status code.
> >> I suspect that's difficult because some of the functions invoked by the
> >> VFS helpers (like fh_verify() ) also return NFS status codes. We just
> >> spent some time extracting NFS version-specific code from fh_verify().
> >>
> >>
> >>> Don't you think something like this is a more sane way to keep the
> >>> mapping rules in one place:
> >>>
> >>> @@ -111,6 +111,26 @@ nfserrno (int errno)
> >>>           return nfserr_io;
> >>>    }
> >>>
> >>> +static __be32
> >>> +nfsd_map_errno(int host_err, int may_flags, int type)
> >>> +{
> >>> +       switch (host_err) {
> >>> +       case -EBUSY:
> >>> +               /*
> >>> +                * According to RFC 8881 Section 18.25.4 paragraph 5,
> >>> +                * removal of regular file can fail with NFS4ERR_FILE_OPEN.
> >>> +                * For failure to remove directory we return NFS4ERR_ACCESS,
> >>> +                * same as NFS4ERR_FILE_OPEN is mapped in v3 and v2.
> >>> +                */
> >>> +               if (may_flags == NFSD_MAY_REMOVE && type == S_IFREG)
> >>> +                       return nfserr_file_open;
> >>> +               else
> >>> +                       return nfserr_acces;
> >>> +       }
> >>> +
> >>> +       return nfserrno(host_err);
> >>> +}
> >>> +
> >>>    /*
> >>>     * Called from nfsd_lookup and encode_dirent. Check if we have crossed
> >>>     * a mount point.
> >>> @@ -2006,14 +2026,7 @@ nfsd_unlink(struct svc_rqst *rqstp, struct
> >>> svc_fh *fhp, int type,
> >>>    out_drop_write:
> >>>           fh_drop_write(fhp);
> >>>    out_nfserr:
> >>> -       if (host_err == -EBUSY) {
> >>> -               /* name is mounted-on. There is no perfect
> >>> -                * error status.
> >>> -                */
> >>> -               err = nfserr_file_open;
> >>> -       } else {
> >>> -               err = nfserrno(host_err);
> >>> -       }
> >>> +       err = nfsd_map_errno(host_err, NFSD_MAY_REMOVE, type);
> >>>    out:
> >>>           return err;
> >>
> >> No, I don't.
> >>
> >> NFSD has Kconfig options that disable support for some versions of NFS.
> >> The code that manages which status code to return really needs to be
> >> inside the functions that are enabled or disabled by Kconfig.
> >>
> >> As I keep repeating: there is no good way to handle the NFS status codes
> >> in one set of functions. Each NFS version has its variations that
> >> require special handling.
> >>
> >>
> >
> > ok.
> >
> >>>> Let's visit RENAME once that is addressed.
> >>>
> >>> And then next patch would be:
> >>>
> >>> @@ -1828,6 +1828,7 @@ nfsd_rename(struct svc_rqst *rqstp, struct
> >>> svc_fh *ffhp, char *fname, int flen,
> >>>           __be32          err;
> >>>           int             host_err;
> >>>           bool            close_cached = false;
> >>> +       int             type;
> >>>
> >>>           err = fh_verify(rqstp, ffhp, S_IFDIR, NFSD_MAY_REMOVE);
> >>>           if (err)
> >>> @@ -1922,8 +1923,10 @@ nfsd_rename(struct svc_rqst *rqstp, struct
> >>> svc_fh *ffhp, char *fname, int flen,
> >>>     out_dput_new:
> >>>           dput(ndentry);
> >>>     out_dput_old:
> >>> +       type = d_inode(odentry)->i_mode & S_IFMT;
> >>>           dput(odentry);
> >>>     out_nfserr:
> >>> -        err = nfserrno(host_err);
> >>> +       err = nfsd_map_errno(host_err, NFSD_MAY_REMOVE, type);
> >>
> >> Same problem here: the NFS version-specific status codes have to be
> >> figured out in the callers, not in nfsd_rename(). The status codes
> >> are not common to all NFS versions.
> >>
> >>
> >
> > ok.
> >
> >>>> Then handle OPEN as a third patch, because I bet we are going to meet
> >>>> some complications there.
> >>>
> >>> Did you think of anything better to do for OPEN other than NFS4ERR_ACCESS?
> >>
> >> I haven't even started to think about that yet.
> >>
> >
> > ok. Let me know when you have any ideas about that.
> >
> > My goal is to fix EBUSY WARN for open from FUSE.
> > The rest is cleanup that I don't mind doing on the way.
>
> I've poked at nfsd4_remove(). It's not going to work the way I prefer.

Do you mean because the file type is not available there?

> But I'll take care of the clean up for remove, rename, and link.
>

FWIW, this is how I was going to solve this,
but I admit it is quite awkward:

https://github.com/amir73il/linux/commits/nfsd-fixes/

> Understood that fixing OPEN is your main priority.
>

I now realized that truncate can also return EBUSY in my FUSE fs :/
That's why I am disappointed that there is no "fall back"
mapping for EBUSY that fits all without a warning, but I will
wait to see how the cleanup goes and we will take it from there.

Thanks,
Amir.

Chuck Lever Jan. 23, 2025, 5:37 p.m. UTC | #14

On 1/23/25 10:29 AM, Amir Goldstein wrote:
> On Thu, Jan 23, 2025 at 3:59 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>
>> On 1/22/25 3:11 PM, Amir Goldstein wrote:
>>> On Wed, Jan 22, 2025 at 8:20 PM Chuck Lever <chuck.lever@oracle.com> wrote:
>>>>
>>>> On 1/22/25 1:53 PM, Amir Goldstein wrote:
>>>>>>> I am fine with handling EBUSY in unlink/rmdir/rename/open
>>>>>>> only for now if that is what everyone prefers.
>>>>>>
>>>>>> As far as I can tell, NFSv2 and NFSv3 REMOVE/RMDIR are working
>>>>>> correctly. NFSv4 REMOVE needs to return a status code that depends
>>>>>> on whether the target object is a file or not. Probably not much more
>>>>>> than something like this:
>>>>>>
>>>>>>            status = vfs_unlink( ... );
>>>>>> +       /* RFC 8881 Section 18.25.4 paragraph 5 */
>>>>>> +       if (status == nfserr_file_open && !S_ISREG(...))
>>>>>> +               status = nfserr_access;
>>>>>>
>>>>>> added to nfsd4_remove().
>>>>>
>>>>> Don't you think it's a bit awkward mapping back and forth like this?
>>>>
>>>> Yes, it's awkward. It's an artifact of the way NFSD's VFS helpers have
>>>> been co-opted for new versions of the NFS protocol over the years.
>>>>
>>>> With NFSv2 and NFSv3, the operations and their permitted status codes
>>>> are roughly similar so that these VFS helpers can be re-used without
>>>> a lot of fuss. This is also why, internally, the symbolic status codes
>>>> are named without the version number in them (ie, nfserr_inval).
>>>>
>>>> With NFSv4, the world is more complicated.
>>>>
>>>> The NFSv4 code was prototyped 20 years ago using these NFSv2/3 helpers,
>>>> and is never revisited until there's a bug. Thus there is quite a bit of
>>>> technical debt in fs/nfsd/vfs.c that we're replacing over time.
>>>>
>>>> IMO it would be better if these VFS helpers returned errno values and
>>>> then the callers should figure out the conversion to an NFS status code.
>>>> I suspect that's difficult because some of the functions invoked by the
>>>> VFS helpers (like fh_verify() ) also return NFS status codes. We just
>>>> spent some time extracting NFS version-specific code from fh_verify().
>>>>
>>>>
>>>>> Don't you think something like this is a more sane way to keep the
>>>>> mapping rules in one place:
>>>>>
>>>>> @@ -111,6 +111,26 @@ nfserrno (int errno)
>>>>>            return nfserr_io;
>>>>>     }
>>>>>
>>>>> +static __be32
>>>>> +nfsd_map_errno(int host_err, int may_flags, int type)
>>>>> +{
>>>>> +       switch (host_err) {
>>>>> +       case -EBUSY:
>>>>> +               /*
>>>>> +                * According to RFC 8881 Section 18.25.4 paragraph 5,
>>>>> +                * removal of regular file can fail with NFS4ERR_FILE_OPEN.
>>>>> +                * For failure to remove directory we return NFS4ERR_ACCESS,
>>>>> +                * same as NFS4ERR_FILE_OPEN is mapped in v3 and v2.
>>>>> +                */
>>>>> +               if (may_flags == NFSD_MAY_REMOVE && type == S_IFREG)
>>>>> +                       return nfserr_file_open;
>>>>> +               else
>>>>> +                       return nfserr_acces;
>>>>> +       }
>>>>> +
>>>>> +       return nfserrno(host_err);
>>>>> +}
>>>>> +
>>>>>     /*
>>>>>      * Called from nfsd_lookup and encode_dirent. Check if we have crossed
>>>>>      * a mount point.
>>>>> @@ -2006,14 +2026,7 @@ nfsd_unlink(struct svc_rqst *rqstp, struct
>>>>> svc_fh *fhp, int type,
>>>>>     out_drop_write:
>>>>>            fh_drop_write(fhp);
>>>>>     out_nfserr:
>>>>> -       if (host_err == -EBUSY) {
>>>>> -               /* name is mounted-on. There is no perfect
>>>>> -                * error status.
>>>>> -                */
>>>>> -               err = nfserr_file_open;
>>>>> -       } else {
>>>>> -               err = nfserrno(host_err);
>>>>> -       }
>>>>> +       err = nfsd_map_errno(host_err, NFSD_MAY_REMOVE, type);
>>>>>     out:
>>>>>            return err;
>>>>
>>>> No, I don't.
>>>>
>>>> NFSD has Kconfig options that disable support for some versions of NFS.
>>>> The code that manages which status code to return really needs to be
>>>> inside the functions that are enabled or disabled by Kconfig.
>>>>
>>>> As I keep repeating: there is no good way to handle the NFS status codes
>>>> in one set of functions. Each NFS version has its variations that
>>>> require special handling.
>>>>
>>>>
>>>
>>> ok.
>>>
>>>>>> Let's visit RENAME once that is addressed.
>>>>>
>>>>> And then next patch would be:
>>>>>
>>>>> @@ -1828,6 +1828,7 @@ nfsd_rename(struct svc_rqst *rqstp, struct
>>>>> svc_fh *ffhp, char *fname, int flen,
>>>>>            __be32          err;
>>>>>            int             host_err;
>>>>>            bool            close_cached = false;
>>>>> +       int             type;
>>>>>
>>>>>            err = fh_verify(rqstp, ffhp, S_IFDIR, NFSD_MAY_REMOVE);
>>>>>            if (err)
>>>>> @@ -1922,8 +1923,10 @@ nfsd_rename(struct svc_rqst *rqstp, struct
>>>>> svc_fh *ffhp, char *fname, int flen,
>>>>>      out_dput_new:
>>>>>            dput(ndentry);
>>>>>      out_dput_old:
>>>>> +       type = d_inode(odentry)->i_mode & S_IFMT;
>>>>>            dput(odentry);
>>>>>      out_nfserr:
>>>>> -        err = nfserrno(host_err);
>>>>> +       err = nfsd_map_errno(host_err, NFSD_MAY_REMOVE, type);
>>>>
>>>> Same problem here: the NFS version-specific status codes have to be
>>>> figured out in the callers, not in nfsd_rename(). The status codes
>>>> are not common to all NFS versions.
>>>>
>>>>
>>>
>>> ok.
>>>
>>>>>> Then handle OPEN as a third patch, because I bet we are going to meet
>>>>>> some complications there.
>>>>>
>>>>> Did you think of anything better to do for OPEN other than NFS4ERR_ACCESS?
>>>>
>>>> I haven't even started to think about that yet.
>>>>
>>>
>>> ok. Let me know when you have any ideas about that.
>>>
>>> My goal is to fix EBUSY WARN for open from FUSE.
>>> The rest is cleanup that I don't mind doing on the way.
>>
>> I've poked at nfsd4_remove(). It's not going to work the way I prefer.
> 
> Do you mean because the file type is not available there?

Yes.


>> But I'll take care of the clean up for remove, rename, and link.
>>
> 
> FWIW, this is how I was going to solve this,
> but I admit it is quite awkward:

I'll deal with it. In the long run, making fh_verify() return an errno
so all of these helpers can return an errno rather than status code
is where I want to take this. But for now, a simple approach is best
because that can be cleanly backported.


> I now realized that truncate can also return EBUSY in my FUSE fs :/
> That's why I am disappointed that there is no "fall back"
> mapping for EBUSY that fits all without a warning, but I will
> wait to see how the cleanup goes and we will take it from there.

It's better for us if we can identify the particular system call
that returns -EBUSY. Auditing these cases might show that a blanket
approach is fine, but we still need to do the audit no matter what.

[v2] nfsd: map EBUSY to NFS4ERR_ACCESS for all operations

Commit Message

Comments

Patch