diff mbox series

[v2] NFS: Zero-stateid SETATTR should first return delegation

Message ID 159872131590.1096729.3952588635826859724.stgit@manet.1015granger.net
State New
Headers show
Series [v2] NFS: Zero-stateid SETATTR should first return delegation | expand

Commit Message

Chuck Lever Aug. 29, 2020, 5:16 p.m. UTC
If a write delegation isn't available, the Linux NFS client uses
a zero-stateid when performing a SETATTR.

NFSv4.0 provides no mechanism for an NFS server to match such a
request to a particular client. It recalls all delegations for that
file, even delegations held by the client issuing the request. If
that client happens to hold a read delegation, the server will
recall it immediately, resulting in an NFS4ERR_DELAY/CB_RECALL/
DELEGRETURN sequence.

Optimize out this pipeline bubble by having the client return any
delegations it may hold on a file before it issues a
SETATTR(zero-stateid) on that file.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfs/nfs4proc.c |    2 ++
 1 file changed, 2 insertions(+)

Changes since v1:
- Return the delegation only for NFSv4.0 mounts

Comments

Trond Myklebust Aug. 29, 2020, 9:17 p.m. UTC | #1
On Sat, 2020-08-29 at 13:16 -0400, Chuck Lever wrote:
> If a write delegation isn't available, the Linux NFS client uses
> a zero-stateid when performing a SETATTR.
> 
> NFSv4.0 provides no mechanism for an NFS server to match such a
> request to a particular client. It recalls all delegations for that
> file, even delegations held by the client issuing the request. If
> that client happens to hold a read delegation, the server will
> recall it immediately, resulting in an NFS4ERR_DELAY/CB_RECALL/
> DELEGRETURN sequence.
> 
> Optimize out this pipeline bubble by having the client return any
> delegations it may hold on a file before it issues a
> SETATTR(zero-stateid) on that file.
> 
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
>  fs/nfs/nfs4proc.c |    2 ++
>  1 file changed, 2 insertions(+)
> 
> Changes since v1:
> - Return the delegation only for NFSv4.0 mounts
> 
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index dbd01548335b..bca7245f1e78 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -3314,6 +3314,8 @@ static int _nfs4_do_setattr(struct inode
> *inode,
>  			goto zero_stateid;
>  	} else {
>  zero_stateid:
> +		if (server->nfs_client->cl_minorversion == 0)
> +			nfs4_inode_return_delegation(inode);

So, the intention is that nfs4_inode_make_writeable() takes care of
this, and in principle it is done in the cases that matter in
nfs4_proc_setattr().

I agree that the zero_stateid case is not currently being taken care
of, but that only matters for the case of truncate. So perhaps we can
just add a single call to nfs4_inode_make_writeable() above the
zero_stateid label instead of adding redundancy for all the other
cases?

>  		nfs4_stateid_copy(&arg->stateid, &zero_stateid);
>  	}
>  	if (delegation_cred)
> 
>
Chuck Lever Aug. 29, 2020, 10:28 p.m. UTC | #2
> On Aug 29, 2020, at 5:17 PM, Trond Myklebust <trondmy@hammerspace.com> wrote:
> 
> On Sat, 2020-08-29 at 13:16 -0400, Chuck Lever wrote:
>> If a write delegation isn't available, the Linux NFS client uses
>> a zero-stateid when performing a SETATTR.
>> 
>> NFSv4.0 provides no mechanism for an NFS server to match such a
>> request to a particular client. It recalls all delegations for that
>> file, even delegations held by the client issuing the request. If
>> that client happens to hold a read delegation, the server will
>> recall it immediately, resulting in an NFS4ERR_DELAY/CB_RECALL/
>> DELEGRETURN sequence.
>> 
>> Optimize out this pipeline bubble by having the client return any
>> delegations it may hold on a file before it issues a
>> SETATTR(zero-stateid) on that file.
>> 
>> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
>> ---
>> fs/nfs/nfs4proc.c |    2 ++
>> 1 file changed, 2 insertions(+)
>> 
>> Changes since v1:
>> - Return the delegation only for NFSv4.0 mounts
>> 
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index dbd01548335b..bca7245f1e78 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -3314,6 +3314,8 @@ static int _nfs4_do_setattr(struct inode
>> *inode,
>> 			goto zero_stateid;
>> 	} else {
>> zero_stateid:
>> +		if (server->nfs_client->cl_minorversion == 0)
>> +			nfs4_inode_return_delegation(inode);
> 
> So, the intention is that nfs4_inode_make_writeable() takes care of
> this, and in principle it is done in the cases that matter in
> nfs4_proc_setattr().

Thanks for pointing out the nfs4_inode_make_writeable call site!

 4219         /* Return any delegations if we're going to change ACLs */
 4220         if ((sattr->ia_valid & (ATTR_MODE|ATTR_UID|ATTR_GID)) != 0)
 4221                 nfs4_inode_make_writeable(inode);


> I agree that the zero_stateid case is not currently being taken care
> of, but that only matters for the case of truncate. So perhaps we can
> just add a single call to nfs4_inode_make_writeable() above the
> zero_stateid label instead of adding redundancy for all the other
> cases?

I'm willing to consider other solutions, but something else is going
on here. I've added some instrumentation to nfsd_setattr. It shows
that the iattr mask does not have the SIZE bit set:

nfsd_compound:        xid=0x8ffc7b48 opcnt=3
nfsd_compound_status: op=1/3 OP_PUTFH status=0

nfsd_setattr:         xid=0x8ffc7b48 fh_hash=0x2aed0c4d valid=ATIME|MTIME|ATIME_SET|MTIME_SET

time_out_leases:      fl=0xffff8887006c7ea0 dev=0x0:0x23 ino=0x16d825 fl_blocker=(nil) fl_owner=0xffff88872928e000 fl_flags=FL_DELEG fl_type=F_RDLCK fl_break_time=0 fl_downgrade
_time=0
leases_conflict:      conflict 1: lease=0xffff8887006c7ea0 fl_flags=FL_DELEG fl_type=F_RDLCK; breaker=0xffff8887006c6e38 fl_flags=FL_DELEG fl_type=F_WRLCK
leases_conflict:      conflict 1: lease=0xffff8887006c7ea0 fl_flags=FL_DELEG fl_type=F_RDLCK; breaker=0xffff8887006c6e38 fl_flags=FL_DELEG fl_type=F_WRLCK
nfsd_deleg_break:     client 5f4a926c:cd5044d5 stateid 00063dd9:00000001
break_lease_noblock:  fl=0xffff8887006c6e38 dev=0x0:0x23 ino=0x16d825 fl_blocker=(nil) fl_owner=(nil) fl_flags=FL_DELEG fl_type=F_WRLCK fl_break_time=0 fl_downgrade_time=0
nfsd_cb_work:         addr=192.168.2.51:35037 client 5f4a926c:cd5044d5 procedure=CB_RECALL
nfsd_compound_status: op=2/3 OP_SETATTR status=10008



--
Chuck Lever
diff mbox series

Patch

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index dbd01548335b..bca7245f1e78 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -3314,6 +3314,8 @@  static int _nfs4_do_setattr(struct inode *inode,
 			goto zero_stateid;
 	} else {
 zero_stateid:
+		if (server->nfs_client->cl_minorversion == 0)
+			nfs4_inode_return_delegation(inode);
 		nfs4_stateid_copy(&arg->stateid, &zero_stateid);
 	}
 	if (delegation_cred)