diff mbox series

NFS: Default change_attr_type to NFS4_CHANGE_TYPE_IS_UNDEFINED

Message ID 20210926181622.81474-1-trondmy@kernel.org (mailing list archive)
State New
Headers show
Series NFS: Default change_attr_type to NFS4_CHANGE_TYPE_IS_UNDEFINED | expand

Commit Message

trondmy@kernel.org Sept. 26, 2021, 6:16 p.m. UTC
From: Trond Myklebust <trond.myklebust@hammerspace.com>

Both NFSv3 and NFSv2 generate their change attribute from the ctime
value that was supplied by the server. However the problem is that there
are plenty of servers out there with ctime resolutions of 1ms or worse.
In a modern performance system, this is insufficient when trying to
decide which is the most recent set of attributes when, for instance, a
READ or GETATTR call races with a WRITE or SETATTR.

For this reason, let's revert to labelling the NFSv2/v3 change
attributes as NFS4_CHANGE_TYPE_IS_UNDEFINED. This will ensure we protect
against such races.

Fixes: 7b24dacf0840 ("NFS: Another inode revalidation improvement")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
---
 fs/nfs/nfs3xdr.c | 2 +-
 fs/nfs/proc.c    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Comments

Chuck Lever Sept. 26, 2021, 7:23 p.m. UTC | #1
Thanks for the quick response! Comments inline.


> On Sep 26, 2021, at 2:16 PM, trondmy@kernel.org wrote:
> 
> From: Trond Myklebust <trond.myklebust@hammerspace.com>
> 
> Both NFSv3 and NFSv2 generate their change attribute from the ctime
> value that was supplied by the server. However the problem is that there
> are plenty of servers out there with ctime resolutions of 1ms or worse.
> In a modern performance system, this is insufficient when trying to
> decide which is the most recent set of attributes when, for instance, a
> READ or GETATTR call races with a WRITE or SETATTR.

I agree 100% that a legacy NFS client shouldn't rely on ctime
alone for tracking server-side changes, exactly because of
the timestamp resolution issue.


> For this reason, let's revert to labelling the NFSv2/v3 change
> attributes as NFS4_CHANGE_TYPE_IS_UNDEFINED. This will ensure we protect
> against such races.
> 
> Fixes: 7b24dacf0840 ("NFS: Another inode revalidation improvement")

Perhaps this should be:

Fixes: 7f08a3359a3c ("NFSv4: Add support for the NFSv4.2 "change_attr_type" attribute")


> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> ---
> fs/nfs/nfs3xdr.c | 2 +-
> fs/nfs/proc.c    | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c
> index e6eca1d7481b..9274c9c5efea 100644
> --- a/fs/nfs/nfs3xdr.c
> +++ b/fs/nfs/nfs3xdr.c
> @@ -2227,7 +2227,7 @@ static int decode_fsinfo3resok(struct xdr_stream *xdr,
> 
> 	/* ignore properties */
> 	result->lease_time = 0;
> -	result->change_attr_type = NFS4_CHANGE_TYPE_IS_TIME_METADATA;
> +	result->change_attr_type = NFS4_CHANGE_TYPE_IS_UNDEFINED;
> 	return 0;
> }
> 
> diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c
> index ea19dbf12301..ecc4e717808c 100644
> --- a/fs/nfs/proc.c
> +++ b/fs/nfs/proc.c
> @@ -91,7 +91,7 @@ nfs_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle,
> 	info->dtpref = fsinfo.tsize;
> 	info->maxfilesize = 0x7FFFFFFF;
> 	info->lease_time = 0;
> -	info->change_attr_type = NFS4_CHANGE_TYPE_IS_TIME_METADATA;
> +	info->change_attr_type = NFS4_CHANGE_TYPE_IS_UNDEFINED;
> 	return 0;
> }
> 

Unfortunately this patch did not change the behavior of fsx, and I
still observe readahead racing with truncation. That's not too
surprising since nfs_inode_attrs_cmp() is already returning zero
in the racing case (see previous e-mail).


             fsx-284916 [011]  1238.734038: nfs_aops_readpages:   fileid=00:28:2 fhandle=0x36fbbe51 version=1753079961650632466 nr_pages=8
             fsx-284916 [011]  1238.734047: nfs_initiate_read:    fileid=00:28:2 fhandle=0x36fbbe51 offset=102400 count=32768

             fsx-284916 [011]  1238.734055: nfs_setattr_enter:    fileid=00:28:2 fhandle=0x36fbbe51 version=1753079961650632466 
             fsx-284916 [011]  1238.734055: nfs_writeback_inode_enter: fileid=00:28:2 fhandle=0x36fbbe51 version=1753079961650632466 
             fsx-284916 [011]  1238.734056: nfs_writeback_inode_exit: error=0 (OK) fileid=00:28:2 fhandle=0x36fbbe51 type=8 (REG) version=1753079961650632466 size=206226 cache_validity=0x2000 (DATA_INVAL_DEFER) nfs_
flags=0x4 (ACL_LRU_SET)
             fsx-284916 [011]  1238.734080: bprint:               nfs3_xdr_dec_setattr3res: task:53611@5 size=0x3ff5a
             fsx-284916 [011]  1238.734082: nfs_size_truncate:    fileid=00:28:2 fhandle=0x36fbbe51 version=1753079961650632466 cursize=206226 newsize=261978
             fsx-284916 [011]  1238.734082: nfs_inode_attrs_cmp:  fileid=00:28:2 fhandle=0x36fbbe51 version=1753079961650632466 res=0
             fsx-284916 [011]  1238.734083: nfs_refresh_inode_enter: fileid=00:28:2 fhandle=0x36fbbe51 version=1753079961650632466 
             fsx-284916 [011]  1238.734083: nfs_partial_attr_update: fileid=00:28:2 fhandle=0x36fbbe51 version=1753079961650632466 cache_validity=DATA_INVAL_DEFER fattr_validity=TYPE|MODE|NLINK|OWNER|GROUP|RDEV|SIZE
|PRE_SIZE|SPACE_USED|FSID|FILEID|ATIME|MTIME|CTIME|PRE_MTIME|PRE_CTIME|CHANGE|PRE_CHANGE
             fsx-284916 [011]  1238.734083: nfs_check_attrs:      fileid=00:28:2 fhandle=0x36fbbe51 version=1753079961650632466 valid_flags=
             fsx-284916 [011]  1238.734083: nfs_refresh_inode_exit: error=0 (OK) fileid=00:28:2 fhandle=0x36fbbe51 type=8 (REG) version=1753079961650632466 size=261978 cache_validity=0x2000 (DATA_INVAL_DEFER) nfs_flags=0x4 (ACL_LRU_SET)
             fsx-284916 [011]  1238.734083: nfs_setattr_exit:     error=0 (OK) fileid=00:28:2 fhandle=0x36fbbe51 type=8 (REG) version=1753079961650632466 size=261978 cache_validity=0x2000 (DATA_INVAL_DEFER) nfs_flags=0x4 (ACL_LRU_SET)

  kworker/u24:13-17203 [002]  1238.734088: bprint:               nfs3_xdr_dec_read3res: task:53610@5 size=0x32592
  kworker/u24:13-17203 [002]  1238.734088: nfs_inode_attrs_cmp:  fileid=00:28:2 fhandle=0x36fbbe51 version=1753079961650632466 res=0
  kworker/u24:13-17203 [002]  1238.734089: nfs_refresh_inode_enter: fileid=00:28:2 fhandle=0x36fbbe51 version=1753079961650632466 
  kworker/u24:13-17203 [002]  1238.734089: nfs_partial_attr_update: fileid=00:28:2 fhandle=0x36fbbe51 version=1753079961650632466 cache_validity=INVALID_ATIME|DATA_INVAL_DEFER fattr_validity=TYPE|MODE|NLINK|OWNER|GROUP|RDEV|SIZE|SPACE_USED|FSID|FILEID|ATIME|MTIME|CTIME|CHANGE
  kworker/u24:13-17203 [002]  1238.734089: nfs_size_update:      fileid=00:28:2 fhandle=0x36fbbe51 version=1753079961650632466 cursize=261978 newsize=206226
  kworker/u24:13-17203 [002]  1238.734089: nfs_refresh_inode_exit: error=0 (OK) fileid=00:28:2 fhandle=0x36fbbe51 type=8 (REG) version=1753079961650632466 size=206226 cache_validity=0x2000 (DATA_INVAL_DEFER) nfs_flags=0x4 (ACL_LRU_SET)
  kworker/u24:13-17203 [002]  1238.734089: nfs_readpage_done:    fileid=00:28:2 fhandle=0x36fbbe51 offset=102400 count=32768 res=32768 status=32768


--
Chuck Lever
diff mbox series

Patch

diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c
index e6eca1d7481b..9274c9c5efea 100644
--- a/fs/nfs/nfs3xdr.c
+++ b/fs/nfs/nfs3xdr.c
@@ -2227,7 +2227,7 @@  static int decode_fsinfo3resok(struct xdr_stream *xdr,
 
 	/* ignore properties */
 	result->lease_time = 0;
-	result->change_attr_type = NFS4_CHANGE_TYPE_IS_TIME_METADATA;
+	result->change_attr_type = NFS4_CHANGE_TYPE_IS_UNDEFINED;
 	return 0;
 }
 
diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c
index ea19dbf12301..ecc4e717808c 100644
--- a/fs/nfs/proc.c
+++ b/fs/nfs/proc.c
@@ -91,7 +91,7 @@  nfs_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle,
 	info->dtpref = fsinfo.tsize;
 	info->maxfilesize = 0x7FFFFFFF;
 	info->lease_time = 0;
-	info->change_attr_type = NFS4_CHANGE_TYPE_IS_TIME_METADATA;
+	info->change_attr_type = NFS4_CHANGE_TYPE_IS_UNDEFINED;
 	return 0;
 }