From patchwork Tue Jan 19 19:24:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bruce Fields X-Patchwork-Id: 12030609 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C423C4332D for ; Tue, 19 Jan 2021 19:41:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 08F0323107 for ; Tue, 19 Jan 2021 19:41:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729872AbhASTcP (ORCPT ); Tue, 19 Jan 2021 14:32:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729533AbhASTZk (ORCPT ); Tue, 19 Jan 2021 14:25:40 -0500 Received: from fieldses.org (fieldses.org [IPv6:2600:3c00:e000:2f7::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 935DFC061574; Tue, 19 Jan 2021 11:25:00 -0800 (PST) Received: by fieldses.org (Postfix, from userid 2815) id 77F22AA2; Tue, 19 Jan 2021 14:24:59 -0500 (EST) DKIM-Filter: OpenDKIM Filter v2.11.0 fieldses.org 77F22AA2 From: "J. Bruce Fields" To: linux-nfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Trond Myklebust , Anna Schumaker , Chuck Lever , "J. Bruce Fields" Subject: [PATCH 1/3] nfs: use change attribute for NFS re-exports Date: Tue, 19 Jan 2021 14:24:55 -0500 Message-Id: <1611084297-27352-2-git-send-email-bfields@redhat.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611084297-27352-1-git-send-email-bfields@redhat.com> References: <1611084297-27352-1-git-send-email-bfields@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: "J. Bruce Fields" When exporting NFS, we may as well use the real change attribute returned by the original server instead of faking up a change attribute from the ctime. Note we can't do that by setting I_VERSION--that would also turn on the logic in iversion.h which treats the lower bit specially, and that doesn't make sense for NFS. So instead we define a new export operation for filesystems like NFS that want to manage the change attribute themselves. Signed-off-by: J. Bruce Fields --- fs/nfs/export.c | 18 ++++++++++++++++++ fs/nfsd/nfsfh.h | 5 ++++- include/linux/exportfs.h | 1 + 3 files changed, 23 insertions(+), 1 deletion(-) diff --git a/fs/nfs/export.c b/fs/nfs/export.c index 7412bb164fa7..f2b34cfe286c 100644 --- a/fs/nfs/export.c +++ b/fs/nfs/export.c @@ -167,10 +167,28 @@ nfs_get_parent(struct dentry *dentry) return parent; } +static u64 nfs_fetch_iversion(struct inode *inode) +{ + struct nfs_server *server = NFS_SERVER(inode); + + /* Is this the right call?: */ + nfs_revalidate_inode(server, inode); + /* + * Also, note we're ignoring any returned error. That seems to be + * the practice for cache consistency information elsewhere in + * the server, but I'm not sure why. + */ + if (server->nfs_client->rpc_ops->version >= 4) + return inode_peek_iversion_raw(inode); + else + return time_to_chattr(&inode->i_ctime); +} + const struct export_operations nfs_export_ops = { .encode_fh = nfs_encode_fh, .fh_to_dentry = nfs_fh_to_dentry, .get_parent = nfs_get_parent, + .fetch_iversion = nfs_fetch_iversion, .flags = EXPORT_OP_NOWCC|EXPORT_OP_NOSUBTREECHK| EXPORT_OP_CLOSE_BEFORE_UNLINK|EXPORT_OP_REMOTE_FS| EXPORT_OP_NOATOMIC_ATTR, diff --git a/fs/nfsd/nfsfh.h b/fs/nfsd/nfsfh.h index cb20c2cd3469..f58933519f38 100644 --- a/fs/nfsd/nfsfh.h +++ b/fs/nfsd/nfsfh.h @@ -12,6 +12,7 @@ #include #include #include +#include static inline __u32 ino_t_to_u32(ino_t ino) { @@ -264,7 +265,9 @@ fh_clear_wcc(struct svc_fh *fhp) static inline u64 nfsd4_change_attribute(struct kstat *stat, struct inode *inode) { - if (IS_I_VERSION(inode)) { + if (inode->i_sb->s_export_op->fetch_iversion) + return inode->i_sb->s_export_op->fetch_iversion(inode); + else if (IS_I_VERSION(inode)) { u64 chattr; chattr = stat->ctime.tv_sec; diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h index 9f4d4bcbf251..fe848901fcc3 100644 --- a/include/linux/exportfs.h +++ b/include/linux/exportfs.h @@ -213,6 +213,7 @@ struct export_operations { bool write, u32 *device_generation); int (*commit_blocks)(struct inode *inode, struct iomap *iomaps, int nr_iomaps, struct iattr *iattr); + u64 (*fetch_iversion)(struct inode *); #define EXPORT_OP_NOWCC (0x1) /* don't collect v3 wcc data */ #define EXPORT_OP_NOSUBTREECHK (0x2) /* no subtree checking */ #define EXPORT_OP_CLOSE_BEFORE_UNLINK (0x4) /* close files before unlink */ From patchwork Tue Jan 19 19:24:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bruce Fields X-Patchwork-Id: 12030607 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA33DC4332B for ; Tue, 19 Jan 2021 19:41:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AB65A23107 for ; Tue, 19 Jan 2021 19:41:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729983AbhASTcZ (ORCPT ); Tue, 19 Jan 2021 14:32:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60192 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729519AbhASTZk (ORCPT ); Tue, 19 Jan 2021 14:25:40 -0500 Received: from fieldses.org (fieldses.org [IPv6:2600:3c00:e000:2f7::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80CACC061573; Tue, 19 Jan 2021 11:25:00 -0800 (PST) Received: by fieldses.org (Postfix, from userid 2815) id AEA2E6C0D; Tue, 19 Jan 2021 14:24:59 -0500 (EST) DKIM-Filter: OpenDKIM Filter v2.11.0 fieldses.org AEA2E6C0D From: "J. Bruce Fields" To: linux-nfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Trond Myklebust , Anna Schumaker , Chuck Lever , "J. Bruce Fields" Subject: [PATCH 2/3] nfsd: move change attribute generation to filesystem Date: Tue, 19 Jan 2021 14:24:56 -0500 Message-Id: <1611084297-27352-3-git-send-email-bfields@redhat.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611084297-27352-1-git-send-email-bfields@redhat.com> References: <1611084297-27352-1-git-send-email-bfields@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: "J. Bruce Fields" After this, only filesystems lacking change attribute support will leave the fetch_iversion export op NULL. This seems cleaner to me, and will allow some minor optimizations in the nfsd code. Signed-off-by: J. Bruce Fields --- fs/btrfs/export.c | 2 ++ fs/ext4/super.c | 9 +++++++++ fs/nfsd/nfsfh.h | 25 +++---------------------- fs/xfs/xfs_export.c | 10 ++++++++++ include/linux/iversion.h | 26 ++++++++++++++++++++++++++ 5 files changed, 50 insertions(+), 22 deletions(-) diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c index 1d4c2397d0d6..9f9fe24b29cc 100644 --- a/fs/btrfs/export.c +++ b/fs/btrfs/export.c @@ -7,6 +7,7 @@ #include "btrfs_inode.h" #include "print-tree.h" #include "export.h" +#include #define BTRFS_FID_SIZE_NON_CONNECTABLE (offsetof(struct btrfs_fid, \ parent_objectid) / 4) @@ -278,4 +279,5 @@ const struct export_operations btrfs_export_ops = { .fh_to_parent = btrfs_fh_to_parent, .get_parent = btrfs_get_parent, .get_name = btrfs_get_name, + .fetch_iversion = generic_fetch_iversion, }; diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 21121787c874..4a37b7fc55b6 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1616,11 +1616,20 @@ static const struct super_operations ext4_sops = { .bdev_try_to_free_page = bdev_try_to_free_page, }; +static u64 ext4_fetch_iversion(struct inode *inode) +{ + if (IS_I_VERSION(inode)) + return generic_fetch_iversion(inode); + else + return time_to_chattr(&inode->i_ctime); +} + static const struct export_operations ext4_export_ops = { .fh_to_dentry = ext4_fh_to_dentry, .fh_to_parent = ext4_fh_to_parent, .get_parent = ext4_get_parent, .commit_metadata = ext4_nfs_commit_metadata, + .fetch_iversion = ext4_fetch_iversion, }; enum { diff --git a/fs/nfsd/nfsfh.h b/fs/nfsd/nfsfh.h index f58933519f38..257ba5646239 100644 --- a/fs/nfsd/nfsfh.h +++ b/fs/nfsd/nfsfh.h @@ -52,8 +52,8 @@ typedef struct svc_fh { struct timespec64 fh_pre_mtime; /* mtime before oper */ struct timespec64 fh_pre_ctime; /* ctime before oper */ /* - * pre-op nfsv4 change attr: note must check IS_I_VERSION(inode) - * to find out if it is valid. + * pre-op nfsv4 change attr: note must check for fetch_iversion + * op to find out if it is valid. */ u64 fh_pre_change; @@ -251,31 +251,12 @@ fh_clear_wcc(struct svc_fh *fhp) fhp->fh_pre_saved = false; } -/* - * We could use i_version alone as the change attribute. However, - * i_version can go backwards after a reboot. On its own that doesn't - * necessarily cause a problem, but if i_version goes backwards and then - * is incremented again it could reuse a value that was previously used - * before boot, and a client who queried the two values might - * incorrectly assume nothing changed. - * - * By using both ctime and the i_version counter we guarantee that as - * long as time doesn't go backwards we never reuse an old value. - */ static inline u64 nfsd4_change_attribute(struct kstat *stat, struct inode *inode) { if (inode->i_sb->s_export_op->fetch_iversion) return inode->i_sb->s_export_op->fetch_iversion(inode); - else if (IS_I_VERSION(inode)) { - u64 chattr; - - chattr = stat->ctime.tv_sec; - chattr <<= 30; - chattr += stat->ctime.tv_nsec; - chattr += inode_query_iversion(inode); - return chattr; - } else + else return time_to_chattr(&stat->ctime); } diff --git a/fs/xfs/xfs_export.c b/fs/xfs/xfs_export.c index 465fd9e048d4..4a08b65c32aa 100644 --- a/fs/xfs/xfs_export.c +++ b/fs/xfs/xfs_export.c @@ -16,6 +16,7 @@ #include "xfs_inode_item.h" #include "xfs_icache.h" #include "xfs_pnfs.h" +#include /* * Note that we only accept fileids which are long enough rather than allow @@ -223,6 +224,14 @@ xfs_fs_nfs_commit_metadata( return xfs_log_force_inode(XFS_I(inode)); } +static u64 xfs_fetch_iversion(struct inode *inode) +{ + if (IS_I_VERSION(inode)) + return generic_fetch_iversion(inode); + else + return time_to_chattr(&inode->i_ctime); +} + const struct export_operations xfs_export_operations = { .encode_fh = xfs_fs_encode_fh, .fh_to_dentry = xfs_fs_fh_to_dentry, @@ -234,4 +243,5 @@ const struct export_operations xfs_export_operations = { .map_blocks = xfs_fs_map_blocks, .commit_blocks = xfs_fs_commit_blocks, #endif + .fetch_iversion = xfs_fetch_iversion, }; diff --git a/include/linux/iversion.h b/include/linux/iversion.h index 3bfebde5a1a6..ded74523c8a6 100644 --- a/include/linux/iversion.h +++ b/include/linux/iversion.h @@ -328,6 +328,32 @@ inode_query_iversion(struct inode *inode) return cur >> I_VERSION_QUERIED_SHIFT; } +/* + * We could use i_version alone as the NFSv4 change attribute. However, + * i_version can go backwards after a reboot. On its own that doesn't + * necessarily cause a problem, but if i_version goes backwards and then + * is incremented again it could reuse a value that was previously used + * before boot, and a client who queried the two values might + * incorrectly assume nothing changed. + * + * By using both ctime and the i_version counter we guarantee that as + * long as time doesn't go backwards we never reuse an old value. + * + * A filesystem that has an on-disk boot counter or similar might prefer + * to use that to avoid the risk of the change attribute going backwards + * if system time is set backwards. + */ +static inline u64 generic_fetch_iversion(struct inode *inode) +{ + u64 chattr; + + chattr = inode->i_ctime.tv_sec; + chattr <<= 30; + chattr += inode->i_ctime.tv_nsec; + chattr += inode_query_iversion(inode); + return chattr; +} + /* * For filesystems without any sort of change attribute, the best we can * do is fake one up from the ctime: From patchwork Tue Jan 19 19:24:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bruce Fields X-Patchwork-Id: 12030605 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68042C433E0 for ; Tue, 19 Jan 2021 19:41:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2153622DD3 for ; Tue, 19 Jan 2021 19:41:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728185AbhASTbu (ORCPT ); Tue, 19 Jan 2021 14:31:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60196 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729548AbhASTZk (ORCPT ); Tue, 19 Jan 2021 14:25:40 -0500 Received: from fieldses.org (fieldses.org [IPv6:2600:3c00:e000:2f7::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 991FAC061575; Tue, 19 Jan 2021 11:25:00 -0800 (PST) Received: by fieldses.org (Postfix, from userid 2815) id 97B1A28E5; Tue, 19 Jan 2021 14:24:59 -0500 (EST) DKIM-Filter: OpenDKIM Filter v2.11.0 fieldses.org 97B1A28E5 From: "J. Bruce Fields" To: linux-nfs@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, Trond Myklebust , Anna Schumaker , Chuck Lever , "J. Bruce Fields" Subject: [PATCH 3/3] nfsd: skip some unnecessary stats in the v4 case Date: Tue, 19 Jan 2021 14:24:57 -0500 Message-Id: <1611084297-27352-4-git-send-email-bfields@redhat.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1611084297-27352-1-git-send-email-bfields@redhat.com> References: <1611084297-27352-1-git-send-email-bfields@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: "J. Bruce Fields" In the typical case of v4 and an i_version-supporting filesystem, we can skip a stat which is only required to fake up a change attribute from ctime. Signed-off-by: J. Bruce Fields --- fs/nfsd/nfs3xdr.c | 37 ++++++++++++++++++++----------------- 1 file changed, 20 insertions(+), 17 deletions(-) diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c index 821db21ba072..e22d5d209d08 100644 --- a/fs/nfsd/nfs3xdr.c +++ b/fs/nfsd/nfs3xdr.c @@ -260,24 +260,25 @@ void fill_pre_wcc(struct svc_fh *fhp) struct inode *inode; struct kstat stat; bool v4 = (fhp->fh_maxsize == NFS4_FHSIZE); - __be32 err; if (fhp->fh_no_wcc || fhp->fh_pre_saved) return; inode = d_inode(fhp->fh_dentry); - err = fh_getattr(fhp, &stat); - if (err) { - /* Grab the times from inode anyway */ - stat.mtime = inode->i_mtime; - stat.ctime = inode->i_ctime; - stat.size = inode->i_size; + if (!v4 || !inode->i_sb->s_export_op->fetch_iversion) { + __be32 err = fh_getattr(fhp, &stat); + if (err) { + /* Grab the times from inode anyway */ + stat.mtime = inode->i_mtime; + stat.ctime = inode->i_ctime; + stat.size = inode->i_size; + } + fhp->fh_pre_mtime = stat.mtime; + fhp->fh_pre_ctime = stat.ctime; + fhp->fh_pre_size = stat.size; } if (v4) fhp->fh_pre_change = nfsd4_change_attribute(&stat, inode); - fhp->fh_pre_mtime = stat.mtime; - fhp->fh_pre_ctime = stat.ctime; - fhp->fh_pre_size = stat.size; fhp->fh_pre_saved = true; } @@ -288,7 +289,6 @@ void fill_post_wcc(struct svc_fh *fhp) { bool v4 = (fhp->fh_maxsize == NFS4_FHSIZE); struct inode *inode = d_inode(fhp->fh_dentry); - __be32 err; if (fhp->fh_no_wcc) return; @@ -296,12 +296,15 @@ void fill_post_wcc(struct svc_fh *fhp) if (fhp->fh_post_saved) printk("nfsd: inode locked twice during operation.\n"); - err = fh_getattr(fhp, &fhp->fh_post_attr); - if (err) { - fhp->fh_post_saved = false; - fhp->fh_post_attr.ctime = inode->i_ctime; - } else - fhp->fh_post_saved = true; + fhp->fh_post_saved = true; + + if (!v4 || !inode->i_sb->s_export_op->fetch_iversion) { + __be32 err = fh_getattr(fhp, &fhp->fh_post_attr); + if (err) { + fhp->fh_post_saved = false; + fhp->fh_post_attr.ctime = inode->i_ctime; + } + } if (v4) fhp->fh_post_change = nfsd4_change_attribute(&fhp->fh_post_attr, inode);