From patchwork Mon Oct 17 10:57:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 13008595 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E0E0C43217 for ; Mon, 17 Oct 2022 10:57:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230356AbiJQK5W (ORCPT ); Mon, 17 Oct 2022 06:57:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44282 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230294AbiJQK5T (ORCPT ); Mon, 17 Oct 2022 06:57:19 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 63F115C9D7; Mon, 17 Oct 2022 03:57:18 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id EBE35B81334; Mon, 17 Oct 2022 10:57:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A7D88C433D7; Mon, 17 Oct 2022 10:57:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666004235; bh=tc1RUm4guOoZTsO0VdPQH+XnxC1sMtsO6DtotD5Lcrw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gj66TCQZlj4T/7xQvycsgXgldB2+C5GmsetaE08O+PZ5gAUkzWUHVOK98ouxsHzjZ Xwn3L4ObaHc+uyEh38PjZ1hsqAQkfRWUm38K+ZVzH/fGc1crQtwwEsJ/LUHHWWuCfC Xox25k4xYLMRSkUoLX817xk9RTmzLpGtojcr5CVTtzXPIS2neyPt/XBZk2Lo39TJbf FXRhQ+XgeIlDn8J4I1zYcs2Z1D5vab4se0VXFyc0G5J3QvbJIl+hyGJxsgoGLw60sh AGwSr4YdxFpiJLh44isPsZXOwH6Jz24lHMLAREbQxaoqFo0M0tri4NVdNnl4Rfu0QX GacQr/afZRz7A== From: Jeff Layton To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org Subject: [PATCH v7 1/9] fs: uninline inode_query_iversion Date: Mon, 17 Oct 2022 06:57:01 -0400 Message-Id: <20221017105709.10830-2-jlayton@kernel.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221017105709.10830-1-jlayton@kernel.org> References: <20221017105709.10830-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Reviewed-by: NeilBrown Signed-off-by: Jeff Layton --- fs/libfs.c | 36 ++++++++++++++++++++++++++++++++++++ include/linux/iversion.h | 38 ++------------------------------------ 2 files changed, 38 insertions(+), 36 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index 682d56345a1c..5ae81466a422 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -1566,3 +1566,39 @@ bool inode_maybe_inc_iversion(struct inode *inode, bool force) return true; } EXPORT_SYMBOL(inode_maybe_inc_iversion); + +/** + * inode_query_iversion - read i_version for later use + * @inode: inode from which i_version should be read + * + * Read the inode i_version counter. This should be used by callers that wish + * to store the returned i_version for later comparison. This will guarantee + * that a later query of the i_version will result in a different value if + * anything has changed. + * + * In this implementation, we fetch the current value, set the QUERIED flag and + * then try to swap it into place with a cmpxchg, if it wasn't already set. If + * that fails, we try again with the newly fetched value from the cmpxchg. + */ +u64 inode_query_iversion(struct inode *inode) +{ + u64 cur, new; + + cur = inode_peek_iversion_raw(inode); + do { + /* If flag is already set, then no need to swap */ + if (cur & I_VERSION_QUERIED) { + /* + * This barrier (and the implicit barrier in the + * cmpxchg below) pairs with the barrier in + * inode_maybe_inc_iversion(). + */ + smp_mb(); + break; + } + + new = cur | I_VERSION_QUERIED; + } while (!atomic64_try_cmpxchg(&inode->i_version, &cur, new)); + return cur >> I_VERSION_QUERIED_SHIFT; +} +EXPORT_SYMBOL(inode_query_iversion); diff --git a/include/linux/iversion.h b/include/linux/iversion.h index e27bd4f55d84..6755d8b4f20b 100644 --- a/include/linux/iversion.h +++ b/include/linux/iversion.h @@ -234,42 +234,6 @@ inode_peek_iversion(const struct inode *inode) return inode_peek_iversion_raw(inode) >> I_VERSION_QUERIED_SHIFT; } -/** - * inode_query_iversion - read i_version for later use - * @inode: inode from which i_version should be read - * - * Read the inode i_version counter. This should be used by callers that wish - * to store the returned i_version for later comparison. This will guarantee - * that a later query of the i_version will result in a different value if - * anything has changed. - * - * In this implementation, we fetch the current value, set the QUERIED flag and - * then try to swap it into place with a cmpxchg, if it wasn't already set. If - * that fails, we try again with the newly fetched value from the cmpxchg. - */ -static inline u64 -inode_query_iversion(struct inode *inode) -{ - u64 cur, new; - - cur = inode_peek_iversion_raw(inode); - do { - /* If flag is already set, then no need to swap */ - if (cur & I_VERSION_QUERIED) { - /* - * This barrier (and the implicit barrier in the - * cmpxchg below) pairs with the barrier in - * inode_maybe_inc_iversion(). - */ - smp_mb(); - break; - } - - new = cur | I_VERSION_QUERIED; - } while (!atomic64_try_cmpxchg(&inode->i_version, &cur, new)); - return cur >> I_VERSION_QUERIED_SHIFT; -} - /* * For filesystems without any sort of change attribute, the best we can * do is fake one up from the ctime: @@ -283,6 +247,8 @@ static inline u64 time_to_chattr(struct timespec64 *t) return chattr; } +u64 inode_query_iversion(struct inode *inode); + /** * inode_eq_iversion_raw - check whether the raw i_version counter has changed * @inode: inode to check From patchwork Mon Oct 17 10:57:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 13008596 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1714C43217 for ; Mon, 17 Oct 2022 10:57:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230043AbiJQK50 (ORCPT ); Mon, 17 Oct 2022 06:57:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44318 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230224AbiJQK5U (ORCPT ); Mon, 17 Oct 2022 06:57:20 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DFC155C95A; Mon, 17 Oct 2022 03:57:18 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 83E3D6104E; Mon, 17 Oct 2022 10:57:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DDF16C43150; Mon, 17 Oct 2022 10:57:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666004237; bh=3UkiypkQG9xLfg+hS+eOV/U1/Tn/AJ7lMyTpKN5KYY4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MwUTTIfzRGKwnFgR9CHvcDGvrugB2kBLT3/HZViQOmOp7i4frDTUrsArcqbmeeT2M sTqJu3GIYqpuq4flLKcAZOZFnPFa7zPj7A3w5PltdPpt4g1ZzSk2WPonaGztX+OYfl s2ipV6sWJ6VMpoWn3SEs/Dv6B190JWr14RE6elpX9WGw03rwUK8saBdvkHEBdaLrCE Ay07oOffO0gqSfvow5iHwu6fqhlxBDabrcNW+ajL/uq9yLeRHETcYAgRGLAc+77frz jH8jsPYTVyyTxXoSYe3j6MKWOwEybrQ85JBygblbRW09TQLEz9WB87blZu3Ne2dAqa ylsTh8PecGdCw== From: Jeff Layton To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, Colin Walters Subject: [PATCH v7 2/9] fs: clarify when the i_version counter must be updated Date: Mon, 17 Oct 2022 06:57:02 -0400 Message-Id: <20221017105709.10830-3-jlayton@kernel.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221017105709.10830-1-jlayton@kernel.org> References: <20221017105709.10830-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org The i_version field in the kernel has had different semantics over the decades, but NFSv4 has certain expectations. Update the comments in iversion.h to describe when the i_version must change. Cc: Colin Walters Cc: NeilBrown Cc: Trond Myklebust Cc: Dave Chinner Signed-off-by: Jeff Layton --- include/linux/iversion.h | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/include/linux/iversion.h b/include/linux/iversion.h index 6755d8b4f20b..94f4dc620d01 100644 --- a/include/linux/iversion.h +++ b/include/linux/iversion.h @@ -9,8 +9,24 @@ * --------------------------- * The change attribute (i_version) is mandated by NFSv4 and is mostly for * knfsd, but is also used for other purposes (e.g. IMA). The i_version must - * appear different to observers if there was a change to the inode's data or - * metadata since it was last queried. + * appear larger to observers if there was an explicit change to the inode's + * data or metadata since it was last queried. + * + * An explicit change is one that would ordinarily result in a change to the + * inode status change time (aka ctime). i_version must appear to change, even + * if the ctime does not (since the whole point is to avoid missing updates due + * to timestamp granularity). If POSIX mandates that the ctime must change due + * to an operation, then the i_version counter must be incremented as well. + * + * Making the i_version update completely atomic with the operation itself would + * be prohibitively expensive. Traditionally the kernel has updated the times on + * directories after an operation that changes its contents. For regular files, + * the ctime is usually updated before the data is copied into the cache for a + * write. This means that there is a window of time when an observer can + * associate a new timestamp with old file contents. Since the purpose of the + * i_version is to allow for better cache coherency, the i_version must always + * be updated after the results of the operation are visible. Updating it before + * and after a change is also permitted. * * Observers see the i_version as a 64-bit number that never decreases. If it * remains the same since it was last checked, then nothing has changed in the From patchwork Mon Oct 17 10:57:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 13008597 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CCB19C4332F for ; Mon, 17 Oct 2022 10:58:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230478AbiJQK57 (ORCPT ); Mon, 17 Oct 2022 06:57:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44534 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230434AbiJQK51 (ORCPT ); Mon, 17 Oct 2022 06:57:27 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 52FBD6173B; Mon, 17 Oct 2022 03:57:23 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 72ABFB81334; Mon, 17 Oct 2022 10:57:21 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3047BC43470; Mon, 17 Oct 2022 10:57:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666004240; bh=8TWv5cnyxlkJnO3rzKxS18paTx4A0ODreD10US43z/E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=g0AtauxQEC+zWtDfBEeehuzIvC524NwIGl56NtZcSFJFyz2LPfoRugKIVLTkjOpCA mq1oZfuJyUL/JeybhMvf/pw0e8CyhwTVH+NaaQqewYqXgCfvcLmb6lARRimSqUOTh+ UIoFhvM/OenF85DGhvDWmzUhBTnW7/pgIxwGsuyM45IjgDxNExc9rBgPnQ2uW8qLnY 4+ISOJurDZtAT0aWPOQzLpiFzQz9ALQUNQnM+Kq1lFvD8pDDUHDa+DFgncHFxkWvt8 6OlNzeFnlT4mSS76FWZA8rvr+br72M+3mNQUUyHxGIPshQZwkGjRaXFuMrjxllWSFM hTAME/ekB05Uw== From: Jeff Layton To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, Jeff Layton Subject: [PATCH v7 3/9] vfs: plumb i_version handling into struct kstat Date: Mon, 17 Oct 2022 06:57:03 -0400 Message-Id: <20221017105709.10830-4-jlayton@kernel.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221017105709.10830-1-jlayton@kernel.org> References: <20221017105709.10830-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Jeff Layton The NFS server has a lot of special handling for different types of change attribute access, depending on the underlying filesystem. In most cases, it's doing a getattr anyway and then fetching that value after the fact. Rather that do that, add a new STATX_VERSION flag that is a kernel-only symbol (for now). If requested and getattr can implement it, it can fill out this field. For IS_I_VERSION inodes, add a generic implementation in vfs_getattr_nosec. Take care to mask STATX_VERSION off in requests from userland and in the result mask. Since not all filesystems can give the same guarantees of monotonicity, claim a STATX_ATTR_VERSION_MONOTONIC flag that filesystems can set to indicate that they offer an i_version value that can never go backward. Eventually if we decide to make the i_version available to userland, we can just designate a field for it in struct statx, and move the STATX_VERSION definition to the uapi header. Reviewed-by: NeilBrown Signed-off-by: Jeff Layton --- fs/stat.c | 17 +++++++++++++++-- include/linux/stat.h | 9 +++++++++ 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/fs/stat.c b/fs/stat.c index a7930d744483..e7f8cd4b24e1 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include @@ -118,6 +119,11 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat, stat->attributes_mask |= (STATX_ATTR_AUTOMOUNT | STATX_ATTR_DAX); + if ((request_mask & STATX_VERSION) && IS_I_VERSION(inode)) { + stat->result_mask |= STATX_VERSION; + stat->version = inode_query_iversion(inode); + } + mnt_userns = mnt_user_ns(path->mnt); if (inode->i_op->getattr) return inode->i_op->getattr(mnt_userns, path, stat, @@ -587,9 +593,11 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer) memset(&tmp, 0, sizeof(tmp)); - tmp.stx_mask = stat->result_mask; + /* STATX_VERSION is kernel-only for now */ + tmp.stx_mask = stat->result_mask & ~STATX_VERSION; tmp.stx_blksize = stat->blksize; - tmp.stx_attributes = stat->attributes; + /* STATX_ATTR_VERSION_MONOTONIC is kernel-only for now */ + tmp.stx_attributes = stat->attributes & ~STATX_ATTR_VERSION_MONOTONIC; tmp.stx_nlink = stat->nlink; tmp.stx_uid = from_kuid_munged(current_user_ns(), stat->uid); tmp.stx_gid = from_kgid_munged(current_user_ns(), stat->gid); @@ -628,6 +636,11 @@ int do_statx(int dfd, struct filename *filename, unsigned int flags, if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE) return -EINVAL; + /* STATX_VERSION is kernel-only for now. Ignore requests + * from userland. + */ + mask &= ~STATX_VERSION; + error = vfs_statx(dfd, filename, flags, &stat, mask); if (error) return error; diff --git a/include/linux/stat.h b/include/linux/stat.h index ff277ced50e9..4e9428d86a3a 100644 --- a/include/linux/stat.h +++ b/include/linux/stat.h @@ -52,6 +52,15 @@ struct kstat { u64 mnt_id; u32 dio_mem_align; u32 dio_offset_align; + u64 version; }; +/* These definitions are internal to the kernel for now. Mainly used by nfsd. */ + +/* mask values */ +#define STATX_VERSION 0x40000000U /* Want/got stx_change_attr */ + +/* file attribute values */ +#define STATX_ATTR_VERSION_MONOTONIC 0x8000000000000000ULL /* version monotonically increases */ + #endif From patchwork Mon Oct 17 10:57:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 13008601 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8ADA1C4332F for ; Mon, 17 Oct 2022 10:58:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231314AbiJQK6w (ORCPT ); Mon, 17 Oct 2022 06:58:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44576 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230349AbiJQK52 (ORCPT ); Mon, 17 Oct 2022 06:57:28 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1CB0061B0E; Mon, 17 Oct 2022 03:57:24 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id A3FCEB812AC; Mon, 17 Oct 2022 10:57:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7664FC433C1; Mon, 17 Oct 2022 10:57:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666004242; bh=pekK2QX2KgEQVYMZRaffG2V7y/r770DCCIDCn7vh39M=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=T3Ky4C9q3L69subjakIcY7GefggWpp4am8aahclTk9o5/rR2lV9HA6L8t1/sxry5+ ZJ0se7Xs9NyKsNFZdGHo+8LrlfexywLrAEeOiRCTHHngmUUmRF/C4MR3ldr/WslKGc vO72WVCpmN1rI4m6TTd8nCdu6UF7R+/nmoDnhSBMcNwGGOTwgZVwa2Xx45WJoHuP08 j0aCxtYUFeg4TIFVHmzBxwwCzwe8q5M6sIbD4+2lWzYdKXwESDj/UsxKMxrd2AZrcS F9KuS57VqwxPG0acFwpX4R9uC1YwgBlkg5Eu0+Qdzjm5s18tz2x0+AsK3SJ5PtGkpA mXF0mCfXaMaAQ== From: Jeff Layton To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org Subject: [PATCH v7 4/9] nfs: report the inode version in getattr if requested Date: Mon, 17 Oct 2022 06:57:04 -0400 Message-Id: <20221017105709.10830-5-jlayton@kernel.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221017105709.10830-1-jlayton@kernel.org> References: <20221017105709.10830-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Allow NFS to report the i_version in getattr requests. Since the cost to fetch it is relatively cheap, do it unconditionally and just set the flag if it looks like it's valid. Also, conditionally enable the MONOTONIC flag when the server reports its change attr type as such. Reviewed-by: NeilBrown Signed-off-by: Jeff Layton --- fs/nfs/inode.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index bea7c005119c..7ed1b4c9260a 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -830,6 +830,8 @@ static u32 nfs_get_valid_attrmask(struct inode *inode) reply_mask |= STATX_UID | STATX_GID; if (!(cache_validity & NFS_INO_INVALID_BLOCKS)) reply_mask |= STATX_BLOCKS; + if (!(cache_validity & NFS_INO_INVALID_CHANGE)) + reply_mask |= STATX_VERSION; return reply_mask; } @@ -848,7 +850,8 @@ int nfs_getattr(struct user_namespace *mnt_userns, const struct path *path, request_mask &= STATX_TYPE | STATX_MODE | STATX_NLINK | STATX_UID | STATX_GID | STATX_ATIME | STATX_MTIME | STATX_CTIME | - STATX_INO | STATX_SIZE | STATX_BLOCKS; + STATX_INO | STATX_SIZE | STATX_BLOCKS | STATX_BTIME | + STATX_VERSION; if ((query_flags & AT_STATX_DONT_SYNC) && !force_sync) { if (readdirplus_enabled) @@ -856,8 +859,8 @@ int nfs_getattr(struct user_namespace *mnt_userns, const struct path *path, goto out_no_revalidate; } - /* Flush out writes to the server in order to update c/mtime. */ - if ((request_mask & (STATX_CTIME | STATX_MTIME)) && + /* Flush out writes to the server in order to update c/mtime/version. */ + if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_VERSION)) && S_ISREG(inode->i_mode)) filemap_write_and_wait(inode->i_mapping); @@ -877,7 +880,7 @@ int nfs_getattr(struct user_namespace *mnt_userns, const struct path *path, /* Is the user requesting attributes that might need revalidation? */ if (!(request_mask & (STATX_MODE|STATX_NLINK|STATX_ATIME|STATX_CTIME| STATX_MTIME|STATX_UID|STATX_GID| - STATX_SIZE|STATX_BLOCKS))) + STATX_SIZE|STATX_BLOCKS|STATX_VERSION))) goto out_no_revalidate; /* Check whether the cached attributes are stale */ @@ -915,6 +918,10 @@ int nfs_getattr(struct user_namespace *mnt_userns, const struct path *path, generic_fillattr(&init_user_ns, inode, stat); stat->ino = nfs_compat_user_ino64(NFS_FILEID(inode)); + stat->version = inode_peek_iversion_raw(inode); + stat->attributes_mask |= STATX_ATTR_VERSION_MONOTONIC; + if (server->change_attr_type != NFS4_CHANGE_TYPE_IS_UNDEFINED) + stat->attributes |= STATX_ATTR_VERSION_MONOTONIC; if (S_ISDIR(inode->i_mode)) stat->blksize = NFS_SERVER(inode)->dtsize; out: From patchwork Mon Oct 17 10:57:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 13008599 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 344FCC43219 for ; Mon, 17 Oct 2022 10:58:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231128AbiJQK6S (ORCPT ); Mon, 17 Oct 2022 06:58:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44472 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230469AbiJQK5a (ORCPT ); Mon, 17 Oct 2022 06:57:30 -0400 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BE0BB61B2D; Mon, 17 Oct 2022 03:57:28 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 945B0CE12AE; Mon, 17 Oct 2022 10:57:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A8BA6C433D7; Mon, 17 Oct 2022 10:57:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666004244; bh=xnrewwvldKIjoeh5ebiZu+onJb8qn6gj1nIuV6NIss8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bh5nTooKpjTJrP2pLkH49auOau9LSXv/ChZHfvENHLq5PdUL9u5sbQlNrDoOn6YMZ WieShPVHZCnZcKf+VDyEgIvtuU6KAlxhaWVpT76Z9ZzS9/oxmULozBKfr6FO16xefe SaJQTNLrEXX7EVZKB+K81f0SFCkXq+N/XuctzrgSZuvqymSzlXk+I63YEe8Nhdka/j PqtAo0CWb2fIZbyv/ddnTp8GvYNSUXZGssJNU7bn9yBifArha9ZucJCQ6Nkpeyoh4C OkU27OjjcdtmBAAuKhaAopo/ZzG0deuEkPTGjhEgM+SWTLNCXxr3trSWtODFA7bBXN qDAkOQ/0TBbbw== From: Jeff Layton To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org Subject: [PATCH v7 5/9] ceph: report the inode version in getattr if requested Date: Mon, 17 Oct 2022 06:57:05 -0400 Message-Id: <20221017105709.10830-6-jlayton@kernel.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221017105709.10830-1-jlayton@kernel.org> References: <20221017105709.10830-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org When getattr requests the STX_VERSION, request the full gamut of caps (similarly to how ctime is handled). When the change attribute seems to be valid, return it in the ino_version field and set the flag in the reply mask. Also, unconditionally enable STATX_ATTR_VERSION_MONOTONIC. Reviewed-by: Xiubo Li Signed-off-by: Jeff Layton --- fs/ceph/inode.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 42351d7a0dd6..bcab855bf1ae 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -2415,10 +2415,10 @@ static int statx_to_caps(u32 want, umode_t mode) { int mask = 0; - if (want & (STATX_MODE|STATX_UID|STATX_GID|STATX_CTIME|STATX_BTIME)) + if (want & (STATX_MODE|STATX_UID|STATX_GID|STATX_CTIME|STATX_BTIME|STATX_VERSION)) mask |= CEPH_CAP_AUTH_SHARED; - if (want & (STATX_NLINK|STATX_CTIME)) { + if (want & (STATX_NLINK|STATX_CTIME|STATX_VERSION)) { /* * The link count for directories depends on inode->i_subdirs, * and that is only updated when Fs caps are held. @@ -2429,11 +2429,10 @@ static int statx_to_caps(u32 want, umode_t mode) mask |= CEPH_CAP_LINK_SHARED; } - if (want & (STATX_ATIME|STATX_MTIME|STATX_CTIME|STATX_SIZE| - STATX_BLOCKS)) + if (want & (STATX_ATIME|STATX_MTIME|STATX_CTIME|STATX_SIZE|STATX_BLOCKS|STATX_VERSION)) mask |= CEPH_CAP_FILE_SHARED; - if (want & (STATX_CTIME)) + if (want & (STATX_CTIME|STATX_VERSION)) mask |= CEPH_CAP_XATTR_SHARED; return mask; @@ -2475,6 +2474,11 @@ int ceph_getattr(struct user_namespace *mnt_userns, const struct path *path, valid_mask |= STATX_BTIME; } + if (request_mask & STATX_VERSION) { + stat->version = inode_peek_iversion_raw(inode); + valid_mask |= STATX_VERSION; + } + if (ceph_snap(inode) == CEPH_NOSNAP) stat->dev = inode->i_sb->s_dev; else @@ -2498,6 +2502,8 @@ int ceph_getattr(struct user_namespace *mnt_userns, const struct path *path, stat->nlink = 1 + 1 + ci->i_subdirs; } + stat->attributes_mask |= STATX_ATTR_VERSION_MONOTONIC; + stat->attributes |= STATX_ATTR_VERSION_MONOTONIC; stat->result_mask = request_mask & valid_mask; return err; } From patchwork Mon Oct 17 10:57:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 13008600 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC5C0C43219 for ; Mon, 17 Oct 2022 10:58:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231308AbiJQK6s (ORCPT ); Mon, 17 Oct 2022 06:58:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44638 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230464AbiJQK5a (ORCPT ); Mon, 17 Oct 2022 06:57:30 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A770D61708; Mon, 17 Oct 2022 03:57:28 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6E0D86105A; Mon, 17 Oct 2022 10:57:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DB448C433D6; Mon, 17 Oct 2022 10:57:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666004246; bh=LBaEFSh5DOeyzDGsOyxhMwQpssm5JK6u2HESg5ybNII=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GcdT6b7mb+oJp742oGeuVIMOfw8oEZ0enQ18wg8Ay/lNdbFfxNrtDJeG5LwPho5dD mjrEEtQNE5tTDsZ3XJBuv/1zzQ2QeaSuOG/cCyWcsUY7eJIEeSErX+E+vAIFJQsC55 N0s4+1Hq8cIO6nsNFLSxEB+742ItXHY3Kedw1srG/T4sW8IkT4hPE77/++e78jtlwu /K5Vs5K3g2tUySh+SylFTQGTzwZWsWsW8g3y1rdekQXDShurEUkxEr/O+y+NwuXVZ7 V0SmxQbcJ+ulid2bOfa6XXx920Zn48b/9WwRFyMGnJFHPxLoEn+tBpw6zmF+eqKvTC tpSNnThfThorQ== From: Jeff Layton To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org Subject: [PATCH v7 6/9] nfsd: move nfsd4_change_attribute to nfsfh.c Date: Mon, 17 Oct 2022 06:57:06 -0400 Message-Id: <20221017105709.10830-7-jlayton@kernel.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221017105709.10830-1-jlayton@kernel.org> References: <20221017105709.10830-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org This is a pretty big function for inlining. Move it to being non-inlined. Reviewed-by: NeilBrown Signed-off-by: Jeff Layton Acked-by: Chuck Lever --- fs/nfsd/nfsfh.c | 27 +++++++++++++++++++++++++++ fs/nfsd/nfsfh.h | 29 +---------------------------- 2 files changed, 28 insertions(+), 28 deletions(-) diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c index d73434200df9..7030d9209903 100644 --- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -748,3 +748,30 @@ enum fsid_source fsid_source(const struct svc_fh *fhp) return FSIDSOURCE_UUID; return FSIDSOURCE_DEV; } + +/* + * We could use i_version alone as the change attribute. However, + * i_version can go backwards after a reboot. On its own that doesn't + * necessarily cause a problem, but if i_version goes backwards and then + * is incremented again it could reuse a value that was previously used + * before boot, and a client who queried the two values might + * incorrectly assume nothing changed. + * + * By using both ctime and the i_version counter we guarantee that as + * long as time doesn't go backwards we never reuse an old value. + */ +u64 nfsd4_change_attribute(struct kstat *stat, struct inode *inode) +{ + if (inode->i_sb->s_export_op->fetch_iversion) + return inode->i_sb->s_export_op->fetch_iversion(inode); + else if (IS_I_VERSION(inode)) { + u64 chattr; + + chattr = stat->ctime.tv_sec; + chattr <<= 30; + chattr += stat->ctime.tv_nsec; + chattr += inode_query_iversion(inode); + return chattr; + } else + return time_to_chattr(&stat->ctime); +} diff --git a/fs/nfsd/nfsfh.h b/fs/nfsd/nfsfh.h index c3ae6414fc5c..4c223a7a91d4 100644 --- a/fs/nfsd/nfsfh.h +++ b/fs/nfsd/nfsfh.h @@ -291,34 +291,7 @@ static inline void fh_clear_pre_post_attrs(struct svc_fh *fhp) fhp->fh_pre_saved = false; } -/* - * We could use i_version alone as the change attribute. However, - * i_version can go backwards after a reboot. On its own that doesn't - * necessarily cause a problem, but if i_version goes backwards and then - * is incremented again it could reuse a value that was previously used - * before boot, and a client who queried the two values might - * incorrectly assume nothing changed. - * - * By using both ctime and the i_version counter we guarantee that as - * long as time doesn't go backwards we never reuse an old value. - */ -static inline u64 nfsd4_change_attribute(struct kstat *stat, - struct inode *inode) -{ - if (inode->i_sb->s_export_op->fetch_iversion) - return inode->i_sb->s_export_op->fetch_iversion(inode); - else if (IS_I_VERSION(inode)) { - u64 chattr; - - chattr = stat->ctime.tv_sec; - chattr <<= 30; - chattr += stat->ctime.tv_nsec; - chattr += inode_query_iversion(inode); - return chattr; - } else - return time_to_chattr(&stat->ctime); -} - +u64 nfsd4_change_attribute(struct kstat *stat, struct inode *inode); extern void fh_fill_pre_attrs(struct svc_fh *fhp); extern void fh_fill_post_attrs(struct svc_fh *fhp); extern void fh_fill_both_attrs(struct svc_fh *fhp); From patchwork Mon Oct 17 10:57:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 13008598 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1DF2C43217 for ; Mon, 17 Oct 2022 10:58:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230380AbiJQK6R (ORCPT ); Mon, 17 Oct 2022 06:58:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230339AbiJQK5z (ORCPT ); Mon, 17 Oct 2022 06:57:55 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 218B361D4C; Mon, 17 Oct 2022 03:57:31 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 3ABDFB812AC; Mon, 17 Oct 2022 10:57:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 18FBEC433D7; Mon, 17 Oct 2022 10:57:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666004249; bh=ETm0SfdogARXRGGJXl+FyttQWaCVWogaX3LXDgrfYj0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=sIvU0EyCOy6DbQPAmG4LnwHl9cRWX/adB+EWpr5MTgOi6HJ6qeYLhbROlgZqOxc9y yG2lImQEsERZI0jAlukm5WsIHCI0j4uJEpO2tOSo6yezl/SAKUR4AYSXn/IXT6zEe1 4Pijv+YSfHHocsBOHKBTsauMGovbSGYWZFpovSPQUztsR+qekfuuoO/hvW3II1kXOo +qehy8Al5tmwnNIoBVkeZ+kh2G82Zw7BZKSQqxsvZWhmDvxOdtnU2yJRHVKmv5175/ q+4ECQ+cUNfHzulE+Ge79dPHx9QJHQAqAjItxI10GzfWOh1HSz5W75OopaxbyYY4N7 UewcH/P4Dhubg== From: Jeff Layton To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org Subject: [PATCH v7 7/9] nfsd: use the getattr operation to fetch i_version Date: Mon, 17 Oct 2022 06:57:07 -0400 Message-Id: <20221017105709.10830-8-jlayton@kernel.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221017105709.10830-1-jlayton@kernel.org> References: <20221017105709.10830-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Now that we can call into vfs_getattr to get the i_version field, use that facility to fetch it instead of doing it in nfsd4_change_attribute. Neil also pointed out recently that IS_I_VERSION directory operations are always logged, and so we only need to mitigate the rollback problem on regular files. Also, we don't need to factor in the ctime when reexporting NFS or Ceph. Set the STATX_VERSION (and BTIME) bits in the request when we're dealing with a v4 request. Then, instead of looking at IS_I_VERSION when generating the change attr, look at the result mask and only use it if STATX_VERSION is set. Change nfsd4_change_attribute to only factor in the ctime if it's a regular file and the fs doesn't advertise STATX_ATTR_VERSION_MONOTONIC. Reviewed-by: NeilBrown Signed-off-by: Jeff Layton Acked-by: Chuck Lever --- fs/nfsd/nfs4xdr.c | 4 +++- fs/nfsd/nfsfh.c | 53 +++++++++++++++++++++++++++++++---------------- fs/nfsd/vfs.h | 7 ++++++- 3 files changed, 44 insertions(+), 20 deletions(-) diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index bcfeb1a922c0..c19b6b00b620 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -2906,7 +2906,9 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, goto out; } - err = vfs_getattr(&path, &stat, STATX_BASIC_STATS, AT_STATX_SYNC_AS_STAT); + err = vfs_getattr(&path, &stat, + STATX_BASIC_STATS | STATX_BTIME | STATX_VERSION, + AT_STATX_SYNC_AS_STAT); if (err) goto out_nfserr; if (!(stat.result_mask & STATX_BTIME)) diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c index 7030d9209903..21b64ac97a06 100644 --- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -628,6 +628,10 @@ void fh_fill_pre_attrs(struct svc_fh *fhp) stat.mtime = inode->i_mtime; stat.ctime = inode->i_ctime; stat.size = inode->i_size; + if (v4 && IS_I_VERSION(inode)) { + stat.version = inode_query_iversion(inode); + stat.result_mask |= STATX_VERSION; + } } if (v4) fhp->fh_pre_change = nfsd4_change_attribute(&stat, inode); @@ -659,6 +663,10 @@ void fh_fill_post_attrs(struct svc_fh *fhp) if (err) { fhp->fh_post_saved = false; fhp->fh_post_attr.ctime = inode->i_ctime; + if (v4 && IS_I_VERSION(inode)) { + fhp->fh_post_attr.version = inode_query_iversion(inode); + fhp->fh_post_attr.result_mask |= STATX_VERSION; + } } else fhp->fh_post_saved = true; if (v4) @@ -750,28 +758,37 @@ enum fsid_source fsid_source(const struct svc_fh *fhp) } /* - * We could use i_version alone as the change attribute. However, - * i_version can go backwards after a reboot. On its own that doesn't - * necessarily cause a problem, but if i_version goes backwards and then - * is incremented again it could reuse a value that was previously used - * before boot, and a client who queried the two values might - * incorrectly assume nothing changed. + * We could use i_version alone as the change attribute. However, i_version + * can go backwards on a regular file after an unclean shutdown. On its own + * that doesn't necessarily cause a problem, but if i_version goes backwards + * and then is incremented again it could reuse a value that was previously + * used before boot, and a client who queried the two values might incorrectly + * assume nothing changed. + * + * By using both ctime and the i_version counter we guarantee that as long as + * time doesn't go backwards we never reuse an old value. If the filesystem + * advertises STATX_ATTR_VERSION_MONOTONIC, then this mitigation is not needed. * - * By using both ctime and the i_version counter we guarantee that as - * long as time doesn't go backwards we never reuse an old value. + * We only need to do this for regular files as well. For directories, we + * assume that the new change attr is always logged to stable storage in some + * fashion before the results can be seen. */ u64 nfsd4_change_attribute(struct kstat *stat, struct inode *inode) { + u64 chattr; + if (inode->i_sb->s_export_op->fetch_iversion) return inode->i_sb->s_export_op->fetch_iversion(inode); - else if (IS_I_VERSION(inode)) { - u64 chattr; - - chattr = stat->ctime.tv_sec; - chattr <<= 30; - chattr += stat->ctime.tv_nsec; - chattr += inode_query_iversion(inode); - return chattr; - } else - return time_to_chattr(&stat->ctime); + if (stat->result_mask & STATX_VERSION) { + chattr = stat->version; + + if (S_ISREG(inode->i_mode) && + !(stat->attributes & STATX_ATTR_VERSION_MONOTONIC)) { + chattr += (u64)stat->ctime.tv_sec << 30; + chattr += stat->ctime.tv_nsec; + } + } else { + chattr = time_to_chattr(&stat->ctime); + } + return chattr; } diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h index 120521bc7b24..c98e13ec37b2 100644 --- a/fs/nfsd/vfs.h +++ b/fs/nfsd/vfs.h @@ -168,9 +168,14 @@ static inline void fh_drop_write(struct svc_fh *fh) static inline __be32 fh_getattr(const struct svc_fh *fh, struct kstat *stat) { + u32 request_mask = STATX_BASIC_STATS; struct path p = {.mnt = fh->fh_export->ex_path.mnt, .dentry = fh->fh_dentry}; - return nfserrno(vfs_getattr(&p, stat, STATX_BASIC_STATS, + + if (fh->fh_maxsize == NFS4_FHSIZE) + request_mask |= (STATX_BTIME | STATX_VERSION); + + return nfserrno(vfs_getattr(&p, stat, request_mask, AT_STATX_SYNC_AS_STAT)); } From patchwork Mon Oct 17 10:57:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 13008602 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45256C43219 for ; Mon, 17 Oct 2022 10:58:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231326AbiJQK64 (ORCPT ); Mon, 17 Oct 2022 06:58:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44638 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230433AbiJQK6J (ORCPT ); Mon, 17 Oct 2022 06:58:09 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9C9E761135; Mon, 17 Oct 2022 03:57:34 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 87ED8B81338; Mon, 17 Oct 2022 10:57:32 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4C0C6C43142; Mon, 17 Oct 2022 10:57:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666004251; bh=IFSEjIJ9849haCXwjIMjlrBQFkMr+ez73GPRlwR8RA8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TTvGFNOCk9adYVcWoJdmaLqoLOr9mZe4vfjD2CR+bk5kFt3mQr3TRyZ9VeVqw3vWc 75uB9/qN74O3l06iOrAPwNi90tSGXI7QwycTpEVgKJE2n6fAk97o44zi+Y//SeCXgr F2SAomQaWPaJgfikV8Apjj9q+0URmKaViRA95YLqHzvGFiQgado/Mih8OM9wnOjY4X MUs8KDvXsb9qML6ZaHAFwfy4GW5sDqLZsvK8J+3b0gOBfZSGjEWolSsNVbuWbKcIm5 9mdCKS4IEoiCpphQqhMigKXqGaXwOLLFn9K6H/3hpF2G680esQ+7AyKwnn9kzXxLYS XPQhxSOBNuVUw== From: Jeff Layton To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org Subject: [PATCH v7 8/9] nfsd: remove fetch_iversion export operation Date: Mon, 17 Oct 2022 06:57:08 -0400 Message-Id: <20221017105709.10830-9-jlayton@kernel.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221017105709.10830-1-jlayton@kernel.org> References: <20221017105709.10830-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Now that the i_version counter is reported in struct kstat, there is no need for this export operation. Reviewed-by: NeilBrown Signed-off-by: Jeff Layton Acked-by: Chuck Lever --- fs/nfs/export.c | 7 ------- fs/nfsd/nfsfh.c | 2 -- include/linux/exportfs.h | 1 - 3 files changed, 10 deletions(-) diff --git a/fs/nfs/export.c b/fs/nfs/export.c index 01596f2d0a1e..1a9d5aa51dfb 100644 --- a/fs/nfs/export.c +++ b/fs/nfs/export.c @@ -145,17 +145,10 @@ nfs_get_parent(struct dentry *dentry) return parent; } -static u64 nfs_fetch_iversion(struct inode *inode) -{ - nfs_revalidate_inode(inode, NFS_INO_INVALID_CHANGE); - return inode_peek_iversion_raw(inode); -} - const struct export_operations nfs_export_ops = { .encode_fh = nfs_encode_fh, .fh_to_dentry = nfs_fh_to_dentry, .get_parent = nfs_get_parent, - .fetch_iversion = nfs_fetch_iversion, .flags = EXPORT_OP_NOWCC|EXPORT_OP_NOSUBTREECHK| EXPORT_OP_CLOSE_BEFORE_UNLINK|EXPORT_OP_REMOTE_FS| EXPORT_OP_NOATOMIC_ATTR, diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c index 21b64ac97a06..9c1f697ffc72 100644 --- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -777,8 +777,6 @@ u64 nfsd4_change_attribute(struct kstat *stat, struct inode *inode) { u64 chattr; - if (inode->i_sb->s_export_op->fetch_iversion) - return inode->i_sb->s_export_op->fetch_iversion(inode); if (stat->result_mask & STATX_VERSION) { chattr = stat->version; diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h index fe848901fcc3..9f4d4bcbf251 100644 --- a/include/linux/exportfs.h +++ b/include/linux/exportfs.h @@ -213,7 +213,6 @@ struct export_operations { bool write, u32 *device_generation); int (*commit_blocks)(struct inode *inode, struct iomap *iomaps, int nr_iomaps, struct iattr *iattr); - u64 (*fetch_iversion)(struct inode *); #define EXPORT_OP_NOWCC (0x1) /* don't collect v3 wcc data */ #define EXPORT_OP_NOSUBTREECHK (0x2) /* no subtree checking */ #define EXPORT_OP_CLOSE_BEFORE_UNLINK (0x4) /* close files before unlink */ From patchwork Mon Oct 17 10:57:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 13008603 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9445BC433FE for ; Mon, 17 Oct 2022 10:59:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231339AbiJQK65 (ORCPT ); Mon, 17 Oct 2022 06:58:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230496AbiJQK6M (ORCPT ); Mon, 17 Oct 2022 06:58:12 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EDFF561D90; Mon, 17 Oct 2022 03:57:36 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id BEABFB80CAD; Mon, 17 Oct 2022 10:57:34 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 81DDCC43144; Mon, 17 Oct 2022 10:57:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1666004253; bh=1oukB3rP9Xx+gRVJOYRNIFNTZFkHQH6wKGEOtYxV1lA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZLtIENBTLEh//XZqkkXjggNWvOQ0CZ0QaZWzgsuYdEyU3UP/cx9/TlAQlifA9F36r 4nLQqITN8SK330HszRafccpu3ehpNbV9qYYEwfR0oSGtX+1QjyfU/LxMjIFqVNNyeR iYRtAg/ue7BmO835kRO/mISgk/TR+rYT0pw18WRhbtZNMvmjj9ebAKM5Xw2NAReneU +uj6wUVpXiE2Xhxx21KWQDtTpIVnVHG73v8IPF2oB96fMm181Freq+CJQ5lWFNQQnF 8dCUSDhAvPPs5ESyFIxgKcxny9G4226aZPHn36wqrCZgZJhG6wiHKUhAk2VCDzmyq6 W+NlN4cyC7YdQ== From: Jeff Layton To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, Jeff Layton Subject: [RFC PATCH v7 9/9] vfs: expose STATX_VERSION to userland Date: Mon, 17 Oct 2022 06:57:09 -0400 Message-Id: <20221017105709.10830-10-jlayton@kernel.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221017105709.10830-1-jlayton@kernel.org> References: <20221017105709.10830-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Jeff Layton Claim one of the spare fields in struct statx to hold a 64-bit inode version attribute. When userland requests STATX_VERSION, copy the value from the kstat struct there, and stop masking off STATX_ATTR_VERSION_MONOTONIC. Update the test-statx sample program to output the change attr and MountId. Reviewed-by: NeilBrown Signed-off-by: Jeff Layton --- fs/stat.c | 12 +++--------- include/linux/stat.h | 9 --------- include/uapi/linux/stat.h | 6 ++++-- samples/vfs/test-statx.c | 8 ++++++-- 4 files changed, 13 insertions(+), 22 deletions(-) Posting this as an RFC as we're still trying to sort out what semantics we want to present to userland. In particular, this patch leaves the problem of crash resilience in to userland applications on filesystems that don't report as MONOTONIC. Trond is of the opinion that monotonicity is a hard requirement, and that we should not allow filesystems that can't provide that quality to report STATX_VERSION at all. His rationale is that one of the main uses for this is for backup applications, and for those a counter that could go backward is worse than useless. I don't have strong feelings either way, but if we want that then we will not be able to offload the crash counter handling to userland. Thoughts? diff --git a/fs/stat.c b/fs/stat.c index e7f8cd4b24e1..8396c372022f 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -593,11 +593,9 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer) memset(&tmp, 0, sizeof(tmp)); - /* STATX_VERSION is kernel-only for now */ - tmp.stx_mask = stat->result_mask & ~STATX_VERSION; + tmp.stx_mask = stat->result_mask; tmp.stx_blksize = stat->blksize; - /* STATX_ATTR_VERSION_MONOTONIC is kernel-only for now */ - tmp.stx_attributes = stat->attributes & ~STATX_ATTR_VERSION_MONOTONIC; + tmp.stx_attributes = stat->attributes; tmp.stx_nlink = stat->nlink; tmp.stx_uid = from_kuid_munged(current_user_ns(), stat->uid); tmp.stx_gid = from_kgid_munged(current_user_ns(), stat->gid); @@ -621,6 +619,7 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer) tmp.stx_mnt_id = stat->mnt_id; tmp.stx_dio_mem_align = stat->dio_mem_align; tmp.stx_dio_offset_align = stat->dio_offset_align; + tmp.stx_version = stat->version; return copy_to_user(buffer, &tmp, sizeof(tmp)) ? -EFAULT : 0; } @@ -636,11 +635,6 @@ int do_statx(int dfd, struct filename *filename, unsigned int flags, if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE) return -EINVAL; - /* STATX_VERSION is kernel-only for now. Ignore requests - * from userland. - */ - mask &= ~STATX_VERSION; - error = vfs_statx(dfd, filename, flags, &stat, mask); if (error) return error; diff --git a/include/linux/stat.h b/include/linux/stat.h index 4e9428d86a3a..69c79e4fd1b1 100644 --- a/include/linux/stat.h +++ b/include/linux/stat.h @@ -54,13 +54,4 @@ struct kstat { u32 dio_offset_align; u64 version; }; - -/* These definitions are internal to the kernel for now. Mainly used by nfsd. */ - -/* mask values */ -#define STATX_VERSION 0x40000000U /* Want/got stx_change_attr */ - -/* file attribute values */ -#define STATX_ATTR_VERSION_MONOTONIC 0x8000000000000000ULL /* version monotonically increases */ - #endif diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h index 7cab2c65d3d7..4a0a1f27c059 100644 --- a/include/uapi/linux/stat.h +++ b/include/uapi/linux/stat.h @@ -127,7 +127,8 @@ struct statx { __u32 stx_dio_mem_align; /* Memory buffer alignment for direct I/O */ __u32 stx_dio_offset_align; /* File offset alignment for direct I/O */ /* 0xa0 */ - __u64 __spare3[12]; /* Spare space for future expansion */ + __u64 stx_version; /* Inode change attribute */ + __u64 __spare3[11]; /* Spare space for future expansion */ /* 0x100 */ }; @@ -154,6 +155,7 @@ struct statx { #define STATX_BTIME 0x00000800U /* Want/got stx_btime */ #define STATX_MNT_ID 0x00001000U /* Got stx_mnt_id */ #define STATX_DIOALIGN 0x00002000U /* Want/got direct I/O alignment info */ +#define STATX_VERSION 0x00004000U /* Want/got stx_version */ #define STATX__RESERVED 0x80000000U /* Reserved for future struct statx expansion */ @@ -189,6 +191,6 @@ struct statx { #define STATX_ATTR_MOUNT_ROOT 0x00002000 /* Root of a mount */ #define STATX_ATTR_VERITY 0x00100000 /* [I] Verity protected file */ #define STATX_ATTR_DAX 0x00200000 /* File is currently in DAX state */ - +#define STATX_ATTR_VERSION_MONOTONIC 0x00400000 /* stx_version increases w/ every change */ #endif /* _UAPI_LINUX_STAT_H */ diff --git a/samples/vfs/test-statx.c b/samples/vfs/test-statx.c index 49c7a46cee07..868c9394e038 100644 --- a/samples/vfs/test-statx.c +++ b/samples/vfs/test-statx.c @@ -107,6 +107,8 @@ static void dump_statx(struct statx *stx) printf("Device: %-15s", buffer); if (stx->stx_mask & STATX_INO) printf(" Inode: %-11llu", (unsigned long long) stx->stx_ino); + if (stx->stx_mask & STATX_MNT_ID) + printf(" MountId: %llx", stx->stx_mnt_id); if (stx->stx_mask & STATX_NLINK) printf(" Links: %-5u", stx->stx_nlink); if (stx->stx_mask & STATX_TYPE) { @@ -145,7 +147,9 @@ static void dump_statx(struct statx *stx) if (stx->stx_mask & STATX_CTIME) print_time("Change: ", &stx->stx_ctime); if (stx->stx_mask & STATX_BTIME) - print_time(" Birth: ", &stx->stx_btime); + print_time("Birth: ", &stx->stx_btime); + if (stx->stx_mask & STATX_VERSION) + printf("Inode Version: %llu\n", stx->stx_version); if (stx->stx_attributes_mask) { unsigned char bits, mbits; @@ -218,7 +222,7 @@ int main(int argc, char **argv) struct statx stx; int ret, raw = 0, atflag = AT_SYMLINK_NOFOLLOW; - unsigned int mask = STATX_BASIC_STATS | STATX_BTIME; + unsigned int mask = STATX_BASIC_STATS | STATX_BTIME | STATX_MNT_ID | STATX_VERSION; for (argv++; *argv; argv++) { if (strcmp(*argv, "-F") == 0) {