From patchwork Thu Sep 8 17:24:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeffrey Layton X-Patchwork-Id: 12970410 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25CECC6FA86 for ; Thu, 8 Sep 2022 17:25:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231709AbiIHRZJ (ORCPT ); Thu, 8 Sep 2022 13:25:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55516 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229799AbiIHRZB (ORCPT ); Thu, 8 Sep 2022 13:25:01 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C1AA34F398; Thu, 8 Sep 2022 10:24:56 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 740B9B821D4; Thu, 8 Sep 2022 17:24:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2C2E1C433D7; Thu, 8 Sep 2022 17:24:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1662657894; bh=5RwWtZAQsV8rQ6WEtGKbKdleHPjSdSyUVZNqRyCE/eM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=tRlN0pfUD9xuLABWeALrjLUrDFjn0LMhSECknUUklQFg9zkp/9r8aM/Q7Id3OVPtZ wTynxxgyVyBdBBIjt1V0zEj6mHEZKpJtkOX17Tnk1UvJBHXbgyfKy6vMZIunUN5WDZ YuVXP2BivVgiZk8ygz+0WCZ+jqfH8bxcpCVyh3lmjHGV7YKVGfOUwcNDQ6OqP0kELw j/RVFd8NT0U19s6Dvx+nvvGIKyZNWRX6R5BaIiOX9PhvjOgKJW//Fdk5yYbonDRCiy Eg+sTQkJQS6rBD+xAmVD03xvgEDGHdI+8kmklMO6GU+9MUjxvHl5xGRDSMr/jA94BX RF0cg+hrT8NVA== From: Jeff Layton To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, Colin Walters Subject: [PATCH v5 1/8] iversion: clarify when the i_version counter must be updated Date: Thu, 8 Sep 2022 13:24:41 -0400 Message-Id: <20220908172448.208585-2-jlayton@kernel.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20220908172448.208585-1-jlayton@kernel.org> References: <20220908172448.208585-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org The i_version field in the kernel has had different semantics over the decades, but NFSv4 has certain expectations. Update the comments in iversion.h to describe when the i_version must change. Cc: Colin Walters Cc: NeilBrown Cc: Trond Myklebust Cc: Dave Chinner Link: https://lore.kernel.org/linux-xfs/166086932784.5425.17134712694961326033@noble.neil.brown.name/#t Signed-off-by: Jeff Layton --- include/linux/iversion.h | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/include/linux/iversion.h b/include/linux/iversion.h index 3bfebde5a1a6..0555a3851dbf 100644 --- a/include/linux/iversion.h +++ b/include/linux/iversion.h @@ -9,8 +9,14 @@ * --------------------------- * The change attribute (i_version) is mandated by NFSv4 and is mostly for * knfsd, but is also used for other purposes (e.g. IMA). The i_version must - * appear different to observers if there was a change to the inode's data or - * metadata since it was last queried. + * appear larger to observers if there was an explicit change to the inode's + * data or metadata since it was last queried. + * + * An explicit change is one that would ordinarily result in a change to the + * inode status change time (aka ctime). i_version must appear to change, even + * if the ctime does not (since the whole point is to avoid missing updates due + * to timestamp granularity). If POSIX mandates that the ctime must change due + * to an operation, then the i_version counter must be incremented as well. * * Observers see the i_version as a 64-bit number that never decreases. If it * remains the same since it was last checked, then nothing has changed in the From patchwork Thu Sep 8 17:24:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeffrey Layton X-Patchwork-Id: 12970406 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9437EC6FA91 for ; Thu, 8 Sep 2022 17:25:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231804AbiIHRZL (ORCPT ); Thu, 8 Sep 2022 13:25:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55526 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230337AbiIHRZB (ORCPT ); Thu, 8 Sep 2022 13:25:01 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7AFC44F3AF; Thu, 8 Sep 2022 10:24:57 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1795D61DC4; Thu, 8 Sep 2022 17:24:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7D855C433C1; Thu, 8 Sep 2022 17:24:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1662657896; bh=WFvAjH2w5GVc6nFr7pNbwOG1HEvACl7dXWMHqViQw+4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=t5lYUJGXCSDG0moMKIJfyQ2A2Yt+4MF2J7q6c6mbQ4X8DrtiiKies4mYwwNOYKglJ Buek3QIy25r5PP0Lf/65kbnpC+VzrBVqGg+lx+gKQvS8FWtb9tU9fD21RbPJUpHheg S7ugbf45n2gyt/YhAdN0AWRvfLVHnVBtlz67ukrugKuzxxO4+KwVKWaFuCL2b1jWWw G+J2y2ibTvFB+JsCVZEpGSk9z/5Yuhr96eW1oKGkBoQgusweeEkelLgOV6SkiC9Uib U4eOyXxaoR9sl5w1id+pofjHikcNLYZvCc6X4jUpjv1o3dtmW6gDHTY1eJ0dxwNe4q t8/QGSVacu8LQ== From: Jeff Layton To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org Subject: [PATCH v5 2/8] ext4: fix i_version handling in ext4 Date: Thu, 8 Sep 2022 13:24:42 -0400 Message-Id: <20220908172448.208585-3-jlayton@kernel.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20220908172448.208585-1-jlayton@kernel.org> References: <20220908172448.208585-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org ext4 currently updates the i_version counter when the atime is updated during a read. This is less than ideal as it can cause unnecessary cache invalidations with NFSv4 and unnecessary remeasurements for IMA. The increment in ext4_mark_iloc_dirty is also problematic since it can corrupt the i_version counter for ea_inodes. We aren't bumping the file times in ext4_mark_iloc_dirty, so changing the i_version there seems wrong, and is the cause of both problems. Remove that callsite and add increments to the setattr, setxattr and ioctl codepaths, at the same times that we update the ctime. The i_version bump that already happens during timestamp updates should take care of the rest. In ext4_move_extents, increment the i_version on both inodes, and also add in missing ctime updates. Cc: Lukas Czerner Reviewed-by: Jan Kara Reviewed-by: Christian Brauner (Microsoft) Signed-off-by: Jeff Layton --- fs/ext4/inode.c | 15 +++++---------- fs/ext4/ioctl.c | 8 ++++++++ fs/ext4/move_extent.c | 8 ++++++++ fs/ext4/xattr.c | 2 ++ 4 files changed, 23 insertions(+), 10 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 2a220be34caa..aa37bce4c541 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5342,6 +5342,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, int error, rc = 0; int orphan = 0; const unsigned int ia_valid = attr->ia_valid; + bool inc_ivers = IS_I_VERSION(inode); if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb)))) return -EIO; @@ -5425,8 +5426,8 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, return -EINVAL; } - if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size) - inode_inc_iversion(inode); + if (attr->ia_size == inode->i_size) + inc_ivers = false; if (shrink) { if (ext4_should_order_data(inode)) { @@ -5528,6 +5529,8 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, } if (!error) { + if (inc_ivers) + inode_inc_iversion(inode); setattr_copy(mnt_userns, inode, attr); mark_inode_dirty(inode); } @@ -5731,14 +5734,6 @@ int ext4_mark_iloc_dirty(handle_t *handle, } ext4_fc_track_inode(handle, inode); - /* - * ea_inodes are using i_version for storing reference count, don't - * mess with it - */ - if (IS_I_VERSION(inode) && - !(EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL)) - inode_inc_iversion(inode); - /* the do_update_inode consumes one bh->b_count */ get_bh(iloc->bh); diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c index 3cf3ec4b1c21..60e77ae9342d 100644 --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -452,6 +452,8 @@ static long swap_inode_boot_loader(struct super_block *sb, swap_inode_data(inode, inode_bl); inode->i_ctime = inode_bl->i_ctime = current_time(inode); + if (IS_I_VERSION(inode)) + inode_inc_iversion(inode); inode->i_generation = prandom_u32(); inode_bl->i_generation = prandom_u32(); @@ -665,6 +667,8 @@ static int ext4_ioctl_setflags(struct inode *inode, ext4_set_inode_flags(inode, false); inode->i_ctime = current_time(inode); + if (IS_I_VERSION(inode)) + inode_inc_iversion(inode); err = ext4_mark_iloc_dirty(handle, inode, &iloc); flags_err: @@ -775,6 +779,8 @@ static int ext4_ioctl_setproject(struct inode *inode, __u32 projid) EXT4_I(inode)->i_projid = kprojid; inode->i_ctime = current_time(inode); + if (IS_I_VERSION(inode)) + inode_inc_iversion(inode); out_dirty: rc = ext4_mark_iloc_dirty(handle, inode, &iloc); if (!err) @@ -1257,6 +1263,8 @@ static long __ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) err = ext4_reserve_inode_write(handle, inode, &iloc); if (err == 0) { inode->i_ctime = current_time(inode); + if (IS_I_VERSION(inode)) + inode_inc_iversion(inode); inode->i_generation = generation; err = ext4_mark_iloc_dirty(handle, inode, &iloc); } diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 701f1d6a217f..d73ab3153218 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -6,6 +6,7 @@ */ #include +#include #include #include #include @@ -683,6 +684,13 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk, break; o_start += cur_len; d_start += cur_len; + + orig_inode->i_ctime = current_time(orig_inode); + donor_inode->i_ctime = current_time(donor_inode); + if (IS_I_VERSION(orig_inode)) + inode_inc_iversion(orig_inode); + if (IS_I_VERSION(donor_inode)) + inode_inc_iversion(donor_inode); } *moved_len = o_start - orig_blk; if (*moved_len > len) diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index 533216e80fa2..e975442e4ab2 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -2412,6 +2412,8 @@ ext4_xattr_set_handle(handle_t *handle, struct inode *inode, int name_index, if (!error) { ext4_xattr_update_super_block(handle, inode->i_sb); inode->i_ctime = current_time(inode); + if (IS_I_VERSION(inode)) + inode_inc_iversion(inode); if (!value) no_expand = 0; error = ext4_mark_iloc_dirty(handle, inode, &is.iloc); From patchwork Thu Sep 8 17:24:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeffrey Layton X-Patchwork-Id: 12970404 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B417C6FA86 for ; Thu, 8 Sep 2022 17:25:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231281AbiIHRZN (ORCPT ); Thu, 8 Sep 2022 13:25:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55630 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231517AbiIHRZD (ORCPT ); Thu, 8 Sep 2022 13:25:03 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E27214F3BF; Thu, 8 Sep 2022 10:24:59 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7DBE761DC5; Thu, 8 Sep 2022 17:24:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BD545C4314F; Thu, 8 Sep 2022 17:24:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1662657898; bh=5lWMrrwEKsq8jGQhZoDZ9khFF1Vla14D3rHsczyN2CA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=EOs8wFTxO72COnPqemECjhIBmQuv9Z5jVB+zqPUo9anTu9ijoKeStihQw7RTv01Qg mr5jqDJpxWGGgPCWMFBgY07C3B4VHYi5GTatzXiiizHSVUEcf89Cul0BjjHAaF+UxE EUO3uiL34ZVpkNwblybBBl6t9ClTuN3u07mIbJ5r2gkfzRAgoRbG/4MPqpKS0sZRki 3D3L/kBwolBLMFR5fF/oGR/G9QursZqKDZ8OrvJ7I5XfuS211nJbXbPrH9V+/CP5la 4aYVEuj2CyvDywvlTcJV2hT8h5k6Gbm3rkj78vFrMNqs+EHYWTZbUanenbstDQq82C mZYKVFsW+TrXQ== From: Jeff Layton To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, Benjamin Coddington , Christoph Hellwig Subject: [PATCH v5 3/8] ext4: unconditionally enable the i_version counter Date: Thu, 8 Sep 2022 13:24:43 -0400 Message-Id: <20220908172448.208585-4-jlayton@kernel.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20220908172448.208585-1-jlayton@kernel.org> References: <20220908172448.208585-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org The original i_version implementation was pretty expensive, requiring a log flush on every change. Because of this, it was gated behind a mount option on ext4 (implemented via the MS_I_VERSION mountoption flag). Commit ae5e165d855d (fs: new API for handling inode->i_version) made the i_version flag much less expensive, so there is no longer a performance penalty from enabling it. xfs and btrfs already enable it unconditionally when the on-disk format can support it. Have ext4 ignore the SB_I_VERSION flag, and just enable it unconditionally. While we're in here, remove the handling of Opt_i_version as well since it's due for deprecation anyway. Ideally, we'd couple this change with a way to disable the i_version counter (just in case), but the way the iversion mount option was implemented makes that difficult to do. We'd need to add a new mount option altogether or do something with tune2fs. That's probably best left to later patches if it turns out to be needed. Cc: Dave Chinner Cc: Lukas Czerner Cc: Benjamin Coddington Cc: Christoph Hellwig Cc: Darrick J. Wong Signed-off-by: Jeff Layton --- fs/ext4/inode.c | 2 +- fs/ext4/ioctl.c | 12 ++++-------- fs/ext4/move_extent.c | 6 ++---- fs/ext4/super.c | 13 ++++--------- fs/ext4/xattr.c | 3 +-- 5 files changed, 12 insertions(+), 24 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index aa37bce4c541..6ef37269e7c0 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5342,7 +5342,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, int error, rc = 0; int orphan = 0; const unsigned int ia_valid = attr->ia_valid; - bool inc_ivers = IS_I_VERSION(inode); + bool inc_ivers = true; if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb)))) return -EIO; diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c index 60e77ae9342d..ad3a294a88eb 100644 --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -452,8 +452,7 @@ static long swap_inode_boot_loader(struct super_block *sb, swap_inode_data(inode, inode_bl); inode->i_ctime = inode_bl->i_ctime = current_time(inode); - if (IS_I_VERSION(inode)) - inode_inc_iversion(inode); + inode_inc_iversion(inode); inode->i_generation = prandom_u32(); inode_bl->i_generation = prandom_u32(); @@ -667,8 +666,7 @@ static int ext4_ioctl_setflags(struct inode *inode, ext4_set_inode_flags(inode, false); inode->i_ctime = current_time(inode); - if (IS_I_VERSION(inode)) - inode_inc_iversion(inode); + inode_inc_iversion(inode); err = ext4_mark_iloc_dirty(handle, inode, &iloc); flags_err: @@ -779,8 +777,7 @@ static int ext4_ioctl_setproject(struct inode *inode, __u32 projid) EXT4_I(inode)->i_projid = kprojid; inode->i_ctime = current_time(inode); - if (IS_I_VERSION(inode)) - inode_inc_iversion(inode); + inode_inc_iversion(inode); out_dirty: rc = ext4_mark_iloc_dirty(handle, inode, &iloc); if (!err) @@ -1263,8 +1260,7 @@ static long __ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) err = ext4_reserve_inode_write(handle, inode, &iloc); if (err == 0) { inode->i_ctime = current_time(inode); - if (IS_I_VERSION(inode)) - inode_inc_iversion(inode); + inode_inc_iversion(inode); inode->i_generation = generation; err = ext4_mark_iloc_dirty(handle, inode, &iloc); } diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index d73ab3153218..285700b00d38 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -687,10 +687,8 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk, orig_inode->i_ctime = current_time(orig_inode); donor_inode->i_ctime = current_time(donor_inode); - if (IS_I_VERSION(orig_inode)) - inode_inc_iversion(orig_inode); - if (IS_I_VERSION(donor_inode)) - inode_inc_iversion(donor_inode); + inode_inc_iversion(orig_inode); + inode_inc_iversion(donor_inode); } *moved_len = o_start - orig_blk; if (*moved_len > len) diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 9a66abcca1a8..e7cf5361245a 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1585,7 +1585,7 @@ enum { Opt_inlinecrypt, Opt_usrjquota, Opt_grpjquota, Opt_quota, Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err, - Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, + Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never, Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error, Opt_nowarn_on_error, Opt_mblk_io_submit, Opt_debug_want_extra_isize, @@ -1694,7 +1694,6 @@ static const struct fs_parameter_spec ext4_param_specs[] = { fsparam_flag ("barrier", Opt_barrier), fsparam_u32 ("barrier", Opt_barrier), fsparam_flag ("nobarrier", Opt_nobarrier), - fsparam_flag ("i_version", Opt_i_version), fsparam_flag ("dax", Opt_dax), fsparam_enum ("dax", Opt_dax_type, ext4_param_dax), fsparam_u32 ("stripe", Opt_stripe), @@ -2140,11 +2139,6 @@ static int ext4_parse_param(struct fs_context *fc, struct fs_parameter *param) case Opt_abort: ctx_set_mount_flag(ctx, EXT4_MF_FS_ABORTED); return 0; - case Opt_i_version: - ext4_msg(NULL, KERN_WARNING, deprecated_msg, param->key, "5.20"); - ext4_msg(NULL, KERN_WARNING, "Use iversion instead\n"); - ctx_set_flags(ctx, SB_I_VERSION); - return 0; case Opt_inlinecrypt: #ifdef CONFIG_FS_ENCRYPTION_INLINE_CRYPT ctx_set_flags(ctx, SB_INLINECRYPT); @@ -2970,8 +2964,6 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb, SEQ_OPTS_PRINT("min_batch_time=%u", sbi->s_min_batch_time); if (nodefs || sbi->s_max_batch_time != EXT4_DEF_MAX_BATCH_TIME) SEQ_OPTS_PRINT("max_batch_time=%u", sbi->s_max_batch_time); - if (sb->s_flags & SB_I_VERSION) - SEQ_OPTS_PUTS("i_version"); if (nodefs || sbi->s_stripe) SEQ_OPTS_PRINT("stripe=%lu", sbi->s_stripe); if (nodefs || EXT4_MOUNT_DATA_FLAGS & @@ -4640,6 +4632,9 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb) sb->s_flags = (sb->s_flags & ~SB_POSIXACL) | (test_opt(sb, POSIX_ACL) ? SB_POSIXACL : 0); + /* i_version is always enabled now */ + sb->s_flags |= SB_I_VERSION; + if (le32_to_cpu(es->s_rev_level) == EXT4_GOOD_OLD_REV && (ext4_has_compat_features(sb) || ext4_has_ro_compat_features(sb) || diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index e975442e4ab2..36d6ba7190b6 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -2412,8 +2412,7 @@ ext4_xattr_set_handle(handle_t *handle, struct inode *inode, int name_index, if (!error) { ext4_xattr_update_super_block(handle, inode->i_sb); inode->i_ctime = current_time(inode); - if (IS_I_VERSION(inode)) - inode_inc_iversion(inode); + inode_inc_iversion(inode); if (!value) no_expand = 0; error = ext4_mark_iloc_dirty(handle, inode, &is.iloc); From patchwork Thu Sep 8 17:24:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeffrey Layton X-Patchwork-Id: 12970411 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D185EC6FA8A for ; Thu, 8 Sep 2022 17:25:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232106AbiIHRZw (ORCPT ); Thu, 8 Sep 2022 13:25:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55632 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231522AbiIHRZD (ORCPT ); Thu, 8 Sep 2022 13:25:03 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 719DB54CAA; Thu, 8 Sep 2022 10:25:02 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D32B661C61; Thu, 8 Sep 2022 17:25:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2F165C433D6; Thu, 8 Sep 2022 17:24:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1662657901; bh=UwpZrmq+vN+P0GFsdu64X7LMo1+EIMrukCLbnYcvUpA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dFwRW3oosWundQ5oWMFWA5H3hpqlmGbpV2eiIskY7WQweKj+xUnhmgl+gbpb0sQGe ZXAZU23BKc4FI10aF83CY/ZZFfK5l0gTMlA08eh1MPifKYmz0XgX0/k8AyI7xrZw1M hHhqVkhjDGkKmzB+0tPqTbc8y8wPJeFXyTbjoXZiTytFsRlGA/ep9fu/0hctskcKZj W+ndB0e2ODjjMZ5i3uIjLBLcBFv/0+BE89GfAF0iXD404rpZS1BVRc1H5SwmV+xJ8V peIHz2e1h1WoTjUkNjm1m5k3ItjMSDWBAOd2uxsGpUi9Z2xf3VTrer0BxDfnBzzZQD t7s16EzwBeQVQ== From: Jeff Layton To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, Jeff Layton Subject: [PATCH v5 4/8] vfs: plumb i_version handling into struct kstat Date: Thu, 8 Sep 2022 13:24:44 -0400 Message-Id: <20220908172448.208585-5-jlayton@kernel.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20220908172448.208585-1-jlayton@kernel.org> References: <20220908172448.208585-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Jeff Layton The NFS server has a lot of special handling for different types of change attribute access, depending on what sort of inode we have. In most cases, it's doing a getattr anyway and then fetching that value after the fact. Rather that do that, add a new STATX_INO_VERSION flag that is a kernel-only symbol (for now). If requested and getattr can implement it, it can fill out this field. For IS_I_VERSION inodes, add a generic implementation in vfs_getattr_nosec. Take care to mask STATX_INO_VERSION off in requests from userland and in the result mask. Eventually if we decide to make this available to userland, we can just designate a field for it in struct statx, and move the STATX_INO_VERSION definition to the uapi header. Signed-off-by: Jeff Layton --- fs/stat.c | 14 +++++++++++++- include/linux/stat.h | 4 ++++ 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/fs/stat.c b/fs/stat.c index 9ced8860e0f3..1a9c20ac5090 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include @@ -118,6 +119,11 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat, stat->attributes_mask |= (STATX_ATTR_AUTOMOUNT | STATX_ATTR_DAX); + if ((request_mask & STATX_INO_VERSION) && IS_I_VERSION(inode)) { + stat->result_mask |= STATX_INO_VERSION; + stat->ino_version = inode_query_iversion(inode); + } + mnt_userns = mnt_user_ns(path->mnt); if (inode->i_op->getattr) return inode->i_op->getattr(mnt_userns, path, stat, @@ -587,7 +593,8 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer) memset(&tmp, 0, sizeof(tmp)); - tmp.stx_mask = stat->result_mask; + /* STATX_INO_VERSION is kernel-only for now */ + tmp.stx_mask = stat->result_mask & ~STATX_INO_VERSION; tmp.stx_blksize = stat->blksize; tmp.stx_attributes = stat->attributes; tmp.stx_nlink = stat->nlink; @@ -626,6 +633,11 @@ int do_statx(int dfd, struct filename *filename, unsigned int flags, if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE) return -EINVAL; + /* STATX_INO_VERSION is kernel-only for now. Ignore requests + * from userland. + */ + mask &= ~STATX_INO_VERSION; + error = vfs_statx(dfd, filename, flags, &stat, mask); if (error) return error; diff --git a/include/linux/stat.h b/include/linux/stat.h index 7df06931f25d..d482bbfc1358 100644 --- a/include/linux/stat.h +++ b/include/linux/stat.h @@ -50,6 +50,10 @@ struct kstat { struct timespec64 btime; /* File creation time */ u64 blocks; u64 mnt_id; + u64 ino_version; }; +/* This definition is internal to the kernel for now. Mainly used by nfsd */ +#define STATX_INO_VERSION 0x40000000U /* Want/got stx_change_attr */ + #endif From patchwork Thu Sep 8 17:24:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeffrey Layton X-Patchwork-Id: 12970408 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97AB6C6FA86 for ; Thu, 8 Sep 2022 17:25:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232114AbiIHRZb (ORCPT ); Thu, 8 Sep 2022 13:25:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56116 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231727AbiIHRZJ (ORCPT ); Thu, 8 Sep 2022 13:25:09 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B33EEE09; Thu, 8 Sep 2022 10:25:06 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id CFC7FB821DB; Thu, 8 Sep 2022 17:25:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7FFFFC433D7; Thu, 8 Sep 2022 17:25:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1662657903; bh=/YNPcN4rxFONundBtrDqmkeT83g0yGiQLxW6cbtCzZs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PoXh84IJy2h+PNMrM5x9Knf84XPB0xhoBe5WwW1p0waOAbkwhaFzWLuk+tV8dFiKq /mHeSchZSF+vFafsXNnXHc64od8sRT1goGnykUFJPq+lshswdYAqzSwyxvAhhwUjCl avqBhRiL908xOsnuPDU5cXkFMZM4niUktu1g37OtvrTJPYXsmQ9IbDfD00XZCXCN2S mgHxHvJT3mA3+gbtwRFfhf6hnhnsdOmuX9nK1SnS+xRFKe/RgWDK2UBhuJAHQO3AAf gzkYcaSkRlckHLWt+ahQAj7l1jq62SyeQmQbjpUvLfDPyt29PBj3IPAlVmI7GSglg6 B8DOBxLBiHlAQ== From: Jeff Layton To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org Subject: [PATCH v5 5/8] nfs: report the inode version in getattr if requested Date: Thu, 8 Sep 2022 13:24:45 -0400 Message-Id: <20220908172448.208585-6-jlayton@kernel.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20220908172448.208585-1-jlayton@kernel.org> References: <20220908172448.208585-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Allow NFS to report the i_version in getattr requests. Since the cost to fetch it is relatively cheap, do it unconditionally and just set the flag if it looks like it's valid. Signed-off-by: Jeff Layton --- fs/nfs/inode.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index bea7c005119c..88c732a5c821 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -830,6 +830,8 @@ static u32 nfs_get_valid_attrmask(struct inode *inode) reply_mask |= STATX_UID | STATX_GID; if (!(cache_validity & NFS_INO_INVALID_BLOCKS)) reply_mask |= STATX_BLOCKS; + if (!(cache_validity & NFS_INO_INVALID_CHANGE)) + reply_mask |= STATX_INO_VERSION; return reply_mask; } @@ -848,7 +850,7 @@ int nfs_getattr(struct user_namespace *mnt_userns, const struct path *path, request_mask &= STATX_TYPE | STATX_MODE | STATX_NLINK | STATX_UID | STATX_GID | STATX_ATIME | STATX_MTIME | STATX_CTIME | - STATX_INO | STATX_SIZE | STATX_BLOCKS; + STATX_INO | STATX_SIZE | STATX_BLOCKS | STATX_INO_VERSION; if ((query_flags & AT_STATX_DONT_SYNC) && !force_sync) { if (readdirplus_enabled) @@ -877,7 +879,7 @@ int nfs_getattr(struct user_namespace *mnt_userns, const struct path *path, /* Is the user requesting attributes that might need revalidation? */ if (!(request_mask & (STATX_MODE|STATX_NLINK|STATX_ATIME|STATX_CTIME| STATX_MTIME|STATX_UID|STATX_GID| - STATX_SIZE|STATX_BLOCKS))) + STATX_SIZE|STATX_BLOCKS|STATX_INO_VERSION))) goto out_no_revalidate; /* Check whether the cached attributes are stale */ @@ -915,6 +917,7 @@ int nfs_getattr(struct user_namespace *mnt_userns, const struct path *path, generic_fillattr(&init_user_ns, inode, stat); stat->ino = nfs_compat_user_ino64(NFS_FILEID(inode)); + stat->ino_version = inode_peek_iversion_raw(inode); if (S_ISDIR(inode->i_mode)) stat->blksize = NFS_SERVER(inode)->dtsize; out: From patchwork Thu Sep 8 17:24:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeffrey Layton X-Patchwork-Id: 12970407 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B672C54EE9 for ; Thu, 8 Sep 2022 17:25:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232070AbiIHRZa (ORCPT ); Thu, 8 Sep 2022 13:25:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56164 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231782AbiIHRZK (ORCPT ); Thu, 8 Sep 2022 13:25:10 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BABEBEE0A; Thu, 8 Sep 2022 10:25:06 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5325C61C61; Thu, 8 Sep 2022 17:25:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BC65CC433C1; Thu, 8 Sep 2022 17:25:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1662657905; bh=vWkxrq3+6QZGR7RqrSOelQqMIu/ocn/ySt9Y7jOOYsM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bmfyY12qPbAMHTjCjjyIlg6NNgg6kJJfuXMKEzViAxgUtK0I80ihSCFRs7wYzTe8k nYbd0Pepp7CYrF88rvCj6igW0B3icNenGy2iY7qARPpwdzWmo01FK6yT2VrBDXgIml dFGbK6+aBUmlBtEeucT82ooxPwW1TgsS89fb5430VGaq+ucDSlp3Yoe2ksG0ygyqd3 u+tS/RZ/9M45/EGXBRQ2ORAoPbQdsQIRMNwg0fvLWcACf3pcPFwg8tNjNe5KpZw8wQ gqb928Vrw8eSmf9LRr18KytshfvDPdn7QrIo8r7w2sQ6uWLXljfwpX4pydzkzd18aA D6XVMWAYBgelg== From: Jeff Layton To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org Subject: [PATCH v5 6/8] ceph: report the inode version in getattr if requested Date: Thu, 8 Sep 2022 13:24:46 -0400 Message-Id: <20220908172448.208585-7-jlayton@kernel.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20220908172448.208585-1-jlayton@kernel.org> References: <20220908172448.208585-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org When getattr requests the STX_INO_VERSION, request the full gamut of caps (similarly to how ctime is handled). When the change attribute seems to be valid, return it in the ino_version field and set the flag in the reply mask. Reviewed-by: Xiubo Li Signed-off-by: Jeff Layton --- fs/ceph/inode.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 42351d7a0dd6..ccc926a7dcb0 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -2415,10 +2415,10 @@ static int statx_to_caps(u32 want, umode_t mode) { int mask = 0; - if (want & (STATX_MODE|STATX_UID|STATX_GID|STATX_CTIME|STATX_BTIME)) + if (want & (STATX_MODE|STATX_UID|STATX_GID|STATX_CTIME|STATX_BTIME|STATX_INO_VERSION)) mask |= CEPH_CAP_AUTH_SHARED; - if (want & (STATX_NLINK|STATX_CTIME)) { + if (want & (STATX_NLINK|STATX_CTIME|STATX_INO_VERSION)) { /* * The link count for directories depends on inode->i_subdirs, * and that is only updated when Fs caps are held. @@ -2429,11 +2429,10 @@ static int statx_to_caps(u32 want, umode_t mode) mask |= CEPH_CAP_LINK_SHARED; } - if (want & (STATX_ATIME|STATX_MTIME|STATX_CTIME|STATX_SIZE| - STATX_BLOCKS)) + if (want & (STATX_ATIME|STATX_MTIME|STATX_CTIME|STATX_SIZE|STATX_BLOCKS|STATX_INO_VERSION)) mask |= CEPH_CAP_FILE_SHARED; - if (want & (STATX_CTIME)) + if (want & (STATX_CTIME|STATX_INO_VERSION)) mask |= CEPH_CAP_XATTR_SHARED; return mask; @@ -2475,6 +2474,11 @@ int ceph_getattr(struct user_namespace *mnt_userns, const struct path *path, valid_mask |= STATX_BTIME; } + if (request_mask & STATX_INO_VERSION) { + stat->ino_version = inode_peek_iversion_raw(inode); + valid_mask |= STATX_INO_VERSION; + } + if (ceph_snap(inode) == CEPH_NOSNAP) stat->dev = inode->i_sb->s_dev; else From patchwork Thu Sep 8 17:24:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeffrey Layton X-Patchwork-Id: 12970405 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BBD61C6FA8A for ; Thu, 8 Sep 2022 17:25:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231878AbiIHRZX (ORCPT ); Thu, 8 Sep 2022 13:25:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56312 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231869AbiIHRZM (ORCPT ); Thu, 8 Sep 2022 13:25:12 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 04EF927FE6; Thu, 8 Sep 2022 10:25:09 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 9635B61DC6; Thu, 8 Sep 2022 17:25:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 04C01C433D6; Thu, 8 Sep 2022 17:25:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1662657908; bh=1d8SlxiMee3v30kC+qO7G0LMkoymE6ghnC1k8q+hGZY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=eFMQp/LTTlCkDlKn92/X3tfHWOHHfgkDXgI9vgEevBM+ryfq0bKMd5IMFdYVEMtwF lxBsO0vEg4FONnL1EgdARB3RgAY/FxboaNFZSa/w8GPguBAdb+Cna3jUv0PfCR5BXH gZzf3hrJoAwmsuKcHjHFD8VMyVWKUbRyJg7/DhT4wJT10i7Nm3/bKDznP5BbbQb4ju ovKASAgfb4fhpKWB6Y+ysHICiwz0dumpzMsoIodw4dgt3WyxgeBi9forfqP7lGw5Xf k8ZMRZhv1OxfWXwfy7DZVgsMjKEM/Pg/Bln7i8esqC0sghpOz7OMsFV10EDAr3UMhY y6HTT7AU8JLlg== From: Jeff Layton To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org Subject: [PATCH v5 7/8] nfsd: use the getattr operation to fetch i_version Date: Thu, 8 Sep 2022 13:24:47 -0400 Message-Id: <20220908172448.208585-8-jlayton@kernel.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20220908172448.208585-1-jlayton@kernel.org> References: <20220908172448.208585-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Now that we can call into vfs_getattr to get the i_version field, use that facility to fetch it instead of doing it in nfsd4_change_attribute. Set the STATX_INO_VERSION (and BTIME) bits in the request when we're dealing with a v4 request. Then, instead of looking at IS_I_VERSION when generating the change attr, look at the result mask and only use it if STATX_INO_VERSION is set. With this change, we can drop the fetch_iversion export operation as well. Signed-off-by: Jeff Layton --- fs/nfs/export.c | 7 ------- fs/nfsd/nfs4xdr.c | 4 +++- fs/nfsd/nfsfh.c | 6 ++++++ fs/nfsd/nfsfh.h | 9 ++++----- fs/nfsd/vfs.h | 7 ++++++- include/linux/exportfs.h | 1 - 6 files changed, 19 insertions(+), 15 deletions(-) diff --git a/fs/nfs/export.c b/fs/nfs/export.c index 01596f2d0a1e..1a9d5aa51dfb 100644 --- a/fs/nfs/export.c +++ b/fs/nfs/export.c @@ -145,17 +145,10 @@ nfs_get_parent(struct dentry *dentry) return parent; } -static u64 nfs_fetch_iversion(struct inode *inode) -{ - nfs_revalidate_inode(inode, NFS_INO_INVALID_CHANGE); - return inode_peek_iversion_raw(inode); -} - const struct export_operations nfs_export_ops = { .encode_fh = nfs_encode_fh, .fh_to_dentry = nfs_fh_to_dentry, .get_parent = nfs_get_parent, - .fetch_iversion = nfs_fetch_iversion, .flags = EXPORT_OP_NOWCC|EXPORT_OP_NOSUBTREECHK| EXPORT_OP_CLOSE_BEFORE_UNLINK|EXPORT_OP_REMOTE_FS| EXPORT_OP_NOATOMIC_ATTR, diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index 5980df859c3a..4eec2ce05e7e 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -2872,7 +2872,9 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, goto out; } - err = vfs_getattr(&path, &stat, STATX_BASIC_STATS, AT_STATX_SYNC_AS_STAT); + err = vfs_getattr(&path, &stat, + STATX_BASIC_STATS | STATX_BTIME | STATX_INO_VERSION, + AT_STATX_SYNC_AS_STAT); if (err) goto out_nfserr; if (!(stat.result_mask & STATX_BTIME)) diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c index a5b71526cee0..798f5d8d2055 100644 --- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -634,6 +634,10 @@ void fh_fill_pre_attrs(struct svc_fh *fhp) stat.mtime = inode->i_mtime; stat.ctime = inode->i_ctime; stat.size = inode->i_size; + if (v4 && IS_I_VERSION(inode)) { + stat.ino_version = inode_query_iversion(inode); + stat.result_mask |= STATX_INO_VERSION; + } } if (v4) fhp->fh_pre_change = nfsd4_change_attribute(&stat, inode); @@ -665,6 +669,8 @@ void fh_fill_post_attrs(struct svc_fh *fhp) if (err) { fhp->fh_post_saved = false; fhp->fh_post_attr.ctime = inode->i_ctime; + if (v4 && IS_I_VERSION(inode)) + fhp->fh_post_attr.ino_version = inode_query_iversion(inode); } else fhp->fh_post_saved = true; if (v4) diff --git a/fs/nfsd/nfsfh.h b/fs/nfsd/nfsfh.h index c3ae6414fc5c..3786b181d5ce 100644 --- a/fs/nfsd/nfsfh.h +++ b/fs/nfsd/nfsfh.h @@ -305,18 +305,17 @@ static inline void fh_clear_pre_post_attrs(struct svc_fh *fhp) static inline u64 nfsd4_change_attribute(struct kstat *stat, struct inode *inode) { - if (inode->i_sb->s_export_op->fetch_iversion) - return inode->i_sb->s_export_op->fetch_iversion(inode); - else if (IS_I_VERSION(inode)) { + if (stat->result_mask & STATX_INO_VERSION) { u64 chattr; chattr = stat->ctime.tv_sec; chattr <<= 30; chattr += stat->ctime.tv_nsec; - chattr += inode_query_iversion(inode); + chattr += stat->ino_version; return chattr; - } else + } else { return time_to_chattr(&stat->ctime); + } } extern void fh_fill_pre_attrs(struct svc_fh *fhp); diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h index c95cd414b4bb..8a3a5dbde5fa 100644 --- a/fs/nfsd/vfs.h +++ b/fs/nfsd/vfs.h @@ -168,9 +168,14 @@ static inline void fh_drop_write(struct svc_fh *fh) static inline __be32 fh_getattr(const struct svc_fh *fh, struct kstat *stat) { + u32 request_mask = STATX_BASIC_STATS; struct path p = {.mnt = fh->fh_export->ex_path.mnt, .dentry = fh->fh_dentry}; - return nfserrno(vfs_getattr(&p, stat, STATX_BASIC_STATS, + + if (fh->fh_maxsize == NFS4_FHSIZE) + request_mask |= (STATX_BTIME | STATX_INO_VERSION); + + return nfserrno(vfs_getattr(&p, stat, request_mask, AT_STATX_SYNC_AS_STAT)); } diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h index fe848901fcc3..9f4d4bcbf251 100644 --- a/include/linux/exportfs.h +++ b/include/linux/exportfs.h @@ -213,7 +213,6 @@ struct export_operations { bool write, u32 *device_generation); int (*commit_blocks)(struct inode *inode, struct iomap *iomaps, int nr_iomaps, struct iattr *iattr); - u64 (*fetch_iversion)(struct inode *); #define EXPORT_OP_NOWCC (0x1) /* don't collect v3 wcc data */ #define EXPORT_OP_NOSUBTREECHK (0x2) /* no subtree checking */ #define EXPORT_OP_CLOSE_BEFORE_UNLINK (0x4) /* close files before unlink */ From patchwork Thu Sep 8 17:24:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeffrey Layton X-Patchwork-Id: 12970409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C72D6C6FA8D for ; Thu, 8 Sep 2022 17:25:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231727AbiIHRZm (ORCPT ); Thu, 8 Sep 2022 13:25:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56568 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231911AbiIHRZP (ORCPT ); Thu, 8 Sep 2022 13:25:15 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E856157E3F; Thu, 8 Sep 2022 10:25:12 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 7958DB821DB; Thu, 8 Sep 2022 17:25:11 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 42F98C4347C; Thu, 8 Sep 2022 17:25:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1662657910; bh=GCiVQrHAmwsie7AsMWJgZavE3oU3xm6fSWOc49mDFmI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bm5rIPlx/0BsT6WANYrIo5eksfQ3ZHl7HQ//5M3nJmwTWrG8OKt2XGG20tlIbzZsv gREn8JhQfvr60YHNNOhaDShwmBnOnCpu9nvPfKSomTVstq54aZKaRazlw5QjMvdLJY fN9IUntS/1fFUOX3M6qjLr2D7S/e1oYFmX1dbp9NIdyO+rgN+jFfARiQA+x9DlNe7w zIMxv5TafDusRr4vyY4FAcXQma0OZbV7ctE9pm5rT7zrHip9UDSQcfBSmNlupA92uK pIM+e3NqISUkh7EjcsG35bFm87umbXxEFvnHoLaew2eB4Eg+4aevC7gdof9dBfu7KN j+hOgRKkjSkpA== From: Jeff Layton To: tytso@mit.edu, adilger.kernel@dilger.ca, djwong@kernel.org, david@fromorbit.com, trondmy@hammerspace.com, neilb@suse.de, viro@zeniv.linux.org.uk, zohar@linux.ibm.com, xiubli@redhat.com, chuck.lever@oracle.com, lczerner@redhat.com, jack@suse.cz, bfields@fieldses.org, brauner@kernel.org, fweimer@redhat.com Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, ceph-devel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org Subject: [RFC PATCH v5 8/8] nfsd: take inode_lock when querying for NFSv4 GETATTR Date: Thu, 8 Sep 2022 13:24:48 -0400 Message-Id: <20220908172448.208585-9-jlayton@kernel.org> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20220908172448.208585-1-jlayton@kernel.org> References: <20220908172448.208585-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org The i_version counter for regular files is updated in update_time, and that's usually done before copying the data to the pagecache. It's possible that a reader and writer could race like this: reader writer ------ ------ i_version++ read getattr update page cache If that happens then the reader may associate the i_version value with the wrong inode state. All of the existing filesystems that implement i_version take the i_rwsem in their write_iter operations before incrementing it. Take the inode_lock when issuing a getattr for NFSv4 attributes to prevent the above race. Signed-off-by: Jeff Layton --- fs/nfsd/nfs4xdr.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index 4eec2ce05e7e..f7951d8d55ca 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -2872,9 +2872,22 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, goto out; } + /* + * The inode lock is needed here to ensure that there is not a + * write to the inode in progress that might change the size, + * or an in-progress directory morphing operation for directory + * inodes. + * + * READ and GETATTR are not guaranteed to be atomic, even when in + * the same compound, but we do try to present attributes in the + * GETATTR reply as representing a single point in time. + */ + inode_lock(d_inode(dentry)); err = vfs_getattr(&path, &stat, STATX_BASIC_STATS | STATX_BTIME | STATX_INO_VERSION, AT_STATX_SYNC_AS_STAT); + inode_unlock(d_inode(dentry)); + if (err) goto out_nfserr; if (!(stat.result_mask & STATX_BTIME))