From patchwork Fri Jan 10 19:29:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11328377 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6B0BD138D for ; Fri, 10 Jan 2020 19:31:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5104620880 for ; Fri, 10 Jan 2020 19:31:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728814AbgAJTbQ (ORCPT ); Fri, 10 Jan 2020 14:31:16 -0500 Received: from mga14.intel.com ([192.55.52.115]:10924 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728566AbgAJT35 (ORCPT ); Fri, 10 Jan 2020 14:29:57 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:29:56 -0800 X-IronPort-AV: E=Sophos;i="5.69,418,1571727600"; d="scan'208";a="272503818" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:29:56 -0800 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org Cc: Ira Weiny , Alexander Viro , "Darrick J. Wong" , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH V2 01/12] fs/stat: Define DAX statx attribute Date: Fri, 10 Jan 2020 11:29:31 -0800 Message-Id: <20200110192942.25021-2-ira.weiny@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200110192942.25021-1-ira.weiny@intel.com> References: <20200110192942.25021-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Ira Weiny In order for users to determine if a file is currently operating in DAX mode (effective DAX). Define a statx attribute value and set that attribute if the effective DAX flag is set. To go along with this we propose the following addition to the statx man page: STATX_ATTR_DAX DAX (cpu direct access) is a file mode that attempts to minimize software cache effects for both I/O and memory mappings of this file. It requires a capable device, a compatible filesystem block size, and filesystem opt-in. It generally assumes all accesses are via cpu load / store instructions which can minimize overhead for small accesses, but adversely affect cpu utilization for large transfers. File I/O is done directly to/from user-space buffers. While the DAX property tends to result in data being transferred synchronously it does not give the guarantees of synchronous I/O that data and necessary metadata are transferred. Memory mapped I/O may be performed with direct mappings that bypass system memory buffering. Again while memory-mapped I/O tends to result in data being transferred synchronously it does not guarantee synchronous metadata updates. A dax file may optionally support being mapped with the MAP_SYNC flag which does allow cpu store operations to be considered synchronous modulo cpu cache effects. Signed-off-by: Ira Weiny Reviewed-by: Jan Kara Reviewed-by: Darrick J. Wong --- fs/stat.c | 3 +++ include/uapi/linux/stat.h | 1 + 2 files changed, 4 insertions(+) diff --git a/fs/stat.c b/fs/stat.c index 030008796479..894699c74dde 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -79,6 +79,9 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat, if (IS_AUTOMOUNT(inode)) stat->attributes |= STATX_ATTR_AUTOMOUNT; + if (IS_DAX(inode)) + stat->attributes |= STATX_ATTR_DAX; + if (inode->i_op->getattr) return inode->i_op->getattr(path, stat, request_mask, query_flags); diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h index ad80a5c885d5..e5f9d5517f6b 100644 --- a/include/uapi/linux/stat.h +++ b/include/uapi/linux/stat.h @@ -169,6 +169,7 @@ struct statx { #define STATX_ATTR_ENCRYPTED 0x00000800 /* [I] File requires key to decrypt in fs */ #define STATX_ATTR_AUTOMOUNT 0x00001000 /* Dir: Automount trigger */ #define STATX_ATTR_VERITY 0x00100000 /* [I] Verity protected file */ +#define STATX_ATTR_DAX 0x00002000 /* [I] File is DAX */ #endif /* _UAPI_LINUX_STAT_H */ From patchwork Fri Jan 10 19:29:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11328333 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 86661138D for ; Fri, 10 Jan 2020 19:30:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6636F2087F for ; Fri, 10 Jan 2020 19:30:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728852AbgAJT37 (ORCPT ); Fri, 10 Jan 2020 14:29:59 -0500 Received: from mga05.intel.com ([192.55.52.43]:59396 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727709AbgAJT36 (ORCPT ); Fri, 10 Jan 2020 14:29:58 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:29:58 -0800 X-IronPort-AV: E=Sophos;i="5.69,418,1571727600"; d="scan'208";a="255125193" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:29:58 -0800 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org Cc: Ira Weiny , Alexander Viro , "Darrick J. Wong" , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH V2 02/12] fs/xfs: Isolate the physical DAX flag from effective Date: Fri, 10 Jan 2020 11:29:32 -0800 Message-Id: <20200110192942.25021-3-ira.weiny@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200110192942.25021-1-ira.weiny@intel.com> References: <20200110192942.25021-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Ira Weiny xfs_ioctl_setattr_dax_invalidate() currently checks if the DAX flag is changing as a quick check. But the implementation mixes the physical (XFS_DIFLAG2_DAX) and effective (S_DAX) DAX flags. Remove the use of the effective flag when determining if a change of the physical flag is required. Signed-off-by: Ira Weiny --- fs/xfs/xfs_ioctl.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 7b35d62ede9f..fe37708cea8f 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1195,9 +1195,11 @@ xfs_ioctl_setattr_dax_invalidate( } /* If the DAX state is not changing, we have nothing to do here. */ - if ((fa->fsx_xflags & FS_XFLAG_DAX) && IS_DAX(inode)) + if ((fa->fsx_xflags & FS_XFLAG_DAX) && + (ip->i_d.di_flags2 & XFS_DIFLAG2_DAX)) return 0; - if (!(fa->fsx_xflags & FS_XFLAG_DAX) && !IS_DAX(inode)) + if (!(fa->fsx_xflags & FS_XFLAG_DAX) && + !(ip->i_d.di_flags2 & XFS_DIFLAG2_DAX)) return 0; if (S_ISDIR(inode->i_mode)) From patchwork Fri Jan 10 19:29:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11328371 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 574EA930 for ; Fri, 10 Jan 2020 19:31:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3DD922087F for ; Fri, 10 Jan 2020 19:31:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728719AbgAJTbC (ORCPT ); Fri, 10 Jan 2020 14:31:02 -0500 Received: from mga01.intel.com ([192.55.52.88]:21831 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728860AbgAJTaA (ORCPT ); Fri, 10 Jan 2020 14:30:00 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:00 -0800 X-IronPort-AV: E=Sophos;i="5.69,418,1571727600"; d="scan'208";a="216772415" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:29:59 -0800 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org Cc: Ira Weiny , Alexander Viro , "Darrick J. Wong" , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH V2 03/12] fs/xfs: Separate functionality of xfs_inode_supports_dax() Date: Fri, 10 Jan 2020 11:29:33 -0800 Message-Id: <20200110192942.25021-4-ira.weiny@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200110192942.25021-1-ira.weiny@intel.com> References: <20200110192942.25021-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Ira Weiny xfs_inode_supports_dax() should reflect if the inode can support DAX not that it is enabled for DAX. Leave that to other helper functions. Change the caller of xfs_inode_supports_dax() to call xfs_inode_use_dax() which reflects new logic to override the effective DAX flag with either the mount option or the physical DAX flag. To make the logic clear create 2 helper functions for the mount and physical flag. Signed-off-by: Ira Weiny --- fs/xfs/xfs_iops.c | 32 ++++++++++++++++++++++++++------ 1 file changed, 26 insertions(+), 6 deletions(-) diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 8afe69ca188b..0a0ea90259e9 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -1234,6 +1234,15 @@ static const struct inode_operations xfs_inline_symlink_inode_operations = { .update_time = xfs_vn_update_time, }; +static bool +xfs_inode_mount_is_dax( + struct xfs_inode *ip) +{ + struct xfs_mount *mp = ip->i_mount; + + return (mp->m_flags & XFS_MOUNT_DAX) == XFS_MOUNT_DAX; +} + /* Figure out if this file actually supports DAX. */ static bool xfs_inode_supports_dax( @@ -1245,11 +1254,6 @@ xfs_inode_supports_dax( if (!S_ISREG(VFS_I(ip)->i_mode) || xfs_is_reflink_inode(ip)) return false; - /* DAX mount option or DAX iflag must be set. */ - if (!(mp->m_flags & XFS_MOUNT_DAX) && - !(ip->i_d.di_flags2 & XFS_DIFLAG2_DAX)) - return false; - /* Block size must match page size */ if (mp->m_sb.sb_blocksize != PAGE_SIZE) return false; @@ -1258,6 +1262,22 @@ xfs_inode_supports_dax( return xfs_inode_buftarg(ip)->bt_daxdev != NULL; } +static bool +xfs_inode_is_dax( + struct xfs_inode *ip) +{ + return (ip->i_d.di_flags2 & XFS_DIFLAG2_DAX) == XFS_DIFLAG2_DAX; +} + +static bool +xfs_inode_use_dax( + struct xfs_inode *ip) +{ + return xfs_inode_supports_dax(ip) && + (xfs_inode_mount_is_dax(ip) || + xfs_inode_is_dax(ip)); +} + STATIC void xfs_diflags_to_iflags( struct inode *inode, @@ -1276,7 +1296,7 @@ xfs_diflags_to_iflags( inode->i_flags |= S_SYNC; if (flags & XFS_DIFLAG_NOATIME) inode->i_flags |= S_NOATIME; - if (xfs_inode_supports_dax(ip)) + if (xfs_inode_use_dax(ip)) inode->i_flags |= S_DAX; } From patchwork Fri Jan 10 19:29:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11328367 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E9045930 for ; Fri, 10 Jan 2020 19:31:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CECA9206ED for ; Fri, 10 Jan 2020 19:31:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728893AbgAJTaC (ORCPT ); Fri, 10 Jan 2020 14:30:02 -0500 Received: from mga09.intel.com ([134.134.136.24]:22265 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728883AbgAJTaC (ORCPT ); Fri, 10 Jan 2020 14:30:02 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:01 -0800 X-IronPort-AV: E=Sophos;i="5.69,418,1571727600"; d="scan'208";a="212359212" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:01 -0800 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org Cc: Ira Weiny , Alexander Viro , "Darrick J. Wong" , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH V2 04/12] fs/xfs: Clean up DAX support check Date: Fri, 10 Jan 2020 11:29:34 -0800 Message-Id: <20200110192942.25021-5-ira.weiny@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200110192942.25021-1-ira.weiny@intel.com> References: <20200110192942.25021-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Ira Weiny Rather than open coding xfs_inode_supports_dax() in xfs_ioctl_setattr_dax_invalidate() export xfs_inode_supports_dax() and call it in preparation for swapping dax flags. This also means updating xfs_inode_supports_dax() to return true for a directory. Signed-off-by: Ira Weiny --- fs/xfs/xfs_ioctl.c | 16 +++------------- fs/xfs/xfs_iops.c | 8 ++++++-- fs/xfs/xfs_iops.h | 2 ++ 3 files changed, 11 insertions(+), 15 deletions(-) diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index fe37708cea8f..b5e00b67c297 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1176,23 +1176,13 @@ xfs_ioctl_setattr_dax_invalidate( int *join_flags) { struct inode *inode = VFS_I(ip); - struct super_block *sb = inode->i_sb; int error; *join_flags = 0; - /* - * It is only valid to set the DAX flag on regular files and - * directories on filesystems where the block size is equal to the page - * size. On directories it serves as an inherited hint so we don't - * have to check the device for dax support or flush pagecache. - */ - if (fa->fsx_xflags & FS_XFLAG_DAX) { - struct xfs_buftarg *target = xfs_inode_buftarg(ip); - - if (!bdev_dax_supported(target->bt_bdev, sb->s_blocksize)) - return -EINVAL; - } + if ((fa->fsx_xflags & FS_XFLAG_DAX) == FS_XFLAG_DAX && + !xfs_inode_supports_dax(ip)) + return -EINVAL; /* If the DAX state is not changing, we have nothing to do here. */ if ((fa->fsx_xflags & FS_XFLAG_DAX) && diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 0a0ea90259e9..d6843cdb51d0 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -1244,14 +1244,18 @@ xfs_inode_mount_is_dax( } /* Figure out if this file actually supports DAX. */ -static bool +bool xfs_inode_supports_dax( struct xfs_inode *ip) { struct xfs_mount *mp = ip->i_mount; /* Only supported on non-reflinked files. */ - if (!S_ISREG(VFS_I(ip)->i_mode) || xfs_is_reflink_inode(ip)) + if (xfs_is_reflink_inode(ip)) + return false; + + /* Only supported on regular files and directories. */ + if (!(S_ISREG(VFS_I(ip)->i_mode) || S_ISDIR(VFS_I(ip)->i_mode))) return false; /* Block size must match page size */ diff --git a/fs/xfs/xfs_iops.h b/fs/xfs/xfs_iops.h index 4d24ff309f59..f24fec8de1d6 100644 --- a/fs/xfs/xfs_iops.h +++ b/fs/xfs/xfs_iops.h @@ -24,4 +24,6 @@ extern int xfs_setattr_nonsize(struct xfs_inode *ip, struct iattr *vap, extern int xfs_vn_setattr_nonsize(struct dentry *dentry, struct iattr *vap); extern int xfs_vn_setattr_size(struct dentry *dentry, struct iattr *vap); +extern bool xfs_inode_supports_dax(struct xfs_inode *ip); + #endif /* __XFS_IOPS_H__ */ From patchwork Fri Jan 10 19:29:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11328335 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 95377930 for ; Fri, 10 Jan 2020 19:30:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7C6092084D for ; Fri, 10 Jan 2020 19:30:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728932AbgAJTaF (ORCPT ); Fri, 10 Jan 2020 14:30:05 -0500 Received: from mga06.intel.com ([134.134.136.31]:52408 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728881AbgAJTaE (ORCPT ); Fri, 10 Jan 2020 14:30:04 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:03 -0800 X-IronPort-AV: E=Sophos;i="5.69,418,1571727600"; d="scan'208";a="223886934" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:02 -0800 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org Cc: Ira Weiny , Alexander Viro , "Darrick J. Wong" , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH V2 05/12] fs: remove unneeded IS_DAX() check Date: Fri, 10 Jan 2020 11:29:35 -0800 Message-Id: <20200110192942.25021-6-ira.weiny@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200110192942.25021-1-ira.weiny@intel.com> References: <20200110192942.25021-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Ira Weiny The IS_DAX() check in io_is_direct() causes a race between changing the DAX mode and creating the iocb flags. Remove the check because DAX now emulates the page cache API and therefore it does not matter if the file mode is DAX or not when the iocb flags are created. Signed-off-by: Ira Weiny Reviewed-by: Jan Kara --- include/linux/fs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index d7584bcef5d3..e11989502eac 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3365,7 +3365,7 @@ extern int file_update_time(struct file *file); static inline bool io_is_direct(struct file *filp) { - return (filp->f_flags & O_DIRECT) || IS_DAX(filp->f_mapping->host); + return (filp->f_flags & O_DIRECT); } static inline bool vma_is_dax(struct vm_area_struct *vma) From patchwork Fri Jan 10 19:29:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11328337 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E87B76C1 for ; Fri, 10 Jan 2020 19:30:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C2D4220848 for ; Fri, 10 Jan 2020 19:30:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728946AbgAJTaH (ORCPT ); Fri, 10 Jan 2020 14:30:07 -0500 Received: from mga18.intel.com ([134.134.136.126]:33892 "EHLO mga18.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728924AbgAJTaG (ORCPT ); Fri, 10 Jan 2020 14:30:06 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:05 -0800 X-IronPort-AV: E=Sophos;i="5.69,418,1571727600"; d="scan'208";a="396531775" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:04 -0800 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org Cc: Ira Weiny , Alexander Viro , "Darrick J. Wong" , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH V2 06/12] fs/xfs: Check if the inode supports DAX under lock Date: Fri, 10 Jan 2020 11:29:36 -0800 Message-Id: <20200110192942.25021-7-ira.weiny@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200110192942.25021-1-ira.weiny@intel.com> References: <20200110192942.25021-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Ira Weiny One of the checks for an inode supporting DAX is if the inode is reflinked. During a non-DAX to DAX mode change we could race with the file being reflinked and end up with a reflinked file being in DAX mode. Prevent this race by checking for DAX support under the MMAP_LOCK. Signed-off-by: Ira Weiny --- fs/xfs/xfs_ioctl.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index b5e00b67c297..bc3654fe3b5d 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1180,10 +1180,6 @@ xfs_ioctl_setattr_dax_invalidate( *join_flags = 0; - if ((fa->fsx_xflags & FS_XFLAG_DAX) == FS_XFLAG_DAX && - !xfs_inode_supports_dax(ip)) - return -EINVAL; - /* If the DAX state is not changing, we have nothing to do here. */ if ((fa->fsx_xflags & FS_XFLAG_DAX) && (ip->i_d.di_flags2 & XFS_DIFLAG2_DAX)) @@ -1197,6 +1193,13 @@ xfs_ioctl_setattr_dax_invalidate( /* lock, flush and invalidate mapping in preparation for flag change */ xfs_ilock(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL); + + if ((fa->fsx_xflags & FS_XFLAG_DAX) == FS_XFLAG_DAX && + !xfs_inode_supports_dax(ip)) { + error = -EINVAL; + goto out_unlock; + } + error = filemap_write_and_wait(inode->i_mapping); if (error) goto out_unlock; From patchwork Fri Jan 10 19:29:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11328357 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5FF1F930 for ; Fri, 10 Jan 2020 19:30:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3458520842 for ; Fri, 10 Jan 2020 19:30:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729127AbgAJTao (ORCPT ); Fri, 10 Jan 2020 14:30:44 -0500 Received: from mga12.intel.com ([192.55.52.136]:38365 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728948AbgAJTaI (ORCPT ); Fri, 10 Jan 2020 14:30:08 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:07 -0800 X-IronPort-AV: E=Sophos;i="5.69,418,1571727600"; d="scan'208";a="371693760" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:06 -0800 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org Cc: Ira Weiny , Alexander Viro , "Darrick J. Wong" , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH V2 07/12] fs: Add locking for a dynamic inode 'mode' Date: Fri, 10 Jan 2020 11:29:37 -0800 Message-Id: <20200110192942.25021-8-ira.weiny@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200110192942.25021-1-ira.weiny@intel.com> References: <20200110192942.25021-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Ira Weiny DAX requires special address space operations but many other functions check the IS_DAX() mode. DAX is a property of the inode thus we define an inode mode lock as an inode operation which file systems can optionally define. This patch defines the core function callbacks as well as puts the locking calls in place. Signed-off-by: Ira Weiny --- Documentation/filesystems/vfs.rst | 30 ++++++++++++++++ fs/ioctl.c | 23 +++++++++---- fs/open.c | 4 +++ include/linux/fs.h | 57 +++++++++++++++++++++++++++++-- mm/fadvise.c | 10 ++++-- mm/khugepaged.c | 2 ++ mm/mmap.c | 7 ++++ 7 files changed, 123 insertions(+), 10 deletions(-) diff --git a/Documentation/filesystems/vfs.rst b/Documentation/filesystems/vfs.rst index 7d4d09dd5e6d..b945aa95f15a 100644 --- a/Documentation/filesystems/vfs.rst +++ b/Documentation/filesystems/vfs.rst @@ -59,6 +59,28 @@ like open(2) the file, or stat(2) it to peek at the inode data. The stat(2) operation is fairly simple: once the VFS has the dentry, it peeks at the inode data and passes some of it back to userspace. +Changing inode 'modes' dynamically +---------------------------------- + +Some file systems may have different modes for their inodes which require +dyanic changing. A specific example of this is DAX enabled files in XFS and +ext4. To switch the mode safely we lock the inode mode in all "normal" file +system operations and restrict mode changes to those operations. The specific +rules are. + +To do this a file system must follow the following rules. + + 1) the direct_IO address_space_operation must be supported in all + potential a_ops vectors for any mode suported by the inode. + 2) Filesystems must define the lock_mode() and unlock_mode() operations + in struct inode_operations. These functions are used by the core + vfs layers to ensure that the mode is stable before allowing the + core operations to proceed. + 3) Mode changes shall not be allowed while the file is mmap'ed + 4) While changing modes filesystems should take exclusive locks which + prevent the core vfs layer from proceeding. + + The File Object --------------- @@ -437,6 +459,8 @@ As of kernel 2.6.22, the following members are defined: int (*atomic_open)(struct inode *, struct dentry *, struct file *, unsigned open_flag, umode_t create_mode); int (*tmpfile) (struct inode *, struct dentry *, umode_t); + void (*lock_mode)(struct inode *); + void (*unlock_mode)(struct inode *); }; Again, all methods are called without any locks being held, unless @@ -584,6 +608,12 @@ otherwise noted. atomically creating, opening and unlinking a file in given directory. +``lock_mode`` + called to prevent operations which depend on the inode's mode from + proceeding should a mode change be in progress + +``unlock_mode`` + called when critical mode dependent operation is complete The Address Space Object ======================== diff --git a/fs/ioctl.c b/fs/ioctl.c index 7c9a5df5a597..ed6ab5303a24 100644 --- a/fs/ioctl.c +++ b/fs/ioctl.c @@ -55,18 +55,29 @@ EXPORT_SYMBOL(vfs_ioctl); static int ioctl_fibmap(struct file *filp, int __user *p) { struct address_space *mapping = filp->f_mapping; + struct inode *inode = filp->f_inode; int res, block; + lock_inode_mode(inode); + /* do we support this mess? */ - if (!mapping->a_ops->bmap) - return -EINVAL; - if (!capable(CAP_SYS_RAWIO)) - return -EPERM; + if (!mapping->a_ops->bmap) { + res = -EINVAL; + goto out; + } + if (!capable(CAP_SYS_RAWIO)) { + res = -EPERM; + goto out; + } res = get_user(block, p); if (res) - return res; + goto out; res = mapping->a_ops->bmap(mapping, block); - return put_user(res, p); + res = put_user(res, p); + +out: + unlock_inode_mode(inode); + return res; } /** diff --git a/fs/open.c b/fs/open.c index b0be77ea8f1b..c62428bbc525 100644 --- a/fs/open.c +++ b/fs/open.c @@ -59,10 +59,12 @@ int do_truncate(struct dentry *dentry, loff_t length, unsigned int time_attrs, if (ret) newattrs.ia_valid |= ret | ATTR_FORCE; + lock_inode_mode(dentry->d_inode); inode_lock(dentry->d_inode); /* Note any delegations or leases have already been broken: */ ret = notify_change(dentry, &newattrs, NULL); inode_unlock(dentry->d_inode); + unlock_inode_mode(dentry->d_inode); return ret; } @@ -306,7 +308,9 @@ int vfs_fallocate(struct file *file, int mode, loff_t offset, loff_t len) return -EOPNOTSUPP; file_start_write(file); + lock_inode_mode(inode); ret = file->f_op->fallocate(file, mode, offset, len); + unlock_inode_mode(inode); /* * Create inotify and fanotify events. diff --git a/include/linux/fs.h b/include/linux/fs.h index e11989502eac..631f11d6246e 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -359,6 +359,11 @@ typedef struct { typedef int (*read_actor_t)(read_descriptor_t *, struct page *, unsigned long, unsigned long); +/** + * NOTE: DO NOT define new functions in address_space_operations without first + * considering how dynamic inode modes can be supported. See the comment in + * struct inode_operations for the lock_mode() and unlock_mode() callbacks. + */ struct address_space_operations { int (*writepage)(struct page *page, struct writeback_control *wbc); int (*readpage)(struct file *, struct page *); @@ -1817,6 +1822,11 @@ struct block_device_operations; struct iov_iter; +/** + * NOTE: DO NOT define new functions in file_operations without first + * considering how dynamic inode modes can be supported. See the comment in + * struct inode_operations for the lock_mode() and unlock_mode() callbacks. + */ struct file_operations { struct module *owner; loff_t (*llseek) (struct file *, loff_t, int); @@ -1859,6 +1869,20 @@ struct file_operations { int (*fadvise)(struct file *, loff_t, loff_t, int); } __randomize_layout; +/* + * Filesystems wishing to support dynamic inode modes must do the following. + * + * 1) the direct_IO address_space_operation must be supported in all + * potential a_ops vectors for any mode suported by the inode. + * 2) Filesystems must define the lock_mode() and unlock_mode() operations + * in struct inode_operations. These functions are used by the core + * vfs layers to ensure that the mode is stable before allowing the + * core operations to proceed. + * 3) Mode changes shall not be allowed while the file is mmap'ed + * 4) While changing modes filesystems should take exclusive locks which + * prevent the core vfs layer from proceeding. + * + */ struct inode_operations { struct dentry * (*lookup) (struct inode *,struct dentry *, unsigned int); const char * (*get_link) (struct dentry *, struct inode *, struct delayed_call *); @@ -1887,18 +1911,47 @@ struct inode_operations { umode_t create_mode); int (*tmpfile) (struct inode *, struct dentry *, umode_t); int (*set_acl)(struct inode *, struct posix_acl *, int); + void (*lock_mode)(struct inode*); + void (*unlock_mode)(struct inode*); } ____cacheline_aligned; +static inline void lock_inode_mode(struct inode *inode) +{ + WARN_ON_ONCE(inode->i_op->lock_mode && + !inode->i_op->unlock_mode); + if (inode->i_op->lock_mode) + inode->i_op->lock_mode(inode); +} +static inline void unlock_inode_mode(struct inode *inode) +{ + WARN_ON_ONCE(inode->i_op->unlock_mode && + !inode->i_op->lock_mode); + if (inode->i_op->unlock_mode) + inode->i_op->unlock_mode(inode); +} + static inline ssize_t call_read_iter(struct file *file, struct kiocb *kio, struct iov_iter *iter) { - return file->f_op->read_iter(kio, iter); + struct inode *inode = file_inode(kio->ki_filp); + ssize_t ret; + + lock_inode_mode(inode); + ret = file->f_op->read_iter(kio, iter); + unlock_inode_mode(inode); + return ret; } static inline ssize_t call_write_iter(struct file *file, struct kiocb *kio, struct iov_iter *iter) { - return file->f_op->write_iter(kio, iter); + struct inode *inode = file_inode(kio->ki_filp); + ssize_t ret; + + lock_inode_mode(inode); + ret = file->f_op->write_iter(kio, iter); + unlock_inode_mode(inode); + return ret; } static inline int call_mmap(struct file *file, struct vm_area_struct *vma) diff --git a/mm/fadvise.c b/mm/fadvise.c index 4f17c83db575..a4095a5deac8 100644 --- a/mm/fadvise.c +++ b/mm/fadvise.c @@ -47,7 +47,10 @@ int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice) bdi = inode_to_bdi(mapping->host); + lock_inode_mode(inode); if (IS_DAX(inode) || (bdi == &noop_backing_dev_info)) { + int ret = 0; + switch (advice) { case POSIX_FADV_NORMAL: case POSIX_FADV_RANDOM: @@ -58,10 +61,13 @@ int generic_fadvise(struct file *file, loff_t offset, loff_t len, int advice) /* no bad return value, but ignore advice */ break; default: - return -EINVAL; + ret = -EINVAL; } - return 0; + + unlock_inode_mode(inode); + return ret; } + unlock_inode_mode(inode); /* * Careful about overflows. Len == 0 means "as much as possible". Use diff --git a/mm/khugepaged.c b/mm/khugepaged.c index b679908743cb..ff49da065db0 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1592,9 +1592,11 @@ static void collapse_file(struct mm_struct *mm, } else { /* !is_shmem */ if (!page || xa_is_value(page)) { xas_unlock_irq(&xas); + lock_inode_mode(file->f_inode); page_cache_sync_readahead(mapping, &file->f_ra, file, index, PAGE_SIZE); + unlock_inode_mode(file->f_inode); /* drain pagevecs to help isolate_lru_page() */ lru_add_drain(); page = find_lock_page(mapping, index); diff --git a/mm/mmap.c b/mm/mmap.c index 70f67c4515aa..dfaf1130e706 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1542,11 +1542,18 @@ unsigned long do_mmap(struct file *file, unsigned long addr, vm_flags |= VM_NORESERVE; } + if (file) + lock_inode_mode(file_inode(file)); + addr = mmap_region(file, addr, len, vm_flags, pgoff, uf); if (!IS_ERR_VALUE(addr) && ((vm_flags & VM_LOCKED) || (flags & (MAP_POPULATE | MAP_NONBLOCK)) == MAP_POPULATE)) *populate = len; + + if (file) + unlock_inode_mode(file_inode(file)); + return addr; } From patchwork Fri Jan 10 19:29:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11328359 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0BC68930 for ; Fri, 10 Jan 2020 19:30:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C6241206ED for ; Fri, 10 Jan 2020 19:30:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729118AbgAJTan (ORCPT ); Fri, 10 Jan 2020 14:30:43 -0500 Received: from mga14.intel.com ([192.55.52.115]:10951 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728966AbgAJTaJ (ORCPT ); Fri, 10 Jan 2020 14:30:09 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:08 -0800 X-IronPort-AV: E=Sophos;i="5.69,418,1571727600"; d="scan'208";a="224289175" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:08 -0800 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org Cc: Ira Weiny , Alexander Viro , "Darrick J. Wong" , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH V2 08/12] fs/xfs: Add lock/unlock mode to xfs Date: Fri, 10 Jan 2020 11:29:38 -0800 Message-Id: <20200110192942.25021-9-ira.weiny@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200110192942.25021-1-ira.weiny@intel.com> References: <20200110192942.25021-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Ira Weiny XFS requires regular files to be locked while changing to/from DAX mode. Define a new DAX lock type and implement the [un]lock_mode() inode operation callbacks. We define a new XFS_DAX_* lock type to carry the lock through the transaction because we don't want to use IOLOCK as that would cause performance issues with locking of the inode itself. Signed-off-by: Ira Weiny --- fs/xfs/xfs_icache.c | 2 ++ fs/xfs/xfs_inode.c | 37 +++++++++++++++++++++++++++++++++++-- fs/xfs/xfs_inode.h | 12 ++++++++++-- fs/xfs/xfs_iops.c | 24 +++++++++++++++++++++++- 4 files changed, 70 insertions(+), 5 deletions(-) diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index 8dc2e5414276..0288672e8902 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -74,6 +74,8 @@ xfs_inode_alloc( INIT_LIST_HEAD(&ip->i_ioend_list); spin_lock_init(&ip->i_ioend_lock); + percpu_init_rwsem(&ip->i_dax_sem); + return ip; } diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 401da197f012..e8fd95b75e5b 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -142,12 +142,12 @@ xfs_ilock_attr_map_shared( * * Basic locking order: * - * i_rwsem -> i_mmap_lock -> page_lock -> i_ilock + * i_rwsem -> i_dax_sem -> i_mmap_lock -> page_lock -> i_ilock * * mmap_sem locking order: * * i_rwsem -> page lock -> mmap_sem - * mmap_sem -> i_mmap_lock -> page_lock + * mmap_sem -> i_dax_sem -> i_mmap_lock -> page_lock * * The difference in mmap_sem locking order mean that we cannot hold the * i_mmap_lock over syscall based read(2)/write(2) based IO. These IO paths can @@ -181,6 +181,13 @@ xfs_ilock( ASSERT((lock_flags & (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL)) != (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL)); ASSERT((lock_flags & ~(XFS_LOCK_MASK | XFS_LOCK_SUBCLASS_MASK)) == 0); + ASSERT((lock_flags & (XFS_DAX_SHARED | XFS_DAX_EXCL)) != + (XFS_DAX_SHARED | XFS_DAX_EXCL)); + + if (lock_flags & XFS_DAX_EXCL) + percpu_down_write(&ip->i_dax_sem); + else if (lock_flags & XFS_DAX_SHARED) + percpu_down_read(&ip->i_dax_sem); if (lock_flags & XFS_IOLOCK_EXCL) { down_write_nested(&VFS_I(ip)->i_rwsem, @@ -224,6 +231,8 @@ xfs_ilock_nowait( * You can't set both SHARED and EXCL for the same lock, * and only XFS_IOLOCK_SHARED, XFS_IOLOCK_EXCL, XFS_ILOCK_SHARED, * and XFS_ILOCK_EXCL are valid values to set in lock_flags. + * + * XFS_DAX_* is not allowed */ ASSERT((lock_flags & (XFS_IOLOCK_SHARED | XFS_IOLOCK_EXCL)) != (XFS_IOLOCK_SHARED | XFS_IOLOCK_EXCL)); @@ -232,6 +241,7 @@ xfs_ilock_nowait( ASSERT((lock_flags & (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL)) != (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL)); ASSERT((lock_flags & ~(XFS_LOCK_MASK | XFS_LOCK_SUBCLASS_MASK)) == 0); + ASSERT((lock_flags & (XFS_DAX_SHARED | XFS_DAX_EXCL)) == 0); if (lock_flags & XFS_IOLOCK_EXCL) { if (!down_write_trylock(&VFS_I(ip)->i_rwsem)) @@ -302,6 +312,8 @@ xfs_iunlock( (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL)); ASSERT((lock_flags & ~(XFS_LOCK_MASK | XFS_LOCK_SUBCLASS_MASK)) == 0); ASSERT(lock_flags != 0); + ASSERT((lock_flags & (XFS_DAX_SHARED | XFS_DAX_EXCL)) != + (XFS_DAX_SHARED | XFS_DAX_EXCL)); if (lock_flags & XFS_IOLOCK_EXCL) up_write(&VFS_I(ip)->i_rwsem); @@ -318,6 +330,11 @@ xfs_iunlock( else if (lock_flags & XFS_ILOCK_SHARED) mrunlock_shared(&ip->i_lock); + if (lock_flags & XFS_DAX_EXCL) + percpu_up_write(&ip->i_dax_sem); + else if (lock_flags & XFS_DAX_SHARED) + percpu_up_read(&ip->i_dax_sem); + trace_xfs_iunlock(ip, lock_flags, _RET_IP_); } @@ -333,6 +350,8 @@ xfs_ilock_demote( ASSERT(lock_flags & (XFS_IOLOCK_EXCL|XFS_MMAPLOCK_EXCL|XFS_ILOCK_EXCL)); ASSERT((lock_flags & ~(XFS_IOLOCK_EXCL|XFS_MMAPLOCK_EXCL|XFS_ILOCK_EXCL)) == 0); + /* XFS_DAX_* is not allowed */ + ASSERT((lock_flags & (XFS_DAX_SHARED | XFS_DAX_EXCL)) == 0); if (lock_flags & XFS_ILOCK_EXCL) mrdemote(&ip->i_lock); @@ -369,6 +388,13 @@ xfs_isilocked( return rwsem_is_locked(&VFS_I(ip)->i_rwsem); } + if (lock_flags & (XFS_DAX_EXCL|XFS_DAX_SHARED)) { + if (!(lock_flags & XFS_DAX_SHARED)) + return !debug_locks || + percpu_rwsem_is_held(&ip->i_dax_sem, 0); + return rwsem_is_locked(&ip->i_dax_sem); + } + ASSERT(0); return 0; } @@ -465,6 +491,9 @@ xfs_lock_inodes( ASSERT(!(lock_mode & XFS_ILOCK_EXCL) || inodes <= XFS_ILOCK_MAX_SUBCLASS + 1); + /* XFS_DAX_* is not allowed */ + ASSERT((lock_mode & (XFS_DAX_SHARED | XFS_DAX_EXCL)) == 0); + if (lock_mode & XFS_IOLOCK_EXCL) { ASSERT(!(lock_mode & (XFS_MMAPLOCK_EXCL | XFS_ILOCK_EXCL))); } else if (lock_mode & XFS_MMAPLOCK_EXCL) @@ -566,6 +595,10 @@ xfs_lock_two_inodes( ASSERT(!(ip0_mode & (XFS_MMAPLOCK_SHARED|XFS_MMAPLOCK_EXCL)) || !(ip1_mode & (XFS_ILOCK_SHARED|XFS_ILOCK_EXCL))); + /* XFS_DAX_* is not allowed */ + ASSERT((ip0_mode & (XFS_DAX_SHARED | XFS_DAX_EXCL)) == 0); + ASSERT((ip1_mode & (XFS_DAX_SHARED | XFS_DAX_EXCL)) == 0); + ASSERT(ip0->i_ino != ip1->i_ino); if (ip0->i_ino > ip1->i_ino) { diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 492e53992fa9..693ca66bd89b 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -67,6 +67,9 @@ typedef struct xfs_inode { spinlock_t i_ioend_lock; struct work_struct i_ioend_work; struct list_head i_ioend_list; + + /* protect changing the mode to/from DAX */ + struct percpu_rw_semaphore i_dax_sem; } xfs_inode_t; /* Convert from vfs inode to xfs inode */ @@ -278,10 +281,13 @@ static inline void xfs_ifunlock(struct xfs_inode *ip) #define XFS_ILOCK_SHARED (1<<3) #define XFS_MMAPLOCK_EXCL (1<<4) #define XFS_MMAPLOCK_SHARED (1<<5) +#define XFS_DAX_EXCL (1<<6) +#define XFS_DAX_SHARED (1<<7) #define XFS_LOCK_MASK (XFS_IOLOCK_EXCL | XFS_IOLOCK_SHARED \ | XFS_ILOCK_EXCL | XFS_ILOCK_SHARED \ - | XFS_MMAPLOCK_EXCL | XFS_MMAPLOCK_SHARED) + | XFS_MMAPLOCK_EXCL | XFS_MMAPLOCK_SHARED \ + | XFS_DAX_EXCL | XFS_DAX_SHARED) #define XFS_LOCK_FLAGS \ { XFS_IOLOCK_EXCL, "IOLOCK_EXCL" }, \ @@ -289,7 +295,9 @@ static inline void xfs_ifunlock(struct xfs_inode *ip) { XFS_ILOCK_EXCL, "ILOCK_EXCL" }, \ { XFS_ILOCK_SHARED, "ILOCK_SHARED" }, \ { XFS_MMAPLOCK_EXCL, "MMAPLOCK_EXCL" }, \ - { XFS_MMAPLOCK_SHARED, "MMAPLOCK_SHARED" } + { XFS_MMAPLOCK_SHARED, "MMAPLOCK_SHARED" }, \ + { XFS_DAX_EXCL, "DAX_EXCL" }, \ + { XFS_DAX_SHARED, "DAX_SHARED" } /* diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index d6843cdb51d0..a2f2604c3187 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -1158,6 +1158,16 @@ xfs_vn_tmpfile( return xfs_generic_create(dir, dentry, mode, 0, true); } +static void xfs_lock_mode(struct inode *inode) +{ + xfs_ilock(XFS_I(inode), XFS_DAX_SHARED); +} + +static void xfs_unlock_mode(struct inode *inode) +{ + xfs_iunlock(XFS_I(inode), XFS_DAX_SHARED); +} + static const struct inode_operations xfs_inode_operations = { .get_acl = xfs_get_acl, .set_acl = xfs_set_acl, @@ -1168,6 +1178,18 @@ static const struct inode_operations xfs_inode_operations = { .update_time = xfs_vn_update_time, }; +static const struct inode_operations xfs_reg_inode_operations = { + .get_acl = xfs_get_acl, + .set_acl = xfs_set_acl, + .getattr = xfs_vn_getattr, + .setattr = xfs_vn_setattr, + .listxattr = xfs_vn_listxattr, + .fiemap = xfs_vn_fiemap, + .update_time = xfs_vn_update_time, + .lock_mode = xfs_lock_mode, + .unlock_mode = xfs_unlock_mode, +}; + static const struct inode_operations xfs_dir_inode_operations = { .create = xfs_vn_create, .lookup = xfs_vn_lookup, @@ -1372,7 +1394,7 @@ xfs_setup_iops( switch (inode->i_mode & S_IFMT) { case S_IFREG: - inode->i_op = &xfs_inode_operations; + inode->i_op = &xfs_reg_inode_operations; inode->i_fop = &xfs_file_operations; if (IS_DAX(inode)) inode->i_mapping->a_ops = &xfs_dax_aops; From patchwork Fri Jan 10 19:29:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11328347 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5E4F56C1 for ; Fri, 10 Jan 2020 19:30:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 47ADD20880 for ; Fri, 10 Jan 2020 19:30:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728994AbgAJTaM (ORCPT ); Fri, 10 Jan 2020 14:30:12 -0500 Received: from mga17.intel.com ([192.55.52.151]:56599 "EHLO mga17.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728977AbgAJTaL (ORCPT ); Fri, 10 Jan 2020 14:30:11 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:10 -0800 X-IronPort-AV: E=Sophos;i="5.69,418,1571727600"; d="scan'208";a="238521007" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:10 -0800 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org Cc: Ira Weiny , Alexander Viro , "Darrick J. Wong" , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH V2 09/12] fs: Prevent mode change if file is mmap'ed Date: Fri, 10 Jan 2020 11:29:39 -0800 Message-Id: <20200110192942.25021-10-ira.weiny@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200110192942.25021-1-ira.weiny@intel.com> References: <20200110192942.25021-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Ira Weiny Page faults need to ensure the inode mode is correct and consistent with the vmf information at the time of the fault. There is no easy way to ensure the vmf information is correct if a mode change is in progress. Furthermore, there is no good use case to require a mode change while the file is mmap'ed. Track mmap's of the file and fail the mode change if the file is mmap'ed. Signed-off-by: Ira Weiny --- fs/inode.c | 2 ++ fs/xfs/xfs_ioctl.c | 8 ++++++++ include/linux/fs.h | 1 + mm/mmap.c | 19 +++++++++++++++++-- 4 files changed, 28 insertions(+), 2 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index 2b0f51161918..944711aed6f8 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -245,6 +245,8 @@ static struct inode *alloc_inode(struct super_block *sb) return NULL; } + atomic64_set(&inode->i_mapped, 0); + return inode; } diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index bc3654fe3b5d..1ab0906c6c7f 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1200,6 +1200,14 @@ xfs_ioctl_setattr_dax_invalidate( goto out_unlock; } + /* + * If there is a mapping in place we must remain in our current mode. + */ + if (atomic64_read(&inode->i_mapped)) { + error = -EBUSY; + goto out_unlock; + } + error = filemap_write_and_wait(inode->i_mapping); if (error) goto out_unlock; diff --git a/include/linux/fs.h b/include/linux/fs.h index 631f11d6246e..6e7dc626b657 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -740,6 +740,7 @@ struct inode { #endif void *i_private; /* fs or device private pointer */ + atomic64_t i_mapped; } __randomize_layout; struct timespec64 timestamp_truncate(struct timespec64 t, struct inode *inode); diff --git a/mm/mmap.c b/mm/mmap.c index dfaf1130e706..e6b68924b7ca 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -171,12 +171,17 @@ void unlink_file_vma(struct vm_area_struct *vma) static struct vm_area_struct *remove_vma(struct vm_area_struct *vma) { struct vm_area_struct *next = vma->vm_next; + struct file *f = vma->vm_file; might_sleep(); if (vma->vm_ops && vma->vm_ops->close) vma->vm_ops->close(vma); - if (vma->vm_file) - fput(vma->vm_file); + if (f) { + struct inode *inode = file_inode(f); + if (inode) + atomic64_dec(&inode->i_mapped); + fput(f); + } mpol_put(vma_policy(vma)); vm_area_free(vma); return next; @@ -1837,6 +1842,16 @@ unsigned long mmap_region(struct file *file, unsigned long addr, vma_set_page_prot(vma); + /* + * Track if there is mapping in place such that a mode change + * does not occur on a file which is mapped + */ + if (file) { + struct inode *inode = file_inode(file); + + atomic64_inc(&inode->i_mapped); + } + return addr; unmap_and_free_vma: From patchwork Fri Jan 10 19:29:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11328349 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C7ABD138D for ; Fri, 10 Jan 2020 19:30:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A5F2920880 for ; Fri, 10 Jan 2020 19:30:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728633AbgAJTaf (ORCPT ); Fri, 10 Jan 2020 14:30:35 -0500 Received: from mga11.intel.com ([192.55.52.93]:46986 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728924AbgAJTaM (ORCPT ); Fri, 10 Jan 2020 14:30:12 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:12 -0800 X-IronPort-AV: E=Sophos;i="5.69,418,1571727600"; d="scan'208";a="304262908" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:11 -0800 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org Cc: Ira Weiny , Alexander Viro , "Darrick J. Wong" , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH V2 10/12] fs/xfs: Fix truncate up Date: Fri, 10 Jan 2020 11:29:40 -0800 Message-Id: <20200110192942.25021-11-ira.weiny@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200110192942.25021-1-ira.weiny@intel.com> References: <20200110192942.25021-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Ira Weiny When zeroing the end of a file we must account for bytes contained in the final page which are past EOF. Extend the range passed to iomap_zero_range() to reach LLONG_MAX which will include all bytes of the final page. Signed-off-by: Ira Weiny --- fs/xfs/xfs_iops.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index a2f2604c3187..a34b04e8ac9c 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -910,7 +910,7 @@ xfs_setattr_size( */ if (newsize > oldsize) { trace_xfs_zero_eof(ip, oldsize, newsize - oldsize); - error = iomap_zero_range(inode, oldsize, newsize - oldsize, + error = iomap_zero_range(inode, oldsize, LLONG_MAX - oldsize, &did_zeroing, &xfs_buffered_write_iomap_ops); } else { error = iomap_truncate_page(inode, newsize, &did_zeroing, From patchwork Fri Jan 10 19:29:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11328341 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 33CA7138D for ; Fri, 10 Jan 2020 19:30:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1C2262082E for ; Fri, 10 Jan 2020 19:30:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729021AbgAJTaQ (ORCPT ); Fri, 10 Jan 2020 14:30:16 -0500 Received: from mga07.intel.com ([134.134.136.100]:43951 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729004AbgAJTaO (ORCPT ); Fri, 10 Jan 2020 14:30:14 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:14 -0800 X-IronPort-AV: E=Sophos;i="5.69,418,1571727600"; d="scan'208";a="218748606" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:13 -0800 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org Cc: Ira Weiny , Alexander Viro , "Darrick J. Wong" , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH V2 11/12] fs/xfs: Clean up locking in dax invalidate Date: Fri, 10 Jan 2020 11:29:41 -0800 Message-Id: <20200110192942.25021-12-ira.weiny@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200110192942.25021-1-ira.weiny@intel.com> References: <20200110192942.25021-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Ira Weiny Define a variable to hold the lock flags to ensure that the correct locks are returned or released on error. Signed-off-by: Ira Weiny --- fs/xfs/xfs_ioctl.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 1ab0906c6c7f..9a35bf83eaa1 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1176,7 +1176,7 @@ xfs_ioctl_setattr_dax_invalidate( int *join_flags) { struct inode *inode = VFS_I(ip); - int error; + int error, flags; *join_flags = 0; @@ -1191,8 +1191,10 @@ xfs_ioctl_setattr_dax_invalidate( if (S_ISDIR(inode->i_mode)) return 0; + flags = XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL; + /* lock, flush and invalidate mapping in preparation for flag change */ - xfs_ilock(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL); + xfs_ilock(ip, flags); if ((fa->fsx_xflags & FS_XFLAG_DAX) == FS_XFLAG_DAX && !xfs_inode_supports_dax(ip)) { @@ -1215,11 +1217,11 @@ xfs_ioctl_setattr_dax_invalidate( if (error) goto out_unlock; - *join_flags = XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL; + *join_flags = flags; return 0; out_unlock: - xfs_iunlock(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL); + xfs_iunlock(ip, flags); return error; } From patchwork Fri Jan 10 19:29:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Weiny X-Patchwork-Id: 11328345 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BA98D930 for ; Fri, 10 Jan 2020 19:30:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A3EF52084D for ; Fri, 10 Jan 2020 19:30:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729036AbgAJTaR (ORCPT ); Fri, 10 Jan 2020 14:30:17 -0500 Received: from mga09.intel.com ([134.134.136.24]:22315 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728977AbgAJTaP (ORCPT ); Fri, 10 Jan 2020 14:30:15 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:15 -0800 X-IronPort-AV: E=Sophos;i="5.69,418,1571727600"; d="scan'208";a="218160906" Received: from iweiny-desk2.sc.intel.com (HELO localhost) ([10.3.52.157]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 10 Jan 2020 11:30:14 -0800 From: ira.weiny@intel.com To: linux-kernel@vger.kernel.org Cc: Ira Weiny , Alexander Viro , "Darrick J. Wong" , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC PATCH V2 12/12] fs/xfs: Allow toggle of effective DAX flag Date: Fri, 10 Jan 2020 11:29:42 -0800 Message-Id: <20200110192942.25021-13-ira.weiny@intel.com> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20200110192942.25021-1-ira.weiny@intel.com> References: <20200110192942.25021-1-ira.weiny@intel.com> MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org From: Ira Weiny Now that locking of the inode is in place we can allow a mode change while under the new lock. Signed-off-by: Ira Weiny --- fs/xfs/xfs_inode.h | 1 + fs/xfs/xfs_ioctl.c | 9 ++++++--- fs/xfs/xfs_iops.c | 15 +++++++++++---- 3 files changed, 18 insertions(+), 7 deletions(-) diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 693ca66bd89b..b0d2e7da88c6 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -474,6 +474,7 @@ int xfs_break_layouts(struct inode *inode, uint *iolock, /* from xfs_iops.c */ extern void xfs_setup_inode(struct xfs_inode *ip); extern void xfs_setup_iops(struct xfs_inode *ip); +extern void xfs_setup_a_ops(struct xfs_inode *ip); /* * When setting up a newly allocated inode, we need to call diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 9a35bf83eaa1..e07559bf70a9 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -1109,12 +1109,11 @@ xfs_diflags_to_linux( inode->i_flags |= S_NOATIME; else inode->i_flags &= ~S_NOATIME; -#if 0 /* disabled until the flag switching races are sorted out */ if (xflags & FS_XFLAG_DAX) inode->i_flags |= S_DAX; else inode->i_flags &= ~S_DAX; -#endif + } static int @@ -1191,7 +1190,7 @@ xfs_ioctl_setattr_dax_invalidate( if (S_ISDIR(inode->i_mode)) return 0; - flags = XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL; + flags = XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL | XFS_DAX_EXCL; /* lock, flush and invalidate mapping in preparation for flag change */ xfs_ilock(ip, flags); @@ -1512,6 +1511,8 @@ xfs_ioctl_setattr( else ip->i_d.di_cowextsize = 0; + xfs_setup_a_ops(ip); + code = xfs_trans_commit(tp); /* @@ -1621,6 +1622,8 @@ xfs_ioc_setxflags( goto out_drop_write; } + xfs_setup_a_ops(ip); + error = xfs_trans_commit(tp); out_drop_write: mnt_drop_write_file(filp); diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index a34b04e8ac9c..c70164a0df97 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -1386,6 +1386,16 @@ xfs_setup_inode( } } +void xfs_setup_a_ops(struct xfs_inode *ip) +{ + struct inode *inode = &ip->i_vnode; + + if (IS_DAX(inode)) + inode->i_mapping->a_ops = &xfs_dax_aops; + else + inode->i_mapping->a_ops = &xfs_address_space_operations; +} + void xfs_setup_iops( struct xfs_inode *ip) @@ -1396,10 +1406,7 @@ xfs_setup_iops( case S_IFREG: inode->i_op = &xfs_reg_inode_operations; inode->i_fop = &xfs_file_operations; - if (IS_DAX(inode)) - inode->i_mapping->a_ops = &xfs_dax_aops; - else - inode->i_mapping->a_ops = &xfs_address_space_operations; + xfs_setup_a_ops(ip); break; case S_IFDIR: if (xfs_sb_version_hasasciici(&XFS_M(inode->i_sb)->m_sb))