From patchwork Thu Mar 15 15:52:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10285007 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 74E85602C2 for ; Thu, 15 Mar 2018 16:03:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6337328AFB for ; Thu, 15 Mar 2018 16:03:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 56EDD28B3B; Thu, 15 Mar 2018 16:03:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BF6F328B38 for ; Thu, 15 Mar 2018 16:03:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933078AbeCOQBy (ORCPT ); Thu, 15 Mar 2018 12:01:54 -0400 Received: from mga07.intel.com ([134.134.136.100]:12089 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932475AbeCOQBw (ORCPT ); Thu, 15 Mar 2018 12:01:52 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 15 Mar 2018 09:01:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.48,311,1517904000"; d="scan'208";a="28194918" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by fmsmga002.fm.intel.com with ESMTP; 15 Mar 2018 09:01:50 -0700 Subject: [PATCH v6 14/15] xfs: prepare xfs_break_layouts() for another layout type From: Dan Williams To: linux-nvdimm@lists.01.org Cc: "Darrick J. Wong" , Ross Zwisler , Dave Chinner , Christoph Hellwig , linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, jack@suse.cz, ross.zwisler@linux.intel.com, hch@lst.de, linux-kernel@vger.kernel.org Date: Thu, 15 Mar 2018 08:52:45 -0700 Message-ID: <152112916514.24669.8643877835071945330.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <152112908134.24669.10222746224538377035.stgit@dwillia2-desk3.amr.corp.intel.com> References: <152112908134.24669.10222746224538377035.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When xfs is operating as the back-end of a pNFS block server, it prevents collisions between local and remote operations by requiring a lease to be held for remotely accessed blocks. Local filesystem operations break those leases before writing or mutating the extent map of the file. A similar mechanism is needed to prevent operations on pinned dax mappings, like device-DMA, from colliding with extent unmap operations. BREAK_WRITE and BREAK_TRUNCATE are introduced as two distinct levels of layout breaking. Layouts are broken in the BREAK_WRITE case to ensure that layout-holders do not collide with local writes. Additionally, layouts are broken in the BREAK_TRUNCATE case to make sure the layout-holder has a consistent view of the file's extent map. While BREAK_WRITE breaks can be satisfied be recalling FL_LAYOUT leases, BREAK_TRUNCATE breaks additionally require waiting for busy dax-pages to go idle. Cc: "Darrick J. Wong" Cc: Ross Zwisler Reported-by: Dave Chinner Reported-by: Christoph Hellwig Signed-off-by: Dan Williams Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_file.c | 23 +++++++++++++++++------ fs/xfs/xfs_inode.h | 18 ++++++++++++++++-- fs/xfs/xfs_ioctl.c | 2 +- fs/xfs/xfs_iops.c | 5 +++-- 4 files changed, 37 insertions(+), 11 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 5742d395a4e4..399c5221f101 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -352,7 +352,7 @@ xfs_file_aio_write_checks( xfs_ilock(ip, XFS_MMAPLOCK_EXCL); *iolock |= XFS_MMAPLOCK_EXCL; - error = xfs_break_layouts(inode, iolock); + error = xfs_break_layouts(inode, iolock, BREAK_WRITE); if (error) { *iolock &= ~XFS_MMAPLOCK_EXCL; xfs_iunlock(ip, XFS_MMAPLOCK_EXCL); @@ -762,7 +762,8 @@ xfs_file_write_iter( int xfs_break_layouts( struct inode *inode, - uint *iolock) + uint *iolock, + enum layout_break_reason reason) { struct xfs_inode *ip = XFS_I(inode); int ret; @@ -770,9 +771,19 @@ xfs_break_layouts( ASSERT(xfs_isilocked(ip, XFS_IOLOCK_SHARED | XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL)); - ret = xfs_break_leased_layouts(inode, iolock); - if (ret > 0) - ret = 0; + switch (reason) { + case BREAK_TRUNCATE: + /* fall through */ + case BREAK_WRITE: + ret = xfs_break_leased_layouts(inode, iolock); + if (ret > 0) + ret = 0; + break; + default: + ret = -EINVAL; + break; + } + return ret; } @@ -802,7 +813,7 @@ xfs_file_fallocate( return -EOPNOTSUPP; xfs_ilock(ip, iolock); - error = xfs_break_layouts(inode, &iolock); + error = xfs_break_layouts(inode, &iolock, BREAK_TRUNCATE); if (error) goto out_unlock; diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 74c63f3a720f..1a66c7afcf45 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -379,6 +379,20 @@ static inline void xfs_ifunlock(struct xfs_inode *ip) >> XFS_ILOCK_SHIFT) /* + * Layouts are broken in the BREAK_WRITE case to ensure that + * layout-holders do not collide with local writes. Additionally, + * layouts are broken in the BREAK_TRUNCATE case to make sure the + * layout-holder has a consistent view of the file's extent map. While + * BREAK_WRITE breaks can be satisfied be recalling FL_LAYOUT leases, + * BREAK_TRUNCATE breaks additionally require waiting for busy dax-pages + * to go idle. + */ +enum layout_break_reason { + BREAK_WRITE, + BREAK_TRUNCATE, +}; + +/* * For multiple groups support: if S_ISGID bit is set in the parent * directory, group of new file is set to that of the parent, and * new subdirectory gets S_ISGID bit from parent. @@ -447,8 +461,8 @@ int xfs_zero_eof(struct xfs_inode *ip, xfs_off_t offset, xfs_fsize_t isize, bool *did_zeroing); int xfs_zero_range(struct xfs_inode *ip, xfs_off_t pos, xfs_off_t count, bool *did_zero); -int xfs_break_layouts(struct inode *inode, uint *iolock); - +int xfs_break_layouts(struct inode *inode, uint *iolock, + enum layout_break_reason reason); /* from xfs_iops.c */ extern void xfs_setup_inode(struct xfs_inode *ip); diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index d70a1919e787..847a67186d95 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -643,7 +643,7 @@ xfs_ioc_space( return error; xfs_ilock(ip, iolock); - error = xfs_break_layouts(inode, &iolock); + error = xfs_break_layouts(inode, &iolock, BREAK_TRUNCATE); if (error) goto out_unlock; diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c index 78eb56d447df..f9fcadb5b555 100644 --- a/fs/xfs/xfs_iops.c +++ b/fs/xfs/xfs_iops.c @@ -1026,13 +1026,14 @@ xfs_vn_setattr( int error; if (iattr->ia_valid & ATTR_SIZE) { - struct xfs_inode *ip = XFS_I(d_inode(dentry)); + struct inode *inode = d_inode(dentry); + struct xfs_inode *ip = XFS_I(inode); uint iolock; xfs_ilock(ip, XFS_MMAPLOCK_EXCL); iolock = XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL; - error = xfs_break_layouts(d_inode(dentry), &iolock); + error = xfs_break_layouts(inode, &iolock, BREAK_TRUNCATE); if (error) { xfs_iunlock(ip, XFS_MMAPLOCK_EXCL); return error;