From patchwork Thu Mar 15 15:52:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10284997 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id F22FC6061F for ; Thu, 15 Mar 2018 16:02:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D7FC828618 for ; Thu, 15 Mar 2018 16:02:39 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CC47B2866D; Thu, 15 Mar 2018 16:02:39 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5D6C028618 for ; Thu, 15 Mar 2018 16:02:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933102AbeCOQB6 (ORCPT ); Thu, 15 Mar 2018 12:01:58 -0400 Received: from mga17.intel.com ([192.55.52.151]:50731 "EHLO mga17.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933095AbeCOQB4 (ORCPT ); Thu, 15 Mar 2018 12:01:56 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 15 Mar 2018 09:01:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.48,311,1517904000"; d="scan'208";a="34161564" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by FMSMGA003.fm.intel.com with ESMTP; 15 Mar 2018 09:01:55 -0700 Subject: [PATCH v6 15/15] xfs, dax: introduce xfs_break_dax_layouts() From: Dan Williams To: linux-nvdimm@lists.01.org Cc: Jan Kara , Dave Chinner , "Darrick J. Wong" , Ross Zwisler , Christoph Hellwig , linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, jack@suse.cz, ross.zwisler@linux.intel.com, hch@lst.de, linux-kernel@vger.kernel.org Date: Thu, 15 Mar 2018 08:52:50 -0700 Message-ID: <152112917064.24669.8101553386217458496.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <152112908134.24669.10222746224538377035.stgit@dwillia2-desk3.amr.corp.intel.com> References: <152112908134.24669.10222746224538377035.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.18-2-gc94f MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP xfs_break_dax_layouts(), similar to xfs_break_leased_layouts(), scans for busy / pinned dax pages and waits for those pages to go idle before any potential extent unmap operation. dax_layout_busy_page() handles synchronizing against new page-busy events (get_user_pages). It invalidates all mappings to trigger the get_user_pages slow path which will eventually block on the xfs inode log held in XFS_MMAPLOCK_EXCL mode. If dax_layout_busy_page() finds a busy page it returns it for xfs to wait for the page-idle event that will fire when the page reference count reaches 1 (recall ZONE_DEVICE pages are idle at count 1). While waiting, the XFS_MMAPLOCK_EXCL lock is dropped in order to not deadlock the process that might be trying to elevate the page count of more pages before arranging for any of them to go idle. I.e. the typical case of submitting I/O is that iov_iter_get_pages() elevates the reference count of all pages in the I/O before starting I/O on the first page. Cc: Jan Kara Cc: Dave Chinner Cc: "Darrick J. Wong" Cc: Ross Zwisler Cc: Christoph Hellwig Signed-off-by: Dan Williams Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_file.c | 67 +++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 54 insertions(+), 13 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 399c5221f101..2ccdbb19e31a 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -759,6 +759,38 @@ xfs_file_write_iter( return ret; } +static void +xfs_wait_var_event( + struct inode *inode, + uint iolock) +{ + struct xfs_inode *ip = XFS_I(inode); + + xfs_iunlock(ip, iolock); + schedule(); + xfs_ilock(ip, iolock); +} + +static int +xfs_break_dax_layouts( + struct inode *inode, + uint iolock) +{ + struct page *page; + int ret; + + page = dax_layout_busy_page(inode->i_mapping); + if (!page) + return 0; + + ret = ___wait_var_event(&page->_refcount, + atomic_read(&page->_refcount) == 1, TASK_INTERRUPTIBLE, + 0, 0, xfs_wait_var_event(inode, iolock)); + if (ret < 0) + return ret; + return 1; +} + int xfs_break_layouts( struct inode *inode, @@ -766,23 +798,32 @@ xfs_break_layouts( enum layout_break_reason reason) { struct xfs_inode *ip = XFS_I(inode); - int ret; + int ret = 0; ASSERT(xfs_isilocked(ip, XFS_IOLOCK_SHARED | XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL)); - switch (reason) { - case BREAK_TRUNCATE: - /* fall through */ - case BREAK_WRITE: - ret = xfs_break_leased_layouts(inode, iolock); - if (ret > 0) - ret = 0; - break; - default: - ret = -EINVAL; - break; - } + do { + switch (reason) { + case BREAK_TRUNCATE: + ret = xfs_break_dax_layouts(inode, *iolock); + /* fall through */ + case BREAK_WRITE: + if (ret != 0) + break; + ret = xfs_break_leased_layouts(inode, iolock); + break; + default: + ret = -EINVAL; + break; + } + /* + * This loop terminates when either layout break attempt + * returns an error, or both layout break attempts + * return 0, i.e. layouts are verified broken while + * holding all required locks. + */ + } while (ret > 0); return ret; }