From patchwork Thu Feb 7 05:08:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 10800369 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0844917FB for ; Thu, 7 Feb 2019 05:08:39 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EB7FC2A778 for ; Thu, 7 Feb 2019 05:08:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E08B22CF61; Thu, 7 Feb 2019 05:08:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7919C2A778 for ; Thu, 7 Feb 2019 05:08:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726293AbfBGFIi (ORCPT ); Thu, 7 Feb 2019 00:08:38 -0500 Received: from ipmail03.adl2.internode.on.net ([150.101.137.141]:11812 "EHLO ipmail03.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726136AbfBGFIh (ORCPT ); Thu, 7 Feb 2019 00:08:37 -0500 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail03.adl2.internode.on.net with ESMTP; 07 Feb 2019 15:38:18 +1030 Received: from discord.disaster.area ([192.168.1.111]) by dastard with esmtp (Exim 4.80) (envelope-from ) id 1grbv6-0003pJ-Pc for linux-xfs@vger.kernel.org; Thu, 07 Feb 2019 16:08:16 +1100 Received: from dave by discord.disaster.area with local (Exim 4.92-RC5) (envelope-from ) id 1grbv6-0006KT-Nk for linux-xfs@vger.kernel.org; Thu, 07 Feb 2019 16:08:16 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Subject: [PATCH 1/3] xfs: Don't free EOF blocks on sync write close Date: Thu, 7 Feb 2019 16:08:11 +1100 Message-Id: <20190207050813.24271-2-david@fromorbit.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190207050813.24271-1-david@fromorbit.com> References: <20190207050813.24271-1-david@fromorbit.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Chinner When we have a workload that does open/sync write/close in parallel with other allocation, the file becomes rapidly fragmented. This is due to close() calling xfs_release() and removing the speculative preallocation beyond EOF. The existing open/write/close hueristic in xfs_release() does not catch this as sync writes do not leave delayed allocation blocks allocated on the inode for later writeback that can be detected in xfs_release() and hence XFS_IDIRTY_RELEASE never gets set. In xfs_file_release(), we can tell whether the release context was a synchronous write context, and so we need to communicate this to xfs_release() so it can do the right thing here and skip EOF block truncation. This defers the EOF block cleanup for synchronous write contexts to the background EOF block cleaner which will clean up within a few minutes. Before: Test 1: sync write fragmentation counts /mnt/scratch/file.0: 919 /mnt/scratch/file.1: 916 /mnt/scratch/file.2: 919 /mnt/scratch/file.3: 920 /mnt/scratch/file.4: 920 /mnt/scratch/file.5: 921 /mnt/scratch/file.6: 916 /mnt/scratch/file.7: 918 After: Test 1: sync write fragmentation counts /mnt/scratch/file.0: 24 /mnt/scratch/file.1: 24 /mnt/scratch/file.2: 11 /mnt/scratch/file.3: 24 /mnt/scratch/file.4: 3 /mnt/scratch/file.5: 24 /mnt/scratch/file.6: 24 /mnt/scratch/file.7: 23 Signed-off-by: Dave Chinner --- fs/xfs/xfs_file.c | 15 +++++++++++++-- fs/xfs/xfs_inode.c | 9 +++++---- fs/xfs/xfs_inode.h | 2 +- 3 files changed, 19 insertions(+), 7 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index e47425071e65..02f76b8e6c03 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1019,12 +1019,23 @@ xfs_dir_open( return error; } +/* + * When we release the file, we don't want it to trim EOF blocks for synchronous + * write contexts as this leads to severe fragmentation when applications do + * repeated open/appending sync write/close to a file amongst other file IO. + */ STATIC int xfs_file_release( struct inode *inode, - struct file *filp) + struct file *file) { - return xfs_release(XFS_I(inode)); + bool free_eof_blocks = true; + + if ((file->f_mode & FMODE_WRITE) && + (file->f_flags & O_DSYNC)) + free_eof_blocks = false; + + return xfs_release(XFS_I(inode), free_eof_blocks); } STATIC int diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index ae667ba74a1c..a74dc240697f 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -1603,10 +1603,11 @@ xfs_itruncate_extents_flags( int xfs_release( - xfs_inode_t *ip) + struct xfs_inode *ip, + bool can_free_eofblocks) { - xfs_mount_t *mp = ip->i_mount; - int error; + struct xfs_mount *mp = ip->i_mount; + int error; if (!S_ISREG(VFS_I(ip)->i_mode) || (VFS_I(ip)->i_mode == 0)) return 0; @@ -1642,7 +1643,7 @@ xfs_release( if (VFS_I(ip)->i_nlink == 0) return 0; - if (xfs_can_free_eofblocks(ip, false)) { + if (can_free_eofblocks && xfs_can_free_eofblocks(ip, false)) { /* * Check if the inode is being opened, written and closed diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index be2014520155..7e59b0e086d7 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -397,7 +397,7 @@ enum layout_break_reason { (((pip)->i_mount->m_flags & XFS_MOUNT_GRPID) || \ (VFS_I(pip)->i_mode & S_ISGID)) -int xfs_release(struct xfs_inode *ip); +int xfs_release(struct xfs_inode *ip, bool can_free_eofblocks); void xfs_inactive(struct xfs_inode *ip); int xfs_lookup(struct xfs_inode *dp, struct xfs_name *name, struct xfs_inode **ipp, struct xfs_name *ci_name); From patchwork Thu Feb 7 05:08:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 10800367 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 96BC6922 for ; Thu, 7 Feb 2019 05:08:37 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 873E82A778 for ; Thu, 7 Feb 2019 05:08:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7B8162CF61; Thu, 7 Feb 2019 05:08:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1F3982A778 for ; Thu, 7 Feb 2019 05:08:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726294AbfBGFIg (ORCPT ); Thu, 7 Feb 2019 00:08:36 -0500 Received: from ipmail03.adl2.internode.on.net ([150.101.137.141]:55104 "EHLO ipmail03.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725910AbfBGFIg (ORCPT ); Thu, 7 Feb 2019 00:08:36 -0500 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail03.adl2.internode.on.net with ESMTP; 07 Feb 2019 15:38:18 +1030 Received: from discord.disaster.area ([192.168.1.111]) by dastard with esmtp (Exim 4.80) (envelope-from ) id 1grbv6-0003pK-Q6 for linux-xfs@vger.kernel.org; Thu, 07 Feb 2019 16:08:16 +1100 Received: from dave by discord.disaster.area with local (Exim 4.92-RC5) (envelope-from ) id 1grbv6-0006KW-Oq for linux-xfs@vger.kernel.org; Thu, 07 Feb 2019 16:08:16 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Subject: [PATCH 2/3] xfs: Don't free EOF blocks on close when extent size hints are set Date: Thu, 7 Feb 2019 16:08:12 +1100 Message-Id: <20190207050813.24271-3-david@fromorbit.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190207050813.24271-1-david@fromorbit.com> References: <20190207050813.24271-1-david@fromorbit.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When we have a workload that does open/write/close on files with extent size hints set in parallel with other allocation, the file becomes rapidly fragmented. This is due to close() calling xfs_release() and removing the preallocated extent beyond EOF. This occurs for both buffered and direct writes that append to files with extent size hints. The existing open/write/close hueristic in xfs_release() does not catch this as writes to files using extent size hints do not use delayed allocation and hence do not leave delayed allocation blocks allocated on the inode that can be detected in xfs_release(). Hence XFS_IDIRTY_RELEASE never gets set. In xfs_file_release(), we can tell whether the inode has extent size hints set and skip EOF block truncation. We add this check to xfs_can_free_eofblocks() so that we treat the post-EOF preallocated extent like intentional preallocation and so are persistent unless directly removed by userspace. Before: Test 2: Extent size hint fragmentation counts /mnt/scratch/file.0: 1002 /mnt/scratch/file.1: 1002 /mnt/scratch/file.2: 1002 /mnt/scratch/file.3: 1002 /mnt/scratch/file.4: 1002 /mnt/scratch/file.5: 1002 /mnt/scratch/file.6: 1002 /mnt/scratch/file.7: 1002 After: Test 2: Extent size hint fragmentation counts /mnt/scratch/file.0: 4 /mnt/scratch/file.1: 4 /mnt/scratch/file.2: 4 /mnt/scratch/file.3: 4 /mnt/scratch/file.4: 4 /mnt/scratch/file.5: 4 /mnt/scratch/file.6: 4 /mnt/scratch/file.7: 4 Signed-off-by: Dave Chinner --- fs/xfs/xfs_bmap_util.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 1ee8c5539fa4..98e5e305b789 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -761,12 +761,15 @@ xfs_can_free_eofblocks(struct xfs_inode *ip, bool force) return false; /* - * Do not free real preallocated or append-only files unless the file - * has delalloc blocks and we are forced to remove them. + * Do not free extent size hints, real preallocated or append-only files + * unless the file has delalloc blocks and we are forced to remove + * them. */ - if (ip->i_d.di_flags & (XFS_DIFLAG_PREALLOC | XFS_DIFLAG_APPEND)) + if (xfs_get_extsz_hint(ip) || + (ip->i_d.di_flags & (XFS_DIFLAG_PREALLOC | XFS_DIFLAG_APPEND))) { if (!force || ip->i_delayed_blks == 0) return false; + } return true; } From patchwork Thu Feb 7 05:08:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 10800365 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 131C3922 for ; Thu, 7 Feb 2019 05:08:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F41E02A778 for ; Thu, 7 Feb 2019 05:08:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E2D352CF61; Thu, 7 Feb 2019 05:08:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8008E2A778 for ; Thu, 7 Feb 2019 05:08:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726128AbfBGFIe (ORCPT ); Thu, 7 Feb 2019 00:08:34 -0500 Received: from ipmail03.adl2.internode.on.net ([150.101.137.141]:55104 "EHLO ipmail03.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725910AbfBGFIe (ORCPT ); Thu, 7 Feb 2019 00:08:34 -0500 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail03.adl2.internode.on.net with ESMTP; 07 Feb 2019 15:38:18 +1030 Received: from discord.disaster.area ([192.168.1.111]) by dastard with esmtp (Exim 4.80) (envelope-from ) id 1grbv6-0003pN-R4 for linux-xfs@vger.kernel.org; Thu, 07 Feb 2019 16:08:16 +1100 Received: from dave by discord.disaster.area with local (Exim 4.92-RC5) (envelope-from ) id 1grbv6-0006KZ-Pw for linux-xfs@vger.kernel.org; Thu, 07 Feb 2019 16:08:16 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Subject: [PATCH 3/3] xfs: Don't free EOF blocks on sync write close Date: Thu, 7 Feb 2019 16:08:13 +1100 Message-Id: <20190207050813.24271-4-david@fromorbit.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190207050813.24271-1-david@fromorbit.com> References: <20190207050813.24271-1-david@fromorbit.com> MIME-Version: 1.0 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Chinner When we have a workload that does open/read/close in parallel with other synchronous buffered writes to long term open files, the file becomes rapidly fragmented. This is due to close() after read calling xfs_release() and removing the speculative preallocation beyond EOF. The existing open/write/close hueristic in xfs_release() does not catch this as sync writes do not leave delayed allocation blocks allocated on the inode for later writeback that can be detected in xfs_release() and hence XFS_IDIRTY_RELEASE never gets set. Further, the close context here is for a file opened O_RDONLY, and so /modifying/ the file metadata on close doesn't pass muster. Fortunately, we can tell in xfs_file_release() whether the release context was a read-only context, and so we need to communicate this to xfs_release() so it can do the right thing here and skip EOF block truncation, hence ensuring that only contexts with write permissions will remove post-EOF blocks from the file. Before: Test 3: Open/read/close loop fragmentation counts /mnt/scratch/file.0: 150 /mnt/scratch/file.1: 342 /mnt/scratch/file.2: 113 /mnt/scratch/file.3: 165 /mnt/scratch/file.4: 86 /mnt/scratch/file.5: 363 /mnt/scratch/file.6: 129 /mnt/scratch/file.7: 233 After: Test 3: Open/read/close loop fragmentation counts /mnt/scratch/file.0: 12 /mnt/scratch/file.1: 12 /mnt/scratch/file.2: 12 /mnt/scratch/file.3: 12 /mnt/scratch/file.4: 12 /mnt/scratch/file.5: 12 /mnt/scratch/file.6: 12 /mnt/scratch/file.7: 12 Signed-off-by: Dave Chinner --- fs/xfs/xfs_file.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 02f76b8e6c03..e2d8a0b7f891 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1023,6 +1023,10 @@ xfs_dir_open( * When we release the file, we don't want it to trim EOF blocks for synchronous * write contexts as this leads to severe fragmentation when applications do * repeated open/appending sync write/close to a file amongst other file IO. + * + * We also don't want to trim the EOF blocks if it is a read only context. This + * prevents open/read/close workloads from removing EOF blocks that other + * writers are depending on to prevent fragmentation. */ STATIC int xfs_file_release( @@ -1031,8 +1035,9 @@ xfs_file_release( { bool free_eof_blocks = true; - if ((file->f_mode & FMODE_WRITE) && - (file->f_flags & O_DSYNC)) + if ((file->f_mode & FMODE_WRITE|FMODE_READ) == FMODE_READ) + free_eof_blocks = false; + else if ((file->f_mode & FMODE_WRITE) && (file->f_flags & O_DSYNC)) free_eof_blocks = false; return xfs_release(XFS_I(inode), free_eof_blocks); From patchwork Mon Feb 18 02:26:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 10817283 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 88CF61399 for ; Mon, 18 Feb 2019 02:26:26 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6A7882A4BA for ; Mon, 18 Feb 2019 02:26:26 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5D9BC2A4BE; Mon, 18 Feb 2019 02:26:26 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B935F2A4BA for ; Mon, 18 Feb 2019 02:26:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728174AbfBRC0Z (ORCPT ); Sun, 17 Feb 2019 21:26:25 -0500 Received: from ipmail03.adl6.internode.on.net ([150.101.137.143]:23080 "EHLO ipmail03.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727266AbfBRC0Z (ORCPT ); Sun, 17 Feb 2019 21:26:25 -0500 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail03.adl6.internode.on.net with ESMTP; 18 Feb 2019 12:56:21 +1030 Received: from dave by dastard with local (Exim 4.80) (envelope-from ) id 1gvYdP-0002p9-Fk for linux-xfs@vger.kernel.org; Mon, 18 Feb 2019 13:26:19 +1100 Date: Mon, 18 Feb 2019 13:26:19 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Subject: [PATCH 4/3] xfs: EOF blocks are not busy extents Message-ID: <20190218022619.GD14116@dastard> References: <20190207050813.24271-1-david@fromorbit.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20190207050813.24271-1-david@fromorbit.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Dave Chinner Userdata extents are considered as "busy extents" when they are freed to ensure that they ar enot reallocated and written to before the transaction that frees the user data extent has been committed to disk. However, in the case of post EOF blocks, these block have never been exposed to user applications and so don't contain valid data. Hence they don't need to be considered "busy" when they've been freed because there is no data in them that can be destroyed if they are reallocated and have data written to them before the free transaction is committed. We already have XFS_BMAPI_NODISCARD to extent freeing that the data extent has never been used so it doesn't need discards issued on it. This new functionality is just an extension of that concept - the extent is actually unused, so doesn't even need to be marked busy. Hence fix this by adding XFS_BMAPI_UNUSED and update the EOF block data extent truncate with XFS_BMAPI_UNUSED and propagate that all the way through the various structures an use it to avoid inserting the extent into the busy list. [ Note: this seems like a bit of a hack, but I just don't see the point of inserting it into the busy list, then adding a new busy list unused flag, then allowing every allocation type to use it in the _trim code, then have every caller have to call _reuse to remove the range from the busy tree. It just seems like complexity that doesn't need to exist because anyone can reallocate an unused extent for immediate use. ] This avoids the problem of free space fragmentation when multiple files are written sequentially via synchronous writes or post-write fsync calls before the next file is written. This results in the post-eof blocks being marked busy and can't be immediately reallocated resulting in the files packing poorly and unnecessarily leaving free space between them. Freespace fragmentation from sequential multi-file synchronous write workload: Before: from to extents blocks pct 1 1 7 7 0.00 2 3 34 80 0.00 4 7 65 345 0.01 8 15 208 2417 0.05 16 31 147 2982 0.06 32 63 1 49 0.00 1048576 1310720 4 5185064 99.89 After: from to extents blocks pct 1 1 3 3 0.00 2 3 1 3 0.00 1048576 1310720 4 5190871 100.00 Much better. Signed-off-by: Dave Chinner --- fs/xfs/libxfs/xfs_alloc.c | 7 +++---- fs/xfs/libxfs/xfs_bmap.c | 6 +++--- fs/xfs/libxfs/xfs_bmap.h | 8 ++++++-- fs/xfs/xfs_bmap_util.c | 2 +- fs/xfs/xfs_trans_extfree.c | 6 +++--- 5 files changed, 16 insertions(+), 13 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 659bb9133955..729136ef0ed1 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -3014,7 +3014,7 @@ __xfs_free_extent( xfs_extlen_t len, const struct xfs_owner_info *oinfo, enum xfs_ag_resv_type type, - bool skip_discard) + bool unused) { struct xfs_mount *mp = tp->t_mountp; struct xfs_buf *agbp; @@ -3045,9 +3045,8 @@ __xfs_free_extent( if (error) goto err; - if (skip_discard) - busy_flags |= XFS_EXTENT_BUSY_SKIP_DISCARD; - xfs_extent_busy_insert(tp, agno, agbno, len, busy_flags); + if (!unused) + xfs_extent_busy_insert(tp, agno, agbno, len, busy_flags); return 0; err: diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 332eefa2700b..ba0fd80eede4 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -537,7 +537,7 @@ __xfs_bmap_add_free( xfs_fsblock_t bno, xfs_filblks_t len, const struct xfs_owner_info *oinfo, - bool skip_discard) + bool unused) { struct xfs_extent_free_item *new; /* new element */ #ifdef DEBUG @@ -565,7 +565,7 @@ __xfs_bmap_add_free( new->xefi_oinfo = *oinfo; else new->xefi_oinfo = XFS_RMAP_OINFO_SKIP_UPDATE; - new->xefi_skip_discard = skip_discard; + new->xefi_unused = unused; trace_xfs_bmap_free_defer(tp->t_mountp, XFS_FSB_TO_AGNO(tp->t_mountp, bno), 0, XFS_FSB_TO_AGBNO(tp->t_mountp, bno), len); @@ -5069,7 +5069,7 @@ xfs_bmap_del_extent_real( } else { __xfs_bmap_add_free(tp, del->br_startblock, del->br_blockcount, NULL, - (bflags & XFS_BMAPI_NODISCARD) || + (bflags & XFS_BMAPI_UNUSED) || del->br_state == XFS_EXT_UNWRITTEN); } } diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h index 09d3ea97cc15..33fb95f84ea0 100644 --- a/fs/xfs/libxfs/xfs_bmap.h +++ b/fs/xfs/libxfs/xfs_bmap.h @@ -54,7 +54,7 @@ struct xfs_extent_free_item xfs_extlen_t xefi_blockcount;/* number of blocks in extent */ struct list_head xefi_list; struct xfs_owner_info xefi_oinfo; /* extent owner */ - bool xefi_skip_discard; + bool xefi_unused; }; #define XFS_BMAP_MAX_NMAP 4 @@ -107,6 +107,9 @@ struct xfs_extent_free_item /* Do not update the rmap btree. Used for reconstructing bmbt from rmapbt. */ #define XFS_BMAPI_NORMAP 0x2000 +/* Unused freed data extent, no need to mark busy */ +#define XFS_BMAPI_UNUSED 0x4000 + #define XFS_BMAPI_FLAGS \ { XFS_BMAPI_ENTIRE, "ENTIRE" }, \ { XFS_BMAPI_METADATA, "METADATA" }, \ @@ -120,7 +123,8 @@ struct xfs_extent_free_item { XFS_BMAPI_DELALLOC, "DELALLOC" }, \ { XFS_BMAPI_CONVERT_ONLY, "CONVERT_ONLY" }, \ { XFS_BMAPI_NODISCARD, "NODISCARD" }, \ - { XFS_BMAPI_NORMAP, "NORMAP" } + { XFS_BMAPI_NORMAP, "NORMAP" }, \ + { XFS_BMAPI_UNUSED, "UNUSED" } static inline int xfs_bmapi_aflag(int w) diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index af2e30d33794..f5a8a4385512 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -841,7 +841,7 @@ xfs_free_eofblocks( * may be full of holes (ie NULL files bug). */ error = xfs_itruncate_extents_flags(&tp, ip, XFS_DATA_FORK, - XFS_ISIZE(ip), XFS_BMAPI_NODISCARD); + XFS_ISIZE(ip), XFS_BMAPI_UNUSED); if (error) { /* * If we get an error at this point we simply don't diff --git a/fs/xfs/xfs_trans_extfree.c b/fs/xfs/xfs_trans_extfree.c index 0710434eb240..d06fb2cd6ffb 100644 --- a/fs/xfs/xfs_trans_extfree.c +++ b/fs/xfs/xfs_trans_extfree.c @@ -58,7 +58,7 @@ xfs_trans_free_extent( xfs_fsblock_t start_block, xfs_extlen_t ext_len, const struct xfs_owner_info *oinfo, - bool skip_discard) + bool unused) { struct xfs_mount *mp = tp->t_mountp; struct xfs_extent *extp; @@ -71,7 +71,7 @@ xfs_trans_free_extent( trace_xfs_bmap_free_deferred(tp->t_mountp, agno, 0, agbno, ext_len); error = __xfs_free_extent(tp, start_block, ext_len, - oinfo, XFS_AG_RESV_NONE, skip_discard); + oinfo, XFS_AG_RESV_NONE, unused); /* * Mark the transaction dirty, even on error. This ensures the * transaction is aborted, which: @@ -184,7 +184,7 @@ xfs_extent_free_finish_item( error = xfs_trans_free_extent(tp, done_item, free->xefi_startblock, free->xefi_blockcount, - &free->xefi_oinfo, free->xefi_skip_discard); + &free->xefi_oinfo, free->xefi_unused); kmem_free(free); return error; }