From patchwork Wed Apr 27 00:51:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12828135 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5164C433F5 for ; Wed, 27 Apr 2022 00:52:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1356540AbiD0AzN (ORCPT ); Tue, 26 Apr 2022 20:55:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51270 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1356529AbiD0AzK (ORCPT ); Tue, 26 Apr 2022 20:55:10 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA69E13EBC for ; Tue, 26 Apr 2022 17:52:00 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id A0EDDB82452 for ; Wed, 27 Apr 2022 00:51:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6765DC385A4; Wed, 27 Apr 2022 00:51:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651020718; bh=3bJtYCzRWx9eWSGjNIeVayXHliKM5WbyeqKhBZvj9ek=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=hG850jgwfpaDTzqWgfV9vGc25RE5dAwT2RswqjUNQ9DXu3Rg4KNmr78BkyKWY89TV yQWcsfp5dXg2pbHQzN8JR54UywMQJiaOfdSB8Rre07wlDxXqbtNZ9WDtJw/Tk7RzdK k7FDZxwt1esdphZc0IDT2QHBlzkOwICJQNVfTqT/KmD7KmFmpTn2R/giwz65Xb7ZC0 Kz4saowNP4pIynuejX/bR9Vw8INhEKe8ubmVdTPoFzQ0jVfR33OiulIZ23gICFxCuz zDn3mgHpshtzRdgI+jh+QE2miFr0eZnRK9sEDmxnB9bIYRhhx1twwApOOXOByy36rO 7Zs3k6WjHkqSg== Subject: [PATCH 1/9] xfs: count EFIs when deciding to ask for a continuation of a refcount update From: "Darrick J. Wong" To: djwong@kernel.org, david@fromorbit.com Cc: Dave Chinner , linux-xfs@vger.kernel.org Date: Tue, 26 Apr 2022 17:51:58 -0700 Message-ID: <165102071799.3922658.11838016511226658958.stgit@magnolia> In-Reply-To: <165102071223.3922658.5241787533081256670.stgit@magnolia> References: <165102071223.3922658.5241787533081256670.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong A long time ago, I added to XFS the ability to use deferred reference count operations as part of a transaction chain. This enabled us to avoid blowing out the transaction reservation when the blocks in a physical extent all had different reference counts because we could ask the deferred operation manager for a continuation, which would get us a clean transaction. The refcount code asks for a continuation when the number of refcount record updates reaches the point where we think that the transaction has logged enough full btree blocks due to refcount (and free space) btree shape changes and refcount record updates that we're in danger of overflowing the transaction. We did not previously count the EFIs logged to the refcount update transaction because the clamps on the length of a bunmap operation were sufficient to avoid overflowing the transaction reservation even in the worst case situation where every other block of the unmapped extent is shared. Unfortunately, the restrictions on bunmap length avoid failure in the worst case by imposing a maximum unmap length of ~3000 blocks, even for non-pathological cases. This seriously limits performance when freeing large extents. Therefore, track EFIs with the same counter as refcount record updates, and use that information as input into when we should ask for a continuation. This enables the next patch to drop the clumsy bunmap limitation. Depends: 27dada070d59 ("xfs: change the order in which child and parent defer ops ar finished") Depends: 74f4d6a1e065 ("xfs: only relog deferred intent items if free space in the log gets low") Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_refcount.c | 5 ++--- fs/xfs/libxfs/xfs_refcount.h | 13 ++++++++----- 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c index 327ba25e9e17..a07ebaecba73 100644 --- a/fs/xfs/libxfs/xfs_refcount.c +++ b/fs/xfs/libxfs/xfs_refcount.c @@ -960,6 +960,7 @@ xfs_refcount_adjust_extents( * Either cover the hole (increment) or * delete the range (decrement). */ + cur->bc_ag.refc.nr_ops++; if (tmp.rc_refcount) { error = xfs_refcount_insert(cur, &tmp, &found_tmp); @@ -970,7 +971,6 @@ xfs_refcount_adjust_extents( error = -EFSCORRUPTED; goto out_error; } - cur->bc_ag.refc.nr_ops++; } else { fsbno = XFS_AGB_TO_FSB(cur->bc_mp, cur->bc_ag.pag->pag_agno, @@ -1001,11 +1001,11 @@ xfs_refcount_adjust_extents( ext.rc_refcount += adj; trace_xfs_refcount_modify_extent(cur->bc_mp, cur->bc_ag.pag->pag_agno, &ext); + cur->bc_ag.refc.nr_ops++; if (ext.rc_refcount > 1) { error = xfs_refcount_update(cur, &ext); if (error) goto out_error; - cur->bc_ag.refc.nr_ops++; } else if (ext.rc_refcount == 1) { error = xfs_refcount_delete(cur, &found_rec); if (error) @@ -1014,7 +1014,6 @@ xfs_refcount_adjust_extents( error = -EFSCORRUPTED; goto out_error; } - cur->bc_ag.refc.nr_ops++; goto advloop; } else { fsbno = XFS_AGB_TO_FSB(cur->bc_mp, diff --git a/fs/xfs/libxfs/xfs_refcount.h b/fs/xfs/libxfs/xfs_refcount.h index 9eb01edbd89d..e8b322de7f3d 100644 --- a/fs/xfs/libxfs/xfs_refcount.h +++ b/fs/xfs/libxfs/xfs_refcount.h @@ -67,14 +67,17 @@ extern int xfs_refcount_recover_cow_leftovers(struct xfs_mount *mp, * log (plus any key updates) so we'll conservatively assume 32 bytes * per record. We must also leave space for btree splits on both ends * of the range and space for the CUD and a new CUI. + * + * Each EFI that we attach to the transaction is assumed to consume ~32 bytes. + * This is a low estimate for an EFI tracking a single extent (16 bytes for the + * EFI header, 16 for the extent, and 12 for the xlog op header), but the + * estimate is acceptable if there's more than one extent being freed. + * In the worst case of freeing every other block during a refcount decrease + * operation, we amortize the space used for one EFI log item across 16 + * extents. */ #define XFS_REFCOUNT_ITEM_OVERHEAD 32 -static inline xfs_fileoff_t xfs_refcount_max_unmap(int log_res) -{ - return (log_res * 3 / 4) / XFS_REFCOUNT_ITEM_OVERHEAD; -} - extern int xfs_refcount_has_record(struct xfs_btree_cur *cur, xfs_agblock_t bno, xfs_extlen_t len, bool *exists); union xfs_btree_rec; From patchwork Wed Apr 27 00:52:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12828136 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD23BC433EF for ; Wed, 27 Apr 2022 00:52:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1356519AbiD0AzR (ORCPT ); Tue, 26 Apr 2022 20:55:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1356529AbiD0AzP (ORCPT ); Tue, 26 Apr 2022 20:55:15 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9347413DC5 for ; Tue, 26 Apr 2022 17:52:06 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 41909B8240D for ; Wed, 27 Apr 2022 00:52:05 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 08FF8C385A4; Wed, 27 Apr 2022 00:52:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651020724; bh=helY6zdcPgYj6ENJ2qdCh4eFVMqVdBCJQcLb/lwb5+w=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=fR0QVJfIpgfzYnFt/SGnseepAxLMwNpc+ehWlIiBXZrIgrFCwksvxKe1gPkPPjhtn dmxuBp/Wt9uNBgh8QG3gmhmwmy9WRXFGddv5E4BYsBVFSn4SmpqOsWwC84DkQV0Cpu QPj98ys68g1oNj3IsrGTPnuMW51KQctxsnWx079M+6B/e0E1YtUQMpTR3O17y98aDz GULiv74cTk7GEqvtiHbaSM75Krfrwyx/birbUzcgouz++nbQrlECRrwWOsz02SPQZa KW3BarRTY9VJ2xCS3YJdvZS32/ZdJShMMA1vNwd9uF6lSBhacpJzKKjZufmt5fDW2P UA9nD/MHs/gTQ== Subject: [PATCH 2/9] xfs: stop artificially limiting the length of bunmap calls From: "Darrick J. Wong" To: djwong@kernel.org, david@fromorbit.com Cc: Dave Chinner , linux-xfs@vger.kernel.org Date: Tue, 26 Apr 2022 17:52:03 -0700 Message-ID: <165102072359.3922658.10110553526152188988.stgit@magnolia> In-Reply-To: <165102071223.3922658.5241787533081256670.stgit@magnolia> References: <165102071223.3922658.5241787533081256670.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong In commit e1a4e37cc7b6, we clamped the length of bunmapi calls on the data forks of shared files to avoid two failure scenarios: one where the extent being unmapped is so sparsely shared that we exceed the transaction reservation with the sheer number of refcount btree updates and EFI intent items; and the other where we attach so many deferred updates to the transaction that we pin the log tail and later the log head meets the tail, causing the log to livelock. We avoid triggering the first problem by tracking the number of ops in the refcount btree cursor and forcing a requeue of the refcount intent item any time we think that we might be close to overflowing. This has been baked into XFS since before the original e1a4 patch. A recent patchset fixed the second problem by changing the deferred ops code to finish all the work items created by each round of trying to complete a refcount intent item, which eliminates the long chains of deferred items (27dad); and causing long-running transactions to relog their intent log items when space in the log gets low (74f4d). Because this clamp affects /any/ unmapping request regardless of the sharing factors of the component blocks, it degrades the performance of all large unmapping requests -- whereas with an unshared file we can unmap millions of blocks in one go, shared files are limited to unmapping a few thousand blocks at a time, which causes the upper level code to spin in a bunmapi loop even if it wasn't needed. This also eliminates one more place where log recovery behavior can differ from online behavior, because bunmapi operations no longer need to requeue. The fstest generic/447 was created to test the old fix, and it still passes with this applied. Partial-revert-of: e1a4e37cc7b6 ("xfs: try to avoid blowing out the transaction reservation when bunmaping a shared extent") Depends: 27dada070d59 ("xfs: change the order in which child and parent defer ops ar finished") Depends: 74f4d6a1e065 ("xfs: only relog deferred intent items if free space in the log gets low") Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_bmap.c | 22 +--------------------- 1 file changed, 1 insertion(+), 21 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 24462bdfd8e7..6833110d1bd4 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -5280,7 +5280,6 @@ __xfs_bunmapi( int whichfork; /* data or attribute fork */ xfs_fsblock_t sum; xfs_filblks_t len = *rlen; /* length to unmap in file */ - xfs_fileoff_t max_len; xfs_fileoff_t end; struct xfs_iext_cursor icur; bool done = false; @@ -5299,16 +5298,6 @@ __xfs_bunmapi( ASSERT(len > 0); ASSERT(nexts >= 0); - /* - * Guesstimate how many blocks we can unmap without running the risk of - * blowing out the transaction with a mix of EFIs and reflink - * adjustments. - */ - if (tp && xfs_is_reflink_inode(ip) && whichfork == XFS_DATA_FORK) - max_len = min(len, xfs_refcount_max_unmap(tp->t_log_res)); - else - max_len = len; - error = xfs_iread_extents(tp, ip, whichfork); if (error) return error; @@ -5347,7 +5336,7 @@ __xfs_bunmapi( extno = 0; while (end != (xfs_fileoff_t)-1 && end >= start && - (nexts == 0 || extno < nexts) && max_len > 0) { + (nexts == 0 || extno < nexts)) { /* * Is the found extent after a hole in which end lives? * Just back up to the previous extent, if so. @@ -5381,14 +5370,6 @@ __xfs_bunmapi( if (del.br_startoff + del.br_blockcount > end + 1) del.br_blockcount = end + 1 - del.br_startoff; - /* How much can we safely unmap? */ - if (max_len < del.br_blockcount) { - del.br_startoff += del.br_blockcount - max_len; - if (!wasdel) - del.br_startblock += del.br_blockcount - max_len; - del.br_blockcount = max_len; - } - if (!isrt) goto delete; @@ -5524,7 +5505,6 @@ __xfs_bunmapi( if (error) goto error0; - max_len -= del.br_blockcount; end = del.br_startoff - 1; nodelete: /* From patchwork Wed Apr 27 00:52:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12828137 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BE2AC433FE for ; Wed, 27 Apr 2022 00:52:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1356529AbiD0AzY (ORCPT ); Tue, 26 Apr 2022 20:55:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1356530AbiD0AzX (ORCPT ); Tue, 26 Apr 2022 20:55:23 -0400 Received: from sin.source.kernel.org (sin.source.kernel.org [IPv6:2604:1380:40e1:4800::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 25E7B13EBC for ; Tue, 26 Apr 2022 17:52:13 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 8DBDFCE228C for ; Wed, 27 Apr 2022 00:52:11 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9DCBBC385AC; Wed, 27 Apr 2022 00:52:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651020729; bh=c6bnLoLW1xv7N935L85CDq780KXhfhcW00Mlpx1gNWk=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=Ck9VEyIYt2t0X2/DHx17X45cJNcrnD9UeyEQclpG9aw0CzSGsDGo1HE4bcLHjbFwR 7MeDBQzuab95AguwxDIMF8HE/k/YXuVl03cMKtBCt8sZqv416Z33RK6NyOOITdlpXk 8Hgzljh2+TZ/2fhkF3UUqlk1YS+soLhNUqUK1uGV2HQ2KAEnVDUGWJ7JEa5Xxm/XBj QdLHrWO0ytk98oAjVv3bkIa6A8j3Mmmm446sryhdkDrQ2PE3yOvWY/gjJpCdv8lf2A V6+UNhU1EwaaR+69PggbYxIUlmVzVKBpy2k21GcJFtqJmpc+nwsUitGUHBfJe4msSD nCUkShqylN9ug== Subject: [PATCH 3/9] xfs: remove a __xfs_bunmapi call from reflink From: "Darrick J. Wong" To: djwong@kernel.org, david@fromorbit.com Cc: Dave Chinner , Christoph Hellwig , linux-xfs@vger.kernel.org Date: Tue, 26 Apr 2022 17:52:09 -0700 Message-ID: <165102072920.3922658.13220953312201833980.stgit@magnolia> In-Reply-To: <165102071223.3922658.5241787533081256670.stgit@magnolia> References: <165102071223.3922658.5241787533081256670.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong This raw call isn't necessary since we can always remove a full delalloc extent. Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_reflink.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 1ae6d3434ad2..960917628a44 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1138,7 +1138,7 @@ xfs_reflink_remap_extent( xfs_refcount_decrease_extent(tp, &smap); qdelta -= smap.br_blockcount; } else if (smap.br_startblock == DELAYSTARTBLOCK) { - xfs_filblks_t len = smap.br_blockcount; + int done; /* * If the extent we're unmapping is a delalloc reservation, @@ -1146,10 +1146,11 @@ xfs_reflink_remap_extent( * incore state. Dropping the delalloc reservation takes care * of the quota reservation for us. */ - error = __xfs_bunmapi(NULL, ip, smap.br_startoff, &len, 0, 1); + error = xfs_bunmapi(NULL, ip, smap.br_startoff, + smap.br_blockcount, 0, 1, &done); if (error) goto out_cancel; - ASSERT(len == 0); + ASSERT(done); } /* From patchwork Wed Apr 27 00:52:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12828138 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA7FCC433EF for ; Wed, 27 Apr 2022 00:52:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1356530AbiD0Az0 (ORCPT ); Tue, 26 Apr 2022 20:55:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52420 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1356539AbiD0AzZ (ORCPT ); Tue, 26 Apr 2022 20:55:25 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4FC1013DC5 for ; Tue, 26 Apr 2022 17:52:16 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id DE05161AED for ; Wed, 27 Apr 2022 00:52:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 409CBC385A0; Wed, 27 Apr 2022 00:52:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651020735; bh=MvO7w7GVO1gufFwt2917uH+tOdPJTrkraMOaMWZjVhE=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=g2vgqZTHm8Ji7/F+2vRzmQlsnDEBAm/EGvTPtvAd5amRv9W54obEDdMb+ufhshym6 j49kIZJj7godsdDex6xAmWGtIiR8jYdiAhtM/fW4UpqszRQ7WY0n1BqOm05SYmwB1E B5c3vZW3BkDxbkIQ8bE1p+StSONknAE+ufDPnbHBSvU3Q4Vm9uuFbXJSgzApIhIh/f lxYe6X+2HwfiP/bAlkO9cRbskcVv2v2wgT/kkfNNiTq04Bmemivcza+FWj7x5JYLdj 3zxNC+C9CIkqet+upQMrwXOuwFdthyW16nDNSZ0BuLfoBNCTxYp4y/F9nd0U7eHfur cpLm5e0MaIJkw== Subject: [PATCH 4/9] xfs: create shadow transaction reservations for computing minimum log size From: "Darrick J. Wong" To: djwong@kernel.org, david@fromorbit.com Cc: linux-xfs@vger.kernel.org Date: Tue, 26 Apr 2022 17:52:14 -0700 Message-ID: <165102073482.3922658.3874181264513799865.stgit@magnolia> In-Reply-To: <165102071223.3922658.5241787533081256670.stgit@magnolia> References: <165102071223.3922658.5241787533081256670.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Every time someone changes the transaction reservation sizes, they introduce potential compatibility problems if the changes affect the minimum log size that we validate at mount time. If the minimum log size gets larger (which should be avoided because doing so presents a serious risk of log livelock), filesystems created with old mkfs will not mount on a newer kernel; if the minimum size shrinks, filesystems created with newer mkfs will not mount on older kernels. Therefore, enable the creation of a shadow log reservation structure where we can "undo" the effects of tweaks when computing minimum log sizes. These shadow reservations should never be used in practice, but they insulate us from perturbations in minimum log size. Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_log_rlimit.c | 15 +++++++++++---- fs/xfs/xfs_trace.h | 12 ++++++++++-- 2 files changed, 21 insertions(+), 6 deletions(-) diff --git a/fs/xfs/libxfs/xfs_log_rlimit.c b/fs/xfs/libxfs/xfs_log_rlimit.c index 67798ff5e14e..4d04568ab07e 100644 --- a/fs/xfs/libxfs/xfs_log_rlimit.c +++ b/fs/xfs/libxfs/xfs_log_rlimit.c @@ -14,6 +14,7 @@ #include "xfs_trans_space.h" #include "xfs_da_btree.h" #include "xfs_bmap_btree.h" +#include "xfs_trace.h" /* * Calculate the maximum length in bytes that would be required for a local @@ -46,19 +47,25 @@ xfs_log_get_max_trans_res( struct xfs_mount *mp, struct xfs_trans_res *max_resp) { + struct xfs_trans_resv resv; struct xfs_trans_res *resp; struct xfs_trans_res *end_resp; + unsigned int i; int log_space = 0; int attr_space; attr_space = xfs_log_calc_max_attrsetm_res(mp); - resp = (struct xfs_trans_res *)M_RES(mp); - end_resp = (struct xfs_trans_res *)(M_RES(mp) + 1); - for (; resp < end_resp; resp++) { + memcpy(&resv, M_RES(mp), sizeof(struct xfs_trans_resv)); + + resp = (struct xfs_trans_res *)&resv; + end_resp = (struct xfs_trans_res *)(&resv + 1); + for (i = 0; resp < end_resp; i++, resp++) { int tmp = resp->tr_logcount > 1 ? resp->tr_logres * resp->tr_logcount : resp->tr_logres; + + trace_xfs_trans_resv_calc_minlogsize(mp, i, resp); if (log_space < tmp) { log_space = tmp; *max_resp = *resp; /* struct copy */ @@ -66,7 +73,7 @@ xfs_log_get_max_trans_res( } if (attr_space > log_space) { - *max_resp = M_RES(mp)->tr_attrsetm; /* struct copy */ + *max_resp = resv.tr_attrsetm; /* struct copy */ max_resp->tr_logres = attr_space; } } diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 91b916e82364..9110bb5dd866 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -3500,7 +3500,7 @@ DEFINE_GETFSMAP_EVENT(xfs_getfsmap_low_key); DEFINE_GETFSMAP_EVENT(xfs_getfsmap_high_key); DEFINE_GETFSMAP_EVENT(xfs_getfsmap_mapping); -TRACE_EVENT(xfs_trans_resv_calc, +DECLARE_EVENT_CLASS(xfs_trans_resv_class, TP_PROTO(struct xfs_mount *mp, unsigned int type, struct xfs_trans_res *res), TP_ARGS(mp, type, res), @@ -3524,7 +3524,15 @@ TRACE_EVENT(xfs_trans_resv_calc, __entry->logres, __entry->logcount, __entry->logflags) -); +) + +#define DEFINE_TRANS_RESV_EVENT(name) \ +DEFINE_EVENT(xfs_trans_resv_class, name, \ + TP_PROTO(struct xfs_mount *mp, unsigned int type, \ + struct xfs_trans_res *res), \ + TP_ARGS(mp, type, res)) +DEFINE_TRANS_RESV_EVENT(xfs_trans_resv_calc); +DEFINE_TRANS_RESV_EVENT(xfs_trans_resv_calc_minlogsize); DECLARE_EVENT_CLASS(xfs_trans_class, TP_PROTO(struct xfs_trans *tp, unsigned long caller_ip), From patchwork Wed Apr 27 00:52:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12828139 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3804FC433FE for ; Wed, 27 Apr 2022 00:52:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1356543AbiD0Azc (ORCPT ); Tue, 26 Apr 2022 20:55:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52832 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1356541AbiD0Azc (ORCPT ); Tue, 26 Apr 2022 20:55:32 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DC4AF17E17 for ; Tue, 26 Apr 2022 17:52:21 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7A7C061AED for ; Wed, 27 Apr 2022 00:52:21 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D6EBAC385A4; Wed, 27 Apr 2022 00:52:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651020740; bh=vxUcY9SqFFcedPyxRRTNa054RFo/9fusdmyhIgDJ7Xc=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=Xu644KHZTqZ4Ep/wRpmSh8jkgKTeb5SaDsDuurDdbU6t4SWCslQfmhP4mXCniAUdA adZF9/epnvC9wqo7ZlXVjzx/LJfUxIm30C7aS06KHoVSRaBrid+d+lRBRx6RcpTE43 hGinR5w9zFqIsw7E9IsZcijE5OMFPnxDDiIoTKbwd7BXPkwdEKDdXMDaOOgRbUr/H7 Pbn4Aqa+8G4AMJEIdq5jwZxOOUvmTVsVeMZf/OhvXM5XtaSDtHq49+RDwDrMTfvZlV vstZU5/iYJknPpvbdisIazgcS4uNCOhTVwK40khD92exe6yRo4PmjPZjOc6Q21JBFs p2hG+BMQXDgwQ== Subject: [PATCH 5/9] xfs: report "max_resp" used for min log size computation From: "Darrick J. Wong" To: djwong@kernel.org, david@fromorbit.com Cc: linux-xfs@vger.kernel.org Date: Tue, 26 Apr 2022 17:52:20 -0700 Message-ID: <165102074043.3922658.16450368163586208910.stgit@magnolia> In-Reply-To: <165102071223.3922658.5241787533081256670.stgit@magnolia> References: <165102071223.3922658.5241787533081256670.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Move the tracepoint that computes the size of the transaction used to compute the minimum log size into xfs_log_get_max_trans_res so that we only have to compute this stuff once. Leave xfs_log_get_max_trans_res as a non-static function so that xfs_db can call it to report the results of the userspace computation of the same value to diagnose mkfs/kernel misinteractions. Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_log_rlimit.c | 1 + fs/xfs/xfs_trace.h | 19 +++++++++++++++++++ fs/xfs/xfs_trans.c | 3 --- 3 files changed, 20 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_log_rlimit.c b/fs/xfs/libxfs/xfs_log_rlimit.c index 4d04568ab07e..1db27c3a1d16 100644 --- a/fs/xfs/libxfs/xfs_log_rlimit.c +++ b/fs/xfs/libxfs/xfs_log_rlimit.c @@ -76,6 +76,7 @@ xfs_log_get_max_trans_res( *max_resp = resv.tr_attrsetm; /* struct copy */ max_resp->tr_logres = attr_space; } + trace_xfs_log_get_max_trans_res(mp, max_resp); } /* diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 9110bb5dd866..a690987cc5f0 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -3534,6 +3534,25 @@ DEFINE_EVENT(xfs_trans_resv_class, name, \ DEFINE_TRANS_RESV_EVENT(xfs_trans_resv_calc); DEFINE_TRANS_RESV_EVENT(xfs_trans_resv_calc_minlogsize); +TRACE_EVENT(xfs_log_get_max_trans_res, + TP_PROTO(struct xfs_mount *mp, const struct xfs_trans_res *res), + TP_ARGS(mp, res), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(uint, logres) + __field(int, logcount) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->logres = res->tr_logres; + __entry->logcount = res->tr_logcount; + ), + TP_printk("dev %d:%d logres %u logcount %d", + MAJOR(__entry->dev), MINOR(__entry->dev), + __entry->logres, + __entry->logcount) +); + DECLARE_EVENT_CLASS(xfs_trans_class, TP_PROTO(struct xfs_trans *tp, unsigned long caller_ip), TP_ARGS(tp, caller_ip), diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c index 836ce2beac53..82cf0189c0db 100644 --- a/fs/xfs/xfs_trans.c +++ b/fs/xfs/xfs_trans.c @@ -32,7 +32,6 @@ static void xfs_trans_trace_reservations( struct xfs_mount *mp) { - struct xfs_trans_res resv; struct xfs_trans_res *res; struct xfs_trans_res *end_res; int i; @@ -41,8 +40,6 @@ xfs_trans_trace_reservations( end_res = (struct xfs_trans_res *)(M_RES(mp) + 1); for (i = 0; res < end_res; i++, res++) trace_xfs_trans_resv_calc(mp, i, res); - xfs_log_get_max_trans_res(mp, &resv); - trace_xfs_trans_resv_calc(mp, -1, &resv); } #else # define xfs_trans_trace_reservations(mp) From patchwork Wed Apr 27 00:52:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12828140 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6CFE8C433EF for ; Wed, 27 Apr 2022 00:52:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1356547AbiD0Azh (ORCPT ); Tue, 26 Apr 2022 20:55:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53058 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1356541AbiD0Azh (ORCPT ); Tue, 26 Apr 2022 20:55:37 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B059D13DC5 for ; Tue, 26 Apr 2022 17:52:27 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3DE0B61B03 for ; Wed, 27 Apr 2022 00:52:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 931BCC385A4; Wed, 27 Apr 2022 00:52:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651020746; bh=ByoyYjVEGSAEk9m77UCxTXPfxLjkE/qZxecC2NTp/js=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=hhaLIvGc5YKmPC7bgdeqLipjqvDJlcNH1x4Hc4H/jjITSICk3DjWZxgIF130zW+V7 LIoK0FXksX/yLuzmVlX8101tx9sW8l0AyLQm2BioPw/Yn7MWA9dWjIPl83nQNM9ypr vhQygfaxKFUTMQXp439On4bZGMQLBskhW7CylK2X7GAOIfr/TCv5qRyClhPhDDOn6g EAXtPFuwg0GKsUdHo9swRlacG8EIoSgcZTLppW7z3QKoECDe7jAK94TEm2oPVCukqs 8kg3cCNXsqGMcpUvv6HtUpKwn08N1IAGHna1H4SiNiItCvvxC+2PyAOeHgJdFntTme ngWQrErm9rdEg== Subject: [PATCH 6/9] xfs: reduce the absurdly large log operation count From: "Darrick J. Wong" To: djwong@kernel.org, david@fromorbit.com Cc: linux-xfs@vger.kernel.org Date: Tue, 26 Apr 2022 17:52:26 -0700 Message-ID: <165102074605.3922658.2732417123514234429.stgit@magnolia> In-Reply-To: <165102071223.3922658.5241787533081256670.stgit@magnolia> References: <165102071223.3922658.5241787533081256670.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Back in the early days of reflink and rmap development I set the transaction reservation sizes to be overly generous for rmap+reflink filesystems, and a little under-generous for rmap-only filesystems. Since we don't need *eight* transaction rolls to handle three new log intent items, decrease the logcounts to what we actually need, and amend the shadow reservation computation function to reflect what we used to do so that the minimum log size doesn't change. Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_log_rlimit.c | 51 ++++++++++++++++++++++++++++++++++++++-- fs/xfs/libxfs/xfs_trans_resv.c | 46 +++++++++++++++--------------------- fs/xfs/libxfs/xfs_trans_resv.h | 10 ++++++-- 3 files changed, 76 insertions(+), 31 deletions(-) diff --git a/fs/xfs/libxfs/xfs_log_rlimit.c b/fs/xfs/libxfs/xfs_log_rlimit.c index 1db27c3a1d16..60fff8c6716f 100644 --- a/fs/xfs/libxfs/xfs_log_rlimit.c +++ b/fs/xfs/libxfs/xfs_log_rlimit.c @@ -37,6 +37,53 @@ xfs_log_calc_max_attrsetm_res( M_RES(mp)->tr_attrsetrt.tr_logres * nblks; } +/* + * Compute an alternate set of log reservation sizes for use exclusively with + * minimum log size calculations. + */ +static void +xfs_log_calc_trans_resv_for_minlogblocks( + struct xfs_mount *mp, + struct xfs_trans_resv *resv) +{ + unsigned int rmap_maxlevels = mp->m_rmap_maxlevels; + + /* + * In the early days of rmap+reflink, we always set the rmap maxlevels + * to 9 even if the AG was small enough that it would never grow to + * that height. Transaction reservation sizes influence the minimum + * log size calculation, which influences the size of the log that mkfs + * creates. Use the old value here to ensure that newly formatted + * small filesystems will mount on older kernels. + */ + if (xfs_has_rmapbt(mp) && xfs_has_reflink(mp)) + mp->m_rmap_maxlevels = XFS_OLD_REFLINK_RMAP_MAXLEVELS; + + xfs_trans_resv_calc(mp, resv); + + if (xfs_has_reflink(mp)) { + /* + * In the early days of reflink, typical log operation counts + * were greatly overestimated. + */ + resv->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT_REFLINK; + resv->tr_itruncate.tr_logcount = + XFS_ITRUNCATE_LOG_COUNT_REFLINK; + resv->tr_qm_dqalloc.tr_logcount = XFS_WRITE_LOG_COUNT_REFLINK; + } else if (xfs_has_rmapbt(mp)) { + /* + * In the early days of non-reflink rmap, the impact of rmapbt + * updates on log counts were not taken into account at all. + */ + resv->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT; + resv->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT; + resv->tr_qm_dqalloc.tr_logcount = XFS_WRITE_LOG_COUNT; + } + + /* Put everything back the way it was. This goes at the end. */ + mp->m_rmap_maxlevels = rmap_maxlevels; +} + /* * Iterate over the log space reservation table to figure out and return * the maximum one in terms of the pre-calculated values which were done @@ -47,7 +94,7 @@ xfs_log_get_max_trans_res( struct xfs_mount *mp, struct xfs_trans_res *max_resp) { - struct xfs_trans_resv resv; + struct xfs_trans_resv resv = {}; struct xfs_trans_res *resp; struct xfs_trans_res *end_resp; unsigned int i; @@ -56,7 +103,7 @@ xfs_log_get_max_trans_res( attr_space = xfs_log_calc_max_attrsetm_res(mp); - memcpy(&resv, M_RES(mp), sizeof(struct xfs_trans_resv)); + xfs_log_calc_trans_resv_for_minlogblocks(mp, &resv); resp = (struct xfs_trans_res *)&resv; end_resp = (struct xfs_trans_res *)(&resv + 1); diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c index 8e1d09e8cc9a..60be82cd491b 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.c +++ b/fs/xfs/libxfs/xfs_trans_resv.c @@ -815,36 +815,18 @@ xfs_trans_resv_calc( struct xfs_mount *mp, struct xfs_trans_resv *resp) { - unsigned int rmap_maxlevels = mp->m_rmap_maxlevels; - - /* - * In the early days of rmap+reflink, we always set the rmap maxlevels - * to 9 even if the AG was small enough that it would never grow to - * that height. Transaction reservation sizes influence the minimum - * log size calculation, which influences the size of the log that mkfs - * creates. Use the old value here to ensure that newly formatted - * small filesystems will mount on older kernels. - */ - if (xfs_has_rmapbt(mp) && xfs_has_reflink(mp)) - mp->m_rmap_maxlevels = XFS_OLD_REFLINK_RMAP_MAXLEVELS; + int logcount_adj = 0; /* * The following transactions are logged in physical format and * require a permanent reservation on space. */ resp->tr_write.tr_logres = xfs_calc_write_reservation(mp); - if (xfs_has_reflink(mp)) - resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT_REFLINK; - else - resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT; + resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT; resp->tr_write.tr_logflags |= XFS_TRANS_PERM_LOG_RES; resp->tr_itruncate.tr_logres = xfs_calc_itruncate_reservation(mp); - if (xfs_has_reflink(mp)) - resp->tr_itruncate.tr_logcount = - XFS_ITRUNCATE_LOG_COUNT_REFLINK; - else - resp->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT; + resp->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT; resp->tr_itruncate.tr_logflags |= XFS_TRANS_PERM_LOG_RES; resp->tr_rename.tr_logres = xfs_calc_rename_reservation(mp); @@ -901,10 +883,7 @@ xfs_trans_resv_calc( resp->tr_growrtalloc.tr_logflags |= XFS_TRANS_PERM_LOG_RES; resp->tr_qm_dqalloc.tr_logres = xfs_calc_qm_dqalloc_reservation(mp); - if (xfs_has_reflink(mp)) - resp->tr_qm_dqalloc.tr_logcount = XFS_WRITE_LOG_COUNT_REFLINK; - else - resp->tr_qm_dqalloc.tr_logcount = XFS_WRITE_LOG_COUNT; + resp->tr_qm_dqalloc.tr_logcount = XFS_WRITE_LOG_COUNT; resp->tr_qm_dqalloc.tr_logflags |= XFS_TRANS_PERM_LOG_RES; /* @@ -931,6 +910,19 @@ xfs_trans_resv_calc( resp->tr_growrtzero.tr_logres = xfs_calc_growrtzero_reservation(mp); resp->tr_growrtfree.tr_logres = xfs_calc_growrtfree_reservation(mp); - /* Put everything back the way it was. This goes at the end. */ - mp->m_rmap_maxlevels = rmap_maxlevels; + /* + * Add one logcount for BUI items that appear with rmap or reflink, + * one logcount for refcount intent items, and one logcount for rmap + * intent items. + */ + if (xfs_has_reflink(mp) || xfs_has_rmapbt(mp)) + logcount_adj++; + if (xfs_has_reflink(mp)) + logcount_adj++; + if (xfs_has_rmapbt(mp)) + logcount_adj++; + + resp->tr_itruncate.tr_logcount += logcount_adj; + resp->tr_write.tr_logcount += logcount_adj; + resp->tr_qm_dqalloc.tr_logcount += logcount_adj; } diff --git a/fs/xfs/libxfs/xfs_trans_resv.h b/fs/xfs/libxfs/xfs_trans_resv.h index fc4e9b369a3a..fa330e646dc5 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.h +++ b/fs/xfs/libxfs/xfs_trans_resv.h @@ -73,7 +73,6 @@ struct xfs_trans_resv { #define XFS_DEFAULT_LOG_COUNT 1 #define XFS_DEFAULT_PERM_LOG_COUNT 2 #define XFS_ITRUNCATE_LOG_COUNT 2 -#define XFS_ITRUNCATE_LOG_COUNT_REFLINK 8 #define XFS_INACTIVE_LOG_COUNT 2 #define XFS_CREATE_LOG_COUNT 2 #define XFS_CREATE_TMPFILE_LOG_COUNT 2 @@ -83,12 +82,19 @@ struct xfs_trans_resv { #define XFS_LINK_LOG_COUNT 2 #define XFS_RENAME_LOG_COUNT 2 #define XFS_WRITE_LOG_COUNT 2 -#define XFS_WRITE_LOG_COUNT_REFLINK 8 #define XFS_ADDAFORK_LOG_COUNT 2 #define XFS_ATTRINVAL_LOG_COUNT 1 #define XFS_ATTRSET_LOG_COUNT 3 #define XFS_ATTRRM_LOG_COUNT 3 +/* + * Original log operation counts were overestimated in the early days of + * reflink. These are retained here purely for minimum log size calculations + * and must not be used for runtime reservations. + */ +#define XFS_ITRUNCATE_LOG_COUNT_REFLINK 8 +#define XFS_WRITE_LOG_COUNT_REFLINK 8 + void xfs_trans_resv_calc(struct xfs_mount *mp, struct xfs_trans_resv *resp); uint xfs_allocfree_log_count(struct xfs_mount *mp, uint num_ops); From patchwork Wed Apr 27 00:52:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12828141 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42095C433F5 for ; Wed, 27 Apr 2022 00:52:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1356532AbiD0Azp (ORCPT ); Tue, 26 Apr 2022 20:55:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53496 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1356527AbiD0Azo (ORCPT ); Tue, 26 Apr 2022 20:55:44 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D513213DC5 for ; Tue, 26 Apr 2022 17:52:34 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 8D80FB82461 for ; Wed, 27 Apr 2022 00:52:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2B550C385A4; Wed, 27 Apr 2022 00:52:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651020752; bh=OXQHx+rhnKT31c6Pr4nvyBVnFJHBFWkoy4gnxf194nQ=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=p8IfllWrp3ycx6tFuUuk2IM2v17jBWBJC8OCdximT/TmpC9UdHOCIe4R24EFhx71P jcjYsPjTICdoSH4UAHSs9B5ZDS6ZWgHavZECQlW1SInoPUNG1vCpVXtlfiQ5tdNcTC apShwsU69kUbzWPP+sK+eL684TbbyABULdpk/4JQfYGMt1tJwQhBE0JhXgOHWWAD3v CT5UIXrobjNB7x1A1emI6yJBPURBIZ1WDN+Sl/XFzwGMbqCNIhTm503SfKP42Ley7Q 18jjydflB+67aDepg+gcQqBlRu7yyIPMbOoBU27/WZ5MqqVhll4xYA1QcLs/uvh8dN 7p3P9lVcf6jDQ== Subject: [PATCH 7/9] xfs: reduce transaction reservations with reflink From: "Darrick J. Wong" To: djwong@kernel.org, david@fromorbit.com Cc: linux-xfs@vger.kernel.org Date: Tue, 26 Apr 2022 17:52:31 -0700 Message-ID: <165102075178.3922658.11792444708694506676.stgit@magnolia> In-Reply-To: <165102071223.3922658.5241787533081256670.stgit@magnolia> References: <165102071223.3922658.5241787533081256670.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Before to the introduction of deferred refcount operations, reflink would try to cram refcount btree updates into the same transaction as an allocation or a free event. Mainline XFS has never actually done that, but we never refactored the transaction reservations to reflect that we now do all refcount updates in separate transactions. Fix this to reduce the transaction reservation size even farther, so that between this patch and the previous one, we reduce the tr_write and tr_itruncate sizes by 66%. Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_log_rlimit.c | 12 ++++ fs/xfs/libxfs/xfs_refcount.c | 9 ++- fs/xfs/libxfs/xfs_trans_resv.c | 130 +++++++++++++++++++++++++++++++++++----- fs/xfs/libxfs/xfs_trans_resv.h | 4 + 4 files changed, 138 insertions(+), 17 deletions(-) diff --git a/fs/xfs/libxfs/xfs_log_rlimit.c b/fs/xfs/libxfs/xfs_log_rlimit.c index 60fff8c6716f..9975b93a7412 100644 --- a/fs/xfs/libxfs/xfs_log_rlimit.c +++ b/fs/xfs/libxfs/xfs_log_rlimit.c @@ -80,6 +80,18 @@ xfs_log_calc_trans_resv_for_minlogblocks( resv->tr_qm_dqalloc.tr_logcount = XFS_WRITE_LOG_COUNT; } + /* + * In the early days of reflink, we did not use deferred refcount + * update log items, so log reservations must be recomputed using the + * old calculations. + */ + resv->tr_write.tr_logres = + xfs_calc_write_reservation_minlogsize(mp); + resv->tr_itruncate.tr_logres = + xfs_calc_itruncate_reservation_minlogsize(mp); + resv->tr_qm_dqalloc.tr_logres = + xfs_calc_qm_dqalloc_reservation_minlogsize(mp); + /* Put everything back the way it was. This goes at the end. */ mp->m_rmap_maxlevels = rmap_maxlevels; } diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c index a07ebaecba73..e53544d52ee2 100644 --- a/fs/xfs/libxfs/xfs_refcount.c +++ b/fs/xfs/libxfs/xfs_refcount.c @@ -886,8 +886,13 @@ xfs_refcount_still_have_space( { unsigned long overhead; - overhead = cur->bc_ag.refc.shape_changes * - xfs_allocfree_log_count(cur->bc_mp, 1); + /* + * Worst case estimate: full splits of the free space and rmap btrees + * to handle each of the shape changes to the refcount btree. + */ + overhead = xfs_allocfree_log_count(cur->bc_mp, + cur->bc_ag.refc.shape_changes); + overhead += cur->bc_mp->m_refc_maxlevels; overhead *= cur->bc_mp->m_sb.sb_blocksize; /* diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c index 60be82cd491b..ab688929d884 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.c +++ b/fs/xfs/libxfs/xfs_trans_resv.c @@ -56,8 +56,7 @@ xfs_calc_buf_res( * Per-extent log reservation for the btree changes involved in freeing or * allocating an extent. In classic XFS there were two trees that will be * modified (bnobt + cntbt). With rmap enabled, there are three trees - * (rmapbt). With reflink, there are four trees (refcountbt). The number of - * blocks reserved is based on the formula: + * (rmapbt). The number of blocks reserved is based on the formula: * * num trees * ((2 blocks/level * max depth) - 1) * @@ -73,12 +72,23 @@ xfs_allocfree_log_count( blocks = num_ops * 2 * (2 * mp->m_alloc_maxlevels - 1); if (xfs_has_rmapbt(mp)) blocks += num_ops * (2 * mp->m_rmap_maxlevels - 1); - if (xfs_has_reflink(mp)) - blocks += num_ops * (2 * mp->m_refc_maxlevels - 1); return blocks; } +/* + * Per-extent log reservation for refcount btree changes. These are never done + * in the same transaction as an allocation or a free, so we compute them + * separately. + */ +static unsigned int +xfs_refcountbt_block_count( + struct xfs_mount *mp, + unsigned int num_ops) +{ + return num_ops * (2 * mp->m_refc_maxlevels - 1); +} + /* * Logging inodes is really tricksy. They are logged in memory format, * which means that what we write into the log doesn't directly translate into @@ -233,6 +243,28 @@ xfs_rtalloc_log_count( * register overflow from temporaries in the calculations. */ +/* + * Compute the log reservation required to handle the refcount update + * transaction. Refcount updates are always done via deferred log items. + * + * This is calculated as: + * Data device refcount updates (t1): + * the agfs of the ags containing the blocks: nr_ops * sector size + * the refcount btrees: nr_ops * 1 trees * (2 * max depth - 1) * block size + */ +static unsigned int +xfs_calc_refcountbt_reservation( + struct xfs_mount *mp, + unsigned int nr_ops) +{ + unsigned int blksz = XFS_FSB_TO_B(mp, 1); + + if (!xfs_has_reflink(mp)) + return 0; + + return xfs_calc_buf_res(nr_ops, mp->m_sb.sb_sectsize) + + xfs_calc_buf_res(xfs_refcountbt_block_count(mp, nr_ops), blksz); +} /* * In a write transaction we can allocate a maximum of 2 @@ -255,12 +287,14 @@ xfs_rtalloc_log_count( * the agfls of the ags containing the blocks: 2 * sector size * the super block free block counter: sector size * the allocation btrees: 2 exts * 2 trees * (2 * max depth - 1) * block size + * And any refcount updates that happen in a separate transaction (t4). */ STATIC uint xfs_calc_write_reservation( - struct xfs_mount *mp) + struct xfs_mount *mp, + bool for_minlogsize) { - unsigned int t1, t2, t3; + unsigned int t1, t2, t3, t4; unsigned int blksz = XFS_FSB_TO_B(mp, 1); t1 = xfs_calc_inode_res(mp, 1) + @@ -282,7 +316,36 @@ xfs_calc_write_reservation( t3 = xfs_calc_buf_res(5, mp->m_sb.sb_sectsize) + xfs_calc_buf_res(xfs_allocfree_log_count(mp, 2), blksz); - return XFS_DQUOT_LOGRES(mp) + max3(t1, t2, t3); + /* + * In the early days of reflink, we included enough reservation to log + * two refcountbt splits for each transaction. The codebase runs + * refcountbt updates in separate transactions now, so to compute the + * minimum log size, add the refcountbtree splits back to t1 and t3 and + * do not account them separately as t4. Reflink did not support + * realtime when the reservations were established, so no adjustment to + * t2 is needed. + */ + if (for_minlogsize) { + unsigned int adj = 0; + + if (xfs_has_reflink(mp)) + adj = xfs_calc_buf_res( + xfs_refcountbt_block_count(mp, 2), + blksz); + t1 += adj; + t3 += adj; + return XFS_DQUOT_LOGRES(mp) + max3(t1, t2, t3); + } + + t4 = xfs_calc_refcountbt_reservation(mp, 1); + return XFS_DQUOT_LOGRES(mp) + max(t4, max3(t1, t2, t3)); +} + +unsigned int +xfs_calc_write_reservation_minlogsize( + struct xfs_mount *mp) +{ + return xfs_calc_write_reservation(mp, true); } /* @@ -304,12 +367,14 @@ xfs_calc_write_reservation( * the realtime summary: 2 exts * 1 block * worst case split in allocation btrees per extent assuming 2 extents: * 2 exts * 2 trees * (2 * max depth - 1) * block size + * And any refcount updates that happen in a separate transaction (t4). */ STATIC uint xfs_calc_itruncate_reservation( - struct xfs_mount *mp) + struct xfs_mount *mp, + bool for_minlogsize) { - unsigned int t1, t2, t3; + unsigned int t1, t2, t3, t4; unsigned int blksz = XFS_FSB_TO_B(mp, 1); t1 = xfs_calc_inode_res(mp, 1) + @@ -326,7 +391,33 @@ xfs_calc_itruncate_reservation( t3 = 0; } - return XFS_DQUOT_LOGRES(mp) + max3(t1, t2, t3); + /* + * In the early days of reflink, we included enough reservation to log + * four refcountbt splits in the same transaction as bnobt/cntbt + * updates. The codebase runs refcountbt updates in separate + * transactions now, so to compute the minimum log size, add the + * refcount btree splits back here and do not compute them separately + * as t4. Reflink did not support realtime when the reservations were + * established, so do not adjust t3. + */ + if (for_minlogsize) { + if (xfs_has_reflink(mp)) + t2 += xfs_calc_buf_res( + xfs_refcountbt_block_count(mp, 4), + blksz); + + return XFS_DQUOT_LOGRES(mp) + max3(t1, t2, t3); + } + + t4 = xfs_calc_refcountbt_reservation(mp, 2); + return XFS_DQUOT_LOGRES(mp) + max(t4, max3(t1, t2, t3)); +} + +unsigned int +xfs_calc_itruncate_reservation_minlogsize( + struct xfs_mount *mp) +{ + return xfs_calc_itruncate_reservation(mp, true); } /* @@ -792,13 +883,21 @@ xfs_calc_qm_setqlim_reservation(void) */ STATIC uint xfs_calc_qm_dqalloc_reservation( - struct xfs_mount *mp) + struct xfs_mount *mp, + bool for_minlogsize) { - return xfs_calc_write_reservation(mp) + + return xfs_calc_write_reservation(mp, for_minlogsize) + xfs_calc_buf_res(1, XFS_FSB_TO_B(mp, XFS_DQUOT_CLUSTER_SIZE_FSB) - 1); } +unsigned int +xfs_calc_qm_dqalloc_reservation_minlogsize( + struct xfs_mount *mp) +{ + return xfs_calc_qm_dqalloc_reservation(mp, true); +} + /* * Syncing the incore super block changes to disk. * the super block to reflect the changes: sector size @@ -821,11 +920,11 @@ xfs_trans_resv_calc( * The following transactions are logged in physical format and * require a permanent reservation on space. */ - resp->tr_write.tr_logres = xfs_calc_write_reservation(mp); + resp->tr_write.tr_logres = xfs_calc_write_reservation(mp, false); resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT; resp->tr_write.tr_logflags |= XFS_TRANS_PERM_LOG_RES; - resp->tr_itruncate.tr_logres = xfs_calc_itruncate_reservation(mp); + resp->tr_itruncate.tr_logres = xfs_calc_itruncate_reservation(mp, false); resp->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT; resp->tr_itruncate.tr_logflags |= XFS_TRANS_PERM_LOG_RES; @@ -882,7 +981,8 @@ xfs_trans_resv_calc( resp->tr_growrtalloc.tr_logcount = XFS_DEFAULT_PERM_LOG_COUNT; resp->tr_growrtalloc.tr_logflags |= XFS_TRANS_PERM_LOG_RES; - resp->tr_qm_dqalloc.tr_logres = xfs_calc_qm_dqalloc_reservation(mp); + resp->tr_qm_dqalloc.tr_logres = xfs_calc_qm_dqalloc_reservation(mp, + false); resp->tr_qm_dqalloc.tr_logcount = XFS_WRITE_LOG_COUNT; resp->tr_qm_dqalloc.tr_logflags |= XFS_TRANS_PERM_LOG_RES; diff --git a/fs/xfs/libxfs/xfs_trans_resv.h b/fs/xfs/libxfs/xfs_trans_resv.h index fa330e646dc5..22b99042127a 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.h +++ b/fs/xfs/libxfs/xfs_trans_resv.h @@ -98,4 +98,8 @@ struct xfs_trans_resv { void xfs_trans_resv_calc(struct xfs_mount *mp, struct xfs_trans_resv *resp); uint xfs_allocfree_log_count(struct xfs_mount *mp, uint num_ops); +unsigned int xfs_calc_itruncate_reservation_minlogsize(struct xfs_mount *mp); +unsigned int xfs_calc_write_reservation_minlogsize(struct xfs_mount *mp); +unsigned int xfs_calc_qm_dqalloc_reservation_minlogsize(struct xfs_mount *mp); + #endif /* __XFS_TRANS_RESV_H__ */ From patchwork Wed Apr 27 00:52:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12828142 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19211C433EF for ; Wed, 27 Apr 2022 00:52:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1356527AbiD0Azt (ORCPT ); Tue, 26 Apr 2022 20:55:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53758 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1356533AbiD0Azs (ORCPT ); Tue, 26 Apr 2022 20:55:48 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CB3B615A21 for ; Tue, 26 Apr 2022 17:52:38 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 69FCF61A5F for ; Wed, 27 Apr 2022 00:52:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CB132C385A4; Wed, 27 Apr 2022 00:52:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651020757; bh=gwEkI+4yf2OZ82UXB3p7ETLNP+e708TkX2n5Ox8/sfY=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=FvMEEDV8+s4dWucranymbP3rZtvH6TOgJO6z3F5OpQaxUSdkd6OALPnnRxxUoQ7Wh E0okLbRraWEsxf/ZLP5vv22XSanm5hl2TgfF0T1nJjfnUJIdRdT4vZtYoMw+qsNU6h NCDKBQE9j7YSgf2sBUvutzUx3ewvthagjlGvGzJYA9qgqq3+3/PiGL02/C4fpM99zc hXoYUdkb4rZ0mpq4pgMzQYRKUs2IxqM7LSCU8BXOPLQs7w6CWA0J47kFrSlqmmgZTS yW38OYqYoq41tBFNumDcyZ6ItC5P7Ehe1mCmRDvqOX/zqh6PpJoBmxrqgc0hj6itu0 Pyh0JZu9m9UeA== Subject: [PATCH 8/9] xfs: rewrite xfs_reflink_end_cow to use intents From: "Darrick J. Wong" To: djwong@kernel.org, david@fromorbit.com Cc: Dave Chinner , linux-xfs@vger.kernel.org Date: Tue, 26 Apr 2022 17:52:37 -0700 Message-ID: <165102075737.3922658.2470136533458678121.stgit@magnolia> In-Reply-To: <165102071223.3922658.5241787533081256670.stgit@magnolia> References: <165102071223.3922658.5241787533081256670.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Currently, the code that performs CoW remapping after a write has this odd behavior where it walks /backwards/ through the data fork to remap extents in reverse order. Earlier, we rewrote the reflink remap function to use deferred bmap log items instead of trying to cram as much into the first transaction that we could. Now do the same for the CoW remap code. There doesn't seem to be any performance impact; we're just making better use of code that we added for the benefit of reflink. Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_reflink.c | 88 ++++++++++++++++++++++++++++++++------------------ fs/xfs/xfs_trace.h | 3 +- 2 files changed, 58 insertions(+), 33 deletions(-) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 960917628a44..e7a7c00d93be 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -586,21 +586,21 @@ xfs_reflink_cancel_cow_range( STATIC int xfs_reflink_end_cow_extent( struct xfs_inode *ip, - xfs_fileoff_t offset_fsb, - xfs_fileoff_t *end_fsb) + xfs_fileoff_t *offset_fsb, + xfs_fileoff_t end_fsb) { - struct xfs_bmbt_irec got, del; struct xfs_iext_cursor icur; + struct xfs_bmbt_irec got, del, data; struct xfs_mount *mp = ip->i_mount; struct xfs_trans *tp; struct xfs_ifork *ifp = XFS_IFORK_PTR(ip, XFS_COW_FORK); - xfs_filblks_t rlen; unsigned int resblks; + int nmaps; int error; /* No COW extents? That's easy! */ if (ifp->if_bytes == 0) { - *end_fsb = offset_fsb; + *offset_fsb = end_fsb; return 0; } @@ -631,42 +631,66 @@ xfs_reflink_end_cow_extent( * left by the time I/O completes for the loser of the race. In that * case we are done. */ - if (!xfs_iext_lookup_extent_before(ip, ifp, end_fsb, &icur, &got) || - got.br_startoff + got.br_blockcount <= offset_fsb) { - *end_fsb = offset_fsb; + if (!xfs_iext_lookup_extent(ip, ifp, *offset_fsb, &icur, &got) || + got.br_startoff >= end_fsb) { + *offset_fsb = end_fsb; goto out_cancel; } /* - * Structure copy @got into @del, then trim @del to the range that we - * were asked to remap. We preserve @got for the eventual CoW fork + * Only remap real extents that contain data. With AIO, speculative + * preallocations can leak into the range we are called upon, and we + * need to skip them. Preserve @got for the eventual CoW fork * deletion; from now on @del represents the mapping that we're * actually remapping. */ + while (!xfs_bmap_is_written_extent(&got)) { + if (!xfs_iext_next_extent(ifp, &icur, &got) || + got.br_startoff >= end_fsb) { + *offset_fsb = end_fsb; + goto out_cancel; + } + } del = got; - xfs_trim_extent(&del, offset_fsb, *end_fsb - offset_fsb); - ASSERT(del.br_blockcount > 0); - - /* - * Only remap real extents that contain data. With AIO, speculative - * preallocations can leak into the range we are called upon, and we - * need to skip them. - */ - if (!xfs_bmap_is_written_extent(&got)) { - *end_fsb = del.br_startoff; - goto out_cancel; - } - - /* Unmap the old blocks in the data fork. */ - rlen = del.br_blockcount; - error = __xfs_bunmapi(tp, ip, del.br_startoff, &rlen, 0, 1); + /* Grab the corresponding mapping in the data fork. */ + nmaps = 1; + error = xfs_bmapi_read(ip, del.br_startoff, del.br_blockcount, &data, + &nmaps, 0); if (error) goto out_cancel; - /* Trim the extent to whatever got unmapped. */ - xfs_trim_extent(&del, del.br_startoff + rlen, del.br_blockcount - rlen); - trace_xfs_reflink_cow_remap(ip, &del); + /* We can only remap the smaller of the two extent sizes. */ + data.br_blockcount = min(data.br_blockcount, del.br_blockcount); + del.br_blockcount = data.br_blockcount; + + trace_xfs_reflink_cow_remap_from(ip, &del); + trace_xfs_reflink_cow_remap_to(ip, &data); + + if (xfs_bmap_is_real_extent(&data)) { + /* + * If the extent we're remapping is backed by storage (written + * or not), unmap the extent and drop its refcount. + */ + xfs_bmap_unmap_extent(tp, ip, &data); + xfs_refcount_decrease_extent(tp, &data); + xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, + -data.br_blockcount); + } else if (data.br_startblock == DELAYSTARTBLOCK) { + int done; + + /* + * If the extent we're remapping is a delalloc reservation, + * we can use the regular bunmapi function to release the + * incore state. Dropping the delalloc reservation takes care + * of the quota reservation for us. + */ + error = xfs_bunmapi(NULL, ip, data.br_startoff, + data.br_blockcount, 0, 1, &done); + if (error) + goto out_cancel; + ASSERT(done); + } /* Free the CoW orphan record. */ xfs_refcount_free_cow_extent(tp, del.br_startblock, del.br_blockcount); @@ -687,7 +711,7 @@ xfs_reflink_end_cow_extent( return error; /* Update the caller about how much progress we made. */ - *end_fsb = del.br_startoff; + *offset_fsb = del.br_startoff + del.br_blockcount; return 0; out_cancel: @@ -715,7 +739,7 @@ xfs_reflink_end_cow( end_fsb = XFS_B_TO_FSB(ip->i_mount, offset + count); /* - * Walk backwards until we're out of the I/O range. The loop function + * Walk forwards until we've remapped the I/O range. The loop function * repeatedly cycles the ILOCK to allocate one transaction per remapped * extent. * @@ -747,7 +771,7 @@ xfs_reflink_end_cow( * blocks will be remapped. */ while (end_fsb > offset_fsb && !error) - error = xfs_reflink_end_cow_extent(ip, offset_fsb, &end_fsb); + error = xfs_reflink_end_cow_extent(ip, &offset_fsb, end_fsb); if (error) trace_xfs_reflink_end_cow_error(ip, error, _RET_IP_); diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index a690987cc5f0..378f9dd1f66f 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -3405,7 +3405,8 @@ DEFINE_INODE_IREC_EVENT(xfs_reflink_convert_cow); DEFINE_SIMPLE_IO_EVENT(xfs_reflink_cancel_cow_range); DEFINE_SIMPLE_IO_EVENT(xfs_reflink_end_cow); -DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_remap); +DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_remap_from); +DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_remap_to); DEFINE_INODE_ERROR_EVENT(xfs_reflink_cancel_cow_range_error); DEFINE_INODE_ERROR_EVENT(xfs_reflink_end_cow_error); From patchwork Wed Apr 27 00:52:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12828143 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34CBAC433F5 for ; Wed, 27 Apr 2022 00:52:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1356525AbiD0Azy (ORCPT ); Tue, 26 Apr 2022 20:55:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54168 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1356518AbiD0Azx (ORCPT ); Tue, 26 Apr 2022 20:55:53 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7D66D17E17 for ; Tue, 26 Apr 2022 17:52:44 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1901A61A5F for ; Wed, 27 Apr 2022 00:52:44 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 71B38C385A0; Wed, 27 Apr 2022 00:52:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1651020763; bh=GkmlHlBaRUnMRIrImsoc+Uu3ZDK0GTiOXw/DzwuLWLM=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=HRZiBS9FcioQJW7PIveaLEdziI9UqXO0uacwD2u0pu4R2dL2xctzdksdpmBkKcDvi JGiJpehQRsEutLV6Mp4QSlMUG+CwAduXbS8QcYGDnS6FcAdsDL0luC3147YoK6TIBZ 3JikXdFPuEfS29zx2DjTrnd93EesVTaloHF5byBhvkMYMOjVzCLG/zZC9CRf7HtVnb ZZ0+0ui6QjIIGTIE72GrjLnDpJywWVbya0ML8uvF2j8Xrf+P5wLVfEoWi/hnXfLJmq /KMyrQWDnlMGLI5Hjvpcx1ZFpX4j4lwK5I8w/gPKLUV2EGgB6f/rpwvf0r9oLgW5/r 9NpUki4BhVq4A== Subject: [PATCH 9/9] xfs: rename xfs_*alloc*_log_count to _block_count From: "Darrick J. Wong" To: djwong@kernel.org, david@fromorbit.com Cc: linux-xfs@vger.kernel.org Date: Tue, 26 Apr 2022 17:52:43 -0700 Message-ID: <165102076298.3922658.10000784148294313471.stgit@magnolia> In-Reply-To: <165102071223.3922658.5241787533081256670.stgit@magnolia> References: <165102071223.3922658.5241787533081256670.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong These functions return the maximum number of blocks that could be logged in a particular transaction. "log count" is confusing since there's a separate concept of a log (operation) count in the reservation code, so let's change it to "block count" to be less confusing. Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_refcount.c | 2 +- fs/xfs/libxfs/xfs_trans_resv.c | 38 +++++++++++++++++++------------------- fs/xfs/libxfs/xfs_trans_resv.h | 2 +- 3 files changed, 21 insertions(+), 21 deletions(-) diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c index e53544d52ee2..97e9e6020596 100644 --- a/fs/xfs/libxfs/xfs_refcount.c +++ b/fs/xfs/libxfs/xfs_refcount.c @@ -890,7 +890,7 @@ xfs_refcount_still_have_space( * Worst case estimate: full splits of the free space and rmap btrees * to handle each of the shape changes to the refcount btree. */ - overhead = xfs_allocfree_log_count(cur->bc_mp, + overhead = xfs_allocfree_block_count(cur->bc_mp, cur->bc_ag.refc.shape_changes); overhead += cur->bc_mp->m_refc_maxlevels; overhead *= cur->bc_mp->m_sb.sb_blocksize; diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c index ab688929d884..e9913c2c5a24 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.c +++ b/fs/xfs/libxfs/xfs_trans_resv.c @@ -63,7 +63,7 @@ xfs_calc_buf_res( * Keep in mind that max depth is calculated separately for each type of tree. */ uint -xfs_allocfree_log_count( +xfs_allocfree_block_count( struct xfs_mount *mp, uint num_ops) { @@ -146,7 +146,7 @@ xfs_calc_inobt_res( { return xfs_calc_buf_res(M_IGEO(mp)->inobt_maxlevels, XFS_FSB_TO_B(mp, 1)) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1), + xfs_calc_buf_res(xfs_allocfree_block_count(mp, 1), XFS_FSB_TO_B(mp, 1)); } @@ -193,7 +193,7 @@ xfs_calc_inode_chunk_res( { uint res, size = 0; - res = xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1), + res = xfs_calc_buf_res(xfs_allocfree_block_count(mp, 1), XFS_FSB_TO_B(mp, 1)); if (alloc) { /* icreate tx uses ordered buffers */ @@ -213,7 +213,7 @@ xfs_calc_inode_chunk_res( * extents, as well as the realtime summary block. */ static unsigned int -xfs_rtalloc_log_count( +xfs_rtalloc_block_count( struct xfs_mount *mp, unsigned int num_ops) { @@ -300,21 +300,21 @@ xfs_calc_write_reservation( t1 = xfs_calc_inode_res(mp, 1) + xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK), blksz) + xfs_calc_buf_res(3, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 2), blksz); + xfs_calc_buf_res(xfs_allocfree_block_count(mp, 2), blksz); if (xfs_has_realtime(mp)) { t2 = xfs_calc_inode_res(mp, 1) + xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK), blksz) + xfs_calc_buf_res(3, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(xfs_rtalloc_log_count(mp, 1), blksz) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1), blksz); + xfs_calc_buf_res(xfs_rtalloc_block_count(mp, 1), blksz) + + xfs_calc_buf_res(xfs_allocfree_block_count(mp, 1), blksz); } else { t2 = 0; } t3 = xfs_calc_buf_res(5, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 2), blksz); + xfs_calc_buf_res(xfs_allocfree_block_count(mp, 2), blksz); /* * In the early days of reflink, we included enough reservation to log @@ -381,12 +381,12 @@ xfs_calc_itruncate_reservation( xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK) + 1, blksz); t2 = xfs_calc_buf_res(9, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 4), blksz); + xfs_calc_buf_res(xfs_allocfree_block_count(mp, 4), blksz); if (xfs_has_realtime(mp)) { t3 = xfs_calc_buf_res(5, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(xfs_rtalloc_log_count(mp, 2), blksz) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 2), blksz); + xfs_calc_buf_res(xfs_rtalloc_block_count(mp, 2), blksz) + + xfs_calc_buf_res(xfs_allocfree_block_count(mp, 2), blksz); } else { t3 = 0; } @@ -441,7 +441,7 @@ xfs_calc_rename_reservation( xfs_calc_buf_res(2 * XFS_DIROP_LOG_COUNT(mp), XFS_FSB_TO_B(mp, 1))), (xfs_calc_buf_res(7, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 3), + xfs_calc_buf_res(xfs_allocfree_block_count(mp, 3), XFS_FSB_TO_B(mp, 1)))); } @@ -481,7 +481,7 @@ xfs_calc_link_reservation( xfs_calc_buf_res(XFS_DIROP_LOG_COUNT(mp), XFS_FSB_TO_B(mp, 1))), (xfs_calc_buf_res(3, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1), + xfs_calc_buf_res(xfs_allocfree_block_count(mp, 1), XFS_FSB_TO_B(mp, 1)))); } @@ -519,7 +519,7 @@ xfs_calc_remove_reservation( xfs_calc_buf_res(XFS_DIROP_LOG_COUNT(mp), XFS_FSB_TO_B(mp, 1))), (xfs_calc_buf_res(4, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 2), + xfs_calc_buf_res(xfs_allocfree_block_count(mp, 2), XFS_FSB_TO_B(mp, 1)))); } @@ -664,7 +664,7 @@ xfs_calc_growdata_reservation( struct xfs_mount *mp) { return xfs_calc_buf_res(3, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1), + xfs_calc_buf_res(xfs_allocfree_block_count(mp, 1), XFS_FSB_TO_B(mp, 1)); } @@ -686,7 +686,7 @@ xfs_calc_growrtalloc_reservation( xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK), XFS_FSB_TO_B(mp, 1)) + xfs_calc_inode_res(mp, 1) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1), + xfs_calc_buf_res(xfs_allocfree_block_count(mp, 1), XFS_FSB_TO_B(mp, 1)); } @@ -762,7 +762,7 @@ xfs_calc_addafork_reservation( xfs_calc_buf_res(1, mp->m_dir_geo->blksize) + xfs_calc_buf_res(XFS_DAENTER_BMAP1B(mp, XFS_DATA_FORK) + 1, XFS_FSB_TO_B(mp, 1)) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 1), + xfs_calc_buf_res(xfs_allocfree_block_count(mp, 1), XFS_FSB_TO_B(mp, 1)); } @@ -785,7 +785,7 @@ xfs_calc_attrinval_reservation( xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_ATTR_FORK), XFS_FSB_TO_B(mp, 1))), (xfs_calc_buf_res(9, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 4), + xfs_calc_buf_res(xfs_allocfree_block_count(mp, 4), XFS_FSB_TO_B(mp, 1)))); } @@ -852,7 +852,7 @@ xfs_calc_attrrm_reservation( XFS_BM_MAXLEVELS(mp, XFS_ATTR_FORK)) + xfs_calc_buf_res(XFS_BM_MAXLEVELS(mp, XFS_DATA_FORK), 0)), (xfs_calc_buf_res(5, mp->m_sb.sb_sectsize) + - xfs_calc_buf_res(xfs_allocfree_log_count(mp, 2), + xfs_calc_buf_res(xfs_allocfree_block_count(mp, 2), XFS_FSB_TO_B(mp, 1)))); } diff --git a/fs/xfs/libxfs/xfs_trans_resv.h b/fs/xfs/libxfs/xfs_trans_resv.h index 22b99042127a..0554b9d775d2 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.h +++ b/fs/xfs/libxfs/xfs_trans_resv.h @@ -96,7 +96,7 @@ struct xfs_trans_resv { #define XFS_WRITE_LOG_COUNT_REFLINK 8 void xfs_trans_resv_calc(struct xfs_mount *mp, struct xfs_trans_resv *resp); -uint xfs_allocfree_log_count(struct xfs_mount *mp, uint num_ops); +uint xfs_allocfree_block_count(struct xfs_mount *mp, uint num_ops); unsigned int xfs_calc_itruncate_reservation_minlogsize(struct xfs_mount *mp); unsigned int xfs_calc_write_reservation_minlogsize(struct xfs_mount *mp);