From patchwork Tue Dec 20 00:05:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13077322 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68B6BC4332F for ; Tue, 20 Dec 2022 00:05:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231972AbiLTAFH (ORCPT ); Mon, 19 Dec 2022 19:05:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229589AbiLTAFF (ORCPT ); Mon, 19 Dec 2022 19:05:05 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B07E63D0 for ; Mon, 19 Dec 2022 16:05:04 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 270A0611B5 for ; Tue, 20 Dec 2022 00:05:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 888B3C433EF; Tue, 20 Dec 2022 00:05:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1671494703; bh=KH8Y3AkHr50y+3+ZvEQmov8I9/Wt90XaGqePN/QS1M0=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=kq6k4dn9C2xA9kj1bnq++m5RmiJDjwE44TLDVyAmVP8ZAed62YchzlCRIslA21Ig/ rWPrBPyprjA9GV955wIDFzxT6RDeUU5GtxP24rPhcYzG5Jg+mNJv5YHDhIV0PZXcd3 JErfWpKIx9Sf3G0FT3ewqhKGywkPO1VYACsIVai3RcFmPRA7/u5e6P1cNJsiKmnj4i 1ciZqYjyEx+zgZsR9SjgOdjKUGjUyau4QrDUR0nAI8M/XcUy9xYlyVPy3V4up1608R NTmR1B1I2nkW8fMYiw0XV91Nk7a0PQdjTo+RDycD0NWi5DdIgiTK1mAn8diXQww/c1 V26mtKJheJMPQ== Subject: [PATCH 1/4] xfs: don't assert if cmap covers imap after cycling lock From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Mon, 19 Dec 2022 16:05:03 -0800 Message-ID: <167149470312.336919.14005739948269903315.stgit@magnolia> In-Reply-To: <167149469744.336919.13748690081866673267.stgit@magnolia> References: <167149469744.336919.13748690081866673267.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong In xfs_reflink_fill_cow_hole, there's a debugging assertion that trips if (after cycling the ILOCK to get a transaction) the requeried cow mapping overlaps the start of the area being written. IOWs, it trips if the hole in the cow fork that it's supposed to fill has been filled. This is trivially possible since we cycled ILOCK_EXCL. If we trip the assertion, then we know that cmap is a delalloc extent because @found is false. Fortunately, the bmapi_write call below will convert the delalloc extent to a real unwritten cow fork extent, so all we need to do here is remove the assertion. It turns out that generic/095 trips this pretty regularly with alwayscow mode enabled. Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner --- fs/xfs/xfs_reflink.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index fe46bce8cae6..5535778a98f9 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -416,8 +416,6 @@ xfs_reflink_fill_cow_hole( goto convert; } - ASSERT(cmap->br_startoff > imap->br_startoff); - /* Allocate the entire reservation as unwritten blocks. */ nimaps = 1; error = xfs_bmapi_write(tp, ip, imap->br_startoff, imap->br_blockcount, From patchwork Tue Dec 20 00:05:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13077323 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8DE06C4332F for ; Tue, 20 Dec 2022 00:05:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229461AbiLTAFN (ORCPT ); Mon, 19 Dec 2022 19:05:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51420 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229779AbiLTAFM (ORCPT ); Mon, 19 Dec 2022 19:05:12 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C76FA266D for ; Mon, 19 Dec 2022 16:05:11 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 730EDB8109E for ; Tue, 20 Dec 2022 00:05:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 235C5C433EF; Tue, 20 Dec 2022 00:05:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1671494709; bh=rk9n68Xt0K3QgrP+pOY7lic0AMmjbI/qQIqCpjmI2RI=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=M9ZVU7gj+a7mj/lEiDx0JEygWyyIVcYCfTx+Tl/zo5BO51tY/2jE2YSyITsKMqPYp XuwUFrEHRDmVvbYrWim4ydDwrxZngFZgea6cYMEyZVrFpWjpMIJC4nF/K2KN3WiVsR 0n5az3LRJjSCEzbSVyCAq6STlAsA7oKfBGTk1VisgLIFlFs/5WDQFtB8FTH9TrJiow RUdSCVraWa/K5yZOe6bDwRBIxCc9CuM4Dhiz7AMR7gGe8bVZZ0l/E1nF95XQxoA4/z FHuX7fq2CU6Vy/kJPGrbYYXMrFJsbbghbaoWJ3Jgo1KHclEWZeOBOdXLnd/NCgstKe 5G9nn9Nfx3sow== Subject: [PATCH 2/4] xfs: don't stall background reclaim on inactvation From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Mon, 19 Dec 2022 16:05:08 -0800 Message-ID: <167149470870.336919.10695086693636688760.stgit@magnolia> In-Reply-To: <167149469744.336919.13748690081866673267.stgit@magnolia> References: <167149469744.336919.13748690081866673267.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong The online fsck stress tests deadlocked a test VM the other night. The deadlock happened because: 1. kswapd tried to prune the sb inode list, xfs found that it needed to inactivate an inode and that the queue was long enough that it should wait for the worker. It was holding shrinker_rwsem. 2. The inactivation worker allocated a transaction and then stalled trying to obtain the AGI buffer lock. 3. An online repair function called unregister_shrinker while in transaction context and holding the AGI lock. It also tried to grab shrinker_rwsem. #3 shouldn't happen and was easily fixed, but seeing as we designed background inodegc to avoid stalling reclaim, I feel that #1 shouldn't be happening either. Fix xfs_inodegc_want_flush_work to avoid stalling background reclaim on inode inactivation. Fixes: ab23a7768739 ("xfs: per-cpu deferred inode inactivation queues") Signed-off-by: Darrick J. Wong --- fs/xfs/xfs_icache.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c index f35e2cee5265..24eff2bd4062 100644 --- a/fs/xfs/xfs_icache.c +++ b/fs/xfs/xfs_icache.c @@ -2000,6 +2000,8 @@ xfs_inodegc_want_queue_work( * * Note: If the current thread is running a transaction, we don't ever want to * wait for other transactions because that could introduce a deadlock. + * + * Don't let kswapd background reclamation stall on inactivations. */ static inline bool xfs_inodegc_want_flush_work( @@ -2010,6 +2012,9 @@ xfs_inodegc_want_flush_work( if (current->journal_info) return false; + if (current_is_kswapd()) + return false; + if (shrinker_hits > 0) return true; From patchwork Tue Dec 20 00:05:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13077324 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E52AC4332F for ; Tue, 20 Dec 2022 00:05:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229589AbiLTAFR (ORCPT ); Mon, 19 Dec 2022 19:05:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51434 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229779AbiLTAFQ (ORCPT ); Mon, 19 Dec 2022 19:05:16 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B836E63A6 for ; Mon, 19 Dec 2022 16:05:15 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 54241611BC for ; Tue, 20 Dec 2022 00:05:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B0FE7C433D2; Tue, 20 Dec 2022 00:05:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1671494714; bh=zc4L0DeigfaSG4iPy19l3vxtcNO6pE+AQ2x5oJC7ngg=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=WDePgkKVNkkDJ7BQEM6DDUvix5jFkACLENQaCA3BYLyKGwHyRg0oIfqkH2rQ5zQgI igTCl7Zc6dRhW1tgSo0lS3Az+u3aL29EXPuyn6g1dTrVLytvdb+VAtBRQsES0c6ay5 CVC+Hqd9I+Mc2GW+8fr1l+gDRG6qC0sD7fn964d5S+xkdNXfkPpR0iYtPi7GgC+UUy TjV5I0j4qT72ZLjksx6NJl/Wa3oppR5HgFfYvJpvyy4kFapevkplDUXIZwcnsAAab7 6btI0KeEiROjsPvRPC+TFS11qPqo2PCUgNwV+ROn+fNzqu4t+b3uveIuD2T3/uw/ar UXSOFB4yhwmtQ== Subject: [PATCH 3/4] xfs: make xfs_iomap_page_ops static From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Mon, 19 Dec 2022 16:05:14 -0800 Message-ID: <167149471429.336919.12382220831144249809.stgit@magnolia> In-Reply-To: <167149469744.336919.13748690081866673267.stgit@magnolia> References: <167149469744.336919.13748690081866673267.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Shut up the sparse warnings about this variable that isn't referenced anywhere else. Fixes: cd89a0950c40 ("xfs: use iomap_valid method to detect stale cached iomaps") Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner --- fs/xfs/xfs_iomap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 669c1bc5c3a7..fc1946f80a4a 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -83,7 +83,7 @@ xfs_iomap_valid( return true; } -const struct iomap_page_ops xfs_iomap_page_ops = { +static const struct iomap_page_ops xfs_iomap_page_ops = { .iomap_valid = xfs_iomap_valid, }; From patchwork Tue Dec 20 00:05:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13077325 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A1CCC4332F for ; Tue, 20 Dec 2022 00:05:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229779AbiLTAFY (ORCPT ); Mon, 19 Dec 2022 19:05:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51460 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230035AbiLTAFX (ORCPT ); Mon, 19 Dec 2022 19:05:23 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC31063D0 for ; Mon, 19 Dec 2022 16:05:22 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 7C185B8109D for ; Tue, 20 Dec 2022 00:05:21 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 48D75C433D2; Tue, 20 Dec 2022 00:05:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1671494720; bh=dVgoqvQsHHKZMeb0S1hnhUhqNcQNmkJdHQCZUjdjT6s=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=m2aJOHJR7D6zRUxdtF2DN6Nd8cnQUJ0FqzZu7eUhoC6v3I0B38YwLN504wiXgqIag Klj8cG1uctCQh/Hpyl4if+OAmGj8pDpfCIVVHydnXW4JA60CcWxoEx9fSa7h3TVYrJ oLi+NTlXiwtPM/W+avdLKrdnQWoUVah3NC+yxB+cKMYWvJaS3nPLm/2DQcGSGvdYqW YA0lAiRcDNeF3b2PjRyNwGFnUvU5CCJ53lSwt4W9WJ9IHLJmd0Pj6QCaS/rMGRfXgb hVaOQ67kICVZ5ghuQjpZvjZ1v4mWlHEywf++3CLezEA01l+BilWRVamy5wKuJFwmow yazgasLq2eWtQ== Subject: [PATCH 4/4] xfs: fix off-by-one error in xfs_btree_space_to_height From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Mon, 19 Dec 2022 16:05:19 -0800 Message-ID: <167149471987.336919.3277522603824048839.stgit@magnolia> In-Reply-To: <167149469744.336919.13748690081866673267.stgit@magnolia> References: <167149469744.336919.13748690081866673267.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Lately I've been stress-testing extreme-sized rmap btrees by using the (new) xfs_db bmap_inflate command to clone bmbt mappings billions of times and then using xfs_repair to build new rmap and refcount btrees. This of course is /much/ faster than actually FICLONEing a file billions of times. Unfortunately, xfs_repair fails in xfs_btree_bload_compute_geometry with EOVERFLOW, which indicates that xfs_mount.m_rmap_maxlevels is not sufficiently large for the test scenario. For a 1TB filesystem (~67 million AG blocks, 4 AGs) the btheight command reports: $ xfs_db -c 'btheight -n 4400801200 -w min rmapbt' /dev/sda rmapbt: worst case per 4096-byte block: 84 records (leaf) / 45 keyptrs (node) level 0: 4400801200 records, 52390491 blocks level 1: 52390491 records, 1164234 blocks level 2: 1164234 records, 25872 blocks level 3: 25872 records, 575 blocks level 4: 575 records, 13 blocks level 5: 13 records, 1 block 6 levels, 53581186 blocks total The AG is sufficiently large to build this rmap btree. Unfortunately, m_rmap_maxlevels is 5. Augmenting the loop in the space->height function to report height, node blocks, and blocks remaining produces this: ht 1 node_blocks 45 blockleft 67108863 ht 2 node_blocks 2025 blockleft 67108818 ht 3 node_blocks 91125 blockleft 67106793 ht 4 node_blocks 4100625 blockleft 67015668 final height: 5 The goal of this function is to compute the maximum height btree that can be stored in the given number of ondisk fsblocks. Starting with the top level of the tree, each iteration through the loop adds the fanout factor of the next level down until we run out of blocks. IOWs, maximum height is achieved by using the smallest fanout factor that can apply to that level. However, the loop setup is not correct. Top level btree blocks are allowed to contain fewer than minrecs items, so the computation is incorrect because the first time through the loop it should be using a fanout factor of 2. With this corrected, the above becomes: ht 1 node_blocks 2 blockleft 67108863 ht 2 node_blocks 90 blockleft 67108861 ht 3 node_blocks 4050 blockleft 67108771 ht 4 node_blocks 182250 blockleft 67104721 ht 5 node_blocks 8201250 blockleft 66922471 final height: 6 Fixes: 9ec691205e7d ("xfs: compute the maximum height of the rmap btree when reflink enabled") Signed-off-by: Darrick J. Wong Reviewed-by: Dave Chinner --- fs/xfs/libxfs/xfs_btree.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 4c16c8c31fcb..8d11d3f5e529 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -4666,7 +4666,11 @@ xfs_btree_space_to_height( const unsigned int *limits, unsigned long long leaf_blocks) { - unsigned long long node_blocks = limits[1]; + /* + * The root btree block can have a fanout between 2 and maxrecs because + * the tree might not be big enough to fill it. + */ + unsigned long long node_blocks = 2; unsigned long long blocks_left = leaf_blocks - 1; unsigned int height = 1;