From patchwork Wed Apr 3 05:18:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13614954 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E8EF86AD7 for ; Wed, 3 Apr 2024 05:18:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712121507; cv=none; b=WtBxOE6xVi9uBetPhjPy0IwlbClk+GWPssQg0onyFdNhn7xnd12eEbyC5Tmh/MkXdIdaLQC699gU0FFqJnMGql11hRLZOh278OnOE2yRp1MrNDyEQOTalU44sqUuqCNMmxhwY1Llfn5jJeNR/eKwKYN7Gpq/V6TNUZlvJhOPOR8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712121507; c=relaxed/simple; bh=hYme5Yaghg5KXbz/JEAjSTp503ad/MXaI1cNYjStvic=; h=Subject:From:To:Cc:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=RenbHR9BqSnV8eRAbPZHR9YcdHErgQu7oBFeHPVf4fW1/SUPjV0Tw5qbIWn3Ta1YJD+7/KWgnwCAlMc0ctfBK69EUDnDGzKE4kGbCkDzZoh/nZ7NMM06WHVBpGvU6oNPQ5qQN4NmXhOJFDzl0Dx2c4TUvBHo/v5VM8bgTmnw0Bg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=m04lx0l9; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="m04lx0l9" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BC96CC433F1; Wed, 3 Apr 2024 05:18:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712121506; bh=hYme5Yaghg5KXbz/JEAjSTp503ad/MXaI1cNYjStvic=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=m04lx0l9137z6S5D2UIMyJojFhA7DFtGQCPxQlKPXQ8vUDa0S2iBAvzVeJP9ly0pv C/3w1nBxA0fcA+nZwwlM7R6prqiLoPUCKvSJzYIb4SAAOy7m5fPujbWHr+PXV2So1V 3ATFlUDTGZoEWX27dHcR090UD3xHZegBYwENfSFATArKpIZZkeLctC4I/lp6x62eXo LPjyiyDBuQ6sT4tzfo9Ie2VpsreGt8o11ik4MOVBsyYYcTV3Kha9HuuntAaj8KnG2l DaomWUfaTF8kTc1OWyWRLopGdEhqiq+i3qVueUuN7fhsyIOxxWZ1qFT+S4ubVKXx2l Nh48wmzOXB6VA== Subject: [PATCH 1/3] xfs: pass xfs_buf lookup flags to xfs_*read_agi From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Tue, 02 Apr 2024 22:18:26 -0700 Message-ID: <171212150618.1535150.1160220960086130932.stgit@frogsfrogsfrogs> In-Reply-To: <171212150033.1535150.8307366470561747407.stgit@frogsfrogsfrogs> References: <171212150033.1535150.8307366470561747407.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Allow callers to pass buffer lookup flags to xfs_read_agi and xfs_ialloc_read_agi. This will be used in the next patch to fix a deadlock in the online fsck inode scanner. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_ag.c | 8 ++++---- fs/xfs/libxfs/xfs_ialloc.c | 16 ++++++++++------ fs/xfs/libxfs/xfs_ialloc.h | 5 +++-- fs/xfs/libxfs/xfs_ialloc_btree.c | 4 ++-- fs/xfs/scrub/common.c | 4 ++-- fs/xfs/scrub/fscounters.c | 2 +- fs/xfs/scrub/iscan.c | 2 +- fs/xfs/scrub/repair.c | 6 +++--- fs/xfs/xfs_inode.c | 8 ++++---- fs/xfs/xfs_iwalk.c | 4 ++-- fs/xfs/xfs_log_recover.c | 4 ++-- 11 files changed, 34 insertions(+), 29 deletions(-) diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c index dc1873f76bffd..09fe9412eab49 100644 --- a/fs/xfs/libxfs/xfs_ag.c +++ b/fs/xfs/libxfs/xfs_ag.c @@ -194,7 +194,7 @@ xfs_initialize_perag_data( pag = xfs_perag_get(mp, index); error = xfs_alloc_read_agf(pag, NULL, 0, NULL); if (!error) - error = xfs_ialloc_read_agi(pag, NULL, NULL); + error = xfs_ialloc_read_agi(pag, NULL, 0, NULL); if (error) { xfs_perag_put(pag); return error; @@ -931,7 +931,7 @@ xfs_ag_shrink_space( int error, err2; ASSERT(pag->pag_agno == mp->m_sb.sb_agcount - 1); - error = xfs_ialloc_read_agi(pag, *tpp, &agibp); + error = xfs_ialloc_read_agi(pag, *tpp, 0, &agibp); if (error) return error; @@ -1062,7 +1062,7 @@ xfs_ag_extend_space( ASSERT(pag->pag_agno == pag->pag_mount->m_sb.sb_agcount - 1); - error = xfs_ialloc_read_agi(pag, tp, &bp); + error = xfs_ialloc_read_agi(pag, tp, 0, &bp); if (error) return error; @@ -1119,7 +1119,7 @@ xfs_ag_get_geometry( int error; /* Lock the AG headers. */ - error = xfs_ialloc_read_agi(pag, NULL, &agi_bp); + error = xfs_ialloc_read_agi(pag, NULL, 0, &agi_bp); if (error) return error; error = xfs_alloc_read_agf(pag, NULL, 0, &agf_bp); diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c index e5ac3e5430c4e..cb37f0007731f 100644 --- a/fs/xfs/libxfs/xfs_ialloc.c +++ b/fs/xfs/libxfs/xfs_ialloc.c @@ -1699,7 +1699,7 @@ xfs_dialloc_good_ag( return false; if (!xfs_perag_initialised_agi(pag)) { - error = xfs_ialloc_read_agi(pag, tp, NULL); + error = xfs_ialloc_read_agi(pag, tp, 0, NULL); if (error) return false; } @@ -1768,7 +1768,7 @@ xfs_dialloc_try_ag( * Then read in the AGI buffer and recheck with the AGI buffer * lock held. */ - error = xfs_ialloc_read_agi(pag, *tpp, &agbp); + error = xfs_ialloc_read_agi(pag, *tpp, 0, &agbp); if (error) return error; @@ -2286,7 +2286,7 @@ xfs_difree( /* * Get the allocation group header. */ - error = xfs_ialloc_read_agi(pag, tp, &agbp); + error = xfs_ialloc_read_agi(pag, tp, 0, &agbp); if (error) { xfs_warn(mp, "%s: xfs_ialloc_read_agi() returned error %d.", __func__, error); @@ -2332,7 +2332,7 @@ xfs_imap_lookup( int error; int i; - error = xfs_ialloc_read_agi(pag, tp, &agbp); + error = xfs_ialloc_read_agi(pag, tp, 0, &agbp); if (error) { xfs_alert(mp, "%s: xfs_ialloc_read_agi() returned error %d, agno %d", @@ -2675,6 +2675,7 @@ int xfs_read_agi( struct xfs_perag *pag, struct xfs_trans *tp, + xfs_buf_flags_t flags, struct xfs_buf **agibpp) { struct xfs_mount *mp = pag->pag_mount; @@ -2684,7 +2685,7 @@ xfs_read_agi( error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, XFS_AG_DADDR(mp, pag->pag_agno, XFS_AGI_DADDR(mp)), - XFS_FSS_TO_BB(mp, 1), 0, agibpp, &xfs_agi_buf_ops); + XFS_FSS_TO_BB(mp, 1), flags, agibpp, &xfs_agi_buf_ops); if (xfs_metadata_is_sick(error)) xfs_ag_mark_sick(pag, XFS_SICK_AG_AGI); if (error) @@ -2704,6 +2705,7 @@ int xfs_ialloc_read_agi( struct xfs_perag *pag, struct xfs_trans *tp, + int flags, struct xfs_buf **agibpp) { struct xfs_buf *agibp; @@ -2712,7 +2714,9 @@ xfs_ialloc_read_agi( trace_xfs_ialloc_read_agi(pag->pag_mount, pag->pag_agno); - error = xfs_read_agi(pag, tp, &agibp); + error = xfs_read_agi(pag, tp, + (flags & XFS_IALLOC_FLAG_TRYLOCK) ? XBF_TRYLOCK : 0, + &agibp); if (error) return error; diff --git a/fs/xfs/libxfs/xfs_ialloc.h b/fs/xfs/libxfs/xfs_ialloc.h index f1412183bb44b..b549627e3a615 100644 --- a/fs/xfs/libxfs/xfs_ialloc.h +++ b/fs/xfs/libxfs/xfs_ialloc.h @@ -63,10 +63,11 @@ xfs_ialloc_log_agi( struct xfs_buf *bp, /* allocation group header buffer */ uint32_t fields); /* bitmask of fields to log */ -int xfs_read_agi(struct xfs_perag *pag, struct xfs_trans *tp, +int xfs_read_agi(struct xfs_perag *pag, struct xfs_trans *tp, xfs_buf_flags_t flags, struct xfs_buf **agibpp); int xfs_ialloc_read_agi(struct xfs_perag *pag, struct xfs_trans *tp, - struct xfs_buf **agibpp); + int flags, struct xfs_buf **agibpp); +#define XFS_IALLOC_FLAG_TRYLOCK (1U << 0) /* use trylock for buffer locking */ /* * Lookup a record by ino in the btree given by cur. diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.c b/fs/xfs/libxfs/xfs_ialloc_btree.c index cc661fca6ff5b..42e9fd47f6c73 100644 --- a/fs/xfs/libxfs/xfs_ialloc_btree.c +++ b/fs/xfs/libxfs/xfs_ialloc_btree.c @@ -745,7 +745,7 @@ xfs_finobt_count_blocks( struct xfs_btree_cur *cur; int error; - error = xfs_ialloc_read_agi(pag, tp, &agbp); + error = xfs_ialloc_read_agi(pag, tp, 0, &agbp); if (error) return error; @@ -768,7 +768,7 @@ xfs_finobt_read_blocks( struct xfs_agi *agi; int error; - error = xfs_ialloc_read_agi(pag, tp, &agbp); + error = xfs_ialloc_read_agi(pag, tp, 0, &agbp); if (error) return error; diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c index 47a20cf5205f0..a27d33b6f4641 100644 --- a/fs/xfs/scrub/common.c +++ b/fs/xfs/scrub/common.c @@ -445,7 +445,7 @@ xchk_perag_read_headers( { int error; - error = xfs_ialloc_read_agi(sa->pag, sc->tp, &sa->agi_bp); + error = xfs_ialloc_read_agi(sa->pag, sc->tp, 0, &sa->agi_bp); if (error && want_ag_read_header_failure(sc, XFS_SCRUB_TYPE_AGI)) return error; @@ -827,7 +827,7 @@ xchk_iget_agi( * in the iget cache miss path. */ pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, inum)); - error = xfs_ialloc_read_agi(pag, tp, agi_bpp); + error = xfs_ialloc_read_agi(pag, tp, 0, agi_bpp); xfs_perag_put(pag); if (error) return error; diff --git a/fs/xfs/scrub/fscounters.c b/fs/xfs/scrub/fscounters.c index d310737c88236..da2f6729699dd 100644 --- a/fs/xfs/scrub/fscounters.c +++ b/fs/xfs/scrub/fscounters.c @@ -85,7 +85,7 @@ xchk_fscount_warmup( continue; /* Lock both AG headers. */ - error = xfs_ialloc_read_agi(pag, sc->tp, &agi_bp); + error = xfs_ialloc_read_agi(pag, sc->tp, 0, &agi_bp); if (error) break; error = xfs_alloc_read_agf(pag, sc->tp, 0, &agf_bp); diff --git a/fs/xfs/scrub/iscan.c b/fs/xfs/scrub/iscan.c index ec3478bc505ef..66ba0fbd059e0 100644 --- a/fs/xfs/scrub/iscan.c +++ b/fs/xfs/scrub/iscan.c @@ -281,7 +281,7 @@ xchk_iscan_advance( if (!pag) return -ECANCELED; - ret = xfs_ialloc_read_agi(pag, sc->tp, &agi_bp); + ret = xfs_ialloc_read_agi(pag, sc->tp, 0, &agi_bp); if (ret) goto out_pag; diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c index f43dce771cdd2..443e62f724818 100644 --- a/fs/xfs/scrub/repair.c +++ b/fs/xfs/scrub/repair.c @@ -290,7 +290,7 @@ xrep_calc_ag_resblks( icount = pag->pagi_count; } else { /* Try to get the actual counters from disk. */ - error = xfs_ialloc_read_agi(pag, NULL, &bp); + error = xfs_ialloc_read_agi(pag, NULL, 0, &bp); if (!error) { icount = pag->pagi_count; xfs_buf_relse(bp); @@ -908,7 +908,7 @@ xrep_reinit_pagi( ASSERT(xfs_perag_initialised_agi(pag)); clear_bit(XFS_AGSTATE_AGI_INIT, &pag->pag_opstate); - error = xfs_ialloc_read_agi(pag, sc->tp, &bp); + error = xfs_ialloc_read_agi(pag, sc->tp, 0, &bp); if (error) return error; @@ -934,7 +934,7 @@ xrep_ag_init( ASSERT(!sa->pag); - error = xfs_ialloc_read_agi(pag, sc->tp, &sa->agi_bp); + error = xfs_ialloc_read_agi(pag, sc->tp, 0, &sa->agi_bp); if (error) return error; diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index ea48774f6b76d..50a2d35991af4 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -2156,7 +2156,7 @@ xfs_iunlink( pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino)); /* Get the agi buffer first. It ensures lock ordering on the list. */ - error = xfs_read_agi(pag, tp, &agibp); + error = xfs_read_agi(pag, tp, 0, &agibp); if (error) goto out; @@ -2253,7 +2253,7 @@ xfs_iunlink_remove( trace_xfs_iunlink_remove(ip); /* Get the agi buffer first. It ensures lock ordering on the list. */ - error = xfs_read_agi(pag, tp, &agibp); + error = xfs_read_agi(pag, tp, 0, &agibp); if (error) return error; @@ -3131,7 +3131,7 @@ xfs_rename( pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, inodes[i]->i_ino)); - error = xfs_read_agi(pag, tp, &bp); + error = xfs_read_agi(pag, tp, 0, &bp); xfs_perag_put(pag); if (error) goto out_trans_cancel; @@ -3803,7 +3803,7 @@ xfs_inode_reload_unlinked_bucket( /* Grab the first inode in the list */ pag = xfs_perag_get(mp, agno); - error = xfs_ialloc_read_agi(pag, tp, &agibp); + error = xfs_ialloc_read_agi(pag, tp, 0, &agibp); xfs_perag_put(pag); if (error) return error; diff --git a/fs/xfs/xfs_iwalk.c b/fs/xfs/xfs_iwalk.c index 01b55f03a1026..730c8d48da282 100644 --- a/fs/xfs/xfs_iwalk.c +++ b/fs/xfs/xfs_iwalk.c @@ -268,7 +268,7 @@ xfs_iwalk_ag_start( /* Set up a fresh cursor and empty the inobt cache. */ iwag->nr_recs = 0; - error = xfs_ialloc_read_agi(pag, tp, agi_bpp); + error = xfs_ialloc_read_agi(pag, tp, 0, agi_bpp); if (error) return error; *curpp = xfs_inobt_init_cursor(pag, tp, *agi_bpp); @@ -386,7 +386,7 @@ xfs_iwalk_run_callbacks( } /* ...and recreate the cursor just past where we left off. */ - error = xfs_ialloc_read_agi(iwag->pag, iwag->tp, agi_bpp); + error = xfs_ialloc_read_agi(iwag->pag, iwag->tp, 0, agi_bpp); if (error) return error; *curpp = xfs_inobt_init_cursor(iwag->pag, iwag->tp, *agi_bpp); diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index 13f1d2e915405..1b1f0a4cd4947 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -2656,7 +2656,7 @@ xlog_recover_clear_agi_bucket( if (error) goto out_error; - error = xfs_read_agi(pag, tp, &agibp); + error = xfs_read_agi(pag, tp, 0, &agibp); if (error) goto out_abort; @@ -2772,7 +2772,7 @@ xlog_recover_iunlink_ag( int bucket; int error; - error = xfs_read_agi(pag, NULL, &agibp); + error = xfs_read_agi(pag, NULL, 0, &agibp); if (error) { /* * AGI is b0rked. Don't process it. From patchwork Wed Apr 3 05:18:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13614955 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E8F686AD7 for ; Wed, 3 Apr 2024 05:18:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712121513; cv=none; b=aT7ASZZoiphuEQoJ6hswSgc6OqLNztCMHHD1NNfynTjUBjAtyhUaB1Q8NH0S6zhBMOYygn8/C0Jv+pXREIRUzmPDqjcemdq+cWVAiqmBF74T+dW1bBKprvvVhcOjVRTcKzLK2I0qWi71bu7PfjFmddxsNVZyq4OzZNfvD4+JAs0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712121513; c=relaxed/simple; bh=Se2nXJCcT4sawSe+jwQtnM+VYLCUul4VhyRIPFuWVOc=; h=Subject:From:To:Cc:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=hQ3yFEKJoA7diQEJw7pUTZYOWGiIU9aSUD6qIJrDj8/mZuJ2NHqGTIRqmi+nfcWT9rk0H2+zdJ4xgr3nIKGkyoEizR6MGS/clKx64dr1G+57g7b9lvZnaPzz8RbTTIp+TVT6cH4Mems1OUtzZCpQ3UPUhgjOqHaE9GlQB9XQnpM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=WYauV48I; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="WYauV48I" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7D7EEC433F1; Wed, 3 Apr 2024 05:18:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712121512; bh=Se2nXJCcT4sawSe+jwQtnM+VYLCUul4VhyRIPFuWVOc=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=WYauV48IUUriUGNF+3FvTND6lx9Iu/J8v6xhTW4s2yU0Rt3j9wDW7eTJX84zP1Mpp LIf/seP+VYgcN7K2+gWUBlgt4dgmDhYPe4btRYBl/9E3o0rqUxXxjkgKRlUc7nQYXn EcUqsU1Pz+zDmiY9ERSg196/vJfzcSDFO3L8/Qk0Iu67oBhEP5Yu2fuyNLTBeOrP21 /BTBdg4llYK7Bl/ZTB6HLVnFwU4GTtj4ZfL2yr+0dVVH8L2cY9oCwOlW+IjIS3SEpa o/NTr7Pzax9Cj9Xwohuua86hkQvCWx1SJQSk++fPGFWbuNYsLy0xb9LULUn8erTBtl I2m2KkskT8Alw== Subject: [PATCH 2/3] xfs: fix an AGI lock acquisition ordering problem in xrep_dinode_findmode From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Tue, 02 Apr 2024 22:18:31 -0700 Message-ID: <171212151192.1535150.13198476701217286884.stgit@frogsfrogsfrogs> In-Reply-To: <171212150033.1535150.8307366470561747407.stgit@frogsfrogsfrogs> References: <171212150033.1535150.8307366470561747407.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong While reviewing the next patch which fixes an ABBA deadlock between the AGI and a directory ILOCK, someone asked a question about why we're holding the AGI in the first place. The reason for that is to quiesce the inode structures for that AG while we do a repair. I then realized that the xrep_dinode_findmode invokes xchk_iscan_iter, which walks the inobts (and hence the AGIs) to find all the inodes. This itself is also an ABBA vector, since the damaged inode could be in AG 5, which we hold while we scan AG 0 for directories. 5 -> 0 is not allowed. To address this, modify the iscan to allow trylock of the AGI buffer using the flags argument to xfs_ialloc_read_agi that the previous patch added. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/inode_repair.c | 1 + fs/xfs/scrub/iscan.c | 36 +++++++++++++++++++++++++++++++++++- fs/xfs/scrub/iscan.h | 15 +++++++++++++++ fs/xfs/scrub/trace.h | 10 ++++++++-- 4 files changed, 59 insertions(+), 3 deletions(-) diff --git a/fs/xfs/scrub/inode_repair.c b/fs/xfs/scrub/inode_repair.c index eab380e95ef40..35da0193c919e 100644 --- a/fs/xfs/scrub/inode_repair.c +++ b/fs/xfs/scrub/inode_repair.c @@ -356,6 +356,7 @@ xrep_dinode_find_mode( * so there's a real possibility that _iscan_iter can return EBUSY. */ xchk_iscan_start(sc, 5000, 100, &ri->ftype_iscan); + xchk_iscan_set_agi_trylock(&ri->ftype_iscan); ri->ftype_iscan.skip_ino = sc->sm->sm_ino; ri->alleged_ftype = XFS_DIR3_FT_UNKNOWN; while ((error = xchk_iscan_iter(&ri->ftype_iscan, &dp)) == 1) { diff --git a/fs/xfs/scrub/iscan.c b/fs/xfs/scrub/iscan.c index 66ba0fbd059e0..736ce7c9de6a8 100644 --- a/fs/xfs/scrub/iscan.c +++ b/fs/xfs/scrub/iscan.c @@ -243,6 +243,40 @@ xchk_iscan_finish( mutex_unlock(&iscan->lock); } +/* + * Grab the AGI to advance the inode scan. Returns 0 if *agi_bpp is now set, + * -ECANCELED if the live scan aborted, -EBUSY if the AGI could not be grabbed, + * or the usual negative errno. + */ +STATIC int +xchk_iscan_read_agi( + struct xchk_iscan *iscan, + struct xfs_perag *pag, + struct xfs_buf **agi_bpp) +{ + struct xfs_scrub *sc = iscan->sc; + unsigned long relax; + int ret; + + if (!xchk_iscan_agi_trylock(iscan)) + return xfs_ialloc_read_agi(pag, sc->tp, 0, agi_bpp); + + relax = msecs_to_jiffies(iscan->iget_retry_delay); + do { + ret = xfs_ialloc_read_agi(pag, sc->tp, XFS_IALLOC_FLAG_TRYLOCK, + agi_bpp); + if (ret != -EAGAIN) + return ret; + if (!iscan->iget_timeout || + time_is_before_jiffies(iscan->__iget_deadline)) + return -EBUSY; + + trace_xchk_iscan_agi_retry_wait(iscan); + } while (!schedule_timeout_killable(relax) && + !xchk_iscan_aborted(iscan)); + return -ECANCELED; +} + /* * Advance ino to the next inode that the inobt thinks is allocated, being * careful to jump to the next AG if we've reached the right end of this AG's @@ -281,7 +315,7 @@ xchk_iscan_advance( if (!pag) return -ECANCELED; - ret = xfs_ialloc_read_agi(pag, sc->tp, 0, &agi_bp); + ret = xchk_iscan_read_agi(iscan, pag, &agi_bp); if (ret) goto out_pag; diff --git a/fs/xfs/scrub/iscan.h b/fs/xfs/scrub/iscan.h index 71f657552dfac..c9da8f7721f66 100644 --- a/fs/xfs/scrub/iscan.h +++ b/fs/xfs/scrub/iscan.h @@ -59,6 +59,9 @@ struct xchk_iscan { /* Set if the scan has been aborted due to some event in the fs. */ #define XCHK_ISCAN_OPSTATE_ABORTED (1) +/* Use trylock to acquire the AGI */ +#define XCHK_ISCAN_OPSTATE_TRYLOCK_AGI (2) + static inline bool xchk_iscan_aborted(const struct xchk_iscan *iscan) { @@ -71,6 +74,18 @@ xchk_iscan_abort(struct xchk_iscan *iscan) set_bit(XCHK_ISCAN_OPSTATE_ABORTED, &iscan->__opstate); } +static inline bool +xchk_iscan_agi_trylock(const struct xchk_iscan *iscan) +{ + return test_bit(XCHK_ISCAN_OPSTATE_TRYLOCK_AGI, &iscan->__opstate); +} + +static inline void +xchk_iscan_set_agi_trylock(struct xchk_iscan *iscan) +{ + set_bit(XCHK_ISCAN_OPSTATE_TRYLOCK_AGI, &iscan->__opstate); +} + void xchk_iscan_start(struct xfs_scrub *sc, unsigned int iget_timeout, unsigned int iget_retry_delay, struct xchk_iscan *iscan); void xchk_iscan_teardown(struct xchk_iscan *iscan); diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 5b294be52c558..b1c7c79760d44 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -1300,7 +1300,7 @@ TRACE_EVENT(xchk_iscan_iget_batch, __entry->unavail) ); -TRACE_EVENT(xchk_iscan_iget_retry_wait, +DECLARE_EVENT_CLASS(xchk_iscan_retry_wait_class, TP_PROTO(struct xchk_iscan *iscan), TP_ARGS(iscan), TP_STRUCT__entry( @@ -1326,7 +1326,13 @@ TRACE_EVENT(xchk_iscan_iget_retry_wait, __entry->remaining, __entry->iget_timeout, __entry->retry_delay) -); +) +#define DEFINE_ISCAN_RETRY_WAIT_EVENT(name) \ +DEFINE_EVENT(xchk_iscan_retry_wait_class, name, \ + TP_PROTO(struct xchk_iscan *iscan), \ + TP_ARGS(iscan)) +DEFINE_ISCAN_RETRY_WAIT_EVENT(xchk_iscan_iget_retry_wait); +DEFINE_ISCAN_RETRY_WAIT_EVENT(xchk_iscan_agi_retry_wait); TRACE_EVENT(xchk_nlinks_collect_dirent, TP_PROTO(struct xfs_mount *mp, struct xfs_inode *dp, From patchwork Wed Apr 3 05:18:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13614956 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 682606AD7 for ; Wed, 3 Apr 2024 05:18:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712121518; cv=none; b=S7muOF3ZNk3+YHEyaTPxfvpfLclPrU4PoHyfBTKGKRIHWEHwucQeZjdW6ljNSU29fBX0ULOHnsZSfZWKyCKEiiERlAxjL0lDyZBIvEZ+dghVHWVVPhqZb1M+Dt4lZm4Z7Ln+pzoCQeFdOh3qHOhvYQvIH5hBSfx3z1ACxyAjeTw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712121518; c=relaxed/simple; bh=lRb+TsarW5R/Tn/SO9RjlahlNd0a6gfCZWdcB9EDNEQ=; h=Subject:From:To:Cc:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=RGiB0Cjqs5rIEfd5/S/2OBO9F+zk5H4D3oiWX7IdFJnCIeklS4bvi3XdVhcnGw2gikhpPTaRW+x2fQPtX/xeM9HEAU+UclrIMLjMlQrizVIvyy2FtB+BGG1T58TvZiHK4eVWnRYVynYys41A/vYNMTUL4vyrVKJetpQy/ho4xUc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=RbTdfJf4; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="RbTdfJf4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3558BC433F1; Wed, 3 Apr 2024 05:18:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712121518; bh=lRb+TsarW5R/Tn/SO9RjlahlNd0a6gfCZWdcB9EDNEQ=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=RbTdfJf4XmYGYpSIJ5PgdB2lhCoxUqPrjO4QP2Y07r3w6wNOKDNU0cKnO3CGakJat O3KkrbNLyTDFhZcDWDU4V9JEXDpSkxfAxSh627T/8vDjYgPXnps/u6Ql5uaquqv5ZZ hEHJY2xyjo0cawX5Q8QJ1bild449fczrtyZpAN4L1oVDVOTmh0xL4bTrkGcpDhaZwi +eDHMbYVSK/wlTxV2uar18K/R6hDwUmLuOlvCtE1hlnCTOGGP880Kz9BKuFy9tLa7w tY+LOpcfQp/D68KqSYJU31TbatH3//CKAa1fvkfi5ukyoKouSZZ1o1vCgP9BzZJQgb cAlBxUNajUhQA== Subject: [PATCH 3/3] xfs: fix potential AGI <-> ILOCK ABBA deadlock in xrep_dinode_findmode_walk_directory From: "Darrick J. Wong" To: djwong@kernel.org Cc: linux-xfs@vger.kernel.org Date: Tue, 02 Apr 2024 22:18:37 -0700 Message-ID: <171212151764.1535150.4553637479403597514.stgit@frogsfrogsfrogs> In-Reply-To: <171212150033.1535150.8307366470561747407.stgit@frogsfrogsfrogs> References: <171212150033.1535150.8307366470561747407.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong xfs/399 found the following deadlock when fuzzing core.mode = ones: /proc/20506/task/20558/stack : [<0>] xfs_ilock+0xa0/0x240 [xfs] [<0>] xfs_ilock_data_map_shared+0x1b/0x20 [xfs] [<0>] xrep_dinode_findmode_walk_directory+0x69/0xe0 [xfs] [<0>] xrep_dinode_find_mode+0x103/0x2a0 [xfs] [<0>] xrep_dinode_mode+0x7c/0x120 [xfs] [<0>] xrep_dinode_core+0xed/0x2b0 [xfs] [<0>] xrep_dinode_problems+0x10/0x80 [xfs] [<0>] xrep_inode+0x6c/0xc0 [xfs] [<0>] xrep_attempt+0x64/0x1d0 [xfs] [<0>] xfs_scrub_metadata+0x365/0x840 [xfs] [<0>] xfs_scrubv_metadata+0x282/0x430 [xfs] [<0>] xfs_ioc_scrubv_metadata+0x149/0x1a0 [xfs] [<0>] xfs_file_ioctl+0xc68/0x1780 [xfs] /proc/20506/task/20559/stack : [<0>] xfs_buf_lock+0x3b/0x110 [xfs] [<0>] xfs_buf_find_lock+0x66/0x1c0 [xfs] [<0>] xfs_buf_get_map+0x208/0xc00 [xfs] [<0>] xfs_buf_read_map+0x5d/0x2c0 [xfs] [<0>] xfs_trans_read_buf_map+0x1b0/0x4c0 [xfs] [<0>] xfs_read_agi+0xbd/0x190 [xfs] [<0>] xfs_ialloc_read_agi+0x47/0x160 [xfs] [<0>] xfs_imap_lookup+0x69/0x1f0 [xfs] [<0>] xfs_imap+0x1fc/0x3d0 [xfs] [<0>] xfs_iget+0x357/0xd50 [xfs] [<0>] xchk_dir_actor+0x16e/0x330 [xfs] [<0>] xchk_dir_walk_block+0x164/0x1e0 [xfs] [<0>] xchk_dir_walk+0x13a/0x190 [xfs] [<0>] xchk_directory+0x1a2/0x2b0 [xfs] [<0>] xfs_scrub_metadata+0x2f4/0x840 [xfs] [<0>] xfs_scrubv_metadata+0x282/0x430 [xfs] [<0>] xfs_ioc_scrubv_metadata+0x149/0x1a0 [xfs] [<0>] xfs_file_ioctl+0xc68/0x1780 [xfs] Thread 20558 holds an AGI buffer and is trying to grab the ILOCK of the root directory. Thread 20559 holds the root directory ILOCK and is trying to grab the AGI of an inode that is one of the root directory's children. The AGI held by 20558 is the same buffer that 20559 is trying to acquire. In other words, this is an ABBA deadlock. In general, the lock order is ILOCK and then AGI -- rename does this while preparing for an operation involving whiteouts or renaming files out of existence; and unlink does this when moving an inode to the unlinked list. The only place where we do it in the opposite order is on the child during an icreate, but at that point the child is marked INEW and is not visible to other threads. Work around this deadlock by replacing the blocking ilock attempt with a nonblocking loop that aborts after 30 seconds. Relax for a jiffy after a failed lock attempt. Signed-off-by: Darrick J. Wong --- fs/xfs/scrub/inode_repair.c | 49 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 48 insertions(+), 1 deletion(-) diff --git a/fs/xfs/scrub/inode_repair.c b/fs/xfs/scrub/inode_repair.c index 35da0193c919e..097afba3043f8 100644 --- a/fs/xfs/scrub/inode_repair.c +++ b/fs/xfs/scrub/inode_repair.c @@ -282,6 +282,51 @@ xrep_dinode_findmode_dirent( return 0; } +/* Try to lock a directory, or wait a jiffy. */ +static inline int +xrep_dinode_ilock_nowait( + struct xfs_inode *dp, + unsigned int lock_mode) +{ + if (xfs_ilock_nowait(dp, lock_mode)) + return true; + + schedule_timeout_killable(1); + return false; +} + +/* + * Try to lock a directory to look for ftype hints. Since we already hold the + * AGI buffer, we cannot block waiting for the ILOCK because rename can take + * the ILOCK and then try to lock AGIs. + */ +STATIC int +xrep_dinode_trylock_directory( + struct xrep_inode *ri, + struct xfs_inode *dp, + unsigned int *lock_modep) +{ + unsigned long deadline = jiffies + msecs_to_jiffies(30000); + unsigned int lock_mode; + int error = 0; + + do { + if (xchk_should_terminate(ri->sc, &error)) + return error; + + if (xfs_need_iread_extents(&dp->i_df)) + lock_mode = XFS_ILOCK_EXCL; + else + lock_mode = XFS_ILOCK_SHARED; + + if (xrep_dinode_ilock_nowait(dp, lock_mode)) { + *lock_modep = lock_mode; + return 0; + } + } while (!time_is_before_jiffies(deadline)); + return -EBUSY; +} + /* * If this is a directory, walk the dirents looking for any that point to the * scrub target inode. @@ -299,7 +344,9 @@ xrep_dinode_findmode_walk_directory( * Scan the directory to see if there it contains an entry pointing to * the directory that we are repairing. */ - lock_mode = xfs_ilock_data_map_shared(dp); + error = xrep_dinode_trylock_directory(ri, dp, &lock_mode); + if (error) + return error; /* * If this directory is known to be sick, we cannot scan it reliably From patchwork Fri Apr 5 03:27:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13618430 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B921C1A26E for ; Fri, 5 Apr 2024 03:27:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712287637; cv=none; b=OE+i4tbThfuXMJzo5LY4O1z8JFNnKdRnLQ0i2VO+l7ck4fy4ncroHtUlRgUZT/m/oLp9ulkHG0KDEAsBq/qGMvCxBEIyiqlclpPk3bHaznqco8pKfRcXuJsh6/V1UMs+avYDBM1dCDA9BXJ3OsbCKwjluVAbAAzedXBH50FhIiY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712287637; c=relaxed/simple; bh=7IMd77PAKMPA0KBOYcMbuxe4sMS6yHNmSSkI9Jxkv7A=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Wcvumgatd+thkESHMvW0QB8usOe1IHGBgQwUxu/KI89ATTiaqTXfpaLUbOhhJQmvdCYhNoqQEgSLGjiBfc5rTzuJuYhJvOw/+vBIeLXqkf4YlfYkTR52DCO+8+SgynkjB5b4T5p9uQTKcywXLhBbqgPCXQKkyXuS2lmB+Rim+A0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ToiyGLPv; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ToiyGLPv" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3BB52C433C7; Fri, 5 Apr 2024 03:27:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1712287637; bh=7IMd77PAKMPA0KBOYcMbuxe4sMS6yHNmSSkI9Jxkv7A=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=ToiyGLPvGqVPCveGUDGQqtMeDQ9hJf/AF/vaBz5kHR4BbQygmrJOF52G1kaFvzNRo PpGH9js0j9K4pvhbz30rnvxWZkO/sAxfe+Qtd8zmCcpkAjHXJk1cHcr6lr5gBm0PR2 ea5b53A6XH6K8FY01vJEGCMKP6g7Og8IacgAKs2F6QHS5X/1Il/CaoPBpv8EYCtLMA 3y30uLKk1Ao2NyeO1Av5axeQxckvpG1kMd8ASidkFczZ79LDrd/J00qPq2YZ0X3+G+ br+MFUPjsTM/+SGN3lyKb8q31GQn8Wpd/fxbg+6xpwy0V0M8Fo8nNMZrwQHLYf9T3J CI9Fwdd/3n9oA== Date: Thu, 4 Apr 2024 20:27:16 -0700 From: "Darrick J. Wong" To: Dan Carpenter , Chandan Babu R Cc: linux-xfs@vger.kernel.org, Christoph Hellwig Subject: [PATCH 4/3] xfs: fix error bailout in xrep_abt_build_new_trees Message-ID: <20240405032716.GX6390@frogsfrogsfrogs> References: <171150379369.3216268.2525451022899750269.stgit@frogsfrogsfrogs> <171212150033.1535150.8307366470561747407.stgit@frogsfrogsfrogs> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <171212150033.1535150.8307366470561747407.stgit@frogsfrogsfrogs> From: Darrick J. Wong Dan Carpenter reports: "Commit 4bdfd7d15747 ("xfs: repair free space btrees") from Dec 15, 2023 (linux-next), leads to the following Smatch static checker warning: fs/xfs/scrub/alloc_repair.c:781 xrep_abt_build_new_trees() warn: missing unwind goto?" That's a bug, so let's fix it. Reported-by: Dan Carpenter Fixes: 4bdfd7d15747 ("xfs: repair free space btrees") Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/alloc_repair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/scrub/alloc_repair.c b/fs/xfs/scrub/alloc_repair.c index d421b253923ed..30295898cc8a6 100644 --- a/fs/xfs/scrub/alloc_repair.c +++ b/fs/xfs/scrub/alloc_repair.c @@ -778,7 +778,7 @@ xrep_abt_build_new_trees( error = xrep_bnobt_sort_records(ra); if (error) - return error; + goto err_levels; /* Load the free space by block number tree. */ ra->array_cur = XFARRAY_CURSOR_INIT;