From patchwork Sat Sep 18 01:29:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12503393 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5AE8C433F5 for ; Sat, 18 Sep 2021 01:29:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 80CFE60FBF for ; Sat, 18 Sep 2021 01:29:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234132AbhIRBak (ORCPT ); Fri, 17 Sep 2021 21:30:40 -0400 Received: from mail.kernel.org ([198.145.29.99]:36840 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230044AbhIRBai (ORCPT ); Fri, 17 Sep 2021 21:30:38 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id A561A60FBF; Sat, 18 Sep 2021 01:29:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1631928555; bh=DDnPziIv420IU1gibdQ/HA731K4LAOfAh/vbaoZD50M=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=EYq5TH89PYK7dltUdgOB/7j7KlHyoyIdp2F2afKmmd1cM9yYJAtxWCXvFnmkYS7Ow ZL5FrW1ziJyGvWYRgGACob7rARklBHPwpwlweAHXhUI0Knl0Ne/tX3J/Hi+qYj6JJz JnL6UMmMxRyDL7btznUHMnnMM2CSxDl/PbAA5APyQ7HXueA3jSvd4gr1yeiWAARvZ0 LUaQQJFRvSeErfNmi+/jBm66Oul3k/fFIbMHGYSVp3CUqyL3Soj1yND0vXo9Yb47li OlBniBpK6uyYo4L+nAh7Db1K3sG91Y1Mib+Bu84I1+bUO2448FEDxZ264vcvI9GpYs nsIN45y3jdl1w== Subject: [PATCH 01/14] xfs: remove xfs_btree_cur_t typedef From: "Darrick J. Wong" To: djwong@kernel.org, chandan.babu@oracle.com, chandanrlinux@gmail.com Cc: linux-xfs@vger.kernel.org Date: Fri, 17 Sep 2021 18:29:15 -0700 Message-ID: <163192855540.416199.17796390757325316141.stgit@magnolia> In-Reply-To: <163192854958.416199.3396890438240296942.stgit@magnolia> References: <163192854958.416199.3396890438240296942.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Signed-off-by: Darrick J. Wong Reviewed-by: Chandan Babu R Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_alloc.c | 12 ++++++------ fs/xfs/libxfs/xfs_bmap.c | 12 ++++++------ fs/xfs/libxfs/xfs_btree.c | 12 ++++++------ fs/xfs/libxfs/xfs_btree.h | 12 ++++++------ 4 files changed, 24 insertions(+), 24 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 95157f5a5a6c..35fb1dd3be95 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -426,8 +426,8 @@ xfs_alloc_fix_len( */ STATIC int /* error code */ xfs_alloc_fixup_trees( - xfs_btree_cur_t *cnt_cur, /* cursor for by-size btree */ - xfs_btree_cur_t *bno_cur, /* cursor for by-block btree */ + struct xfs_btree_cur *cnt_cur, /* cursor for by-size btree */ + struct xfs_btree_cur *bno_cur, /* cursor for by-block btree */ xfs_agblock_t fbno, /* starting block of free extent */ xfs_extlen_t flen, /* length of free extent */ xfs_agblock_t rbno, /* starting block of returned extent */ @@ -1200,8 +1200,8 @@ xfs_alloc_ag_vextent_exact( xfs_alloc_arg_t *args) /* allocation argument structure */ { struct xfs_agf __maybe_unused *agf = args->agbp->b_addr; - xfs_btree_cur_t *bno_cur;/* by block-number btree cursor */ - xfs_btree_cur_t *cnt_cur;/* by count btree cursor */ + struct xfs_btree_cur *bno_cur;/* by block-number btree cursor */ + struct xfs_btree_cur *cnt_cur;/* by count btree cursor */ int error; xfs_agblock_t fbno; /* start block of found extent */ xfs_extlen_t flen; /* length of found extent */ @@ -1658,8 +1658,8 @@ xfs_alloc_ag_vextent_size( xfs_alloc_arg_t *args) /* allocation argument structure */ { struct xfs_agf *agf = args->agbp->b_addr; - xfs_btree_cur_t *bno_cur; /* cursor for bno btree */ - xfs_btree_cur_t *cnt_cur; /* cursor for cnt btree */ + struct xfs_btree_cur *bno_cur; /* cursor for bno btree */ + struct xfs_btree_cur *cnt_cur; /* cursor for cnt btree */ int error; /* error result */ xfs_agblock_t fbno; /* start of found freespace */ xfs_extlen_t flen; /* length of found freespace */ diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index b48230f1a361..499c977cbf56 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -316,7 +316,7 @@ xfs_check_block( */ STATIC void xfs_bmap_check_leaf_extents( - xfs_btree_cur_t *cur, /* btree cursor or null */ + struct xfs_btree_cur *cur, /* btree cursor or null */ xfs_inode_t *ip, /* incore inode pointer */ int whichfork) /* data or attr fork */ { @@ -925,7 +925,7 @@ xfs_bmap_add_attrfork_btree( int *flags) /* inode logging flags */ { struct xfs_btree_block *block = ip->i_df.if_broot; - xfs_btree_cur_t *cur; /* btree cursor */ + struct xfs_btree_cur *cur; /* btree cursor */ int error; /* error return value */ xfs_mount_t *mp; /* file system mount struct */ int stat; /* newroot status */ @@ -968,7 +968,7 @@ xfs_bmap_add_attrfork_extents( struct xfs_inode *ip, /* incore inode pointer */ int *flags) /* inode logging flags */ { - xfs_btree_cur_t *cur; /* bmap btree cursor */ + struct xfs_btree_cur *cur; /* bmap btree cursor */ int error; /* error return value */ if (ip->i_df.if_nextents * sizeof(struct xfs_bmbt_rec) <= @@ -1988,11 +1988,11 @@ xfs_bmap_add_extent_unwritten_real( xfs_inode_t *ip, /* incore inode pointer */ int whichfork, struct xfs_iext_cursor *icur, - xfs_btree_cur_t **curp, /* if *curp is null, not a btree */ + struct xfs_btree_cur **curp, /* if *curp is null, not a btree */ xfs_bmbt_irec_t *new, /* new data to add to file extents */ int *logflagsp) /* inode logging flags */ { - xfs_btree_cur_t *cur; /* btree cursor */ + struct xfs_btree_cur *cur; /* btree cursor */ int error; /* error return value */ int i; /* temp state */ struct xfs_ifork *ifp; /* inode fork pointer */ @@ -5045,7 +5045,7 @@ xfs_bmap_del_extent_real( xfs_inode_t *ip, /* incore inode pointer */ xfs_trans_t *tp, /* current transaction pointer */ struct xfs_iext_cursor *icur, - xfs_btree_cur_t *cur, /* if null, not a btree */ + struct xfs_btree_cur *cur, /* if null, not a btree */ xfs_bmbt_irec_t *del, /* data to remove from extents */ int *logflagsp, /* inode logging flags */ int whichfork, /* data or attr fork */ diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 298395481713..b0cce0932f02 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -388,14 +388,14 @@ xfs_btree_del_cursor( */ int /* error */ xfs_btree_dup_cursor( - xfs_btree_cur_t *cur, /* input cursor */ - xfs_btree_cur_t **ncur) /* output cursor */ + struct xfs_btree_cur *cur, /* input cursor */ + struct xfs_btree_cur **ncur) /* output cursor */ { struct xfs_buf *bp; /* btree block's buffer pointer */ int error; /* error return value */ int i; /* level number of btree block */ xfs_mount_t *mp; /* mount structure for filesystem */ - xfs_btree_cur_t *new; /* new cursor value */ + struct xfs_btree_cur *new; /* new cursor value */ xfs_trans_t *tp; /* transaction pointer, can be NULL */ tp = cur->bc_tp; @@ -691,7 +691,7 @@ xfs_btree_get_block( */ STATIC int /* success=1, failure=0 */ xfs_btree_firstrec( - xfs_btree_cur_t *cur, /* btree cursor */ + struct xfs_btree_cur *cur, /* btree cursor */ int level) /* level to change */ { struct xfs_btree_block *block; /* generic btree block pointer */ @@ -721,7 +721,7 @@ xfs_btree_firstrec( */ STATIC int /* success=1, failure=0 */ xfs_btree_lastrec( - xfs_btree_cur_t *cur, /* btree cursor */ + struct xfs_btree_cur *cur, /* btree cursor */ int level) /* level to change */ { struct xfs_btree_block *block; /* generic btree block pointer */ @@ -985,7 +985,7 @@ xfs_btree_readahead_ptr( */ STATIC void xfs_btree_setbuf( - xfs_btree_cur_t *cur, /* btree cursor */ + struct xfs_btree_cur *cur, /* btree cursor */ int lev, /* level in btree */ struct xfs_buf *bp) /* new buffer to set */ { diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index 4eaf8517f850..513ade4a89f8 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -216,7 +216,7 @@ struct xfs_btree_cur_ino { * Btree cursor structure. * This collects all information needed by the btree code in one place. */ -typedef struct xfs_btree_cur +struct xfs_btree_cur { struct xfs_trans *bc_tp; /* transaction we're in, if any */ struct xfs_mount *bc_mp; /* file system mount struct */ @@ -243,7 +243,7 @@ typedef struct xfs_btree_cur struct xfs_btree_cur_ag bc_ag; struct xfs_btree_cur_ino bc_ino; }; -} xfs_btree_cur_t; +}; /* cursor flags */ #define XFS_BTREE_LONG_PTRS (1<<0) /* pointers are 64bits long */ @@ -309,7 +309,7 @@ xfs_btree_check_sptr( */ void xfs_btree_del_cursor( - xfs_btree_cur_t *cur, /* btree cursor */ + struct xfs_btree_cur *cur, /* btree cursor */ int error); /* del because of error */ /* @@ -318,8 +318,8 @@ xfs_btree_del_cursor( */ int /* error */ xfs_btree_dup_cursor( - xfs_btree_cur_t *cur, /* input cursor */ - xfs_btree_cur_t **ncur);/* output cursor */ + struct xfs_btree_cur *cur, /* input cursor */ + struct xfs_btree_cur **ncur);/* output cursor */ /* * Compute first and last byte offsets for the fields given. @@ -527,7 +527,7 @@ struct xfs_ifork *xfs_btree_ifork_ptr(struct xfs_btree_cur *cur); /* Does this cursor point to the last block in the given level? */ static inline bool xfs_btree_islastblock( - xfs_btree_cur_t *cur, + struct xfs_btree_cur *cur, int level) { struct xfs_btree_block *block; From patchwork Sat Sep 18 01:29:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12503395 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73842C433EF for ; Sat, 18 Sep 2021 01:29:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 598436113A for ; Sat, 18 Sep 2021 01:29:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234182AbhIRBao (ORCPT ); Fri, 17 Sep 2021 21:30:44 -0400 Received: from mail.kernel.org ([198.145.29.99]:36876 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230044AbhIRBan (ORCPT ); Fri, 17 Sep 2021 21:30:43 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 24CA160FBF; Sat, 18 Sep 2021 01:29:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1631928561; bh=ecWs/yi+xC1fUgYM9NAv5DBhswEr1v4efojwYWVRV+c=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=GfrNF2Wo1mctFr1K5djwDn4pJzcmaAJh10b+Y6L/4HqacTfG63Bq0C9kl3RJkULlH 5hqDwSxffjcIjZKQxLeEgrzV0fOtY+V5g2WK4bwPrz/U/n2L0EwRH5q3i9+WZpDcBM yHV3grI8sCnmLDgiXPIkDqXI42IrJg/+njkxUaivV0rey2V4FX/bQ7mp4lqy4/vWdN xLqTeDQn6GzlYsVSS9xfslZORcql6Lkr3s9REt41ALcc7vO9bS1EbpSI7D3b1evyA5 ZADcIcusPAv/IBKg6+fA1U2oZjnXJmkxNHD4WEkc28OhVrhe0IoeTFfQID7G2RdZ4S JtQmLIeV1CDpA== Subject: [PATCH 02/14] xfs: don't allocate scrub contexts on the stack From: "Darrick J. Wong" To: djwong@kernel.org, chandan.babu@oracle.com, chandanrlinux@gmail.com Cc: linux-xfs@vger.kernel.org Date: Fri, 17 Sep 2021 18:29:20 -0700 Message-ID: <163192856086.416199.10504751435007741959.stgit@magnolia> In-Reply-To: <163192854958.416199.3396890438240296942.stgit@magnolia> References: <163192854958.416199.3396890438240296942.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Convert the on-stack scrub context, btree scrub context, and da btree scrub context into a heap allocation so that we reduce stack usage and gain the ability to handle tall btrees without issue. Specifically, this saves us ~208 bytes for the dabtree scrub, ~464 bytes for the btree scrub, and ~200 bytes for the main scrub context. Signed-off-by: Darrick J. Wong Reviewed-by: Chandan Babu R Reviewed-by: Christoph Hellwig --- fs/xfs/scrub/btree.c | 54 ++++++++++++++++++++++++------------------ fs/xfs/scrub/btree.h | 1 + fs/xfs/scrub/dabtree.c | 62 ++++++++++++++++++++++++++---------------------- fs/xfs/scrub/scrub.c | 60 ++++++++++++++++++++++++++-------------------- 4 files changed, 98 insertions(+), 79 deletions(-) diff --git a/fs/xfs/scrub/btree.c b/fs/xfs/scrub/btree.c index eccb855dc904..26dcb4691e31 100644 --- a/fs/xfs/scrub/btree.c +++ b/fs/xfs/scrub/btree.c @@ -627,15 +627,8 @@ xchk_btree( const struct xfs_owner_info *oinfo, void *private) { - struct xchk_btree bs = { - .cur = cur, - .scrub_rec = scrub_fn, - .oinfo = oinfo, - .firstrec = true, - .private = private, - .sc = sc, - }; union xfs_btree_ptr ptr; + struct xchk_btree *bs; union xfs_btree_ptr *pp; union xfs_btree_rec *recp; struct xfs_btree_block *block; @@ -646,10 +639,24 @@ xchk_btree( int i; int error = 0; + /* + * Allocate the btree scrub context from the heap, because this + * structure can get rather large. + */ + bs = kmem_zalloc(sizeof(struct xchk_btree), KM_NOFS | KM_MAYFAIL); + if (!bs) + return -ENOMEM; + bs->cur = cur; + bs->scrub_rec = scrub_fn; + bs->oinfo = oinfo; + bs->firstrec = true; + bs->private = private; + bs->sc = sc; + /* Initialize scrub state */ for (i = 0; i < XFS_BTREE_MAXLEVELS; i++) - bs.firstkey[i] = true; - INIT_LIST_HEAD(&bs.to_check); + bs->firstkey[i] = true; + INIT_LIST_HEAD(&bs->to_check); /* Don't try to check a tree with a height we can't handle. */ if (cur->bc_nlevels > XFS_BTREE_MAXLEVELS) { @@ -663,9 +670,9 @@ xchk_btree( */ level = cur->bc_nlevels - 1; cur->bc_ops->init_ptr_from_cur(cur, &ptr); - if (!xchk_btree_ptr_ok(&bs, cur->bc_nlevels, &ptr)) + if (!xchk_btree_ptr_ok(bs, cur->bc_nlevels, &ptr)) goto out; - error = xchk_btree_get_block(&bs, level, &ptr, &block, &bp); + error = xchk_btree_get_block(bs, level, &ptr, &block, &bp); if (error || !block) goto out; @@ -678,7 +685,7 @@ xchk_btree( /* End of leaf, pop back towards the root. */ if (cur->bc_ptrs[level] > be16_to_cpu(block->bb_numrecs)) { - xchk_btree_block_keys(&bs, level, block); + xchk_btree_block_keys(bs, level, block); if (level < cur->bc_nlevels - 1) cur->bc_ptrs[level + 1]++; level++; @@ -686,11 +693,11 @@ xchk_btree( } /* Records in order for scrub? */ - xchk_btree_rec(&bs); + xchk_btree_rec(bs); /* Call out to the record checker. */ recp = xfs_btree_rec_addr(cur, cur->bc_ptrs[0], block); - error = bs.scrub_rec(&bs, recp); + error = bs->scrub_rec(bs, recp); if (error) break; if (xchk_should_terminate(sc, &error) || @@ -703,7 +710,7 @@ xchk_btree( /* End of node, pop back towards the root. */ if (cur->bc_ptrs[level] > be16_to_cpu(block->bb_numrecs)) { - xchk_btree_block_keys(&bs, level, block); + xchk_btree_block_keys(bs, level, block); if (level < cur->bc_nlevels - 1) cur->bc_ptrs[level + 1]++; level++; @@ -711,16 +718,16 @@ xchk_btree( } /* Keys in order for scrub? */ - xchk_btree_key(&bs, level); + xchk_btree_key(bs, level); /* Drill another level deeper. */ pp = xfs_btree_ptr_addr(cur, cur->bc_ptrs[level], block); - if (!xchk_btree_ptr_ok(&bs, level, pp)) { + if (!xchk_btree_ptr_ok(bs, level, pp)) { cur->bc_ptrs[level]++; continue; } level--; - error = xchk_btree_get_block(&bs, level, pp, &block, &bp); + error = xchk_btree_get_block(bs, level, pp, &block, &bp); if (error || !block) goto out; @@ -729,13 +736,14 @@ xchk_btree( out: /* Process deferred owner checks on btree blocks. */ - list_for_each_entry_safe(co, n, &bs.to_check, list) { - if (!error && bs.cur) - error = xchk_btree_check_block_owner(&bs, - co->level, co->daddr); + list_for_each_entry_safe(co, n, &bs->to_check, list) { + if (!error && bs->cur) + error = xchk_btree_check_block_owner(bs, co->level, + co->daddr); list_del(&co->list); kmem_free(co); } + kmem_free(bs); return error; } diff --git a/fs/xfs/scrub/btree.h b/fs/xfs/scrub/btree.h index b7d2fc01fbf9..d5c0b0cbc505 100644 --- a/fs/xfs/scrub/btree.h +++ b/fs/xfs/scrub/btree.h @@ -44,6 +44,7 @@ struct xchk_btree { bool firstkey[XFS_BTREE_MAXLEVELS]; struct list_head to_check; }; + int xchk_btree(struct xfs_scrub *sc, struct xfs_btree_cur *cur, xchk_btree_rec_fn scrub_fn, const struct xfs_owner_info *oinfo, void *private); diff --git a/fs/xfs/scrub/dabtree.c b/fs/xfs/scrub/dabtree.c index 8a52514bc1ff..b962cfbbd92b 100644 --- a/fs/xfs/scrub/dabtree.c +++ b/fs/xfs/scrub/dabtree.c @@ -473,7 +473,7 @@ xchk_da_btree( xchk_da_btree_rec_fn scrub_fn, void *private) { - struct xchk_da_btree ds = {}; + struct xchk_da_btree *ds; struct xfs_mount *mp = sc->mp; struct xfs_da_state_blk *blks; struct xfs_da_node_entry *key; @@ -486,32 +486,35 @@ xchk_da_btree( return 0; /* Set up initial da state. */ - ds.dargs.dp = sc->ip; - ds.dargs.whichfork = whichfork; - ds.dargs.trans = sc->tp; - ds.dargs.op_flags = XFS_DA_OP_OKNOENT; - ds.state = xfs_da_state_alloc(&ds.dargs); - ds.sc = sc; - ds.private = private; + ds = kmem_zalloc(sizeof(struct xchk_da_btree), KM_NOFS | KM_MAYFAIL); + if (!ds) + return -ENOMEM; + ds->dargs.dp = sc->ip; + ds->dargs.whichfork = whichfork; + ds->dargs.trans = sc->tp; + ds->dargs.op_flags = XFS_DA_OP_OKNOENT; + ds->state = xfs_da_state_alloc(&ds->dargs); + ds->sc = sc; + ds->private = private; if (whichfork == XFS_ATTR_FORK) { - ds.dargs.geo = mp->m_attr_geo; - ds.lowest = 0; - ds.highest = 0; + ds->dargs.geo = mp->m_attr_geo; + ds->lowest = 0; + ds->highest = 0; } else { - ds.dargs.geo = mp->m_dir_geo; - ds.lowest = ds.dargs.geo->leafblk; - ds.highest = ds.dargs.geo->freeblk; + ds->dargs.geo = mp->m_dir_geo; + ds->lowest = ds->dargs.geo->leafblk; + ds->highest = ds->dargs.geo->freeblk; } - blkno = ds.lowest; + blkno = ds->lowest; level = 0; /* Find the root of the da tree, if present. */ - blks = ds.state->path.blk; - error = xchk_da_btree_block(&ds, level, blkno); + blks = ds->state->path.blk; + error = xchk_da_btree_block(ds, level, blkno); if (error) goto out_state; /* - * We didn't find a block at ds.lowest, which means that there's + * We didn't find a block at ds->lowest, which means that there's * no LEAF1/LEAFN tree (at least not where it's supposed to be), * so jump out now. */ @@ -523,16 +526,16 @@ xchk_da_btree( /* Handle leaf block. */ if (blks[level].magic != XFS_DA_NODE_MAGIC) { /* End of leaf, pop back towards the root. */ - if (blks[level].index >= ds.maxrecs[level]) { + if (blks[level].index >= ds->maxrecs[level]) { if (level > 0) blks[level - 1].index++; - ds.tree_level++; + ds->tree_level++; level--; continue; } /* Dispatch record scrubbing. */ - error = scrub_fn(&ds, level); + error = scrub_fn(ds, level); if (error) break; if (xchk_should_terminate(sc, &error) || @@ -545,17 +548,17 @@ xchk_da_btree( /* End of node, pop back towards the root. */ - if (blks[level].index >= ds.maxrecs[level]) { + if (blks[level].index >= ds->maxrecs[level]) { if (level > 0) blks[level - 1].index++; - ds.tree_level++; + ds->tree_level++; level--; continue; } /* Hashes in order for scrub? */ - key = xchk_da_btree_node_entry(&ds, level); - error = xchk_da_btree_hash(&ds, level, &key->hashval); + key = xchk_da_btree_node_entry(ds, level); + error = xchk_da_btree_hash(ds, level, &key->hashval); if (error) goto out; @@ -564,11 +567,11 @@ xchk_da_btree( level++; if (level >= XFS_DA_NODE_MAXDEPTH) { /* Too deep! */ - xchk_da_set_corrupt(&ds, level - 1); + xchk_da_set_corrupt(ds, level - 1); break; } - ds.tree_level--; - error = xchk_da_btree_block(&ds, level, blkno); + ds->tree_level--; + error = xchk_da_btree_block(ds, level, blkno); if (error) goto out; if (blks[level].bp == NULL) @@ -587,6 +590,7 @@ xchk_da_btree( } out_state: - xfs_da_state_free(ds.state); + xfs_da_state_free(ds->state); + kmem_free(ds); return error; } diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c index 51e4c61916d2..0569b15526ea 100644 --- a/fs/xfs/scrub/scrub.c +++ b/fs/xfs/scrub/scrub.c @@ -461,15 +461,10 @@ xfs_scrub_metadata( struct file *file, struct xfs_scrub_metadata *sm) { - struct xfs_scrub sc = { - .file = file, - .sm = sm, - }; + struct xfs_scrub *sc; struct xfs_mount *mp = XFS_I(file_inode(file))->i_mount; int error = 0; - sc.mp = mp; - BUILD_BUG_ON(sizeof(meta_scrub_ops) != (sizeof(struct xchk_meta_ops) * XFS_SCRUB_TYPE_NR)); @@ -489,59 +484,68 @@ xfs_scrub_metadata( xchk_experimental_warning(mp); - sc.ops = &meta_scrub_ops[sm->sm_type]; - sc.sick_mask = xchk_health_mask_for_scrub_type(sm->sm_type); + sc = kmem_zalloc(sizeof(struct xfs_scrub), KM_NOFS | KM_MAYFAIL); + if (!sc) { + error = -ENOMEM; + goto out; + } + + sc->mp = mp; + sc->file = file; + sc->sm = sm; + sc->ops = &meta_scrub_ops[sm->sm_type]; + sc->sick_mask = xchk_health_mask_for_scrub_type(sm->sm_type); retry_op: /* * When repairs are allowed, prevent freezing or readonly remount while * scrub is running with a real transaction. */ if (sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR) { - error = mnt_want_write_file(sc.file); + error = mnt_want_write_file(sc->file); if (error) goto out; } /* Set up for the operation. */ - error = sc.ops->setup(&sc); + error = sc->ops->setup(sc); if (error) goto out_teardown; /* Scrub for errors. */ - error = sc.ops->scrub(&sc); - if (!(sc.flags & XCHK_TRY_HARDER) && error == -EDEADLOCK) { + error = sc->ops->scrub(sc); + if (!(sc->flags & XCHK_TRY_HARDER) && error == -EDEADLOCK) { /* * Scrubbers return -EDEADLOCK to mean 'try harder'. * Tear down everything we hold, then set up again with * preparation for worst-case scenarios. */ - error = xchk_teardown(&sc, 0); + error = xchk_teardown(sc, 0); if (error) goto out; - sc.flags |= XCHK_TRY_HARDER; + sc->flags |= XCHK_TRY_HARDER; goto retry_op; } else if (error || (sm->sm_flags & XFS_SCRUB_OFLAG_INCOMPLETE)) goto out_teardown; - xchk_update_health(&sc); + xchk_update_health(sc); - if ((sc.sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR) && - !(sc.flags & XREP_ALREADY_FIXED)) { + if ((sc->sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR) && + !(sc->flags & XREP_ALREADY_FIXED)) { bool needs_fix; /* Let debug users force us into the repair routines. */ if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_FORCE_SCRUB_REPAIR)) - sc.sm->sm_flags |= XFS_SCRUB_OFLAG_CORRUPT; + sc->sm->sm_flags |= XFS_SCRUB_OFLAG_CORRUPT; - needs_fix = (sc.sm->sm_flags & (XFS_SCRUB_OFLAG_CORRUPT | - XFS_SCRUB_OFLAG_XCORRUPT | - XFS_SCRUB_OFLAG_PREEN)); + needs_fix = (sc->sm->sm_flags & (XFS_SCRUB_OFLAG_CORRUPT | + XFS_SCRUB_OFLAG_XCORRUPT | + XFS_SCRUB_OFLAG_PREEN)); /* * If userspace asked for a repair but it wasn't necessary, * report that back to userspace. */ if (!needs_fix) { - sc.sm->sm_flags |= XFS_SCRUB_OFLAG_NO_REPAIR_NEEDED; + sc->sm->sm_flags |= XFS_SCRUB_OFLAG_NO_REPAIR_NEEDED; goto out_nofix; } @@ -549,26 +553,28 @@ xfs_scrub_metadata( * If it's broken, userspace wants us to fix it, and we haven't * already tried to fix it, then attempt a repair. */ - error = xrep_attempt(&sc); + error = xrep_attempt(sc); if (error == -EAGAIN) { /* * Either the repair function succeeded or it couldn't * get all the resources it needs; either way, we go * back to the beginning and call the scrub function. */ - error = xchk_teardown(&sc, 0); + error = xchk_teardown(sc, 0); if (error) { xrep_failure(mp); - goto out; + goto out_sc; } goto retry_op; } } out_nofix: - xchk_postmortem(&sc); + xchk_postmortem(sc); out_teardown: - error = xchk_teardown(&sc, error); + error = xchk_teardown(sc, error); +out_sc: + kmem_free(sc); out: trace_xchk_done(XFS_I(file_inode(file)), sm, error); if (error == -EFSCORRUPTED || error == -EFSBADCRC) { From patchwork Sat Sep 18 01:29:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12503397 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3119C433EF for ; Sat, 18 Sep 2021 01:29:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C8D1A6113A for ; Sat, 18 Sep 2021 01:29:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234430AbhIRBat (ORCPT ); Fri, 17 Sep 2021 21:30:49 -0400 Received: from mail.kernel.org ([198.145.29.99]:36900 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230044AbhIRBat (ORCPT ); Fri, 17 Sep 2021 21:30:49 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 9B9B860FBF; Sat, 18 Sep 2021 01:29:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1631928566; bh=Ojnq2Ke6G6+ibnwYJkseQ1hCQ0Oh4pLqAJ0daN6wf+s=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=RKpubasQsWR7WfC2/wZ3Q2WKD2jHOcDJZg4k87SmE+Hmde7GhMaaXW4zmggUNzDTf eoTpNU4NtROOsXCCI3yGKpKafzatB8RAvWbq4MjQYlzmwZY/Lhzlh20JDUdREKgQVr 6cr9nBIw86aFRt3lbbLz9Le1qIhuLHi5a66Vi32sCQSSbISAq26iIqQLcIgvVLI0tS qgMWRJ8+Xl0CJ2RrLZTVtChFRDglf8KUv0dhETiF8r2ficQvfUzZOtnqI8Y4W6j5eX 5TeolOyPTKIXTkGQszJ59y6Dx3+5ETjjspGhQye8TKypQ0/KL5eyBlJ9zLr9MQzvEa Bgw9lbpVcrvRw== Subject: [PATCH 03/14] xfs: dynamically allocate btree scrub context structure From: "Darrick J. Wong" To: djwong@kernel.org, chandan.babu@oracle.com, chandanrlinux@gmail.com Cc: linux-xfs@vger.kernel.org Date: Fri, 17 Sep 2021 18:29:26 -0700 Message-ID: <163192856634.416199.12496831484611764326.stgit@magnolia> In-Reply-To: <163192854958.416199.3396890438240296942.stgit@magnolia> References: <163192854958.416199.3396890438240296942.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Reorganize struct xchk_btree so that we can dynamically size the context structure to fit the type of btree cursor that we have. This will enable us to use memory more efficiently once we start adding very tall btree types. Signed-off-by: Darrick J. Wong Reviewed-by: Chandan Babu R --- fs/xfs/scrub/btree.c | 38 +++++++++++++++++--------------------- fs/xfs/scrub/btree.h | 16 +++++++++++++--- 2 files changed, 30 insertions(+), 24 deletions(-) diff --git a/fs/xfs/scrub/btree.c b/fs/xfs/scrub/btree.c index 26dcb4691e31..7b7762ae22e5 100644 --- a/fs/xfs/scrub/btree.c +++ b/fs/xfs/scrub/btree.c @@ -141,9 +141,10 @@ xchk_btree_rec( trace_xchk_btree_rec(bs->sc, cur, 0); /* If this isn't the first record, are they in order? */ - if (!bs->firstrec && !cur->bc_ops->recs_inorder(cur, &bs->lastrec, rec)) + if (bs->levels[0].has_lastkey && + !cur->bc_ops->recs_inorder(cur, &bs->lastrec, rec)) xchk_btree_set_corrupt(bs->sc, cur, 0); - bs->firstrec = false; + bs->levels[0].has_lastkey = true; memcpy(&bs->lastrec, rec, cur->bc_ops->rec_len); if (cur->bc_nlevels == 1) @@ -188,11 +189,11 @@ xchk_btree_key( trace_xchk_btree_key(bs->sc, cur, level); /* If this isn't the first key, are they in order? */ - if (!bs->firstkey[level] && - !cur->bc_ops->keys_inorder(cur, &bs->lastkey[level], key)) + if (bs->levels[level].has_lastkey && + !cur->bc_ops->keys_inorder(cur, &bs->levels[level].lastkey, key)) xchk_btree_set_corrupt(bs->sc, cur, level); - bs->firstkey[level] = false; - memcpy(&bs->lastkey[level], key, cur->bc_ops->key_len); + bs->levels[level].has_lastkey = true; + memcpy(&bs->levels[level].lastkey, key, cur->bc_ops->key_len); if (level + 1 >= cur->bc_nlevels) return; @@ -632,38 +633,33 @@ xchk_btree( union xfs_btree_ptr *pp; union xfs_btree_rec *recp; struct xfs_btree_block *block; - int level; struct xfs_buf *bp; struct check_owner *co; struct check_owner *n; - int i; + size_t cur_sz; + int level; int error = 0; /* * Allocate the btree scrub context from the heap, because this - * structure can get rather large. + * structure can get rather large. Don't let a caller feed us a + * totally absurd size. */ - bs = kmem_zalloc(sizeof(struct xchk_btree), KM_NOFS | KM_MAYFAIL); + cur_sz = xchk_btree_sizeof(cur->bc_nlevels); + if (cur_sz > PAGE_SIZE) { + xchk_btree_set_corrupt(sc, cur, 0); + return 0; + } + bs = kmem_zalloc(cur_sz, KM_NOFS | KM_MAYFAIL); if (!bs) return -ENOMEM; bs->cur = cur; bs->scrub_rec = scrub_fn; bs->oinfo = oinfo; - bs->firstrec = true; bs->private = private; bs->sc = sc; - - /* Initialize scrub state */ - for (i = 0; i < XFS_BTREE_MAXLEVELS; i++) - bs->firstkey[i] = true; INIT_LIST_HEAD(&bs->to_check); - /* Don't try to check a tree with a height we can't handle. */ - if (cur->bc_nlevels > XFS_BTREE_MAXLEVELS) { - xchk_btree_set_corrupt(sc, cur, 0); - goto out; - } - /* * Load the root of the btree. The helper function absorbs * error codes for us. diff --git a/fs/xfs/scrub/btree.h b/fs/xfs/scrub/btree.h index d5c0b0cbc505..7f8c54d8020e 100644 --- a/fs/xfs/scrub/btree.h +++ b/fs/xfs/scrub/btree.h @@ -29,6 +29,11 @@ typedef int (*xchk_btree_rec_fn)( struct xchk_btree *bs, const union xfs_btree_rec *rec); +struct xchk_btree_levels { + union xfs_btree_key lastkey; + bool has_lastkey; +}; + struct xchk_btree { /* caller-provided scrub state */ struct xfs_scrub *sc; @@ -39,12 +44,17 @@ struct xchk_btree { /* internal scrub state */ union xfs_btree_rec lastrec; - bool firstrec; - union xfs_btree_key lastkey[XFS_BTREE_MAXLEVELS]; - bool firstkey[XFS_BTREE_MAXLEVELS]; struct list_head to_check; + struct xchk_btree_levels levels[]; }; +static inline size_t +xchk_btree_sizeof(unsigned int levels) +{ + return sizeof(struct xchk_btree) + + (levels * sizeof(struct xchk_btree_levels)); +} + int xchk_btree(struct xfs_scrub *sc, struct xfs_btree_cur *cur, xchk_btree_rec_fn scrub_fn, const struct xfs_owner_info *oinfo, void *private); From patchwork Sat Sep 18 01:29:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12503399 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50480C433F5 for ; Sat, 18 Sep 2021 01:29:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 347286112E for ; Sat, 18 Sep 2021 01:29:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230044AbhIRBaz (ORCPT ); Fri, 17 Sep 2021 21:30:55 -0400 Received: from mail.kernel.org ([198.145.29.99]:36928 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234491AbhIRBay (ORCPT ); Fri, 17 Sep 2021 21:30:54 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 16A2960FBF; Sat, 18 Sep 2021 01:29:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1631928572; bh=ikFCyOytKcPdRQFStzh9UtWulVY1CfifxapE/vtYPo0=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=PzTHQMzQpva1rLO70Fh1IzIhjjoD7PKORhYDC29Fx97zySNIrqzHVs4770whuIyE7 3459k/3t7fpTV6AsxY/Nf1lrK3h0TcFN2ZriDEqRjgI3LTtwNakM5xKJO0xRkZJ6ke 6ebuHJ2T1pLtYQeVB3kdqkWjm5/I3qsd/KMbEtOSkKXERfDgtIb9BTs8n8a9RJ+vam y5iF+TefNOiyP8SLxEN9bfpggH+Lks3iKDCtIb05pMHv5pmlBzpJmACvKCA/cmZaEm i9yA5AsgHTOuw5pW2IqGtpDf/jLCixaxy4G2KqbR+FdZqHP1bM/s/QTSYV7R24FQpe +owgoJIs1Ryfg== Subject: [PATCH 04/14] xfs: stricter btree height checking when looking for errors From: "Darrick J. Wong" To: djwong@kernel.org, chandan.babu@oracle.com, chandanrlinux@gmail.com Cc: linux-xfs@vger.kernel.org Date: Fri, 17 Sep 2021 18:29:31 -0700 Message-ID: <163192857182.416199.9310383108874723785.stgit@magnolia> In-Reply-To: <163192854958.416199.3396890438240296942.stgit@magnolia> References: <163192854958.416199.3396890438240296942.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Since each btree type has its own precomputed maxlevels variable now, use them instead of the generic XFS_BTREE_MAXLEVELS to check the level of each per-AG btree. Signed-off-by: Darrick J. Wong Reviewed-by: Chandan Babu R --- fs/xfs/scrub/agheader.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/fs/xfs/scrub/agheader.c b/fs/xfs/scrub/agheader.c index ae3c9f6e2c69..a2c3af77b6c2 100644 --- a/fs/xfs/scrub/agheader.c +++ b/fs/xfs/scrub/agheader.c @@ -555,11 +555,11 @@ xchk_agf( xchk_block_set_corrupt(sc, sc->sa.agf_bp); level = be32_to_cpu(agf->agf_levels[XFS_BTNUM_BNO]); - if (level <= 0 || level > XFS_BTREE_MAXLEVELS) + if (level <= 0 || level > mp->m_ag_maxlevels) xchk_block_set_corrupt(sc, sc->sa.agf_bp); level = be32_to_cpu(agf->agf_levels[XFS_BTNUM_CNT]); - if (level <= 0 || level > XFS_BTREE_MAXLEVELS) + if (level <= 0 || level > mp->m_ag_maxlevels) xchk_block_set_corrupt(sc, sc->sa.agf_bp); if (xfs_has_rmapbt(mp)) { @@ -568,7 +568,7 @@ xchk_agf( xchk_block_set_corrupt(sc, sc->sa.agf_bp); level = be32_to_cpu(agf->agf_levels[XFS_BTNUM_RMAP]); - if (level <= 0 || level > XFS_BTREE_MAXLEVELS) + if (level <= 0 || level > mp->m_rmap_maxlevels) xchk_block_set_corrupt(sc, sc->sa.agf_bp); } @@ -578,7 +578,7 @@ xchk_agf( xchk_block_set_corrupt(sc, sc->sa.agf_bp); level = be32_to_cpu(agf->agf_refcount_level); - if (level <= 0 || level > XFS_BTREE_MAXLEVELS) + if (level <= 0 || level > mp->m_refc_maxlevels) xchk_block_set_corrupt(sc, sc->sa.agf_bp); } @@ -850,6 +850,7 @@ xchk_agi( struct xfs_mount *mp = sc->mp; struct xfs_agi *agi; struct xfs_perag *pag; + struct xfs_ino_geometry *igeo = M_IGEO(sc->mp); xfs_agnumber_t agno = sc->sm->sm_agno; xfs_agblock_t agbno; xfs_agblock_t eoag; @@ -880,7 +881,7 @@ xchk_agi( xchk_block_set_corrupt(sc, sc->sa.agi_bp); level = be32_to_cpu(agi->agi_level); - if (level <= 0 || level > XFS_BTREE_MAXLEVELS) + if (level <= 0 || level > igeo->inobt_maxlevels) xchk_block_set_corrupt(sc, sc->sa.agi_bp); if (xfs_has_finobt(mp)) { @@ -889,7 +890,7 @@ xchk_agi( xchk_block_set_corrupt(sc, sc->sa.agi_bp); level = be32_to_cpu(agi->agi_free_level); - if (level <= 0 || level > XFS_BTREE_MAXLEVELS) + if (level <= 0 || level > igeo->inobt_maxlevels) xchk_block_set_corrupt(sc, sc->sa.agi_bp); } From patchwork Sat Sep 18 01:29:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12503401 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4F23C433F5 for ; Sat, 18 Sep 2021 01:29:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AAF7D6112E for ; Sat, 18 Sep 2021 01:29:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234490AbhIRBbA (ORCPT ); Fri, 17 Sep 2021 21:31:00 -0400 Received: from mail.kernel.org ([198.145.29.99]:36968 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234491AbhIRBbA (ORCPT ); Fri, 17 Sep 2021 21:31:00 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 8B4BF6112E; Sat, 18 Sep 2021 01:29:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1631928577; bh=U/9Qu3oukMyZK286hOUMfC8/pnJuPu4at7I3d6qFlSU=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=nNSoZ7o0AZS9HNti3ZPuIe/FzBvob83H0CRXANwyWzq6cBmb8J+bZc1oThuLyuwIF EoH68dPPx4d72G7fH8NOBXxSflUUswdobTd0MFxgOKa8r+dSE9Kg+dNZ2XHNPc5kCT f8SPgMOOSSBZ2zPRQ6JRVN59u+1v1wqFkyH0iEA3FcttVlREoQtIrt/zdiWrwKRYFO QA4Sar4HyudtpI+CXPmdKwo8wkDui7dDV65s7Q74AEK7E7m5Fx1An/lOdnVZS7vkOm QmVIRKSHn8ngtR7tIZPjoFyPABcZBFOXkubFkTLb+dgnezwxVp6Kyk4cHSnu7nkBpa tcrJfLwP2VFpg== Subject: [PATCH 05/14] xfs: stricter btree height checking when scanning for btree roots From: "Darrick J. Wong" To: djwong@kernel.org, chandan.babu@oracle.com, chandanrlinux@gmail.com Cc: linux-xfs@vger.kernel.org Date: Fri, 17 Sep 2021 18:29:37 -0700 Message-ID: <163192857728.416199.11679791890386351921.stgit@magnolia> In-Reply-To: <163192854958.416199.3396890438240296942.stgit@magnolia> References: <163192854958.416199.3396890438240296942.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong When we're scanning for btree roots to rebuild the AG headers, make sure that the proposed tree does not exceed the maximum height for that btree type (and not just XFS_BTREE_MAXLEVELS). Signed-off-by: Darrick J. Wong Reviewed-by: Chandan Babu R --- fs/xfs/scrub/agheader_repair.c | 8 +++++++- fs/xfs/scrub/repair.h | 3 +++ 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/fs/xfs/scrub/agheader_repair.c b/fs/xfs/scrub/agheader_repair.c index 0f8deee66f15..05c27149b65d 100644 --- a/fs/xfs/scrub/agheader_repair.c +++ b/fs/xfs/scrub/agheader_repair.c @@ -122,7 +122,7 @@ xrep_check_btree_root( xfs_agnumber_t agno = sc->sm->sm_agno; return xfs_verify_agbno(mp, agno, fab->root) && - fab->height <= XFS_BTREE_MAXLEVELS; + fab->height <= fab->maxlevels; } /* @@ -339,18 +339,22 @@ xrep_agf( [XREP_AGF_BNOBT] = { .rmap_owner = XFS_RMAP_OWN_AG, .buf_ops = &xfs_bnobt_buf_ops, + .maxlevels = sc->mp->m_ag_maxlevels, }, [XREP_AGF_CNTBT] = { .rmap_owner = XFS_RMAP_OWN_AG, .buf_ops = &xfs_cntbt_buf_ops, + .maxlevels = sc->mp->m_ag_maxlevels, }, [XREP_AGF_RMAPBT] = { .rmap_owner = XFS_RMAP_OWN_AG, .buf_ops = &xfs_rmapbt_buf_ops, + .maxlevels = sc->mp->m_rmap_maxlevels, }, [XREP_AGF_REFCOUNTBT] = { .rmap_owner = XFS_RMAP_OWN_REFC, .buf_ops = &xfs_refcountbt_buf_ops, + .maxlevels = sc->mp->m_refc_maxlevels, }, [XREP_AGF_END] = { .buf_ops = NULL, @@ -881,10 +885,12 @@ xrep_agi( [XREP_AGI_INOBT] = { .rmap_owner = XFS_RMAP_OWN_INOBT, .buf_ops = &xfs_inobt_buf_ops, + .maxlevels = M_IGEO(sc->mp)->inobt_maxlevels, }, [XREP_AGI_FINOBT] = { .rmap_owner = XFS_RMAP_OWN_INOBT, .buf_ops = &xfs_finobt_buf_ops, + .maxlevels = M_IGEO(sc->mp)->inobt_maxlevels, }, [XREP_AGI_END] = { .buf_ops = NULL diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h index 3bb152d52a07..840f74ec431c 100644 --- a/fs/xfs/scrub/repair.h +++ b/fs/xfs/scrub/repair.h @@ -44,6 +44,9 @@ struct xrep_find_ag_btree { /* in: buffer ops */ const struct xfs_buf_ops *buf_ops; + /* in: maximum btree height */ + unsigned int maxlevels; + /* out: the highest btree block found and the tree height */ xfs_agblock_t root; unsigned int height; From patchwork Sat Sep 18 01:29:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12503403 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 526ADC433EF for ; Sat, 18 Sep 2021 01:29:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 300DA61152 for ; Sat, 18 Sep 2021 01:29:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234491AbhIRBbG (ORCPT ); Fri, 17 Sep 2021 21:31:06 -0400 Received: from mail.kernel.org ([198.145.29.99]:37016 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234492AbhIRBbF (ORCPT ); Fri, 17 Sep 2021 21:31:05 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 06F5F6112E; Sat, 18 Sep 2021 01:29:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1631928583; bh=LyGKZh25qDkbXOV4fTNukRG2/PAIjNqQ5GVg2Fv0EQQ=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=faaz58oSiXizO7S+nLMVQwxWDz4qZZ5MA5pNRZP0t8FNrc6WLiU1kF0NtlKD3TEr7 +RHWnQQyK2MeXGZjSjKrUDWvF8Sn0/4ury+ilgw+ByFFiH+qaWLG56liGCEcssel0A Zz2FybErG4j9sg92djCi5dhGrTKPt6E8lYOzT/8ej5AN+GCzqHD3om9bMOIQtWHuwU z9YW7qkVHafmdfB6gnYNfrT4n6nl+pBF8rZQwaWgeTxNGy9MbE0fzXUf1u2TUYFw7n GLHTwxCXFTinh0Tract90nqG4f9iGdS6BmKL0FYZ+Y3GvMdDO7hLeJ6OzpLnHJ98dm E0Rv6+zM9/SLA== Subject: [PATCH 06/14] xfs: check that bc_nlevels never overflows From: "Darrick J. Wong" To: djwong@kernel.org, chandan.babu@oracle.com, chandanrlinux@gmail.com Cc: linux-xfs@vger.kernel.org Date: Fri, 17 Sep 2021 18:29:42 -0700 Message-ID: <163192858276.416199.6204001049315596078.stgit@magnolia> In-Reply-To: <163192854958.416199.3396890438240296942.stgit@magnolia> References: <163192854958.416199.3396890438240296942.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Warn if we ever bump nlevels higher than the allowed maximum cursor height. Signed-off-by: Darrick J. Wong Reviewed-by: Chandan Babu R Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_btree.c | 2 ++ fs/xfs/libxfs/xfs_btree_staging.c | 2 ++ 2 files changed, 4 insertions(+) diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index b0cce0932f02..bc4e49f0456a 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -2933,6 +2933,7 @@ xfs_btree_new_iroot( be16_add_cpu(&block->bb_level, 1); xfs_btree_set_numrecs(block, 1); cur->bc_nlevels++; + ASSERT(cur->bc_nlevels <= XFS_BTREE_MAXLEVELS); cur->bc_ptrs[level + 1] = 1; kp = xfs_btree_key_addr(cur, 1, block); @@ -3096,6 +3097,7 @@ xfs_btree_new_root( xfs_btree_setbuf(cur, cur->bc_nlevels, nbp); cur->bc_ptrs[cur->bc_nlevels] = nptr; cur->bc_nlevels++; + ASSERT(cur->bc_nlevels <= XFS_BTREE_MAXLEVELS); *stat = 1; return 0; error0: diff --git a/fs/xfs/libxfs/xfs_btree_staging.c b/fs/xfs/libxfs/xfs_btree_staging.c index ac9e80152b5c..26143297bb7b 100644 --- a/fs/xfs/libxfs/xfs_btree_staging.c +++ b/fs/xfs/libxfs/xfs_btree_staging.c @@ -703,6 +703,7 @@ xfs_btree_bload_compute_geometry( * block-based btree level. */ cur->bc_nlevels++; + ASSERT(cur->bc_nlevels <= XFS_BTREE_MAXLEVELS); xfs_btree_bload_level_geometry(cur, bbl, level, nr_this_level, &avg_per_block, &level_blocks, &dontcare64); @@ -718,6 +719,7 @@ xfs_btree_bload_compute_geometry( /* Otherwise, we need another level of btree. */ cur->bc_nlevels++; + ASSERT(cur->bc_nlevels <= XFS_BTREE_MAXLEVELS); } nr_blocks += level_blocks; From patchwork Sat Sep 18 01:29:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12503405 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A8ECC433EF for ; Sat, 18 Sep 2021 01:29:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0944C6113A for ; Sat, 18 Sep 2021 01:29:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234492AbhIRBbM (ORCPT ); Fri, 17 Sep 2021 21:31:12 -0400 Received: from mail.kernel.org ([198.145.29.99]:37044 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234646AbhIRBbL (ORCPT ); Fri, 17 Sep 2021 21:31:11 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 7FECB60FBF; Sat, 18 Sep 2021 01:29:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1631928588; bh=zd3M5HOute6nrLOUUprQ73DPUf2vsBnL+zgDapQVWng=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=d1bAEaSphtLemOkJDH4w9reD1gKS/fRICS+4Y0IWnqxVLnrXVgQPlTUpDbTCn6iTt 89dtWtfW7SllCI1S/3cKz0YyCRWWAQ+nNl0OcbOMTUofHB/noBEQaNC61PyyYauQe4 BgaLUTy+d7P9/LW0m+SWVYFjjGDwfI/cxP33y+pQC2/60OKEChN2jYjJkIKk1w/vLx LcJZv5b4T3eMTaDDUqFirlMl5f7C/4Q/P+c0TiXog6ToIk6sB9GI/tlC7caFXxm2Om FDydt6ETx4rh3kfqN2VxOIz5f99zENW1Zk10c388K5/MoRH3aZNmqT8GZoTlBvlCiC NkhvDkKB3jLeA== Subject: [PATCH 07/14] xfs: support dynamic btree cursor heights From: "Darrick J. Wong" To: djwong@kernel.org, chandan.babu@oracle.com, chandanrlinux@gmail.com Cc: linux-xfs@vger.kernel.org Date: Fri, 17 Sep 2021 18:29:48 -0700 Message-ID: <163192858823.416199.17720760425094444911.stgit@magnolia> In-Reply-To: <163192854958.416199.3396890438240296942.stgit@magnolia> References: <163192854958.416199.3396890438240296942.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Split out the btree level information into a separate struct and put it at the end of the cursor structure as a VLA. The realtime rmap btree (which is rooted in an inode) will require the ability to support many more levels than a per-AG btree cursor, which means that we're going to create two btree cursor caches to conserve memory for the more common case. Signed-off-by: Darrick J. Wong Reviewed-by: Chandan Babu R Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_alloc.c | 6 +- fs/xfs/libxfs/xfs_bmap.c | 10 +-- fs/xfs/libxfs/xfs_btree.c | 154 +++++++++++++++++++++++---------------------- fs/xfs/libxfs/xfs_btree.h | 28 ++++++-- fs/xfs/scrub/bitmap.c | 16 ++--- fs/xfs/scrub/bmap.c | 2 - fs/xfs/scrub/btree.c | 40 ++++++------ fs/xfs/scrub/trace.c | 7 +- fs/xfs/scrub/trace.h | 10 +-- fs/xfs/xfs_super.c | 2 - fs/xfs/xfs_trace.h | 2 - 11 files changed, 147 insertions(+), 130 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 35fb1dd3be95..55c5adc9b54e 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -488,8 +488,8 @@ xfs_alloc_fixup_trees( struct xfs_btree_block *bnoblock; struct xfs_btree_block *cntblock; - bnoblock = XFS_BUF_TO_BLOCK(bno_cur->bc_bufs[0]); - cntblock = XFS_BUF_TO_BLOCK(cnt_cur->bc_bufs[0]); + bnoblock = XFS_BUF_TO_BLOCK(bno_cur->bc_levels[0].bp); + cntblock = XFS_BUF_TO_BLOCK(cnt_cur->bc_levels[0].bp); if (XFS_IS_CORRUPT(mp, bnoblock->bb_numrecs != @@ -1512,7 +1512,7 @@ xfs_alloc_ag_vextent_lastblock( * than minlen. */ if (*len || args->alignment > 1) { - acur->cnt->bc_ptrs[0] = 1; + acur->cnt->bc_levels[0].ptr = 1; do { error = xfs_alloc_get_rec(acur->cnt, bno, len, &i); if (error) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 499c977cbf56..644b956301b6 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -240,10 +240,10 @@ xfs_bmap_get_bp( return NULL; for (i = 0; i < XFS_BTREE_MAXLEVELS; i++) { - if (!cur->bc_bufs[i]) + if (!cur->bc_levels[i].bp) break; - if (xfs_buf_daddr(cur->bc_bufs[i]) == bno) - return cur->bc_bufs[i]; + if (xfs_buf_daddr(cur->bc_levels[i].bp) == bno) + return cur->bc_levels[i].bp; } /* Chase down all the log items to see if the bp is there */ @@ -629,8 +629,8 @@ xfs_bmap_btree_to_extents( ip->i_nblocks--; xfs_trans_mod_dquot_byino(tp, ip, XFS_TRANS_DQ_BCOUNT, -1L); xfs_trans_binval(tp, cbp); - if (cur->bc_bufs[0] == cbp) - cur->bc_bufs[0] = NULL; + if (cur->bc_levels[0].bp == cbp) + cur->bc_levels[0].bp = NULL; xfs_iroot_realloc(ip, -1, whichfork); ASSERT(ifp->if_broot == NULL); ifp->if_format = XFS_DINODE_FMT_EXTENTS; diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index bc4e49f0456a..93fb50516bc2 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -367,8 +367,8 @@ xfs_btree_del_cursor( * way we won't have initialized all the entries down to 0. */ for (i = 0; i < cur->bc_nlevels; i++) { - if (cur->bc_bufs[i]) - xfs_trans_brelse(cur->bc_tp, cur->bc_bufs[i]); + if (cur->bc_levels[i].bp) + xfs_trans_brelse(cur->bc_tp, cur->bc_levels[i].bp); else if (!error) break; } @@ -415,9 +415,9 @@ xfs_btree_dup_cursor( * For each level current, re-get the buffer and copy the ptr value. */ for (i = 0; i < new->bc_nlevels; i++) { - new->bc_ptrs[i] = cur->bc_ptrs[i]; - new->bc_ra[i] = cur->bc_ra[i]; - bp = cur->bc_bufs[i]; + new->bc_levels[i].ptr = cur->bc_levels[i].ptr; + new->bc_levels[i].ra = cur->bc_levels[i].ra; + bp = cur->bc_levels[i].bp; if (bp) { error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, xfs_buf_daddr(bp), mp->m_bsize, @@ -429,7 +429,7 @@ xfs_btree_dup_cursor( return error; } } - new->bc_bufs[i] = bp; + new->bc_levels[i].bp = bp; } *ncur = new; return 0; @@ -681,7 +681,7 @@ xfs_btree_get_block( return xfs_btree_get_iroot(cur); } - *bpp = cur->bc_bufs[level]; + *bpp = cur->bc_levels[level].bp; return XFS_BUF_TO_BLOCK(*bpp); } @@ -711,7 +711,7 @@ xfs_btree_firstrec( /* * Set the ptr value to 1, that's the first record/key. */ - cur->bc_ptrs[level] = 1; + cur->bc_levels[level].ptr = 1; return 1; } @@ -741,7 +741,7 @@ xfs_btree_lastrec( /* * Set the ptr value to numrecs, that's the last record/key. */ - cur->bc_ptrs[level] = be16_to_cpu(block->bb_numrecs); + cur->bc_levels[level].ptr = be16_to_cpu(block->bb_numrecs); return 1; } @@ -922,11 +922,11 @@ xfs_btree_readahead( (lev == cur->bc_nlevels - 1)) return 0; - if ((cur->bc_ra[lev] | lr) == cur->bc_ra[lev]) + if ((cur->bc_levels[lev].ra | lr) == cur->bc_levels[lev].ra) return 0; - cur->bc_ra[lev] |= lr; - block = XFS_BUF_TO_BLOCK(cur->bc_bufs[lev]); + cur->bc_levels[lev].ra |= lr; + block = XFS_BUF_TO_BLOCK(cur->bc_levels[lev].bp); if (cur->bc_flags & XFS_BTREE_LONG_PTRS) return xfs_btree_readahead_lblock(cur, lr, block); @@ -991,22 +991,22 @@ xfs_btree_setbuf( { struct xfs_btree_block *b; /* btree block */ - if (cur->bc_bufs[lev]) - xfs_trans_brelse(cur->bc_tp, cur->bc_bufs[lev]); - cur->bc_bufs[lev] = bp; - cur->bc_ra[lev] = 0; + if (cur->bc_levels[lev].bp) + xfs_trans_brelse(cur->bc_tp, cur->bc_levels[lev].bp); + cur->bc_levels[lev].bp = bp; + cur->bc_levels[lev].ra = 0; b = XFS_BUF_TO_BLOCK(bp); if (cur->bc_flags & XFS_BTREE_LONG_PTRS) { if (b->bb_u.l.bb_leftsib == cpu_to_be64(NULLFSBLOCK)) - cur->bc_ra[lev] |= XFS_BTCUR_LEFTRA; + cur->bc_levels[lev].ra |= XFS_BTCUR_LEFTRA; if (b->bb_u.l.bb_rightsib == cpu_to_be64(NULLFSBLOCK)) - cur->bc_ra[lev] |= XFS_BTCUR_RIGHTRA; + cur->bc_levels[lev].ra |= XFS_BTCUR_RIGHTRA; } else { if (b->bb_u.s.bb_leftsib == cpu_to_be32(NULLAGBLOCK)) - cur->bc_ra[lev] |= XFS_BTCUR_LEFTRA; + cur->bc_levels[lev].ra |= XFS_BTCUR_LEFTRA; if (b->bb_u.s.bb_rightsib == cpu_to_be32(NULLAGBLOCK)) - cur->bc_ra[lev] |= XFS_BTCUR_RIGHTRA; + cur->bc_levels[lev].ra |= XFS_BTCUR_RIGHTRA; } } @@ -1548,7 +1548,7 @@ xfs_btree_increment( #endif /* We're done if we remain in the block after the increment. */ - if (++cur->bc_ptrs[level] <= xfs_btree_get_numrecs(block)) + if (++cur->bc_levels[level].ptr <= xfs_btree_get_numrecs(block)) goto out1; /* Fail if we just went off the right edge of the tree. */ @@ -1571,7 +1571,7 @@ xfs_btree_increment( goto error0; #endif - if (++cur->bc_ptrs[lev] <= xfs_btree_get_numrecs(block)) + if (++cur->bc_levels[lev].ptr <= xfs_btree_get_numrecs(block)) break; /* Read-ahead the right block for the next loop. */ @@ -1598,14 +1598,14 @@ xfs_btree_increment( for (block = xfs_btree_get_block(cur, lev, &bp); lev > level; ) { union xfs_btree_ptr *ptrp; - ptrp = xfs_btree_ptr_addr(cur, cur->bc_ptrs[lev], block); + ptrp = xfs_btree_ptr_addr(cur, cur->bc_levels[lev].ptr, block); --lev; error = xfs_btree_read_buf_block(cur, ptrp, 0, &block, &bp); if (error) goto error0; xfs_btree_setbuf(cur, lev, bp); - cur->bc_ptrs[lev] = 1; + cur->bc_levels[lev].ptr = 1; } out1: *stat = 1; @@ -1641,7 +1641,7 @@ xfs_btree_decrement( xfs_btree_readahead(cur, level, XFS_BTCUR_LEFTRA); /* We're done if we remain in the block after the decrement. */ - if (--cur->bc_ptrs[level] > 0) + if (--cur->bc_levels[level].ptr > 0) goto out1; /* Get a pointer to the btree block. */ @@ -1665,7 +1665,7 @@ xfs_btree_decrement( * Stop when we don't go off the left edge of a block. */ for (lev = level + 1; lev < cur->bc_nlevels; lev++) { - if (--cur->bc_ptrs[lev] > 0) + if (--cur->bc_levels[lev].ptr > 0) break; /* Read-ahead the left block for the next loop. */ xfs_btree_readahead(cur, lev, XFS_BTCUR_LEFTRA); @@ -1691,13 +1691,13 @@ xfs_btree_decrement( for (block = xfs_btree_get_block(cur, lev, &bp); lev > level; ) { union xfs_btree_ptr *ptrp; - ptrp = xfs_btree_ptr_addr(cur, cur->bc_ptrs[lev], block); + ptrp = xfs_btree_ptr_addr(cur, cur->bc_levels[lev].ptr, block); --lev; error = xfs_btree_read_buf_block(cur, ptrp, 0, &block, &bp); if (error) goto error0; xfs_btree_setbuf(cur, lev, bp); - cur->bc_ptrs[lev] = xfs_btree_get_numrecs(block); + cur->bc_levels[lev].ptr = xfs_btree_get_numrecs(block); } out1: *stat = 1; @@ -1735,7 +1735,7 @@ xfs_btree_lookup_get_block( * * Otherwise throw it away and get a new one. */ - bp = cur->bc_bufs[level]; + bp = cur->bc_levels[level].bp; error = xfs_btree_ptr_to_daddr(cur, pp, &daddr); if (error) return error; @@ -1864,7 +1864,7 @@ xfs_btree_lookup( return -EFSCORRUPTED; } - cur->bc_ptrs[0] = dir != XFS_LOOKUP_LE; + cur->bc_levels[0].ptr = dir != XFS_LOOKUP_LE; *stat = 0; return 0; } @@ -1916,7 +1916,7 @@ xfs_btree_lookup( if (error) goto error0; - cur->bc_ptrs[level] = keyno; + cur->bc_levels[level].ptr = keyno; } } @@ -1933,7 +1933,7 @@ xfs_btree_lookup( !xfs_btree_ptr_is_null(cur, &ptr)) { int i; - cur->bc_ptrs[0] = keyno; + cur->bc_levels[0].ptr = keyno; error = xfs_btree_increment(cur, 0, &i); if (error) goto error0; @@ -1944,7 +1944,7 @@ xfs_btree_lookup( } } else if (dir == XFS_LOOKUP_LE && diff > 0) keyno--; - cur->bc_ptrs[0] = keyno; + cur->bc_levels[0].ptr = keyno; /* Return if we succeeded or not. */ if (keyno == 0 || keyno > xfs_btree_get_numrecs(block)) @@ -2104,7 +2104,7 @@ __xfs_btree_updkeys( if (error) return error; #endif - ptr = cur->bc_ptrs[level]; + ptr = cur->bc_levels[level].ptr; nlkey = xfs_btree_key_addr(cur, ptr, block); nhkey = xfs_btree_high_key_addr(cur, ptr, block); if (!force_all && @@ -2171,7 +2171,7 @@ xfs_btree_update_keys( if (error) return error; #endif - ptr = cur->bc_ptrs[level]; + ptr = cur->bc_levels[level].ptr; kp = xfs_btree_key_addr(cur, ptr, block); xfs_btree_copy_keys(cur, kp, &key, 1); xfs_btree_log_keys(cur, bp, ptr, ptr); @@ -2205,7 +2205,7 @@ xfs_btree_update( goto error0; #endif /* Get the address of the rec to be updated. */ - ptr = cur->bc_ptrs[0]; + ptr = cur->bc_levels[0].ptr; rp = xfs_btree_rec_addr(cur, ptr, block); /* Fill in the new contents and log them. */ @@ -2280,7 +2280,7 @@ xfs_btree_lshift( * If the cursor entry is the one that would be moved, don't * do it... it's too complicated. */ - if (cur->bc_ptrs[level] <= 1) + if (cur->bc_levels[level].ptr <= 1) goto out0; /* Set up the left neighbor as "left". */ @@ -2414,7 +2414,7 @@ xfs_btree_lshift( goto error0; /* Slide the cursor value left one. */ - cur->bc_ptrs[level]--; + cur->bc_levels[level].ptr--; *stat = 1; return 0; @@ -2476,7 +2476,7 @@ xfs_btree_rshift( * do it... it's too complicated. */ lrecs = xfs_btree_get_numrecs(left); - if (cur->bc_ptrs[level] >= lrecs) + if (cur->bc_levels[level].ptr >= lrecs) goto out0; /* Set up the right neighbor as "right". */ @@ -2664,7 +2664,7 @@ __xfs_btree_split( */ lrecs = xfs_btree_get_numrecs(left); rrecs = lrecs / 2; - if ((lrecs & 1) && cur->bc_ptrs[level] <= rrecs + 1) + if ((lrecs & 1) && cur->bc_levels[level].ptr <= rrecs + 1) rrecs++; src_index = (lrecs - rrecs + 1); @@ -2760,9 +2760,9 @@ __xfs_btree_split( * If it's just pointing past the last entry in left, then we'll * insert there, so don't change anything in that case. */ - if (cur->bc_ptrs[level] > lrecs + 1) { + if (cur->bc_levels[level].ptr > lrecs + 1) { xfs_btree_setbuf(cur, level, rbp); - cur->bc_ptrs[level] -= lrecs; + cur->bc_levels[level].ptr -= lrecs; } /* * If there are more levels, we'll need another cursor which refers @@ -2772,7 +2772,7 @@ __xfs_btree_split( error = xfs_btree_dup_cursor(cur, curp); if (error) goto error0; - (*curp)->bc_ptrs[level + 1]++; + (*curp)->bc_levels[level + 1].ptr++; } *ptrp = rptr; *stat = 1; @@ -2934,7 +2934,7 @@ xfs_btree_new_iroot( xfs_btree_set_numrecs(block, 1); cur->bc_nlevels++; ASSERT(cur->bc_nlevels <= XFS_BTREE_MAXLEVELS); - cur->bc_ptrs[level + 1] = 1; + cur->bc_levels[level + 1].ptr = 1; kp = xfs_btree_key_addr(cur, 1, block); ckp = xfs_btree_key_addr(cur, 1, cblock); @@ -3095,7 +3095,7 @@ xfs_btree_new_root( /* Fix up the cursor. */ xfs_btree_setbuf(cur, cur->bc_nlevels, nbp); - cur->bc_ptrs[cur->bc_nlevels] = nptr; + cur->bc_levels[cur->bc_nlevels].ptr = nptr; cur->bc_nlevels++; ASSERT(cur->bc_nlevels <= XFS_BTREE_MAXLEVELS); *stat = 1; @@ -3154,7 +3154,7 @@ xfs_btree_make_block_unfull( return error; if (*stat) { - *oindex = *index = cur->bc_ptrs[level]; + *oindex = *index = cur->bc_levels[level].ptr; return 0; } @@ -3169,7 +3169,7 @@ xfs_btree_make_block_unfull( return error; - *index = cur->bc_ptrs[level]; + *index = cur->bc_levels[level].ptr; return 0; } @@ -3216,7 +3216,7 @@ xfs_btree_insrec( } /* If we're off the left edge, return failure. */ - ptr = cur->bc_ptrs[level]; + ptr = cur->bc_levels[level].ptr; if (ptr == 0) { *stat = 0; return 0; @@ -3559,7 +3559,7 @@ xfs_btree_kill_iroot( if (error) return error; - cur->bc_bufs[level - 1] = NULL; + cur->bc_levels[level - 1].bp = NULL; be16_add_cpu(&block->bb_level, -1); xfs_trans_log_inode(cur->bc_tp, ip, XFS_ILOG_CORE | xfs_ilog_fbroot(cur->bc_ino.whichfork)); @@ -3592,8 +3592,8 @@ xfs_btree_kill_root( if (error) return error; - cur->bc_bufs[level] = NULL; - cur->bc_ra[level] = 0; + cur->bc_levels[level].bp = NULL; + cur->bc_levels[level].ra = 0; cur->bc_nlevels--; return 0; @@ -3652,7 +3652,7 @@ xfs_btree_delrec( tcur = NULL; /* Get the index of the entry being deleted, check for nothing there. */ - ptr = cur->bc_ptrs[level]; + ptr = cur->bc_levels[level].ptr; if (ptr == 0) { *stat = 0; return 0; @@ -3962,7 +3962,7 @@ xfs_btree_delrec( xfs_btree_del_cursor(tcur, XFS_BTREE_NOERROR); tcur = NULL; if (level == 0) - cur->bc_ptrs[0]++; + cur->bc_levels[0].ptr++; *stat = 1; return 0; @@ -4099,9 +4099,9 @@ xfs_btree_delrec( * cursor to the left block, and fix up the index. */ if (bp != lbp) { - cur->bc_bufs[level] = lbp; - cur->bc_ptrs[level] += lrecs; - cur->bc_ra[level] = 0; + cur->bc_levels[level].bp = lbp; + cur->bc_levels[level].ptr += lrecs; + cur->bc_levels[level].ra = 0; } /* * If we joined with the right neighbor and there's a level above @@ -4121,11 +4121,11 @@ xfs_btree_delrec( * We can't use decrement because it would change the next level up. */ if (level > 0) - cur->bc_ptrs[level]--; + cur->bc_levels[level].ptr--; /* * We combined blocks, so we have to update the parent keys if the - * btree supports overlapped intervals. However, bc_ptrs[level + 1] + * btree supports overlapped intervals. However, bc_levels[level + 1].ptr * points to the old block so that the caller knows which record to * delete. Therefore, the caller must be savvy enough to call updkeys * for us if we return stat == 2. The other exit points from this @@ -4184,7 +4184,7 @@ xfs_btree_delete( if (i == 0) { for (level = 1; level < cur->bc_nlevels; level++) { - if (cur->bc_ptrs[level] == 0) { + if (cur->bc_levels[level].ptr == 0) { error = xfs_btree_decrement(cur, level, &i); if (error) goto error0; @@ -4215,7 +4215,7 @@ xfs_btree_get_rec( int error; /* error return value */ #endif - ptr = cur->bc_ptrs[0]; + ptr = cur->bc_levels[0].ptr; block = xfs_btree_get_block(cur, 0, &bp); #ifdef DEBUG @@ -4663,23 +4663,23 @@ xfs_btree_overlapped_query_range( if (error) goto out; #endif - cur->bc_ptrs[level] = 1; + cur->bc_levels[level].ptr = 1; while (level < cur->bc_nlevels) { block = xfs_btree_get_block(cur, level, &bp); /* End of node, pop back towards the root. */ - if (cur->bc_ptrs[level] > be16_to_cpu(block->bb_numrecs)) { + if (cur->bc_levels[level].ptr > be16_to_cpu(block->bb_numrecs)) { pop_up: if (level < cur->bc_nlevels - 1) - cur->bc_ptrs[level + 1]++; + cur->bc_levels[level + 1].ptr++; level++; continue; } if (level == 0) { /* Handle a leaf node. */ - recp = xfs_btree_rec_addr(cur, cur->bc_ptrs[0], block); + recp = xfs_btree_rec_addr(cur, cur->bc_levels[0].ptr, block); cur->bc_ops->init_high_key_from_rec(&rec_hkey, recp); ldiff = cur->bc_ops->diff_two_keys(cur, &rec_hkey, @@ -4702,14 +4702,14 @@ xfs_btree_overlapped_query_range( /* Record is larger than high key; pop. */ goto pop_up; } - cur->bc_ptrs[level]++; + cur->bc_levels[level].ptr++; continue; } /* Handle an internal node. */ - lkp = xfs_btree_key_addr(cur, cur->bc_ptrs[level], block); - hkp = xfs_btree_high_key_addr(cur, cur->bc_ptrs[level], block); - pp = xfs_btree_ptr_addr(cur, cur->bc_ptrs[level], block); + lkp = xfs_btree_key_addr(cur, cur->bc_levels[level].ptr, block); + hkp = xfs_btree_high_key_addr(cur, cur->bc_levels[level].ptr, block); + pp = xfs_btree_ptr_addr(cur, cur->bc_levels[level].ptr, block); ldiff = cur->bc_ops->diff_two_keys(cur, hkp, low_key); hdiff = cur->bc_ops->diff_two_keys(cur, high_key, lkp); @@ -4732,13 +4732,13 @@ xfs_btree_overlapped_query_range( if (error) goto out; #endif - cur->bc_ptrs[level] = 1; + cur->bc_levels[level].ptr = 1; continue; } else if (hdiff < 0) { /* The low key is larger than the upper range; pop. */ goto pop_up; } - cur->bc_ptrs[level]++; + cur->bc_levels[level].ptr++; } out: @@ -4749,13 +4749,13 @@ xfs_btree_overlapped_query_range( * with a zero-results range query, so release the buffers if we * failed to return any results. */ - if (cur->bc_bufs[0] == NULL) { + if (cur->bc_levels[0].bp == NULL) { for (i = 0; i < cur->bc_nlevels; i++) { - if (cur->bc_bufs[i]) { - xfs_trans_brelse(cur->bc_tp, cur->bc_bufs[i]); - cur->bc_bufs[i] = NULL; - cur->bc_ptrs[i] = 0; - cur->bc_ra[i] = 0; + if (cur->bc_levels[i].bp) { + xfs_trans_brelse(cur->bc_tp, cur->bc_levels[i].bp); + cur->bc_levels[i].bp = NULL; + cur->bc_levels[i].ptr = 0; + cur->bc_levels[i].ra = 0; } } } @@ -4917,7 +4917,7 @@ xfs_btree_has_more_records( block = xfs_btree_get_block(cur, 0, &bp); /* There are still records in this block. */ - if (cur->bc_ptrs[0] < xfs_btree_get_numrecs(block)) + if (cur->bc_levels[0].ptr < xfs_btree_get_numrecs(block)) return true; /* There are more record blocks. */ diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index 513ade4a89f8..827c44bf24dc 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -212,6 +212,19 @@ struct xfs_btree_cur_ino { #define XFS_BTCUR_BMBT_INVALID_OWNER (1 << 1) }; +struct xfs_btree_level { + /* buffer pointer */ + struct xfs_buf *bp; + + /* key/record number */ + unsigned int ptr; + + /* readahead info */ +#define XFS_BTCUR_LEFTRA 1 /* left sibling has been read-ahead */ +#define XFS_BTCUR_RIGHTRA 2 /* right sibling has been read-ahead */ + uint8_t ra; +}; + /* * Btree cursor structure. * This collects all information needed by the btree code in one place. @@ -223,11 +236,6 @@ struct xfs_btree_cur const struct xfs_btree_ops *bc_ops; uint bc_flags; /* btree features - below */ union xfs_btree_irec bc_rec; /* current insert/search record value */ - struct xfs_buf *bc_bufs[XFS_BTREE_MAXLEVELS]; /* buf ptr per level */ - int bc_ptrs[XFS_BTREE_MAXLEVELS]; /* key/record # */ - uint8_t bc_ra[XFS_BTREE_MAXLEVELS]; /* readahead bits */ -#define XFS_BTCUR_LEFTRA 1 /* left sibling has been read-ahead */ -#define XFS_BTCUR_RIGHTRA 2 /* right sibling has been read-ahead */ uint8_t bc_nlevels; /* number of levels in the tree */ uint8_t bc_blocklog; /* log2(blocksize) of btree blocks */ xfs_btnum_t bc_btnum; /* identifies which btree type */ @@ -243,8 +251,17 @@ struct xfs_btree_cur struct xfs_btree_cur_ag bc_ag; struct xfs_btree_cur_ino bc_ino; }; + + /* Must be at the end of the struct! */ + struct xfs_btree_level bc_levels[]; }; +static inline size_t xfs_btree_cur_sizeof(unsigned int nlevels) +{ + return sizeof(struct xfs_btree_cur) + + sizeof(struct xfs_btree_level) * (nlevels); +} + /* cursor flags */ #define XFS_BTREE_LONG_PTRS (1<<0) /* pointers are 64bits long */ #define XFS_BTREE_ROOT_IN_INODE (1<<1) /* root may be variable size */ @@ -258,7 +275,6 @@ struct xfs_btree_cur */ #define XFS_BTREE_STAGING (1<<5) - #define XFS_BTREE_NOERROR 0 #define XFS_BTREE_ERROR 1 diff --git a/fs/xfs/scrub/bitmap.c b/fs/xfs/scrub/bitmap.c index d6d24c866bc4..b8b8e871e3b7 100644 --- a/fs/xfs/scrub/bitmap.c +++ b/fs/xfs/scrub/bitmap.c @@ -222,20 +222,20 @@ xbitmap_disunion( * 1 2 3 * * Pretend for this example that each leaf block has 100 btree records. For - * the first btree record, we'll observe that bc_ptrs[0] == 1, so we record - * that we saw block 1. Then we observe that bc_ptrs[1] == 1, so we record + * the first btree record, we'll observe that bc_levels[0].ptr == 1, so we record + * that we saw block 1. Then we observe that bc_levels[1].ptr == 1, so we record * block 4. The list is [1, 4]. * - * For the second btree record, we see that bc_ptrs[0] == 2, so we exit the + * For the second btree record, we see that bc_levels[0].ptr == 2, so we exit the * loop. The list remains [1, 4]. * * For the 101st btree record, we've moved onto leaf block 2. Now - * bc_ptrs[0] == 1 again, so we record that we saw block 2. We see that - * bc_ptrs[1] == 2, so we exit the loop. The list is now [1, 4, 2]. + * bc_levels[0].ptr == 1 again, so we record that we saw block 2. We see that + * bc_levels[1].ptr == 2, so we exit the loop. The list is now [1, 4, 2]. * - * For the 102nd record, bc_ptrs[0] == 2, so we continue. + * For the 102nd record, bc_levels[0].ptr == 2, so we continue. * - * For the 201st record, we've moved on to leaf block 3. bc_ptrs[0] == 1, so + * For the 201st record, we've moved on to leaf block 3. bc_levels[0].ptr == 1, so * we add 3 to the list. Now it is [1, 4, 2, 3]. * * For the 300th record we just exit, with the list being [1, 4, 2, 3]. @@ -256,7 +256,7 @@ xbitmap_set_btcur_path( int i; int error; - for (i = 0; i < cur->bc_nlevels && cur->bc_ptrs[i] == 1; i++) { + for (i = 0; i < cur->bc_nlevels && cur->bc_levels[i].ptr == 1; i++) { xfs_btree_get_block(cur, i, &bp); if (!bp) continue; diff --git a/fs/xfs/scrub/bmap.c b/fs/xfs/scrub/bmap.c index 017da9ceaee9..a4cbbc346f60 100644 --- a/fs/xfs/scrub/bmap.c +++ b/fs/xfs/scrub/bmap.c @@ -402,7 +402,7 @@ xchk_bmapbt_rec( * the root since the verifiers don't do that. */ if (xfs_has_crc(bs->cur->bc_mp) && - bs->cur->bc_ptrs[0] == 1) { + bs->cur->bc_levels[0].ptr == 1) { for (i = 0; i < bs->cur->bc_nlevels - 1; i++) { block = xfs_btree_get_block(bs->cur, i, &bp); owner = be64_to_cpu(block->bb_u.l.bb_owner); diff --git a/fs/xfs/scrub/btree.c b/fs/xfs/scrub/btree.c index 7b7762ae22e5..5a453ce151ed 100644 --- a/fs/xfs/scrub/btree.c +++ b/fs/xfs/scrub/btree.c @@ -136,7 +136,7 @@ xchk_btree_rec( struct xfs_buf *bp; block = xfs_btree_get_block(cur, 0, &bp); - rec = xfs_btree_rec_addr(cur, cur->bc_ptrs[0], block); + rec = xfs_btree_rec_addr(cur, cur->bc_levels[0].ptr, block); trace_xchk_btree_rec(bs->sc, cur, 0); @@ -153,7 +153,7 @@ xchk_btree_rec( /* Is this at least as large as the parent low key? */ cur->bc_ops->init_key_from_rec(&key, rec); keyblock = xfs_btree_get_block(cur, 1, &bp); - keyp = xfs_btree_key_addr(cur, cur->bc_ptrs[1], keyblock); + keyp = xfs_btree_key_addr(cur, cur->bc_levels[1].ptr, keyblock); if (cur->bc_ops->diff_two_keys(cur, &key, keyp) < 0) xchk_btree_set_corrupt(bs->sc, cur, 1); @@ -162,7 +162,7 @@ xchk_btree_rec( /* Is this no larger than the parent high key? */ cur->bc_ops->init_high_key_from_rec(&hkey, rec); - keyp = xfs_btree_high_key_addr(cur, cur->bc_ptrs[1], keyblock); + keyp = xfs_btree_high_key_addr(cur, cur->bc_levels[1].ptr, keyblock); if (cur->bc_ops->diff_two_keys(cur, keyp, &hkey) < 0) xchk_btree_set_corrupt(bs->sc, cur, 1); } @@ -184,7 +184,7 @@ xchk_btree_key( struct xfs_buf *bp; block = xfs_btree_get_block(cur, level, &bp); - key = xfs_btree_key_addr(cur, cur->bc_ptrs[level], block); + key = xfs_btree_key_addr(cur, cur->bc_levels[level].ptr, block); trace_xchk_btree_key(bs->sc, cur, level); @@ -200,7 +200,7 @@ xchk_btree_key( /* Is this at least as large as the parent low key? */ keyblock = xfs_btree_get_block(cur, level + 1, &bp); - keyp = xfs_btree_key_addr(cur, cur->bc_ptrs[level + 1], keyblock); + keyp = xfs_btree_key_addr(cur, cur->bc_levels[level + 1].ptr, keyblock); if (cur->bc_ops->diff_two_keys(cur, key, keyp) < 0) xchk_btree_set_corrupt(bs->sc, cur, level); @@ -208,8 +208,8 @@ xchk_btree_key( return; /* Is this no larger than the parent high key? */ - key = xfs_btree_high_key_addr(cur, cur->bc_ptrs[level], block); - keyp = xfs_btree_high_key_addr(cur, cur->bc_ptrs[level + 1], keyblock); + key = xfs_btree_high_key_addr(cur, cur->bc_levels[level].ptr, block); + keyp = xfs_btree_high_key_addr(cur, cur->bc_levels[level + 1].ptr, keyblock); if (cur->bc_ops->diff_two_keys(cur, keyp, key) < 0) xchk_btree_set_corrupt(bs->sc, cur, level); } @@ -292,7 +292,7 @@ xchk_btree_block_check_sibling( /* Compare upper level pointer to sibling pointer. */ pblock = xfs_btree_get_block(ncur, level + 1, &pbp); - pp = xfs_btree_ptr_addr(ncur, ncur->bc_ptrs[level + 1], pblock); + pp = xfs_btree_ptr_addr(ncur, ncur->bc_levels[level + 1].ptr, pblock); if (!xchk_btree_ptr_ok(bs, level + 1, pp)) goto out; if (pbp) @@ -597,7 +597,7 @@ xchk_btree_block_keys( /* Obtain the parent's copy of the keys for this block. */ parent_block = xfs_btree_get_block(cur, level + 1, &bp); - parent_keys = xfs_btree_key_addr(cur, cur->bc_ptrs[level + 1], + parent_keys = xfs_btree_key_addr(cur, cur->bc_levels[level + 1].ptr, parent_block); if (cur->bc_ops->diff_two_keys(cur, &block_keys, parent_keys) != 0) @@ -608,7 +608,7 @@ xchk_btree_block_keys( /* Get high keys */ high_bk = xfs_btree_high_key_from_key(cur, &block_keys); - high_pk = xfs_btree_high_key_addr(cur, cur->bc_ptrs[level + 1], + high_pk = xfs_btree_high_key_addr(cur, cur->bc_levels[level + 1].ptr, parent_block); if (cur->bc_ops->diff_two_keys(cur, high_bk, high_pk) != 0) @@ -672,18 +672,18 @@ xchk_btree( if (error || !block) goto out; - cur->bc_ptrs[level] = 1; + cur->bc_levels[level].ptr = 1; while (level < cur->bc_nlevels) { block = xfs_btree_get_block(cur, level, &bp); if (level == 0) { /* End of leaf, pop back towards the root. */ - if (cur->bc_ptrs[level] > + if (cur->bc_levels[level].ptr > be16_to_cpu(block->bb_numrecs)) { xchk_btree_block_keys(bs, level, block); if (level < cur->bc_nlevels - 1) - cur->bc_ptrs[level + 1]++; + cur->bc_levels[level + 1].ptr++; level++; continue; } @@ -692,7 +692,7 @@ xchk_btree( xchk_btree_rec(bs); /* Call out to the record checker. */ - recp = xfs_btree_rec_addr(cur, cur->bc_ptrs[0], block); + recp = xfs_btree_rec_addr(cur, cur->bc_levels[0].ptr, block); error = bs->scrub_rec(bs, recp); if (error) break; @@ -700,15 +700,15 @@ xchk_btree( (sc->sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT)) break; - cur->bc_ptrs[level]++; + cur->bc_levels[level].ptr++; continue; } /* End of node, pop back towards the root. */ - if (cur->bc_ptrs[level] > be16_to_cpu(block->bb_numrecs)) { + if (cur->bc_levels[level].ptr > be16_to_cpu(block->bb_numrecs)) { xchk_btree_block_keys(bs, level, block); if (level < cur->bc_nlevels - 1) - cur->bc_ptrs[level + 1]++; + cur->bc_levels[level + 1].ptr++; level++; continue; } @@ -717,9 +717,9 @@ xchk_btree( xchk_btree_key(bs, level); /* Drill another level deeper. */ - pp = xfs_btree_ptr_addr(cur, cur->bc_ptrs[level], block); + pp = xfs_btree_ptr_addr(cur, cur->bc_levels[level].ptr, block); if (!xchk_btree_ptr_ok(bs, level, pp)) { - cur->bc_ptrs[level]++; + cur->bc_levels[level].ptr++; continue; } level--; @@ -727,7 +727,7 @@ xchk_btree( if (error || !block) goto out; - cur->bc_ptrs[level] = 1; + cur->bc_levels[level].ptr = 1; } out: diff --git a/fs/xfs/scrub/trace.c b/fs/xfs/scrub/trace.c index c0ef53fe6611..816dfc8e5a80 100644 --- a/fs/xfs/scrub/trace.c +++ b/fs/xfs/scrub/trace.c @@ -21,10 +21,11 @@ xchk_btree_cur_fsbno( struct xfs_btree_cur *cur, int level) { - if (level < cur->bc_nlevels && cur->bc_bufs[level]) + if (level < cur->bc_nlevels && cur->bc_levels[level].bp) return XFS_DADDR_TO_FSB(cur->bc_mp, - xfs_buf_daddr(cur->bc_bufs[level])); - if (level == cur->bc_nlevels - 1 && cur->bc_flags & XFS_BTREE_LONG_PTRS) + xfs_buf_daddr(cur->bc_levels[level].bp)); + else if (level == cur->bc_nlevels - 1 && + cur->bc_flags & XFS_BTREE_LONG_PTRS) return XFS_INO_TO_FSB(cur->bc_mp, cur->bc_ino.ip->i_ino); if (!(cur->bc_flags & XFS_BTREE_LONG_PTRS)) return XFS_AGB_TO_FSB(cur->bc_mp, cur->bc_ag.pag->pag_agno, 0); diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index a7bbb84f91a7..93ece6df02e3 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -348,7 +348,7 @@ TRACE_EVENT(xchk_btree_op_error, __entry->level = level; __entry->agno = XFS_FSB_TO_AGNO(cur->bc_mp, fsbno); __entry->bno = XFS_FSB_TO_AGBNO(cur->bc_mp, fsbno); - __entry->ptr = cur->bc_ptrs[level]; + __entry->ptr = cur->bc_levels[level].ptr; __entry->error = error; __entry->ret_ip = ret_ip; ), @@ -389,7 +389,7 @@ TRACE_EVENT(xchk_ifork_btree_op_error, __entry->type = sc->sm->sm_type; __entry->btnum = cur->bc_btnum; __entry->level = level; - __entry->ptr = cur->bc_ptrs[level]; + __entry->ptr = cur->bc_levels[level].ptr; __entry->agno = XFS_FSB_TO_AGNO(cur->bc_mp, fsbno); __entry->bno = XFS_FSB_TO_AGBNO(cur->bc_mp, fsbno); __entry->error = error; @@ -431,7 +431,7 @@ TRACE_EVENT(xchk_btree_error, __entry->level = level; __entry->agno = XFS_FSB_TO_AGNO(cur->bc_mp, fsbno); __entry->bno = XFS_FSB_TO_AGBNO(cur->bc_mp, fsbno); - __entry->ptr = cur->bc_ptrs[level]; + __entry->ptr = cur->bc_levels[level].ptr; __entry->ret_ip = ret_ip; ), TP_printk("dev %d:%d type %s btree %s level %d ptr %d agno 0x%x agbno 0x%x ret_ip %pS", @@ -471,7 +471,7 @@ TRACE_EVENT(xchk_ifork_btree_error, __entry->level = level; __entry->agno = XFS_FSB_TO_AGNO(cur->bc_mp, fsbno); __entry->bno = XFS_FSB_TO_AGBNO(cur->bc_mp, fsbno); - __entry->ptr = cur->bc_ptrs[level]; + __entry->ptr = cur->bc_levels[level].ptr; __entry->ret_ip = ret_ip; ), TP_printk("dev %d:%d ino 0x%llx fork %s type %s btree %s level %d ptr %d agno 0x%x agbno 0x%x ret_ip %pS", @@ -511,7 +511,7 @@ DECLARE_EVENT_CLASS(xchk_sbtree_class, __entry->bno = XFS_FSB_TO_AGBNO(cur->bc_mp, fsbno); __entry->level = level; __entry->nlevels = cur->bc_nlevels; - __entry->ptr = cur->bc_ptrs[level]; + __entry->ptr = cur->bc_levels[level].ptr; ), TP_printk("dev %d:%d type %s btree %s agno 0x%x agbno 0x%x level %d nlevels %d ptr %d", MAJOR(__entry->dev), MINOR(__entry->dev), diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index c4e0cd1c1c8c..30bae0657343 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1966,7 +1966,7 @@ xfs_init_zones(void) goto out_destroy_log_ticket_zone; xfs_btree_cur_zone = kmem_cache_create("xfs_btree_cur", - sizeof(struct xfs_btree_cur), + xfs_btree_cur_sizeof(XFS_BTREE_MAXLEVELS), 0, 0, NULL); if (!xfs_btree_cur_zone) goto out_destroy_bmap_free_item_zone; diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h index 1033a95fbf8e..4a8076ef8cb4 100644 --- a/fs/xfs/xfs_trace.h +++ b/fs/xfs/xfs_trace.h @@ -2476,7 +2476,7 @@ DECLARE_EVENT_CLASS(xfs_btree_cur_class, __entry->btnum = cur->bc_btnum; __entry->level = level; __entry->nlevels = cur->bc_nlevels; - __entry->ptr = cur->bc_ptrs[level]; + __entry->ptr = cur->bc_levels[level].ptr; __entry->daddr = bp ? xfs_buf_daddr(bp) : -1; ), TP_printk("dev %d:%d btree %s level %d/%d ptr %d daddr 0x%llx", From patchwork Sat Sep 18 01:29:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12503407 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FC65C433EF for ; Sat, 18 Sep 2021 01:29:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EEA666112E for ; Sat, 18 Sep 2021 01:29:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234646AbhIRBbR (ORCPT ); Fri, 17 Sep 2021 21:31:17 -0400 Received: from mail.kernel.org ([198.145.29.99]:37074 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234515AbhIRBbQ (ORCPT ); Fri, 17 Sep 2021 21:31:16 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 02B236112E; Sat, 18 Sep 2021 01:29:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1631928594; bh=fkB3r0nwh0o0hQ+wKBbOJfoQHo5ix0vIfTkh7uugFNU=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=RSJiRSPDsjo7/+i7bRwq2SjAoGuYKJtJleHiU5zxgD5263ipxS+JWKy8kga6tBloB fjJGvcsJUsmU2hpNC86rOZXXbbvPRRhSxHFS+YJ57Yf79q3yh+HrxJG4Pg4e4cib2+ /wPdGvwwaat66X4ycdztBJmk4zkKYrR1InEsLJofQ1Zb05jXoCvt7NF9We8vZ1cObJ mV1zZaUaVQ5BwNRFT0997M3RqiMxXTljc0haCatlrUkY/bI4UxbAfLwdagQ+/Ytbs5 FeFbDQDRXXp7mEtoE02BdBuXoCKyr157vrkQ4V4I9IPO6iYeG1cCW7rnksIw0OMton F/mo3lRCMup1Q== Subject: [PATCH 08/14] xfs: refactor btree cursor allocation function From: "Darrick J. Wong" To: djwong@kernel.org, chandan.babu@oracle.com, chandanrlinux@gmail.com Cc: linux-xfs@vger.kernel.org Date: Fri, 17 Sep 2021 18:29:53 -0700 Message-ID: <163192859374.416199.6462613685926781959.stgit@magnolia> In-Reply-To: <163192854958.416199.3396890438240296942.stgit@magnolia> References: <163192854958.416199.3396890438240296942.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Refactor btree allocation to a common helper. Signed-off-by: Darrick J. Wong Reviewed-by: Chandan Babu R Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_alloc_btree.c | 7 +------ fs/xfs/libxfs/xfs_bmap_btree.c | 7 +------ fs/xfs/libxfs/xfs_btree.c | 18 ++++++++++++++++++ fs/xfs/libxfs/xfs_btree.h | 2 ++ fs/xfs/libxfs/xfs_ialloc_btree.c | 7 +------ fs/xfs/libxfs/xfs_refcount_btree.c | 6 +----- fs/xfs/libxfs/xfs_rmap_btree.c | 6 +----- 7 files changed, 25 insertions(+), 28 deletions(-) diff --git a/fs/xfs/libxfs/xfs_alloc_btree.c b/fs/xfs/libxfs/xfs_alloc_btree.c index 6746fd735550..c644b11132f6 100644 --- a/fs/xfs/libxfs/xfs_alloc_btree.c +++ b/fs/xfs/libxfs/xfs_alloc_btree.c @@ -477,12 +477,7 @@ xfs_allocbt_init_common( ASSERT(btnum == XFS_BTNUM_BNO || btnum == XFS_BTNUM_CNT); - cur = kmem_cache_zalloc(xfs_btree_cur_zone, GFP_NOFS | __GFP_NOFAIL); - - cur->bc_tp = tp; - cur->bc_mp = mp; - cur->bc_btnum = btnum; - cur->bc_blocklog = mp->m_sb.sb_blocklog; + cur = xfs_btree_alloc_cursor(mp, tp, btnum); cur->bc_ag.abt.active = false; if (btnum == XFS_BTNUM_CNT) { diff --git a/fs/xfs/libxfs/xfs_bmap_btree.c b/fs/xfs/libxfs/xfs_bmap_btree.c index 72444b8b38a6..a06987e36db5 100644 --- a/fs/xfs/libxfs/xfs_bmap_btree.c +++ b/fs/xfs/libxfs/xfs_bmap_btree.c @@ -552,13 +552,8 @@ xfs_bmbt_init_cursor( struct xfs_btree_cur *cur; ASSERT(whichfork != XFS_COW_FORK); - cur = kmem_cache_zalloc(xfs_btree_cur_zone, GFP_NOFS | __GFP_NOFAIL); - - cur->bc_tp = tp; - cur->bc_mp = mp; + cur = xfs_btree_alloc_cursor(mp, tp, XFS_BTNUM_BMAP); cur->bc_nlevels = be16_to_cpu(ifp->if_broot->bb_level) + 1; - cur->bc_btnum = XFS_BTNUM_BMAP; - cur->bc_blocklog = mp->m_sb.sb_blocklog; cur->bc_statoff = XFS_STATS_CALC_INDEX(xs_bmbt_2); cur->bc_ops = &xfs_bmbt_ops; diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 93fb50516bc2..70785004414e 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -4926,3 +4926,21 @@ xfs_btree_has_more_records( else return block->bb_u.s.bb_rightsib != cpu_to_be32(NULLAGBLOCK); } + +/* Allocate a new btree cursor of the appropriate size. */ +struct xfs_btree_cur * +xfs_btree_alloc_cursor( + struct xfs_mount *mp, + struct xfs_trans *tp, + xfs_btnum_t btnum) +{ + struct xfs_btree_cur *cur; + + cur = kmem_cache_zalloc(xfs_btree_cur_zone, GFP_NOFS | __GFP_NOFAIL); + cur->bc_tp = tp; + cur->bc_mp = mp; + cur->bc_btnum = btnum; + cur->bc_blocklog = mp->m_sb.sb_blocklog; + + return cur; +} diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index 827c44bf24dc..6540c4957c36 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -573,5 +573,7 @@ void xfs_btree_copy_ptrs(struct xfs_btree_cur *cur, void xfs_btree_copy_keys(struct xfs_btree_cur *cur, union xfs_btree_key *dst_key, const union xfs_btree_key *src_key, int numkeys); +struct xfs_btree_cur *xfs_btree_alloc_cursor(struct xfs_mount *mp, + struct xfs_trans *tp, xfs_btnum_t btnum); #endif /* __XFS_BTREE_H__ */ diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.c b/fs/xfs/libxfs/xfs_ialloc_btree.c index 27190840c5d8..c8fea6a464d5 100644 --- a/fs/xfs/libxfs/xfs_ialloc_btree.c +++ b/fs/xfs/libxfs/xfs_ialloc_btree.c @@ -432,10 +432,7 @@ xfs_inobt_init_common( { struct xfs_btree_cur *cur; - cur = kmem_cache_zalloc(xfs_btree_cur_zone, GFP_NOFS | __GFP_NOFAIL); - cur->bc_tp = tp; - cur->bc_mp = mp; - cur->bc_btnum = btnum; + cur = xfs_btree_alloc_cursor(mp, tp, btnum); if (btnum == XFS_BTNUM_INO) { cur->bc_statoff = XFS_STATS_CALC_INDEX(xs_ibt_2); cur->bc_ops = &xfs_inobt_ops; @@ -444,8 +441,6 @@ xfs_inobt_init_common( cur->bc_ops = &xfs_finobt_ops; } - cur->bc_blocklog = mp->m_sb.sb_blocklog; - if (xfs_has_crc(mp)) cur->bc_flags |= XFS_BTREE_CRC_BLOCKS; diff --git a/fs/xfs/libxfs/xfs_refcount_btree.c b/fs/xfs/libxfs/xfs_refcount_btree.c index 1ef9b99962ab..48c45e31d897 100644 --- a/fs/xfs/libxfs/xfs_refcount_btree.c +++ b/fs/xfs/libxfs/xfs_refcount_btree.c @@ -322,11 +322,7 @@ xfs_refcountbt_init_common( ASSERT(pag->pag_agno < mp->m_sb.sb_agcount); - cur = kmem_cache_zalloc(xfs_btree_cur_zone, GFP_NOFS | __GFP_NOFAIL); - cur->bc_tp = tp; - cur->bc_mp = mp; - cur->bc_btnum = XFS_BTNUM_REFC; - cur->bc_blocklog = mp->m_sb.sb_blocklog; + cur = xfs_btree_alloc_cursor(mp, tp, XFS_BTNUM_REFC); cur->bc_statoff = XFS_STATS_CALC_INDEX(xs_refcbt_2); cur->bc_flags |= XFS_BTREE_CRC_BLOCKS; diff --git a/fs/xfs/libxfs/xfs_rmap_btree.c b/fs/xfs/libxfs/xfs_rmap_btree.c index b7dbbfb3aeed..f3c4d0965cc9 100644 --- a/fs/xfs/libxfs/xfs_rmap_btree.c +++ b/fs/xfs/libxfs/xfs_rmap_btree.c @@ -451,13 +451,9 @@ xfs_rmapbt_init_common( { struct xfs_btree_cur *cur; - cur = kmem_cache_zalloc(xfs_btree_cur_zone, GFP_NOFS | __GFP_NOFAIL); - cur->bc_tp = tp; - cur->bc_mp = mp; /* Overlapping btree; 2 keys per pointer. */ - cur->bc_btnum = XFS_BTNUM_RMAP; + cur = xfs_btree_alloc_cursor(mp, tp, XFS_BTNUM_RMAP); cur->bc_flags = XFS_BTREE_CRC_BLOCKS | XFS_BTREE_OVERLAPPING; - cur->bc_blocklog = mp->m_sb.sb_blocklog; cur->bc_statoff = XFS_STATS_CALC_INDEX(xs_rmap_2); cur->bc_ops = &xfs_rmapbt_ops; From patchwork Sat Sep 18 01:29:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12503409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DABA2C433F5 for ; Sat, 18 Sep 2021 01:30:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C551061152 for ; Sat, 18 Sep 2021 01:30:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234774AbhIRBbW (ORCPT ); Fri, 17 Sep 2021 21:31:22 -0400 Received: from mail.kernel.org ([198.145.29.99]:37106 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234515AbhIRBbW (ORCPT ); Fri, 17 Sep 2021 21:31:22 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 74E8760FBF; Sat, 18 Sep 2021 01:29:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1631928599; bh=hF1uzI4Sf8Onq+nZo2VRS0Bn9y/+ew9SsFSjtG+Spsc=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=LW8Jkn+KVk/Bc1EBlJ/bshFa2v6Oa79rzConSkKuCe6PM/nf0qzzYijgnCm/izivC CYUC+6MoS/bA6fQmlwuyTREm+X2zhRNmxDXJWYmg0ehBx5t3zxuUplIvEZgyahMBtZ Zyi0SpuFhJxSa2nCqMZb99CHjcQcr7/P66ukKtzY0W/brdDGwuV4H0D+f/VoaB4sCi iSwQ6aHQ3O4bzFGM6cMttEewHLEeaoXyrH9tqWSU+hiI3+BrPybwDkecQO4AS+pmKM ATtkn11Mwr1mphGIdd4maOTqZmsBRlzxRKyrhdRONJ8Q+OAM7h3djArO7CR15YR1+5 MNBYQ1aSauJeQ== Subject: [PATCH 09/14] xfs: fix maxlevels comparisons in the btree staging code From: "Darrick J. Wong" To: djwong@kernel.org, chandan.babu@oracle.com, chandanrlinux@gmail.com Cc: linux-xfs@vger.kernel.org Date: Fri, 17 Sep 2021 18:29:59 -0700 Message-ID: <163192859919.416199.9790046292707106095.stgit@magnolia> In-Reply-To: <163192854958.416199.3396890438240296942.stgit@magnolia> References: <163192854958.416199.3396890438240296942.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong The btree geometry computation function has an off-by-one error in that it does not allow maximally tall btrees (nlevels == XFS_BTREE_MAXLEVELS). This can result in repairs failing unnecessarily on very fragmented filesystems. Subsequent patches to remove MAXLEVELS usage in favor of the per-btree type computations will make this a much more likely occurrence. Signed-off-by: Darrick J. Wong Reviewed-by: Chandan Babu R Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_btree_staging.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/xfs/libxfs/xfs_btree_staging.c b/fs/xfs/libxfs/xfs_btree_staging.c index 26143297bb7b..cc56efc2b90a 100644 --- a/fs/xfs/libxfs/xfs_btree_staging.c +++ b/fs/xfs/libxfs/xfs_btree_staging.c @@ -662,7 +662,7 @@ xfs_btree_bload_compute_geometry( xfs_btree_bload_ensure_slack(cur, &bbl->node_slack, 1); bbl->nr_records = nr_this_level = nr_records; - for (cur->bc_nlevels = 1; cur->bc_nlevels < XFS_BTREE_MAXLEVELS;) { + for (cur->bc_nlevels = 1; cur->bc_nlevels <= XFS_BTREE_MAXLEVELS;) { uint64_t level_blocks; uint64_t dontcare64; unsigned int level = cur->bc_nlevels - 1; @@ -726,7 +726,7 @@ xfs_btree_bload_compute_geometry( nr_this_level = level_blocks; } - if (cur->bc_nlevels == XFS_BTREE_MAXLEVELS) + if (cur->bc_nlevels > XFS_BTREE_MAXLEVELS) return -EOVERFLOW; bbl->btree_height = cur->bc_nlevels; From patchwork Sat Sep 18 01:30:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12503411 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B760DC433F5 for ; Sat, 18 Sep 2021 01:30:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 979256113A for ; Sat, 18 Sep 2021 01:30:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234786AbhIRBb2 (ORCPT ); Fri, 17 Sep 2021 21:31:28 -0400 Received: from mail.kernel.org ([198.145.29.99]:37188 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234515AbhIRBb1 (ORCPT ); Fri, 17 Sep 2021 21:31:27 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id F3BDA6112E; Sat, 18 Sep 2021 01:30:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1631928605; bh=1Su6Y3wVdkndP0RCdgw2dB+4wUDdCS9/JLiGewn21Dk=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=ZnrvPQqsQuSsJd3Zop5igI5whOUziteVsuYVAgXJ/UCYAxFQAH7XwkpjrHpbRmjky s0qOvletHfr8Zghh9ohiysuXxYz8QwS/H5IaiSXWDdwf8aq3Qv4VCMOtPzQ+k5PHg5 IFnEcqxe2zJE5LT+MntXLWBiiP/budjLfeWqxjXOL4087jjSEfg86t56kVC+vANnLo HwBNOyBj5G4l6uTSvM89Dw19Cr39RR06H0qGRjoaJjWAC1HPZ4RfsaCK188h9EeLXz l0EzDErKpBHUpZrHOWI7AcQSruNH5UFWOE/ZwN51EC0B8Tzs2zDDGtMv1P7q6FQv0s N4uhXYDrtq4Gg== Subject: [PATCH 10/14] xfs: encode the max btree height in the cursor From: "Darrick J. Wong" To: djwong@kernel.org, chandan.babu@oracle.com, chandanrlinux@gmail.com Cc: linux-xfs@vger.kernel.org Date: Fri, 17 Sep 2021 18:30:04 -0700 Message-ID: <163192860467.416199.3157992669504614921.stgit@magnolia> In-Reply-To: <163192854958.416199.3396890438240296942.stgit@magnolia> References: <163192854958.416199.3396890438240296942.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Encode the maximum btree height in the cursor, since we're soon going to allow smaller cursors for AG btrees and larger cursors for file btrees. Signed-off-by: Darrick J. Wong Reviewed-by: Chandan Babu R Reviewed-by: Christoph Hellwig --- fs/xfs/libxfs/xfs_bmap.c | 2 +- fs/xfs/libxfs/xfs_btree.c | 5 +++-- fs/xfs/libxfs/xfs_btree.h | 3 ++- fs/xfs/libxfs/xfs_btree_staging.c | 10 +++++----- 4 files changed, 11 insertions(+), 9 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 644b956301b6..2ae5bf9a74e7 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -239,7 +239,7 @@ xfs_bmap_get_bp( if (!cur) return NULL; - for (i = 0; i < XFS_BTREE_MAXLEVELS; i++) { + for (i = 0; i < cur->bc_maxlevels; i++) { if (!cur->bc_levels[i].bp) break; if (xfs_buf_daddr(cur->bc_levels[i].bp) == bno) diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 70785004414e..2486ba22c01d 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -2933,7 +2933,7 @@ xfs_btree_new_iroot( be16_add_cpu(&block->bb_level, 1); xfs_btree_set_numrecs(block, 1); cur->bc_nlevels++; - ASSERT(cur->bc_nlevels <= XFS_BTREE_MAXLEVELS); + ASSERT(cur->bc_nlevels <= cur->bc_maxlevels); cur->bc_levels[level + 1].ptr = 1; kp = xfs_btree_key_addr(cur, 1, block); @@ -3097,7 +3097,7 @@ xfs_btree_new_root( xfs_btree_setbuf(cur, cur->bc_nlevels, nbp); cur->bc_levels[cur->bc_nlevels].ptr = nptr; cur->bc_nlevels++; - ASSERT(cur->bc_nlevels <= XFS_BTREE_MAXLEVELS); + ASSERT(cur->bc_nlevels <= cur->bc_maxlevels); *stat = 1; return 0; error0: @@ -4941,6 +4941,7 @@ xfs_btree_alloc_cursor( cur->bc_mp = mp; cur->bc_btnum = btnum; cur->bc_blocklog = mp->m_sb.sb_blocklog; + cur->bc_maxlevels = XFS_BTREE_MAXLEVELS; return cur; } diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index 6540c4957c36..6075918efa0c 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -235,9 +235,10 @@ struct xfs_btree_cur struct xfs_mount *bc_mp; /* file system mount struct */ const struct xfs_btree_ops *bc_ops; uint bc_flags; /* btree features - below */ - union xfs_btree_irec bc_rec; /* current insert/search record value */ + uint8_t bc_maxlevels; /* maximum levels for this btree type */ uint8_t bc_nlevels; /* number of levels in the tree */ uint8_t bc_blocklog; /* log2(blocksize) of btree blocks */ + union xfs_btree_irec bc_rec; /* current insert/search record value */ xfs_btnum_t bc_btnum; /* identifies which btree type */ int bc_statoff; /* offset of btre stats array */ diff --git a/fs/xfs/libxfs/xfs_btree_staging.c b/fs/xfs/libxfs/xfs_btree_staging.c index cc56efc2b90a..dd75e208b543 100644 --- a/fs/xfs/libxfs/xfs_btree_staging.c +++ b/fs/xfs/libxfs/xfs_btree_staging.c @@ -657,12 +657,12 @@ xfs_btree_bload_compute_geometry( * checking levels 0 and 1 here, so set bc_nlevels such that the btree * code doesn't interpret either as the root level. */ - cur->bc_nlevels = XFS_BTREE_MAXLEVELS - 1; + cur->bc_nlevels = cur->bc_maxlevels - 1; xfs_btree_bload_ensure_slack(cur, &bbl->leaf_slack, 0); xfs_btree_bload_ensure_slack(cur, &bbl->node_slack, 1); bbl->nr_records = nr_this_level = nr_records; - for (cur->bc_nlevels = 1; cur->bc_nlevels <= XFS_BTREE_MAXLEVELS;) { + for (cur->bc_nlevels = 1; cur->bc_nlevels <= cur->bc_maxlevels;) { uint64_t level_blocks; uint64_t dontcare64; unsigned int level = cur->bc_nlevels - 1; @@ -703,7 +703,7 @@ xfs_btree_bload_compute_geometry( * block-based btree level. */ cur->bc_nlevels++; - ASSERT(cur->bc_nlevels <= XFS_BTREE_MAXLEVELS); + ASSERT(cur->bc_nlevels <= cur->bc_maxlevels); xfs_btree_bload_level_geometry(cur, bbl, level, nr_this_level, &avg_per_block, &level_blocks, &dontcare64); @@ -719,14 +719,14 @@ xfs_btree_bload_compute_geometry( /* Otherwise, we need another level of btree. */ cur->bc_nlevels++; - ASSERT(cur->bc_nlevels <= XFS_BTREE_MAXLEVELS); + ASSERT(cur->bc_nlevels <= cur->bc_maxlevels); } nr_blocks += level_blocks; nr_this_level = level_blocks; } - if (cur->bc_nlevels > XFS_BTREE_MAXLEVELS) + if (cur->bc_nlevels > cur->bc_maxlevels) return -EOVERFLOW; bbl->btree_height = cur->bc_nlevels; From patchwork Sat Sep 18 01:30:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12503413 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E05AC433EF for ; Sat, 18 Sep 2021 01:30:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 65BD561241 for ; Sat, 18 Sep 2021 01:30:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234834AbhIRBbe (ORCPT ); Fri, 17 Sep 2021 21:31:34 -0400 Received: from mail.kernel.org ([198.145.29.99]:37238 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234515AbhIRBbd (ORCPT ); Fri, 17 Sep 2021 21:31:33 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 7090860FBF; Sat, 18 Sep 2021 01:30:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1631928610; bh=tEjUh9ULbjJdmr7bMXbCxBIC8pwaFKG1ZFjtgK7B12s=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=dRCZCTZOe12ApZQ7FRWG6gVtvFmVOSSIHb1efVU8BExmq1VQ7gP1vSCQV6kpM2s8O 3J/u6S9ukYyMyYLI4bMnUwLY1NTthiDgCv1uEdxTGLeV1y16dYO4FvUd92x4xUri+j OnIRqbweiTD2UW6mfU1fJacXoKc1YKWSPtjjAz3zfZNJrklStjwjmE2bEzGKHvYPcd 2ppv8dJfifYuBlThsyAv7+CMygQhtnTqceX2OfWAf7tlJjqadEnSY2ATYYuyjw/pHR RqDwt3/wDQfeE48vh8GSa7epOdE6oiDWLLZI1kh/x4HBc0gzXBzAUumenTTPh50tFD yllk66TODC6yA== Subject: [PATCH 11/14] xfs: dynamically allocate cursors based on maxlevels From: "Darrick J. Wong" To: djwong@kernel.org, chandan.babu@oracle.com, chandanrlinux@gmail.com Cc: linux-xfs@vger.kernel.org Date: Fri, 17 Sep 2021 18:30:10 -0700 Message-ID: <163192861018.416199.11733078081556457241.stgit@magnolia> In-Reply-To: <163192854958.416199.3396890438240296942.stgit@magnolia> References: <163192854958.416199.3396890438240296942.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Replace the statically-sized btree cursor zone with dynamically sized allocations so that we can reduce the memory overhead for per-AG bt cursors while handling very tall btrees for rt metadata. Signed-off-by: Darrick J. Wong Reviewed-by: Chandan Babu R --- fs/xfs/libxfs/xfs_btree.c | 40 ++++++++++++++++++++++++++++++++-------- fs/xfs/libxfs/xfs_btree.h | 2 -- fs/xfs/xfs_super.c | 11 +---------- 3 files changed, 33 insertions(+), 20 deletions(-) diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 2486ba22c01d..f9516828a847 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -23,11 +23,6 @@ #include "xfs_btree_staging.h" #include "xfs_ag.h" -/* - * Cursor allocation zone. - */ -kmem_zone_t *xfs_btree_cur_zone; - /* * Btree magic numbers. */ @@ -379,7 +374,7 @@ xfs_btree_del_cursor( kmem_free(cur->bc_ops); if (!(cur->bc_flags & XFS_BTREE_LONG_PTRS) && cur->bc_ag.pag) xfs_perag_put(cur->bc_ag.pag); - kmem_cache_free(xfs_btree_cur_zone, cur); + kmem_free(cur); } /* @@ -4927,6 +4922,32 @@ xfs_btree_has_more_records( return block->bb_u.s.bb_rightsib != cpu_to_be32(NULLAGBLOCK); } +/* Compute the maximum allowed height for a given btree type. */ +static unsigned int +xfs_btree_maxlevels( + struct xfs_mount *mp, + xfs_btnum_t btnum) +{ + switch (btnum) { + case XFS_BTNUM_BNO: + case XFS_BTNUM_CNT: + return mp->m_ag_maxlevels; + case XFS_BTNUM_BMAP: + return max(mp->m_bm_maxlevels[XFS_DATA_FORK], + mp->m_bm_maxlevels[XFS_ATTR_FORK]); + case XFS_BTNUM_INO: + case XFS_BTNUM_FINO: + return M_IGEO(mp)->inobt_maxlevels; + case XFS_BTNUM_RMAP: + return mp->m_rmap_maxlevels; + case XFS_BTNUM_REFC: + return mp->m_refc_maxlevels; + default: + ASSERT(0); + return XFS_BTREE_MAXLEVELS; + } +} + /* Allocate a new btree cursor of the appropriate size. */ struct xfs_btree_cur * xfs_btree_alloc_cursor( @@ -4935,13 +4956,16 @@ xfs_btree_alloc_cursor( xfs_btnum_t btnum) { struct xfs_btree_cur *cur; + unsigned int maxlevels = xfs_btree_maxlevels(mp, btnum); - cur = kmem_cache_zalloc(xfs_btree_cur_zone, GFP_NOFS | __GFP_NOFAIL); + ASSERT(maxlevels <= XFS_BTREE_MAXLEVELS); + + cur = kmem_zalloc(xfs_btree_cur_sizeof(maxlevels), KM_NOFS); cur->bc_tp = tp; cur->bc_mp = mp; cur->bc_btnum = btnum; cur->bc_blocklog = mp->m_sb.sb_blocklog; - cur->bc_maxlevels = XFS_BTREE_MAXLEVELS; + cur->bc_maxlevels = maxlevels; return cur; } diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index 6075918efa0c..ae83fbf58c18 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -13,8 +13,6 @@ struct xfs_trans; struct xfs_ifork; struct xfs_perag; -extern kmem_zone_t *xfs_btree_cur_zone; - /* * Generic key, ptr and record wrapper structures. * diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 30bae0657343..25a548bbb0b2 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1965,17 +1965,11 @@ xfs_init_zones(void) if (!xfs_bmap_free_item_zone) goto out_destroy_log_ticket_zone; - xfs_btree_cur_zone = kmem_cache_create("xfs_btree_cur", - xfs_btree_cur_sizeof(XFS_BTREE_MAXLEVELS), - 0, 0, NULL); - if (!xfs_btree_cur_zone) - goto out_destroy_bmap_free_item_zone; - xfs_da_state_zone = kmem_cache_create("xfs_da_state", sizeof(struct xfs_da_state), 0, 0, NULL); if (!xfs_da_state_zone) - goto out_destroy_btree_cur_zone; + goto out_destroy_bmap_free_item_zone; xfs_ifork_zone = kmem_cache_create("xfs_ifork", sizeof(struct xfs_ifork), @@ -2105,8 +2099,6 @@ xfs_init_zones(void) kmem_cache_destroy(xfs_ifork_zone); out_destroy_da_state_zone: kmem_cache_destroy(xfs_da_state_zone); - out_destroy_btree_cur_zone: - kmem_cache_destroy(xfs_btree_cur_zone); out_destroy_bmap_free_item_zone: kmem_cache_destroy(xfs_bmap_free_item_zone); out_destroy_log_ticket_zone: @@ -2138,7 +2130,6 @@ xfs_destroy_zones(void) kmem_cache_destroy(xfs_trans_zone); kmem_cache_destroy(xfs_ifork_zone); kmem_cache_destroy(xfs_da_state_zone); - kmem_cache_destroy(xfs_btree_cur_zone); kmem_cache_destroy(xfs_bmap_free_item_zone); kmem_cache_destroy(xfs_log_ticket_zone); } From patchwork Sat Sep 18 01:30:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12503415 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D4ABC433EF for ; Sat, 18 Sep 2021 01:30:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 02A3060FBF for ; Sat, 18 Sep 2021 01:30:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234515AbhIRBbl (ORCPT ); Fri, 17 Sep 2021 21:31:41 -0400 Received: from mail.kernel.org ([198.145.29.99]:37260 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234976AbhIRBbi (ORCPT ); Fri, 17 Sep 2021 21:31:38 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id E458E6112E; Sat, 18 Sep 2021 01:30:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1631928616; bh=Cxj1cGQ5vwazVeHYmrjlibKlT7ggUBVTm3vRlZTryeo=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=MhgMoZZtoHS62mHuWeu6Fv2qg8KHAPzz0YhsuIGmUr+GLcDflTBIo6T657NEevBrq V/wSoZtDqUup7L9EkUPT0lE4FyFRwA4wN0hKi6dMoT9UShl+DWnZ3N04abmICYbQEd s+BvhNvclwi1NZpBESAFNNeoIAlfEhI1s2ZX7o7DgZxfGORtiX34JRhcNkUISMv92Q OaadiPpbqxQKxF688wu2tLUmWcLIZsPY2SC6V6opULFHOpnzjWKNCesxmq0/VZfwTS 0MZ1D8+3IthccIkD912c5QNkoxIUp+t3U2HXC+ms5ZfvOzpqrzz+FqWNYYKjSy6Ar2 mh3bRkRBjbrxA== Subject: [PATCH 12/14] xfs: compute actual maximum btree height for critical reservation calculation From: "Darrick J. Wong" To: djwong@kernel.org, chandan.babu@oracle.com, chandanrlinux@gmail.com Cc: linux-xfs@vger.kernel.org Date: Fri, 17 Sep 2021 18:30:15 -0700 Message-ID: <163192861564.416199.12921575958749918045.stgit@magnolia> In-Reply-To: <163192854958.416199.3396890438240296942.stgit@magnolia> References: <163192854958.416199.3396890438240296942.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Compute the actual maximum btree height when deciding if per-AG block reservation is critically low. This only affects the sanity check condition, since we /generally/ will trigger on the 10% threshold. This is a long-winded way of saying that we're removing one more usage of XFS_BTREE_MAXLEVELS. Signed-off-by: Darrick J. Wong Reviewed-by: Chandan Babu R --- fs/xfs/libxfs/xfs_ag_resv.c | 4 +++- fs/xfs/libxfs/xfs_btree.c | 19 +++++++++++++++---- fs/xfs/libxfs/xfs_btree.h | 1 + 3 files changed, 19 insertions(+), 5 deletions(-) diff --git a/fs/xfs/libxfs/xfs_ag_resv.c b/fs/xfs/libxfs/xfs_ag_resv.c index 2aa2b3484c28..931481fbdd72 100644 --- a/fs/xfs/libxfs/xfs_ag_resv.c +++ b/fs/xfs/libxfs/xfs_ag_resv.c @@ -72,6 +72,7 @@ xfs_ag_resv_critical( { xfs_extlen_t avail; xfs_extlen_t orig; + xfs_extlen_t btree_maxlevels; switch (type) { case XFS_AG_RESV_METADATA: @@ -91,7 +92,8 @@ xfs_ag_resv_critical( trace_xfs_ag_resv_critical(pag, type, avail); /* Critically low if less than 10% or max btree height remains. */ - return XFS_TEST_ERROR(avail < orig / 10 || avail < XFS_BTREE_MAXLEVELS, + btree_maxlevels = xfs_btree_maxlevels(pag->pag_mount, XFS_BTNUM_MAX); + return XFS_TEST_ERROR(avail < orig / 10 || avail < btree_maxlevels, pag->pag_mount, XFS_ERRTAG_AG_RESV_CRITICAL); } diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index f9516828a847..6cf49f7e1299 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -4922,12 +4922,17 @@ xfs_btree_has_more_records( return block->bb_u.s.bb_rightsib != cpu_to_be32(NULLAGBLOCK); } -/* Compute the maximum allowed height for a given btree type. */ -static unsigned int +/* + * Compute the maximum allowed height for a given btree type. If XFS_BTNUM_MAX + * is passed in, the maximum allowed height for all btree types is returned. + */ +unsigned int xfs_btree_maxlevels( struct xfs_mount *mp, xfs_btnum_t btnum) { + unsigned int ret; + switch (btnum) { case XFS_BTNUM_BNO: case XFS_BTNUM_CNT: @@ -4943,9 +4948,15 @@ xfs_btree_maxlevels( case XFS_BTNUM_REFC: return mp->m_refc_maxlevels; default: - ASSERT(0); - return XFS_BTREE_MAXLEVELS; + break; } + + ret = mp->m_ag_maxlevels; + ret = max(ret, mp->m_bm_maxlevels[XFS_DATA_FORK]); + ret = max(ret, mp->m_bm_maxlevels[XFS_ATTR_FORK]); + ret = max(ret, M_IGEO(mp)->inobt_maxlevels); + ret = max(ret, mp->m_rmap_maxlevels); + return max(ret, mp->m_refc_maxlevels); } /* Allocate a new btree cursor of the appropriate size. */ diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index ae83fbf58c18..106760c540c7 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -574,5 +574,6 @@ void xfs_btree_copy_keys(struct xfs_btree_cur *cur, const union xfs_btree_key *src_key, int numkeys); struct xfs_btree_cur *xfs_btree_alloc_cursor(struct xfs_mount *mp, struct xfs_trans *tp, xfs_btnum_t btnum); +unsigned int xfs_btree_maxlevels(struct xfs_mount *mp, xfs_btnum_t btnum); #endif /* __XFS_BTREE_H__ */ From patchwork Sat Sep 18 01:30:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12503417 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D9DBC433F5 for ; Sat, 18 Sep 2021 01:30:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D738E61241 for ; Sat, 18 Sep 2021 01:30:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234976AbhIRBbo (ORCPT ); Fri, 17 Sep 2021 21:31:44 -0400 Received: from mail.kernel.org ([198.145.29.99]:37284 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234986AbhIRBbo (ORCPT ); Fri, 17 Sep 2021 21:31:44 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 63BFC6112E; Sat, 18 Sep 2021 01:30:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1631928621; bh=wyJB9VQeHVYy8UU10UPpc7ar+rymniwbbZCURJpfOvU=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=X1Uk8mrsb7pnrzd9SZAQTZSlBTV+QOdDyEEKgnz6wQa0Q610ABuRKmSZR4bi3MAEu LyOJEDpzErXxNqMsHVkMR1kC/bbh0QFfM1zsoh8qAO9trlyAe2+X6NINohNY5hb8lR GBs4Ubp2sEAMlqqyhRH3D2oV/IxLs8ZRulXUHGviS+znCzBahME79ZVVN4IVn+Xnq5 GKWikn3oqfOj+CzH5D7h15qF927rbaPBZ6dVUFBTP/YU67H5yc29g77tN/tMUG5MDr 0LbqcYf5jnBPXD6ifMZUNs5rO3H0/g96Y4DKWk+j6VyqxAaJxX2okMvRwdSEmPVJm2 k6+yidwV4ltiw== Subject: [PATCH 13/14] xfs: compute the maximum height of the rmap btree when reflink enabled From: "Darrick J. Wong" To: djwong@kernel.org, chandan.babu@oracle.com, chandanrlinux@gmail.com Cc: linux-xfs@vger.kernel.org Date: Fri, 17 Sep 2021 18:30:21 -0700 Message-ID: <163192862112.416199.3937220618088469929.stgit@magnolia> In-Reply-To: <163192854958.416199.3396890438240296942.stgit@magnolia> References: <163192854958.416199.3396890438240296942.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Instead of assuming that the hardcoded XFS_BTREE_MAXLEVELS value is big enough to handle the maximally tall rmap btree when all blocks are in use and maximally shared, let's compute the maximum height assuming the rmapbt consumes as many blocks as possible. Signed-off-by: Darrick J. Wong Reviewed-by: Chandan Babu R --- fs/xfs/libxfs/xfs_btree.c | 34 +++++++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_btree.h | 2 ++ fs/xfs/libxfs/xfs_rmap_btree.c | 40 ++++++++++++++++++++------------------- fs/xfs/libxfs/xfs_rmap_btree.h | 2 +- fs/xfs/libxfs/xfs_trans_resv.c | 12 ++++++++++++ fs/xfs/libxfs/xfs_trans_space.h | 7 +++++++ fs/xfs/xfs_mount.c | 2 +- 7 files changed, 78 insertions(+), 21 deletions(-) diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 6cf49f7e1299..005bc42cf0bd 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -4526,6 +4526,40 @@ xfs_btree_compute_maxlevels( return level; } +/* + * Compute the maximum height of a btree that is allowed to consume up to the + * given number of blocks. + */ +unsigned int +xfs_btree_compute_maxlevels_size( + unsigned long long max_btblocks, + unsigned int leaf_mnr) +{ + unsigned long long leaf_blocks = leaf_mnr; + unsigned long long blocks_left; + unsigned int maxlevels; + + if (max_btblocks < 1) + return 0; + + /* + * The loop increments maxlevels as long as there would be enough + * blocks left in the reservation to handle each node block at the + * current level pointing to the minimum possible number of leaf blocks + * at the next level down. We start the loop assuming a single-level + * btree consuming one block. + */ + maxlevels = 1; + blocks_left = max_btblocks - 1; + while (leaf_blocks < blocks_left) { + maxlevels++; + blocks_left -= leaf_blocks; + leaf_blocks *= leaf_mnr; + } + + return maxlevels; +} + /* * Query a regular btree for all records overlapping a given interval. * Start with a LE lookup of the key of low_rec and return all records diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index 106760c540c7..d256d869f0af 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -476,6 +476,8 @@ xfs_failaddr_t xfs_btree_lblock_verify(struct xfs_buf *bp, unsigned int max_recs); uint xfs_btree_compute_maxlevels(uint *limits, unsigned long len); +unsigned int xfs_btree_compute_maxlevels_size(unsigned long long max_btblocks, + unsigned int leaf_mnr); unsigned long long xfs_btree_calc_size(uint *limits, unsigned long long len); /* diff --git a/fs/xfs/libxfs/xfs_rmap_btree.c b/fs/xfs/libxfs/xfs_rmap_btree.c index f3c4d0965cc9..85caeb14e4db 100644 --- a/fs/xfs/libxfs/xfs_rmap_btree.c +++ b/fs/xfs/libxfs/xfs_rmap_btree.c @@ -535,30 +535,32 @@ xfs_rmapbt_maxrecs( } /* Compute the maximum height of an rmap btree. */ -void +unsigned int xfs_rmapbt_compute_maxlevels( - struct xfs_mount *mp) + struct xfs_mount *mp) { + if (!xfs_has_reflink(mp)) { + /* + * If there's no block sharing, compute the maximum rmapbt + * height assuming one rmap record per AG block. + */ + return xfs_btree_compute_maxlevels(mp->m_rmap_mnr, + mp->m_sb.sb_agblocks); + } + /* - * On a non-reflink filesystem, the maximum number of rmap - * records is the number of blocks in the AG, hence the max - * rmapbt height is log_$maxrecs($agblocks). However, with - * reflink each AG block can have up to 2^32 (per the refcount - * record format) owners, which means that theoretically we - * could face up to 2^64 rmap records. + * Compute the asymptotic maxlevels for an rmapbt on a reflink fs. * - * That effectively means that the max rmapbt height must be - * XFS_BTREE_MAXLEVELS. "Fortunately" we'll run out of AG - * blocks to feed the rmapbt long before the rmapbt reaches - * maximum height. The reflink code uses ag_resv_critical to - * disallow reflinking when less than 10% of the per-AG metadata - * block reservation since the fallback is a regular file copy. + * On a reflink filesystem, each AG block can have up to 2^32 (per the + * refcount record format) owners, which means that theoretically we + * could face up to 2^64 rmap records. However, we're likely to run + * out of blocks in the AG long before that happens, which means that + * we must compute the max height based on what the btree will look + * like if it consumes almost all the blocks in the AG due to maximal + * sharing factor. */ - if (xfs_has_reflink(mp)) - mp->m_rmap_maxlevels = XFS_BTREE_MAXLEVELS; - else - mp->m_rmap_maxlevels = xfs_btree_compute_maxlevels( - mp->m_rmap_mnr, mp->m_sb.sb_agblocks); + return xfs_btree_compute_maxlevels_size(mp->m_sb.sb_agblocks, + mp->m_rmap_mnr[1]); } /* Calculate the refcount btree size for some records. */ diff --git a/fs/xfs/libxfs/xfs_rmap_btree.h b/fs/xfs/libxfs/xfs_rmap_btree.h index f2eee6572af4..5aaecf755abd 100644 --- a/fs/xfs/libxfs/xfs_rmap_btree.h +++ b/fs/xfs/libxfs/xfs_rmap_btree.h @@ -49,7 +49,7 @@ struct xfs_btree_cur *xfs_rmapbt_stage_cursor(struct xfs_mount *mp, void xfs_rmapbt_commit_staged_btree(struct xfs_btree_cur *cur, struct xfs_trans *tp, struct xfs_buf *agbp); int xfs_rmapbt_maxrecs(int blocklen, int leaf); -extern void xfs_rmapbt_compute_maxlevels(struct xfs_mount *mp); +unsigned int xfs_rmapbt_compute_maxlevels(struct xfs_mount *mp); extern xfs_extlen_t xfs_rmapbt_calc_size(struct xfs_mount *mp, unsigned long long len); diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c index 5e300daa2559..679f10e08f31 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.c +++ b/fs/xfs/libxfs/xfs_trans_resv.c @@ -814,6 +814,15 @@ xfs_trans_resv_calc( struct xfs_mount *mp, struct xfs_trans_resv *resp) { + unsigned int rmap_maxlevels = mp->m_rmap_maxlevels; + + /* + * In the early days of rmap+reflink, we hardcoded the rmap maxlevels + * to 9 even if the AG size was smaller. + */ + if (xfs_has_rmapbt(mp) && xfs_has_reflink(mp)) + mp->m_rmap_maxlevels = XFS_OLD_REFLINK_RMAP_MAXLEVELS; + /* * The following transactions are logged in physical format and * require a permanent reservation on space. @@ -916,4 +925,7 @@ xfs_trans_resv_calc( resp->tr_clearagi.tr_logres = xfs_calc_clear_agi_bucket_reservation(mp); resp->tr_growrtzero.tr_logres = xfs_calc_growrtzero_reservation(mp); resp->tr_growrtfree.tr_logres = xfs_calc_growrtfree_reservation(mp); + + /* Put everything back the way it was. This goes at the end. */ + mp->m_rmap_maxlevels = rmap_maxlevels; } diff --git a/fs/xfs/libxfs/xfs_trans_space.h b/fs/xfs/libxfs/xfs_trans_space.h index 50332be34388..440c9c390b86 100644 --- a/fs/xfs/libxfs/xfs_trans_space.h +++ b/fs/xfs/libxfs/xfs_trans_space.h @@ -17,6 +17,13 @@ /* Adding one rmap could split every level up to the top of the tree. */ #define XFS_RMAPADD_SPACE_RES(mp) ((mp)->m_rmap_maxlevels) +/* + * Note that we historically set m_rmap_maxlevels to 9 when reflink was + * enabled, so we must preserve this behavior to avoid changing the transaction + * space reservations. + */ +#define XFS_OLD_REFLINK_RMAP_MAXLEVELS (9) + /* Blocks we might need to add "b" rmaps to a tree. */ #define XFS_NRMAPADD_SPACE_RES(mp, b)\ (((b + XFS_MAX_CONTIG_RMAPS_PER_BLOCK(mp) - 1) / \ diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 06dac09eddbd..e600a0b781c8 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -635,7 +635,7 @@ xfs_mountfs( xfs_bmap_compute_maxlevels(mp, XFS_DATA_FORK); xfs_bmap_compute_maxlevels(mp, XFS_ATTR_FORK); xfs_mount_setup_inode_geom(mp); - xfs_rmapbt_compute_maxlevels(mp); + mp->m_rmap_maxlevels = xfs_rmapbt_compute_maxlevels(mp); xfs_refcountbt_compute_maxlevels(mp); /* From patchwork Sat Sep 18 01:30:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 12503419 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FBDCC433F5 for ; Sat, 18 Sep 2021 01:30:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EB56760FBF for ; Sat, 18 Sep 2021 01:30:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234986AbhIRBbv (ORCPT ); Fri, 17 Sep 2021 21:31:51 -0400 Received: from mail.kernel.org ([198.145.29.99]:37320 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235079AbhIRBbt (ORCPT ); Fri, 17 Sep 2021 21:31:49 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id D5AE160FBF; Sat, 18 Sep 2021 01:30:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1631928626; bh=oGnMn2zETXGz0ot92YxDBa9OabwlA3aTzg0IRZ3D9+s=; h=Subject:From:To:Cc:Date:In-Reply-To:References:From; b=QGKR3GX5NSHoVUIpgu3+W+uUd4YZX3ujU2jz7EwGxOz8yrdzMvBGnmRi8qU6Kbej9 UmdOLmBCcpJbkMphERJtjOzfgOIpq2uYb0zAq8vOgAU9aocKM76q8DVskQ1+6A1yZB MfVaS+REFAZrFy+y2fDJ7ySXu7DGQ9Y2Wu1RriILpAcdv2uCBt/jiAmWGSCunNi9rL QX2tyVpkg25qWcJLDwGcnDu9QwT67rhtRov7VcKLLJidM5wwAc7dAR7PXHVS5x7KGl RBGkh8adazhutRnn2FkefSZWnQDfmRQycWfsLtBj+i7R6msao33qildgpKbcbr1oom w/0GptSW31ZqA== Subject: [PATCH 14/14] xfs: kill XFS_BTREE_MAXLEVELS From: "Darrick J. Wong" To: djwong@kernel.org, chandan.babu@oracle.com, chandanrlinux@gmail.com Cc: linux-xfs@vger.kernel.org Date: Fri, 17 Sep 2021 18:30:26 -0700 Message-ID: <163192862659.416199.7915871246193862758.stgit@magnolia> In-Reply-To: <163192854958.416199.3396890438240296942.stgit@magnolia> References: <163192854958.416199.3396890438240296942.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Nobody uses this symbol anymore, so kill it. Signed-off-by: Darrick J. Wong Reviewed-by: Chandan Babu R --- fs/xfs/libxfs/xfs_btree.c | 2 -- fs/xfs/libxfs/xfs_btree.h | 2 -- 2 files changed, 4 deletions(-) diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 005bc42cf0bd..a7c866332911 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -5003,8 +5003,6 @@ xfs_btree_alloc_cursor( struct xfs_btree_cur *cur; unsigned int maxlevels = xfs_btree_maxlevels(mp, btnum); - ASSERT(maxlevels <= XFS_BTREE_MAXLEVELS); - cur = kmem_zalloc(xfs_btree_cur_sizeof(maxlevels), KM_NOFS); cur->bc_tp = tp; cur->bc_mp = mp; diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h index d256d869f0af..91154dd63472 100644 --- a/fs/xfs/libxfs/xfs_btree.h +++ b/fs/xfs/libxfs/xfs_btree.h @@ -90,8 +90,6 @@ uint32_t xfs_btree_magic(int crc, xfs_btnum_t btnum); #define XFS_BTREE_STATS_ADD(cur, stat, val) \ XFS_STATS_ADD_OFF((cur)->bc_mp, (cur)->bc_statoff + __XBTS_ ## stat, val) -#define XFS_BTREE_MAXLEVELS 9 /* max of all btrees */ - struct xfs_btree_ops { /* size of the key and record structures */ size_t key_len;