From patchwork Sat May 9 16:31:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11538401 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 45B1B912 for ; Sat, 9 May 2020 16:31:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 230FC2063A for ; Sat, 9 May 2020 16:31:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="lKjUYRdm" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728104AbgEIQby (ORCPT ); Sat, 9 May 2020 12:31:54 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:38354 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727953AbgEIQby (ORCPT ); Sat, 9 May 2020 12:31:54 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GRVvM196365; Sat, 9 May 2020 16:31:49 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2020-01-29; bh=ajMo4ReBISmalRahz1GRvjE9G54RLGKfhr37hRZKxIQ=; b=lKjUYRdmO4TJCQIOodsulyC1i1uonJRu1J7ZOcLDcDC41HIB1N2NUQadgyZflFxpyLmP ALUXrKhGioPoZBZuazhWICv5q7UWQ2QG5XLKdjUJ2wG2+Tg9U+meTWo2S4DZTyYYqoXp DSsWNEtov74C7q4H03kEuxMI3K/1nPB2K9t5SLocykh3e9P18SJ+YhkEcCRiPmRnisAW PkEVKq3QiioGJtuZfGBuK4ml6YFLbwmRX9GubD+LbEzmW5elvUTcAH+NrV7pacQAYl4P HWt0FZpbjdNnL4TH9IFEh1jXyH05bfG2Q5opQ+iKZJG/G/h/gOJAWoKfoUvD7VRoS9+q IQ== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by aserp2120.oracle.com with ESMTP id 30wmfm152e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:31:49 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GTuMe112431; Sat, 9 May 2020 16:31:48 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userp3020.oracle.com with ESMTP id 30wx11cvrk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:31:48 +0000 Received: from abhmp0007.oracle.com (abhmp0007.oracle.com [141.146.116.13]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 049GVl7f020819; Sat, 9 May 2020 16:31:47 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sat, 09 May 2020 09:31:47 -0700 Subject: [PATCH 1/9] xfs_repair: port the online repair newbt structure From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org, bfoster@redhat.com Date: Sat, 09 May 2020 09:31:47 -0700 Message-ID: <158904190713.984305.3298591047333841655.stgit@magnolia> In-Reply-To: <158904190079.984305.707785748675261111.stgit@magnolia> References: <158904190079.984305.707785748675261111.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 spamscore=0 bulkscore=0 suspectscore=2 malwarescore=0 mlxscore=0 mlxlogscore=999 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090141 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxlogscore=999 suspectscore=2 clxscore=1015 lowpriorityscore=0 bulkscore=0 impostorscore=0 mlxscore=0 priorityscore=1501 adultscore=0 spamscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090140 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Port the new btree staging context and related block reservation helper code from the kernel to repair. We'll use this in subsequent patches to implement btree bulk loading. Signed-off-by: Darrick J. Wong --- include/libxfs.h | 1 libxfs/libxfs_api_defs.h | 2 repair/Makefile | 4 - repair/bload.c | 276 ++++++++++++++++++++++++++++++++++++++++++++++ repair/bload.h | 79 +++++++++++++ repair/xfs_repair.c | 17 +++ 6 files changed, 377 insertions(+), 2 deletions(-) create mode 100644 repair/bload.c create mode 100644 repair/bload.h diff --git a/include/libxfs.h b/include/libxfs.h index 12447835..b9370139 100644 --- a/include/libxfs.h +++ b/include/libxfs.h @@ -76,6 +76,7 @@ struct iomap; #include "xfs_rmap.h" #include "xfs_refcount_btree.h" #include "xfs_refcount.h" +#include "xfs_btree_staging.h" #ifndef ARRAY_SIZE #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h index be06c763..61047f8f 100644 --- a/libxfs/libxfs_api_defs.h +++ b/libxfs/libxfs_api_defs.h @@ -27,12 +27,14 @@ #define xfs_alloc_fix_freelist libxfs_alloc_fix_freelist #define xfs_alloc_min_freelist libxfs_alloc_min_freelist #define xfs_alloc_read_agf libxfs_alloc_read_agf +#define xfs_alloc_vextent libxfs_alloc_vextent #define xfs_attr_get libxfs_attr_get #define xfs_attr_leaf_newentsize libxfs_attr_leaf_newentsize #define xfs_attr_namecheck libxfs_attr_namecheck #define xfs_attr_set libxfs_attr_set +#define __xfs_bmap_add_free __libxfs_bmap_add_free #define xfs_bmapi_read libxfs_bmapi_read #define xfs_bmapi_write libxfs_bmapi_write #define xfs_bmap_last_offset libxfs_bmap_last_offset diff --git a/repair/Makefile b/repair/Makefile index 0964499a..8cc1ee68 100644 --- a/repair/Makefile +++ b/repair/Makefile @@ -9,11 +9,11 @@ LSRCFILES = README LTCOMMAND = xfs_repair -HFILES = agheader.h attr_repair.h avl.h bmap.h btree.h \ +HFILES = agheader.h attr_repair.h avl.h bload.h bmap.h btree.h \ da_util.h dinode.h dir2.h err_protos.h globals.h incore.h protos.h \ rt.h progress.h scan.h versions.h prefetch.h rmap.h slab.h threads.h -CFILES = agheader.c attr_repair.c avl.c bmap.c btree.c \ +CFILES = agheader.c attr_repair.c avl.c bload.c bmap.c btree.c \ da_util.c dino_chunks.c dinode.c dir2.c globals.c incore.c \ incore_bmc.c init.c incore_ext.c incore_ino.c phase1.c \ phase2.c phase3.c phase4.c phase5.c phase6.c phase7.c \ diff --git a/repair/bload.c b/repair/bload.c new file mode 100644 index 00000000..ab05815c --- /dev/null +++ b/repair/bload.c @@ -0,0 +1,276 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2020 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#include +#include "bload.h" + +#define trace_xrep_newbt_claim_block(...) ((void) 0) +#define trace_xrep_newbt_reserve_space(...) ((void) 0) +#define trace_xrep_newbt_unreserve_space(...) ((void) 0) +#define trace_xrep_newbt_claim_block(...) ((void) 0) + +int bload_leaf_slack = -1; +int bload_node_slack = -1; + +/* Ported routines from fs/xfs/scrub/repair.c */ + +/* + * Roll a transaction, keeping the AG headers locked and reinitializing + * the btree cursors. + */ +int +xrep_roll_ag_trans( + struct repair_ctx *sc) +{ + int error; + + /* Keep the AG header buffers locked so we can keep going. */ + if (sc->agi_bp) + libxfs_trans_bhold(sc->tp, sc->agi_bp); + if (sc->agf_bp) + libxfs_trans_bhold(sc->tp, sc->agf_bp); + if (sc->agfl_bp) + libxfs_trans_bhold(sc->tp, sc->agfl_bp); + + /* + * Roll the transaction. We still own the buffer and the buffer lock + * regardless of whether or not the roll succeeds. If the roll fails, + * the buffers will be released during teardown on our way out of the + * kernel. If it succeeds, we join them to the new transaction and + * move on. + */ + error = -libxfs_trans_roll(&sc->tp); + if (error) + return error; + + /* Join AG headers to the new transaction. */ + if (sc->agi_bp) + libxfs_trans_bjoin(sc->tp, sc->agi_bp); + if (sc->agf_bp) + libxfs_trans_bjoin(sc->tp, sc->agf_bp); + if (sc->agfl_bp) + libxfs_trans_bjoin(sc->tp, sc->agfl_bp); + + return 0; +} + +/* Initialize accounting resources for staging a new AG btree. */ +void +xrep_newbt_init_ag( + struct xrep_newbt *xnr, + struct repair_ctx *sc, + const struct xfs_owner_info *oinfo, + xfs_fsblock_t alloc_hint, + enum xfs_ag_resv_type resv) +{ + memset(xnr, 0, sizeof(struct xrep_newbt)); + xnr->sc = sc; + xnr->oinfo = *oinfo; /* structure copy */ + xnr->alloc_hint = alloc_hint; + xnr->resv = resv; + INIT_LIST_HEAD(&xnr->reservations); +} + +/* Initialize accounting resources for staging a new inode fork btree. */ +void +xrep_newbt_init_inode( + struct xrep_newbt *xnr, + struct repair_ctx *sc, + int whichfork, + const struct xfs_owner_info *oinfo) +{ + memset(xnr, 0, sizeof(struct xrep_newbt)); + xnr->sc = sc; + xnr->oinfo = *oinfo; /* structure copy */ + xnr->alloc_hint = XFS_INO_TO_FSB(sc->mp, sc->ip->i_ino); + xnr->resv = XFS_AG_RESV_NONE; + xnr->ifake.if_fork = kmem_zone_zalloc(xfs_ifork_zone, 0); + xnr->ifake.if_fork_size = XFS_IFORK_SIZE(sc->ip, whichfork); + INIT_LIST_HEAD(&xnr->reservations); +} + +/* + * Initialize accounting resources for staging a new btree. Callers are + * expected to add their own reservations (and clean them up) manually. + */ +void +xrep_newbt_init_bare( + struct xrep_newbt *xnr, + struct repair_ctx *sc) +{ + xrep_newbt_init_ag(xnr, sc, &XFS_RMAP_OINFO_ANY_OWNER, NULLFSBLOCK, + XFS_AG_RESV_NONE); +} + +/* Add a space reservation manually. */ +int +xrep_newbt_add_reservation( + struct xrep_newbt *xnr, + xfs_fsblock_t fsbno, + xfs_extlen_t len, + void *priv) +{ + struct xrep_newbt_resv *resv; + + resv = kmem_alloc(sizeof(struct xrep_newbt_resv), KM_MAYFAIL); + if (!resv) + return -ENOMEM; + + INIT_LIST_HEAD(&resv->list); + resv->fsbno = fsbno; + resv->len = len; + resv->used = 0; + resv->priv = priv; + list_add_tail(&resv->list, &xnr->reservations); + return 0; +} + +/* Reserve disk space for our new btree. */ +int +xrep_newbt_reserve_space( + struct xrep_newbt *xnr, + uint64_t nr_blocks) +{ + struct repair_ctx *sc = xnr->sc; + xfs_alloctype_t type; + xfs_fsblock_t alloc_hint = xnr->alloc_hint; + int error = 0; + + type = sc->ip ? XFS_ALLOCTYPE_START_BNO : XFS_ALLOCTYPE_NEAR_BNO; + + while (nr_blocks > 0 && !error) { + struct xfs_alloc_arg args = { + .tp = sc->tp, + .mp = sc->mp, + .type = type, + .fsbno = alloc_hint, + .oinfo = xnr->oinfo, + .minlen = 1, + .maxlen = nr_blocks, + .prod = 1, + .resv = xnr->resv, + }; + + error = -libxfs_alloc_vextent(&args); + if (error) + return error; + if (args.fsbno == NULLFSBLOCK) + return -ENOSPC; + + trace_xrep_newbt_reserve_space(sc->mp, + XFS_FSB_TO_AGNO(sc->mp, args.fsbno), + XFS_FSB_TO_AGBNO(sc->mp, args.fsbno), + args.len, xnr->oinfo.oi_owner); + + /* We don't have real EFIs here so skip that. */ + + error = xrep_newbt_add_reservation(xnr, args.fsbno, args.len, + NULL); + if (error) + break; + + nr_blocks -= args.len; + alloc_hint = args.fsbno + args.len - 1; + + if (sc->ip) + error = -libxfs_trans_roll_inode(&sc->tp, sc->ip); + else + error = xrep_roll_ag_trans(sc); + } + + return error; +} + +/* Free all the accounting infor and disk space we reserved for a new btree. */ +void +xrep_newbt_destroy( + struct xrep_newbt *xnr, + int error) +{ + struct repair_ctx *sc = xnr->sc; + struct xrep_newbt_resv *resv, *n; + + if (error) + goto junkit; + + list_for_each_entry_safe(resv, n, &xnr->reservations, list) { + /* We don't have EFIs here so skip the EFD. */ + + /* Free every block we didn't use. */ + resv->fsbno += resv->used; + resv->len -= resv->used; + resv->used = 0; + + if (resv->len > 0) { + trace_xrep_newbt_unreserve_space(sc->mp, + XFS_FSB_TO_AGNO(sc->mp, resv->fsbno), + XFS_FSB_TO_AGBNO(sc->mp, resv->fsbno), + resv->len, xnr->oinfo.oi_owner); + + __libxfs_bmap_add_free(sc->tp, resv->fsbno, resv->len, + &xnr->oinfo, true); + } + + list_del(&resv->list); + kmem_free(resv); + } + +junkit: + list_for_each_entry_safe(resv, n, &xnr->reservations, list) { + list_del(&resv->list); + kmem_free(resv); + } + + if (sc->ip) { + kmem_cache_free(xfs_ifork_zone, xnr->ifake.if_fork); + xnr->ifake.if_fork = NULL; + } +} + +/* Feed one of the reserved btree blocks to the bulk loader. */ +int +xrep_newbt_claim_block( + struct xfs_btree_cur *cur, + struct xrep_newbt *xnr, + union xfs_btree_ptr *ptr) +{ + struct xrep_newbt_resv *resv; + xfs_fsblock_t fsb; + + /* + * If last_resv doesn't have a block for us, move forward until we find + * one that does (or run out of reservations). + */ + if (xnr->last_resv == NULL) { + list_for_each_entry(resv, &xnr->reservations, list) { + if (resv->used < resv->len) { + xnr->last_resv = resv; + break; + } + } + if (xnr->last_resv == NULL) + return -ENOSPC; + } else if (xnr->last_resv->used == xnr->last_resv->len) { + if (xnr->last_resv->list.next == &xnr->reservations) + return -ENOSPC; + xnr->last_resv = list_entry(xnr->last_resv->list.next, + struct xrep_newbt_resv, list); + } + + /* Nab the block. */ + fsb = xnr->last_resv->fsbno + xnr->last_resv->used; + xnr->last_resv->used++; + + trace_xrep_newbt_claim_block(cur->bc_mp, + XFS_FSB_TO_AGNO(cur->bc_mp, fsb), + XFS_FSB_TO_AGBNO(cur->bc_mp, fsb), + xnr->oinfo.oi_owner); + + if (cur->bc_flags & XFS_BTREE_LONG_PTRS) + ptr->l = cpu_to_be64(fsb); + else + ptr->s = cpu_to_be32(XFS_FSB_TO_AGBNO(cur->bc_mp, fsb)); + return 0; +} diff --git a/repair/bload.h b/repair/bload.h new file mode 100644 index 00000000..ba5f6d0b --- /dev/null +++ b/repair/bload.h @@ -0,0 +1,79 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2020 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#ifndef __XFS_REPAIR_BLOAD_H__ +#define __XFS_REPAIR_BLOAD_H__ + +extern int bload_leaf_slack; +extern int bload_node_slack; + +struct repair_ctx { + struct xfs_mount *mp; + struct xfs_inode *ip; + struct xfs_trans *tp; + + struct xfs_buf *agi_bp; + struct xfs_buf *agf_bp; + struct xfs_buf *agfl_bp; +}; + +struct xrep_newbt_resv { + /* Link to list of extents that we've reserved. */ + struct list_head list; + + void *priv; + + /* FSB of the block we reserved. */ + xfs_fsblock_t fsbno; + + /* Length of the reservation. */ + xfs_extlen_t len; + + /* How much of this reservation we've used. */ + xfs_extlen_t used; +}; + +struct xrep_newbt { + struct repair_ctx *sc; + + /* List of extents that we've reserved. */ + struct list_head reservations; + + /* Fake root for new btree. */ + union { + struct xbtree_afakeroot afake; + struct xbtree_ifakeroot ifake; + }; + + /* rmap owner of these blocks */ + struct xfs_owner_info oinfo; + + /* The last reservation we allocated from. */ + struct xrep_newbt_resv *last_resv; + + /* Allocation hint */ + xfs_fsblock_t alloc_hint; + + /* per-ag reservation type */ + enum xfs_ag_resv_type resv; +}; + +#define for_each_xrep_newbt_reservation(xnr, resv, n) \ + list_for_each_entry_safe((resv), (n), &(xnr)->reservations, list) + +void xrep_newbt_init_bare(struct xrep_newbt *xba, struct repair_ctx *sc); +void xrep_newbt_init_ag(struct xrep_newbt *xba, struct repair_ctx *sc, + const struct xfs_owner_info *oinfo, xfs_fsblock_t alloc_hint, + enum xfs_ag_resv_type resv); +void xrep_newbt_init_inode(struct xrep_newbt *xba, struct repair_ctx *sc, + int whichfork, const struct xfs_owner_info *oinfo); +int xrep_newbt_add_reservation(struct xrep_newbt *xba, xfs_fsblock_t fsbno, + xfs_extlen_t len, void *priv); +int xrep_newbt_reserve_space(struct xrep_newbt *xba, uint64_t nr_blocks); +void xrep_newbt_destroy(struct xrep_newbt *xba, int error); +int xrep_newbt_claim_block(struct xfs_btree_cur *cur, struct xrep_newbt *xba, + union xfs_btree_ptr *ptr); + +#endif /* __XFS_REPAIR_BLOAD_H__ */ diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c index 9d72fa8e..8fbd3649 100644 --- a/repair/xfs_repair.c +++ b/repair/xfs_repair.c @@ -24,6 +24,7 @@ #include "rmap.h" #include "libfrog/fsgeom.h" #include "libfrog/platform.h" +#include "bload.h" /* * option tables for getsubopt calls @@ -39,6 +40,8 @@ enum o_opt_nums { AG_STRIDE, FORCE_GEO, PHASE2_THREADS, + BLOAD_LEAF_SLACK, + BLOAD_NODE_SLACK, O_MAX_OPTS, }; @@ -49,6 +52,8 @@ static char *o_opts[] = { [AG_STRIDE] = "ag_stride", [FORCE_GEO] = "force_geometry", [PHASE2_THREADS] = "phase2_threads", + [BLOAD_LEAF_SLACK] = "debug_bload_leaf_slack", + [BLOAD_NODE_SLACK] = "debug_bload_node_slack", [O_MAX_OPTS] = NULL, }; @@ -260,6 +265,18 @@ process_args(int argc, char **argv) _("-o phase2_threads requires a parameter\n")); phase2_threads = (int)strtol(val, NULL, 0); break; + case BLOAD_LEAF_SLACK: + if (!val) + do_abort( + _("-o debug_bload_leaf_slack requires a parameter\n")); + bload_leaf_slack = (int)strtol(val, NULL, 0); + break; + case BLOAD_NODE_SLACK: + if (!val) + do_abort( + _("-o debug_bload_node_slack requires a parameter\n")); + bload_node_slack = (int)strtol(val, NULL, 0); + break; default: unknown('o', val); break; From patchwork Sat May 9 16:31:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11538411 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 65949912 for ; Sat, 9 May 2020 16:32:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 48B4E20735 for ; Sat, 9 May 2020 16:32:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="ScWpECmG" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728171AbgEIQcg (ORCPT ); Sat, 9 May 2020 12:32:36 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:38882 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728058AbgEIQcg (ORCPT ); Sat, 9 May 2020 12:32:36 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GRVrn196417; Sat, 9 May 2020 16:32:31 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2020-01-29; bh=+rDNSsN2RkAgpdEc28I+cHY/k/oW1Hu5AsucHFF5wNg=; b=ScWpECmG28PBlgezxkyQSXBTbG8XB/BsnPItA6Dilbq+uHWkKtWtrSOygf2yF9nyAGgO Zs+Zzcadq15Fwzkx4Vf9mLaTaE70jSVWTCwYbz5uhavdeQ2Za6ngMF/PUWsA9qTO5p6+ Tu7nXVDUkXc1qB3c2/sI5Tf7kz9by/llaNxbj+PmUob8ml/QeModn2jxcLWm/JhrJKvp iNyxa8mNkY/+93aDCVcPnbC0zMNpSpwMlB8+A49xhJpt7Vp2aJG5RhSidU+Sushng8Be Yblq4MkYL3Ehv/mJyl7eM6oo5KZLcI/V4X8PX9mye6MGS7VyaJdcsMA3ElIg+vqf7VCd EQ== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by aserp2120.oracle.com with ESMTP id 30wmfm153r-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:32:31 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GWPe9116844; Sat, 9 May 2020 16:32:30 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserp3020.oracle.com with ESMTP id 30wwwpnkak-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:32:29 +0000 Received: from abhmp0016.oracle.com (abhmp0016.oracle.com [141.146.116.22]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 049GVrwN020915; Sat, 9 May 2020 16:31:53 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sat, 09 May 2020 09:31:53 -0700 Subject: [PATCH 2/9] xfs_repair: unindent phase 5 function From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org, bfoster@redhat.com Date: Sat, 09 May 2020 09:31:53 -0700 Message-ID: <158904191346.984305.10394364390153692151.stgit@magnolia> In-Reply-To: <158904190079.984305.707785748675261111.stgit@magnolia> References: <158904190079.984305.707785748675261111.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 adultscore=0 phishscore=0 bulkscore=0 suspectscore=2 malwarescore=0 mlxlogscore=999 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090141 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxlogscore=999 suspectscore=2 clxscore=1015 lowpriorityscore=0 bulkscore=0 impostorscore=0 mlxscore=0 priorityscore=1501 adultscore=0 spamscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090140 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Remove the unnecessary indent in phase5_func. No functional changes. Signed-off-by: Darrick J. Wong --- repair/phase5.c | 309 +++++++++++++++++++++++++++---------------------------- 1 file changed, 154 insertions(+), 155 deletions(-) diff --git a/repair/phase5.c b/repair/phase5.c index 17b57448..f3be15de 100644 --- a/repair/phase5.c +++ b/repair/phase5.c @@ -2316,201 +2316,200 @@ phase5_func( if (verbose) do_log(_(" - agno = %d\n"), agno); - { - /* - * build up incore bno and bcnt extent btrees - */ - num_extents = mk_incore_fstree(mp, agno); + /* + * build up incore bno and bcnt extent btrees + */ + num_extents = mk_incore_fstree(mp, agno); #ifdef XR_BLD_FREE_TRACE - fprintf(stderr, "# of bno extents is %d\n", - count_bno_extents(agno)); + fprintf(stderr, "# of bno extents is %d\n", + count_bno_extents(agno)); #endif - if (num_extents == 0) { - /* - * XXX - what we probably should do here is pick an - * inode for a regular file in the allocation group - * that has space allocated and shoot it by traversing - * the bmap list and putting all its extents on the - * incore freespace trees, clearing the inode, - * and clearing the in-use bit in the incore inode - * tree. Then try mk_incore_fstree() again. - */ - do_error(_("unable to rebuild AG %u. " - "Not enough free space in on-disk AG.\n"), - agno); - } - + if (num_extents == 0) { /* - * ok, now set up the btree cursors for the - * on-disk btrees (includs pre-allocating all - * required blocks for the trees themselves) + * XXX - what we probably should do here is pick an + * inode for a regular file in the allocation group + * that has space allocated and shoot it by traversing + * the bmap list and putting all its extents on the + * incore freespace trees, clearing the inode, + * and clearing the in-use bit in the incore inode + * tree. Then try mk_incore_fstree() again. */ - init_ino_cursor(mp, agno, &ino_btree_curs, &num_inos, - &num_free_inos, 0); + do_error(_("unable to rebuild AG %u. " + "Not enough free space in on-disk AG.\n"), + agno); + } - if (xfs_sb_version_hasfinobt(&mp->m_sb)) - init_ino_cursor(mp, agno, &fino_btree_curs, - &finobt_num_inos, &finobt_num_free_inos, - 1); + /* + * ok, now set up the btree cursors for the + * on-disk btrees (includs pre-allocating all + * required blocks for the trees themselves) + */ + init_ino_cursor(mp, agno, &ino_btree_curs, &num_inos, + &num_free_inos, 0); - sb_icount_ag[agno] += num_inos; - sb_ifree_ag[agno] += num_free_inos; + if (xfs_sb_version_hasfinobt(&mp->m_sb)) + init_ino_cursor(mp, agno, &fino_btree_curs, + &finobt_num_inos, &finobt_num_free_inos, + 1); - /* - * Set up the btree cursors for the on-disk rmap btrees, - * which includes pre-allocating all required blocks. - */ - init_rmapbt_cursor(mp, agno, &rmap_btree_curs); + sb_icount_ag[agno] += num_inos; + sb_ifree_ag[agno] += num_free_inos; - /* - * Set up the btree cursors for the on-disk refcount btrees, - * which includes pre-allocating all required blocks. - */ - init_refc_cursor(mp, agno, &refcnt_btree_curs); + /* + * Set up the btree cursors for the on-disk rmap btrees, + * which includes pre-allocating all required blocks. + */ + init_rmapbt_cursor(mp, agno, &rmap_btree_curs); - num_extents = count_bno_extents_blocks(agno, &num_freeblocks); + /* + * Set up the btree cursors for the on-disk refcount btrees, + * which includes pre-allocating all required blocks. + */ + init_refc_cursor(mp, agno, &refcnt_btree_curs); + + num_extents = count_bno_extents_blocks(agno, &num_freeblocks); + /* + * lose two blocks per AG -- the space tree roots + * are counted as allocated since the space trees + * always have roots + */ + sb_fdblocks_ag[agno] += num_freeblocks - 2; + + if (num_extents == 0) { /* - * lose two blocks per AG -- the space tree roots - * are counted as allocated since the space trees - * always have roots + * XXX - what we probably should do here is pick an + * inode for a regular file in the allocation group + * that has space allocated and shoot it by traversing + * the bmap list and putting all its extents on the + * incore freespace trees, clearing the inode, + * and clearing the in-use bit in the incore inode + * tree. Then try mk_incore_fstree() again. */ - sb_fdblocks_ag[agno] += num_freeblocks - 2; - - if (num_extents == 0) { - /* - * XXX - what we probably should do here is pick an - * inode for a regular file in the allocation group - * that has space allocated and shoot it by traversing - * the bmap list and putting all its extents on the - * incore freespace trees, clearing the inode, - * and clearing the in-use bit in the incore inode - * tree. Then try mk_incore_fstree() again. - */ - do_error( - _("unable to rebuild AG %u. No free space.\n"), agno); - } + do_error( + _("unable to rebuild AG %u. No free space.\n"), agno); + } #ifdef XR_BLD_FREE_TRACE - fprintf(stderr, "# of bno extents is %d\n", num_extents); + fprintf(stderr, "# of bno extents is %d\n", num_extents); #endif - /* - * track blocks that we might really lose - */ - extra_blocks = calculate_freespace_cursor(mp, agno, - &num_extents, &bno_btree_curs); + /* + * track blocks that we might really lose + */ + extra_blocks = calculate_freespace_cursor(mp, agno, + &num_extents, &bno_btree_curs); - /* - * freespace btrees live in the "free space" but - * the filesystem treats AGFL blocks as allocated - * since they aren't described by the freespace trees - */ + /* + * freespace btrees live in the "free space" but + * the filesystem treats AGFL blocks as allocated + * since they aren't described by the freespace trees + */ - /* - * see if we can fit all the extra blocks into the AGFL - */ - extra_blocks = (extra_blocks - libxfs_agfl_size(mp) > 0) - ? extra_blocks - libxfs_agfl_size(mp) - : 0; + /* + * see if we can fit all the extra blocks into the AGFL + */ + extra_blocks = (extra_blocks - libxfs_agfl_size(mp) > 0) + ? extra_blocks - libxfs_agfl_size(mp) + : 0; - if (extra_blocks > 0) - sb_fdblocks_ag[agno] -= extra_blocks; + if (extra_blocks > 0) + sb_fdblocks_ag[agno] -= extra_blocks; - bcnt_btree_curs = bno_btree_curs; + bcnt_btree_curs = bno_btree_curs; - bno_btree_curs.owner = XFS_RMAP_OWN_AG; - bcnt_btree_curs.owner = XFS_RMAP_OWN_AG; - setup_cursor(mp, agno, &bno_btree_curs); - setup_cursor(mp, agno, &bcnt_btree_curs); + bno_btree_curs.owner = XFS_RMAP_OWN_AG; + bcnt_btree_curs.owner = XFS_RMAP_OWN_AG; + setup_cursor(mp, agno, &bno_btree_curs); + setup_cursor(mp, agno, &bcnt_btree_curs); #ifdef XR_BLD_FREE_TRACE - fprintf(stderr, "# of bno extents is %d\n", - count_bno_extents(agno)); - fprintf(stderr, "# of bcnt extents is %d\n", - count_bcnt_extents(agno)); + fprintf(stderr, "# of bno extents is %d\n", + count_bno_extents(agno)); + fprintf(stderr, "# of bcnt extents is %d\n", + count_bcnt_extents(agno)); #endif - /* - * now rebuild the freespace trees - */ - freeblks1 = build_freespace_tree(mp, agno, - &bno_btree_curs, XFS_BTNUM_BNO); + /* + * now rebuild the freespace trees + */ + freeblks1 = build_freespace_tree(mp, agno, + &bno_btree_curs, XFS_BTNUM_BNO); #ifdef XR_BLD_FREE_TRACE - fprintf(stderr, "# of free blocks == %d\n", freeblks1); + fprintf(stderr, "# of free blocks == %d\n", freeblks1); #endif - write_cursor(&bno_btree_curs); + write_cursor(&bno_btree_curs); #ifdef DEBUG - freeblks2 = build_freespace_tree(mp, agno, - &bcnt_btree_curs, XFS_BTNUM_CNT); + freeblks2 = build_freespace_tree(mp, agno, + &bcnt_btree_curs, XFS_BTNUM_CNT); #else - (void) build_freespace_tree(mp, agno, - &bcnt_btree_curs, XFS_BTNUM_CNT); + (void) build_freespace_tree(mp, agno, + &bcnt_btree_curs, XFS_BTNUM_CNT); #endif - write_cursor(&bcnt_btree_curs); + write_cursor(&bcnt_btree_curs); - ASSERT(freeblks1 == freeblks2); + ASSERT(freeblks1 == freeblks2); - if (xfs_sb_version_hasrmapbt(&mp->m_sb)) { - build_rmap_tree(mp, agno, &rmap_btree_curs); - write_cursor(&rmap_btree_curs); - sb_fdblocks_ag[agno] += (rmap_btree_curs.num_tot_blocks - - rmap_btree_curs.num_free_blocks) - 1; - } + if (xfs_sb_version_hasrmapbt(&mp->m_sb)) { + build_rmap_tree(mp, agno, &rmap_btree_curs); + write_cursor(&rmap_btree_curs); + sb_fdblocks_ag[agno] += (rmap_btree_curs.num_tot_blocks - + rmap_btree_curs.num_free_blocks) - 1; + } - if (xfs_sb_version_hasreflink(&mp->m_sb)) { - build_refcount_tree(mp, agno, &refcnt_btree_curs); - write_cursor(&refcnt_btree_curs); - } + if (xfs_sb_version_hasreflink(&mp->m_sb)) { + build_refcount_tree(mp, agno, &refcnt_btree_curs); + write_cursor(&refcnt_btree_curs); + } - /* - * set up agf and agfl - */ - build_agf_agfl(mp, agno, &bno_btree_curs, - &bcnt_btree_curs, freeblks1, extra_blocks, - &rmap_btree_curs, &refcnt_btree_curs, lost_fsb); - /* - * build inode allocation tree. - */ - build_ino_tree(mp, agno, &ino_btree_curs, XFS_BTNUM_INO, - &agi_stat); - write_cursor(&ino_btree_curs); + /* + * set up agf and agfl + */ + build_agf_agfl(mp, agno, &bno_btree_curs, + &bcnt_btree_curs, freeblks1, extra_blocks, + &rmap_btree_curs, &refcnt_btree_curs, lost_fsb); + /* + * build inode allocation tree. + */ + build_ino_tree(mp, agno, &ino_btree_curs, XFS_BTNUM_INO, + &agi_stat); + write_cursor(&ino_btree_curs); - /* - * build free inode tree - */ - if (xfs_sb_version_hasfinobt(&mp->m_sb)) { - build_ino_tree(mp, agno, &fino_btree_curs, - XFS_BTNUM_FINO, NULL); - write_cursor(&fino_btree_curs); - } + /* + * build free inode tree + */ + if (xfs_sb_version_hasfinobt(&mp->m_sb)) { + build_ino_tree(mp, agno, &fino_btree_curs, + XFS_BTNUM_FINO, NULL); + write_cursor(&fino_btree_curs); + } - /* build the agi */ - build_agi(mp, agno, &ino_btree_curs, &fino_btree_curs, - &agi_stat); + /* build the agi */ + build_agi(mp, agno, &ino_btree_curs, &fino_btree_curs, + &agi_stat); - /* - * tear down cursors - */ - finish_cursor(&bno_btree_curs); - finish_cursor(&ino_btree_curs); - if (xfs_sb_version_hasrmapbt(&mp->m_sb)) - finish_cursor(&rmap_btree_curs); - if (xfs_sb_version_hasreflink(&mp->m_sb)) - finish_cursor(&refcnt_btree_curs); - if (xfs_sb_version_hasfinobt(&mp->m_sb)) - finish_cursor(&fino_btree_curs); - finish_cursor(&bcnt_btree_curs); + /* + * tear down cursors + */ + finish_cursor(&bno_btree_curs); + finish_cursor(&ino_btree_curs); + if (xfs_sb_version_hasrmapbt(&mp->m_sb)) + finish_cursor(&rmap_btree_curs); + if (xfs_sb_version_hasreflink(&mp->m_sb)) + finish_cursor(&refcnt_btree_curs); + if (xfs_sb_version_hasfinobt(&mp->m_sb)) + finish_cursor(&fino_btree_curs); + finish_cursor(&bcnt_btree_curs); + + /* + * release the incore per-AG bno/bcnt trees so + * the extent nodes can be recycled + */ + release_agbno_extent_tree(agno); + release_agbcnt_extent_tree(agno); - /* - * release the incore per-AG bno/bcnt trees so - * the extent nodes can be recycled - */ - release_agbno_extent_tree(agno); - release_agbcnt_extent_tree(agno); - } PROG_RPT_INC(prog_rpt_done[agno], 1); } From patchwork Sat May 9 16:31:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11538403 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8F953912 for ; Sat, 9 May 2020 16:32:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7309620735 for ; Sat, 9 May 2020 16:32:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="yoEwrahn" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727940AbgEIQcJ (ORCPT ); Sat, 9 May 2020 12:32:09 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:50392 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727863AbgEIQcI (ORCPT ); Sat, 9 May 2020 12:32:08 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GMg9O072311; Sat, 9 May 2020 16:32:03 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2020-01-29; bh=wm/P4ICwh4G93SZWpQI+IO0pRk6NT+Qa6sUuSgfpIeU=; b=yoEwrahnTsEZz0s+qeF3vKh+TfpPAAaJfIJFnbT8rZ1FZ9GPeN6I5hlMs45Lxv1lyzs0 IUw2FZQd5S/C/sHbOkFtIBRKluczO/BlERLwDKJyESnqY4k4+1kSElpuyVk6QKiM5K/V oY+19z5gR6Veg0/mfszpblQzMN+krdiM+9GDkwkv+jtPtwL/xvNKKlV79Sm9lefTcYLS 9Sshd792KaCrAqP27iKtePI0tQEmRZ6qUJhCd/DUxfW6+Jppsw4s9NoM/DslE7WcfD0B VKOalJXhjmb3oNUCB1SPzf2GQEZZjeDM4Nhn2ite0l0YrXe7rsz7+V7Ru9ynnCTG3Jih CQ== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2130.oracle.com with ESMTP id 30wkxqs6gp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:32:03 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GRb3O132458; Sat, 9 May 2020 16:32:03 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userp3030.oracle.com with ESMTP id 30wwxb5jd3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:32:03 +0000 Received: from abhmp0015.oracle.com (abhmp0015.oracle.com [141.146.116.21]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 049GW2ZQ020923; Sat, 9 May 2020 16:32:02 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sat, 09 May 2020 09:32:01 -0700 Subject: [PATCH 3/9] xfs_repair: create a new class of btree rebuild cursors From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org, bfoster@redhat.com Date: Sat, 09 May 2020 09:31:59 -0700 Message-ID: <158904191982.984305.12997847094211521747.stgit@magnolia> In-Reply-To: <158904190079.984305.707785748675261111.stgit@magnolia> References: <158904190079.984305.707785748675261111.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 adultscore=0 suspectscore=2 bulkscore=0 phishscore=0 mlxscore=0 spamscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090141 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 mlxscore=0 bulkscore=0 priorityscore=1501 impostorscore=0 lowpriorityscore=0 clxscore=1015 malwarescore=0 suspectscore=2 phishscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090139 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Create some new support structures and functions to assist phase5 in using the btree bulk loader to reconstruct metadata btrees. This is the first step in removing the open-coded rebuilding code. Signed-off-by: Darrick J. Wong --- repair/phase5.c | 240 ++++++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 219 insertions(+), 21 deletions(-) diff --git a/repair/phase5.c b/repair/phase5.c index f3be15de..7eb24519 100644 --- a/repair/phase5.c +++ b/repair/phase5.c @@ -18,6 +18,7 @@ #include "progress.h" #include "slab.h" #include "rmap.h" +#include "bload.h" /* * we maintain the current slice (path from root to leaf) @@ -65,6 +66,23 @@ typedef struct bt_status { uint64_t owner; /* owner */ } bt_status_t; +/* Context for rebuilding a per-AG btree. */ +struct bt_rebuild { + /* Fake root for staging and space preallocations. */ + struct xrep_newbt newbt; + + /* Geometry of the new btree. */ + struct xfs_btree_bload bload; + + /* Staging btree cursor for the new tree. */ + struct xfs_btree_cur *cur; + + /* Tree-specific data. */ + union { + struct xfs_slab_cursor *slab_cursor; + }; +}; + /* * extra metadata for the agi */ @@ -306,6 +324,157 @@ _("error - not enough free space in filesystem\n")); #endif } +/* + * Estimate proper slack values for a btree that's being reloaded. + * + * Under most circumstances, we'll take whatever default loading value the + * btree bulk loading code calculates for us. However, there are some + * exceptions to this rule: + * + * (1) If someone turned one of the debug knobs. + * (2) The AG has less than ~9% space free. + * + * Note that we actually use 3/32 for the comparison to avoid division. + */ +static void +estimate_ag_bload_slack( + struct repair_ctx *sc, + struct xfs_btree_bload *bload, + unsigned int free) +{ + /* + * The global values are set to -1 (i.e. take the bload defaults) + * unless someone has set them otherwise, so we just pull the values + * here. + */ + bload->leaf_slack = bload_leaf_slack; + bload->node_slack = bload_node_slack; + + /* No further changes if there's more than 3/32ths space left. */ + if (free >= ((sc->mp->m_sb.sb_agblocks * 3) >> 5)) + return; + + /* We're low on space; load the btrees as tightly as possible. */ + if (bload->leaf_slack < 0) + bload->leaf_slack = 0; + if (bload->node_slack < 0) + bload->node_slack = 0; +} + +/* Initialize a btree rebuild context. */ +static void +init_rebuild( + struct repair_ctx *sc, + const struct xfs_owner_info *oinfo, + xfs_agblock_t free_space, + struct bt_rebuild *btr) +{ + memset(btr, 0, sizeof(struct bt_rebuild)); + + xrep_newbt_init_bare(&btr->newbt, sc); + btr->newbt.oinfo = *oinfo; /* struct copy */ + estimate_ag_bload_slack(sc, &btr->bload, free_space); +} + +/* Reserve blocks for the new btree. */ +static void +setup_rebuild( + struct xfs_mount *mp, + xfs_agnumber_t agno, + struct bt_rebuild *btr, + uint32_t nr_blocks) +{ + struct extent_tree_node *ext_ptr; + struct extent_tree_node *bno_ext_ptr; + uint32_t blocks_allocated = 0; + int error; + + /* + * grab the smallest extent and use it up, then get the + * next smallest. This mimics the init_*_cursor code. + */ + ext_ptr = findfirst_bcnt_extent(agno); + + /* + * set up the free block array + */ + while (blocks_allocated < nr_blocks) { + uint64_t len; + xfs_agblock_t new_start; + xfs_extlen_t new_len; + + if (!ext_ptr) + do_error( +_("error - not enough free space in filesystem\n")); + + /* Use up the extent we've got. */ + len = min(ext_ptr->ex_blockcount, + btr->bload.nr_blocks - blocks_allocated); + error = xrep_newbt_add_reservation(&btr->newbt, + XFS_AGB_TO_FSB(mp, agno, + ext_ptr->ex_startblock), + len, NULL); + if (error) + do_error(_("could not set up btree reservation: %s\n"), + strerror(-error)); + blocks_allocated += len; + + error = rmap_add_ag_rec(mp, agno, ext_ptr->ex_startblock, len, + btr->newbt.oinfo.oi_owner); + if (error) + do_error(_("could not set up btree rmaps: %s\n"), + strerror(-error)); + + /* Figure out if we're putting anything back. */ + new_start = ext_ptr->ex_startblock + len; + new_len = ext_ptr->ex_blockcount - len; + + /* Delete the used-up extent from both extent trees. */ +#ifdef XR_BLD_FREE_TRACE + fprintf(stderr, "releasing extent: %u [%u %u]\n", + agno, ext_ptr->ex_startblock, ext_ptr->ex_blockcount); +#endif + bno_ext_ptr = find_bno_extent(agno, ext_ptr->ex_startblock); + ASSERT(bno_ext_ptr != NULL); + get_bno_extent(agno, bno_ext_ptr); + release_extent_tree_node(bno_ext_ptr); + + ext_ptr = get_bcnt_extent(agno, ext_ptr->ex_startblock, + ext_ptr->ex_blockcount); + ASSERT(ext_ptr != NULL); + release_extent_tree_node(ext_ptr); + + /* + * If we only used part of this last extent, then we need only + * to reinsert the extent in the extent trees and we're done. + */ + if (new_len > 0) { + add_bno_extent(agno, new_start, new_len); + add_bcnt_extent(agno, new_start, new_len); + break; + } + + /* Otherwise, find the next biggest extent. */ + ext_ptr = findfirst_bcnt_extent(agno); + } +#ifdef XR_BLD_FREE_TRACE + fprintf(stderr, "blocks_allocated = %d\n", + blocks_allocated); +#endif +} + +/* Feed one of the new btree blocks to the bulk loader. */ +static int +rebuild_alloc_block( + struct xfs_btree_cur *cur, + union xfs_btree_ptr *ptr, + void *priv) +{ + struct bt_rebuild *btr = priv; + + return xrep_newbt_claim_block(cur, &btr->newbt, ptr); +} + static void write_cursor(bt_status_t *curs) { @@ -336,6 +505,34 @@ finish_cursor(bt_status_t *curs) free(curs->btree_blocks); } +static void +finish_rebuild( + struct xfs_mount *mp, + struct bt_rebuild *btr) +{ + struct xrep_newbt_resv *resv, *n; + + for_each_xrep_newbt_reservation(&btr->newbt, resv, n) { + xfs_agnumber_t agno; + xfs_agblock_t bno; + xfs_extlen_t len; + + if (resv->used >= resv->len) + continue; + + /* XXX: Shouldn't this go on the AGFL? */ + /* Put back everything we didn't use. */ + bno = XFS_FSB_TO_AGBNO(mp, resv->fsbno + resv->used); + agno = XFS_FSB_TO_AGNO(mp, resv->fsbno + resv->used); + len = resv->len - resv->used; + + add_bno_extent(agno, bno, len); + add_bcnt_extent(agno, bno, len); + } + + xrep_newbt_destroy(&btr->newbt, 0); +} + /* * We need to leave some free records in the tree for the corner case of * setting up the AGFL. This may require allocation of blocks, and as @@ -2290,28 +2487,29 @@ keep_fsinos(xfs_mount_t *mp) static void phase5_func( - xfs_mount_t *mp, - xfs_agnumber_t agno, - struct xfs_slab *lost_fsb) + struct xfs_mount *mp, + xfs_agnumber_t agno, + struct xfs_slab *lost_fsb) { - uint64_t num_inos; - uint64_t num_free_inos; - uint64_t finobt_num_inos; - uint64_t finobt_num_free_inos; - bt_status_t bno_btree_curs; - bt_status_t bcnt_btree_curs; - bt_status_t ino_btree_curs; - bt_status_t fino_btree_curs; - bt_status_t rmap_btree_curs; - bt_status_t refcnt_btree_curs; - int extra_blocks = 0; - uint num_freeblocks; - xfs_extlen_t freeblks1; + struct repair_ctx sc = { .mp = mp, }; + struct agi_stat agi_stat = {0,}; + uint64_t num_inos; + uint64_t num_free_inos; + uint64_t finobt_num_inos; + uint64_t finobt_num_free_inos; + bt_status_t bno_btree_curs; + bt_status_t bcnt_btree_curs; + bt_status_t ino_btree_curs; + bt_status_t fino_btree_curs; + bt_status_t rmap_btree_curs; + bt_status_t refcnt_btree_curs; + int extra_blocks = 0; + uint num_freeblocks; + xfs_extlen_t freeblks1; #ifdef DEBUG - xfs_extlen_t freeblks2; + xfs_extlen_t freeblks2; #endif - xfs_agblock_t num_extents; - struct agi_stat agi_stat = {0,}; + xfs_agblock_t num_extents; if (verbose) do_log(_(" - agno = %d\n"), agno); @@ -2533,8 +2731,8 @@ inject_lost_blocks( if (error) goto out_cancel; - error = -libxfs_free_extent(tp, *fsb, 1, &XFS_RMAP_OINFO_AG, - XFS_AG_RESV_NONE); + error = -libxfs_free_extent(tp, *fsb, 1, + &XFS_RMAP_OINFO_ANY_OWNER, XFS_AG_RESV_NONE); if (error) goto out_cancel; From patchwork Sat May 9 16:32:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11538405 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8999615AB for ; Sat, 9 May 2020 16:32:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5EBC82184D for ; Sat, 9 May 2020 16:32:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="pwH7L0+q" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727953AbgEIQcT (ORCPT ); Sat, 9 May 2020 12:32:19 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:50452 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727863AbgEIQcS (ORCPT ); Sat, 9 May 2020 12:32:18 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GMgBm072380; Sat, 9 May 2020 16:32:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2020-01-29; bh=ZFLl4s8kpSbACLA8quMzzR0Jk2KeR4yi1GxQd+8G2JY=; b=pwH7L0+qUFYxhEQm+721Wrrj9Nj3A5+H27AxDdeuyNwe8ksrrLYDAHR2f20mgvdgMPRo QZp9FM7s1otsz94ucxlEcRKQdTl9/oA+23qnz4btVpjACOG8ifEMVCrqf2oxvg6W8un5 uLSh6XwvtKdZEDJ+POlb5cY+40m3nBfjP84U4tCxJ+dGe5iLUvZf+LNM3FwfEkgvOGlz I/BH41puvdslwDxE1qC3UdhzqQEit3AWbFYheK4Q36FLyB8dlCR72yBBUn6zjQnuGANb zKm7Z/301xwIGwAe+sXb4/sDKaojaMrksjdGG8i9HPgz/wcw3S6jH1Z79TDenQHTRAMk GA== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by userp2130.oracle.com with ESMTP id 30wkxqs6gt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:32:10 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GTuMk112431; Sat, 9 May 2020 16:32:09 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userp3020.oracle.com with ESMTP id 30wx11cw06-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:32:09 +0000 Received: from abhmp0019.oracle.com (abhmp0019.oracle.com [141.146.116.25]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 049GW8RL005108; Sat, 9 May 2020 16:32:08 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sat, 09 May 2020 09:32:08 -0700 Subject: [PATCH 4/9] xfs_repair: rebuild free space btrees with bulk loader From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org, bfoster@redhat.com Date: Sat, 09 May 2020 09:32:08 -0700 Message-ID: <158904192820.984305.12654411837854594801.stgit@magnolia> In-Reply-To: <158904190079.984305.707785748675261111.stgit@magnolia> References: <158904190079.984305.707785748675261111.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 spamscore=0 bulkscore=0 suspectscore=2 malwarescore=0 mlxscore=0 mlxlogscore=999 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090141 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 mlxscore=0 bulkscore=0 priorityscore=1501 impostorscore=0 lowpriorityscore=0 clxscore=1015 malwarescore=0 suspectscore=2 phishscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090139 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Use the btree bulk loading functions to rebuild the free space btrees and drop the open-coded implementation. Signed-off-by: Darrick J. Wong --- libxfs/libxfs_api_defs.h | 3 repair/phase5.c | 858 +++++++++++++--------------------------------- 2 files changed, 247 insertions(+), 614 deletions(-) diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h index 61047f8f..bace739c 100644 --- a/libxfs/libxfs_api_defs.h +++ b/libxfs/libxfs_api_defs.h @@ -24,6 +24,7 @@ #define xfs_alloc_ag_max_usable libxfs_alloc_ag_max_usable #define xfs_allocbt_maxrecs libxfs_allocbt_maxrecs +#define xfs_allocbt_stage_cursor libxfs_allocbt_stage_cursor #define xfs_alloc_fix_freelist libxfs_alloc_fix_freelist #define xfs_alloc_min_freelist libxfs_alloc_min_freelist #define xfs_alloc_read_agf libxfs_alloc_read_agf @@ -41,6 +42,8 @@ #define xfs_bmbt_maxrecs libxfs_bmbt_maxrecs #define xfs_bmdr_maxrecs libxfs_bmdr_maxrecs +#define xfs_btree_bload libxfs_btree_bload +#define xfs_btree_bload_compute_geometry libxfs_btree_bload_compute_geometry #define xfs_btree_del_cursor libxfs_btree_del_cursor #define xfs_btree_init_block libxfs_btree_init_block #define xfs_buf_delwri_submit libxfs_buf_delwri_submit diff --git a/repair/phase5.c b/repair/phase5.c index 7eb24519..94e4610c 100644 --- a/repair/phase5.c +++ b/repair/phase5.c @@ -80,6 +80,10 @@ struct bt_rebuild { /* Tree-specific data. */ union { struct xfs_slab_cursor *slab_cursor; + struct { + struct extent_tree_node *bno_rec; + xfs_agblock_t *freeblks; + }; }; }; @@ -97,7 +101,10 @@ static uint64_t *sb_ifree_ag; /* free inodes per ag */ static uint64_t *sb_fdblocks_ag; /* free data blocks per ag */ static int -mk_incore_fstree(xfs_mount_t *mp, xfs_agnumber_t agno) +mk_incore_fstree( + struct xfs_mount *mp, + xfs_agnumber_t agno, + unsigned int *num_freeblocks) { int in_extent; int num_extents; @@ -109,6 +116,8 @@ mk_incore_fstree(xfs_mount_t *mp, xfs_agnumber_t agno) xfs_extlen_t blen; int bstate; + *num_freeblocks = 0; + /* * scan the bitmap for the ag looking for continuous * extents of free blocks. At this point, we know @@ -164,6 +173,7 @@ mk_incore_fstree(xfs_mount_t *mp, xfs_agnumber_t agno) #endif add_bno_extent(agno, extent_start, extent_len); add_bcnt_extent(agno, extent_start, extent_len); + *num_freeblocks += extent_len; } } } @@ -177,6 +187,7 @@ mk_incore_fstree(xfs_mount_t *mp, xfs_agnumber_t agno) #endif add_bno_extent(agno, extent_start, extent_len); add_bcnt_extent(agno, extent_start, extent_len); + *num_freeblocks += extent_len; } return(num_extents); @@ -465,7 +476,7 @@ _("error - not enough free space in filesystem\n")); /* Feed one of the new btree blocks to the bulk loader. */ static int -rebuild_alloc_block( +rebuild_claim_block( struct xfs_btree_cur *cur, union xfs_btree_ptr *ptr, void *priv) @@ -505,313 +516,32 @@ finish_cursor(bt_status_t *curs) free(curs->btree_blocks); } +/* + * Scoop up leftovers from a rebuild cursor for later freeing, then free the + * rebuild context. + */ static void finish_rebuild( struct xfs_mount *mp, - struct bt_rebuild *btr) + struct bt_rebuild *btr, + struct xfs_slab *lost_fsb) { struct xrep_newbt_resv *resv, *n; for_each_xrep_newbt_reservation(&btr->newbt, resv, n) { - xfs_agnumber_t agno; - xfs_agblock_t bno; - xfs_extlen_t len; - - if (resv->used >= resv->len) - continue; - - /* XXX: Shouldn't this go on the AGFL? */ - /* Put back everything we didn't use. */ - bno = XFS_FSB_TO_AGBNO(mp, resv->fsbno + resv->used); - agno = XFS_FSB_TO_AGNO(mp, resv->fsbno + resv->used); - len = resv->len - resv->used; - - add_bno_extent(agno, bno, len); - add_bcnt_extent(agno, bno, len); - } - - xrep_newbt_destroy(&btr->newbt, 0); -} - -/* - * We need to leave some free records in the tree for the corner case of - * setting up the AGFL. This may require allocation of blocks, and as - * such can require insertion of new records into the tree (e.g. moving - * a record in the by-count tree when a long extent is shortened). If we - * pack the records into the leaves with no slack space, this requires a - * leaf split to occur and a block to be allocated from the free list. - * If we don't have any blocks on the free list (because we are setting - * it up!), then we fail, and the filesystem will fail with the same - * failure at runtime. Hence leave a couple of records slack space in - * each block to allow immediate modification of the tree without - * requiring splits to be done. - * - * XXX(hch): any reason we don't just look at mp->m_alloc_mxr? - */ -#define XR_ALLOC_BLOCK_MAXRECS(mp, level) \ - (libxfs_allocbt_maxrecs((mp), (mp)->m_sb.sb_blocksize, (level) == 0) - 2) - -/* - * this calculates a freespace cursor for an ag. - * btree_curs is an in/out. returns the number of - * blocks that will show up in the AGFL. - */ -static int -calculate_freespace_cursor(xfs_mount_t *mp, xfs_agnumber_t agno, - xfs_agblock_t *extents, bt_status_t *btree_curs) -{ - xfs_extlen_t blocks_needed; /* a running count */ - xfs_extlen_t blocks_allocated_pt; /* per tree */ - xfs_extlen_t blocks_allocated_total; /* for both trees */ - xfs_agblock_t num_extents; - int i; - int extents_used; - int extra_blocks; - bt_stat_level_t *lptr; - bt_stat_level_t *p_lptr; - extent_tree_node_t *ext_ptr; - int level; - - num_extents = *extents; - extents_used = 0; - - ASSERT(num_extents != 0); - - lptr = &btree_curs->level[0]; - btree_curs->init = 1; + while (resv->used < resv->len) { + xfs_fsblock_t fsb = resv->fsbno + resv->used; + int error; - /* - * figure out how much space we need for the leaf level - * of the tree and set up the cursor for the leaf level - * (note that the same code is duplicated further down) - */ - lptr->num_blocks = howmany(num_extents, XR_ALLOC_BLOCK_MAXRECS(mp, 0)); - lptr->num_recs_pb = num_extents / lptr->num_blocks; - lptr->modulo = num_extents % lptr->num_blocks; - lptr->num_recs_tot = num_extents; - level = 1; - -#ifdef XR_BLD_FREE_TRACE - fprintf(stderr, "%s 0 %d %d %d %d\n", __func__, - lptr->num_blocks, - lptr->num_recs_pb, - lptr->modulo, - lptr->num_recs_tot); -#endif - /* - * if we need more levels, set them up. # of records - * per level is the # of blocks in the level below it - */ - if (lptr->num_blocks > 1) { - for (; btree_curs->level[level - 1].num_blocks > 1 - && level < XFS_BTREE_MAXLEVELS; - level++) { - lptr = &btree_curs->level[level]; - p_lptr = &btree_curs->level[level - 1]; - lptr->num_blocks = howmany(p_lptr->num_blocks, - XR_ALLOC_BLOCK_MAXRECS(mp, level)); - lptr->modulo = p_lptr->num_blocks - % lptr->num_blocks; - lptr->num_recs_pb = p_lptr->num_blocks - / lptr->num_blocks; - lptr->num_recs_tot = p_lptr->num_blocks; -#ifdef XR_BLD_FREE_TRACE - fprintf(stderr, "%s %d %d %d %d %d\n", __func__, - level, - lptr->num_blocks, - lptr->num_recs_pb, - lptr->modulo, - lptr->num_recs_tot); -#endif - } - } - - ASSERT(lptr->num_blocks == 1); - btree_curs->num_levels = level; - - /* - * ok, now we have a hypothetical cursor that - * will work for both the bno and bcnt trees. - * now figure out if using up blocks to set up the - * trees will perturb the shape of the freespace tree. - * if so, we've over-allocated. the freespace trees - * as they will be *after* accounting for the free space - * we've used up will need fewer blocks to to represent - * than we've allocated. We can use the AGFL to hold - * xfs_agfl_size (sector/struct xfs_agfl) blocks but that's it. - * Thus we limit things to xfs_agfl_size/2 for each of the 2 btrees. - * if the number of extra blocks is more than that, - * we'll have to be called again. - */ - for (blocks_needed = 0, i = 0; i < level; i++) { - blocks_needed += btree_curs->level[i].num_blocks; - } - - /* - * record the # of blocks we've allocated - */ - blocks_allocated_pt = blocks_needed; - blocks_needed *= 2; - blocks_allocated_total = blocks_needed; - - /* - * figure out how many free extents will be used up by - * our space allocation - */ - if ((ext_ptr = findfirst_bcnt_extent(agno)) == NULL) - do_error(_("can't rebuild fs trees -- not enough free space " - "on ag %u\n"), agno); - - while (ext_ptr != NULL && blocks_needed > 0) { - if (ext_ptr->ex_blockcount <= blocks_needed) { - blocks_needed -= ext_ptr->ex_blockcount; - extents_used++; - } else { - blocks_needed = 0; - } - - ext_ptr = findnext_bcnt_extent(agno, ext_ptr); - -#ifdef XR_BLD_FREE_TRACE - if (ext_ptr != NULL) { - fprintf(stderr, "got next extent [%u %u]\n", - ext_ptr->ex_startblock, ext_ptr->ex_blockcount); - } else { - fprintf(stderr, "out of extents\n"); - } -#endif - } - if (blocks_needed > 0) - do_error(_("ag %u - not enough free space to build freespace " - "btrees\n"), agno); - - ASSERT(num_extents >= extents_used); - - num_extents -= extents_used; - - /* - * see if the number of leaf blocks will change as a result - * of the number of extents changing - */ - if (howmany(num_extents, XR_ALLOC_BLOCK_MAXRECS(mp, 0)) - != btree_curs->level[0].num_blocks) { - /* - * yes -- recalculate the cursor. If the number of - * excess (overallocated) blocks is < xfs_agfl_size/2, we're ok. - * we can put those into the AGFL. we don't try - * and get things to converge exactly (reach a - * state with zero excess blocks) because there - * exist pathological cases which will never - * converge. first, check for the zero-case. - */ - if (num_extents == 0) { - /* - * ok, we've used up all the free blocks - * trying to lay out the leaf level. go - * to a one block (empty) btree and put the - * already allocated blocks into the AGFL - */ - if (btree_curs->level[0].num_blocks != 1) { - /* - * we really needed more blocks because - * the old tree had more than one level. - * this is bad. - */ - do_warn(_("not enough free blocks left to " - "describe all free blocks in AG " - "%u\n"), agno); - } -#ifdef XR_BLD_FREE_TRACE - fprintf(stderr, - "ag %u -- no free extents, alloc'ed %d\n", - agno, blocks_allocated_pt); -#endif - lptr->num_blocks = 1; - lptr->modulo = 0; - lptr->num_recs_pb = 0; - lptr->num_recs_tot = 0; - - btree_curs->num_levels = 1; - - /* - * don't reset the allocation stats, assume - * they're all extra blocks - * don't forget to return the total block count - * not the per-tree block count. these are the - * extras that will go into the AGFL. subtract - * two for the root blocks. - */ - btree_curs->num_tot_blocks = blocks_allocated_pt; - btree_curs->num_free_blocks = blocks_allocated_pt; - - *extents = 0; - - return(blocks_allocated_total - 2); - } - - lptr = &btree_curs->level[0]; - lptr->num_blocks = howmany(num_extents, - XR_ALLOC_BLOCK_MAXRECS(mp, 0)); - lptr->num_recs_pb = num_extents / lptr->num_blocks; - lptr->modulo = num_extents % lptr->num_blocks; - lptr->num_recs_tot = num_extents; - level = 1; - - /* - * if we need more levels, set them up - */ - if (lptr->num_blocks > 1) { - for (level = 1; btree_curs->level[level-1].num_blocks - > 1 && level < XFS_BTREE_MAXLEVELS; - level++) { - lptr = &btree_curs->level[level]; - p_lptr = &btree_curs->level[level-1]; - lptr->num_blocks = howmany(p_lptr->num_blocks, - XR_ALLOC_BLOCK_MAXRECS(mp, level)); - lptr->modulo = p_lptr->num_blocks - % lptr->num_blocks; - lptr->num_recs_pb = p_lptr->num_blocks - / lptr->num_blocks; - lptr->num_recs_tot = p_lptr->num_blocks; - } - } - ASSERT(lptr->num_blocks == 1); - btree_curs->num_levels = level; - - /* - * now figure out the number of excess blocks - */ - for (blocks_needed = 0, i = 0; i < level; i++) { - blocks_needed += btree_curs->level[i].num_blocks; - } - blocks_needed *= 2; - - ASSERT(blocks_allocated_total >= blocks_needed); - extra_blocks = blocks_allocated_total - blocks_needed; - } else { - if (extents_used > 0) { - /* - * reset the leaf level geometry to account - * for consumed extents. we can leave the - * rest of the cursor alone since the number - * of leaf blocks hasn't changed. - */ - lptr = &btree_curs->level[0]; - - lptr->num_recs_pb = num_extents / lptr->num_blocks; - lptr->modulo = num_extents % lptr->num_blocks; - lptr->num_recs_tot = num_extents; + error = slab_add(lost_fsb, &fsb); + if (error) + do_error( +_("Insufficient memory saving lost blocks.\n")); + resv->used++; } - - extra_blocks = 0; } - btree_curs->num_tot_blocks = blocks_allocated_pt; - btree_curs->num_free_blocks = blocks_allocated_pt; - - *extents = num_extents; - - return(extra_blocks); + xrep_newbt_destroy(&btr->newbt, 0); } /* Map btnum to buffer ops for the types that need it. */ @@ -838,268 +568,202 @@ btnum_to_ops( } } +/* + * Free Space Btrees + * + * We need to leave some free records in the tree for the corner case of + * setting up the AGFL. This may require allocation of blocks, and as + * such can require insertion of new records into the tree (e.g. moving + * a record in the by-count tree when a long extent is shortened). If we + * pack the records into the leaves with no slack space, this requires a + * leaf split to occur and a block to be allocated from the free list. + * If we don't have any blocks on the free list (because we are setting + * it up!), then we fail, and the filesystem will fail with the same + * failure at runtime. Hence leave a couple of records slack space in + * each block to allow immediate modification of the tree without + * requiring splits to be done. + */ + static void -prop_freespace_cursor(xfs_mount_t *mp, xfs_agnumber_t agno, - bt_status_t *btree_curs, xfs_agblock_t startblock, - xfs_extlen_t blockcount, int level, xfs_btnum_t btnum) +init_freespace_cursors( + struct repair_ctx *sc, + xfs_agnumber_t agno, + unsigned int free_space, + unsigned int *nr_extents, + int *extra_blocks, + struct bt_rebuild *btr_bno, + struct bt_rebuild *btr_cnt) { - struct xfs_btree_block *bt_hdr; - xfs_alloc_key_t *bt_key; - xfs_alloc_ptr_t *bt_ptr; - xfs_agblock_t agbno; - bt_stat_level_t *lptr; - const struct xfs_buf_ops *ops = btnum_to_ops(btnum); + unsigned int bno_blocks; + unsigned int cnt_blocks; int error; - ASSERT(btnum == XFS_BTNUM_BNO || btnum == XFS_BTNUM_CNT); + init_rebuild(sc, &XFS_RMAP_OINFO_AG, free_space, btr_bno); + init_rebuild(sc, &XFS_RMAP_OINFO_AG, free_space, btr_cnt); - level++; - - if (level >= btree_curs->num_levels) - return; - - lptr = &btree_curs->level[level]; - bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p); - - if (be16_to_cpu(bt_hdr->bb_numrecs) == 0) { - /* - * only happens once when initializing the - * left-hand side of the tree. - */ - prop_freespace_cursor(mp, agno, btree_curs, startblock, - blockcount, level, btnum); - } + btr_bno->cur = libxfs_allocbt_stage_cursor(sc->mp, + &btr_bno->newbt.afake, agno, XFS_BTNUM_BNO); + btr_cnt->cur = libxfs_allocbt_stage_cursor(sc->mp, + &btr_cnt->newbt.afake, agno, XFS_BTNUM_CNT); - if (be16_to_cpu(bt_hdr->bb_numrecs) == - lptr->num_recs_pb + (lptr->modulo > 0)) { - /* - * write out current prev block, grab us a new block, - * and set the rightsib pointer of current block - */ -#ifdef XR_BLD_FREE_TRACE - fprintf(stderr, " %d ", lptr->prev_agbno); -#endif - if (lptr->prev_agbno != NULLAGBLOCK) { - ASSERT(lptr->prev_buf_p != NULL); - libxfs_buf_mark_dirty(lptr->prev_buf_p); - libxfs_buf_relse(lptr->prev_buf_p); - } - lptr->prev_agbno = lptr->agbno;; - lptr->prev_buf_p = lptr->buf_p; - agbno = get_next_blockaddr(agno, level, btree_curs); + /* + * Now we need to allocate blocks for the free space btrees using the + * free space records we're about to put in them. Every record we use + * can change the shape of the free space trees, so we recompute the + * btree shape until we stop needing /more/ blocks. If we have any + * left over we'll stash them in the AGFL when we're done. + */ + do { + unsigned int num_freeblocks; - bt_hdr->bb_u.s.bb_rightsib = cpu_to_be32(agbno); + bno_blocks = btr_bno->bload.nr_blocks; + cnt_blocks = btr_cnt->bload.nr_blocks; - error = -libxfs_buf_get(mp->m_dev, - XFS_AGB_TO_DADDR(mp, agno, agbno), - XFS_FSB_TO_BB(mp, 1), &lptr->buf_p); + /* Compute how many bnobt blocks we'll need. */ + error = -libxfs_btree_bload_compute_geometry(btr_bno->cur, + &btr_bno->bload, *nr_extents); if (error) do_error( - _("Cannot grab free space btree buffer, err=%d"), - error); - lptr->agbno = agbno; +_("Unable to compute free space by block btree geometry, error %d.\n"), -error); - if (lptr->modulo) - lptr->modulo--; - - /* - * initialize block header - */ - lptr->buf_p->b_ops = ops; - bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p); - memset(bt_hdr, 0, mp->m_sb.sb_blocksize); - libxfs_btree_init_block(mp, lptr->buf_p, btnum, level, - 0, agno); + /* Compute how many cntbt blocks we'll need. */ + error = -libxfs_btree_bload_compute_geometry(btr_bno->cur, + &btr_cnt->bload, *nr_extents); + if (error) + do_error( +_("Unable to compute free space by length btree geometry, error %d.\n"), -error); - bt_hdr->bb_u.s.bb_leftsib = cpu_to_be32(lptr->prev_agbno); + /* We don't need any more blocks, so we're done. */ + if (bno_blocks >= btr_bno->bload.nr_blocks && + cnt_blocks >= btr_cnt->bload.nr_blocks) + break; - /* - * propagate extent record for first extent in new block up - */ - prop_freespace_cursor(mp, agno, btree_curs, startblock, - blockcount, level, btnum); - } - /* - * add extent info to current block - */ - be16_add_cpu(&bt_hdr->bb_numrecs, 1); + /* Allocate however many more blocks we need this time. */ + if (bno_blocks < btr_bno->bload.nr_blocks) + setup_rebuild(sc->mp, agno, btr_bno, + btr_bno->bload.nr_blocks - bno_blocks); + if (cnt_blocks < btr_cnt->bload.nr_blocks) + setup_rebuild(sc->mp, agno, btr_cnt, + btr_cnt->bload.nr_blocks - cnt_blocks); - bt_key = XFS_ALLOC_KEY_ADDR(mp, bt_hdr, - be16_to_cpu(bt_hdr->bb_numrecs)); - bt_ptr = XFS_ALLOC_PTR_ADDR(mp, bt_hdr, - be16_to_cpu(bt_hdr->bb_numrecs), - mp->m_alloc_mxr[1]); + /* Ok, now how many free space records do we have? */ + *nr_extents = count_bno_extents_blocks(agno, &num_freeblocks); + } while (1); - bt_key->ar_startblock = cpu_to_be32(startblock); - bt_key->ar_blockcount = cpu_to_be32(blockcount); - *bt_ptr = cpu_to_be32(btree_curs->level[level-1].agbno); + *extra_blocks = (bno_blocks - btr_bno->bload.nr_blocks) + + (cnt_blocks - btr_cnt->bload.nr_blocks); } -/* - * rebuilds a freespace tree given a cursor and type - * of tree to build (bno or bcnt). returns the number of free blocks - * represented by the tree. - */ -static xfs_extlen_t -build_freespace_tree(xfs_mount_t *mp, xfs_agnumber_t agno, - bt_status_t *btree_curs, xfs_btnum_t btnum) +static void +get_freesp_data( + struct xfs_btree_cur *cur, + struct extent_tree_node *bno_rec, + xfs_agblock_t *freeblks) { - xfs_agnumber_t i; - xfs_agblock_t j; - struct xfs_btree_block *bt_hdr; - xfs_alloc_rec_t *bt_rec; - int level; - xfs_agblock_t agbno; - extent_tree_node_t *ext_ptr; - bt_stat_level_t *lptr; - xfs_extlen_t freeblks; - const struct xfs_buf_ops *ops = btnum_to_ops(btnum); - int error; + struct xfs_alloc_rec_incore *arec = &cur->bc_rec.a; - ASSERT(btnum == XFS_BTNUM_BNO || btnum == XFS_BTNUM_CNT); - -#ifdef XR_BLD_FREE_TRACE - fprintf(stderr, "in build_freespace_tree, agno = %d\n", agno); -#endif - level = btree_curs->num_levels; - freeblks = 0; + arec->ar_startblock = bno_rec->ex_startblock; + arec->ar_blockcount = bno_rec->ex_blockcount; + if (freeblks) + *freeblks += bno_rec->ex_blockcount; +} - ASSERT(level > 0); +/* Grab one bnobt record. */ +static int +get_bnobt_record( + struct xfs_btree_cur *cur, + void *priv) +{ + struct bt_rebuild *btr = priv; - /* - * initialize the first block on each btree level - */ - for (i = 0; i < level; i++) { - lptr = &btree_curs->level[i]; + get_freesp_data(cur, btr->bno_rec, btr->freeblks); + btr->bno_rec = findnext_bno_extent(btr->bno_rec); + return 0; +} - agbno = get_next_blockaddr(agno, i, btree_curs); - error = -libxfs_buf_get(mp->m_dev, - XFS_AGB_TO_DADDR(mp, agno, agbno), - XFS_FSB_TO_BB(mp, 1), &lptr->buf_p); - if (error) - do_error( - _("Cannot grab free space btree buffer, err=%d"), - error); +/* Rebuild a free space by block number btree. */ +static void +build_bnobt( + struct repair_ctx *sc, + xfs_agnumber_t agno, + struct bt_rebuild *btr_bno, + xfs_agblock_t *freeblks) +{ + int error; - if (i == btree_curs->num_levels - 1) - btree_curs->root = agbno; + *freeblks = 0; + btr_bno->bload.get_record = get_bnobt_record; + btr_bno->bload.claim_block = rebuild_claim_block; + btr_bno->bno_rec = findfirst_bno_extent(agno); + btr_bno->freeblks = freeblks; - lptr->agbno = agbno; - lptr->prev_agbno = NULLAGBLOCK; - lptr->prev_buf_p = NULL; - /* - * initialize block header - */ - lptr->buf_p->b_ops = ops; - bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p); - memset(bt_hdr, 0, mp->m_sb.sb_blocksize); - libxfs_btree_init_block(mp, lptr->buf_p, btnum, i, 0, agno); - } - /* - * run along leaf, setting up records. as we have to switch - * blocks, call the prop_freespace_cursor routine to set up the new - * pointers for the parent. that can recurse up to the root - * if required. set the sibling pointers for leaf level here. - */ - if (btnum == XFS_BTNUM_BNO) - ext_ptr = findfirst_bno_extent(agno); - else - ext_ptr = findfirst_bcnt_extent(agno); + error = -libxfs_trans_alloc_empty(sc->mp, &sc->tp); + if (error) + do_error( +_("Insufficient memory to construct bnobt rebuild transaction.\n")); -#ifdef XR_BLD_FREE_TRACE - fprintf(stderr, "bft, agno = %d, start = %u, count = %u\n", - agno, ext_ptr->ex_startblock, ext_ptr->ex_blockcount); -#endif + /* Add all observed bnobt records. */ + error = -libxfs_btree_bload(btr_bno->cur, &btr_bno->bload, btr_bno); + if (error) + do_error( +_("Error %d while creating bnobt btree for AG %u.\n"), error, agno); - lptr = &btree_curs->level[0]; + /* Since we're not writing the AGF yet, no need to commit the cursor */ + libxfs_btree_del_cursor(btr_bno->cur, 0); + error = -libxfs_trans_commit(sc->tp); + if (error) + do_error( +_("Error %d while writing bnobt btree for AG %u.\n"), error, agno); + sc->tp = NULL; +} - for (i = 0; i < btree_curs->level[0].num_blocks; i++) { - /* - * block initialization, lay in block header - */ - lptr->buf_p->b_ops = ops; - bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p); - memset(bt_hdr, 0, mp->m_sb.sb_blocksize); - libxfs_btree_init_block(mp, lptr->buf_p, btnum, 0, 0, agno); +/* Grab one cntbt record. */ +static int +get_cntbt_record( + struct xfs_btree_cur *cur, + void *priv) +{ + struct bt_rebuild *btr = priv; - bt_hdr->bb_u.s.bb_leftsib = cpu_to_be32(lptr->prev_agbno); - bt_hdr->bb_numrecs = cpu_to_be16(lptr->num_recs_pb + - (lptr->modulo > 0)); -#ifdef XR_BLD_FREE_TRACE - fprintf(stderr, "bft, bb_numrecs = %d\n", - be16_to_cpu(bt_hdr->bb_numrecs)); -#endif + get_freesp_data(cur, btr->bno_rec, btr->freeblks); + btr->bno_rec = findnext_bcnt_extent(cur->bc_ag.agno, btr->bno_rec); + return 0; +} - if (lptr->modulo > 0) - lptr->modulo--; +/* Rebuild a freespace by count btree. */ +static void +build_cntbt( + struct repair_ctx *sc, + xfs_agnumber_t agno, + struct bt_rebuild *btr_cnt, + xfs_agblock_t *freeblks) +{ + int error; - /* - * initialize values in the path up to the root if - * this is a multi-level btree - */ - if (btree_curs->num_levels > 1) - prop_freespace_cursor(mp, agno, btree_curs, - ext_ptr->ex_startblock, - ext_ptr->ex_blockcount, - 0, btnum); - - bt_rec = (xfs_alloc_rec_t *) - ((char *)bt_hdr + XFS_ALLOC_BLOCK_LEN(mp)); - for (j = 0; j < be16_to_cpu(bt_hdr->bb_numrecs); j++) { - ASSERT(ext_ptr != NULL); - bt_rec[j].ar_startblock = cpu_to_be32( - ext_ptr->ex_startblock); - bt_rec[j].ar_blockcount = cpu_to_be32( - ext_ptr->ex_blockcount); - freeblks += ext_ptr->ex_blockcount; - if (btnum == XFS_BTNUM_BNO) - ext_ptr = findnext_bno_extent(ext_ptr); - else - ext_ptr = findnext_bcnt_extent(agno, ext_ptr); -#if 0 -#ifdef XR_BLD_FREE_TRACE - if (ext_ptr == NULL) - fprintf(stderr, "null extent pointer, j = %d\n", - j); - else - fprintf(stderr, - "bft, agno = %d, start = %u, count = %u\n", - agno, ext_ptr->ex_startblock, - ext_ptr->ex_blockcount); -#endif -#endif - } + *freeblks = 0; + btr_cnt->bload.get_record = get_cntbt_record; + btr_cnt->bload.claim_block = rebuild_claim_block; + btr_cnt->bno_rec = findfirst_bcnt_extent(agno); + btr_cnt->freeblks = freeblks; - if (ext_ptr != NULL) { - /* - * get next leaf level block - */ - if (lptr->prev_buf_p != NULL) { -#ifdef XR_BLD_FREE_TRACE - fprintf(stderr, " writing fst agbno %u\n", - lptr->prev_agbno); -#endif - ASSERT(lptr->prev_agbno != NULLAGBLOCK); - libxfs_buf_mark_dirty(lptr->prev_buf_p); - libxfs_buf_relse(lptr->prev_buf_p); - } - lptr->prev_buf_p = lptr->buf_p; - lptr->prev_agbno = lptr->agbno; - lptr->agbno = get_next_blockaddr(agno, 0, btree_curs); - bt_hdr->bb_u.s.bb_rightsib = cpu_to_be32(lptr->agbno); + error = -libxfs_trans_alloc_empty(sc->mp, &sc->tp); + if (error) + do_error( +_("Insufficient memory to construct cntbt rebuild transaction.\n")); - error = -libxfs_buf_get(mp->m_dev, - XFS_AGB_TO_DADDR(mp, agno, lptr->agbno), - XFS_FSB_TO_BB(mp, 1), - &lptr->buf_p); - if (error) - do_error( - _("Cannot grab free space btree buffer, err=%d"), - error); - } - } + /* Add all observed cntbt records. */ + error = -libxfs_btree_bload(btr_cnt->cur, &btr_cnt->bload, btr_cnt); + if (error) + do_error( +_("Error %d while creating cntbt btree for AG %u.\n"), error, agno); - return(freeblks); + /* Since we're not writing the AGF yet, no need to commit the cursor */ + libxfs_btree_del_cursor(btr_cnt->cur, 0); + error = -libxfs_trans_commit(sc->tp); + if (error) + do_error( +_("Error %d while writing cntbt btree for AG %u.\n"), error, agno); + sc->tp = NULL; } /* @@ -2233,6 +1897,27 @@ _("Insufficient memory to construct refcount cursor.")); free_slab_cursor(&refc_cur); } +/* Fill the AGFL with any leftover bnobt rebuilder blocks. */ +static void +fill_agfl( + struct bt_rebuild *btr, + __be32 *agfl_bnos, + int *i) +{ + struct xrep_newbt_resv *resv, *n; + struct xfs_mount *mp = btr->newbt.sc->mp; + + for_each_xrep_newbt_reservation(&btr->newbt, resv, n) { + xfs_agblock_t bno; + + bno = XFS_FSB_TO_AGBNO(mp, resv->fsbno + resv->used); + while (resv->used < resv->len && (*i) < libxfs_agfl_size(mp)) { + agfl_bnos[(*i)++] = cpu_to_be32(bno++); + resv->used++; + } + } +} + /* * build both the agf and the agfl for an agno given both * btree cursors. @@ -2243,8 +1928,8 @@ static void build_agf_agfl( struct xfs_mount *mp, xfs_agnumber_t agno, - struct bt_status *bno_bt, - struct bt_status *bcnt_bt, + struct bt_rebuild *btr_bno, + struct bt_rebuild *btr_cnt, xfs_extlen_t freeblks, /* # free blocks in tree */ int lostblocks, /* # blocks that will be lost */ struct bt_status *rmap_bt, @@ -2256,7 +1941,6 @@ build_agf_agfl( int i; struct xfs_agfl *agfl; struct xfs_agf *agf; - xfs_fsblock_t fsb; __be32 *freelist; int error; @@ -2288,10 +1972,14 @@ build_agf_agfl( agf->agf_length = cpu_to_be32(mp->m_sb.sb_dblocks - (xfs_rfsblock_t) mp->m_sb.sb_agblocks * agno); - agf->agf_roots[XFS_BTNUM_BNO] = cpu_to_be32(bno_bt->root); - agf->agf_levels[XFS_BTNUM_BNO] = cpu_to_be32(bno_bt->num_levels); - agf->agf_roots[XFS_BTNUM_CNT] = cpu_to_be32(bcnt_bt->root); - agf->agf_levels[XFS_BTNUM_CNT] = cpu_to_be32(bcnt_bt->num_levels); + agf->agf_roots[XFS_BTNUM_BNO] = + cpu_to_be32(btr_bno->newbt.afake.af_root); + agf->agf_levels[XFS_BTNUM_BNO] = + cpu_to_be32(btr_bno->newbt.afake.af_levels); + agf->agf_roots[XFS_BTNUM_CNT] = + cpu_to_be32(btr_cnt->newbt.afake.af_root); + agf->agf_levels[XFS_BTNUM_CNT] = + cpu_to_be32(btr_cnt->newbt.afake.af_levels); agf->agf_roots[XFS_BTNUM_RMAP] = cpu_to_be32(rmap_bt->root); agf->agf_levels[XFS_BTNUM_RMAP] = cpu_to_be32(rmap_bt->num_levels); agf->agf_freeblks = cpu_to_be32(freeblks); @@ -2311,9 +1999,8 @@ build_agf_agfl( * Don't count the root blocks as they are already * accounted for. */ - blks = (bno_bt->num_tot_blocks - bno_bt->num_free_blocks) + - (bcnt_bt->num_tot_blocks - bcnt_bt->num_free_blocks) - - 2; + blks = btr_bno->newbt.afake.af_blocks + + btr_cnt->newbt.afake.af_blocks - 2; if (xfs_sb_version_hasrmapbt(&mp->m_sb)) blks += rmap_bt->num_tot_blocks - rmap_bt->num_free_blocks - 1; agf->agf_btreeblks = cpu_to_be32(blks); @@ -2357,49 +2044,14 @@ build_agf_agfl( } freelist = xfs_buf_to_agfl_bno(agfl_buf); + i = 0; - /* - * do we have left-over blocks in the btree cursors that should - * be used to fill the AGFL? - */ - if (bno_bt->num_free_blocks > 0 || bcnt_bt->num_free_blocks > 0) { - /* - * yes, now grab as many blocks as we can - */ - i = 0; - while (bno_bt->num_free_blocks > 0 && i < libxfs_agfl_size(mp)) - { - freelist[i] = cpu_to_be32( - get_next_blockaddr(agno, 0, bno_bt)); - i++; - } - - while (bcnt_bt->num_free_blocks > 0 && i < libxfs_agfl_size(mp)) - { - freelist[i] = cpu_to_be32( - get_next_blockaddr(agno, 0, bcnt_bt)); - i++; - } - /* - * now throw the rest of the blocks away and complain - */ - while (bno_bt->num_free_blocks > 0) { - fsb = XFS_AGB_TO_FSB(mp, agno, - get_next_blockaddr(agno, 0, bno_bt)); - error = slab_add(lost_fsb, &fsb); - if (error) - do_error( -_("Insufficient memory saving lost blocks.\n")); - } - while (bcnt_bt->num_free_blocks > 0) { - fsb = XFS_AGB_TO_FSB(mp, agno, - get_next_blockaddr(agno, 0, bcnt_bt)); - error = slab_add(lost_fsb, &fsb); - if (error) - do_error( -_("Insufficient memory saving lost blocks.\n")); - } + /* Fill the AGFL with leftover blocks or save them for later. */ + fill_agfl(btr_bno, freelist, &i); + fill_agfl(btr_cnt, freelist, &i); + /* Set the AGF counters for the AGFL. */ + if (i > 0) { agf->agf_flfirst = 0; agf->agf_fllast = cpu_to_be32(i - 1); agf->agf_flcount = cpu_to_be32(i); @@ -2497,8 +2149,8 @@ phase5_func( uint64_t num_free_inos; uint64_t finobt_num_inos; uint64_t finobt_num_free_inos; - bt_status_t bno_btree_curs; - bt_status_t bcnt_btree_curs; + struct bt_rebuild btr_bno; + struct bt_rebuild btr_cnt; bt_status_t ino_btree_curs; bt_status_t fino_btree_curs; bt_status_t rmap_btree_curs; @@ -2506,9 +2158,7 @@ phase5_func( int extra_blocks = 0; uint num_freeblocks; xfs_extlen_t freeblks1; -#ifdef DEBUG xfs_extlen_t freeblks2; -#endif xfs_agblock_t num_extents; if (verbose) @@ -2517,7 +2167,7 @@ phase5_func( /* * build up incore bno and bcnt extent btrees */ - num_extents = mk_incore_fstree(mp, agno); + num_extents = mk_incore_fstree(mp, agno, &num_freeblocks); #ifdef XR_BLD_FREE_TRACE fprintf(stderr, "# of bno extents is %d\n", @@ -2596,8 +2246,8 @@ phase5_func( /* * track blocks that we might really lose */ - extra_blocks = calculate_freespace_cursor(mp, agno, - &num_extents, &bno_btree_curs); + init_freespace_cursors(&sc, agno, num_freeblocks, &num_extents, + &extra_blocks, &btr_bno, &btr_cnt); /* * freespace btrees live in the "free space" but @@ -2615,13 +2265,6 @@ phase5_func( if (extra_blocks > 0) sb_fdblocks_ag[agno] -= extra_blocks; - bcnt_btree_curs = bno_btree_curs; - - bno_btree_curs.owner = XFS_RMAP_OWN_AG; - bcnt_btree_curs.owner = XFS_RMAP_OWN_AG; - setup_cursor(mp, agno, &bno_btree_curs); - setup_cursor(mp, agno, &bcnt_btree_curs); - #ifdef XR_BLD_FREE_TRACE fprintf(stderr, "# of bno extents is %d\n", count_bno_extents(agno)); @@ -2629,25 +2272,13 @@ phase5_func( count_bcnt_extents(agno)); #endif - /* - * now rebuild the freespace trees - */ - freeblks1 = build_freespace_tree(mp, agno, - &bno_btree_curs, XFS_BTNUM_BNO); + /* Rebuild the freespace btrees. */ + build_bnobt(&sc, agno, &btr_bno, &freeblks1); + build_cntbt(&sc, agno, &btr_cnt, &freeblks2); + #ifdef XR_BLD_FREE_TRACE - fprintf(stderr, "# of free blocks == %d\n", freeblks1); -#endif - write_cursor(&bno_btree_curs); - -#ifdef DEBUG - freeblks2 = build_freespace_tree(mp, agno, - &bcnt_btree_curs, XFS_BTNUM_CNT); -#else - (void) build_freespace_tree(mp, agno, - &bcnt_btree_curs, XFS_BTNUM_CNT); + fprintf(stderr, "# of free blocks == %d/%d\n", freeblks1, freeblks2); #endif - write_cursor(&bcnt_btree_curs); - ASSERT(freeblks1 == freeblks2); if (xfs_sb_version_hasrmapbt(&mp->m_sb)) { @@ -2665,9 +2296,9 @@ phase5_func( /* * set up agf and agfl */ - build_agf_agfl(mp, agno, &bno_btree_curs, - &bcnt_btree_curs, freeblks1, extra_blocks, + build_agf_agfl(mp, agno, &btr_bno, &btr_cnt, freeblks1, extra_blocks, &rmap_btree_curs, &refcnt_btree_curs, lost_fsb); + /* * build inode allocation tree. */ @@ -2691,15 +2322,14 @@ phase5_func( /* * tear down cursors */ - finish_cursor(&bno_btree_curs); - finish_cursor(&ino_btree_curs); + finish_rebuild(mp, &btr_bno, lost_fsb); + finish_rebuild(mp, &btr_cnt, lost_fsb); if (xfs_sb_version_hasrmapbt(&mp->m_sb)) finish_cursor(&rmap_btree_curs); if (xfs_sb_version_hasreflink(&mp->m_sb)) finish_cursor(&refcnt_btree_curs); if (xfs_sb_version_hasfinobt(&mp->m_sb)) finish_cursor(&fino_btree_curs); - finish_cursor(&bcnt_btree_curs); /* * release the incore per-AG bno/bcnt trees so From patchwork Sat May 9 16:32:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11538429 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BD5B9912 for ; Sat, 9 May 2020 16:34:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9B9492184D for ; Sat, 9 May 2020 16:34:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="fQB976cp" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727790AbgEIQeh (ORCPT ); Sat, 9 May 2020 12:34:37 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:40094 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726013AbgEIQeg (ORCPT ); Sat, 9 May 2020 12:34:36 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GRVbD196437; Sat, 9 May 2020 16:34:31 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2020-01-29; bh=fumNeHIxNPT4TSoLY32Xjwg8TwszCtGh6PUPuz5xedg=; b=fQB976cpRlfzY+bNPkrTS6NZ2OSKsKrPkkPFS0j3B3pOn6ULmrcfktlz5KiEmuzueSrM HZ9q8nN9SiAgFCR6RxDGZ1FJTrnaUAZBdYoM3gTne+TSEMUHhnmksEyEZWxgu3WtjfQ+ nXKpjjCewzLugiPcjIWKqhanuHGtFdHmFryaAVJppHS6BRsXLpfKC+YnPXX4vfkX30yX CCtbU+1zD/wXAy6hCXsHBQFVgq64iNBovFgcFPX1fBusquH7CtC1VDz7iDpyouJF/iiZ Fip7Fqs25tTlNscrt+37w9pEaJz4F44fo0CnHaufPEqBgyT9QdPYSriaVzQESSM7gf8W 9g== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by aserp2120.oracle.com with ESMTP id 30wmfm157f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:34:31 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GWPqX116841; Sat, 9 May 2020 16:32:30 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserp3020.oracle.com with ESMTP id 30wwwpnkhe-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:32:29 +0000 Received: from abhmp0016.oracle.com (abhmp0016.oracle.com [141.146.116.22]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 049GWFO4021011; Sat, 9 May 2020 16:32:15 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sat, 09 May 2020 09:32:14 -0700 Subject: [PATCH 5/9] xfs_repair: rebuild inode btrees with bulk loader From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org, bfoster@redhat.com Date: Sat, 09 May 2020 09:32:14 -0700 Message-ID: <158904193466.984305.10132714783018885545.stgit@magnolia> In-Reply-To: <158904190079.984305.707785748675261111.stgit@magnolia> References: <158904190079.984305.707785748675261111.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 adultscore=0 phishscore=0 bulkscore=0 suspectscore=2 malwarescore=0 mlxlogscore=999 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090141 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxlogscore=999 suspectscore=2 clxscore=1015 lowpriorityscore=0 bulkscore=0 impostorscore=0 mlxscore=0 priorityscore=1501 adultscore=0 spamscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090140 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Use the btree bulk loading functions to rebuild the inode btrees and drop the open-coded implementation. Signed-off-by: Darrick J. Wong --- libxfs/libxfs_api_defs.h | 1 repair/phase5.c | 615 ++++++++++++++++------------------------------ 2 files changed, 221 insertions(+), 395 deletions(-) diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h index bace739c..5d0868c2 100644 --- a/libxfs/libxfs_api_defs.h +++ b/libxfs/libxfs_api_defs.h @@ -115,6 +115,7 @@ #define xfs_init_local_fork libxfs_init_local_fork #define xfs_inobt_maxrecs libxfs_inobt_maxrecs +#define xfs_inobt_stage_cursor libxfs_inobt_stage_cursor #define xfs_inode_from_disk libxfs_inode_from_disk #define xfs_inode_to_disk libxfs_inode_to_disk #define xfs_inode_validate_cowextsize libxfs_inode_validate_cowextsize diff --git a/repair/phase5.c b/repair/phase5.c index 94e4610c..22be0fa2 100644 --- a/repair/phase5.c +++ b/repair/phase5.c @@ -84,6 +84,10 @@ struct bt_rebuild { struct extent_tree_node *bno_rec; xfs_agblock_t *freeblks; }; + struct { + struct ino_tree_node *ino_rec; + struct agi_stat *agi_stat; + }; }; }; @@ -766,48 +770,38 @@ _("Error %d while writing cntbt btree for AG %u.\n"), error, agno); sc->tp = NULL; } -/* - * XXX(hch): any reason we don't just look at mp->m_inobt_mxr? - */ -#define XR_INOBT_BLOCK_MAXRECS(mp, level) \ - libxfs_inobt_maxrecs((mp), (mp)->m_sb.sb_blocksize, \ - (level) == 0) +/* Inode Btrees */ -/* - * we don't have to worry here about how chewing up free extents - * may perturb things because inode tree building happens before - * freespace tree building. - */ +/* Initialize both inode btree cursors as needed. */ static void -init_ino_cursor(xfs_mount_t *mp, xfs_agnumber_t agno, bt_status_t *btree_curs, - uint64_t *num_inos, uint64_t *num_free_inos, int finobt) +init_ino_cursors( + struct repair_ctx *sc, + xfs_agnumber_t agno, + unsigned int free_space, + uint64_t *num_inos, + uint64_t *num_free_inos, + struct bt_rebuild *btr_ino, + struct bt_rebuild *btr_fino) { - uint64_t ninos; - uint64_t nfinos; - int rec_nfinos; - int rec_ninos; - ino_tree_node_t *ino_rec; - int num_recs; - int level; - bt_stat_level_t *lptr; - bt_stat_level_t *p_lptr; - xfs_extlen_t blocks_allocated; - int i; + struct ino_tree_node *ino_rec; + unsigned int ino_recs = 0; + unsigned int fino_recs = 0; + bool finobt; + int error; - *num_inos = *num_free_inos = 0; - ninos = nfinos = 0; + finobt = xfs_sb_version_hasfinobt(&sc->mp->m_sb); + init_rebuild(sc, &XFS_RMAP_OINFO_INOBT, free_space, btr_ino); - lptr = &btree_curs->level[0]; - btree_curs->init = 1; - btree_curs->owner = XFS_RMAP_OWN_INOBT; + /* Compute inode statistics. */ + *num_free_inos = 0; + *num_inos = 0; + for (ino_rec = findfirst_inode_rec(agno); + ino_rec != NULL; + ino_rec = next_ino_rec(ino_rec)) { + unsigned int rec_ninos = 0; + unsigned int rec_nfinos = 0; + int i; - /* - * build up statistics - */ - ino_rec = findfirst_inode_rec(agno); - for (num_recs = 0; ino_rec != NULL; ino_rec = next_ino_rec(ino_rec)) { - rec_ninos = 0; - rec_nfinos = 0; for (i = 0; i < XFS_INODES_PER_CHUNK; i++) { ASSERT(is_inode_confirmed(ino_rec, i)); /* @@ -821,174 +815,218 @@ init_ino_cursor(xfs_mount_t *mp, xfs_agnumber_t agno, bt_status_t *btree_curs, rec_ninos++; } - /* - * finobt only considers records with free inodes - */ - if (finobt && !rec_nfinos) - continue; + *num_free_inos += rec_nfinos; + *num_inos += rec_ninos; + ino_recs++; - nfinos += rec_nfinos; - ninos += rec_ninos; - num_recs++; + /* finobt only considers records with free inodes */ + if (rec_nfinos) + fino_recs++; } - if (num_recs == 0) { - /* - * easy corner-case -- no inode records - */ - lptr->num_blocks = 1; - lptr->modulo = 0; - lptr->num_recs_pb = 0; - lptr->num_recs_tot = 0; + btr_ino->cur = libxfs_inobt_stage_cursor(sc->mp, &btr_ino->newbt.afake, + agno, XFS_BTNUM_INO); - btree_curs->num_levels = 1; - btree_curs->num_tot_blocks = btree_curs->num_free_blocks = 1; + /* Compute how many inobt blocks we'll need. */ + error = -libxfs_btree_bload_compute_geometry(btr_ino->cur, + &btr_ino->bload, ino_recs); + if (error) + do_error( +_("Unable to compute inode btree geometry, error %d.\n"), error); - setup_cursor(mp, agno, btree_curs); + setup_rebuild(sc->mp, agno, btr_ino, btr_ino->bload.nr_blocks); + if (!finobt) return; - } - blocks_allocated = lptr->num_blocks = howmany(num_recs, - XR_INOBT_BLOCK_MAXRECS(mp, 0)); + init_rebuild(sc, &XFS_RMAP_OINFO_INOBT, free_space, btr_fino); + btr_fino->cur = libxfs_inobt_stage_cursor(sc->mp, + &btr_fino->newbt.afake, agno, XFS_BTNUM_FINO); - lptr->modulo = num_recs % lptr->num_blocks; - lptr->num_recs_pb = num_recs / lptr->num_blocks; - lptr->num_recs_tot = num_recs; - level = 1; + /* Compute how many finobt blocks we'll need. */ + error = -libxfs_btree_bload_compute_geometry(btr_fino->cur, + &btr_fino->bload, fino_recs); + if (error) + do_error( +_("Unable to compute free inode btree geometry, error %d.\n"), error); - if (lptr->num_blocks > 1) { - for (; btree_curs->level[level-1].num_blocks > 1 - && level < XFS_BTREE_MAXLEVELS; - level++) { - lptr = &btree_curs->level[level]; - p_lptr = &btree_curs->level[level - 1]; - lptr->num_blocks = howmany(p_lptr->num_blocks, - XR_INOBT_BLOCK_MAXRECS(mp, level)); - lptr->modulo = p_lptr->num_blocks % lptr->num_blocks; - lptr->num_recs_pb = p_lptr->num_blocks - / lptr->num_blocks; - lptr->num_recs_tot = p_lptr->num_blocks; + setup_rebuild(sc->mp, agno, btr_fino, btr_fino->bload.nr_blocks); +} - blocks_allocated += lptr->num_blocks; +/* Copy one incore inode record into the inobt cursor. */ +static void +get_inode_data( + struct xfs_btree_cur *cur, + struct ino_tree_node *ino_rec, + struct agi_stat *agi_stat) +{ + struct xfs_inobt_rec_incore *irec = &cur->bc_rec.i; + int inocnt = 0; + int finocnt = 0; + int k; + + irec->ir_startino = ino_rec->ino_startnum; + irec->ir_free = ino_rec->ir_free; + + for (k = 0; k < sizeof(xfs_inofree_t) * NBBY; k++) { + ASSERT(is_inode_confirmed(ino_rec, k)); + + if (is_inode_sparse(ino_rec, k)) + continue; + if (is_inode_free(ino_rec, k)) + finocnt++; + inocnt++; + } + + irec->ir_count = inocnt; + irec->ir_freecount = finocnt; + + if (xfs_sb_version_hassparseinodes(&cur->bc_mp->m_sb)) { + uint64_t sparse; + int spmask; + uint16_t holemask; + + /* + * Convert the 64-bit in-core sparse inode state to the + * 16-bit on-disk holemask. + */ + holemask = 0; + spmask = (1 << XFS_INODES_PER_HOLEMASK_BIT) - 1; + sparse = ino_rec->ir_sparse; + for (k = 0; k < XFS_INOBT_HOLEMASK_BITS; k++) { + if (sparse & spmask) { + ASSERT((sparse & spmask) == spmask); + holemask |= (1 << k); + } else + ASSERT((sparse & spmask) == 0); + sparse >>= XFS_INODES_PER_HOLEMASK_BIT; } + + irec->ir_holemask = holemask; + } else { + irec->ir_holemask = 0; } - ASSERT(lptr->num_blocks == 1); - btree_curs->num_levels = level; - btree_curs->num_tot_blocks = btree_curs->num_free_blocks - = blocks_allocated; + if (!agi_stat) + return; - setup_cursor(mp, agno, btree_curs); + if (agi_stat->first_agino != NULLAGINO) + agi_stat->first_agino = ino_rec->ino_startnum; + agi_stat->freecount += finocnt; + agi_stat->count += inocnt; +} - *num_inos = ninos; - *num_free_inos = nfinos; +/* Grab one inobt record. */ +static int +get_inobt_record( + struct xfs_btree_cur *cur, + void *priv) +{ + struct bt_rebuild *rebuild = priv; - return; + get_inode_data(cur, rebuild->ino_rec, rebuild->agi_stat); + rebuild->ino_rec = next_ino_rec(rebuild->ino_rec); + return 0; } +/* Rebuild a inobt btree. */ static void -prop_ino_cursor(xfs_mount_t *mp, xfs_agnumber_t agno, bt_status_t *btree_curs, - xfs_btnum_t btnum, xfs_agino_t startino, int level) +build_inobt( + struct repair_ctx *sc, + xfs_agnumber_t agno, + struct bt_rebuild *btr_ino, + struct agi_stat *agi_stat) { - struct xfs_btree_block *bt_hdr; - xfs_inobt_key_t *bt_key; - xfs_inobt_ptr_t *bt_ptr; - xfs_agblock_t agbno; - bt_stat_level_t *lptr; - const struct xfs_buf_ops *ops = btnum_to_ops(btnum); int error; - level++; - - if (level >= btree_curs->num_levels) - return; + btr_ino->bload.get_record = get_inobt_record; + btr_ino->bload.claim_block = rebuild_claim_block; + agi_stat->count = agi_stat->freecount = 0; + agi_stat->first_agino = NULLAGINO; + btr_ino->agi_stat = agi_stat; + btr_ino->ino_rec = findfirst_inode_rec(agno); - lptr = &btree_curs->level[level]; - bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p); - - if (be16_to_cpu(bt_hdr->bb_numrecs) == 0) { - /* - * this only happens once to initialize the - * first path up the left side of the tree - * where the agbno's are already set up - */ - prop_ino_cursor(mp, agno, btree_curs, btnum, startino, level); - } + error = -libxfs_trans_alloc_empty(sc->mp, &sc->tp); + if (error) + do_error( +_("Insufficient memory to construct inobt rebuild transaction.\n")); - if (be16_to_cpu(bt_hdr->bb_numrecs) == - lptr->num_recs_pb + (lptr->modulo > 0)) { - /* - * write out current prev block, grab us a new block, - * and set the rightsib pointer of current block - */ -#ifdef XR_BLD_INO_TRACE - fprintf(stderr, " ino prop agbno %d ", lptr->prev_agbno); -#endif - if (lptr->prev_agbno != NULLAGBLOCK) { - ASSERT(lptr->prev_buf_p != NULL); - libxfs_buf_mark_dirty(lptr->prev_buf_p); - libxfs_buf_relse(lptr->prev_buf_p); - } - lptr->prev_agbno = lptr->agbno;; - lptr->prev_buf_p = lptr->buf_p; - agbno = get_next_blockaddr(agno, level, btree_curs); + /* Add all observed inobt records. */ + error = -libxfs_btree_bload(btr_ino->cur, &btr_ino->bload, btr_ino); + if (error) + do_error( +_("Error %d while creating inobt btree for AG %u.\n"), error, agno); - bt_hdr->bb_u.s.bb_rightsib = cpu_to_be32(agbno); + /* Since we're not writing the AGI yet, no need to commit the cursor */ + libxfs_btree_del_cursor(btr_ino->cur, 0); + error = -libxfs_trans_commit(sc->tp); + if (error) + do_error( +_("Error %d while writing inobt btree for AG %u.\n"), error, agno); + sc->tp = NULL; +} - error = -libxfs_buf_get(mp->m_dev, - XFS_AGB_TO_DADDR(mp, agno, agbno), - XFS_FSB_TO_BB(mp, 1), &lptr->buf_p); - if (error) - do_error(_("Cannot grab inode btree buffer, err=%d"), - error); - lptr->agbno = agbno; +/* Grab one finobt record. */ +static int +get_finobt_record( + struct xfs_btree_cur *cur, + void *priv) +{ + struct bt_rebuild *rebuild = priv; - if (lptr->modulo) - lptr->modulo--; + get_inode_data(cur, rebuild->ino_rec, NULL); + rebuild->ino_rec = next_free_ino_rec(rebuild->ino_rec); + return 0; +} - /* - * initialize block header - */ - lptr->buf_p->b_ops = ops; - bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p); - memset(bt_hdr, 0, mp->m_sb.sb_blocksize); - libxfs_btree_init_block(mp, lptr->buf_p, btnum, - level, 0, agno); +/* Rebuild a finobt btree. */ +static void +build_finobt( + struct repair_ctx *sc, + xfs_agnumber_t agno, + struct bt_rebuild *btr_fino) +{ + int error; - bt_hdr->bb_u.s.bb_leftsib = cpu_to_be32(lptr->prev_agbno); + btr_fino->bload.get_record = get_finobt_record; + btr_fino->bload.claim_block = rebuild_claim_block; + btr_fino->ino_rec = findfirst_free_inode_rec(agno); - /* - * propagate extent record for first extent in new block up - */ - prop_ino_cursor(mp, agno, btree_curs, btnum, startino, level); - } - /* - * add inode info to current block - */ - be16_add_cpu(&bt_hdr->bb_numrecs, 1); + error = -libxfs_trans_alloc_empty(sc->mp, &sc->tp); + if (error) + do_error( +_("Insufficient memory to construct finobt rebuild transaction.\n")); - bt_key = XFS_INOBT_KEY_ADDR(mp, bt_hdr, - be16_to_cpu(bt_hdr->bb_numrecs)); - bt_ptr = XFS_INOBT_PTR_ADDR(mp, bt_hdr, - be16_to_cpu(bt_hdr->bb_numrecs), - M_IGEO(mp)->inobt_mxr[1]); + /* Add all observed finobt records. */ + error = -libxfs_btree_bload(btr_fino->cur, &btr_fino->bload, btr_fino); + if (error) + do_error( +_("Error %d while creating finobt btree for AG %u.\n"), error, agno); - bt_key->ir_startino = cpu_to_be32(startino); - *bt_ptr = cpu_to_be32(btree_curs->level[level-1].agbno); + /* Since we're not writing the AGI yet, no need to commit the cursor */ + libxfs_btree_del_cursor(btr_fino->cur, 0); + error = -libxfs_trans_commit(sc->tp); + if (error) + do_error( +_("Error %d while writing finobt btree for AG %u.\n"), error, agno); + sc->tp = NULL; } /* * XXX: yet more code that can be shared with mkfs, growfs. */ static void -build_agi(xfs_mount_t *mp, xfs_agnumber_t agno, bt_status_t *btree_curs, - bt_status_t *finobt_curs, struct agi_stat *agi_stat) +build_agi( + struct xfs_mount *mp, + xfs_agnumber_t agno, + struct bt_rebuild *btr_ino, + struct bt_rebuild *btr_fino, + struct agi_stat *agi_stat) { - xfs_buf_t *agi_buf; - xfs_agi_t *agi; - int i; - int error; + struct xfs_buf *agi_buf; + struct xfs_agi *agi; + int i; + int error; error = -libxfs_buf_get(mp->m_dev, XFS_AG_DADDR(mp, agno, XFS_AGI_DADDR(mp)), @@ -1009,8 +1047,8 @@ build_agi(xfs_mount_t *mp, xfs_agnumber_t agno, bt_status_t *btree_curs, agi->agi_length = cpu_to_be32(mp->m_sb.sb_dblocks - (xfs_rfsblock_t) mp->m_sb.sb_agblocks * agno); agi->agi_count = cpu_to_be32(agi_stat->count); - agi->agi_root = cpu_to_be32(btree_curs->root); - agi->agi_level = cpu_to_be32(btree_curs->num_levels); + agi->agi_root = cpu_to_be32(btr_ino->newbt.afake.af_root); + agi->agi_level = cpu_to_be32(btr_ino->newbt.afake.af_levels); agi->agi_freecount = cpu_to_be32(agi_stat->freecount); agi->agi_newino = cpu_to_be32(agi_stat->first_agino); agi->agi_dirino = cpu_to_be32(NULLAGINO); @@ -1022,203 +1060,16 @@ build_agi(xfs_mount_t *mp, xfs_agnumber_t agno, bt_status_t *btree_curs, platform_uuid_copy(&agi->agi_uuid, &mp->m_sb.sb_meta_uuid); if (xfs_sb_version_hasfinobt(&mp->m_sb)) { - agi->agi_free_root = cpu_to_be32(finobt_curs->root); - agi->agi_free_level = cpu_to_be32(finobt_curs->num_levels); + agi->agi_free_root = + cpu_to_be32(btr_fino->newbt.afake.af_root); + agi->agi_free_level = + cpu_to_be32(btr_fino->newbt.afake.af_levels); } libxfs_buf_mark_dirty(agi_buf); libxfs_buf_relse(agi_buf); } -/* - * rebuilds an inode tree given a cursor. We're lazy here and call - * the routine that builds the agi - */ -static void -build_ino_tree(xfs_mount_t *mp, xfs_agnumber_t agno, - bt_status_t *btree_curs, xfs_btnum_t btnum, - struct agi_stat *agi_stat) -{ - xfs_agnumber_t i; - xfs_agblock_t j; - xfs_agblock_t agbno; - xfs_agino_t first_agino; - struct xfs_btree_block *bt_hdr; - xfs_inobt_rec_t *bt_rec; - ino_tree_node_t *ino_rec; - bt_stat_level_t *lptr; - const struct xfs_buf_ops *ops = btnum_to_ops(btnum); - xfs_agino_t count = 0; - xfs_agino_t freecount = 0; - int inocnt; - uint8_t finocnt; - int k; - int level = btree_curs->num_levels; - int spmask; - uint64_t sparse; - uint16_t holemask; - int error; - - ASSERT(btnum == XFS_BTNUM_INO || btnum == XFS_BTNUM_FINO); - - for (i = 0; i < level; i++) { - lptr = &btree_curs->level[i]; - - agbno = get_next_blockaddr(agno, i, btree_curs); - error = -libxfs_buf_get(mp->m_dev, - XFS_AGB_TO_DADDR(mp, agno, agbno), - XFS_FSB_TO_BB(mp, 1), &lptr->buf_p); - if (error) - do_error(_("Cannot grab inode btree buffer, err=%d"), - error); - - if (i == btree_curs->num_levels - 1) - btree_curs->root = agbno; - - lptr->agbno = agbno; - lptr->prev_agbno = NULLAGBLOCK; - lptr->prev_buf_p = NULL; - /* - * initialize block header - */ - - lptr->buf_p->b_ops = ops; - bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p); - memset(bt_hdr, 0, mp->m_sb.sb_blocksize); - libxfs_btree_init_block(mp, lptr->buf_p, btnum, i, 0, agno); - } - - /* - * run along leaf, setting up records. as we have to switch - * blocks, call the prop_ino_cursor routine to set up the new - * pointers for the parent. that can recurse up to the root - * if required. set the sibling pointers for leaf level here. - */ - if (btnum == XFS_BTNUM_FINO) - ino_rec = findfirst_free_inode_rec(agno); - else - ino_rec = findfirst_inode_rec(agno); - - if (ino_rec != NULL) - first_agino = ino_rec->ino_startnum; - else - first_agino = NULLAGINO; - - lptr = &btree_curs->level[0]; - - for (i = 0; i < lptr->num_blocks; i++) { - /* - * block initialization, lay in block header - */ - lptr->buf_p->b_ops = ops; - bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p); - memset(bt_hdr, 0, mp->m_sb.sb_blocksize); - libxfs_btree_init_block(mp, lptr->buf_p, btnum, 0, 0, agno); - - bt_hdr->bb_u.s.bb_leftsib = cpu_to_be32(lptr->prev_agbno); - bt_hdr->bb_numrecs = cpu_to_be16(lptr->num_recs_pb + - (lptr->modulo > 0)); - - if (lptr->modulo > 0) - lptr->modulo--; - - if (lptr->num_recs_pb > 0) - prop_ino_cursor(mp, agno, btree_curs, btnum, - ino_rec->ino_startnum, 0); - - bt_rec = (xfs_inobt_rec_t *) - ((char *)bt_hdr + XFS_INOBT_BLOCK_LEN(mp)); - for (j = 0; j < be16_to_cpu(bt_hdr->bb_numrecs); j++) { - ASSERT(ino_rec != NULL); - bt_rec[j].ir_startino = - cpu_to_be32(ino_rec->ino_startnum); - bt_rec[j].ir_free = cpu_to_be64(ino_rec->ir_free); - - inocnt = finocnt = 0; - for (k = 0; k < sizeof(xfs_inofree_t)*NBBY; k++) { - ASSERT(is_inode_confirmed(ino_rec, k)); - - if (is_inode_sparse(ino_rec, k)) - continue; - if (is_inode_free(ino_rec, k)) - finocnt++; - inocnt++; - } - - /* - * Set the freecount and check whether we need to update - * the sparse format fields. Otherwise, skip to the next - * record. - */ - inorec_set_freecount(mp, &bt_rec[j], finocnt); - if (!xfs_sb_version_hassparseinodes(&mp->m_sb)) - goto nextrec; - - /* - * Convert the 64-bit in-core sparse inode state to the - * 16-bit on-disk holemask. - */ - holemask = 0; - spmask = (1 << XFS_INODES_PER_HOLEMASK_BIT) - 1; - sparse = ino_rec->ir_sparse; - for (k = 0; k < XFS_INOBT_HOLEMASK_BITS; k++) { - if (sparse & spmask) { - ASSERT((sparse & spmask) == spmask); - holemask |= (1 << k); - } else - ASSERT((sparse & spmask) == 0); - sparse >>= XFS_INODES_PER_HOLEMASK_BIT; - } - - bt_rec[j].ir_u.sp.ir_count = inocnt; - bt_rec[j].ir_u.sp.ir_holemask = cpu_to_be16(holemask); - -nextrec: - freecount += finocnt; - count += inocnt; - - if (btnum == XFS_BTNUM_FINO) - ino_rec = next_free_ino_rec(ino_rec); - else - ino_rec = next_ino_rec(ino_rec); - } - - if (ino_rec != NULL) { - /* - * get next leaf level block - */ - if (lptr->prev_buf_p != NULL) { -#ifdef XR_BLD_INO_TRACE - fprintf(stderr, "writing inobt agbno %u\n", - lptr->prev_agbno); -#endif - ASSERT(lptr->prev_agbno != NULLAGBLOCK); - libxfs_buf_mark_dirty(lptr->prev_buf_p); - libxfs_buf_relse(lptr->prev_buf_p); - } - lptr->prev_buf_p = lptr->buf_p; - lptr->prev_agbno = lptr->agbno; - lptr->agbno = get_next_blockaddr(agno, 0, btree_curs); - bt_hdr->bb_u.s.bb_rightsib = cpu_to_be32(lptr->agbno); - - error = -libxfs_buf_get(mp->m_dev, - XFS_AGB_TO_DADDR(mp, agno, lptr->agbno), - XFS_FSB_TO_BB(mp, 1), - &lptr->buf_p); - if (error) - do_error( - _("Cannot grab inode btree buffer, err=%d"), - error); - } - } - - if (agi_stat) { - agi_stat->first_agino = first_agino; - agi_stat->count = count; - agi_stat->freecount = freecount; - } -} - /* rebuild the rmap tree */ /* @@ -2145,14 +1996,10 @@ phase5_func( { struct repair_ctx sc = { .mp = mp, }; struct agi_stat agi_stat = {0,}; - uint64_t num_inos; - uint64_t num_free_inos; - uint64_t finobt_num_inos; - uint64_t finobt_num_free_inos; struct bt_rebuild btr_bno; struct bt_rebuild btr_cnt; - bt_status_t ino_btree_curs; - bt_status_t fino_btree_curs; + struct bt_rebuild btr_ino; + struct bt_rebuild btr_fino; bt_status_t rmap_btree_curs; bt_status_t refcnt_btree_curs; int extra_blocks = 0; @@ -2189,21 +2036,8 @@ phase5_func( agno); } - /* - * ok, now set up the btree cursors for the - * on-disk btrees (includs pre-allocating all - * required blocks for the trees themselves) - */ - init_ino_cursor(mp, agno, &ino_btree_curs, &num_inos, - &num_free_inos, 0); - - if (xfs_sb_version_hasfinobt(&mp->m_sb)) - init_ino_cursor(mp, agno, &fino_btree_curs, - &finobt_num_inos, &finobt_num_free_inos, - 1); - - sb_icount_ag[agno] += num_inos; - sb_ifree_ag[agno] += num_free_inos; + init_ino_cursors(&sc, agno, num_freeblocks, &sb_icount_ag[agno], + &sb_ifree_ag[agno], &btr_ino, &btr_fino); /* * Set up the btree cursors for the on-disk rmap btrees, @@ -2300,36 +2134,27 @@ phase5_func( &rmap_btree_curs, &refcnt_btree_curs, lost_fsb); /* - * build inode allocation tree. + * build inode allocation trees. */ - build_ino_tree(mp, agno, &ino_btree_curs, XFS_BTNUM_INO, - &agi_stat); - write_cursor(&ino_btree_curs); - - /* - * build free inode tree - */ - if (xfs_sb_version_hasfinobt(&mp->m_sb)) { - build_ino_tree(mp, agno, &fino_btree_curs, - XFS_BTNUM_FINO, NULL); - write_cursor(&fino_btree_curs); - } + build_inobt(&sc, agno, &btr_ino, &agi_stat); + if (xfs_sb_version_hasfinobt(&mp->m_sb)) + build_finobt(&sc, agno, &btr_fino); /* build the agi */ - build_agi(mp, agno, &ino_btree_curs, &fino_btree_curs, - &agi_stat); + build_agi(mp, agno, &btr_ino, &btr_fino, &agi_stat); /* * tear down cursors */ finish_rebuild(mp, &btr_bno, lost_fsb); finish_rebuild(mp, &btr_cnt, lost_fsb); + finish_rebuild(mp, &btr_ino, lost_fsb); + if (xfs_sb_version_hasfinobt(&mp->m_sb)) + finish_rebuild(mp, &btr_fino, lost_fsb); if (xfs_sb_version_hasrmapbt(&mp->m_sb)) finish_cursor(&rmap_btree_curs); if (xfs_sb_version_hasreflink(&mp->m_sb)) finish_cursor(&refcnt_btree_curs); - if (xfs_sb_version_hasfinobt(&mp->m_sb)) - finish_cursor(&fino_btree_curs); /* * release the incore per-AG bno/bcnt trees so From patchwork Sat May 9 16:32:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11538413 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 59E3315AB for ; Sat, 9 May 2020 16:32:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 36C7E20735 for ; Sat, 9 May 2020 16:32:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="N+mCCOoS" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728058AbgEIQch (ORCPT ); Sat, 9 May 2020 12:32:37 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:50712 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727863AbgEIQch (ORCPT ); Sat, 9 May 2020 12:32:37 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GMgHt072310; Sat, 9 May 2020 16:32:33 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2020-01-29; bh=i/o+nnmVoTLy5ldhGcYcjvykO6IhtvVcZS42Iy7Q9/o=; b=N+mCCOoSUhvVEwTpOlcKcJKsSK/UgpJzKvAmu+zPJILHmpQHhhVLVfz5MTHoq1BD7a9v OCmazZSScKvKKevWYEcygyh7hQD2YH0jApbVaioY8Se7HBZBxRF/3wO4h6PZyRza4CV/ NazptxWbknU2WQVLX2ecLSWPJMtLuDvMXYWsvDW3mu1+94DMOo1+eFtEXnfAognl1RFs 3MhXJLzuAyXjFEuk0LoQq+L6y3DNNw5CKlZPnc1HoL5KDdhHkJ4ZY+qxxo0Amq0v78KR rLYIgauKJC3MNTHy0XXNQrAfr02F1lWt419OUQYH+eppZVNaK8Dkfx3Oh6v/BlkOi2MH hQ== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2130.oracle.com with ESMTP id 30wkxqs6hs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:32:33 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GWTR8117243; Sat, 9 May 2020 16:32:32 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by aserp3020.oracle.com with ESMTP id 30wwwpnkjw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:32:30 +0000 Received: from abhmp0012.oracle.com (abhmp0012.oracle.com [141.146.116.18]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 049GWLoO026684; Sat, 9 May 2020 16:32:21 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sat, 09 May 2020 09:32:21 -0700 Subject: [PATCH 6/9] xfs_repair: rebuild reverse mapping btrees with bulk loader From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org, bfoster@redhat.com Date: Sat, 09 May 2020 09:32:21 -0700 Message-ID: <158904194111.984305.6132229160225755064.stgit@magnolia> In-Reply-To: <158904190079.984305.707785748675261111.stgit@magnolia> References: <158904190079.984305.707785748675261111.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 adultscore=0 phishscore=0 bulkscore=0 suspectscore=2 malwarescore=0 mlxlogscore=999 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090141 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 mlxscore=0 bulkscore=0 priorityscore=1501 impostorscore=0 lowpriorityscore=0 clxscore=1015 malwarescore=0 suspectscore=2 phishscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090139 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Use the btree bulk loading functions to rebuild the reverse mapping btrees and drop the open-coded implementation. Signed-off-by: Darrick J. Wong --- libxfs/libxfs_api_defs.h | 1 repair/phase5.c | 428 ++++++++-------------------------------------- 2 files changed, 72 insertions(+), 357 deletions(-) diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h index 5d0868c2..0026ca45 100644 --- a/libxfs/libxfs_api_defs.h +++ b/libxfs/libxfs_api_defs.h @@ -142,6 +142,7 @@ #define xfs_rmapbt_calc_reserves libxfs_rmapbt_calc_reserves #define xfs_rmapbt_init_cursor libxfs_rmapbt_init_cursor #define xfs_rmapbt_maxrecs libxfs_rmapbt_maxrecs +#define xfs_rmapbt_stage_cursor libxfs_rmapbt_stage_cursor #define xfs_rmap_compare libxfs_rmap_compare #define xfs_rmap_get_rec libxfs_rmap_get_rec #define xfs_rmap_irec_offset_pack libxfs_rmap_irec_offset_pack diff --git a/repair/phase5.c b/repair/phase5.c index 22be0fa2..9c43100f 100644 --- a/repair/phase5.c +++ b/repair/phase5.c @@ -1072,373 +1072,79 @@ build_agi( /* rebuild the rmap tree */ -/* - * we don't have to worry here about how chewing up free extents - * may perturb things because rmap tree building happens before - * freespace tree building. - */ +/* Set up the rmap rebuild parameters. */ static void init_rmapbt_cursor( - struct xfs_mount *mp, - xfs_agnumber_t agno, - struct bt_status *btree_curs) -{ - size_t num_recs; - int level; - struct bt_stat_level *lptr; - struct bt_stat_level *p_lptr; - xfs_extlen_t blocks_allocated; - int maxrecs; - - if (!xfs_sb_version_hasrmapbt(&mp->m_sb)) { - memset(btree_curs, 0, sizeof(struct bt_status)); - return; - } - - lptr = &btree_curs->level[0]; - btree_curs->init = 1; - btree_curs->owner = XFS_RMAP_OWN_AG; - - /* - * build up statistics - */ - num_recs = rmap_record_count(mp, agno); - if (num_recs == 0) { - /* - * easy corner-case -- no rmap records - */ - lptr->num_blocks = 1; - lptr->modulo = 0; - lptr->num_recs_pb = 0; - lptr->num_recs_tot = 0; - - btree_curs->num_levels = 1; - btree_curs->num_tot_blocks = btree_curs->num_free_blocks = 1; - - setup_cursor(mp, agno, btree_curs); - - return; - } - - /* - * Leave enough slack in the rmapbt that we can insert the - * metadata AG entries without too many splits. - */ - maxrecs = mp->m_rmap_mxr[0]; - if (num_recs > maxrecs) - maxrecs -= 10; - blocks_allocated = lptr->num_blocks = howmany(num_recs, maxrecs); - - lptr->modulo = num_recs % lptr->num_blocks; - lptr->num_recs_pb = num_recs / lptr->num_blocks; - lptr->num_recs_tot = num_recs; - level = 1; - - if (lptr->num_blocks > 1) { - for (; btree_curs->level[level-1].num_blocks > 1 - && level < XFS_BTREE_MAXLEVELS; - level++) { - lptr = &btree_curs->level[level]; - p_lptr = &btree_curs->level[level - 1]; - lptr->num_blocks = howmany(p_lptr->num_blocks, - mp->m_rmap_mxr[1]); - lptr->modulo = p_lptr->num_blocks % lptr->num_blocks; - lptr->num_recs_pb = p_lptr->num_blocks - / lptr->num_blocks; - lptr->num_recs_tot = p_lptr->num_blocks; - - blocks_allocated += lptr->num_blocks; - } - } - ASSERT(lptr->num_blocks == 1); - btree_curs->num_levels = level; - - btree_curs->num_tot_blocks = btree_curs->num_free_blocks - = blocks_allocated; - - setup_cursor(mp, agno, btree_curs); -} - -static void -prop_rmap_cursor( - struct xfs_mount *mp, + struct repair_ctx *sc, xfs_agnumber_t agno, - struct bt_status *btree_curs, - struct xfs_rmap_irec *rm_rec, - int level) + unsigned int free_space, + struct bt_rebuild *btr) { - struct xfs_btree_block *bt_hdr; - struct xfs_rmap_key *bt_key; - xfs_rmap_ptr_t *bt_ptr; - xfs_agblock_t agbno; - struct bt_stat_level *lptr; - const struct xfs_buf_ops *ops = btnum_to_ops(XFS_BTNUM_RMAP); int error; - level++; - - if (level >= btree_curs->num_levels) - return; - - lptr = &btree_curs->level[level]; - bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p); - - if (be16_to_cpu(bt_hdr->bb_numrecs) == 0) { - /* - * this only happens once to initialize the - * first path up the left side of the tree - * where the agbno's are already set up - */ - prop_rmap_cursor(mp, agno, btree_curs, rm_rec, level); - } - - if (be16_to_cpu(bt_hdr->bb_numrecs) == - lptr->num_recs_pb + (lptr->modulo > 0)) { - /* - * write out current prev block, grab us a new block, - * and set the rightsib pointer of current block - */ -#ifdef XR_BLD_INO_TRACE - fprintf(stderr, " rmap prop agbno %d ", lptr->prev_agbno); -#endif - if (lptr->prev_agbno != NULLAGBLOCK) { - ASSERT(lptr->prev_buf_p != NULL); - libxfs_buf_mark_dirty(lptr->prev_buf_p); - libxfs_buf_relse(lptr->prev_buf_p); - } - lptr->prev_agbno = lptr->agbno; - lptr->prev_buf_p = lptr->buf_p; - agbno = get_next_blockaddr(agno, level, btree_curs); - - bt_hdr->bb_u.s.bb_rightsib = cpu_to_be32(agbno); - - error = -libxfs_buf_get(mp->m_dev, - XFS_AGB_TO_DADDR(mp, agno, agbno), - XFS_FSB_TO_BB(mp, 1), &lptr->buf_p); - if (error) - do_error(_("Cannot grab rmapbt buffer, err=%d"), - error); - lptr->agbno = agbno; - - if (lptr->modulo) - lptr->modulo--; - - /* - * initialize block header - */ - lptr->buf_p->b_ops = ops; - bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p); - memset(bt_hdr, 0, mp->m_sb.sb_blocksize); - libxfs_btree_init_block(mp, lptr->buf_p, XFS_BTNUM_RMAP, - level, 0, agno); - - bt_hdr->bb_u.s.bb_leftsib = cpu_to_be32(lptr->prev_agbno); - - /* - * propagate extent record for first extent in new block up - */ - prop_rmap_cursor(mp, agno, btree_curs, rm_rec, level); - } - /* - * add rmap info to current block - */ - be16_add_cpu(&bt_hdr->bb_numrecs, 1); + init_rebuild(sc, &XFS_RMAP_OINFO_AG, free_space, btr); + btr->cur = libxfs_rmapbt_stage_cursor(sc->mp, &btr->newbt.afake, agno); - bt_key = XFS_RMAP_KEY_ADDR(bt_hdr, - be16_to_cpu(bt_hdr->bb_numrecs)); - bt_ptr = XFS_RMAP_PTR_ADDR(bt_hdr, - be16_to_cpu(bt_hdr->bb_numrecs), - mp->m_rmap_mxr[1]); - - bt_key->rm_startblock = cpu_to_be32(rm_rec->rm_startblock); - bt_key->rm_owner = cpu_to_be64(rm_rec->rm_owner); - bt_key->rm_offset = cpu_to_be64(rm_rec->rm_offset); + /* Compute how many blocks we'll need. */ + error = -libxfs_btree_bload_compute_geometry(btr->cur, &btr->bload, + rmap_record_count(sc->mp, agno)); + if (error) + do_error( +_("Unable to compute rmap btree geometry, error %d.\n"), error); - *bt_ptr = cpu_to_be32(btree_curs->level[level-1].agbno); + setup_rebuild(sc->mp, agno, btr, btr->bload.nr_blocks); } -static void -prop_rmap_highkey( - struct xfs_mount *mp, - xfs_agnumber_t agno, - struct bt_status *btree_curs, - struct xfs_rmap_irec *rm_highkey) +/* Grab one rmap record. */ +static int +get_rmapbt_record( + struct xfs_btree_cur *cur, + void *priv) { - struct xfs_btree_block *bt_hdr; - struct xfs_rmap_key *bt_key; - struct bt_stat_level *lptr; - struct xfs_rmap_irec key = {0}; - struct xfs_rmap_irec high_key; - int level; - int i; - int numrecs; + struct xfs_rmap_irec *rec; + struct bt_rebuild *btr = priv; - high_key = *rm_highkey; - for (level = 1; level < btree_curs->num_levels; level++) { - lptr = &btree_curs->level[level]; - bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p); - numrecs = be16_to_cpu(bt_hdr->bb_numrecs); - bt_key = XFS_RMAP_HIGH_KEY_ADDR(bt_hdr, numrecs); - - bt_key->rm_startblock = cpu_to_be32(high_key.rm_startblock); - bt_key->rm_owner = cpu_to_be64(high_key.rm_owner); - bt_key->rm_offset = cpu_to_be64( - libxfs_rmap_irec_offset_pack(&high_key)); - - for (i = 1; i <= numrecs; i++) { - bt_key = XFS_RMAP_HIGH_KEY_ADDR(bt_hdr, i); - key.rm_startblock = be32_to_cpu(bt_key->rm_startblock); - key.rm_owner = be64_to_cpu(bt_key->rm_owner); - key.rm_offset = be64_to_cpu(bt_key->rm_offset); - if (rmap_diffkeys(&key, &high_key) > 0) - high_key = key; - } - } + rec = pop_slab_cursor(btr->slab_cursor); + memcpy(&cur->bc_rec.r, rec, sizeof(struct xfs_rmap_irec)); + return 0; } -/* - * rebuilds a rmap btree given a cursor. - */ +/* Rebuild a rmap btree. */ static void build_rmap_tree( - struct xfs_mount *mp, + struct repair_ctx *sc, xfs_agnumber_t agno, - struct bt_status *btree_curs) + struct bt_rebuild *btr) { - xfs_agnumber_t i; - xfs_agblock_t j; - xfs_agblock_t agbno; - struct xfs_btree_block *bt_hdr; - struct xfs_rmap_irec *rm_rec; - struct xfs_slab_cursor *rmap_cur; - struct xfs_rmap_rec *bt_rec; - struct xfs_rmap_irec highest_key = {0}; - struct xfs_rmap_irec hi_key = {0}; - struct bt_stat_level *lptr; - const struct xfs_buf_ops *ops = btnum_to_ops(XFS_BTNUM_RMAP); - int numrecs; - int level = btree_curs->num_levels; int error; - highest_key.rm_flags = 0; - for (i = 0; i < level; i++) { - lptr = &btree_curs->level[i]; - - agbno = get_next_blockaddr(agno, i, btree_curs); - error = -libxfs_buf_get(mp->m_dev, - XFS_AGB_TO_DADDR(mp, agno, agbno), - XFS_FSB_TO_BB(mp, 1), &lptr->buf_p); - if (error) - do_error(_("Cannot grab rmapbt buffer, err=%d"), - error); - - if (i == btree_curs->num_levels - 1) - btree_curs->root = agbno; + btr->bload.get_record = get_rmapbt_record; + btr->bload.claim_block = rebuild_claim_block; - lptr->agbno = agbno; - lptr->prev_agbno = NULLAGBLOCK; - lptr->prev_buf_p = NULL; - /* - * initialize block header - */ - - lptr->buf_p->b_ops = ops; - bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p); - memset(bt_hdr, 0, mp->m_sb.sb_blocksize); - libxfs_btree_init_block(mp, lptr->buf_p, XFS_BTNUM_RMAP, - i, 0, agno); - } - - /* - * run along leaf, setting up records. as we have to switch - * blocks, call the prop_rmap_cursor routine to set up the new - * pointers for the parent. that can recurse up to the root - * if required. set the sibling pointers for leaf level here. - */ - error = rmap_init_cursor(agno, &rmap_cur); + error = -libxfs_trans_alloc_empty(sc->mp, &sc->tp); if (error) do_error( -_("Insufficient memory to construct reverse-map cursor.")); - rm_rec = pop_slab_cursor(rmap_cur); - lptr = &btree_curs->level[0]; - - for (i = 0; i < lptr->num_blocks; i++) { - numrecs = lptr->num_recs_pb + (lptr->modulo > 0); - ASSERT(rm_rec != NULL || numrecs == 0); - - /* - * block initialization, lay in block header - */ - lptr->buf_p->b_ops = ops; - bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p); - memset(bt_hdr, 0, mp->m_sb.sb_blocksize); - libxfs_btree_init_block(mp, lptr->buf_p, XFS_BTNUM_RMAP, - 0, 0, agno); +_("Insufficient memory to construct rmap rebuild transaction.\n")); - bt_hdr->bb_u.s.bb_leftsib = cpu_to_be32(lptr->prev_agbno); - bt_hdr->bb_numrecs = cpu_to_be16(numrecs); - - if (lptr->modulo > 0) - lptr->modulo--; - - if (lptr->num_recs_pb > 0) { - ASSERT(rm_rec != NULL); - prop_rmap_cursor(mp, agno, btree_curs, rm_rec, 0); - } - - bt_rec = (struct xfs_rmap_rec *) - ((char *)bt_hdr + XFS_RMAP_BLOCK_LEN); - highest_key.rm_startblock = 0; - highest_key.rm_owner = 0; - highest_key.rm_offset = 0; - for (j = 0; j < be16_to_cpu(bt_hdr->bb_numrecs); j++) { - ASSERT(rm_rec != NULL); - bt_rec[j].rm_startblock = - cpu_to_be32(rm_rec->rm_startblock); - bt_rec[j].rm_blockcount = - cpu_to_be32(rm_rec->rm_blockcount); - bt_rec[j].rm_owner = cpu_to_be64(rm_rec->rm_owner); - bt_rec[j].rm_offset = cpu_to_be64( - libxfs_rmap_irec_offset_pack(rm_rec)); - rmap_high_key_from_rec(rm_rec, &hi_key); - if (rmap_diffkeys(&hi_key, &highest_key) > 0) - highest_key = hi_key; - - rm_rec = pop_slab_cursor(rmap_cur); - } - - /* Now go set the parent key */ - prop_rmap_highkey(mp, agno, btree_curs, &highest_key); + error = rmap_init_cursor(agno, &btr->slab_cursor); + if (error) + do_error( +_("Insufficient memory to construct rmap cursor.\n")); - if (rm_rec != NULL) { - /* - * get next leaf level block - */ - if (lptr->prev_buf_p != NULL) { -#ifdef XR_BLD_RL_TRACE - fprintf(stderr, "writing rmapbt agbno %u\n", - lptr->prev_agbno); -#endif - ASSERT(lptr->prev_agbno != NULLAGBLOCK); - libxfs_buf_mark_dirty(lptr->prev_buf_p); - libxfs_buf_relse(lptr->prev_buf_p); - } - lptr->prev_buf_p = lptr->buf_p; - lptr->prev_agbno = lptr->agbno; - lptr->agbno = get_next_blockaddr(agno, 0, btree_curs); - bt_hdr->bb_u.s.bb_rightsib = cpu_to_be32(lptr->agbno); + /* Add all observed rmap records. */ + error = -libxfs_btree_bload(btr->cur, &btr->bload, btr); + if (error) + do_error( +_("Error %d while creating rmap btree for AG %u.\n"), error, agno); - error = -libxfs_buf_get(mp->m_dev, - XFS_AGB_TO_DADDR(mp, agno, lptr->agbno), - XFS_FSB_TO_BB(mp, 1), - &lptr->buf_p); - if (error) - do_error( - _("Cannot grab rmapbt buffer, err=%d"), - error); - } - } - free_slab_cursor(&rmap_cur); + /* Since we're not writing the AGF yet, no need to commit the cursor */ + libxfs_btree_del_cursor(btr->cur, 0); + free_slab_cursor(&btr->slab_cursor); + error = -libxfs_trans_commit(sc->tp); + if (error) + do_error( +_("Error %d while writing rmap btree for AG %u.\n"), error, agno); + sc->tp = NULL; } /* rebuild the refcount tree */ @@ -1783,7 +1489,7 @@ build_agf_agfl( struct bt_rebuild *btr_cnt, xfs_extlen_t freeblks, /* # free blocks in tree */ int lostblocks, /* # blocks that will be lost */ - struct bt_status *rmap_bt, + struct bt_rebuild *btr_rmap, struct bt_status *refcnt_bt, struct xfs_slab *lost_fsb) { @@ -1831,11 +1537,17 @@ build_agf_agfl( cpu_to_be32(btr_cnt->newbt.afake.af_root); agf->agf_levels[XFS_BTNUM_CNT] = cpu_to_be32(btr_cnt->newbt.afake.af_levels); - agf->agf_roots[XFS_BTNUM_RMAP] = cpu_to_be32(rmap_bt->root); - agf->agf_levels[XFS_BTNUM_RMAP] = cpu_to_be32(rmap_bt->num_levels); agf->agf_freeblks = cpu_to_be32(freeblks); - agf->agf_rmap_blocks = cpu_to_be32(rmap_bt->num_tot_blocks - - rmap_bt->num_free_blocks); + + if (xfs_sb_version_hasrmapbt(&mp->m_sb)) { + agf->agf_roots[XFS_BTNUM_RMAP] = + cpu_to_be32(btr_rmap->newbt.afake.af_root); + agf->agf_levels[XFS_BTNUM_RMAP] = + cpu_to_be32(btr_rmap->newbt.afake.af_levels); + agf->agf_rmap_blocks = + cpu_to_be32(btr_rmap->newbt.afake.af_blocks); + } + agf->agf_refcount_root = cpu_to_be32(refcnt_bt->root); agf->agf_refcount_level = cpu_to_be32(refcnt_bt->num_levels); agf->agf_refcount_blocks = cpu_to_be32(refcnt_bt->num_tot_blocks - @@ -1853,7 +1565,7 @@ build_agf_agfl( blks = btr_bno->newbt.afake.af_blocks + btr_cnt->newbt.afake.af_blocks - 2; if (xfs_sb_version_hasrmapbt(&mp->m_sb)) - blks += rmap_bt->num_tot_blocks - rmap_bt->num_free_blocks - 1; + blks += btr_rmap->newbt.afake.af_blocks - 1; agf->agf_btreeblks = cpu_to_be32(blks); #ifdef XR_BLD_FREE_TRACE fprintf(stderr, "agf->agf_btreeblks = %u\n", @@ -1900,6 +1612,8 @@ build_agf_agfl( /* Fill the AGFL with leftover blocks or save them for later. */ fill_agfl(btr_bno, freelist, &i); fill_agfl(btr_cnt, freelist, &i); + if (xfs_sb_version_hasrmapbt(&mp->m_sb)) + fill_agfl(btr_rmap, freelist, &i); /* Set the AGF counters for the AGFL. */ if (i > 0) { @@ -2000,7 +1714,7 @@ phase5_func( struct bt_rebuild btr_cnt; struct bt_rebuild btr_ino; struct bt_rebuild btr_fino; - bt_status_t rmap_btree_curs; + struct bt_rebuild btr_rmap; bt_status_t refcnt_btree_curs; int extra_blocks = 0; uint num_freeblocks; @@ -2040,10 +1754,12 @@ phase5_func( &sb_ifree_ag[agno], &btr_ino, &btr_fino); /* - * Set up the btree cursors for the on-disk rmap btrees, - * which includes pre-allocating all required blocks. + * Set up the btree cursors for the on-disk rmap btrees, which includes + * pre-allocating all required blocks. If rmap is disabled then the + * it's zeroed. */ - init_rmapbt_cursor(mp, agno, &rmap_btree_curs); + if (xfs_sb_version_hasrmapbt(&mp->m_sb)) + init_rmapbt_cursor(&sc, agno, num_freeblocks, &btr_rmap); /* * Set up the btree cursors for the on-disk refcount btrees, @@ -2116,10 +1832,8 @@ phase5_func( ASSERT(freeblks1 == freeblks2); if (xfs_sb_version_hasrmapbt(&mp->m_sb)) { - build_rmap_tree(mp, agno, &rmap_btree_curs); - write_cursor(&rmap_btree_curs); - sb_fdblocks_ag[agno] += (rmap_btree_curs.num_tot_blocks - - rmap_btree_curs.num_free_blocks) - 1; + build_rmap_tree(&sc, agno, &btr_rmap); + sb_fdblocks_ag[agno] += btr_rmap.newbt.afake.af_blocks - 1; } if (xfs_sb_version_hasreflink(&mp->m_sb)) { @@ -2131,7 +1845,7 @@ phase5_func( * set up agf and agfl */ build_agf_agfl(mp, agno, &btr_bno, &btr_cnt, freeblks1, extra_blocks, - &rmap_btree_curs, &refcnt_btree_curs, lost_fsb); + &btr_rmap, &refcnt_btree_curs, lost_fsb); /* * build inode allocation trees. @@ -2152,7 +1866,7 @@ phase5_func( if (xfs_sb_version_hasfinobt(&mp->m_sb)) finish_rebuild(mp, &btr_fino, lost_fsb); if (xfs_sb_version_hasrmapbt(&mp->m_sb)) - finish_cursor(&rmap_btree_curs); + finish_rebuild(mp, &btr_rmap, lost_fsb); if (xfs_sb_version_hasreflink(&mp->m_sb)) finish_cursor(&refcnt_btree_curs); From patchwork Sat May 9 16:32:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11538409 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E50F51668 for ; Sat, 9 May 2020 16:32:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C28B02063A for ; Sat, 9 May 2020 16:32:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="vIAduXbr" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728053AbgEIQcf (ORCPT ); Sat, 9 May 2020 12:32:35 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:50650 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727863AbgEIQcf (ORCPT ); Sat, 9 May 2020 12:32:35 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GMgul072459; Sat, 9 May 2020 16:32:29 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2020-01-29; bh=ys9KUvXTN0RIL9G0eOWQ6cnCkd6EvJArsRfkrqpJAsg=; b=vIAduXbrlGxzoTbvavHFu2NaQA65t9C1HD2iO0snRD3Ijf26JNrussipd4xEgLwEIbVp w8VTTiz9XbUx0axwb/FohmncXOdU2Pb4nlywZ6gWrOJwEbglOk2JurgrcDH0iURcDo3s awTFTwF7x8IPtXNq/BACLFi1KdiiS8nfJ/XHsikmrjeac92JHgTjUt7fQPYxU7tCbfrj P5XIynp3UkKi8yVl84v8eu+S4HSRTi/Oqj5HD8fRMCGNLMUVWQ9v4eqsApToLAMrPxn6 EQbFuoqg628RjWszePy3KPZEbnENb+JnnkBbe1Hk9a5GCiu3Syl0/VWwPDwCgtOOFrzR Yw== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by userp2130.oracle.com with ESMTP id 30wkxqs6hd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:32:29 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GRbCG132470; Sat, 9 May 2020 16:32:29 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by userp3030.oracle.com with ESMTP id 30wwxb5jta-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:32:28 +0000 Received: from abhmp0013.oracle.com (abhmp0013.oracle.com [141.146.116.19]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 049GWRfs005224; Sat, 9 May 2020 16:32:27 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sat, 09 May 2020 09:32:27 -0700 Subject: [PATCH 7/9] xfs_repair: rebuild refcount btrees with bulk loader From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org, bfoster@redhat.com Date: Sat, 09 May 2020 09:32:27 -0700 Message-ID: <158904194756.984305.3705661929019376198.stgit@magnolia> In-Reply-To: <158904190079.984305.707785748675261111.stgit@magnolia> References: <158904190079.984305.707785748675261111.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 adultscore=0 suspectscore=2 bulkscore=0 phishscore=0 mlxscore=0 spamscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090141 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 mlxscore=0 bulkscore=0 priorityscore=1501 impostorscore=0 lowpriorityscore=0 clxscore=1015 malwarescore=0 suspectscore=2 phishscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090139 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Use the btree bulk loading functions to rebuild the refcount btrees and drop the open-coded implementation. Signed-off-by: Darrick J. Wong --- libxfs/libxfs_api_defs.h | 1 repair/phase5.c | 356 ++++++++-------------------------------------- 2 files changed, 66 insertions(+), 291 deletions(-) diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h index 0026ca45..1a7cdbf9 100644 --- a/libxfs/libxfs_api_defs.h +++ b/libxfs/libxfs_api_defs.h @@ -135,6 +135,7 @@ #define xfs_refcountbt_calc_reserves libxfs_refcountbt_calc_reserves #define xfs_refcountbt_init_cursor libxfs_refcountbt_init_cursor #define xfs_refcountbt_maxrecs libxfs_refcountbt_maxrecs +#define xfs_refcountbt_stage_cursor libxfs_refcountbt_stage_cursor #define xfs_refcount_get_rec libxfs_refcount_get_rec #define xfs_refcount_lookup_le libxfs_refcount_lookup_le diff --git a/repair/phase5.c b/repair/phase5.c index 9c43100f..6efc0613 100644 --- a/repair/phase5.c +++ b/repair/phase5.c @@ -1149,309 +1149,80 @@ _("Error %d while writing rmap btree for AG %u.\n"), error, agno); /* rebuild the refcount tree */ -/* - * we don't have to worry here about how chewing up free extents - * may perturb things because reflink tree building happens before - * freespace tree building. - */ +/* Set up the refcount rebuild parameters. */ static void init_refc_cursor( - struct xfs_mount *mp, + struct repair_ctx *sc, xfs_agnumber_t agno, - struct bt_status *btree_curs) + unsigned int free_space, + struct bt_rebuild *btr) { - size_t num_recs; - int level; - struct bt_stat_level *lptr; - struct bt_stat_level *p_lptr; - xfs_extlen_t blocks_allocated; - - if (!xfs_sb_version_hasreflink(&mp->m_sb)) { - memset(btree_curs, 0, sizeof(struct bt_status)); - return; - } - - lptr = &btree_curs->level[0]; - btree_curs->init = 1; - btree_curs->owner = XFS_RMAP_OWN_REFC; - - /* - * build up statistics - */ - num_recs = refcount_record_count(mp, agno); - if (num_recs == 0) { - /* - * easy corner-case -- no refcount records - */ - lptr->num_blocks = 1; - lptr->modulo = 0; - lptr->num_recs_pb = 0; - lptr->num_recs_tot = 0; - - btree_curs->num_levels = 1; - btree_curs->num_tot_blocks = btree_curs->num_free_blocks = 1; - - setup_cursor(mp, agno, btree_curs); - - return; - } + int error; - blocks_allocated = lptr->num_blocks = howmany(num_recs, - mp->m_refc_mxr[0]); - - lptr->modulo = num_recs % lptr->num_blocks; - lptr->num_recs_pb = num_recs / lptr->num_blocks; - lptr->num_recs_tot = num_recs; - level = 1; - - if (lptr->num_blocks > 1) { - for (; btree_curs->level[level-1].num_blocks > 1 - && level < XFS_BTREE_MAXLEVELS; - level++) { - lptr = &btree_curs->level[level]; - p_lptr = &btree_curs->level[level - 1]; - lptr->num_blocks = howmany(p_lptr->num_blocks, - mp->m_refc_mxr[1]); - lptr->modulo = p_lptr->num_blocks % lptr->num_blocks; - lptr->num_recs_pb = p_lptr->num_blocks - / lptr->num_blocks; - lptr->num_recs_tot = p_lptr->num_blocks; - - blocks_allocated += lptr->num_blocks; - } - } - ASSERT(lptr->num_blocks == 1); - btree_curs->num_levels = level; + init_rebuild(sc, &XFS_RMAP_OINFO_REFC, free_space, btr); + btr->cur = libxfs_refcountbt_stage_cursor(sc->mp, &btr->newbt.afake, + agno); - btree_curs->num_tot_blocks = btree_curs->num_free_blocks - = blocks_allocated; + /* Compute how many blocks we'll need. */ + error = -libxfs_btree_bload_compute_geometry(btr->cur, &btr->bload, + refcount_record_count(sc->mp, agno)); + if (error) + do_error( +_("Unable to compute refcount btree geometry, error %d.\n"), error); - setup_cursor(mp, agno, btree_curs); + setup_rebuild(sc->mp, agno, btr, btr->bload.nr_blocks); } -static void -prop_refc_cursor( - struct xfs_mount *mp, - xfs_agnumber_t agno, - struct bt_status *btree_curs, - xfs_agblock_t startbno, - int level) +/* Grab one refcount record. */ +static int +get_refcountbt_record( + struct xfs_btree_cur *cur, + void *priv) { - struct xfs_btree_block *bt_hdr; - struct xfs_refcount_key *bt_key; - xfs_refcount_ptr_t *bt_ptr; - xfs_agblock_t agbno; - struct bt_stat_level *lptr; - const struct xfs_buf_ops *ops = btnum_to_ops(XFS_BTNUM_REFC); - int error; - - level++; - - if (level >= btree_curs->num_levels) - return; - - lptr = &btree_curs->level[level]; - bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p); - - if (be16_to_cpu(bt_hdr->bb_numrecs) == 0) { - /* - * this only happens once to initialize the - * first path up the left side of the tree - * where the agbno's are already set up - */ - prop_refc_cursor(mp, agno, btree_curs, startbno, level); - } - - if (be16_to_cpu(bt_hdr->bb_numrecs) == - lptr->num_recs_pb + (lptr->modulo > 0)) { - /* - * write out current prev block, grab us a new block, - * and set the rightsib pointer of current block - */ -#ifdef XR_BLD_INO_TRACE - fprintf(stderr, " ino prop agbno %d ", lptr->prev_agbno); -#endif - if (lptr->prev_agbno != NULLAGBLOCK) { - ASSERT(lptr->prev_buf_p != NULL); - libxfs_buf_mark_dirty(lptr->prev_buf_p); - libxfs_buf_relse(lptr->prev_buf_p); - } - lptr->prev_agbno = lptr->agbno; - lptr->prev_buf_p = lptr->buf_p; - agbno = get_next_blockaddr(agno, level, btree_curs); - - bt_hdr->bb_u.s.bb_rightsib = cpu_to_be32(agbno); - - error = -libxfs_buf_get(mp->m_dev, - XFS_AGB_TO_DADDR(mp, agno, agbno), - XFS_FSB_TO_BB(mp, 1), &lptr->buf_p); - if (error) - do_error(_("Cannot grab refcountbt buffer, err=%d"), - error); - lptr->agbno = agbno; - - if (lptr->modulo) - lptr->modulo--; - - /* - * initialize block header - */ - lptr->buf_p->b_ops = ops; - bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p); - memset(bt_hdr, 0, mp->m_sb.sb_blocksize); - libxfs_btree_init_block(mp, lptr->buf_p, XFS_BTNUM_REFC, - level, 0, agno); - - bt_hdr->bb_u.s.bb_leftsib = cpu_to_be32(lptr->prev_agbno); - - /* - * propagate extent record for first extent in new block up - */ - prop_refc_cursor(mp, agno, btree_curs, startbno, level); - } - /* - * add inode info to current block - */ - be16_add_cpu(&bt_hdr->bb_numrecs, 1); - - bt_key = XFS_REFCOUNT_KEY_ADDR(bt_hdr, - be16_to_cpu(bt_hdr->bb_numrecs)); - bt_ptr = XFS_REFCOUNT_PTR_ADDR(bt_hdr, - be16_to_cpu(bt_hdr->bb_numrecs), - mp->m_refc_mxr[1]); + struct xfs_refcount_irec *rec; + struct bt_rebuild *btr = priv; - bt_key->rc_startblock = cpu_to_be32(startbno); - *bt_ptr = cpu_to_be32(btree_curs->level[level-1].agbno); + rec = pop_slab_cursor(btr->slab_cursor); + memcpy(&cur->bc_rec.rc, rec, sizeof(struct xfs_refcount_irec)); + return 0; } -/* - * rebuilds a refcount btree given a cursor. - */ +/* Rebuild a refcount btree. */ static void build_refcount_tree( - struct xfs_mount *mp, + struct repair_ctx *sc, xfs_agnumber_t agno, - struct bt_status *btree_curs) + struct bt_rebuild *btr) { - xfs_agnumber_t i; - xfs_agblock_t j; - xfs_agblock_t agbno; - struct xfs_btree_block *bt_hdr; - struct xfs_refcount_irec *refc_rec; - struct xfs_slab_cursor *refc_cur; - struct xfs_refcount_rec *bt_rec; - struct bt_stat_level *lptr; - const struct xfs_buf_ops *ops = btnum_to_ops(XFS_BTNUM_REFC); - int numrecs; - int level = btree_curs->num_levels; int error; - for (i = 0; i < level; i++) { - lptr = &btree_curs->level[i]; - - agbno = get_next_blockaddr(agno, i, btree_curs); - error = -libxfs_buf_get(mp->m_dev, - XFS_AGB_TO_DADDR(mp, agno, agbno), - XFS_FSB_TO_BB(mp, 1), &lptr->buf_p); - if (error) - do_error(_("Cannot grab refcountbt buffer, err=%d"), - error); - - if (i == btree_curs->num_levels - 1) - btree_curs->root = agbno; - - lptr->agbno = agbno; - lptr->prev_agbno = NULLAGBLOCK; - lptr->prev_buf_p = NULL; - /* - * initialize block header - */ - - lptr->buf_p->b_ops = ops; - bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p); - memset(bt_hdr, 0, mp->m_sb.sb_blocksize); - libxfs_btree_init_block(mp, lptr->buf_p, XFS_BTNUM_REFC, - i, 0, agno); - } + btr->bload.get_record = get_refcountbt_record; + btr->bload.claim_block = rebuild_claim_block; - /* - * run along leaf, setting up records. as we have to switch - * blocks, call the prop_refc_cursor routine to set up the new - * pointers for the parent. that can recurse up to the root - * if required. set the sibling pointers for leaf level here. - */ - error = init_refcount_cursor(agno, &refc_cur); + error = -libxfs_trans_alloc_empty(sc->mp, &sc->tp); if (error) do_error( -_("Insufficient memory to construct refcount cursor.")); - refc_rec = pop_slab_cursor(refc_cur); - lptr = &btree_curs->level[0]; +_("Insufficient memory to construct refcount rebuild transaction.\n")); - for (i = 0; i < lptr->num_blocks; i++) { - numrecs = lptr->num_recs_pb + (lptr->modulo > 0); - ASSERT(refc_rec != NULL || numrecs == 0); + error = init_refcount_cursor(agno, &btr->slab_cursor); + if (error) + do_error( +_("Insufficient memory to construct refcount cursor.\n")); - /* - * block initialization, lay in block header - */ - lptr->buf_p->b_ops = ops; - bt_hdr = XFS_BUF_TO_BLOCK(lptr->buf_p); - memset(bt_hdr, 0, mp->m_sb.sb_blocksize); - libxfs_btree_init_block(mp, lptr->buf_p, XFS_BTNUM_REFC, - 0, 0, agno); - - bt_hdr->bb_u.s.bb_leftsib = cpu_to_be32(lptr->prev_agbno); - bt_hdr->bb_numrecs = cpu_to_be16(numrecs); - - if (lptr->modulo > 0) - lptr->modulo--; - - if (lptr->num_recs_pb > 0) - prop_refc_cursor(mp, agno, btree_curs, - refc_rec->rc_startblock, 0); - - bt_rec = (struct xfs_refcount_rec *) - ((char *)bt_hdr + XFS_REFCOUNT_BLOCK_LEN); - for (j = 0; j < be16_to_cpu(bt_hdr->bb_numrecs); j++) { - ASSERT(refc_rec != NULL); - bt_rec[j].rc_startblock = - cpu_to_be32(refc_rec->rc_startblock); - bt_rec[j].rc_blockcount = - cpu_to_be32(refc_rec->rc_blockcount); - bt_rec[j].rc_refcount = cpu_to_be32(refc_rec->rc_refcount); - - refc_rec = pop_slab_cursor(refc_cur); - } + /* Add all observed refcount records. */ + error = -libxfs_btree_bload(btr->cur, &btr->bload, btr); + if (error) + do_error( +_("Error %d while creating refcount btree for AG %u.\n"), error, agno); - if (refc_rec != NULL) { - /* - * get next leaf level block - */ - if (lptr->prev_buf_p != NULL) { -#ifdef XR_BLD_RL_TRACE - fprintf(stderr, "writing refcntbt agbno %u\n", - lptr->prev_agbno); -#endif - ASSERT(lptr->prev_agbno != NULLAGBLOCK); - libxfs_buf_mark_dirty(lptr->prev_buf_p); - libxfs_buf_relse(lptr->prev_buf_p); - } - lptr->prev_buf_p = lptr->buf_p; - lptr->prev_agbno = lptr->agbno; - lptr->agbno = get_next_blockaddr(agno, 0, btree_curs); - bt_hdr->bb_u.s.bb_rightsib = cpu_to_be32(lptr->agbno); - - error = -libxfs_buf_get(mp->m_dev, - XFS_AGB_TO_DADDR(mp, agno, lptr->agbno), - XFS_FSB_TO_BB(mp, 1), - &lptr->buf_p); - if (error) - do_error( - _("Cannot grab refcountbt buffer, err=%d"), - error); - } - } - free_slab_cursor(&refc_cur); + /* Since we're not writing the AGF yet, no need to commit the cursor */ + libxfs_btree_del_cursor(btr->cur, 0); + free_slab_cursor(&btr->slab_cursor); + error = -libxfs_trans_commit(sc->tp); + if (error) + do_error( +_("Error %d while writing refcount btree for AG %u.\n"), error, agno); + sc->tp = NULL; } /* Fill the AGFL with any leftover bnobt rebuilder blocks. */ @@ -1490,7 +1261,7 @@ build_agf_agfl( xfs_extlen_t freeblks, /* # free blocks in tree */ int lostblocks, /* # blocks that will be lost */ struct bt_rebuild *btr_rmap, - struct bt_status *refcnt_bt, + struct bt_rebuild *btr_refc, struct xfs_slab *lost_fsb) { struct extent_tree_node *ext_ptr; @@ -1548,10 +1319,14 @@ build_agf_agfl( cpu_to_be32(btr_rmap->newbt.afake.af_blocks); } - agf->agf_refcount_root = cpu_to_be32(refcnt_bt->root); - agf->agf_refcount_level = cpu_to_be32(refcnt_bt->num_levels); - agf->agf_refcount_blocks = cpu_to_be32(refcnt_bt->num_tot_blocks - - refcnt_bt->num_free_blocks); + if (xfs_sb_version_hasreflink(&mp->m_sb)) { + agf->agf_refcount_root = + cpu_to_be32(btr_refc->newbt.afake.af_root); + agf->agf_refcount_level = + cpu_to_be32(btr_refc->newbt.afake.af_levels); + agf->agf_refcount_blocks = + cpu_to_be32(btr_refc->newbt.afake.af_blocks); + } /* * Count and record the number of btree blocks consumed if required. @@ -1715,7 +1490,7 @@ phase5_func( struct bt_rebuild btr_ino; struct bt_rebuild btr_fino; struct bt_rebuild btr_rmap; - bt_status_t refcnt_btree_curs; + struct bt_rebuild btr_refc; int extra_blocks = 0; uint num_freeblocks; xfs_extlen_t freeblks1; @@ -1765,7 +1540,8 @@ phase5_func( * Set up the btree cursors for the on-disk refcount btrees, * which includes pre-allocating all required blocks. */ - init_refc_cursor(mp, agno, &refcnt_btree_curs); + if (xfs_sb_version_hasreflink(&mp->m_sb)) + init_refc_cursor(&sc, agno, num_freeblocks, &btr_refc); num_extents = count_bno_extents_blocks(agno, &num_freeblocks); /* @@ -1836,16 +1612,14 @@ phase5_func( sb_fdblocks_ag[agno] += btr_rmap.newbt.afake.af_blocks - 1; } - if (xfs_sb_version_hasreflink(&mp->m_sb)) { - build_refcount_tree(mp, agno, &refcnt_btree_curs); - write_cursor(&refcnt_btree_curs); - } + if (xfs_sb_version_hasreflink(&mp->m_sb)) + build_refcount_tree(&sc, agno, &btr_refc); /* * set up agf and agfl */ build_agf_agfl(mp, agno, &btr_bno, &btr_cnt, freeblks1, extra_blocks, - &btr_rmap, &refcnt_btree_curs, lost_fsb); + &btr_rmap, &btr_refc, lost_fsb); /* * build inode allocation trees. @@ -1868,7 +1642,7 @@ phase5_func( if (xfs_sb_version_hasrmapbt(&mp->m_sb)) finish_rebuild(mp, &btr_rmap, lost_fsb); if (xfs_sb_version_hasreflink(&mp->m_sb)) - finish_cursor(&refcnt_btree_curs); + finish_rebuild(mp, &btr_refc, lost_fsb); /* * release the incore per-AG bno/bcnt trees so From patchwork Sat May 9 16:32:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11538417 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E5F45912 for ; Sat, 9 May 2020 16:32:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CA5732184D for ; Sat, 9 May 2020 16:32:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="N1GouR6z" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728223AbgEIQcm (ORCPT ); Sat, 9 May 2020 12:32:42 -0400 Received: from userp2120.oracle.com ([156.151.31.85]:38906 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727863AbgEIQcm (ORCPT ); Sat, 9 May 2020 12:32:42 -0400 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GOM87065002; Sat, 9 May 2020 16:32:37 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2020-01-29; bh=6fn1f9wGEkXLbHWXNbqLu1J/EpN8sE817tqGM3h+zFQ=; b=N1GouR6zJ0ph8JS3uQZvLSgS4UTO2yRfkKs+DxjZ3tdoVtdFUlhgDE0oZT+oHRyaSnGz oMpmaXkBzLR06OKL8xqgdASn4ajXo2MOw4E4cUWlxWj085GiaLqCX19iYswwlopCHtot KMpn2DrxhySI8qjIdmjHuRnK3WLJP5LBNUYba7B/6n5RkyqC0t8k9HgU0hJMTvAksjvX Ma3TFS9qGoF6odPdJbWwj+TQroyPeOMdmg7ZRZHvIN4rdxyLIoF8EHmttMBPueA9/TK4 qg6eL9FYnuBXLvXfMJPAqW8CLvvjNj4yUvEh91CP5YS58eJ+nEsR5NySiOL2IFFFswhw lA== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2120.oracle.com with ESMTP id 30wx8n86ra-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:32:37 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GWOBO116828; Sat, 9 May 2020 16:32:37 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by aserp3020.oracle.com with ESMTP id 30wwwpnkt0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:32:36 +0000 Received: from abhmp0010.oracle.com (abhmp0010.oracle.com [141.146.116.16]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 049GWYA8021132; Sat, 9 May 2020 16:32:34 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sat, 09 May 2020 09:32:33 -0700 Subject: [PATCH 8/9] xfs_repair: remove old btree rebuild support code From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org, bfoster@redhat.com Date: Sat, 09 May 2020 09:32:33 -0700 Message-ID: <158904195394.984305.11106338686100685725.stgit@magnolia> In-Reply-To: <158904190079.984305.707785748675261111.stgit@magnolia> References: <158904190079.984305.707785748675261111.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 adultscore=0 phishscore=0 bulkscore=0 suspectscore=2 malwarescore=0 mlxlogscore=999 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090141 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxscore=0 priorityscore=1501 adultscore=0 spamscore=0 malwarescore=0 mlxlogscore=999 phishscore=0 lowpriorityscore=0 bulkscore=0 suspectscore=2 clxscore=1015 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090140 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong This code isn't needed anymore, so get rid of it. Signed-off-by: Darrick J. Wong --- repair/phase5.c | 242 ------------------------------------------------------- 1 file changed, 242 deletions(-) diff --git a/repair/phase5.c b/repair/phase5.c index 6efc0613..9b064a1b 100644 --- a/repair/phase5.c +++ b/repair/phase5.c @@ -20,52 +20,6 @@ #include "rmap.h" #include "bload.h" -/* - * we maintain the current slice (path from root to leaf) - * of the btree incore. when we need a new block, we ask - * the block allocator for the address of a block on that - * level, map the block in, and set up the appropriate - * pointers (child, silbing, etc.) and keys that should - * point to the new block. - */ -typedef struct bt_stat_level { - /* - * set in setup_cursor routine and maintained in the tree-building - * routines - */ - xfs_buf_t *buf_p; /* 2 buffer pointers to ... */ - xfs_buf_t *prev_buf_p; - xfs_agblock_t agbno; /* current block being filled */ - xfs_agblock_t prev_agbno; /* previous block */ - /* - * set in calculate/init cursor routines for each btree level - */ - int num_recs_tot; /* # tree recs in level */ - int num_blocks; /* # tree blocks in level */ - int num_recs_pb; /* num_recs_tot / num_blocks */ - int modulo; /* num_recs_tot % num_blocks */ -} bt_stat_level_t; - -typedef struct bt_status { - int init; /* cursor set up once? */ - int num_levels; /* # of levels in btree */ - xfs_extlen_t num_tot_blocks; /* # blocks alloc'ed for tree */ - xfs_extlen_t num_free_blocks;/* # blocks currently unused */ - - xfs_agblock_t root; /* root block */ - /* - * list of blocks to be used to set up this tree - * and pointer to the first unused block on the list - */ - xfs_agblock_t *btree_blocks; /* block list */ - xfs_agblock_t *free_btree_blocks; /* first unused block */ - /* - * per-level status info - */ - bt_stat_level_t level[XFS_BTREE_MAXLEVELS]; - uint64_t owner; /* owner */ -} bt_status_t; - /* Context for rebuilding a per-AG btree. */ struct bt_rebuild { /* Fake root for staging and space preallocations. */ @@ -197,148 +151,6 @@ mk_incore_fstree( return(num_extents); } -static xfs_agblock_t -get_next_blockaddr(xfs_agnumber_t agno, int level, bt_status_t *curs) -{ - ASSERT(curs->free_btree_blocks < curs->btree_blocks + - curs->num_tot_blocks); - ASSERT(curs->num_free_blocks > 0); - - curs->num_free_blocks--; - return(*curs->free_btree_blocks++); -} - -/* - * set up the dynamically allocated block allocation data in the btree - * cursor that depends on the info in the static portion of the cursor. - * allocates space from the incore bno/bcnt extent trees and sets up - * the first path up the left side of the tree. Also sets up the - * cursor pointer to the btree root. called by init_freespace_cursor() - * and init_ino_cursor() - */ -static void -setup_cursor(xfs_mount_t *mp, xfs_agnumber_t agno, bt_status_t *curs) -{ - int j; - unsigned int u; - xfs_extlen_t big_extent_len; - xfs_agblock_t big_extent_start; - extent_tree_node_t *ext_ptr; - extent_tree_node_t *bno_ext_ptr; - xfs_extlen_t blocks_allocated; - xfs_agblock_t *agb_ptr; - int error; - - /* - * get the number of blocks we need to allocate, then - * set up block number array, set the free block pointer - * to the first block in the array, and null the array - */ - big_extent_len = curs->num_tot_blocks; - blocks_allocated = 0; - - ASSERT(big_extent_len > 0); - - if ((curs->btree_blocks = malloc(sizeof(xfs_agblock_t) - * big_extent_len)) == NULL) - do_error(_("could not set up btree block array\n")); - - agb_ptr = curs->free_btree_blocks = curs->btree_blocks; - - for (j = 0; j < curs->num_free_blocks; j++, agb_ptr++) - *agb_ptr = NULLAGBLOCK; - - /* - * grab the smallest extent and use it up, then get the - * next smallest. This mimics the init_*_cursor code. - */ - ext_ptr = findfirst_bcnt_extent(agno); - - agb_ptr = curs->btree_blocks; - - /* - * set up the free block array - */ - while (blocks_allocated < big_extent_len) { - if (!ext_ptr) - do_error( -_("error - not enough free space in filesystem\n")); - /* - * use up the extent we've got - */ - for (u = 0; u < ext_ptr->ex_blockcount && - blocks_allocated < big_extent_len; u++) { - ASSERT(agb_ptr < curs->btree_blocks - + curs->num_tot_blocks); - *agb_ptr++ = ext_ptr->ex_startblock + u; - blocks_allocated++; - } - - error = rmap_add_ag_rec(mp, agno, ext_ptr->ex_startblock, u, - curs->owner); - if (error) - do_error(_("could not set up btree rmaps: %s\n"), - strerror(-error)); - - /* - * if we only used part of this last extent, then we - * need only to reset the extent in the extent - * trees and we're done - */ - if (u < ext_ptr->ex_blockcount) { - big_extent_start = ext_ptr->ex_startblock + u; - big_extent_len = ext_ptr->ex_blockcount - u; - - ASSERT(big_extent_len > 0); - - bno_ext_ptr = find_bno_extent(agno, - ext_ptr->ex_startblock); - ASSERT(bno_ext_ptr != NULL); - get_bno_extent(agno, bno_ext_ptr); - release_extent_tree_node(bno_ext_ptr); - - ext_ptr = get_bcnt_extent(agno, ext_ptr->ex_startblock, - ext_ptr->ex_blockcount); - release_extent_tree_node(ext_ptr); -#ifdef XR_BLD_FREE_TRACE - fprintf(stderr, "releasing extent: %u [%u %u]\n", - agno, ext_ptr->ex_startblock, - ext_ptr->ex_blockcount); - fprintf(stderr, "blocks_allocated = %d\n", - blocks_allocated); -#endif - - add_bno_extent(agno, big_extent_start, big_extent_len); - add_bcnt_extent(agno, big_extent_start, big_extent_len); - - return; - } - /* - * delete the used-up extent from both extent trees and - * find next biggest extent - */ -#ifdef XR_BLD_FREE_TRACE - fprintf(stderr, "releasing extent: %u [%u %u]\n", - agno, ext_ptr->ex_startblock, ext_ptr->ex_blockcount); -#endif - bno_ext_ptr = find_bno_extent(agno, ext_ptr->ex_startblock); - ASSERT(bno_ext_ptr != NULL); - get_bno_extent(agno, bno_ext_ptr); - release_extent_tree_node(bno_ext_ptr); - - ext_ptr = get_bcnt_extent(agno, ext_ptr->ex_startblock, - ext_ptr->ex_blockcount); - ASSERT(ext_ptr != NULL); - release_extent_tree_node(ext_ptr); - - ext_ptr = findfirst_bcnt_extent(agno); - } -#ifdef XR_BLD_FREE_TRACE - fprintf(stderr, "blocks_allocated = %d\n", - blocks_allocated); -#endif -} - /* * Estimate proper slack values for a btree that's being reloaded. * @@ -490,36 +302,6 @@ rebuild_claim_block( return xrep_newbt_claim_block(cur, &btr->newbt, ptr); } -static void -write_cursor(bt_status_t *curs) -{ - int i; - - for (i = 0; i < curs->num_levels; i++) { -#if defined(XR_BLD_FREE_TRACE) || defined(XR_BLD_INO_TRACE) - fprintf(stderr, "writing bt block %u\n", curs->level[i].agbno); -#endif - if (curs->level[i].prev_buf_p != NULL) { - ASSERT(curs->level[i].prev_agbno != NULLAGBLOCK); -#if defined(XR_BLD_FREE_TRACE) || defined(XR_BLD_INO_TRACE) - fprintf(stderr, "writing bt prev block %u\n", - curs->level[i].prev_agbno); -#endif - libxfs_buf_mark_dirty(curs->level[i].prev_buf_p); - libxfs_buf_relse(curs->level[i].prev_buf_p); - } - libxfs_buf_mark_dirty(curs->level[i].buf_p); - libxfs_buf_relse(curs->level[i].buf_p); - } -} - -static void -finish_cursor(bt_status_t *curs) -{ - ASSERT(curs->num_free_blocks == 0); - free(curs->btree_blocks); -} - /* * Scoop up leftovers from a rebuild cursor for later freeing, then free the * rebuild context. @@ -548,30 +330,6 @@ _("Insufficient memory saving lost blocks.\n")); xrep_newbt_destroy(&btr->newbt, 0); } -/* Map btnum to buffer ops for the types that need it. */ -static const struct xfs_buf_ops * -btnum_to_ops( - xfs_btnum_t btnum) -{ - switch (btnum) { - case XFS_BTNUM_BNO: - return &xfs_bnobt_buf_ops; - case XFS_BTNUM_CNT: - return &xfs_cntbt_buf_ops; - case XFS_BTNUM_INO: - return &xfs_inobt_buf_ops; - case XFS_BTNUM_FINO: - return &xfs_finobt_buf_ops; - case XFS_BTNUM_RMAP: - return &xfs_rmapbt_buf_ops; - case XFS_BTNUM_REFC: - return &xfs_refcountbt_buf_ops; - default: - ASSERT(0); - return NULL; - } -} - /* * Free Space Btrees * From patchwork Sat May 9 16:32:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 11538425 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 57038912 for ; Sat, 9 May 2020 16:32:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3B4D420735 for ; Sat, 9 May 2020 16:32:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="nxuSzjRD" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728323AbgEIQct (ORCPT ); Sat, 9 May 2020 12:32:49 -0400 Received: from userp2130.oracle.com ([156.151.31.86]:50818 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728281AbgEIQcp (ORCPT ); Sat, 9 May 2020 12:32:45 -0400 Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GMgAq072313; Sat, 9 May 2020 16:32:41 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=corp-2020-01-29; bh=EMiylHkb+UfdwgRp+LAkTDoazRCGOMBEhlccS1JxGr0=; b=nxuSzjRD80m+5XRz883dwjou9Yg8RsgvqsEsHBjEV0B7Ll8hObciH1Kr67q+6eyE570N OuFe7lgZhqZOcHi9gjukR/e9ck4D/vN3HVSmhiikQz4cyKe2uofqncoZyC/6EviAg9w7 cTkK3444nQSOnSsA90m89hvRu6UYyznxqfvVxUelk+GB2KF8X/7nLoYv7tOmKWmREyPo AcWU0A1ey3VdTh4TDh1D0UdB5rGLasnFi6I7OYyBqXl6ITwz03c+0oV/xM9creF41KbL 69ZF7Cjwue4yEOQsAMPKyIjHfC3eHzNJ9ZjIihH3lnDTw+bKNQl/lOIvO+yDzbRGL4bU xQ== Received: from userp3020.oracle.com (userp3020.oracle.com [156.151.31.79]) by userp2130.oracle.com with ESMTP id 30wkxqs6jf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:32:41 +0000 Received: from pps.filterd (userp3020.oracle.com [127.0.0.1]) by userp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 049GTufb112489; Sat, 9 May 2020 16:32:41 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userp3020.oracle.com with ESMTP id 30wx11cww3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sat, 09 May 2020 16:32:41 +0000 Received: from abhmp0018.oracle.com (abhmp0018.oracle.com [141.146.116.24]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id 049GWeSo021143; Sat, 9 May 2020 16:32:40 GMT Received: from localhost (/67.169.218.210) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sat, 09 May 2020 09:32:40 -0700 Subject: [PATCH 9/9] xfs_repair: track blocks lost during btree construction via extents From: "Darrick J. Wong" To: sandeen@sandeen.net, darrick.wong@oracle.com Cc: linux-xfs@vger.kernel.org, bfoster@redhat.com Date: Sat, 09 May 2020 09:32:40 -0700 Message-ID: <158904196027.984305.4802064994885970727.stgit@magnolia> In-Reply-To: <158904190079.984305.707785748675261111.stgit@magnolia> References: <158904190079.984305.707785748675261111.stgit@magnolia> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 spamscore=0 bulkscore=0 suspectscore=2 malwarescore=0 mlxscore=0 mlxlogscore=999 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090141 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9616 signatures=668687 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 mlxscore=0 bulkscore=0 priorityscore=1501 impostorscore=0 lowpriorityscore=0 clxscore=1015 malwarescore=0 suspectscore=2 phishscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2005090139 Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Darrick J. Wong Use extent records (not just raw fsbs) to track blocks that were lost during btree construction. This makes it somewhat more efficient. Signed-off-by: Darrick J. Wong --- repair/phase5.c | 61 ++++++++++++++++++++++++++++++++----------------------- 1 file changed, 35 insertions(+), 26 deletions(-) diff --git a/repair/phase5.c b/repair/phase5.c index 9b064a1b..f8693528 100644 --- a/repair/phase5.c +++ b/repair/phase5.c @@ -45,6 +45,12 @@ struct bt_rebuild { }; }; +struct lost_fsb { + xfs_fsblock_t fsbno; + xfs_extlen_t len; +}; + + /* * extra metadata for the agi */ @@ -310,21 +316,24 @@ static void finish_rebuild( struct xfs_mount *mp, struct bt_rebuild *btr, - struct xfs_slab *lost_fsb) + struct xfs_slab *lost_fsbs) { struct xrep_newbt_resv *resv, *n; for_each_xrep_newbt_reservation(&btr->newbt, resv, n) { - while (resv->used < resv->len) { - xfs_fsblock_t fsb = resv->fsbno + resv->used; - int error; + struct lost_fsb lost; + int error; + + if (resv->used == resv->len) + continue; - error = slab_add(lost_fsb, &fsb); - if (error) - do_error( + lost.fsbno = resv->fsbno + resv->used; + lost.len = resv->len - resv->used; + error = slab_add(lost_fsbs, &lost); + if (error) + do_error( _("Insufficient memory saving lost blocks.\n")); - resv->used++; - } + resv->used = resv->len; } xrep_newbt_destroy(&btr->newbt, 0); @@ -1020,7 +1029,7 @@ build_agf_agfl( int lostblocks, /* # blocks that will be lost */ struct bt_rebuild *btr_rmap, struct bt_rebuild *btr_refc, - struct xfs_slab *lost_fsb) + struct xfs_slab *lost_fsbs) { struct extent_tree_node *ext_ptr; struct xfs_buf *agf_buf, *agfl_buf; @@ -1239,7 +1248,7 @@ static void phase5_func( struct xfs_mount *mp, xfs_agnumber_t agno, - struct xfs_slab *lost_fsb) + struct xfs_slab *lost_fsbs) { struct repair_ctx sc = { .mp = mp, }; struct agi_stat agi_stat = {0,}; @@ -1377,7 +1386,7 @@ phase5_func( * set up agf and agfl */ build_agf_agfl(mp, agno, &btr_bno, &btr_cnt, freeblks1, extra_blocks, - &btr_rmap, &btr_refc, lost_fsb); + &btr_rmap, &btr_refc, lost_fsbs); /* * build inode allocation trees. @@ -1392,15 +1401,15 @@ phase5_func( /* * tear down cursors */ - finish_rebuild(mp, &btr_bno, lost_fsb); - finish_rebuild(mp, &btr_cnt, lost_fsb); - finish_rebuild(mp, &btr_ino, lost_fsb); + finish_rebuild(mp, &btr_bno, lost_fsbs); + finish_rebuild(mp, &btr_cnt, lost_fsbs); + finish_rebuild(mp, &btr_ino, lost_fsbs); if (xfs_sb_version_hasfinobt(&mp->m_sb)) - finish_rebuild(mp, &btr_fino, lost_fsb); + finish_rebuild(mp, &btr_fino, lost_fsbs); if (xfs_sb_version_hasrmapbt(&mp->m_sb)) - finish_rebuild(mp, &btr_rmap, lost_fsb); + finish_rebuild(mp, &btr_rmap, lost_fsbs); if (xfs_sb_version_hasreflink(&mp->m_sb)) - finish_rebuild(mp, &btr_refc, lost_fsb); + finish_rebuild(mp, &btr_refc, lost_fsbs); /* * release the incore per-AG bno/bcnt trees so @@ -1420,19 +1429,19 @@ inject_lost_blocks( { struct xfs_trans *tp = NULL; struct xfs_slab_cursor *cur = NULL; - xfs_fsblock_t *fsb; + struct lost_fsb *lost; int error; error = init_slab_cursor(lost_fsbs, NULL, &cur); if (error) return error; - while ((fsb = pop_slab_cursor(cur)) != NULL) { + while ((lost = pop_slab_cursor(cur)) != NULL) { error = -libxfs_trans_alloc_rollable(mp, 16, &tp); if (error) goto out_cancel; - error = -libxfs_free_extent(tp, *fsb, 1, + error = -libxfs_free_extent(tp, lost->fsbno, lost->len, &XFS_RMAP_OINFO_ANY_OWNER, XFS_AG_RESV_NONE); if (error) goto out_cancel; @@ -1453,7 +1462,7 @@ inject_lost_blocks( void phase5(xfs_mount_t *mp) { - struct xfs_slab *lost_fsb; + struct xfs_slab *lost_fsbs; xfs_agnumber_t agno; int error; @@ -1496,12 +1505,12 @@ phase5(xfs_mount_t *mp) if (sb_fdblocks_ag == NULL) do_error(_("cannot alloc sb_fdblocks_ag buffers\n")); - error = init_slab(&lost_fsb, sizeof(xfs_fsblock_t)); + error = init_slab(&lost_fsbs, sizeof(struct lost_fsb)); if (error) do_error(_("cannot alloc lost block slab\n")); for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) - phase5_func(mp, agno, lost_fsb); + phase5_func(mp, agno, lost_fsbs); print_final_rpt(); @@ -1544,10 +1553,10 @@ _("unable to add AG %u reverse-mapping data to btree.\n"), agno); * Put blocks that were unnecessarily reserved for btree * reconstruction back into the filesystem free space data. */ - error = inject_lost_blocks(mp, lost_fsb); + error = inject_lost_blocks(mp, lost_fsbs); if (error) do_error(_("Unable to reinsert lost blocks into filesystem.\n")); - free_slab(&lost_fsb); + free_slab(&lost_fsbs); bad_ino_btree = 0;