From patchwork Tue Jul 30 00:57:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13746059 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A097029AF for ; Tue, 30 Jul 2024 00:57:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722301077; cv=none; b=LCM2Nl7ri858K1y/y0UJnnfHsR/zZ+W9fsSxL4dEmJCuBpMhPUiF/On1YjiiJneB7Uzty43uGnPHknE3geJ/xlwi/Svhqmy96Arjyjn+9ZfmoUIopa4mYU098rbEbUxNDgUOLm95mPaEMuCbSuw2fYUIndMsnMy5WCSh+zJTC0Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722301077; c=relaxed/simple; bh=8oSeG4DOfNcdLr3Ls720/vTyzCjZmSx+50pJQfZ3az8=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=WWSOavCkRlwKakpmXVlB8B4qyC35qT1LAT6++vMce+/aKYfthHSaRRoNGGzBrzWklSWQVeqHUqCNICRTsintdjpGfQ6w4BoN27cBzBMOI7pf/AajdimXl6hQdEOKNkHgHnBYTxdGP4liXnp2SCPIIKba/KxQrS8WsMEXL4ff7Qc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=FuUZXXY+; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="FuUZXXY+" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 39F7AC32786; Tue, 30 Jul 2024 00:57:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1722301077; bh=8oSeG4DOfNcdLr3Ls720/vTyzCjZmSx+50pJQfZ3az8=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=FuUZXXY+R+UJTyzkjT+inAeReeVze1rjYUEY73HGbVMZVRX+9wHvflTregnNcPxrt MpbIeNr5I2uBcT9uuBMPN68dc08UuGpFl7AhNBrQ5HJCo1dn5z3xVql//jraZh1NnO JnyjF27bONFufO61GOQBZCzAFFnTraMHjmKq/I0Nolq1XcRJeaigIn4wwcaKVEGrD2 327jbc0oAnUC0aGtSuCcSHYO/FZXB6e60jA+34jqh5M0nJzRG4EPlse1R7IeqsDkze 34WayK+3i+iS6ZWN8GOZpxiOOULG8XJSOFC4qWm3ey2OW8k7MEB7VMdWd35UDHCvYj S7wmSH//gs6YQ== Date: Mon, 29 Jul 2024 17:57:56 -0700 Subject: [PATCH 1/5] xfs_scrub: remove ALP_* flags namespace From: "Darrick J. Wong" To: djwong@kernel.org, cem@kernel.org Cc: Christoph Hellwig , linux-xfs@vger.kernel.org Message-ID: <172229845558.1345742.15712672377284673875.stgit@frogsfrogsfrogs> In-Reply-To: <172229845539.1345742.12185001279081616156.stgit@frogsfrogsfrogs> References: <172229845539.1345742.12185001279081616156.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong In preparation to move all the repair code to repair.[ch], remove the ALP_* flags namespace since it mostly overlaps with XRM_*. Rename the clunky "COMPLAIN_IF_UNFIXED" flag to "FINAL_WARNING", because that's what it really means. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- scrub/phase3.c | 2 +- scrub/phase4.c | 2 +- scrub/phase5.c | 2 +- scrub/phase7.c | 2 +- scrub/repair.c | 4 ++-- scrub/repair.h | 16 ++++++++++++---- scrub/scrub.c | 10 +++++----- scrub/scrub.h | 10 ---------- 8 files changed, 23 insertions(+), 25 deletions(-) diff --git a/scrub/phase3.c b/scrub/phase3.c index 4235c228c..9a26b9203 100644 --- a/scrub/phase3.c +++ b/scrub/phase3.c @@ -88,7 +88,7 @@ try_inode_repair( return 0; ret = action_list_process(ictx->ctx, fd, alist, - ALP_REPAIR_ONLY | ALP_NOPROGRESS); + XRM_REPAIR_ONLY | XRM_NOPROGRESS); if (ret) return ret; diff --git a/scrub/phase4.c b/scrub/phase4.c index 8807f147a..d42e67637 100644 --- a/scrub/phase4.c +++ b/scrub/phase4.c @@ -54,7 +54,7 @@ repair_ag( } while (unfixed > 0); /* Try once more, but this time complain if we can't fix things. */ - flags |= ALP_COMPLAIN_IF_UNFIXED; + flags |= XRM_FINAL_WARNING; ret = action_list_process(ctx, -1, alist, flags); if (ret) *aborted = true; diff --git a/scrub/phase5.c b/scrub/phase5.c index b4c635d34..940e434c3 100644 --- a/scrub/phase5.c +++ b/scrub/phase5.c @@ -422,7 +422,7 @@ fs_scan_worker( } ret = action_list_process(ctx, ctx->mnt.fd, &item->alist, - ALP_COMPLAIN_IF_UNFIXED | ALP_NOPROGRESS); + XRM_FINAL_WARNING | XRM_NOPROGRESS); if (ret) { str_liberror(ctx, ret, _("repairing fs scan metadata")); *item->abortedp = true; diff --git a/scrub/phase7.c b/scrub/phase7.c index 93a074f11..820a68f99 100644 --- a/scrub/phase7.c +++ b/scrub/phase7.c @@ -122,7 +122,7 @@ phase7_func( if (error) return error; error = action_list_process(ctx, -1, &alist, - ALP_COMPLAIN_IF_UNFIXED | ALP_NOPROGRESS); + XRM_FINAL_WARNING | XRM_NOPROGRESS); if (error) return error; diff --git a/scrub/repair.c b/scrub/repair.c index 9ade805e1..61d62ab6b 100644 --- a/scrub/repair.c +++ b/scrub/repair.c @@ -274,7 +274,7 @@ action_list_process( fix = xfs_repair_metadata(ctx, xfdp, aitem, repair_flags); switch (fix) { case CHECK_DONE: - if (!(repair_flags & ALP_NOPROGRESS)) + if (!(repair_flags & XRM_NOPROGRESS)) progress_add(1); alist->nr--; list_del(&aitem->list); @@ -316,7 +316,7 @@ action_list_process_or_defer( int ret; ret = action_list_process(ctx, -1, alist, - ALP_REPAIR_ONLY | ALP_NOPROGRESS); + XRM_REPAIR_ONLY | XRM_NOPROGRESS); if (ret) return ret; diff --git a/scrub/repair.h b/scrub/repair.h index aa3ea1361..6b6f64691 100644 --- a/scrub/repair.h +++ b/scrub/repair.h @@ -32,10 +32,18 @@ void action_list_find_mustfix(struct action_list *actions, unsigned long long *broken_primaries, unsigned long long *broken_secondaries); -/* Passed through to xfs_repair_metadata() */ -#define ALP_REPAIR_ONLY (XRM_REPAIR_ONLY) -#define ALP_COMPLAIN_IF_UNFIXED (XRM_COMPLAIN_IF_UNFIXED) -#define ALP_NOPROGRESS (1U << 31) +/* + * Only ask the kernel to repair this object if the kernel directly told us it + * was corrupt. Objects that are only flagged as having cross-referencing + * errors or flagged as eligible for optimization are left for later. + */ +#define XRM_REPAIR_ONLY (1U << 0) + +/* This is the last repair attempt; complain if still broken even after fix. */ +#define XRM_FINAL_WARNING (1U << 1) + +/* Don't call progress_add after repairing an item. */ +#define XRM_NOPROGRESS (1U << 2) int action_list_process(struct scrub_ctx *ctx, int fd, struct action_list *alist, unsigned int repair_flags); diff --git a/scrub/scrub.c b/scrub/scrub.c index 7cb94af3d..f4b152a1c 100644 --- a/scrub/scrub.c +++ b/scrub/scrub.c @@ -743,7 +743,7 @@ _("Filesystem is shut down, aborting.")); * could fix this, it's at least worth trying the scan * again to see if another repair fixed it. */ - if (!(repair_flags & XRM_COMPLAIN_IF_UNFIXED)) + if (!(repair_flags & XRM_FINAL_WARNING)) return CHECK_RETRY; fallthrough; case EINVAL: @@ -773,13 +773,13 @@ _("Read-only filesystem; cannot make changes.")); * to requeue the repair for later and don't say a * thing. Otherwise, print error and bail out. */ - if (!(repair_flags & XRM_COMPLAIN_IF_UNFIXED)) + if (!(repair_flags & XRM_FINAL_WARNING)) return CHECK_RETRY; str_liberror(ctx, error, descr_render(&dsc)); return CHECK_DONE; } - if (repair_flags & XRM_COMPLAIN_IF_UNFIXED) + if (repair_flags & XRM_FINAL_WARNING) scrub_warn_incomplete_scrub(ctx, &dsc, &meta); if (needs_repair(&meta)) { /* @@ -787,7 +787,7 @@ _("Read-only filesystem; cannot make changes.")); * just requeue this and try again later. Otherwise we * log the error loudly and don't try again. */ - if (!(repair_flags & XRM_COMPLAIN_IF_UNFIXED)) + if (!(repair_flags & XRM_FINAL_WARNING)) return CHECK_RETRY; str_corrupt(ctx, descr_render(&dsc), _("Repair unsuccessful; offline repair required.")); @@ -799,7 +799,7 @@ _("Repair unsuccessful; offline repair required.")); * caller to run xfs_repair; otherwise, we'll keep trying to * reverify the cross-referencing as repairs progress. */ - if (repair_flags & XRM_COMPLAIN_IF_UNFIXED) { + if (repair_flags & XRM_FINAL_WARNING) { str_info(ctx, descr_render(&dsc), _("Seems correct but cross-referencing failed; offline repair recommended.")); } else { diff --git a/scrub/scrub.h b/scrub/scrub.h index cb33ddb46..5359548b0 100644 --- a/scrub/scrub.h +++ b/scrub/scrub.h @@ -54,16 +54,6 @@ struct action_item { __u32 agno; }; -/* - * Only ask the kernel to repair this object if the kernel directly told us it - * was corrupt. Objects that are only flagged as having cross-referencing - * errors or flagged as eligible for optimization are left for later. - */ -#define XRM_REPAIR_ONLY (1U << 0) - -/* Complain if still broken even after fix. */ -#define XRM_COMPLAIN_IF_UNFIXED (1U << 1) - enum check_outcome xfs_repair_metadata(struct scrub_ctx *ctx, struct xfs_fd *xfdp, struct action_item *aitem, unsigned int repair_flags); From patchwork Tue Jul 30 00:58:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13746060 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0BC154C97 for ; Tue, 30 Jul 2024 00:58:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722301093; cv=none; b=ebTxfcqA9FJHzyI0WsG1BPXLQ1q4Z93GMWN4zn8pk7JSnnVxIZA6rsC63uJGVAClsjrO8EcszXqjbmm1dX3TxhOVsGfrJyiQ4j9Q2fvswrCbM5gL6YP8+t2SE1Hu87jHF2LurivXCM4RnoUm1zUpYTbCrpbAl/Ea+m98BSbl2VM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722301093; c=relaxed/simple; bh=xvdbRYrvu+XRnkxZXHC4m059ELN6ME8HQW8h3VygXaM=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=S6iOcEABZaVg6MIvh/MOYTlGQAnbfI6v2bcMSEMFIMwmz0DDbrI1jruitNFD+DPScuGs0zK0wPoFOEy0ixiEUQwtpxBLnipyOs+n1pg4rShhiCOKe0MWCPndBrUcRjmYA1d2BxgtBTxJTPCyCYlGttka5Q2hGasunu2enjmXLtk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=nuCLDs7n; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="nuCLDs7n" Received: by smtp.kernel.org (Postfix) with ESMTPSA id D51EEC32786; Tue, 30 Jul 2024 00:58:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1722301092; bh=xvdbRYrvu+XRnkxZXHC4m059ELN6ME8HQW8h3VygXaM=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=nuCLDs7n+O+vQGyf4+Nxfxo1B8NDFBm4qtXT6cBaNiZCIGWoHeSo/5VD/+8H8gysd dZY/2dpSl0Jh7rBicRAr6j3UaP21MJajJmUeadfsjxNP6Or5zrHyzN8f4XVZ3IbkcB MAIanJ/i1VwrQrp7Ay5RtBJUrMHhL79iRExA4bjRNOqOxGqwmk97QzBQkmR8N+1Sok I5LjT/QqIhQO/dPQxaBM3XMaoxgz57jUNuelyjTj5GS4ZJHjh1BSPlM3zI4coexemQ ZcBSPTWb8FmZn62bTtyADw+XUpE9yUMo8JsbLbb/WRTHBe2SAjo/+lLTM88SCKRJfp p+Tm8d+QNv9bw== Date: Mon, 29 Jul 2024 17:58:12 -0700 Subject: [PATCH 2/5] xfs_scrub: move repair functions to repair.c From: "Darrick J. Wong" To: djwong@kernel.org, cem@kernel.org Cc: Christoph Hellwig , linux-xfs@vger.kernel.org Message-ID: <172229845571.1345742.12588291319317974254.stgit@frogsfrogsfrogs> In-Reply-To: <172229845539.1345742.12185001279081616156.stgit@frogsfrogsfrogs> References: <172229845539.1345742.12185001279081616156.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Move all the repair functions to repair.c. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- scrub/phase1.c | 2 scrub/repair.c | 169 +++++++++++++++++++++++++++++++++++++++++ scrub/scrub.c | 204 +------------------------------------------------ scrub/scrub.h | 6 - scrub/scrub_private.h | 55 +++++++++++++ 5 files changed, 230 insertions(+), 206 deletions(-) create mode 100644 scrub/scrub_private.h diff --git a/scrub/phase1.c b/scrub/phase1.c index 96138e03e..81b0918a1 100644 --- a/scrub/phase1.c +++ b/scrub/phase1.c @@ -210,7 +210,7 @@ _("Kernel metadata scrubbing facility is not available.")); } /* Do we need kernel-assisted metadata repair? */ - if (ctx->mode != SCRUB_MODE_DRY_RUN && !xfs_can_repair(ctx)) { + if (ctx->mode != SCRUB_MODE_DRY_RUN && !can_repair(ctx)) { str_error(ctx, ctx->mntpoint, _("Kernel metadata repair facility is not available. Use -n to scrub.")); return ECANCELED; diff --git a/scrub/repair.c b/scrub/repair.c index 61d62ab6b..54bd09575 100644 --- a/scrub/repair.c +++ b/scrub/repair.c @@ -10,11 +10,180 @@ #include #include "list.h" #include "libfrog/paths.h" +#include "libfrog/fsgeom.h" +#include "libfrog/scrub.h" #include "xfs_scrub.h" #include "common.h" #include "scrub.h" #include "progress.h" #include "repair.h" +#include "descr.h" +#include "scrub_private.h" + +/* General repair routines. */ + +/* Repair some metadata. */ +static enum check_outcome +xfs_repair_metadata( + struct scrub_ctx *ctx, + struct xfs_fd *xfdp, + struct action_item *aitem, + unsigned int repair_flags) +{ + struct xfs_scrub_metadata meta = { 0 }; + struct xfs_scrub_metadata oldm; + DEFINE_DESCR(dsc, ctx, format_scrub_descr); + int error; + + assert(aitem->type < XFS_SCRUB_TYPE_NR); + assert(!debug_tweak_on("XFS_SCRUB_NO_KERNEL")); + meta.sm_type = aitem->type; + meta.sm_flags = aitem->flags | XFS_SCRUB_IFLAG_REPAIR; + if (use_force_rebuild) + meta.sm_flags |= XFS_SCRUB_IFLAG_FORCE_REBUILD; + switch (xfrog_scrubbers[aitem->type].group) { + case XFROG_SCRUB_GROUP_AGHEADER: + case XFROG_SCRUB_GROUP_PERAG: + meta.sm_agno = aitem->agno; + break; + case XFROG_SCRUB_GROUP_INODE: + meta.sm_ino = aitem->ino; + meta.sm_gen = aitem->gen; + break; + default: + break; + } + + if (!is_corrupt(&meta) && (repair_flags & XRM_REPAIR_ONLY)) + return CHECK_RETRY; + + memcpy(&oldm, &meta, sizeof(oldm)); + descr_set(&dsc, &oldm); + + if (needs_repair(&meta)) + str_info(ctx, descr_render(&dsc), _("Attempting repair.")); + else if (debug || verbose) + str_info(ctx, descr_render(&dsc), + _("Attempting optimization.")); + + error = -xfrog_scrub_metadata(xfdp, &meta); + switch (error) { + case 0: + /* No operational errors encountered. */ + break; + case EDEADLOCK: + case EBUSY: + /* Filesystem is busy, try again later. */ + if (debug || verbose) + str_info(ctx, descr_render(&dsc), +_("Filesystem is busy, deferring repair.")); + return CHECK_RETRY; + case ESHUTDOWN: + /* Filesystem is already shut down, abort. */ + str_error(ctx, descr_render(&dsc), +_("Filesystem is shut down, aborting.")); + return CHECK_ABORT; + case ENOTTY: + case EOPNOTSUPP: + /* + * If the kernel cannot perform the optimization that we + * requested; or we forced a repair but the kernel doesn't know + * how to perform the repair, don't requeue the request. Mark + * it done and move on. + */ + if (is_unoptimized(&oldm) || + debug_tweak_on("XFS_SCRUB_FORCE_REPAIR")) + return CHECK_DONE; + /* + * If we're in no-complain mode, requeue the check for + * later. It's possible that an error in another + * component caused us to flag an error in this + * component. Even if the kernel didn't think it + * could fix this, it's at least worth trying the scan + * again to see if another repair fixed it. + */ + if (!(repair_flags & XRM_FINAL_WARNING)) + return CHECK_RETRY; + fallthrough; + case EINVAL: + /* Kernel doesn't know how to repair this? */ + str_corrupt(ctx, descr_render(&dsc), +_("Don't know how to fix; offline repair required.")); + return CHECK_DONE; + case EROFS: + /* Read-only filesystem, can't fix. */ + if (verbose || debug || needs_repair(&oldm)) + str_error(ctx, descr_render(&dsc), +_("Read-only filesystem; cannot make changes.")); + return CHECK_ABORT; + case ENOENT: + /* Metadata not present, just skip it. */ + return CHECK_DONE; + case ENOMEM: + case ENOSPC: + /* Don't care if preen fails due to low resources. */ + if (is_unoptimized(&oldm) && !needs_repair(&oldm)) + return CHECK_DONE; + fallthrough; + default: + /* + * Operational error. If the caller doesn't want us + * to complain about repair failures, tell the caller + * to requeue the repair for later and don't say a + * thing. Otherwise, print error and bail out. + */ + if (!(repair_flags & XRM_FINAL_WARNING)) + return CHECK_RETRY; + str_liberror(ctx, error, descr_render(&dsc)); + return CHECK_DONE; + } + + if (repair_flags & XRM_FINAL_WARNING) + scrub_warn_incomplete_scrub(ctx, &dsc, &meta); + if (needs_repair(&meta)) { + /* + * Still broken; if we've been told not to complain then we + * just requeue this and try again later. Otherwise we + * log the error loudly and don't try again. + */ + if (!(repair_flags & XRM_FINAL_WARNING)) + return CHECK_RETRY; + str_corrupt(ctx, descr_render(&dsc), +_("Repair unsuccessful; offline repair required.")); + } else if (xref_failed(&meta)) { + /* + * This metadata object itself looks ok, but we still noticed + * inconsistencies when comparing it with the other filesystem + * metadata. If we're in "final warning" mode, advise the + * caller to run xfs_repair; otherwise, we'll keep trying to + * reverify the cross-referencing as repairs progress. + */ + if (repair_flags & XRM_FINAL_WARNING) { + str_info(ctx, descr_render(&dsc), + _("Seems correct but cross-referencing failed; offline repair recommended.")); + } else { + if (verbose) + str_info(ctx, descr_render(&dsc), + _("Seems correct but cross-referencing failed; will keep checking.")); + return CHECK_RETRY; + } + } else { + /* Clean operation, no corruption detected. */ + if (is_corrupt(&oldm)) + record_repair(ctx, descr_render(&dsc), + _("Repairs successful.")); + else if (xref_disagrees(&oldm)) + record_repair(ctx, descr_render(&dsc), + _("Repairs successful after discrepancy in cross-referencing.")); + else if (xref_failed(&oldm)) + record_repair(ctx, descr_render(&dsc), + _("Repairs successful after cross-referencing failure.")); + else + record_preen(ctx, descr_render(&dsc), + _("Optimization successful.")); + } + return CHECK_DONE; +} /* * Prioritize action items in order of how long we can wait. diff --git a/scrub/scrub.c b/scrub/scrub.c index f4b152a1c..595839130 100644 --- a/scrub/scrub.c +++ b/scrub/scrub.c @@ -20,11 +20,12 @@ #include "scrub.h" #include "repair.h" #include "descr.h" +#include "scrub_private.h" /* Online scrub and repair wrappers. */ /* Format a scrub description. */ -static int +int format_scrub_descr( struct scrub_ctx *ctx, char *buf, @@ -52,46 +53,8 @@ format_scrub_descr( return -1; } -/* Predicates for scrub flag state. */ - -static inline bool is_corrupt(struct xfs_scrub_metadata *sm) -{ - return sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT; -} - -static inline bool is_unoptimized(struct xfs_scrub_metadata *sm) -{ - return sm->sm_flags & XFS_SCRUB_OFLAG_PREEN; -} - -static inline bool xref_failed(struct xfs_scrub_metadata *sm) -{ - return sm->sm_flags & XFS_SCRUB_OFLAG_XFAIL; -} - -static inline bool xref_disagrees(struct xfs_scrub_metadata *sm) -{ - return sm->sm_flags & XFS_SCRUB_OFLAG_XCORRUPT; -} - -static inline bool is_incomplete(struct xfs_scrub_metadata *sm) -{ - return sm->sm_flags & XFS_SCRUB_OFLAG_INCOMPLETE; -} - -static inline bool is_suspicious(struct xfs_scrub_metadata *sm) -{ - return sm->sm_flags & XFS_SCRUB_OFLAG_WARNING; -} - -/* Should we fix it? */ -static inline bool needs_repair(struct xfs_scrub_metadata *sm) -{ - return is_corrupt(sm) || xref_disagrees(sm); -} - /* Warn about strange circumstances after scrub. */ -static inline void +void scrub_warn_incomplete_scrub( struct scrub_ctx *ctx, struct descr *dsc, @@ -647,7 +610,7 @@ can_scrub_parent( } bool -xfs_can_repair( +can_repair( struct scrub_ctx *ctx) { return __scrub_test(ctx, XFS_SCRUB_TYPE_PROBE, XFS_SCRUB_IFLAG_REPAIR); @@ -660,162 +623,3 @@ can_force_rebuild( return __scrub_test(ctx, XFS_SCRUB_TYPE_PROBE, XFS_SCRUB_IFLAG_REPAIR | XFS_SCRUB_IFLAG_FORCE_REBUILD); } - -/* General repair routines. */ - -/* Repair some metadata. */ -enum check_outcome -xfs_repair_metadata( - struct scrub_ctx *ctx, - struct xfs_fd *xfdp, - struct action_item *aitem, - unsigned int repair_flags) -{ - struct xfs_scrub_metadata meta = { 0 }; - struct xfs_scrub_metadata oldm; - DEFINE_DESCR(dsc, ctx, format_scrub_descr); - int error; - - assert(aitem->type < XFS_SCRUB_TYPE_NR); - assert(!debug_tweak_on("XFS_SCRUB_NO_KERNEL")); - meta.sm_type = aitem->type; - meta.sm_flags = aitem->flags | XFS_SCRUB_IFLAG_REPAIR; - if (use_force_rebuild) - meta.sm_flags |= XFS_SCRUB_IFLAG_FORCE_REBUILD; - switch (xfrog_scrubbers[aitem->type].group) { - case XFROG_SCRUB_GROUP_AGHEADER: - case XFROG_SCRUB_GROUP_PERAG: - meta.sm_agno = aitem->agno; - break; - case XFROG_SCRUB_GROUP_INODE: - meta.sm_ino = aitem->ino; - meta.sm_gen = aitem->gen; - break; - default: - break; - } - - if (!is_corrupt(&meta) && (repair_flags & XRM_REPAIR_ONLY)) - return CHECK_RETRY; - - memcpy(&oldm, &meta, sizeof(oldm)); - descr_set(&dsc, &oldm); - - if (needs_repair(&meta)) - str_info(ctx, descr_render(&dsc), _("Attempting repair.")); - else if (debug || verbose) - str_info(ctx, descr_render(&dsc), - _("Attempting optimization.")); - - error = -xfrog_scrub_metadata(xfdp, &meta); - switch (error) { - case 0: - /* No operational errors encountered. */ - break; - case EDEADLOCK: - case EBUSY: - /* Filesystem is busy, try again later. */ - if (debug || verbose) - str_info(ctx, descr_render(&dsc), -_("Filesystem is busy, deferring repair.")); - return CHECK_RETRY; - case ESHUTDOWN: - /* Filesystem is already shut down, abort. */ - str_error(ctx, descr_render(&dsc), -_("Filesystem is shut down, aborting.")); - return CHECK_ABORT; - case ENOTTY: - case EOPNOTSUPP: - /* - * If the kernel cannot perform the optimization that we - * requested; or we forced a repair but the kernel doesn't know - * how to perform the repair, don't requeue the request. Mark - * it done and move on. - */ - if (is_unoptimized(&oldm) || - debug_tweak_on("XFS_SCRUB_FORCE_REPAIR")) - return CHECK_DONE; - /* - * If we're in no-complain mode, requeue the check for - * later. It's possible that an error in another - * component caused us to flag an error in this - * component. Even if the kernel didn't think it - * could fix this, it's at least worth trying the scan - * again to see if another repair fixed it. - */ - if (!(repair_flags & XRM_FINAL_WARNING)) - return CHECK_RETRY; - fallthrough; - case EINVAL: - /* Kernel doesn't know how to repair this? */ - str_corrupt(ctx, descr_render(&dsc), -_("Don't know how to fix; offline repair required.")); - return CHECK_DONE; - case EROFS: - /* Read-only filesystem, can't fix. */ - if (verbose || debug || needs_repair(&oldm)) - str_error(ctx, descr_render(&dsc), -_("Read-only filesystem; cannot make changes.")); - return CHECK_ABORT; - case ENOENT: - /* Metadata not present, just skip it. */ - return CHECK_DONE; - case ENOMEM: - case ENOSPC: - /* Don't care if preen fails due to low resources. */ - if (is_unoptimized(&oldm) && !needs_repair(&oldm)) - return CHECK_DONE; - fallthrough; - default: - /* - * Operational error. If the caller doesn't want us - * to complain about repair failures, tell the caller - * to requeue the repair for later and don't say a - * thing. Otherwise, print error and bail out. - */ - if (!(repair_flags & XRM_FINAL_WARNING)) - return CHECK_RETRY; - str_liberror(ctx, error, descr_render(&dsc)); - return CHECK_DONE; - } - - if (repair_flags & XRM_FINAL_WARNING) - scrub_warn_incomplete_scrub(ctx, &dsc, &meta); - if (needs_repair(&meta)) { - /* - * Still broken; if we've been told not to complain then we - * just requeue this and try again later. Otherwise we - * log the error loudly and don't try again. - */ - if (!(repair_flags & XRM_FINAL_WARNING)) - return CHECK_RETRY; - str_corrupt(ctx, descr_render(&dsc), -_("Repair unsuccessful; offline repair required.")); - } else if (xref_failed(&meta)) { - /* - * This metadata object itself looks ok, but we still noticed - * inconsistencies when comparing it with the other filesystem - * metadata. If we're in "final warning" mode, advise the - * caller to run xfs_repair; otherwise, we'll keep trying to - * reverify the cross-referencing as repairs progress. - */ - if (repair_flags & XRM_FINAL_WARNING) { - str_info(ctx, descr_render(&dsc), - _("Seems correct but cross-referencing failed; offline repair recommended.")); - } else { - if (verbose) - str_info(ctx, descr_render(&dsc), - _("Seems correct but cross-referencing failed; will keep checking.")); - return CHECK_RETRY; - } - } else { - /* Clean operation, no corruption detected. */ - if (needs_repair(&oldm)) - record_repair(ctx, descr_render(&dsc), - _("Repairs successful.")); - else - record_preen(ctx, descr_render(&dsc), - _("Optimization successful.")); - } - return CHECK_DONE; -} diff --git a/scrub/scrub.h b/scrub/scrub.h index 5359548b0..133445e8d 100644 --- a/scrub/scrub.h +++ b/scrub/scrub.h @@ -38,7 +38,7 @@ bool can_scrub_dir(struct scrub_ctx *ctx); bool can_scrub_attr(struct scrub_ctx *ctx); bool can_scrub_symlink(struct scrub_ctx *ctx); bool can_scrub_parent(struct scrub_ctx *ctx); -bool xfs_can_repair(struct scrub_ctx *ctx); +bool can_repair(struct scrub_ctx *ctx); bool can_force_rebuild(struct scrub_ctx *ctx); int scrub_file(struct scrub_ctx *ctx, int fd, const struct xfs_bulkstat *bstat, @@ -54,8 +54,4 @@ struct action_item { __u32 agno; }; -enum check_outcome xfs_repair_metadata(struct scrub_ctx *ctx, - struct xfs_fd *xfdp, struct action_item *aitem, - unsigned int repair_flags); - #endif /* XFS_SCRUB_SCRUB_H_ */ diff --git a/scrub/scrub_private.h b/scrub/scrub_private.h new file mode 100644 index 000000000..a24d485a2 --- /dev/null +++ b/scrub/scrub_private.h @@ -0,0 +1,55 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (c) 2021-2024 Oracle. All Rights Reserved. + * Author: Darrick J. Wong + */ +#ifndef XFS_SCRUB_SCRUB_PRIVATE_H_ +#define XFS_SCRUB_SCRUB_PRIVATE_H_ + +/* Shared code between scrub.c and repair.c. */ + +int format_scrub_descr(struct scrub_ctx *ctx, char *buf, size_t buflen, + void *where); + +/* Predicates for scrub flag state. */ + +static inline bool is_corrupt(struct xfs_scrub_metadata *sm) +{ + return sm->sm_flags & XFS_SCRUB_OFLAG_CORRUPT; +} + +static inline bool is_unoptimized(struct xfs_scrub_metadata *sm) +{ + return sm->sm_flags & XFS_SCRUB_OFLAG_PREEN; +} + +static inline bool xref_failed(struct xfs_scrub_metadata *sm) +{ + return sm->sm_flags & XFS_SCRUB_OFLAG_XFAIL; +} + +static inline bool xref_disagrees(struct xfs_scrub_metadata *sm) +{ + return sm->sm_flags & XFS_SCRUB_OFLAG_XCORRUPT; +} + +static inline bool is_incomplete(struct xfs_scrub_metadata *sm) +{ + return sm->sm_flags & XFS_SCRUB_OFLAG_INCOMPLETE; +} + +static inline bool is_suspicious(struct xfs_scrub_metadata *sm) +{ + return sm->sm_flags & XFS_SCRUB_OFLAG_WARNING; +} + +/* Should we fix it? */ +static inline bool needs_repair(struct xfs_scrub_metadata *sm) +{ + return is_corrupt(sm) || xref_disagrees(sm); +} + +void scrub_warn_incomplete_scrub(struct scrub_ctx *ctx, struct descr *dsc, + struct xfs_scrub_metadata *meta); + +#endif /* XFS_SCRUB_SCRUB_PRIVATE_H_ */ From patchwork Tue Jul 30 00:58:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13746061 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A9278827 for ; Tue, 30 Jul 2024 00:58:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722301109; cv=none; b=VrR/w/HJhQY3SuXBxJIpkWoeoKa9MlbbSnSOof3hWfjVVxGfKUOFEGHbVPJckP0vgMh1PPsslHxwsPxmD9y72JPrWRRJ/YmjVTACPSiY9wEBYpCWfnSi0rrVY3M2S9SC5f2nPkgafZMW/TmFAyLBrsOT8t4g1/l7dMliqM5c07A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722301109; c=relaxed/simple; bh=GSyavmyX2XNSiHI1tGIWMVA05lK6P80YpV0+90ICCUk=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=iWfArkyJ99KH+BRihdkOj8oIh0QNG+NZpKDHMHyOF+8RrtPeelcvo7xJFyNlfEiHwCV8bqTXzNebNLluurAt6t4/fKW0/cRVgiyC0vema9Q4wcppz0MryAjXMbuk4wW26ri+AHGniGwJi1WvFn4NwLQhP9mHesA/X0DYWTJ6cXY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=koLVTYRT; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="koLVTYRT" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8F799C4AF0F; Tue, 30 Jul 2024 00:58:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1722301108; bh=GSyavmyX2XNSiHI1tGIWMVA05lK6P80YpV0+90ICCUk=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=koLVTYRTAgPaDJ9bnTPByi8o/HrtorVNjA0mU3QxLeZ7PgrEy+IVclFMro9nEg/P+ izKm1USImILP3TorSHLve/BdzDK6aX/u9sPAa+1J5ypeCOkMOfty25clpfop0ECf0D VX9WT9lyNxGCncMnte1Rw9UKcDn9b56aV1gBo3PYg2A9z8A+6lEQwrTqahrwoRSGo2 4FvP53kJ0mY+L7ooSd0vdbpVR25dzeeDycMeO3pLUmoobYyiW6Q75S6JgmNk555h5H XiyFpD+JFfACYN+YWZGj9puTRyQx5LOmcvXjveZrLdZVJcV5cm8inzX4i+/2NB8uWi i8Ed0FgaXHgcw== Date: Mon, 29 Jul 2024 17:58:28 -0700 Subject: [PATCH 3/5] xfs_scrub: log when a repair was unnecessary From: "Darrick J. Wong" To: djwong@kernel.org, cem@kernel.org Cc: Christoph Hellwig , linux-xfs@vger.kernel.org Message-ID: <172229845586.1345742.18261199623012055361.stgit@frogsfrogsfrogs> In-Reply-To: <172229845539.1345742.12185001279081616156.stgit@frogsfrogsfrogs> References: <172229845539.1345742.12185001279081616156.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong If the kernel tells us that a filesystem object didn't need repairs, we should log that with a message specific to that outcome. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- scrub/repair.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/scrub/repair.c b/scrub/repair.c index 54bd09575..50f168d24 100644 --- a/scrub/repair.c +++ b/scrub/repair.c @@ -167,6 +167,10 @@ _("Repair unsuccessful; offline repair required.")); _("Seems correct but cross-referencing failed; will keep checking.")); return CHECK_RETRY; } + } else if (meta.sm_flags & XFS_SCRUB_OFLAG_NO_REPAIR_NEEDED) { + if (verbose) + str_info(ctx, descr_render(&dsc), + _("No modification needed.")); } else { /* Clean operation, no corruption detected. */ if (is_corrupt(&oldm)) From patchwork Tue Jul 30 00:58:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13746062 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 61ACCD528 for ; Tue, 30 Jul 2024 00:58:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722301124; cv=none; b=Xb3SWqFxJu/MmJtlHmDVkU5p+GEoecf0Lt6uUAE1vpnHOwRIdnuSvamV4SvUoUzM5bXx385xi8qIgozEnMrLDsH4WZH5Dhb8yl/qwJmPwJMNeu1WCBk27FjaAlc/IkdVyv+GjC3QhfXOvuaWzKiVb9XK7/u8HUk5Wp+AFwXblgI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722301124; c=relaxed/simple; bh=5SdjwVqRD08+aXCh9lrFYzDFhWkLBydV4p9swI4HNjc=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=J2x8vGHEhU+9Sp6D/sJ6T8zbRdcxWz8tSr2bWcHq+oS+GnftBCrAk0ufnFeljI/ahFoqo1o1Zd+noLshEqH8SMPpqvQhYK1MXNwXw8oL1tn6xKWjXEePn0uCRO/TLQIrsDr5+D+Pc7pHrJaDa6AGGog7ryfIyhc4Cee3IS+Kh/E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=tv7N5kMQ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="tv7N5kMQ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3A768C32786; Tue, 30 Jul 2024 00:58:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1722301124; bh=5SdjwVqRD08+aXCh9lrFYzDFhWkLBydV4p9swI4HNjc=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=tv7N5kMQq2rk5RnZ6YDu2IOY4uRM3IYHRxn2y8QilOHa1311CU5MbjeXku7do67Zl D/NH3NkmkAA3zTJG/TTEcdds98Q2B6ftm3C7Ma1psBGRKn4Ub3d/XriGRhwrFVHgCM yv4vV1DP8OI2wUquXb0KGpg8aJKTs/AhGWLb96qw8hZiwVRCcUHSzNKVRLGSbMLG9p M+ZkLNgK2VnArTByjezFU8Vd1hamONPrlp6KYlwUi2o1mumVlxJpkUlagw4S04rXef TQsv+4e7bz2UpYBMaIEhpxwjdo6QpI7coRzDcgNC6xSc2QsCbVGw8ytZhTkTAbTxpw OxqPyppBho3jw== Date: Mon, 29 Jul 2024 17:58:43 -0700 Subject: [PATCH 4/5] xfs_scrub: require primary superblock repairs to complete before proceeding From: "Darrick J. Wong" To: djwong@kernel.org, cem@kernel.org Cc: Christoph Hellwig , linux-xfs@vger.kernel.org Message-ID: <172229845602.1345742.11240664872191396236.stgit@frogsfrogsfrogs> In-Reply-To: <172229845539.1345742.12185001279081616156.stgit@frogsfrogsfrogs> References: <172229845539.1345742.12185001279081616156.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong Phase 2 of the xfs_scrub program calls the kernel to check the primary superblock before scanning the rest of the filesystem. Though doing so is a no-op now (since the primary super must pass all checks as a prerequisite for mounting), the goal of this code is to enable future kernel code to intercept an xfs_scrub run before it actually does anything. If this some day involves fixing the primary superblock, it seems reasonable to require that /all/ repairs complete successfully before moving on to the rest of the filesystem. Unfortunately, that's not what xfs_scrub does now -- primary super repairs that fail are theoretically deferred to phase 4! So make this mandatory. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- scrub/phase2.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/scrub/phase2.c b/scrub/phase2.c index 80c77b287..2d49c604e 100644 --- a/scrub/phase2.c +++ b/scrub/phase2.c @@ -174,7 +174,8 @@ phase2_func( ret = scrub_primary_super(ctx, &alist); if (ret) goto out_wq; - ret = action_list_process_or_defer(ctx, 0, &alist); + ret = action_list_process(ctx, -1, &alist, + XRM_FINAL_WARNING | XRM_NOPROGRESS); if (ret) goto out_wq; From patchwork Tue Jul 30 00:58:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Darrick J. Wong" X-Patchwork-Id: 13746063 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1C399101C5 for ; Tue, 30 Jul 2024 00:59:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722301140; cv=none; b=qD7I5JTtY1cKKkfMOpiBHKPBBd66sOYdRSeofXfDn7uSk8jCCgzXkssAASBjselTuNdZj1U+uNiy5VFrnekKt4zz5M6wbxO2RANGg+m30AHUD1U8OqX6qew09RdSbaqEiKr0PVPjc1oJM2rZ8ykEDGsSiYzQduKKVxqz1+ZRwX0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722301140; c=relaxed/simple; bh=Qor6Rs907wGl9NQAgIkoRNDZGmfAQBk+UF04ff54/tg=; h=Date:Subject:From:To:Cc:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=MfiTR9KBYMBO8wJsIhXs5hC5hBJXWvU2hwloHleHsxnA8Aw+WXbgborNZ14mHJXhZB945osY1pNitaP/jcxFSYx2JtK2OVQGmswYL0iiAfaRQVMKh5mS67TL0soVCScos0lzN6aXrbef6PxPL5UTAK1SvJN+CvUQfxZgDxdXKks= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=islynhe5; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="islynhe5" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DAEDFC32786; Tue, 30 Jul 2024 00:58:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1722301139; bh=Qor6Rs907wGl9NQAgIkoRNDZGmfAQBk+UF04ff54/tg=; h=Date:Subject:From:To:Cc:In-Reply-To:References:From; b=islynhe5UGbhvL4uAk0RaBN5rT7R4eh2wTA8yDoHv7Hr8GFyi4ucQbZFMVcGCzlKx OZIEYCB6JdBeJVtcV75Oz7azcxz+BqUcnsYLw7mUFNv1kvTk3gFxmGz0VMeG1i46E3 Fb+5OzaksC7VX6GYl+PurW/axs96riLuqXSkyEom0DLUlzshf7zLTCXKRT7NtVwyE/ lQBjEhiaxND3hYgUirjHkz8UsHfEl9wSh78mP86UWTptU0WW6bgEMiOYUjNOFXO6MP Ab2btlwvwGWYAjgoZykeeM+HlR6GGSzwTqMn9lWqe1lbtmuA2t4wTSNebmQez+AFH4 SVZA9w8HOxEWA== Date: Mon, 29 Jul 2024 17:58:59 -0700 Subject: [PATCH 5/5] xfs_scrub: actually try to fix summary counters ahead of repairs From: "Darrick J. Wong" To: djwong@kernel.org, cem@kernel.org Cc: Christoph Hellwig , linux-xfs@vger.kernel.org Message-ID: <172229845614.1345742.16716590378668642512.stgit@frogsfrogsfrogs> In-Reply-To: <172229845539.1345742.12185001279081616156.stgit@frogsfrogsfrogs> References: <172229845539.1345742.12185001279081616156.stgit@frogsfrogsfrogs> User-Agent: StGit/0.19 Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Darrick J. Wong A while ago, I decided to make phase 4 check the summary counters before it starts any other repairs, having observed that repairs of primary metadata can fail because the summary counters (incorrectly) claim that there aren't enough free resources in the filesystem. However, if problems are found in the summary counters, the repair work will be run as part of the AG 0 repairs, which means that it runs concurrently with other scrubbers. This doesn't quite get us to the intended goal, so try to fix the scrubbers ahead of time. If that fails, tough, we'll get back to it in phase 7 if scrub gets that far. Fixes: cbaf1c9d91a0 ("xfs_scrub: check summary counters") Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig --- scrub/phase4.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/scrub/phase4.c b/scrub/phase4.c index d42e67637..0c67abf64 100644 --- a/scrub/phase4.c +++ b/scrub/phase4.c @@ -129,6 +129,7 @@ phase4_func( struct scrub_ctx *ctx) { struct xfs_fsop_geom fsgeom; + struct action_list alist; int ret; if (!have_action_items(ctx)) @@ -136,11 +137,13 @@ phase4_func( /* * Check the summary counters early. Normally we do this during phase - * seven, but some of the cross-referencing requires fairly-accurate - * counters, so counter repairs have to be put on the list now so that - * they get fixed before we stop retrying unfixed metadata repairs. + * seven, but some of the cross-referencing requires fairly accurate + * summary counters. Check and try to repair them now to minimize the + * chance that repairs of primary metadata fail due to secondary + * metadata. If repairs fails, we'll come back during phase 7. */ - ret = scrub_fs_counters(ctx, &ctx->action_lists[0]); + action_list_init(&alist); + ret = scrub_fs_counters(ctx, &alist); if (ret) return ret; @@ -155,11 +158,18 @@ phase4_func( return ret; if (fsgeom.sick & XFS_FSOP_GEOM_SICK_QUOTACHECK) { - ret = scrub_quotacheck(ctx, &ctx->action_lists[0]); + ret = scrub_quotacheck(ctx, &alist); if (ret) return ret; } + /* Repair counters before starting on the rest. */ + ret = action_list_process(ctx, -1, &alist, + XRM_REPAIR_ONLY | XRM_NOPROGRESS); + if (ret) + return ret; + action_list_discard(&alist); + ret = repair_everything(ctx); if (ret) return ret;