diff mbox series

[2/2] xfs_scrub: retry scrub (and repair) of items that are ok except for XFAIL

Message ID 164738659907.3191772.7906348523548262156.stgit@magnolia (mailing list archive)
State Accepted
Headers show
Series xfsprogs: various 5.15 fixes | expand

Commit Message

Darrick J. Wong March 15, 2022, 11:23 p.m. UTC
From: Darrick J. Wong <djwong@kernel.org>

Sometimes a metadata object will pass all of the obvious scrubber
checks, but we won't be able to cross-reference the object's records
with other metadata objects (e.g. a file data fork and a free space
btree both claim ownership of an extent).  When this happens during the
checking phase, we should queue the object for a repair, which means
that phase 4 will keep re-evaluating the object as repairs proceed.
Eventually, the hope is that we'll fix the filesystem and everything
will scrub cleanly; if not, we recommend running xfs_repair as a second
attempt to fix the inconsistency.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 scrub/scrub.c |   27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

Comments

Eric Sandeen March 16, 2022, 6:10 p.m. UTC | #1
On 3/15/22 6:23 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Sometimes a metadata object will pass all of the obvious scrubber
> checks, but we won't be able to cross-reference the object's records
> with other metadata objects (e.g. a file data fork and a free space
> btree both claim ownership of an extent).  When this happens during the
> checking phase, we should queue the object for a repair, which means
> that phase 4 will keep re-evaluating the object as repairs proceed.
> Eventually, the hope is that we'll fix the filesystem and everything
> will scrub cleanly; if not, we recommend running xfs_repair as a second
> attempt to fix the inconsistency.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

I'm not a scrub-master, but this seems to make sense.

Reviewed-by: Eric Sandeen <sandeen@redhat.com>
diff mbox series

Patch

diff --git a/scrub/scrub.c b/scrub/scrub.c
index 07ae0673..4ef19656 100644
--- a/scrub/scrub.c
+++ b/scrub/scrub.c
@@ -223,6 +223,16 @@  _("Optimization is possible."));
 		return CHECK_REPAIR;
 	}
 
+	/*
+	 * This metadata object itself looks ok, but we noticed inconsistencies
+	 * when comparing it with the other filesystem metadata.  If we're in
+	 * repair mode we need to queue it for a "repair" so that phase 4 will
+	 * re-examine the object as repairs progress to see if the kernel will
+	 * deem it completely consistent at some point.
+	 */
+	if (xref_failed(meta) && ctx->mode == SCRUB_MODE_REPAIR)
+		return CHECK_REPAIR;
+
 	/* Everything is ok. */
 	return CHECK_DONE;
 }
@@ -787,6 +797,23 @@  _("Read-only filesystem; cannot make changes."));
 			return CHECK_RETRY;
 		str_corrupt(ctx, descr_render(&dsc),
 _("Repair unsuccessful; offline repair required."));
+	} else if (xref_failed(&meta)) {
+		/*
+		 * This metadata object itself looks ok, but we still noticed
+		 * inconsistencies when comparing it with the other filesystem
+		 * metadata.  If we're in "final warning" mode, advise the
+		 * caller to run xfs_repair; otherwise, we'll keep trying to
+		 * reverify the cross-referencing as repairs progress.
+		 */
+		if (repair_flags & XRM_COMPLAIN_IF_UNFIXED) {
+			str_info(ctx, descr_render(&dsc),
+ _("Seems correct but cross-referencing failed; offline repair recommended."));
+		} else {
+			if (verbose)
+				str_info(ctx, descr_render(&dsc),
+ _("Seems correct but cross-referencing failed; will keep checking."));
+			return CHECK_RETRY;
+		}
 	} else {
 		/* Clean operation, no corruption detected. */
 		if (needs_repair(&oldm))