[7/7] xfs_repair: try to correct sb_unit value from secondaries
diff mbox series

Message ID 158086364145.2079685.5986767044268901944.stgit@magnolia
State Accepted
Headers show
Series
  • xfs_repair: do not trash valid root dirs
Related show

Commit Message

Darrick J. Wong Feb. 5, 2020, 12:47 a.m. UTC
From: Darrick J. Wong <darrick.wong@oracle.com>

If the primary superblock's sb_unit leads to a rootino calculation that
doesn't match sb_rootino /but/ we can find a secondary superblock whose
sb_unit does match, fix the primary.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 libxfs/libxfs_api_defs.h |    1 +
 repair/xfs_repair.c      |   79 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 80 insertions(+)

Comments

Christoph Hellwig Feb. 17, 2020, 1:55 p.m. UTC | #1
On Tue, Feb 04, 2020 at 04:47:21PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> If the primary superblock's sb_unit leads to a rootino calculation that
> doesn't match sb_rootino /but/ we can find a secondary superblock whose
> sb_unit does match, fix the primary.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
Eric Sandeen Feb. 26, 2020, 5:42 p.m. UTC | #2
On 2/4/20 4:47 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> If the primary superblock's sb_unit leads to a rootino calculation that
> doesn't match sb_rootino /but/ we can find a secondary superblock whose
> sb_unit does match, fix the primary.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

With the previous patch issuing a warning

+_("sb root inode value %" PRIu64 " valid but in unaligned location (expected %"PRIu64") possibly due to sunit change\n"),

What does this do in the case where the user intentionally changed the
sunit, which is (I think) the situation launched this work in the first
place?

Will that warning persist in the case of an intentional sunit change?

Thanks,
-Eric
Darrick J. Wong Feb. 26, 2020, 5:55 p.m. UTC | #3
On Wed, Feb 26, 2020 at 09:42:01AM -0800, Eric Sandeen wrote:
> On 2/4/20 4:47 PM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > If the primary superblock's sb_unit leads to a rootino calculation that
> > doesn't match sb_rootino /but/ we can find a secondary superblock whose
> > sb_unit does match, fix the primary.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> 
> With the previous patch issuing a warning
> 
> +_("sb root inode value %" PRIu64 " valid but in unaligned location (expected %"PRIu64") possibly due to sunit change\n"),
> 
> What does this do in the case where the user intentionally changed the
> sunit, which is (I think) the situation launched this work in the first
> place?

[echoing what we just discussed on irc]

repair will print the above warning, and then it'll try to find either a
backup sb with a sunit value that makes the computation work, or guess
at sunit values until it finds one that makes the computation work.  If
it finds a reasonable sunit value it'll change it back to that.

If the user mounts an old kernel with the same bad sunit value, the
kernel will set it back to the bad value and we wash, rinse, and repeat.

If the user mounts a new kernel with the same bad sunit value, it'll
decline to change the sunit value.

If the user runs the bad-sunit fs against an old xfs_repair it will
descend into madness nuking the root directory, which is why we're
trying to nudge ourselves away from the bad value.

> Will that warning persist in the case of an intentional sunit change?

Yes.

--D

> Thanks,
> -Eric
>

Patch
diff mbox series

diff --git a/libxfs/libxfs_api_defs.h b/libxfs/libxfs_api_defs.h
index 7c629c62..1a438e58 100644
--- a/libxfs/libxfs_api_defs.h
+++ b/libxfs/libxfs_api_defs.h
@@ -143,6 +143,7 @@ 
 #define xfs_rtfree_extent		libxfs_rtfree_extent
 #define xfs_sb_from_disk		libxfs_sb_from_disk
 #define xfs_sb_quota_from_disk		libxfs_sb_quota_from_disk
+#define xfs_sb_read_secondary		libxfs_sb_read_secondary
 #define xfs_sb_to_disk			libxfs_sb_to_disk
 #define xfs_symlink_blocks		libxfs_symlink_blocks
 #define xfs_symlink_hdr_ok		libxfs_symlink_hdr_ok
diff --git a/repair/xfs_repair.c b/repair/xfs_repair.c
index b34d41d4..eb1ce546 100644
--- a/repair/xfs_repair.c
+++ b/repair/xfs_repair.c
@@ -457,6 +457,84 @@  has_plausible_rootdir(
 	return ret;
 }
 
+/*
+ * If any of the secondary SBs contain a *correct* value for sunit, write that
+ * back to the primary superblock.
+ */
+static void
+guess_correct_sunit(
+	struct xfs_mount	*mp)
+{
+	struct xfs_sb		sb;
+	struct xfs_buf		*bp;
+	xfs_ino_t		calc_rootino = NULLFSINO;
+	xfs_agnumber_t		agno;
+	unsigned int		new_sunit;
+	unsigned int		sunit_guess;
+	int			error;
+
+	/* Try reading secondary supers to see if we find a good sb_unit. */
+	for (agno = 1; agno < mp->m_sb.sb_agcount; agno++) {
+		error = -libxfs_sb_read_secondary(mp, NULL, agno, &bp);
+		if (error)
+			continue;
+		libxfs_sb_from_disk(&sb, XFS_BUF_TO_SBP(bp));
+		libxfs_putbuf(bp);
+
+		calc_rootino = libxfs_ialloc_calc_rootino(mp, sb.sb_unit);
+		if (calc_rootino == mp->m_sb.sb_rootino)
+			break;
+	}
+
+	/* If we found a reasonable value, log where we found it. */
+	if (calc_rootino == mp->m_sb.sb_rootino) {
+		do_warn(_("AG %u superblock contains plausible sb_unit value\n"),
+				agno);
+		new_sunit = sb.sb_unit;
+		goto fix;
+	}
+
+	/* Try successive powers of two. */
+	for (sunit_guess = 1;
+	     sunit_guess <= XFS_AG_MAX_BLOCKS(mp->m_sb.sb_blocklog);
+	     sunit_guess *= 2) {
+		calc_rootino = libxfs_ialloc_calc_rootino(mp, sunit_guess);
+		if (calc_rootino == mp->m_sb.sb_rootino)
+			break;
+	}
+
+	/* If we found a reasonable value, log where we found it. */
+	if (calc_rootino == mp->m_sb.sb_rootino) {
+		do_warn(_("Found an sb_unit value that looks plausible\n"));
+		new_sunit = sunit_guess;
+		goto fix;
+	}
+
+	do_warn(_("Could not estimate a plausible sb_unit value\n"));
+	return;
+
+fix:
+	if (!no_modify)
+		do_warn(_("Resetting sb_unit to %u\n"), new_sunit);
+	else
+		do_warn(_("Would reset sb_unit to %u\n"), new_sunit);
+
+	/*
+	 * Just set the value -- safe since the superblock doesn't get flushed
+	 * out if no_modify is set.
+	 */
+	mp->m_sb.sb_unit = new_sunit;
+
+	/* Make sure that swidth is still a multiple of sunit. */
+	if (mp->m_sb.sb_width % mp->m_sb.sb_unit == 0)
+		return;
+
+	if (!no_modify)
+		do_warn(_("Resetting sb_width to %u\n"), new_sunit);
+	else
+		do_warn(_("Would reset sb_width to %u\n"), new_sunit);
+}
+
 /*
  * Make sure that the first 3 inodes in the filesystem are the root directory,
  * the realtime bitmap, and the realtime summary, in that order.
@@ -480,6 +558,7 @@  calc_mkfs(
 		do_warn(
 _("sb root inode value %" PRIu64 " valid but in unaligned location (expected %"PRIu64") possibly due to sunit change\n"),
 			mp->m_sb.sb_rootino, rootino);
+		guess_correct_sunit(mp);
 		rootino = mp->m_sb.sb_rootino;
 	}