diff mbox series

[6/7] xfs_repair: throw away totally bad clusters

Message ID 159950115513.567790.16525509399719506379.stgit@magnolia (mailing list archive)
State Superseded
Headers show
Series xfs_repair: more fuzzer fixes | expand

Commit Message

Darrick J. Wong Sept. 7, 2020, 5:52 p.m. UTC
From: Darrick J. Wong <darrick.wong@oracle.com>

If the filesystem supports sparse inodes, we detect that an entire
cluster buffer has no detectable inodes at all, and we can easily mark
that part of the inode chunk sparse, just drop the cluster buffer and
forget about it.  This makes repair less likely to go to great lengths
to try to save something that's totally unsalvageable.

This manifested in recs[2].free=zeroes in xfs/364, wherein the root
directory claimed to own block X and the inobt also claimed that X was
inodes; repair tried to create rmaps for both owners, and then the whole
mess blew up because the rmap code aborts on those kinds of anomalies.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 repair/dino_chunks.c |   35 ++++++++++++++++++++++++++++++++++-
 repair/rmap.c        |    1 +
 2 files changed, 35 insertions(+), 1 deletion(-)

Comments

Christoph Hellwig Sept. 8, 2020, 3:15 p.m. UTC | #1
On Mon, Sep 07, 2020 at 10:52:35AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> If the filesystem supports sparse inodes, we detect that an entire
> cluster buffer has no detectable inodes at all, and we can easily mark
> that part of the inode chunk sparse, just drop the cluster buffer and
> forget about it.  This makes repair less likely to go to great lengths
> to try to save something that's totally unsalvageable.
> 
> This manifested in recs[2].free=zeroes in xfs/364, wherein the root
> directory claimed to own block X and the inobt also claimed that X was
> inodes; repair tried to create rmaps for both owners, and then the whole
> mess blew up because the rmap code aborts on those kinds of anomalies.

How is the rmap.c chunk related to this?  The dino_chunks.c part looks
fine, and the rmap.c at least not bad, but I don't understand how it
fits here.
Darrick J. Wong Sept. 29, 2020, 3:34 p.m. UTC | #2
On Tue, Sep 08, 2020 at 04:15:26PM +0100, Christoph Hellwig wrote:
> On Mon, Sep 07, 2020 at 10:52:35AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > If the filesystem supports sparse inodes, we detect that an entire
> > cluster buffer has no detectable inodes at all, and we can easily mark
> > that part of the inode chunk sparse, just drop the cluster buffer and
> > forget about it.  This makes repair less likely to go to great lengths
> > to try to save something that's totally unsalvageable.
> > 
> > This manifested in recs[2].free=zeroes in xfs/364, wherein the root
> > directory claimed to own block X and the inobt also claimed that X was
> > inodes; repair tried to create rmaps for both owners, and then the whole
> > mess blew up because the rmap code aborts on those kinds of anomalies.
> 
> How is the rmap.c chunk related to this?  The dino_chunks.c part looks
> fine, and the rmap.c at least not bad, but I don't understand how it
> fits here.

I think that's a stray paste error; the chunk can be dropped.

--D
diff mbox series

Patch

diff --git a/repair/dino_chunks.c b/repair/dino_chunks.c
index 50a2003614df..e4a95ff635c8 100644
--- a/repair/dino_chunks.c
+++ b/repair/dino_chunks.c
@@ -601,6 +601,7 @@  process_inode_chunk(
 	xfs_dinode_t		*dino;
 	int			icnt;
 	int			status;
+	int			bp_found;
 	int			is_used;
 	int			ino_dirty;
 	int			irec_offset;
@@ -614,6 +615,7 @@  process_inode_chunk(
 	int			bp_index;
 	int			cluster_offset;
 	struct xfs_ino_geometry	*igeo = M_IGEO(mp);
+	bool			can_punch_sparse = false;
 	int			error;
 
 	ASSERT(first_irec != NULL);
@@ -626,6 +628,10 @@  process_inode_chunk(
 	if (cluster_count == 0)
 		cluster_count = 1;
 
+	if (xfs_sb_version_hassparseinodes(&mp->m_sb) &&
+	    M_IGEO(mp)->inodes_per_cluster >= XFS_INODES_PER_HOLEMASK_BIT)
+		can_punch_sparse = true;
+
 	/*
 	 * get all blocks required to read in this chunk (may wind up
 	 * having to process more chunks in a multi-chunk per block fs)
@@ -700,6 +706,7 @@  process_inode_chunk(
 	cluster_offset = 0;
 	icnt = 0;
 	status = 0;
+	bp_found = 0;
 	bp_index = 0;
 
 	/*
@@ -725,8 +732,10 @@  process_inode_chunk(
 						(agno == 0 &&
 						(mp->m_sb.sb_rootino == agino ||
 						 mp->m_sb.sb_rsumino == agino ||
-						 mp->m_sb.sb_rbmino == agino)))
+						 mp->m_sb.sb_rbmino == agino))) {
 					status++;
+					bp_found++;
+				}
 			}
 
 			irec_offset++;
@@ -749,11 +758,35 @@  process_inode_chunk(
 				irec_offset = 0;
 			}
 			if (cluster_offset == M_IGEO(mp)->inodes_per_cluster) {
+				if (can_punch_sparse &&
+				    bplist[bp_index] != NULL &&
+				    bp_found == 0) {
+					/*
+					 * We didn't find any good inodes in
+					 * this cluster, blow it away before
+					 * moving on to the next one.
+					 */
+					libxfs_buf_relse(bplist[bp_index]);
+					bplist[bp_index] = NULL;
+				}
 				bp_index++;
 				cluster_offset = 0;
+				bp_found = 0;
 			}
 		}
 
+		if (can_punch_sparse &&
+		    bp_index < cluster_count &&
+		    bplist[bp_index] != NULL &&
+		    bp_found == 0) {
+			/*
+			 * We didn't find any good inodes in this cluster, blow
+			 * it away.
+			 */
+			libxfs_buf_relse(bplist[bp_index]);
+			bplist[bp_index] = NULL;
+		}
+
 		/*
 		 * if chunk/block is bad, blow it off.  the inode records
 		 * will be deleted by the caller if appropriate.
diff --git a/repair/rmap.c b/repair/rmap.c
index a4cc6a4937c9..98d2f46fa047 100644
--- a/repair/rmap.c
+++ b/repair/rmap.c
@@ -399,6 +399,7 @@  rmap_add_fixed_ag_rec(
 			nr = 1;
 		agino = ino_rec->ino_startnum + startidx;
 		agbno = XFS_AGINO_TO_AGBNO(mp, agino);
+		ASSERT(get_bmap(agno, agbno) == XR_E_INO);
 		if (XFS_AGINO_TO_OFFSET(mp, agino) == 0) {
 			error = rmap_add_ag_rec(mp, agno, agbno, nr,
 					XFS_RMAP_OWN_INODES);