diff mbox series

[v2] xfs_repair: coordinate parallel updates to the rt bitmap

Message ID 20201002201831.GA49547@magnolia (mailing list archive)
State Accepted, archived
Headers show
Series [v2] xfs_repair: coordinate parallel updates to the rt bitmap | expand

Commit Message

Darrick J. Wong Oct. 2, 2020, 8:18 p.m. UTC
From: Darrick J. Wong <darrick.wong@oracle.com>

Actually take the rt lock before updating the bitmap from multiple
threads.  This fixes an infrequent corruption problem when running
generic/013 and rtinherit=1 is set on the root dir.

Fixes: 2556c98bd9e6 ("Perform true sequential bulk read prefetching in xfs_repair Merge of master-melb:xfs-cmds:29147a by kenmcd.")
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
v2: fix review comments per hch
---
 repair/dinode.c  |   16 ++++++++--------
 repair/globals.c |    1 +
 repair/globals.h |    1 +
 repair/incore.c  |    1 +
 4 files changed, 11 insertions(+), 8 deletions(-)

Comments

Christoph Hellwig Oct. 5, 2020, 6:15 a.m. UTC | #1
On Fri, Oct 02, 2020 at 01:18:31PM -0700, Darrick J. Wong wrote:
> +			error2 = process_rt_rec(mp, &irec, ino, tot, check_dups);

This adds a 81 char line.

Except fo that:

Reviewed-by: Christoph Hellwig <hch@lst.de>
diff mbox series

Patch

diff --git a/repair/dinode.c b/repair/dinode.c
index f65a614702fd..c89f21e08373 100644
--- a/repair/dinode.c
+++ b/repair/dinode.c
@@ -323,6 +323,7 @@  process_bmbt_reclist_int(
 	xfs_extlen_t		blen;
 	xfs_agnumber_t		locked_agno = -1;
 	int			error = 1;
+	int			error2;
 
 	if (type == XR_INO_RTDATA)
 		ftype = ftype_real_time;
@@ -383,14 +384,14 @@  _("zero length extent (off = %" PRIu64 ", fsbno = %" PRIu64 ") in ino %" PRIu64
 		}
 
 		if (type == XR_INO_RTDATA && whichfork == XFS_DATA_FORK) {
+			pthread_mutex_lock(&rt_lock.lock);
+			error2 = process_rt_rec(mp, &irec, ino, tot, check_dups);
+			pthread_mutex_unlock(&rt_lock.lock);
+			if (error2)
+				return error2;
+
 			/*
-			 * realtime bitmaps don't use AG locks, so returning
-			 * immediately is fine for this code path.
-			 */
-			if (process_rt_rec(mp, &irec, ino, tot, check_dups))
-				return 1;
-			/*
-			 * skip rest of loop processing since that'irec.br_startblock
+			 * skip rest of loop processing since the rest is
 			 * all for regular file forks and attr forks
 			 */
 			continue;
@@ -442,7 +443,6 @@  _("inode %" PRIu64 " - extent exceeds max offset - start %" PRIu64 ", "
 		}
 
 		if (blkmapp && *blkmapp) {
-			int	error2;
 			error2 = blkmap_set_ext(blkmapp, irec.br_startoff,
 					irec.br_startblock, irec.br_blockcount);
 			if (error2) {
diff --git a/repair/globals.c b/repair/globals.c
index 299bacd13132..110d98b6681e 100644
--- a/repair/globals.c
+++ b/repair/globals.c
@@ -110,6 +110,7 @@  uint32_t	sb_unit;
 uint32_t	sb_width;
 
 struct aglock	*ag_locks;
+struct aglock	rt_lock;
 
 int		report_interval;
 uint64_t	*prog_rpt_done;
diff --git a/repair/globals.h b/repair/globals.h
index 953e3dfbb4f2..1d397b351276 100644
--- a/repair/globals.h
+++ b/repair/globals.h
@@ -154,6 +154,7 @@  struct aglock {
 	pthread_mutex_t	lock __attribute__((__aligned__(64)));
 };
 extern struct aglock	*ag_locks;
+extern struct aglock	rt_lock;
 
 extern int		report_interval;
 extern uint64_t		*prog_rpt_done;
diff --git a/repair/incore.c b/repair/incore.c
index 1374ddefe06e..4ffe18aba839 100644
--- a/repair/incore.c
+++ b/repair/incore.c
@@ -290,6 +290,7 @@  init_bmaps(xfs_mount_t *mp)
 		btree_init(&ag_bmap[i]);
 		pthread_mutex_init(&ag_locks[i].lock, NULL);
 	}
+	pthread_mutex_init(&rt_lock.lock, NULL);
 
 	init_rt_bmap(mp);
 	reset_bmaps(mp);