diff mbox series

xfs_repair: coordinate parallel updates to the rt bitmap

Message ID 20200923182437.GW7955@magnolia (mailing list archive)
State Superseded
Headers show
Series xfs_repair: coordinate parallel updates to the rt bitmap | expand

Commit Message

Darrick J. Wong Sept. 23, 2020, 6:24 p.m. UTC
From: Darrick J. Wong <darrick.wong@oracle.com>

Actually take the rt lock before updating the bitmap from multiple
threads.  This fixes an infrequent corruption problem when running
generic/013 and rtinherit=1 is set on the root dir.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 repair/dinode.c |    4 ++++
 1 file changed, 4 insertions(+)

Comments

Christoph Hellwig Sept. 24, 2020, 5:40 a.m. UTC | #1
On Wed, Sep 23, 2020 at 11:24:37AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Actually take the rt lock before updating the bitmap from multiple
> threads.  This fixes an infrequent corruption problem when running
> generic/013 and rtinherit=1 is set on the root dir.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  repair/dinode.c |    4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/repair/dinode.c b/repair/dinode.c
> index 57013bf149b2..07f3f83aef8c 100644
> --- a/repair/dinode.c
> +++ b/repair/dinode.c
> @@ -184,6 +184,7 @@ process_rt_rec(
>  	xfs_rfsblock_t		*tot,
>  	int			check_dups)
>  {
> +	struct aglock		*lock = &ag_locks[(signed)NULLAGNUMBER];

Err, what is this weird cast doing here?

The rest looks sane.
Darrick J. Wong Sept. 24, 2020, 6 a.m. UTC | #2
On Thu, Sep 24, 2020 at 06:40:41AM +0100, Christoph Hellwig wrote:
> On Wed, Sep 23, 2020 at 11:24:37AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Actually take the rt lock before updating the bitmap from multiple
> > threads.  This fixes an infrequent corruption problem when running
> > generic/013 and rtinherit=1 is set on the root dir.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  repair/dinode.c |    4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/repair/dinode.c b/repair/dinode.c
> > index 57013bf149b2..07f3f83aef8c 100644
> > --- a/repair/dinode.c
> > +++ b/repair/dinode.c
> > @@ -184,6 +184,7 @@ process_rt_rec(
> >  	xfs_rfsblock_t		*tot,
> >  	int			check_dups)
> >  {
> > +	struct aglock		*lock = &ag_locks[(signed)NULLAGNUMBER];
> 
> Err, what is this weird cast doing here?

Well.... ag_locks is allocated with length ag_locks[agcount + 1], and
then the pointer is incremented so that ag_locks[-1] is the rt lock.

NULLAGNUMBER is cast to xfs_agnumber_t, which is uint32_t, so we have to
cast it back to signed so that we actually do the pointer subtraction(!)

Yeah, I know, it's nuts...

--D

> The rest looks sane.
Christoph Hellwig Sept. 24, 2020, 6:19 a.m. UTC | #3
On Wed, Sep 23, 2020 at 11:00:01PM -0700, Darrick J. Wong wrote:
> > > +	struct aglock		*lock = &ag_locks[(signed)NULLAGNUMBER];
> > 
> > Err, what is this weird cast doing here?
> 
> Well.... ag_locks is allocated with length ag_locks[agcount + 1], and
> then the pointer is incremented so that ag_locks[-1] is the rt lock.

At least in the for-next branch it isn't:

	ag_locks = calloc(mp->m_sb.sb_agcount, sizeof(struct aglock));

More importantly, I can't even find other uses of ag_locks for the
RT subvolume.  Is this hidden in one of your series?

Either way I think a separate lock for the RT subvolume would make a
whole lot more sense.
Darrick J. Wong Sept. 24, 2020, 3:06 p.m. UTC | #4
On Thu, Sep 24, 2020 at 07:19:11AM +0100, Christoph Hellwig wrote:
> On Wed, Sep 23, 2020 at 11:00:01PM -0700, Darrick J. Wong wrote:
> > > > +	struct aglock		*lock = &ag_locks[(signed)NULLAGNUMBER];
> > > 
> > > Err, what is this weird cast doing here?
> > 
> > Well.... ag_locks is allocated with length ag_locks[agcount + 1], and
> > then the pointer is incremented so that ag_locks[-1] is the rt lock.
> 
> At least in the for-next branch it isn't:
> 
> 	ag_locks = calloc(mp->m_sb.sb_agcount, sizeof(struct aglock));
> 
> More importantly, I can't even find other uses of ag_locks for the
> RT subvolume.  Is this hidden in one of your series?

Doh.  Yes, it is, in the realtime rmap series. :( :(

> Either way I think a separate lock for the RT subvolume would make a
> whole lot more sense.

Yes, let's do it that way.

--D
diff mbox series

Patch

diff --git a/repair/dinode.c b/repair/dinode.c
index 57013bf149b2..07f3f83aef8c 100644
--- a/repair/dinode.c
+++ b/repair/dinode.c
@@ -184,6 +184,7 @@  process_rt_rec(
 	xfs_rfsblock_t		*tot,
 	int			check_dups)
 {
+	struct aglock		*lock = &ag_locks[(signed)NULLAGNUMBER];
 	xfs_fsblock_t		b, lastb;
 	xfs_rtblock_t		ext;
 	int			state;
@@ -245,6 +246,7 @@  _("data fork in rt ino %" PRIu64 " claims dup rt extent,"
 			continue;
 		}
 
+		pthread_mutex_lock(&lock->lock);
 		state = get_rtbmap(ext);
 		switch (state)  {
 		case XR_E_FREE:
@@ -270,6 +272,7 @@  _("data fork in rt inode %" PRIu64 " found metadata block %" PRIu64 " in rt bmap
 			do_warn(
 _("data fork in rt inode %" PRIu64 " claims used rt block %" PRIu64 "\n"),
 				ino, ext);
+			pthread_mutex_unlock(&lock->lock);
 			return 1;
 		case XR_E_FREE1:
 		default:
@@ -277,6 +280,7 @@  _("data fork in rt inode %" PRIu64 " claims used rt block %" PRIu64 "\n"),
 _("illegal state %d in rt block map %" PRIu64 "\n"),
 				state, b);
 		}
+		pthread_mutex_unlock(&lock->lock);
 	}
 
 	/*