[RFC] xfs: fix buffer check for primary sb in userspace libxfs
diff mbox

Message ID 20170718141337.46255-1-bfoster@redhat.com
State New
Headers show

Commit Message

Brian Foster July 18, 2017, 2:13 p.m. UTC
Signed-off-by: Brian Foster <bfoster@redhat.com>
---

Hi all,

This patch is actually targeted at userspace. The previous change in commit
f3d7ebde ("xfs: fix superblock inprogress check") to use ->b_maps technically
breaks the logic in userspace in a similar way to the original problem because
userspace has no concept of uncached buffers.  ->b_maps is NULL in userspace
unless the buffer is truly discontiguous.

This would normally result in a segfault but this appears to be hidden
by gcc optimization as -O2 is enabled by default and the
check_inprogress param to xfs_mount_validate_sb() is unused in
userspace. Therefore, the segfault is only reproducible when
optimization is disabled (which is a useful configuration for
debugging).

There are obviously different ways to fix this. I'm floating this (untested)
rfc as a kernel patch (do we ever sync libxfs from xfsprogs -> kernel?) with
the objective of keeping the libxfs code the same between the kernel and
userspace. We could alternatively create a custom helper/macro with the
appropriate check in each place. Thoughts?

Brian

 fs/xfs/libxfs/xfs_sb.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

Comments

Darrick J. Wong July 18, 2017, 6:10 p.m. UTC | #1
On Tue, Jul 18, 2017 at 10:13:37AM -0400, Brian Foster wrote:
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
> 
> Hi all,
> 
> This patch is actually targeted at userspace. The previous change in commit
> f3d7ebde ("xfs: fix superblock inprogress check") to use ->b_maps technically
> breaks the logic in userspace in a similar way to the original problem because
> userspace has no concept of uncached buffers.  ->b_maps is NULL in userspace
> unless the buffer is truly discontiguous.
> 
> This would normally result in a segfault but this appears to be hidden
> by gcc optimization as -O2 is enabled by default and the
> check_inprogress param to xfs_mount_validate_sb() is unused in
> userspace. Therefore, the segfault is only reproducible when
> optimization is disabled (which is a useful configuration for
> debugging).
> 
> There are obviously different ways to fix this. I'm floating this (untested)
> rfc as a kernel patch (do we ever sync libxfs from xfsprogs -> kernel?) with
> the objective of keeping the libxfs code the same between the kernel and
> userspace. We could alternatively create a custom helper/macro with the
> appropriate check in each place. Thoughts?

Eww, macros. :)

> Brian
> 
>  fs/xfs/libxfs/xfs_sb.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> index 9b5aae2..ec2fd03 100644
> --- a/fs/xfs/libxfs/xfs_sb.c
> +++ b/fs/xfs/libxfs/xfs_sb.c
> @@ -583,6 +583,7 @@ xfs_sb_verify(
>  {
>  	struct xfs_mount *mp = bp->b_target->bt_mount;
>  	struct xfs_sb	sb;
> +	bool		primary_sb;
>  
>  	/*
>  	 * Use call variant which doesn't convert quota flags from disk 
> @@ -592,11 +593,14 @@ xfs_sb_verify(
>  
>  	/*
>  	 * Only check the in progress field for the primary superblock as
> -	 * mkfs.xfs doesn't clear it from secondary superblocks.
> +	 * mkfs.xfs doesn't clear it from secondary superblocks. Note that
> +	 * userspace libxfs does not have uncached buffers and so b_maps is not
> +	 * used for the sb buffer.
>  	 */
> -	return xfs_mount_validate_sb(mp, &sb,
> -				     bp->b_maps[0].bm_bn == XFS_SB_DADDR,
> -				     check_version);

/me wonders if it'd be appropriate to:

ASSERT(bp->b_maps != NULL || bp->b_bn != XFS_BUF_DADDR_NULL);

here to confirm that uncached buffers are working the way we think
they're supposed to.

Otherwise it looks ok.

--D

> +	primary_sb = (bp->b_bn == XFS_BUF_DADDR_NULL &&
> +		      bp->b_maps[0].bm_bn == XFS_SB_DADDR) ||
> +		     bp->b_bn == XFS_SB_DADDR;
> +	return xfs_mount_validate_sb(mp, &sb, primary_sb, check_version);
>  }
>  
>  /*
> -- 
> 2.9.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Brian Foster July 18, 2017, 6:23 p.m. UTC | #2
On Tue, Jul 18, 2017 at 11:10:41AM -0700, Darrick J. Wong wrote:
> On Tue, Jul 18, 2017 at 10:13:37AM -0400, Brian Foster wrote:
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> > 
> > Hi all,
> > 
> > This patch is actually targeted at userspace. The previous change in commit
> > f3d7ebde ("xfs: fix superblock inprogress check") to use ->b_maps technically
> > breaks the logic in userspace in a similar way to the original problem because
> > userspace has no concept of uncached buffers.  ->b_maps is NULL in userspace
> > unless the buffer is truly discontiguous.
> > 
> > This would normally result in a segfault but this appears to be hidden
> > by gcc optimization as -O2 is enabled by default and the
> > check_inprogress param to xfs_mount_validate_sb() is unused in
> > userspace. Therefore, the segfault is only reproducible when
> > optimization is disabled (which is a useful configuration for
> > debugging).
> > 
> > There are obviously different ways to fix this. I'm floating this (untested)
> > rfc as a kernel patch (do we ever sync libxfs from xfsprogs -> kernel?) with
> > the objective of keeping the libxfs code the same between the kernel and
> > userspace. We could alternatively create a custom helper/macro with the
> > appropriate check in each place. Thoughts?
> 
> Eww, macros. :)
> 
> > Brian
> > 
> >  fs/xfs/libxfs/xfs_sb.c | 12 ++++++++----
> >  1 file changed, 8 insertions(+), 4 deletions(-)
> > 
> > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> > index 9b5aae2..ec2fd03 100644
> > --- a/fs/xfs/libxfs/xfs_sb.c
> > +++ b/fs/xfs/libxfs/xfs_sb.c
> > @@ -583,6 +583,7 @@ xfs_sb_verify(
> >  {
> >  	struct xfs_mount *mp = bp->b_target->bt_mount;
> >  	struct xfs_sb	sb;
> > +	bool		primary_sb;
> >  
> >  	/*
> >  	 * Use call variant which doesn't convert quota flags from disk 
> > @@ -592,11 +593,14 @@ xfs_sb_verify(
> >  
> >  	/*
> >  	 * Only check the in progress field for the primary superblock as
> > -	 * mkfs.xfs doesn't clear it from secondary superblocks.
> > +	 * mkfs.xfs doesn't clear it from secondary superblocks. Note that
> > +	 * userspace libxfs does not have uncached buffers and so b_maps is not
> > +	 * used for the sb buffer.
> >  	 */
> > -	return xfs_mount_validate_sb(mp, &sb,
> > -				     bp->b_maps[0].bm_bn == XFS_SB_DADDR,
> > -				     check_version);
> 
> /me wonders if it'd be appropriate to:
> 
> ASSERT(bp->b_maps != NULL || bp->b_bn != XFS_BUF_DADDR_NULL);
> 
> here to confirm that uncached buffers are working the way we think
> they're supposed to.
> 

Sure, I think we can add something like that.

> Otherwise it looks ok.
> 

Thanks.

And after some discussion on irc with Darrick and Eric, the next version
will target xfsprogs/libxfs as that is where the fix is primarily needed
(with the expectation that this will eventually sync from xfsprogs ->
kernel).

Brian

> --D
> 
> > +	primary_sb = (bp->b_bn == XFS_BUF_DADDR_NULL &&
> > +		      bp->b_maps[0].bm_bn == XFS_SB_DADDR) ||
> > +		     bp->b_bn == XFS_SB_DADDR;
> > +	return xfs_mount_validate_sb(mp, &sb, primary_sb, check_version);
> >  }
> >  
> >  /*
> > -- 
> > 2.9.4
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Chinner July 18, 2017, 11:12 p.m. UTC | #3
On Tue, Jul 18, 2017 at 10:13:37AM -0400, Brian Foster wrote:
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
> 
> Hi all,
> 
> This patch is actually targeted at userspace. The previous change in commit
> f3d7ebde ("xfs: fix superblock inprogress check") to use ->b_maps technically
> breaks the logic in userspace in a similar way to the original problem because
> userspace has no concept of uncached buffers.  ->b_maps is NULL in userspace
> unless the buffer is truly discontiguous.
> 
> This would normally result in a segfault but this appears to be hidden
> by gcc optimization as -O2 is enabled by default and the
> check_inprogress param to xfs_mount_validate_sb() is unused in
> userspace. Therefore, the segfault is only reproducible when
> optimization is disabled (which is a useful configuration for
> debugging).
> 
> There are obviously different ways to fix this. I'm floating this (untested)
> rfc as a kernel patch (do we ever sync libxfs from xfsprogs -> kernel?) with
> the objective of keeping the libxfs code the same between the kernel and
> userspace. We could alternatively create a custom helper/macro with the
> appropriate check in each place. Thoughts?

Wouldn't it be better to simply fix the userspace buffer
initialisation to always have a valid bp->b_maps, just like the
kernel does? (See xfs_buf_get_maps() in the kernel code).  That way
we don't have a landmine lurking in all the shared libxfs code we
bring from the kernel that may interact with uncached buffers.

Cheers,

Dave.
Brian Foster July 19, 2017, 11:17 a.m. UTC | #4
On Wed, Jul 19, 2017 at 09:12:02AM +1000, Dave Chinner wrote:
> On Tue, Jul 18, 2017 at 10:13:37AM -0400, Brian Foster wrote:
> > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > ---
> > 
> > Hi all,
> > 
> > This patch is actually targeted at userspace. The previous change in commit
> > f3d7ebde ("xfs: fix superblock inprogress check") to use ->b_maps technically
> > breaks the logic in userspace in a similar way to the original problem because
> > userspace has no concept of uncached buffers.  ->b_maps is NULL in userspace
> > unless the buffer is truly discontiguous.
> > 
> > This would normally result in a segfault but this appears to be hidden
> > by gcc optimization as -O2 is enabled by default and the
> > check_inprogress param to xfs_mount_validate_sb() is unused in
> > userspace. Therefore, the segfault is only reproducible when
> > optimization is disabled (which is a useful configuration for
> > debugging).
> > 
> > There are obviously different ways to fix this. I'm floating this (untested)
> > rfc as a kernel patch (do we ever sync libxfs from xfsprogs -> kernel?) with
> > the objective of keeping the libxfs code the same between the kernel and
> > userspace. We could alternatively create a custom helper/macro with the
> > appropriate check in each place. Thoughts?
> 
> Wouldn't it be better to simply fix the userspace buffer
> initialisation to always have a valid bp->b_maps, just like the
> kernel does? (See xfs_buf_get_maps() in the kernel code).  That way
> we don't have a landmine lurking in all the shared libxfs code we
> bring from the kernel that may interact with uncached buffers.
> 

We could certainly create a bp->__b_map field in xfsprogs libxfs and
initialize ->b_maps similar to the kernel for nmap == 1 buffers. Given
the lack of overlap of uncached buffers between xfsprogs and the kernel
(I'm not sure there are other cases where such buffers are commonly
handled), I don't personally think one way is notably better than the
other.

The tradeoffs seem to be that this patch is fairly localized but leaves
the potentially different states for uncached buffers in kernel vs.
xfsprogs context. The above approach addresses that issue at the cost of
slightly increasing the size of xfs_buf in userspace for something that
may not ever be necessary outside of an isolated bit of code. It also
only requires a change to xfsprogs libxfs.

Given the tradeoffs, I have no real preference on which approach we
take. Do you prefer the latter? If so and there are no other objections,
I'll send a patch along those lines.

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Chinner July 20, 2017, 2:48 a.m. UTC | #5
On Wed, Jul 19, 2017 at 07:17:41AM -0400, Brian Foster wrote:
> On Wed, Jul 19, 2017 at 09:12:02AM +1000, Dave Chinner wrote:
> > On Tue, Jul 18, 2017 at 10:13:37AM -0400, Brian Foster wrote:
> > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > ---
> > > 
> > > Hi all,
> > > 
> > > This patch is actually targeted at userspace. The previous change in commit
> > > f3d7ebde ("xfs: fix superblock inprogress check") to use ->b_maps technically
> > > breaks the logic in userspace in a similar way to the original problem because
> > > userspace has no concept of uncached buffers.  ->b_maps is NULL in userspace
> > > unless the buffer is truly discontiguous.
> > > 
> > > This would normally result in a segfault but this appears to be hidden
> > > by gcc optimization as -O2 is enabled by default and the
> > > check_inprogress param to xfs_mount_validate_sb() is unused in
> > > userspace. Therefore, the segfault is only reproducible when
> > > optimization is disabled (which is a useful configuration for
> > > debugging).
> > > 
> > > There are obviously different ways to fix this. I'm floating this (untested)
> > > rfc as a kernel patch (do we ever sync libxfs from xfsprogs -> kernel?) with
> > > the objective of keeping the libxfs code the same between the kernel and
> > > userspace. We could alternatively create a custom helper/macro with the
> > > appropriate check in each place. Thoughts?
> > 
> > Wouldn't it be better to simply fix the userspace buffer
> > initialisation to always have a valid bp->b_maps, just like the
> > kernel does? (See xfs_buf_get_maps() in the kernel code).  That way
> > we don't have a landmine lurking in all the shared libxfs code we
> > bring from the kernel that may interact with uncached buffers.
> > 
> 
> We could certainly create a bp->__b_map field in xfsprogs libxfs and
> initialize ->b_maps similar to the kernel for nmap == 1 buffers. Given
> the lack of overlap of uncached buffers between xfsprogs and the kernel
> (I'm not sure there are other cases where such buffers are commonly
> handled), I don't personally think one way is notably better than the
> other.
> 
> The tradeoffs seem to be that this patch is fairly localized but leaves
> the potentially different states for uncached buffers in kernel vs.
> xfsprogs context. The above approach addresses that issue at the cost of
> slightly increasing the size of xfs_buf in userspace for something that
> may not ever be necessary outside of an isolated bit of code. It also
> only requires a change to xfsprogs libxfs.
> 
> Given the tradeoffs, I have no real preference on which approach we
> take. Do you prefer the latter? If so and there are no other objections,
> I'll send a patch along those lines.

I'd prefer the latter (the bp->__b_map solution) simply so we don't
have to worry about it in future. The closer the kernel and
userspace buffer caches are in terms of behaviour, implementation
and structure members the less likely we are to have problems
related to the kernel using uncached buffers...

FWIW, my wish list contains porting the kernel side buffer cache
implementation to userspace to solve the scalabilty problems in the
current userspace implementation that are exposed when repair
hammers multiple AGs at once. Hence anything that gets kernel +
userspace closer together helps get us (minutely) closer to that
goal....

Cheers,

Dave.
Brian Foster July 20, 2017, 11:52 a.m. UTC | #6
On Thu, Jul 20, 2017 at 12:48:55PM +1000, Dave Chinner wrote:
> On Wed, Jul 19, 2017 at 07:17:41AM -0400, Brian Foster wrote:
> > On Wed, Jul 19, 2017 at 09:12:02AM +1000, Dave Chinner wrote:
> > > On Tue, Jul 18, 2017 at 10:13:37AM -0400, Brian Foster wrote:
> > > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > > ---
> > > > 
> > > > Hi all,
> > > > 
> > > > This patch is actually targeted at userspace. The previous change in commit
> > > > f3d7ebde ("xfs: fix superblock inprogress check") to use ->b_maps technically
> > > > breaks the logic in userspace in a similar way to the original problem because
> > > > userspace has no concept of uncached buffers.  ->b_maps is NULL in userspace
> > > > unless the buffer is truly discontiguous.
> > > > 
> > > > This would normally result in a segfault but this appears to be hidden
> > > > by gcc optimization as -O2 is enabled by default and the
> > > > check_inprogress param to xfs_mount_validate_sb() is unused in
> > > > userspace. Therefore, the segfault is only reproducible when
> > > > optimization is disabled (which is a useful configuration for
> > > > debugging).
> > > > 
> > > > There are obviously different ways to fix this. I'm floating this (untested)
> > > > rfc as a kernel patch (do we ever sync libxfs from xfsprogs -> kernel?) with
> > > > the objective of keeping the libxfs code the same between the kernel and
> > > > userspace. We could alternatively create a custom helper/macro with the
> > > > appropriate check in each place. Thoughts?
> > > 
> > > Wouldn't it be better to simply fix the userspace buffer
> > > initialisation to always have a valid bp->b_maps, just like the
> > > kernel does? (See xfs_buf_get_maps() in the kernel code).  That way
> > > we don't have a landmine lurking in all the shared libxfs code we
> > > bring from the kernel that may interact with uncached buffers.
> > > 
> > 
> > We could certainly create a bp->__b_map field in xfsprogs libxfs and
> > initialize ->b_maps similar to the kernel for nmap == 1 buffers. Given
> > the lack of overlap of uncached buffers between xfsprogs and the kernel
> > (I'm not sure there are other cases where such buffers are commonly
> > handled), I don't personally think one way is notably better than the
> > other.
> > 
> > The tradeoffs seem to be that this patch is fairly localized but leaves
> > the potentially different states for uncached buffers in kernel vs.
> > xfsprogs context. The above approach addresses that issue at the cost of
> > slightly increasing the size of xfs_buf in userspace for something that
> > may not ever be necessary outside of an isolated bit of code. It also
> > only requires a change to xfsprogs libxfs.
> > 
> > Given the tradeoffs, I have no real preference on which approach we
> > take. Do you prefer the latter? If so and there are no other objections,
> > I'll send a patch along those lines.
> 
> I'd prefer the latter (the bp->__b_map solution) simply so we don't
> have to worry about it in future. The closer the kernel and
> userspace buffer caches are in terms of behaviour, implementation
> and structure members the less likely we are to have problems
> related to the kernel using uncached buffers...
> 
> FWIW, my wish list contains porting the kernel side buffer cache
> implementation to userspace to solve the scalabilty problems in the
> current userspace implementation that are exposed when repair
> hammers multiple AGs at once. Hence anything that gets kernel +
> userspace closer together helps get us (minutely) closer to that
> goal....
> 

Sounds reasonable to me. I'll post the xfsprogs __b_map patch I cooked
up and started testing yesterday a bit later...

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darrick J. Wong Aug. 16, 2017, 6:22 a.m. UTC | #7
On Thu, Jul 20, 2017 at 07:52:01AM -0400, Brian Foster wrote:
> On Thu, Jul 20, 2017 at 12:48:55PM +1000, Dave Chinner wrote:
> > On Wed, Jul 19, 2017 at 07:17:41AM -0400, Brian Foster wrote:
> > > On Wed, Jul 19, 2017 at 09:12:02AM +1000, Dave Chinner wrote:
> > > > On Tue, Jul 18, 2017 at 10:13:37AM -0400, Brian Foster wrote:
> > > > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > > > ---
> > > > > 
> > > > > Hi all,
> > > > > 
> > > > > This patch is actually targeted at userspace. The previous change in commit
> > > > > f3d7ebde ("xfs: fix superblock inprogress check") to use ->b_maps technically
> > > > > breaks the logic in userspace in a similar way to the original problem because
> > > > > userspace has no concept of uncached buffers.  ->b_maps is NULL in userspace
> > > > > unless the buffer is truly discontiguous.
> > > > > 
> > > > > This would normally result in a segfault but this appears to be hidden
> > > > > by gcc optimization as -O2 is enabled by default and the
> > > > > check_inprogress param to xfs_mount_validate_sb() is unused in
> > > > > userspace. Therefore, the segfault is only reproducible when
> > > > > optimization is disabled (which is a useful configuration for
> > > > > debugging).
> > > > > 
> > > > > There are obviously different ways to fix this. I'm floating this (untested)
> > > > > rfc as a kernel patch (do we ever sync libxfs from xfsprogs -> kernel?) with
> > > > > the objective of keeping the libxfs code the same between the kernel and
> > > > > userspace. We could alternatively create a custom helper/macro with the
> > > > > appropriate check in each place. Thoughts?
> > > > 
> > > > Wouldn't it be better to simply fix the userspace buffer
> > > > initialisation to always have a valid bp->b_maps, just like the
> > > > kernel does? (See xfs_buf_get_maps() in the kernel code).  That way
> > > > we don't have a landmine lurking in all the shared libxfs code we
> > > > bring from the kernel that may interact with uncached buffers.
> > > > 
> > > 
> > > We could certainly create a bp->__b_map field in xfsprogs libxfs and
> > > initialize ->b_maps similar to the kernel for nmap == 1 buffers. Given
> > > the lack of overlap of uncached buffers between xfsprogs and the kernel
> > > (I'm not sure there are other cases where such buffers are commonly
> > > handled), I don't personally think one way is notably better than the
> > > other.
> > > 
> > > The tradeoffs seem to be that this patch is fairly localized but leaves
> > > the potentially different states for uncached buffers in kernel vs.
> > > xfsprogs context. The above approach addresses that issue at the cost of
> > > slightly increasing the size of xfs_buf in userspace for something that
> > > may not ever be necessary outside of an isolated bit of code. It also
> > > only requires a change to xfsprogs libxfs.
> > > 
> > > Given the tradeoffs, I have no real preference on which approach we
> > > take. Do you prefer the latter? If so and there are no other objections,
> > > I'll send a patch along those lines.
> > 
> > I'd prefer the latter (the bp->__b_map solution) simply so we don't
> > have to worry about it in future. The closer the kernel and
> > userspace buffer caches are in terms of behaviour, implementation
> > and structure members the less likely we are to have problems
> > related to the kernel using uncached buffers...
> > 
> > FWIW, my wish list contains porting the kernel side buffer cache
> > implementation to userspace to solve the scalabilty problems in the
> > current userspace implementation that are exposed when repair
> > hammers multiple AGs at once. Hence anything that gets kernel +
> > userspace closer together helps get us (minutely) closer to that
> > goal....
> > 
> 
> Sounds reasonable to me. I'll post the xfsprogs __b_map patch I cooked
> up and started testing yesterday a bit later...

Did this ever happen?  Someone was complaining about this on IRC just now.

--D

> 
> Brian
> 
> > Cheers,
> > 
> > Dave.
> > -- 
> > Dave Chinner
> > david@fromorbit.com
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Brian Foster Aug. 16, 2017, 10:31 a.m. UTC | #8
On Tue, Aug 15, 2017 at 11:22:13PM -0700, Darrick J. Wong wrote:
> On Thu, Jul 20, 2017 at 07:52:01AM -0400, Brian Foster wrote:
> > On Thu, Jul 20, 2017 at 12:48:55PM +1000, Dave Chinner wrote:
> > > On Wed, Jul 19, 2017 at 07:17:41AM -0400, Brian Foster wrote:
> > > > On Wed, Jul 19, 2017 at 09:12:02AM +1000, Dave Chinner wrote:
> > > > > On Tue, Jul 18, 2017 at 10:13:37AM -0400, Brian Foster wrote:
> > > > > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > > > > ---
> > > > > > 
> > > > > > Hi all,
> > > > > > 
> > > > > > This patch is actually targeted at userspace. The previous change in commit
> > > > > > f3d7ebde ("xfs: fix superblock inprogress check") to use ->b_maps technically
> > > > > > breaks the logic in userspace in a similar way to the original problem because
> > > > > > userspace has no concept of uncached buffers.  ->b_maps is NULL in userspace
> > > > > > unless the buffer is truly discontiguous.
> > > > > > 
> > > > > > This would normally result in a segfault but this appears to be hidden
> > > > > > by gcc optimization as -O2 is enabled by default and the
> > > > > > check_inprogress param to xfs_mount_validate_sb() is unused in
> > > > > > userspace. Therefore, the segfault is only reproducible when
> > > > > > optimization is disabled (which is a useful configuration for
> > > > > > debugging).
> > > > > > 
> > > > > > There are obviously different ways to fix this. I'm floating this (untested)
> > > > > > rfc as a kernel patch (do we ever sync libxfs from xfsprogs -> kernel?) with
> > > > > > the objective of keeping the libxfs code the same between the kernel and
> > > > > > userspace. We could alternatively create a custom helper/macro with the
> > > > > > appropriate check in each place. Thoughts?
> > > > > 
> > > > > Wouldn't it be better to simply fix the userspace buffer
> > > > > initialisation to always have a valid bp->b_maps, just like the
> > > > > kernel does? (See xfs_buf_get_maps() in the kernel code).  That way
> > > > > we don't have a landmine lurking in all the shared libxfs code we
> > > > > bring from the kernel that may interact with uncached buffers.
> > > > > 
> > > > 
> > > > We could certainly create a bp->__b_map field in xfsprogs libxfs and
> > > > initialize ->b_maps similar to the kernel for nmap == 1 buffers. Given
> > > > the lack of overlap of uncached buffers between xfsprogs and the kernel
> > > > (I'm not sure there are other cases where such buffers are commonly
> > > > handled), I don't personally think one way is notably better than the
> > > > other.
> > > > 
> > > > The tradeoffs seem to be that this patch is fairly localized but leaves
> > > > the potentially different states for uncached buffers in kernel vs.
> > > > xfsprogs context. The above approach addresses that issue at the cost of
> > > > slightly increasing the size of xfs_buf in userspace for something that
> > > > may not ever be necessary outside of an isolated bit of code. It also
> > > > only requires a change to xfsprogs libxfs.
> > > > 
> > > > Given the tradeoffs, I have no real preference on which approach we
> > > > take. Do you prefer the latter? If so and there are no other objections,
> > > > I'll send a patch along those lines.
> > > 
> > > I'd prefer the latter (the bp->__b_map solution) simply so we don't
> > > have to worry about it in future. The closer the kernel and
> > > userspace buffer caches are in terms of behaviour, implementation
> > > and structure members the less likely we are to have problems
> > > related to the kernel using uncached buffers...
> > > 
> > > FWIW, my wish list contains porting the kernel side buffer cache
> > > implementation to userspace to solve the scalabilty problems in the
> > > current userspace implementation that are exposed when repair
> > > hammers multiple AGs at once. Hence anything that gets kernel +
> > > userspace closer together helps get us (minutely) closer to that
> > > goal....
> > > 
> > 
> > Sounds reasonable to me. I'll post the xfsprogs __b_map patch I cooked
> > up and started testing yesterday a bit later...
> 
> Did this ever happen?  Someone was complaining about this on IRC just now.
> 

Yep, see xfsprogs commit 2c6c632820 ("libxfs: init ->b_maps on contig
buffers for uncached compatibility") in for-next.

Brian

> --D
> 
> > 
> > Brian
> > 
> > > Cheers,
> > > 
> > > Dave.
> > > -- 
> > > Dave Chinner
> > > david@fromorbit.com
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darrick J. Wong Aug. 16, 2017, 3:22 p.m. UTC | #9
On Wed, Aug 16, 2017 at 06:31:58AM -0400, Brian Foster wrote:
> On Tue, Aug 15, 2017 at 11:22:13PM -0700, Darrick J. Wong wrote:
> > On Thu, Jul 20, 2017 at 07:52:01AM -0400, Brian Foster wrote:
> > > On Thu, Jul 20, 2017 at 12:48:55PM +1000, Dave Chinner wrote:
> > > > On Wed, Jul 19, 2017 at 07:17:41AM -0400, Brian Foster wrote:
> > > > > On Wed, Jul 19, 2017 at 09:12:02AM +1000, Dave Chinner wrote:
> > > > > > On Tue, Jul 18, 2017 at 10:13:37AM -0400, Brian Foster wrote:
> > > > > > > Signed-off-by: Brian Foster <bfoster@redhat.com>
> > > > > > > ---
> > > > > > > 
> > > > > > > Hi all,
> > > > > > > 
> > > > > > > This patch is actually targeted at userspace. The previous change in commit
> > > > > > > f3d7ebde ("xfs: fix superblock inprogress check") to use ->b_maps technically
> > > > > > > breaks the logic in userspace in a similar way to the original problem because
> > > > > > > userspace has no concept of uncached buffers.  ->b_maps is NULL in userspace
> > > > > > > unless the buffer is truly discontiguous.
> > > > > > > 
> > > > > > > This would normally result in a segfault but this appears to be hidden
> > > > > > > by gcc optimization as -O2 is enabled by default and the
> > > > > > > check_inprogress param to xfs_mount_validate_sb() is unused in
> > > > > > > userspace. Therefore, the segfault is only reproducible when
> > > > > > > optimization is disabled (which is a useful configuration for
> > > > > > > debugging).
> > > > > > > 
> > > > > > > There are obviously different ways to fix this. I'm floating this (untested)
> > > > > > > rfc as a kernel patch (do we ever sync libxfs from xfsprogs -> kernel?) with
> > > > > > > the objective of keeping the libxfs code the same between the kernel and
> > > > > > > userspace. We could alternatively create a custom helper/macro with the
> > > > > > > appropriate check in each place. Thoughts?
> > > > > > 
> > > > > > Wouldn't it be better to simply fix the userspace buffer
> > > > > > initialisation to always have a valid bp->b_maps, just like the
> > > > > > kernel does? (See xfs_buf_get_maps() in the kernel code).  That way
> > > > > > we don't have a landmine lurking in all the shared libxfs code we
> > > > > > bring from the kernel that may interact with uncached buffers.
> > > > > > 
> > > > > 
> > > > > We could certainly create a bp->__b_map field in xfsprogs libxfs and
> > > > > initialize ->b_maps similar to the kernel for nmap == 1 buffers. Given
> > > > > the lack of overlap of uncached buffers between xfsprogs and the kernel
> > > > > (I'm not sure there are other cases where such buffers are commonly
> > > > > handled), I don't personally think one way is notably better than the
> > > > > other.
> > > > > 
> > > > > The tradeoffs seem to be that this patch is fairly localized but leaves
> > > > > the potentially different states for uncached buffers in kernel vs.
> > > > > xfsprogs context. The above approach addresses that issue at the cost of
> > > > > slightly increasing the size of xfs_buf in userspace for something that
> > > > > may not ever be necessary outside of an isolated bit of code. It also
> > > > > only requires a change to xfsprogs libxfs.
> > > > > 
> > > > > Given the tradeoffs, I have no real preference on which approach we
> > > > > take. Do you prefer the latter? If so and there are no other objections,
> > > > > I'll send a patch along those lines.
> > > > 
> > > > I'd prefer the latter (the bp->__b_map solution) simply so we don't
> > > > have to worry about it in future. The closer the kernel and
> > > > userspace buffer caches are in terms of behaviour, implementation
> > > > and structure members the less likely we are to have problems
> > > > related to the kernel using uncached buffers...
> > > > 
> > > > FWIW, my wish list contains porting the kernel side buffer cache
> > > > implementation to userspace to solve the scalabilty problems in the
> > > > current userspace implementation that are exposed when repair
> > > > hammers multiple AGs at once. Hence anything that gets kernel +
> > > > userspace closer together helps get us (minutely) closer to that
> > > > goal....
> > > > 
> > > 
> > > Sounds reasonable to me. I'll post the xfsprogs __b_map patch I cooked
> > > up and started testing yesterday a bit later...
> > 
> > Did this ever happen?  Someone was complaining about this on IRC just now.
> > 
> 
> Yep, see xfsprogs commit 2c6c632820 ("libxfs: init ->b_maps on contig
> buffers for uncached compatibility") in for-next.

Heh, missed that, thanks!  If that person pops up again I'll know what to
tell them.

--D

> 
> Brian
> 
> > --D
> > 
> > > 
> > > Brian
> > > 
> > > > Cheers,
> > > > 
> > > > Dave.
> > > > -- 
> > > > Dave Chinner
> > > > david@fromorbit.com
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > > the body of a message to majordomo@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch
diff mbox

diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index 9b5aae2..ec2fd03 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -583,6 +583,7 @@  xfs_sb_verify(
 {
 	struct xfs_mount *mp = bp->b_target->bt_mount;
 	struct xfs_sb	sb;
+	bool		primary_sb;
 
 	/*
 	 * Use call variant which doesn't convert quota flags from disk 
@@ -592,11 +593,14 @@  xfs_sb_verify(
 
 	/*
 	 * Only check the in progress field for the primary superblock as
-	 * mkfs.xfs doesn't clear it from secondary superblocks.
+	 * mkfs.xfs doesn't clear it from secondary superblocks. Note that
+	 * userspace libxfs does not have uncached buffers and so b_maps is not
+	 * used for the sb buffer.
 	 */
-	return xfs_mount_validate_sb(mp, &sb,
-				     bp->b_maps[0].bm_bn == XFS_SB_DADDR,
-				     check_version);
+	primary_sb = (bp->b_bn == XFS_BUF_DADDR_NULL &&
+		      bp->b_maps[0].bm_bn == XFS_SB_DADDR) ||
+		     bp->b_bn == XFS_SB_DADDR;
+	return xfs_mount_validate_sb(mp, &sb, primary_sb, check_version);
 }
 
 /*