diff mbox

[3/5] xfs: fix bogus minleft manipulations

Message ID 1482436822-31546-4-git-send-email-hch@lst.de (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Christoph Hellwig Dec. 22, 2016, 8 p.m. UTC
We can't just set minleft to 0 when we're low on space - that's exactly
what we need minleft for: to protect space in the AG for btree block
allocations when we are low on free space.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_alloc.c      | 24 +++++++-----------------
 fs/xfs/libxfs/xfs_bmap.c       |  3 ---
 fs/xfs/libxfs/xfs_bmap_btree.c | 14 --------------
 3 files changed, 7 insertions(+), 34 deletions(-)

Comments

Brian Foster Jan. 4, 2017, 6:19 p.m. UTC | #1
On Thu, Dec 22, 2016 at 09:00:20PM +0100, Christoph Hellwig wrote:
> We can't just set minleft to 0 when we're low on space - that's exactly
> what we need minleft for: to protect space in the AG for btree block
> allocations when we are low on free space.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Staring at this some more, I'm still not terribly fond of this, moreso
because I wonder how much more of this can be ripped out and whether the
low space allocator thing is still effective. Aside from that, afaict
the set_aside change should make it robust and addresses my previous
question in that it holds blocks out of all transaction reservations.

I'm also curious whether the set_aside patch alone addresses the
original problem, or setting minleft = 0 in one of these cases was
actually the cause.

>  fs/xfs/libxfs/xfs_alloc.c      | 24 +++++++-----------------
>  fs/xfs/libxfs/xfs_bmap.c       |  3 ---
>  fs/xfs/libxfs/xfs_bmap_btree.c | 14 --------------
>  3 files changed, 7 insertions(+), 34 deletions(-)
> 
...
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 19c05e9..62663a2 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -3812,7 +3812,6 @@ xfs_bmap_btalloc(
>  		args.fsbno = 0;
>  		args.type = XFS_ALLOCTYPE_FIRST_AG;
>  		args.total = ap->minlen;
> -		args.minleft = 0;
>  		if ((error = xfs_alloc_vextent(&args)))
>  			return error;
>  		ap->dfops->dop_low = true;
...
> diff --git a/fs/xfs/libxfs/xfs_bmap_btree.c b/fs/xfs/libxfs/xfs_bmap_btree.c
> index d6330c2..9581ee8 100644
> --- a/fs/xfs/libxfs/xfs_bmap_btree.c
> +++ b/fs/xfs/libxfs/xfs_bmap_btree.c
> @@ -499,20 +499,6 @@ xfs_bmbt_alloc_block(
>  		goto try_another_ag;
>  	}
>  
> -	if (args.fsbno == NULLFSBLOCK && args.minleft) {
> -		/*
> -		 * Could not find an AG with enough free space to satisfy
> -		 * a full btree split.  Try again without minleft and if
> -		 * successful activate the lowspace algorithm.
> -		 */
> -		args.fsbno = 0;
> -		args.type = XFS_ALLOCTYPE_FIRST_AG;
> -		args.minleft = 0;
> -		error = xfs_alloc_vextent(&args);
> -		if (error)
> -			goto error0;
> -		cur->bc_private.b.dfops->dop_low = true;
> -	}

We have a similar retry pattern in xfs_bmap_btalloc() where, in the hunk
just above, we retain the retry that appears analogous to this one (in
that it activates the low space algo) and just drop the minleft = 0 bit.
Here we are dropping the whole thing. Any reason not to be consistent
one way or the other? (Though do note that we don't check firstblock
here...).

Brian

>  	if (args.fsbno == NULLFSBLOCK) {
>  		XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT);
>  		*stat = 0;
> -- 
> 2.1.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig Jan. 8, 2017, 10:36 a.m. UTC | #2
On Wed, Jan 04, 2017 at 01:19:33PM -0500, Brian Foster wrote:
> Staring at this some more, I'm still not terribly fond of this, moreso
> because I wonder how much more of this can be ripped out

What else do you have in mind?

> and whether the
> low space allocator thing is still effective.

That's a good question.  Another item added to my not critical allocator
TODO list, which keeps growing..

> Aside from that, afaict
> the set_aside change should make it robust and addresses my previous
> question in that it holds blocks out of all transaction reservations.
> 
> I'm also curious whether the set_aside patch alone addresses the
> original problem, or setting minleft = 0 in one of these cases was
> actually the cause.

I think I needed both changes, but it's been a while and I'll have to
retest.

> > -		/*
> > -		 * Could not find an AG with enough free space to satisfy
> > -		 * a full btree split.  Try again without minleft and if
> > -		 * successful activate the lowspace algorithm.
> > -		 */
> > -		args.fsbno = 0;
> > -		args.type = XFS_ALLOCTYPE_FIRST_AG;
> > -		args.minleft = 0;
> > -		error = xfs_alloc_vextent(&args);
> > -		if (error)
> > -			goto error0;
> > -		cur->bc_private.b.dfops->dop_low = true;
> > -	}
> 
> We have a similar retry pattern in xfs_bmap_btalloc() where, in the hunk
> just above, we retain the retry that appears analogous to this one (in
> that it activates the low space algo) and just drop the minleft = 0 bit.
> Here we are dropping the whole thing. Any reason not to be consistent
> one way or the other? (Though do note that we don't check firstblock
> here...).

xfs_bmap_btalloc is a bit different because the alignment question
comes into play as well, in addition to the non-binding AG selection
from the higher level allocator code.  But I suspect that there is a lot
of dead or questionable code in it, and it's been on my todo list
to audit xfs_bmap_btalloc, xfs_alloc_vectent and it's subfunction,
and make the argument passing, allocator modes and things like the
lowspace mode aswell as the firstblock magic a lot cleaner and properly
documented.
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Brian Foster Jan. 8, 2017, 4:09 p.m. UTC | #3
On Sun, Jan 08, 2017 at 11:36:11AM +0100, Christoph Hellwig wrote:
> On Wed, Jan 04, 2017 at 01:19:33PM -0500, Brian Foster wrote:
> > Staring at this some more, I'm still not terribly fond of this, moreso
> > because I wonder how much more of this can be ripped out
> 
> What else do you have in mind?
> 

Oh, I don't have anything better to offer atm. :P It's just that we're
messing with this little corner of the low space allocation mechanism
when a broader reassessment of the bigger picture seems more appropriate
(hence the low space allocator question). It sounds like doing that is
on your radar, though..

Sorry, I recognize there's a bug to fix here.. I guess I'm just trying
to convince myself this is as minimal as necessary to fix it.

> > and whether the
> > low space allocator thing is still effective.
> 
> That's a good question.  Another item added to my not critical allocator
> TODO list, which keeps growing..
> 

You're welcome. ;)

> > Aside from that, afaict
> > the set_aside change should make it robust and addresses my previous
> > question in that it holds blocks out of all transaction reservations.
> > 
> > I'm also curious whether the set_aside patch alone addresses the
> > original problem, or setting minleft = 0 in one of these cases was
> > actually the cause.
> 
> I think I needed both changes, but it's been a while and I'll have to
> retest.
> 

Ok, thanks.

> > > -		/*
> > > -		 * Could not find an AG with enough free space to satisfy
> > > -		 * a full btree split.  Try again without minleft and if
> > > -		 * successful activate the lowspace algorithm.
> > > -		 */
> > > -		args.fsbno = 0;
> > > -		args.type = XFS_ALLOCTYPE_FIRST_AG;
> > > -		args.minleft = 0;
> > > -		error = xfs_alloc_vextent(&args);
> > > -		if (error)
> > > -			goto error0;
> > > -		cur->bc_private.b.dfops->dop_low = true;
> > > -	}
> > 
> > We have a similar retry pattern in xfs_bmap_btalloc() where, in the hunk
> > just above, we retain the retry that appears analogous to this one (in
> > that it activates the low space algo) and just drop the minleft = 0 bit.
> > Here we are dropping the whole thing. Any reason not to be consistent
> > one way or the other? (Though do note that we don't check firstblock
> > here...).
> 
> xfs_bmap_btalloc is a bit different because the alignment question
> comes into play as well, in addition to the non-binding AG selection
> from the higher level allocator code.  But I suspect that there is a lot
> of dead or questionable code in it, and it's been on my todo list
> to audit xfs_bmap_btalloc, xfs_alloc_vectent and it's subfunction,
> and make the argument passing, allocator modes and things like the
> lowspace mode aswell as the firstblock magic a lot cleaner and properly
> documented.

I agree that is much needed and may very well kill some of this code
off...

In this particular case, I think it's probably safer to defer the
removal of the entire bmbt_alloc_block() hunk to that audit that will
take into consideration the broader context. IOWs, take the same
approach in bmbt_alloc_block() as you have in xfs_bmap_btalloc().

I'm not even sure the code is correct as it is, but if we're in a
situation where multiple bmbt block allocations are required, afaict
that hunk can affect subsequent bmbt block allocs in terms of how
aggressive the allocation request is (via allocation mode). I think that
also provides some logical separation between minleft changes and all of
this retry logic and low space allocator stuff, fwiw.

Brian

> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig Jan. 9, 2017, 5:56 p.m. UTC | #4
On Sun, Jan 08, 2017 at 11:09:35AM -0500, Brian Foster wrote:
> In this particular case, I think it's probably safer to defer the
> removal of the entire bmbt_alloc_block() hunk to that audit that will
> take into consideration the broader context. IOWs, take the same
> approach in bmbt_alloc_block() as you have in xfs_bmap_btalloc().

Ok, will do.
--
To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index 0a46f84..fe92570 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -2635,12 +2635,10 @@  xfs_alloc_vextent(
 	xfs_agblock_t	agsize;	/* allocation group size */
 	int		error;
 	int		flags;	/* XFS_ALLOC_FLAG_... locking flags */
-	xfs_extlen_t	minleft;/* minimum left value, temp copy */
 	xfs_mount_t	*mp;	/* mount structure pointer */
 	xfs_agnumber_t	sagno;	/* starting allocation group number */
 	xfs_alloctype_t	type;	/* input allocation type */
 	int		bump_rotor = 0;
-	int		no_min = 0;
 	xfs_agnumber_t	rotorstep = xfs_rotorstep; /* inode32 agf stepper */
 
 	mp = args->mp;
@@ -2669,7 +2667,6 @@  xfs_alloc_vextent(
 		trace_xfs_alloc_vextent_badargs(args);
 		return 0;
 	}
-	minleft = args->minleft;
 
 	switch (type) {
 	case XFS_ALLOCTYPE_THIS_AG:
@@ -2680,9 +2677,7 @@  xfs_alloc_vextent(
 		 */
 		args->agno = XFS_FSB_TO_AGNO(mp, args->fsbno);
 		args->pag = xfs_perag_get(mp, args->agno);
-		args->minleft = 0;
 		error = xfs_alloc_fix_freelist(args, 0);
-		args->minleft = minleft;
 		if (error) {
 			trace_xfs_alloc_vextent_nofix(args);
 			goto error0;
@@ -2747,9 +2742,7 @@  xfs_alloc_vextent(
 		 */
 		for (;;) {
 			args->pag = xfs_perag_get(mp, args->agno);
-			if (no_min) args->minleft = 0;
 			error = xfs_alloc_fix_freelist(args, flags);
-			args->minleft = minleft;
 			if (error) {
 				trace_xfs_alloc_vextent_nofix(args);
 				goto error0;
@@ -2789,20 +2782,17 @@  xfs_alloc_vextent(
 			 * or switch to non-trylock mode.
 			 */
 			if (args->agno == sagno) {
-				if (no_min == 1) {
+				if (flags == 0) {
 					args->agbno = NULLAGBLOCK;
 					trace_xfs_alloc_vextent_allfailed(args);
 					break;
 				}
-				if (flags == 0) {
-					no_min = 1;
-				} else {
-					flags = 0;
-					if (type == XFS_ALLOCTYPE_START_BNO) {
-						args->agbno = XFS_FSB_TO_AGBNO(mp,
-							args->fsbno);
-						args->type = XFS_ALLOCTYPE_NEAR_BNO;
-					}
+
+				flags = 0;
+				if (type == XFS_ALLOCTYPE_START_BNO) {
+					args->agbno = XFS_FSB_TO_AGBNO(mp,
+						args->fsbno);
+					args->type = XFS_ALLOCTYPE_NEAR_BNO;
 				}
 			}
 			xfs_perag_put(args->pag);
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 19c05e9..62663a2 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3812,7 +3812,6 @@  xfs_bmap_btalloc(
 		args.fsbno = 0;
 		args.type = XFS_ALLOCTYPE_FIRST_AG;
 		args.total = ap->minlen;
-		args.minleft = 0;
 		if ((error = xfs_alloc_vextent(&args)))
 			return error;
 		ap->dfops->dop_low = true;
@@ -4344,8 +4343,6 @@  xfs_bmapi_allocate(
 	if (error)
 		return error;
 
-	if (bma->dfops->dop_low)
-		bma->minleft = 0;
 	if (bma->cur)
 		bma->cur->bc_private.b.firstblock = *bma->firstblock;
 	if (bma->blkno == NULLFSBLOCK)
diff --git a/fs/xfs/libxfs/xfs_bmap_btree.c b/fs/xfs/libxfs/xfs_bmap_btree.c
index d6330c2..9581ee8 100644
--- a/fs/xfs/libxfs/xfs_bmap_btree.c
+++ b/fs/xfs/libxfs/xfs_bmap_btree.c
@@ -499,20 +499,6 @@  xfs_bmbt_alloc_block(
 		goto try_another_ag;
 	}
 
-	if (args.fsbno == NULLFSBLOCK && args.minleft) {
-		/*
-		 * Could not find an AG with enough free space to satisfy
-		 * a full btree split.  Try again without minleft and if
-		 * successful activate the lowspace algorithm.
-		 */
-		args.fsbno = 0;
-		args.type = XFS_ALLOCTYPE_FIRST_AG;
-		args.minleft = 0;
-		error = xfs_alloc_vextent(&args);
-		if (error)
-			goto error0;
-		cur->bc_private.b.dfops->dop_low = true;
-	}
 	if (args.fsbno == NULLFSBLOCK) {
 		XFS_BTREE_TRACE_CURSOR(cur, XBT_EXIT);
 		*stat = 0;