diff mbox series

[11/11] xfs: flush speculative space allocations when we run out of space

Message ID 161142798066.2171939.9311024588681972086.stgit@magnolia (mailing list archive)
State Superseded
Headers show
Series xfs: try harder to reclaim space when we run out | expand

Commit Message

Darrick J. Wong Jan. 23, 2021, 6:53 p.m. UTC
From: Darrick J. Wong <djwong@kernel.org>

If a fs modification (creation, file write, reflink, etc.) is unable to
reserve enough space to handle the modification, try clearing whatever
space the filesystem might have been hanging onto in the hopes of
speeding up the filesystem.  The flushing behavior will become
particularly important when we add deferred inode inactivation because
that will increase the amount of space that isn't actively tied to user
data.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_trans.c |   20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Comments

Christoph Hellwig Jan. 24, 2021, 9:48 a.m. UTC | #1
> +retry:
>  	/*
>  	 * Allocate the handle before we do our freeze accounting and setting up
>  	 * GFP_NOFS allocation context so that we avoid lockdep false positives
> @@ -285,6 +289,22 @@ xfs_trans_alloc(
>  	tp->t_firstblock = NULLFSBLOCK;
>  
>  	error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> +	if (error == -ENOSPC && tries > 0) {
> +		xfs_trans_cancel(tp);
> +
> +		/*
> +		 * We weren't able to reserve enough space for the transaction.
> +		 * Flush the other speculative space allocations to free space.
> +		 * Do not perform a synchronous scan because callers can hold
> +		 * other locks.
> +		 */
> +		error = xfs_blockgc_free_space(mp, NULL);
> +		if (error)
> +			return error;
> +
> +		tries--;
> +		goto retry;
> +	}
>  	if (error) {
>  		xfs_trans_cancel(tp);
>  		return error;

Why do we need to restart the whole function?  A failing
xfs_trans_reserve should restore tp to its initial state, and keeping
the SB_FREEZE_FS counter increased also doesn't look harmful as far as
I can tell.  So why not:

	error = xfs_trans_reserve(tp, resp, blocks, rtextents);
	if (error == -ENOSPC) {
		/*
		 * We weren't able to reserve enough space for the transaction.
		 * Flush the other speculative space allocations to free space.
		 * Do not perform a synchronous scan because callers can hold
		 * other locks.
		 */
		error = xfs_blockgc_free_space(mp, NULL);
		if (error)
			return error;
		error = xfs_trans_reserve(tp, resp, blocks, rtextents);
	}
 	if (error) {
  		xfs_trans_cancel(tp);
  		return error;

?
Brian Foster Jan. 25, 2021, 6:46 p.m. UTC | #2
On Sun, Jan 24, 2021 at 09:48:16AM +0000, Christoph Hellwig wrote:
> > +retry:
> >  	/*
> >  	 * Allocate the handle before we do our freeze accounting and setting up
> >  	 * GFP_NOFS allocation context so that we avoid lockdep false positives
> > @@ -285,6 +289,22 @@ xfs_trans_alloc(
> >  	tp->t_firstblock = NULLFSBLOCK;
> >  
> >  	error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> > +	if (error == -ENOSPC && tries > 0) {
> > +		xfs_trans_cancel(tp);
> > +
> > +		/*
> > +		 * We weren't able to reserve enough space for the transaction.
> > +		 * Flush the other speculative space allocations to free space.
> > +		 * Do not perform a synchronous scan because callers can hold
> > +		 * other locks.
> > +		 */
> > +		error = xfs_blockgc_free_space(mp, NULL);
> > +		if (error)
> > +			return error;
> > +
> > +		tries--;
> > +		goto retry;
> > +	}
> >  	if (error) {
> >  		xfs_trans_cancel(tp);
> >  		return error;
> 
> Why do we need to restart the whole function?  A failing
> xfs_trans_reserve should restore tp to its initial state, and keeping
> the SB_FREEZE_FS counter increased also doesn't look harmful as far as
> I can tell.  So why not:
> 
> 	error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> 	if (error == -ENOSPC) {
> 		/*
> 		 * We weren't able to reserve enough space for the transaction.
> 		 * Flush the other speculative space allocations to free space.
> 		 * Do not perform a synchronous scan because callers can hold
> 		 * other locks.
> 		 */
> 		error = xfs_blockgc_free_space(mp, NULL);
> 		if (error)
> 			return error;
> 		error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> 	}
>  	if (error) {
>   		xfs_trans_cancel(tp);
>   		return error;
> 
> ?
> 

That looks cleaner to me, but similar to the earlier quota res patch I'm
wondering if this should be pushed down into xfs_trans_reserve() (or
lifted into a new xfs_trans_reserve_blks() helper called from there)
such that it can handle the various scan/retry scenarios in one place.

Brian
Darrick J. Wong Jan. 25, 2021, 8:02 p.m. UTC | #3
On Sun, Jan 24, 2021 at 09:48:16AM +0000, Christoph Hellwig wrote:
> > +retry:
> >  	/*
> >  	 * Allocate the handle before we do our freeze accounting and setting up
> >  	 * GFP_NOFS allocation context so that we avoid lockdep false positives
> > @@ -285,6 +289,22 @@ xfs_trans_alloc(
> >  	tp->t_firstblock = NULLFSBLOCK;
> >  
> >  	error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> > +	if (error == -ENOSPC && tries > 0) {
> > +		xfs_trans_cancel(tp);
> > +
> > +		/*
> > +		 * We weren't able to reserve enough space for the transaction.
> > +		 * Flush the other speculative space allocations to free space.
> > +		 * Do not perform a synchronous scan because callers can hold
> > +		 * other locks.
> > +		 */
> > +		error = xfs_blockgc_free_space(mp, NULL);
> > +		if (error)
> > +			return error;
> > +
> > +		tries--;
> > +		goto retry;
> > +	}
> >  	if (error) {
> >  		xfs_trans_cancel(tp);
> >  		return error;
> 
> Why do we need to restart the whole function?  A failing
> xfs_trans_reserve should restore tp to its initial state, and keeping
> the SB_FREEZE_FS counter increased also doesn't look harmful as far as
> I can tell.  So why not:
> 
> 	error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> 	if (error == -ENOSPC) {
> 		/*
> 		 * We weren't able to reserve enough space for the transaction.
> 		 * Flush the other speculative space allocations to free space.
> 		 * Do not perform a synchronous scan because callers can hold
> 		 * other locks.
> 		 */
> 		error = xfs_blockgc_free_space(mp, NULL);

xfs_blockgc_free_space runs the blockgc scan directly, which means that
it creates transactions to free blocks.  Since we can't have nested
transactions, we have to drop tp here.

--D

> 		if (error)
> 			return error;
> 		error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> 	}
>  	if (error) {
>   		xfs_trans_cancel(tp);
>   		return error;
> 
> ?
Brian Foster Jan. 25, 2021, 9:06 p.m. UTC | #4
On Mon, Jan 25, 2021 at 12:02:16PM -0800, Darrick J. Wong wrote:
> On Sun, Jan 24, 2021 at 09:48:16AM +0000, Christoph Hellwig wrote:
> > > +retry:
> > >  	/*
> > >  	 * Allocate the handle before we do our freeze accounting and setting up
> > >  	 * GFP_NOFS allocation context so that we avoid lockdep false positives
> > > @@ -285,6 +289,22 @@ xfs_trans_alloc(
> > >  	tp->t_firstblock = NULLFSBLOCK;
> > >  
> > >  	error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> > > +	if (error == -ENOSPC && tries > 0) {
> > > +		xfs_trans_cancel(tp);
> > > +
> > > +		/*
> > > +		 * We weren't able to reserve enough space for the transaction.
> > > +		 * Flush the other speculative space allocations to free space.
> > > +		 * Do not perform a synchronous scan because callers can hold
> > > +		 * other locks.
> > > +		 */
> > > +		error = xfs_blockgc_free_space(mp, NULL);
> > > +		if (error)
> > > +			return error;
> > > +
> > > +		tries--;
> > > +		goto retry;
> > > +	}
> > >  	if (error) {
> > >  		xfs_trans_cancel(tp);
> > >  		return error;
> > 
> > Why do we need to restart the whole function?  A failing
> > xfs_trans_reserve should restore tp to its initial state, and keeping
> > the SB_FREEZE_FS counter increased also doesn't look harmful as far as
> > I can tell.  So why not:
> > 
> > 	error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> > 	if (error == -ENOSPC) {
> > 		/*
> > 		 * We weren't able to reserve enough space for the transaction.
> > 		 * Flush the other speculative space allocations to free space.
> > 		 * Do not perform a synchronous scan because callers can hold
> > 		 * other locks.
> > 		 */
> > 		error = xfs_blockgc_free_space(mp, NULL);
> 
> xfs_blockgc_free_space runs the blockgc scan directly, which means that
> it creates transactions to free blocks.  Since we can't have nested
> transactions, we have to drop tp here.
> 

Technically, I don't think it's a problem to hold a transaction memory
allocation (and superblock write access?) while diving into the scanning
mechanism. BTW, this also looks like a landmine passing a NULL eofb into
the xfs_blockgc_free_space() tracepoint.

Brian

> --D
> 
> > 		if (error)
> > 			return error;
> > 		error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> > 	}
> >  	if (error) {
> >   		xfs_trans_cancel(tp);
> >   		return error;
> > 
> > ?
>
Darrick J. Wong Jan. 26, 2021, 12:29 a.m. UTC | #5
On Mon, Jan 25, 2021 at 04:06:28PM -0500, Brian Foster wrote:
> On Mon, Jan 25, 2021 at 12:02:16PM -0800, Darrick J. Wong wrote:
> > On Sun, Jan 24, 2021 at 09:48:16AM +0000, Christoph Hellwig wrote:
> > > > +retry:
> > > >  	/*
> > > >  	 * Allocate the handle before we do our freeze accounting and setting up
> > > >  	 * GFP_NOFS allocation context so that we avoid lockdep false positives
> > > > @@ -285,6 +289,22 @@ xfs_trans_alloc(
> > > >  	tp->t_firstblock = NULLFSBLOCK;
> > > >  
> > > >  	error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> > > > +	if (error == -ENOSPC && tries > 0) {
> > > > +		xfs_trans_cancel(tp);
> > > > +
> > > > +		/*
> > > > +		 * We weren't able to reserve enough space for the transaction.
> > > > +		 * Flush the other speculative space allocations to free space.
> > > > +		 * Do not perform a synchronous scan because callers can hold
> > > > +		 * other locks.
> > > > +		 */
> > > > +		error = xfs_blockgc_free_space(mp, NULL);
> > > > +		if (error)
> > > > +			return error;
> > > > +
> > > > +		tries--;
> > > > +		goto retry;
> > > > +	}
> > > >  	if (error) {
> > > >  		xfs_trans_cancel(tp);
> > > >  		return error;
> > > 
> > > Why do we need to restart the whole function?  A failing
> > > xfs_trans_reserve should restore tp to its initial state, and keeping
> > > the SB_FREEZE_FS counter increased also doesn't look harmful as far as

I'm curious about your motivation for letting transaction nest here.
Seeing as the ENOSPC return should be infrequent, are you simply not
wanting to cycle the memory allocators and the FREEZE_FS counters?

Hm.  I guess at this point the only resources we hold are the FREEZE_FS
counter and *tp itself.  The transaction doesn't have any log space
grants or block reservation associated with it, and I guess we're not in
PF_MEMALLOC_NOFS mode either.  So I guess this is ok, except...

> > > I can tell.  So why not:
> > > 
> > > 	error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> > > 	if (error == -ENOSPC) {
> > > 		/*
> > > 		 * We weren't able to reserve enough space for the transaction.
> > > 		 * Flush the other speculative space allocations to free space.
> > > 		 * Do not perform a synchronous scan because callers can hold
> > > 		 * other locks.
> > > 		 */
> > > 		error = xfs_blockgc_free_space(mp, NULL);
> > 
> > xfs_blockgc_free_space runs the blockgc scan directly, which means that
> > it creates transactions to free blocks.  Since we can't have nested
> > transactions, we have to drop tp here.
> > 
> 
> Technically, I don't think it's a problem to hold a transaction memory
> allocation (and superblock write access?) while diving into the scanning
> mechanism.

...except that doing so will collide with what we've been telling Yafang
(as part of his series to detect nested transactions) as far as when is
the appropriate time to set current->journal_info/PF_MEMALLOC_NOFS.

> BTW, this also looks like a landmine passing a NULL eofb into
> the xfs_blockgc_free_space() tracepoint.

Errk, will fix that.

--D

> Brian
> 
> > --D
> > 
> > > 		if (error)
> > > 			return error;
> > > 		error = xfs_trans_reserve(tp, resp, blocks, rtextents);
> > > 	}
> > >  	if (error) {
> > >   		xfs_trans_cancel(tp);
> > >   		return error;
> > > 
> > > ?
> > 
>
Christoph Hellwig Jan. 27, 2021, 4:57 p.m. UTC | #6
On Mon, Jan 25, 2021 at 04:29:01PM -0800, Darrick J. Wong wrote:
> ...except that doing so will collide with what we've been telling Yafang
> (as part of his series to detect nested transactions) as far as when is
> the appropriate time to set current->journal_info/PF_MEMALLOC_NOFS.

Can't we do that based on a log/blk reservation?  If not I'm also fine
with going back to your original goto based loop, it just looked rather
cumbersome to me.
Darrick J. Wong Jan. 27, 2021, 9 p.m. UTC | #7
On Wed, Jan 27, 2021 at 04:57:34PM +0000, Christoph Hellwig wrote:
> On Mon, Jan 25, 2021 at 04:29:01PM -0800, Darrick J. Wong wrote:
> > ...except that doing so will collide with what we've been telling Yafang
> > (as part of his series to detect nested transactions) as far as when is
> > the appropriate time to set current->journal_info/PF_MEMALLOC_NOFS.
> 
> Can't we do that based on a log/blk reservation?  If not I'm also fine
> with going back to your original goto based loop, it just looked rather
> cumbersome to me.

Meh, we'll figure that out when that series gets closer to merging...

--D
diff mbox series

Patch

diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index e72730f85af1..2b92a4084bb8 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -20,6 +20,8 @@ 
 #include "xfs_trace.h"
 #include "xfs_error.h"
 #include "xfs_defer.h"
+#include "xfs_inode.h"
+#include "xfs_icache.h"
 
 kmem_zone_t	*xfs_trans_zone;
 
@@ -256,8 +258,10 @@  xfs_trans_alloc(
 	struct xfs_trans	**tpp)
 {
 	struct xfs_trans	*tp;
+	unsigned int		tries = 1;
 	int			error;
 
+retry:
 	/*
 	 * Allocate the handle before we do our freeze accounting and setting up
 	 * GFP_NOFS allocation context so that we avoid lockdep false positives
@@ -285,6 +289,22 @@  xfs_trans_alloc(
 	tp->t_firstblock = NULLFSBLOCK;
 
 	error = xfs_trans_reserve(tp, resp, blocks, rtextents);
+	if (error == -ENOSPC && tries > 0) {
+		xfs_trans_cancel(tp);
+
+		/*
+		 * We weren't able to reserve enough space for the transaction.
+		 * Flush the other speculative space allocations to free space.
+		 * Do not perform a synchronous scan because callers can hold
+		 * other locks.
+		 */
+		error = xfs_blockgc_free_space(mp, NULL);
+		if (error)
+			return error;
+
+		tries--;
+		goto retry;
+	}
 	if (error) {
 		xfs_trans_cancel(tp);
 		return error;