diff mbox series

[1/1] xfs: check free AG space when making per-AG reservations

Message ID 162181809311.203030.14398379924057321012.stgit@locust (mailing list archive)
State Accepted
Headers show
Series xfs: fixes for online shrink | expand

Commit Message

Darrick J. Wong May 24, 2021, 1:01 a.m. UTC
From: Darrick J. Wong <djwong@kernel.org>

The new online shrink code exposed a gap in the per-AG reservation
code, which is that we only return ENOSPC to callers if the entire fs
doesn't have enough free blocks.  Except for debugging mode, the
reservation init code doesn't ever check that there's enough free space
in that AG to cover the reservation.

Not having enough space is not considered an immediate fatal error that
requires filesystem offlining because (a) it's shouldn't be possible to
wind up in that state through normal file operations and (b) even if
one did, freeing data blocks would recover the situation.

However, online shrink now needs to know if shrinking would not leave
enough space so that it can abort the shrink operation.  Hence we need
to promote this assertion into an actual error return.

Observed by running xfs/168 with a 1k block size, though in theory this
could happen with any configuration.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_ag_resv.c |   18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

Comments

Brian Foster May 24, 2021, 10:42 a.m. UTC | #1
On Sun, May 23, 2021 at 06:01:33PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> The new online shrink code exposed a gap in the per-AG reservation
> code, which is that we only return ENOSPC to callers if the entire fs
> doesn't have enough free blocks.  Except for debugging mode, the
> reservation init code doesn't ever check that there's enough free space
> in that AG to cover the reservation.
> 
> Not having enough space is not considered an immediate fatal error that
> requires filesystem offlining because (a) it's shouldn't be possible to
> wind up in that state through normal file operations and (b) even if
> one did, freeing data blocks would recover the situation.
> 
> However, online shrink now needs to know if shrinking would not leave
> enough space so that it can abort the shrink operation.  Hence we need
> to promote this assertion into an actual error return.
> 
> Observed by running xfs/168 with a 1k block size, though in theory this
> could happen with any configuration.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_ag_resv.c |   18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_ag_resv.c b/fs/xfs/libxfs/xfs_ag_resv.c
> index e32a1833d523..bbfea8022a3b 100644
> --- a/fs/xfs/libxfs/xfs_ag_resv.c
> +++ b/fs/xfs/libxfs/xfs_ag_resv.c
> @@ -325,10 +325,22 @@ xfs_ag_resv_init(
>  		error2 = xfs_alloc_pagf_init(mp, tp, pag->pag_agno, 0);
>  		if (error2)
>  			return error2;
> -		ASSERT(xfs_perag_resv(pag, XFS_AG_RESV_METADATA)->ar_reserved +
> -		       xfs_perag_resv(pag, XFS_AG_RESV_RMAPBT)->ar_reserved <=
> -		       pag->pagf_freeblks + pag->pagf_flcount);
> +
> +		/*
> +		 * If there isn't enough space in the AG to satisfy the
> +		 * reservation, let the caller know that there wasn't enough
> +		 * space.  Callers are responsible for deciding what to do
> +		 * next, since (in theory) we can stumble along with
> +		 * insufficient reservation if data blocks are being freed to
> +		 * replenish the AG's free space.
> +		 */
> +		if (!error &&
> +		    xfs_perag_resv(pag, XFS_AG_RESV_METADATA)->ar_reserved +
> +		    xfs_perag_resv(pag, XFS_AG_RESV_RMAPBT)->ar_reserved >
> +		    pag->pagf_freeblks + pag->pagf_flcount)
> +			error = -ENOSPC;
>  	}
> +
>  	return error;
>  }
>  
>
Carlos Maiolino May 24, 2021, 1:06 p.m. UTC | #2
On Sun, May 23, 2021 at 06:01:33PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> The new online shrink code exposed a gap in the per-AG reservation
> code, which is that we only return ENOSPC to callers if the entire fs
> doesn't have enough free blocks.  Except for debugging mode, the
> reservation init code doesn't ever check that there's enough free space
> in that AG to cover the reservation.
> 
> Not having enough space is not considered an immediate fatal error that
> requires filesystem offlining because (a) it's shouldn't be possible to
> wind up in that state through normal file operations and (b) even if
> one did, freeing data blocks would recover the situation.
> 
> However, online shrink now needs to know if shrinking would not leave
> enough space so that it can abort the shrink operation.  Hence we need
> to promote this assertion into an actual error return.
> 
> Observed by running xfs/168 with a 1k block size, though in theory this
> could happen with any configuration.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/xfs/libxfs/xfs_ag_resv.c |   18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)

Looks good

Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>

> 
> 
> diff --git a/fs/xfs/libxfs/xfs_ag_resv.c b/fs/xfs/libxfs/xfs_ag_resv.c
> index e32a1833d523..bbfea8022a3b 100644
> --- a/fs/xfs/libxfs/xfs_ag_resv.c
> +++ b/fs/xfs/libxfs/xfs_ag_resv.c
> @@ -325,10 +325,22 @@ xfs_ag_resv_init(
>  		error2 = xfs_alloc_pagf_init(mp, tp, pag->pag_agno, 0);
>  		if (error2)
>  			return error2;
> -		ASSERT(xfs_perag_resv(pag, XFS_AG_RESV_METADATA)->ar_reserved +
> -		       xfs_perag_resv(pag, XFS_AG_RESV_RMAPBT)->ar_reserved <=
> -		       pag->pagf_freeblks + pag->pagf_flcount);
> +
> +		/*
> +		 * If there isn't enough space in the AG to satisfy the
> +		 * reservation, let the caller know that there wasn't enough
> +		 * space.  Callers are responsible for deciding what to do
> +		 * next, since (in theory) we can stumble along with
> +		 * insufficient reservation if data blocks are being freed to
> +		 * replenish the AG's free space.
> +		 */
> +		if (!error &&
> +		    xfs_perag_resv(pag, XFS_AG_RESV_METADATA)->ar_reserved +
> +		    xfs_perag_resv(pag, XFS_AG_RESV_RMAPBT)->ar_reserved >
> +		    pag->pagf_freeblks + pag->pagf_flcount)
> +			error = -ENOSPC;
>  	}
> +
>  	return error;
>  }
>  
>
Gao Xiang May 24, 2021, 3:09 p.m. UTC | #3
On Sun, May 23, 2021 at 06:01:33PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> The new online shrink code exposed a gap in the per-AG reservation
> code, which is that we only return ENOSPC to callers if the entire fs
> doesn't have enough free blocks.  Except for debugging mode, the
> reservation init code doesn't ever check that there's enough free space
> in that AG to cover the reservation.
> 
> Not having enough space is not considered an immediate fatal error that
> requires filesystem offlining because (a) it's shouldn't be possible to
> wind up in that state through normal file operations and (b) even if
> one did, freeing data blocks would recover the situation.
> 
> However, online shrink now needs to know if shrinking would not leave
> enough space so that it can abort the shrink operation.  Hence we need
> to promote this assertion into an actual error return.
> 
> Observed by running xfs/168 with a 1k block size, though in theory this
> could happen with any configuration.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Many thanks for the fix!

Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>

Thanks,
Gao Xiang
diff mbox series

Patch

diff --git a/fs/xfs/libxfs/xfs_ag_resv.c b/fs/xfs/libxfs/xfs_ag_resv.c
index e32a1833d523..bbfea8022a3b 100644
--- a/fs/xfs/libxfs/xfs_ag_resv.c
+++ b/fs/xfs/libxfs/xfs_ag_resv.c
@@ -325,10 +325,22 @@  xfs_ag_resv_init(
 		error2 = xfs_alloc_pagf_init(mp, tp, pag->pag_agno, 0);
 		if (error2)
 			return error2;
-		ASSERT(xfs_perag_resv(pag, XFS_AG_RESV_METADATA)->ar_reserved +
-		       xfs_perag_resv(pag, XFS_AG_RESV_RMAPBT)->ar_reserved <=
-		       pag->pagf_freeblks + pag->pagf_flcount);
+
+		/*
+		 * If there isn't enough space in the AG to satisfy the
+		 * reservation, let the caller know that there wasn't enough
+		 * space.  Callers are responsible for deciding what to do
+		 * next, since (in theory) we can stumble along with
+		 * insufficient reservation if data blocks are being freed to
+		 * replenish the AG's free space.
+		 */
+		if (!error &&
+		    xfs_perag_resv(pag, XFS_AG_RESV_METADATA)->ar_reserved +
+		    xfs_perag_resv(pag, XFS_AG_RESV_RMAPBT)->ar_reserved >
+		    pag->pagf_freeblks + pag->pagf_flcount)
+			error = -ENOSPC;
 	}
+
 	return error;
 }