diff mbox series

[02/18] xfs: fix inode quota reservation checks

Message ID 159353172261.2864738.3201442261854530990.stgit@magnolia (mailing list archive)
State Superseded, archived
Headers show
Series xfs: remove xfs_disk_quot from incore dquot | expand

Commit Message

Darrick J. Wong June 30, 2020, 3:42 p.m. UTC
From: Darrick J. Wong <darrick.wong@oracle.com>

xfs_trans_dqresv is the function that we use to make reservations
against resource quotas.  Each resource contains two counters: the
q_core counter, which tracks resources allocated on disk; and the dquot
reservation counter, which tracks how much of that resource has either
been allocated or reserved by threads that are working on metadata
updates.

For disk blocks, we compare the proposed reservation counter against the
hard and soft limits to decide if we're going to fail the operation.
However, for inodes we inexplicably compare against the q_core counter,
not the incore reservation count.

Since the q_core counter is always lower than the reservation count and
we unlock the dquot between reservation and transaction commit, this
means that multiple threads can reserve the last inode count before we
hit the hard limit, and when they commit, we'll be well over the hard
limit.

Fix this by checking against the incore inode reservation counter, since
we would appear to maintain that correctly (and that's what we report in
GETQUOTA).

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_trans_dquot.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Allison Henderson June 30, 2020, 9:35 p.m. UTC | #1
On 6/30/20 8:42 AM, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> xfs_trans_dqresv is the function that we use to make reservations
> against resource quotas.  Each resource contains two counters: the
> q_core counter, which tracks resources allocated on disk; and the dquot
> reservation counter, which tracks how much of that resource has either
> been allocated or reserved by threads that are working on metadata
> updates.
> 
> For disk blocks, we compare the proposed reservation counter against the
> hard and soft limits to decide if we're going to fail the operation.
> However, for inodes we inexplicably compare against the q_core counter,
> not the incore reservation count.
> 
> Since the q_core counter is always lower than the reservation count and
> we unlock the dquot between reservation and transaction commit, this
> means that multiple threads can reserve the last inode count before we
> hit the hard limit, and when they commit, we'll be well over the hard
> limit.
> 
> Fix this by checking against the incore inode reservation counter, since
> we would appear to maintain that correctly (and that's what we report in
> GETQUOTA).
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Ok, makes sense
Reviewed-by: Allison Collins <allison.henderson@oracle.com>

> ---
>   fs/xfs/xfs_trans_dquot.c |    2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> 
> diff --git a/fs/xfs/xfs_trans_dquot.c b/fs/xfs/xfs_trans_dquot.c
> index c0f73b82c055..ed0ce8b301b4 100644
> --- a/fs/xfs/xfs_trans_dquot.c
> +++ b/fs/xfs/xfs_trans_dquot.c
> @@ -647,7 +647,7 @@ xfs_trans_dqresv(
>   			}
>   		}
>   		if (ninos > 0) {
> -			total_count = be64_to_cpu(dqp->q_core.d_icount) + ninos;
> +			total_count = dqp->q_res_icount + ninos;
>   			timer = be32_to_cpu(dqp->q_core.d_itimer);
>   			warns = be16_to_cpu(dqp->q_core.d_iwarns);
>   			warnlimit = defq->iwarnlimit;
>
Chandan Babu R July 1, 2020, 8:33 a.m. UTC | #2
On Tuesday 30 June 2020 9:12:02 PM IST Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> xfs_trans_dqresv is the function that we use to make reservations
> against resource quotas.  Each resource contains two counters: the
> q_core counter, which tracks resources allocated on disk; and the dquot
> reservation counter, which tracks how much of that resource has either
> been allocated or reserved by threads that are working on metadata
> updates.
> 
> For disk blocks, we compare the proposed reservation counter against the
> hard and soft limits to decide if we're going to fail the operation.
> However, for inodes we inexplicably compare against the q_core counter,
> not the incore reservation count.
> 
> Since the q_core counter is always lower than the reservation count and
> we unlock the dquot between reservation and transaction commit, this
> means that multiple threads can reserve the last inode count before we
> hit the hard limit, and when they commit, we'll be well over the hard
> limit.
> 
> Fix this by checking against the incore inode reservation counter, since
> we would appear to maintain that correctly (and that's what we report in
> GETQUOTA).
>

As mentioned above, q_core.d_icount's value can be less than what has been
reserved by various running tasks. Hence deriving new reservation count from
q_core.d_icount is incorrect.

Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>

> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/xfs_trans_dquot.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> 
> diff --git a/fs/xfs/xfs_trans_dquot.c b/fs/xfs/xfs_trans_dquot.c
> index c0f73b82c055..ed0ce8b301b4 100644
> --- a/fs/xfs/xfs_trans_dquot.c
> +++ b/fs/xfs/xfs_trans_dquot.c
> @@ -647,7 +647,7 @@ xfs_trans_dqresv(
>  			}
>  		}
>  		if (ninos > 0) {
> -			total_count = be64_to_cpu(dqp->q_core.d_icount) + ninos;
> +			total_count = dqp->q_res_icount + ninos;
>  			timer = be32_to_cpu(dqp->q_core.d_itimer);
>  			warns = be16_to_cpu(dqp->q_core.d_iwarns);
>  			warnlimit = defq->iwarnlimit;
> 
>
Christoph Hellwig July 1, 2020, 8:34 a.m. UTC | #3
Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
diff mbox series

Patch

diff --git a/fs/xfs/xfs_trans_dquot.c b/fs/xfs/xfs_trans_dquot.c
index c0f73b82c055..ed0ce8b301b4 100644
--- a/fs/xfs/xfs_trans_dquot.c
+++ b/fs/xfs/xfs_trans_dquot.c
@@ -647,7 +647,7 @@  xfs_trans_dqresv(
 			}
 		}
 		if (ninos > 0) {
-			total_count = be64_to_cpu(dqp->q_core.d_icount) + ninos;
+			total_count = dqp->q_res_icount + ninos;
 			timer = be32_to_cpu(dqp->q_core.d_itimer);
 			warns = be16_to_cpu(dqp->q_core.d_iwarns);
 			warnlimit = defq->iwarnlimit;