diff mbox series

[03/11] xfs: don't reclaim dquots with incore reservations

Message ID 161543195719.1947934.8218545606940173264.stgit@magnolia (mailing list archive)
State Deferred, archived
Headers show
Series xfs: deferred inode inactivation | expand

Commit Message

Darrick J. Wong March 11, 2021, 3:05 a.m. UTC
From: Darrick J. Wong <djwong@kernel.org>

If a dquot has an incore reservation that exceeds the ondisk count, it
by definition has active incore state and must not be reclaimed.  Up to
this point every inode with an incore dquot reservation has always
retained a reference to the dquot so it was never possible for
xfs_qm_dquot_isolate to be called on a dquot with active state and zero
refcount, but this will soon change.

Deferred inode inactivation is about to reorganize how inodes are
inactivated by shunting all that work to a background workqueue.  In
order to avoid deadlocks with the quotaoff inode scan and reduce overall
memory requirements (since inodes can spend a lot of time waiting for
inactivation), inactive inodes will drop their dquot references while
they're waiting to be inactivated.

However, inactive inodes can have delalloc extents in the data fork or
any extents in the CoW fork.  Either of these contribute to the dquot's
incore reservation being larger than the resource count (i.e. they're
the reason the dquot still has active incore state), so we cannot allow
the dquot to be reclaimed.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_qm.c |   29 ++++++++++++++++++++++++-----
 fs/xfs/xfs_qm.h |   17 +++++++++++++++++
 2 files changed, 41 insertions(+), 5 deletions(-)

Comments

Christoph Hellwig March 15, 2021, 6:29 p.m. UTC | #1
Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
Dave Chinner March 22, 2021, 11:31 p.m. UTC | #2
On Wed, Mar 10, 2021 at 07:05:57PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> If a dquot has an incore reservation that exceeds the ondisk count, it
> by definition has active incore state and must not be reclaimed.  Up to
> this point every inode with an incore dquot reservation has always
> retained a reference to the dquot so it was never possible for
> xfs_qm_dquot_isolate to be called on a dquot with active state and zero
> refcount, but this will soon change.
> 
> Deferred inode inactivation is about to reorganize how inodes are
> inactivated by shunting all that work to a background workqueue.  In
> order to avoid deadlocks with the quotaoff inode scan and reduce overall
> memory requirements (since inodes can spend a lot of time waiting for
> inactivation), inactive inodes will drop their dquot references while
> they're waiting to be inactivated.
> 
> However, inactive inodes can have delalloc extents in the data fork or
> any extents in the CoW fork.  Either of these contribute to the dquot's
> incore reservation being larger than the resource count (i.e. they're
> the reason the dquot still has active incore state), so we cannot allow
> the dquot to be reclaimed.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
.....
>  static enum lru_status
>  xfs_qm_dquot_isolate(
>  	struct list_head	*item,
> @@ -427,10 +441,15 @@ xfs_qm_dquot_isolate(
>  		goto out_miss_busy;
>  
>  	/*
> -	 * This dquot has acquired a reference in the meantime remove it from
> -	 * the freelist and try again.
> +	 * Either this dquot has incore reservations or it has acquired a
> +	 * reference.  Remove it from the freelist and try again.
> +	 *
> +	 * Inodes tagged for inactivation drop their dquot references to avoid
> +	 * deadlocks with quotaoff.  If these inodes have delalloc reservations
> +	 * in the data fork or any extents in the CoW fork, these contribute
> +	 * to the dquot's incore block reservation exceeding the count.
>  	 */
> -	if (dqp->q_nrefs) {
> +	if (xfs_dquot_has_incore_resv(dqp) || dqp->q_nrefs) {
>  		xfs_dqunlock(dqp);
>  		XFS_STATS_INC(dqp->q_mount, xs_qm_dqwants);
>  

This means we can have dquots with no references that aren't on
the free list and aren't actually referenced by any inode, either.

So if we now shut down the filesystem, what frees these dquots?
Are we relying on xfs_qm_dqpurge_all() to find all these dquots
and xfs_qm_dqpurge() guaranteeing that they are always cleaned
and freed?

Cheers,

Dave.
Darrick J. Wong March 23, 2021, 12:01 a.m. UTC | #3
On Tue, Mar 23, 2021 at 10:31:39AM +1100, Dave Chinner wrote:
> On Wed, Mar 10, 2021 at 07:05:57PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > If a dquot has an incore reservation that exceeds the ondisk count, it
> > by definition has active incore state and must not be reclaimed.  Up to
> > this point every inode with an incore dquot reservation has always
> > retained a reference to the dquot so it was never possible for
> > xfs_qm_dquot_isolate to be called on a dquot with active state and zero
> > refcount, but this will soon change.
> > 
> > Deferred inode inactivation is about to reorganize how inodes are
> > inactivated by shunting all that work to a background workqueue.  In
> > order to avoid deadlocks with the quotaoff inode scan and reduce overall
> > memory requirements (since inodes can spend a lot of time waiting for
> > inactivation), inactive inodes will drop their dquot references while
> > they're waiting to be inactivated.
> > 
> > However, inactive inodes can have delalloc extents in the data fork or
> > any extents in the CoW fork.  Either of these contribute to the dquot's
> > incore reservation being larger than the resource count (i.e. they're
> > the reason the dquot still has active incore state), so we cannot allow
> > the dquot to be reclaimed.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> .....
> >  static enum lru_status
> >  xfs_qm_dquot_isolate(
> >  	struct list_head	*item,
> > @@ -427,10 +441,15 @@ xfs_qm_dquot_isolate(
> >  		goto out_miss_busy;
> >  
> >  	/*
> > -	 * This dquot has acquired a reference in the meantime remove it from
> > -	 * the freelist and try again.
> > +	 * Either this dquot has incore reservations or it has acquired a
> > +	 * reference.  Remove it from the freelist and try again.
> > +	 *
> > +	 * Inodes tagged for inactivation drop their dquot references to avoid
> > +	 * deadlocks with quotaoff.  If these inodes have delalloc reservations
> > +	 * in the data fork or any extents in the CoW fork, these contribute
> > +	 * to the dquot's incore block reservation exceeding the count.
> >  	 */
> > -	if (dqp->q_nrefs) {
> > +	if (xfs_dquot_has_incore_resv(dqp) || dqp->q_nrefs) {
> >  		xfs_dqunlock(dqp);
> >  		XFS_STATS_INC(dqp->q_mount, xs_qm_dqwants);
> >  
> 
> This means we can have dquots with no references that aren't on
> the free list and aren't actually referenced by any inode, either.
> 
> So if we now shut down the filesystem, what frees these dquots?
> Are we relying on xfs_qm_dqpurge_all() to find all these dquots
> and xfs_qm_dqpurge() guaranteeing that they are always cleaned
> and freed?

Yes.  Want me to add that to the comment?

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
Dave Chinner March 23, 2021, 1:48 a.m. UTC | #4
On Mon, Mar 22, 2021 at 05:01:11PM -0700, Darrick J. Wong wrote:
> On Tue, Mar 23, 2021 at 10:31:39AM +1100, Dave Chinner wrote:
> > On Wed, Mar 10, 2021 at 07:05:57PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <djwong@kernel.org>
> > > 
> > > If a dquot has an incore reservation that exceeds the ondisk count, it
> > > by definition has active incore state and must not be reclaimed.  Up to
> > > this point every inode with an incore dquot reservation has always
> > > retained a reference to the dquot so it was never possible for
> > > xfs_qm_dquot_isolate to be called on a dquot with active state and zero
> > > refcount, but this will soon change.
> > > 
> > > Deferred inode inactivation is about to reorganize how inodes are
> > > inactivated by shunting all that work to a background workqueue.  In
> > > order to avoid deadlocks with the quotaoff inode scan and reduce overall
> > > memory requirements (since inodes can spend a lot of time waiting for
> > > inactivation), inactive inodes will drop their dquot references while
> > > they're waiting to be inactivated.
> > > 
> > > However, inactive inodes can have delalloc extents in the data fork or
> > > any extents in the CoW fork.  Either of these contribute to the dquot's
> > > incore reservation being larger than the resource count (i.e. they're
> > > the reason the dquot still has active incore state), so we cannot allow
> > > the dquot to be reclaimed.
> > > 
> > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > .....
> > >  static enum lru_status
> > >  xfs_qm_dquot_isolate(
> > >  	struct list_head	*item,
> > > @@ -427,10 +441,15 @@ xfs_qm_dquot_isolate(
> > >  		goto out_miss_busy;
> > >  
> > >  	/*
> > > -	 * This dquot has acquired a reference in the meantime remove it from
> > > -	 * the freelist and try again.
> > > +	 * Either this dquot has incore reservations or it has acquired a
> > > +	 * reference.  Remove it from the freelist and try again.
> > > +	 *
> > > +	 * Inodes tagged for inactivation drop their dquot references to avoid
> > > +	 * deadlocks with quotaoff.  If these inodes have delalloc reservations
> > > +	 * in the data fork or any extents in the CoW fork, these contribute
> > > +	 * to the dquot's incore block reservation exceeding the count.
> > >  	 */
> > > -	if (dqp->q_nrefs) {
> > > +	if (xfs_dquot_has_incore_resv(dqp) || dqp->q_nrefs) {
> > >  		xfs_dqunlock(dqp);
> > >  		XFS_STATS_INC(dqp->q_mount, xs_qm_dqwants);
> > >  
> > 
> > This means we can have dquots with no references that aren't on
> > the free list and aren't actually referenced by any inode, either.
> > 
> > So if we now shut down the filesystem, what frees these dquots?
> > Are we relying on xfs_qm_dqpurge_all() to find all these dquots
> > and xfs_qm_dqpurge() guaranteeing that they are always cleaned
> > and freed?
> 
> Yes.  Want me to add that to the comment?

Yes Please!

-Dave.
diff mbox series

Patch

diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index bfa4164990b1..b3ce04dec181 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -166,9 +166,14 @@  xfs_qm_dqpurge(
 
 	/*
 	 * We move dquots to the freelist as soon as their reference count
-	 * hits zero, so it really should be on the freelist here.
+	 * hits zero, so it really should be on the freelist here.  If we're
+	 * running quotaoff, it's possible that we're purging a zero-refcount
+	 * dquot with active incore reservation because there are inodes
+	 * awaiting inactivation.  Dquots in this state will not be on the LRU
+	 * but it's quotaoff, so we don't care.
 	 */
-	ASSERT(!list_empty(&dqp->q_lru));
+	ASSERT(!(mp->m_qflags & xfs_quota_active_flag(xfs_dquot_type(dqp))) ||
+	       !list_empty(&dqp->q_lru));
 	list_lru_del(&qi->qi_lru, &dqp->q_lru);
 	XFS_STATS_DEC(mp, xs_qm_dquot_unused);
 
@@ -411,6 +416,15 @@  struct xfs_qm_isolate {
 	struct list_head	dispose;
 };
 
+static inline bool
+xfs_dquot_has_incore_resv(
+	struct xfs_dquot	*dqp)
+{
+	return  dqp->q_blk.reserved > dqp->q_blk.count ||
+		dqp->q_ino.reserved > dqp->q_ino.count ||
+		dqp->q_rtb.reserved > dqp->q_rtb.count;
+}
+
 static enum lru_status
 xfs_qm_dquot_isolate(
 	struct list_head	*item,
@@ -427,10 +441,15 @@  xfs_qm_dquot_isolate(
 		goto out_miss_busy;
 
 	/*
-	 * This dquot has acquired a reference in the meantime remove it from
-	 * the freelist and try again.
+	 * Either this dquot has incore reservations or it has acquired a
+	 * reference.  Remove it from the freelist and try again.
+	 *
+	 * Inodes tagged for inactivation drop their dquot references to avoid
+	 * deadlocks with quotaoff.  If these inodes have delalloc reservations
+	 * in the data fork or any extents in the CoW fork, these contribute
+	 * to the dquot's incore block reservation exceeding the count.
 	 */
-	if (dqp->q_nrefs) {
+	if (xfs_dquot_has_incore_resv(dqp) || dqp->q_nrefs) {
 		xfs_dqunlock(dqp);
 		XFS_STATS_INC(dqp->q_mount, xs_qm_dqwants);
 
diff --git a/fs/xfs/xfs_qm.h b/fs/xfs/xfs_qm.h
index e3dabab44097..78f90935e91e 100644
--- a/fs/xfs/xfs_qm.h
+++ b/fs/xfs/xfs_qm.h
@@ -105,6 +105,23 @@  xfs_quota_inode(struct xfs_mount *mp, xfs_dqtype_t type)
 	return NULL;
 }
 
+static inline unsigned int
+xfs_quota_active_flag(
+	xfs_dqtype_t		type)
+{
+	switch (type) {
+	case XFS_DQTYPE_USER:
+		return XFS_UQUOTA_ACTIVE;
+	case XFS_DQTYPE_GROUP:
+		return XFS_GQUOTA_ACTIVE;
+	case XFS_DQTYPE_PROJ:
+		return XFS_PQUOTA_ACTIVE;
+	default:
+		ASSERT(0);
+	}
+	return 0;
+}
+
 extern void	xfs_trans_mod_dquot(struct xfs_trans *tp, struct xfs_dquot *dqp,
 				    uint field, int64_t delta);
 extern void	xfs_trans_dqjoin(struct xfs_trans *, struct xfs_dquot *);