diff mbox series

[2/3] xfs: avoid buffer deadlocks in inumbers/bulkstat

Message ID 161472410813.3421449.1691962515820573818.stgit@magnolia (mailing list archive)
State New, archived
Headers show
Series xfs: small fixes for 5.12 | expand

Commit Message

Darrick J. Wong March 2, 2021, 10:28 p.m. UTC
From: Darrick J. Wong <djwong@kernel.org>

When we're servicing an INUMBERS or BULKSTAT request, grab an empty
transaction so that we don't hit an ABBA buffer deadlock if the inode
btree contains a cycle.

Found by fuzzing an inode btree pointer to introduce a cycle into the
tree (xfs/365).

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_itable.c |   46 ++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 42 insertions(+), 4 deletions(-)

Comments

Christoph Hellwig March 3, 2021, 6:45 a.m. UTC | #1
On Tue, Mar 02, 2021 at 02:28:28PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> When we're servicing an INUMBERS or BULKSTAT request, grab an empty
> transaction so that we don't hit an ABBA buffer deadlock if the inode
> btree contains a cycle.
> 
> Found by fuzzing an inode btree pointer to introduce a cycle into the
> tree (xfs/365).
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

So basically you want to piggy back on the per-trans recursion using
xfs_trans_buf_item_match?  Why do we need the sb-counter for that?
Can the comments be a little more clear?  Why don't we want that
elsewhere where we're walking the btrees?
Darrick J. Wong March 3, 2021, 6:57 p.m. UTC | #2
On Wed, Mar 03, 2021 at 07:45:56AM +0100, Christoph Hellwig wrote:
> On Tue, Mar 02, 2021 at 02:28:28PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > When we're servicing an INUMBERS or BULKSTAT request, grab an empty
> > transaction so that we don't hit an ABBA buffer deadlock if the inode
> > btree contains a cycle.
> > 
> > Found by fuzzing an inode btree pointer to introduce a cycle into the
> > tree (xfs/365).
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> 
> So basically you want to piggy back on the per-trans recursion using
> xfs_trans_buf_item_match?

Right.

> Why do we need the sb-counter for that?

I was about to say that we need freeze protection to prevent fsfreeze
and inumbers/bulkstat stalling each other out on the buffer LRU, but
then I remembered that Brian changed freeze to wait for IO to complete
without dumping the buffer cache this cycle.  So, I don't think freeze
protection is necessary anymore.

> Can the comments be a little more clear?

/*
 * Grab an empty transaction so that we can use the recursive locking
 * support to detect tree cycles instead of deadlocking.
 */

How does that sound?

> Why don't we want that elsewhere where we're walking the btrees?

I'm merely playing whackamole with tree walks here.  I would guess that
quotacheck will be next for evaluation.

It also occurs to me that inumbers/bulkstat should probably not be
calling copy_to_user() separately for every single record.

--D
diff mbox series

Patch

diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
index ca310a125d1e..2339a1874efa 100644
--- a/fs/xfs/xfs_itable.c
+++ b/fs/xfs/xfs_itable.c
@@ -19,6 +19,7 @@ 
 #include "xfs_error.h"
 #include "xfs_icache.h"
 #include "xfs_health.h"
+#include "xfs_trans.h"
 
 /*
  * Bulk Stat
@@ -166,6 +167,7 @@  xfs_bulkstat_one(
 		.formatter	= formatter,
 		.breq		= breq,
 	};
+	struct xfs_trans	*tp;
 	int			error;
 
 	ASSERT(breq->icount == 1);
@@ -175,9 +177,20 @@  xfs_bulkstat_one(
 	if (!bc.buf)
 		return -ENOMEM;
 
+	/*
+	 * Grab freeze protection and allocate an empty transaction so that
+	 * we avoid deadlocking if the inobt is corrupt.
+	 */
+	sb_start_write(breq->mp->m_super);
+	error = xfs_trans_alloc_empty(breq->mp, &tp);
+	if (error)
+		goto out;
+
 	error = xfs_bulkstat_one_int(breq->mp, breq->mnt_userns, NULL,
 				     breq->startino, &bc);
-
+	xfs_trans_cancel(tp);
+out:
+	sb_end_write(breq->mp->m_super);
 	kmem_free(bc.buf);
 
 	/*
@@ -241,6 +254,7 @@  xfs_bulkstat(
 		.formatter	= formatter,
 		.breq		= breq,
 	};
+	struct xfs_trans	*tp;
 	int			error;
 
 	if (breq->mnt_userns != &init_user_ns) {
@@ -256,9 +270,20 @@  xfs_bulkstat(
 	if (!bc.buf)
 		return -ENOMEM;
 
-	error = xfs_iwalk(breq->mp, NULL, breq->startino, breq->flags,
+	/*
+	 * Grab freeze protection and allocate an empty transaction so that
+	 * we avoid deadlocking if the inobt is corrupt.
+	 */
+	sb_start_write(breq->mp->m_super);
+	error = xfs_trans_alloc_empty(breq->mp, &tp);
+	if (error)
+		goto out;
+
+	error = xfs_iwalk(breq->mp, tp, breq->startino, breq->flags,
 			xfs_bulkstat_iwalk, breq->icount, &bc);
-
+	xfs_trans_cancel(tp);
+out:
+	sb_end_write(breq->mp->m_super);
 	kmem_free(bc.buf);
 
 	/*
@@ -371,13 +396,26 @@  xfs_inumbers(
 		.formatter	= formatter,
 		.breq		= breq,
 	};
+	struct xfs_trans	*tp;
 	int			error = 0;
 
 	if (xfs_bulkstat_already_done(breq->mp, breq->startino))
 		return 0;
 
-	error = xfs_inobt_walk(breq->mp, NULL, breq->startino, breq->flags,
+	/*
+	 * Grab freeze protection and allocate an empty transaction so that
+	 * we avoid deadlocking if the inobt is corrupt.
+	 */
+	sb_start_write(breq->mp->m_super);
+	error = xfs_trans_alloc_empty(breq->mp, &tp);
+	if (error)
+		goto out;
+
+	error = xfs_inobt_walk(breq->mp, tp, breq->startino, breq->flags,
 			xfs_inumbers_walk, breq->icount, &ic);
+	xfs_trans_cancel(tp);
+out:
+	sb_end_write(breq->mp->m_super);
 
 	/*
 	 * We found some inode groups, so clear the error status and return