diff mbox series

[1/7] xfs: add a function to deal with corrupt buffers post-verifiers

Message ID 158388763904.939165.1796274155705134706.stgit@magnolia (mailing list archive)
State Superseded
Headers show
Series xfs: fix errors in various verifiers | expand

Commit Message

Darrick J. Wong March 11, 2020, 12:47 a.m. UTC
From: Darrick J. Wong <darrick.wong@oracle.com>

Add a helper function to get rid of buffers that we have decided are
corrupt after the verifiers have run.  This function is intended to
handle metadata checks that can't happen in the verifiers, such as
inter-block relationship checking.  Note that we now mark the buffer
stale so that it will not end up on any LRU and will be purged on
release.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_alloc.c     |    2 +-
 fs/xfs/libxfs/xfs_attr_leaf.c |    6 +++---
 fs/xfs/libxfs/xfs_btree.c     |    2 +-
 fs/xfs/libxfs/xfs_da_btree.c  |   10 +++++-----
 fs/xfs/libxfs/xfs_dir2_leaf.c |    2 +-
 fs/xfs/libxfs/xfs_dir2_node.c |    6 +++---
 fs/xfs/xfs_attr_inactive.c    |    6 +++---
 fs/xfs/xfs_attr_list.c        |    2 +-
 fs/xfs/xfs_buf.c              |   25 +++++++++++++++++++++++++
 fs/xfs/xfs_buf.h              |    2 ++
 fs/xfs/xfs_error.c            |    2 ++
 fs/xfs/xfs_inode.c            |    4 ++--
 12 files changed, 49 insertions(+), 20 deletions(-)

Comments

Dave Chinner March 11, 2020, 5:35 a.m. UTC | #1
On Tue, Mar 10, 2020 at 05:47:19PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Add a helper function to get rid of buffers that we have decided are
> corrupt after the verifiers have run.  This function is intended to
> handle metadata checks that can't happen in the verifiers, such as
> inter-block relationship checking.  Note that we now mark the buffer
> stale so that it will not end up on any LRU and will be purged on
> release.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
....
> diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> index 11a97bc35f70..9d26e37f78f5 100644
> --- a/fs/xfs/xfs_buf.c
> +++ b/fs/xfs/xfs_buf.c
> @@ -16,6 +16,8 @@
>  #include "xfs_log.h"
>  #include "xfs_errortag.h"
>  #include "xfs_error.h"
> +#include "xfs_trans.h"
> +#include "xfs_buf_item.h"
>  
>  static kmem_zone_t *xfs_buf_zone;
>  
> @@ -1572,6 +1574,29 @@ xfs_buf_zero(
>  	}
>  }
>  
> +/*
> + * Log a message about and stale a buffer that a caller has decided is corrupt.
> + *
> + * This function should be called for the kinds of metadata corruption that
> + * cannot be detect from a verifier, such as incorrect inter-block relationship
> + * data.  Do /not/ call this function from a verifier function.

So if it's called from a read verifier, the buffer will not have the
XBF_DONE flag set on it. Maybe we should assert that this flag is
set on the buffer? Yes, I know XBF_DONE will be set at write time,
but most verifiers are called from both the read and write path so
it should catch invalid use at read time...

> + * The buffer must not be dirty prior to the call.  Afterwards, the buffer will

Why can't it be dirty?

> + * be marked stale, but b_error will not be set.  The caller is responsible for
> + * releasing the buffer or fixing it.
> + */
> +void
> +__xfs_buf_mark_corrupt(
> +	struct xfs_buf		*bp,
> +	xfs_failaddr_t		fa)
> +{
> +	ASSERT(bp->b_log_item == NULL ||
> +	       !(bp->b_log_item->bli_flags & XFS_BLI_DIRTY));

XFS_BLI_DIRTY isn't a complete definition of a dirty buffer. What it
means is "modifications to this buffer are not yet
committed to the journal". It may have been linked into a
transaction but not modified, but it can still be XFS_BLI_DIRTY
because it is in the CIL. IOWs, transactions can cancel safely
aborted even when the items in it are dirty as long as the
transaction itself is clean and made no modifications to the object.

Hence I'm not sure what you are trying to protect against here?

The rest of the code looks fine.

Cheers,

Dave,
Darrick J. Wong March 11, 2020, 4:42 p.m. UTC | #2
On Wed, Mar 11, 2020 at 04:35:59PM +1100, Dave Chinner wrote:
> On Tue, Mar 10, 2020 at 05:47:19PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Add a helper function to get rid of buffers that we have decided are
> > corrupt after the verifiers have run.  This function is intended to
> > handle metadata checks that can't happen in the verifiers, such as
> > inter-block relationship checking.  Note that we now mark the buffer
> > stale so that it will not end up on any LRU and will be purged on
> > release.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> ....
> > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> > index 11a97bc35f70..9d26e37f78f5 100644
> > --- a/fs/xfs/xfs_buf.c
> > +++ b/fs/xfs/xfs_buf.c
> > @@ -16,6 +16,8 @@
> >  #include "xfs_log.h"
> >  #include "xfs_errortag.h"
> >  #include "xfs_error.h"
> > +#include "xfs_trans.h"
> > +#include "xfs_buf_item.h"
> >  
> >  static kmem_zone_t *xfs_buf_zone;
> >  
> > @@ -1572,6 +1574,29 @@ xfs_buf_zero(
> >  	}
> >  }
> >  
> > +/*
> > + * Log a message about and stale a buffer that a caller has decided is corrupt.
> > + *
> > + * This function should be called for the kinds of metadata corruption that
> > + * cannot be detect from a verifier, such as incorrect inter-block relationship
> > + * data.  Do /not/ call this function from a verifier function.
> 
> So if it's called from a read verifier, the buffer will not have the
> XBF_DONE flag set on it. Maybe we should assert that this flag is
> set on the buffer? Yes, I know XBF_DONE will be set at write time,
> but most verifiers are called from both the read and write path so
> it should catch invalid use at read time...

<nod>

> 
> > + * The buffer must not be dirty prior to the call.  Afterwards, the buffer will
> 
> Why can't it be dirty?

There isn't a technical reason why this function cannot handle a dirty
buffer, but it seems like a mistake in data handling to dirty a buffer
and /then/ decide it's garbage.  That's what the lengthy assertion below
is about...

> > + * be marked stale, but b_error will not be set.  The caller is responsible for
> > + * releasing the buffer or fixing it.
> > + */
> > +void
> > +__xfs_buf_mark_corrupt(
> > +	struct xfs_buf		*bp,
> > +	xfs_failaddr_t		fa)
> > +{
> > +	ASSERT(bp->b_log_item == NULL ||
> > +	       !(bp->b_log_item->bli_flags & XFS_BLI_DIRTY));
> 
> XFS_BLI_DIRTY isn't a complete definition of a dirty buffer. What it
> means is "modifications to this buffer are not yet
> committed to the journal". It may have been linked into a
> transaction but not modified, but it can still be XFS_BLI_DIRTY
> because it is in the CIL. IOWs, transactions can cancel safely
> aborted even when the items in it are dirty as long as the
> transaction itself is clean and made no modifications to the object.
> 
> Hence I'm not sure what you are trying to protect against here?

...ah, ok, I'll just get rid of the assertion entirely then.

--D

> 
> The rest of the code looks fine.
> 
> Cheers,
> 
> Dave,
> -- 
> Dave Chinner
> david@fromorbit.com
diff mbox series

Patch

diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
index d8053bc96c4d..24b71e3d401a 100644
--- a/fs/xfs/libxfs/xfs_alloc.c
+++ b/fs/xfs/libxfs/xfs_alloc.c
@@ -721,7 +721,7 @@  xfs_alloc_update_counters(
 	xfs_trans_agblocks_delta(tp, len);
 	if (unlikely(be32_to_cpu(agf->agf_freeblks) >
 		     be32_to_cpu(agf->agf_length))) {
-		xfs_buf_corruption_error(agbp);
+		xfs_buf_mark_corrupt(agbp);
 		return -EFSCORRUPTED;
 	}
 
diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
index fed537a4353d..208f8f56be0a 100644
--- a/fs/xfs/libxfs/xfs_attr_leaf.c
+++ b/fs/xfs/libxfs/xfs_attr_leaf.c
@@ -2346,7 +2346,7 @@  xfs_attr3_leaf_lookup_int(
 	xfs_attr3_leaf_hdr_from_disk(args->geo, &ichdr, leaf);
 	entries = xfs_attr3_leaf_entryp(leaf);
 	if (ichdr.count >= args->geo->blksize / 8) {
-		xfs_buf_corruption_error(bp);
+		xfs_buf_mark_corrupt(bp);
 		return -EFSCORRUPTED;
 	}
 
@@ -2365,11 +2365,11 @@  xfs_attr3_leaf_lookup_int(
 			break;
 	}
 	if (!(probe >= 0 && (!ichdr.count || probe < ichdr.count))) {
-		xfs_buf_corruption_error(bp);
+		xfs_buf_mark_corrupt(bp);
 		return -EFSCORRUPTED;
 	}
 	if (!(span <= 4 || be32_to_cpu(entry->hashval) == hashval)) {
-		xfs_buf_corruption_error(bp);
+		xfs_buf_mark_corrupt(bp);
 		return -EFSCORRUPTED;
 	}
 
diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index fd300dc93ca4..0599a3ee7d88 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -1762,7 +1762,7 @@  xfs_btree_lookup_get_block(
 
 out_bad:
 	*blkp = NULL;
-	xfs_buf_corruption_error(bp);
+	xfs_buf_mark_corrupt(bp);
 	xfs_trans_brelse(cur->bc_tp, bp);
 	return -EFSCORRUPTED;
 }
diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c
index 875e04f82541..b6e03b58a5d6 100644
--- a/fs/xfs/libxfs/xfs_da_btree.c
+++ b/fs/xfs/libxfs/xfs_da_btree.c
@@ -590,7 +590,7 @@  xfs_da3_split(
 	node = oldblk->bp->b_addr;
 	if (node->hdr.info.forw) {
 		if (be32_to_cpu(node->hdr.info.forw) != addblk->blkno) {
-			xfs_buf_corruption_error(oldblk->bp);
+			xfs_buf_mark_corrupt(oldblk->bp);
 			error = -EFSCORRUPTED;
 			goto out;
 		}
@@ -603,7 +603,7 @@  xfs_da3_split(
 	node = oldblk->bp->b_addr;
 	if (node->hdr.info.back) {
 		if (be32_to_cpu(node->hdr.info.back) != addblk->blkno) {
-			xfs_buf_corruption_error(oldblk->bp);
+			xfs_buf_mark_corrupt(oldblk->bp);
 			error = -EFSCORRUPTED;
 			goto out;
 		}
@@ -1624,7 +1624,7 @@  xfs_da3_node_lookup_int(
 		}
 
 		if (magic != XFS_DA_NODE_MAGIC && magic != XFS_DA3_NODE_MAGIC) {
-			xfs_buf_corruption_error(blk->bp);
+			xfs_buf_mark_corrupt(blk->bp);
 			return -EFSCORRUPTED;
 		}
 
@@ -1639,7 +1639,7 @@  xfs_da3_node_lookup_int(
 
 		/* Tree taller than we can handle; bail out! */
 		if (nodehdr.level >= XFS_DA_NODE_MAXDEPTH) {
-			xfs_buf_corruption_error(blk->bp);
+			xfs_buf_mark_corrupt(blk->bp);
 			return -EFSCORRUPTED;
 		}
 
@@ -1647,7 +1647,7 @@  xfs_da3_node_lookup_int(
 		if (blkno == args->geo->leafblk)
 			expected_level = nodehdr.level - 1;
 		else if (expected_level != nodehdr.level) {
-			xfs_buf_corruption_error(blk->bp);
+			xfs_buf_mark_corrupt(blk->bp);
 			return -EFSCORRUPTED;
 		} else
 			expected_level--;
diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c
index a131b520aac7..95d2a3f92d75 100644
--- a/fs/xfs/libxfs/xfs_dir2_leaf.c
+++ b/fs/xfs/libxfs/xfs_dir2_leaf.c
@@ -1383,7 +1383,7 @@  xfs_dir2_leaf_removename(
 	ltp = xfs_dir2_leaf_tail_p(geo, leaf);
 	bestsp = xfs_dir2_leaf_bests_p(ltp);
 	if (be16_to_cpu(bestsp[db]) != oldbest) {
-		xfs_buf_corruption_error(lbp);
+		xfs_buf_mark_corrupt(lbp);
 		return -EFSCORRUPTED;
 	}
 	/*
diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
index a0cc5e240306..dbd1e901da92 100644
--- a/fs/xfs/libxfs/xfs_dir2_node.c
+++ b/fs/xfs/libxfs/xfs_dir2_node.c
@@ -439,7 +439,7 @@  xfs_dir2_leaf_to_node(
 	ltp = xfs_dir2_leaf_tail_p(args->geo, leaf);
 	if (be32_to_cpu(ltp->bestcount) >
 				(uint)dp->i_d.di_size / args->geo->blksize) {
-		xfs_buf_corruption_error(lbp);
+		xfs_buf_mark_corrupt(lbp);
 		return -EFSCORRUPTED;
 	}
 
@@ -513,7 +513,7 @@  xfs_dir2_leafn_add(
 	 * into other peoples memory
 	 */
 	if (index < 0) {
-		xfs_buf_corruption_error(bp);
+		xfs_buf_mark_corrupt(bp);
 		return -EFSCORRUPTED;
 	}
 
@@ -800,7 +800,7 @@  xfs_dir2_leafn_lookup_for_entry(
 
 	xfs_dir3_leaf_check(dp, bp);
 	if (leafhdr.count <= 0) {
-		xfs_buf_corruption_error(bp);
+		xfs_buf_mark_corrupt(bp);
 		return -EFSCORRUPTED;
 	}
 
diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c
index fe8f60b59ec4..c42f90e16b4f 100644
--- a/fs/xfs/xfs_attr_inactive.c
+++ b/fs/xfs/xfs_attr_inactive.c
@@ -145,7 +145,7 @@  xfs_attr3_node_inactive(
 	 * Since this code is recursive (gasp!) we must protect ourselves.
 	 */
 	if (level > XFS_DA_NODE_MAXDEPTH) {
-		xfs_buf_corruption_error(bp);
+		xfs_buf_mark_corrupt(bp);
 		xfs_trans_brelse(*trans, bp);	/* no locks for later trans */
 		return -EFSCORRUPTED;
 	}
@@ -194,7 +194,7 @@  xfs_attr3_node_inactive(
 			error = xfs_attr3_leaf_inactive(trans, dp, child_bp);
 			break;
 		default:
-			xfs_buf_corruption_error(child_bp);
+			xfs_buf_mark_corrupt(child_bp);
 			xfs_trans_brelse(*trans, child_bp);
 			error = -EFSCORRUPTED;
 			break;
@@ -289,7 +289,7 @@  xfs_attr3_root_inactive(
 		break;
 	default:
 		error = -EFSCORRUPTED;
-		xfs_buf_corruption_error(bp);
+		xfs_buf_mark_corrupt(bp);
 		xfs_trans_brelse(*trans, bp);
 		break;
 	}
diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
index d37743bdf274..945fe9a6428b 100644
--- a/fs/xfs/xfs_attr_list.c
+++ b/fs/xfs/xfs_attr_list.c
@@ -279,7 +279,7 @@  xfs_attr_node_list_lookup(
 	return 0;
 
 out_corruptbuf:
-	xfs_buf_corruption_error(bp);
+	xfs_buf_mark_corrupt(bp);
 	xfs_trans_brelse(tp, bp);
 	return -EFSCORRUPTED;
 }
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 11a97bc35f70..9d26e37f78f5 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -16,6 +16,8 @@ 
 #include "xfs_log.h"
 #include "xfs_errortag.h"
 #include "xfs_error.h"
+#include "xfs_trans.h"
+#include "xfs_buf_item.h"
 
 static kmem_zone_t *xfs_buf_zone;
 
@@ -1572,6 +1574,29 @@  xfs_buf_zero(
 	}
 }
 
+/*
+ * Log a message about and stale a buffer that a caller has decided is corrupt.
+ *
+ * This function should be called for the kinds of metadata corruption that
+ * cannot be detect from a verifier, such as incorrect inter-block relationship
+ * data.  Do /not/ call this function from a verifier function.
+ *
+ * The buffer must not be dirty prior to the call.  Afterwards, the buffer will
+ * be marked stale, but b_error will not be set.  The caller is responsible for
+ * releasing the buffer or fixing it.
+ */
+void
+__xfs_buf_mark_corrupt(
+	struct xfs_buf		*bp,
+	xfs_failaddr_t		fa)
+{
+	ASSERT(bp->b_log_item == NULL ||
+	       !(bp->b_log_item->bli_flags & XFS_BLI_DIRTY));
+
+	xfs_buf_corruption_error(bp);
+	xfs_buf_stale(bp);
+}
+
 /*
  *	Handling of buffer targets (buftargs).
  */
diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index 1c2640771b9e..deaa9c2607af 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -272,6 +272,8 @@  static inline int xfs_buf_submit(struct xfs_buf *bp)
 }
 
 void xfs_buf_zero(struct xfs_buf *bp, size_t boff, size_t bsize);
+void __xfs_buf_mark_corrupt(struct xfs_buf *bp, xfs_failaddr_t fa);
+#define xfs_buf_mark_corrupt(bp) __xfs_buf_mark_corrupt((bp), __this_address)
 
 /* Buffer Utility Routines */
 extern void *xfs_buf_offset(struct xfs_buf *, size_t);
diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c
index 331765afc53e..57068d4ffba2 100644
--- a/fs/xfs/xfs_error.c
+++ b/fs/xfs/xfs_error.c
@@ -345,6 +345,8 @@  xfs_corruption_error(
  * Complain about the kinds of metadata corruption that we can't detect from a
  * verifier, such as incorrect inter-block relationship data.  Does not set
  * bp->b_error.
+ *
+ * Call xfs_buf_mark_corrupt, not this function.
  */
 void
 xfs_buf_corruption_error(
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index c5077e6326c7..0368347bb348 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -2135,7 +2135,7 @@  xfs_iunlink_update_bucket(
 	 * head of the list.
 	 */
 	if (old_value == new_agino) {
-		xfs_buf_corruption_error(agibp);
+		xfs_buf_mark_corrupt(agibp);
 		return -EFSCORRUPTED;
 	}
 
@@ -2269,7 +2269,7 @@  xfs_iunlink(
 	next_agino = be32_to_cpu(agi->agi_unlinked[bucket_index]);
 	if (next_agino == agino ||
 	    !xfs_verify_agino_or_null(mp, agno, next_agino)) {
-		xfs_buf_corruption_error(agibp);
+		xfs_buf_mark_corrupt(agibp);
 		return -EFSCORRUPTED;
 	}