diff mbox series

[4/4] xfs: don't return -EFSCORRUPTED from repair when resources cannot be grabbed

Message ID 166473479567.1083393.7668585289114718845.stgit@magnolia (mailing list archive)
State Superseded
Headers show
Series xfs: fix incorrect return values in online fsck | expand

Commit Message

Darrick J. Wong Oct. 2, 2022, 6:19 p.m. UTC
From: Darrick J. Wong <djwong@kernel.org>

If we tried to repair something but the repair failed with -EDEADLOCK or
-EAGAIN, that means that the repair function couldn't grab some resource
it needed and wants us to try again.  If we try again (with TRY_HARDER)
but still can't do it, exit back to userspace, since xfs_scrub_metadata
requires xrep_attempt to return -EAGAIN.

This makes the return value diagnostics look less weird, and fixes a
wart that remains from very early in the repair implementation.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/scrub/repair.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Dave Chinner Oct. 13, 2022, 10:49 p.m. UTC | #1
On Sun, Oct 02, 2022 at 11:19:55AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> If we tried to repair something but the repair failed with -EDEADLOCK or
> -EAGAIN, that means that the repair function couldn't grab some resource

Nothing should fail with EAGAIN by this point?

> it needed and wants us to try again.  If we try again (with TRY_HARDER)
> but still can't do it, exit back to userspace, since xfs_scrub_metadata
> requires xrep_attempt to return -EAGAIN.

-EDEADLOCK, not -EAGAIN?

Confused.

-Dave.
Darrick J. Wong Oct. 14, 2022, 9:44 p.m. UTC | #2
On Fri, Oct 14, 2022 at 09:49:11AM +1100, Dave Chinner wrote:
> On Sun, Oct 02, 2022 at 11:19:55AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > If we tried to repair something but the repair failed with -EDEADLOCK or
> > -EAGAIN, that means that the repair function couldn't grab some resource
> 
> Nothing should fail with EAGAIN by this point?

Right.

> > it needed and wants us to try again.  If we try again (with TRY_HARDER)
> > but still can't do it, exit back to userspace, since xfs_scrub_metadata
> > requires xrep_attempt to return -EAGAIN.
> 
> -EDEADLOCK, not -EAGAIN?

That part of the message confused me too.

How about this revision?

"If we tried to repair something but the repair failed with -EDEADLOCK,
that means that the repair function couldn't grab some resource it
needed and wants us to try again.  If we try again (with TRY_HARDER) but
still can't get all the resources we need, the repair fails and errors
remain on the filesystem.

"Right now, repair returns the -EDEADLOCK to the caller, which passes it
up to userspace without copying the xfs_scrub_metadata structure back to
userspace.  This is very confusing for userspace since xfs_scrub merely
reports "Resource deadlock would occur" and gives no indication that
there are uncorrected errors on the filesystem.  Hence we want to return
0 here so that the ioctl code copies the CORRUPTION flag back to
userspace."

Clearer, I hope?

--D

> 
> Confused.
> 
> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
diff mbox series

Patch

diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c
index f6c4cb013346..34fc0dc5f200 100644
--- a/fs/xfs/scrub/repair.c
+++ b/fs/xfs/scrub/repair.c
@@ -69,9 +69,9 @@  xrep_attempt(
 		/*
 		 * We tried harder but still couldn't grab all the resources
 		 * we needed to fix it.  The corruption has not been fixed,
-		 * so report back to userspace.
+		 * so exit to userspace.
 		 */
-		return -EFSCORRUPTED;
+		return 0;
 	case -EAGAIN:
 		/* Repair functions should return EDEADLOCK, not EAGAIN. */
 		ASSERT(0);