diff mbox series

[v23.2,4/4] xfs: don't return -EFSCORRUPTED from repair when resources cannot be grabbed

Message ID Y2V3oOMM85/MwK0i@magnolia (mailing list archive)
State Superseded
Headers show
Series None | expand

Commit Message

Darrick J. Wong Nov. 4, 2022, 8:35 p.m. UTC
From: Darrick J. Wong <djwong@kernel.org>

If we tried to repair something but the repair failed with -EDEADLOCK,
that means that the repair function couldn't grab some resource it
needed and wants us to try again.  If we try again (with TRY_HARDER) but
still can't get all the resources we need, the repair fails and errors
remain on the filesystem.

Right now, repair returns the -EDEADLOCK to the caller as -EFSCORRUPTED,
which results in XFS_SCRUB_OFLAG_CORRUPT being passed out to userspace.
This is not correct because repair has not determined that anything is
corrupt.  If the repair had been invoked on an object that could be
optimized but wasn't corrupt (OFLAG_PREEN), the inability to grab
resources will be reported to userspace as corrupt metadata, and users
will be unnecessarily alarmed that their suboptimal metadata turned into
a corruption.

Fix this by returning zero so that the results of the actual scrub will
be copied back out to userspace.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
v23.2: fix the commit message to discuss what's really going on in this
       patch.
---
 fs/xfs/scrub/repair.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Darrick J. Wong Nov. 8, 2022, 1:27 a.m. UTC | #1
On Fri, Nov 04, 2022 at 01:35:44PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> If we tried to repair something but the repair failed with -EDEADLOCK,
> that means that the repair function couldn't grab some resource it
> needed and wants us to try again.  If we try again (with TRY_HARDER) but
> still can't get all the resources we need, the repair fails and errors
> remain on the filesystem.
> 
> Right now, repair returns the -EDEADLOCK to the caller as -EFSCORRUPTED,
> which results in XFS_SCRUB_OFLAG_CORRUPT being passed out to userspace.
> This is not correct because repair has not determined that anything is
> corrupt.  If the repair had been invoked on an object that could be
> optimized but wasn't corrupt (OFLAG_PREEN), the inability to grab
> resources will be reported to userspace as corrupt metadata, and users
> will be unnecessarily alarmed that their suboptimal metadata turned into
> a corruption.
> 
> Fix this by returning zero so that the results of the actual scrub will
> be copied back out to userspace.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
> v23.2: fix the commit message to discuss what's really going on in this
>        patch.
> ---
>  fs/xfs/scrub/repair.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c
> index 7323bd9fddfb..86f770af6737 100644
> --- a/fs/xfs/scrub/repair.c
> +++ b/fs/xfs/scrub/repair.c
> @@ -69,9 +69,9 @@ xrep_attempt(
>  		/*
>  		 * We tried harder but still couldn't grab all the resources
>  		 * we needed to fix it.  The corruption has not been fixed,
> -		 * so report back to userspace.
> +		 * so exit to userspace with the corruption flags still set.

<groan> this wording is vague, i'll try again...

--D

>  		 */
> -		return -EFSCORRUPTED;
> +		return 0;
>  	default:
>  		/*
>  		 * EAGAIN tells the caller to re-scrub, so we cannot return
diff mbox series

Patch

diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c
index 7323bd9fddfb..86f770af6737 100644
--- a/fs/xfs/scrub/repair.c
+++ b/fs/xfs/scrub/repair.c
@@ -69,9 +69,9 @@  xrep_attempt(
 		/*
 		 * We tried harder but still couldn't grab all the resources
 		 * we needed to fix it.  The corruption has not been fixed,
-		 * so report back to userspace.
+		 * so exit to userspace with the corruption flags still set.
 		 */
-		return -EFSCORRUPTED;
+		return 0;
 	default:
 		/*
 		 * EAGAIN tells the caller to re-scrub, so we cannot return