[V4] xfs: Document error handlers behavior

Message ID	20160914012334.GK30497@dastard (mailing list archive)
State	New, archived
Headers	show Return-Path: <xfs-bounces@oss.sgi.com> Date: Wed, 14 Sep 2016 11:23:34 +1000 From: Dave Chinner <david@fromorbit.com> To: Carlos Maiolino <cmaiolino@redhat.com> Subject: Re: [PATCH V4] xfs: Document error handlers behavior Message-ID: <20160914012334.GK30497@dastard> xfs: Document error handlers behavior References: <1473757385-81633-1-git-send-email-cmaiolino@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1473757385-81633-1-git-send-email-cmaiolino@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: linux-xfs@vger.kernel.org, xfs@oss.sgi.com Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com

Dave Chinner Sept. 14, 2016, 1:23 a.m. UTC

On Tue, Sep 13, 2016 at 05:03:05AM -0400, Carlos Maiolino wrote:
> Document the implementation of error handlers into sysfs.
> 
> 
> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> ---
> Changelog:
> 
> V2:
> 	- Add a description of the precedence order of each option, focusing on
> 	  the behavior of "fail_at_unmount" which was not well explained in V1
> 
> V3:
> 	- Fix English spelling mistakes suggested by Eric
> 
> V4:
> 	- Typo mistakes, document ENODEV default value for max_retries, fix
> 	  directories's hierarchy description

Ok, I had to update this for the change in retry timeout values from
Eric, so I went and fixed all the other things I thought needed
fixing, too. New patch below....

Dave.

Carlos Maiolino Sept. 14, 2016, 10:02 a.m. UTC | #1

On Wed, Sep 14, 2016 at 11:23:34AM +1000, Dave Chinner wrote:
> Ok, I had to update this for the change in retry timeout values from
> Eric, so I went and fixed all the other things I thought needed
> fixing, too. New patch below....
> 

Hi, thanks, this looks good to me, with one exception described below.

> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> xfs: Document error handlers behavior
> 
> From: Carlos Maiolino <cmaiolino@redhat.com>
> 
> + -error handlers:
> +	Defines the behavior for a specific error.
> +
> +The filesystem behavior during an error can be set via sysfs files, Each
> +error handler works independently, the first condition met by and error handler
> +for a specific class will cause the error to be propagated rather than reset and
> +retried.
> +
> +The action taken by the filesystem when the error is propagated is context
> +dependent - it may cause a shut down in the case of an unrecoverable error,
> +it may be reported back to userspace, or it may even be ignored because
> +there's nothing useful we can with the error or anyone we can report it to (e.g.

"there's nothing useful we can do with the error"

> +during unmount).

Also, I apologize if I misunderstand it, but being ignored doesn't look a proper
description here, it sounds to me something like 'we ignore the error and tell
nobody about it", in unmount example, we shut down the filesystem if any error
happens, for me it doesn't sound like ignoring an error, but I might be
interpreting it in the wrong way.


> +
> +The configuration files are organized into the following per-mounted filesystem
> +hierarchy:
> +
> +  /sys/fs/xfs/<dev>/error/<class>/<error>/
> +
> +Where:
> +  <dev>
> +	The short device name of the mounted filesystem. This is the same device
> +	name that shows up in XFS kernel error messages as "XFS(<dev>): ..."
> +
> +  <class>
> +	The subsystem the error configuration belongs to. As of 4.9, the defined
> +	classes are:
> +
> +		- "metadata": applies metadata buffer write IO
> +
> +  <error>
> +	The individual error handler configurations.
> +
> +
> +Each filesystem has "global" error configuration options defined in their top
> +level directory:
> +
> +  /sys/fs/xfs/<dev>/error/
> +
> +  fail_at_unmount		(Min:  0  Default:  1  Max: 1)
> +	Defines the filesystem error behavior at unmount time.
> +
> +	If set to a value of 1, XFS will override all other error configurations
> +	during unmount and replace them with "immediate fail" characteristics.
> +	i.e. no retries, no retry timeout. This will always allow unmount to
> +	succeed when there are persistent errors present.
> +
> +	If set to 0, the configured retry behaviour will continue until all
> +	retries and/or timeouts have been exhausted. This will delay unmount
> +	completion when there are persistent errors, and it may prevent the
> +	filesystem from ever unmounting fully in the case of "retry forever"
> +	handler configurations.
> +
> +	Note: there is no guarantee that fail_at_unmount can be set whilst an
> +	unmount is in progress. It is possible that the sysfs entries are
> +	removed by the unmounting filesystem before a "retry forever" error
> +	handler configuration causes unmount to hang, and hence the filesystem
> +	must be configured appropriately before unmount begins to prevent
> +	unmount hangs.
> +
> +Each filesystem has specific error class handlers that define the error
> +propagation behaviour for specific errors. There is also a "default" error
> +handler defined, which defines the behaviour for all errors that don't have
> +specific handlers defined. The handler configurations are found in the
> +directory:
> +
> +  /sys/fs/xfs/<dev>/error/<class>/<error>/
> +
> +  max_retries			(Min: -1  Default: Varies  Max: INTMAX)
> +	Defines the allowed number of retries of a specific error before
> +	the filesystem will propagate the error. The retry count for a given
> +	error context (e.g. a specific metadata buffer) is reset ever time there
> +	is a successful completion of the operation.
> +
> +	Setting the value to "-1" will cause XFS to retry forever for this
> +	specific error.
> +
> +	Setting the value to "0" will cause XFS to fail immediately when the
> +	specific error is reported.
> +
> +	Setting the value to "N" (where 0 < N < Max) will make XFS retry the
> +	operation "N" times before propagating the error.
> +
> +  retry_timeout_seconds		(Min:  -1  Default:  Varies  Max: 1 day)
> +	Define the amount of time (in seconds) that the filesystem is
> +	allowed to retry its operations when the specific error is
> +	found.
> +
> +	Setting the value to "-1" will set an infinite timeout, causing
> +	error propagation behaviour to be determined solely by the "max_retries"
> +	parameter.
> +
> +	Setting the value to "0" will cause XFS to fail immediately when the
> +	specific error is reported.
> +
> +	Setting the value to  "N" (where 0 < N < Max) will propagate the error
> +	on the first retry that fails at least "N" seconds after the first error
> +	was detected, unless the number of retries defined by max_retries
> +	expires first.
> +
> +Note: The default behaviour for a specific error handler is dependent on both
> +the class and error context. For example, the default values for
> +"metadata/ENODEV" are "0" rather than "-1" so that this error handler defaults
> +to "fail immediately" behaviour. This is done because ENODEV is a fatal,
> +unrecoverable error no matter how many times the metadata IO is retried.
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

Eric Sandeen Sept. 14, 2016, 3:10 p.m. UTC | #2

On 9/13/16 8:23 PM, Dave Chinner wrote:

> Ok, I had to update this for the change in retry timeout values from
> Eric, so I went and fixed all the other things I thought needed
> fixing, too. New patch below....
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> xfs: Document error handlers behavior
> 
> From: Carlos Maiolino <cmaiolino@redhat.com>
> 
> Document the implementation of error handlers into sysfs.
> 
> [dchinner: significant update:
> 	- removed examples from concept descriptions, placed them in
> 	  appropriate detailed descriptions instead
> 	- added explanations for <dev>, <class> and <error> strings
> 	  in sysfs layout description
> 	- added specific definition of "global" per-filesystem error
> 	  configuration parameters.
> 	- reformatted to remove multiple indents
> 	- added more information about fail_at_unmount behaviour and
> 	  constraints
> 	- added comment that there is a "default" handler to
> 	  configure behaviour for all errors that don't have
> 	  specific handlers defined.
> 	- added specific handler value explanations
> 	- added note about handlers having context specific
> 	  defaults with example. ]
> 
> Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> 
> ---
>  Documentation/filesystems/xfs.txt | 125 +++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 125 insertions(+)
> 
> diff --git a/Documentation/filesystems/xfs.txt b/Documentation/filesystems/xfs.txt
> index 8146e9f..705d064 100644
> --- a/Documentation/filesystems/xfs.txt
> +++ b/Documentation/filesystems/xfs.txt
> @@ -348,3 +348,128 @@ Removed Sysctls
>    ----				-------
>    fs.xfs.xfsbufd_centisec	v4.0
>    fs.xfs.age_buffer_centisecs	v4.0
> +
> +
> +Error handling
> +==============
> +
> +XFS can act differently according to the type of error found during its
> +operation. The implementation introduces the following concepts to the error
> +handler:
> +
> + -failure speed:
> +	Defines how fast XFS should propagate an error upwards when a specific
> +	error is found during the filesystem operation. It can propagate
> +	immediately, after a defined number of retries, after a set time period,
> +	or simply retry forever.
> +
> + -error classes:
> +	Specifies the subsystem the error configuration will apply to, such as
> +	metadata IO or memory allocation. Different subsystems will have
> +	different error handlers for which behaviour can be configured.
> +
> + -error handlers:
> +	Defines the behavior for a specific error.
> +
> +The filesystem behavior during an error can be set via sysfs files, Each

files.  Each error ...

> +error handler works independently, the first condition met by and error handler

works independently - the firest condition met by /an/ error handler ...

> +for a specific class will cause the error to be propagated rather than reset and
> +retried.
> +
> +The action taken by the filesystem when the error is propagated is context
> +dependent - it may cause a shut down in the case of an unrecoverable error,
> +it may be reported back to userspace, or it may even be ignored because
> +there's nothing useful we can with the error or anyone we can report it to (e.g.
> +during unmount).
> +
> +The configuration files are organized into the following per-mounted filesystem
> +hierarchy:

... into the following hierarchy for each mounted filesystem:

> +
> +  /sys/fs/xfs/<dev>/error/<class>/<error>/
> +
> +Where:
> +  <dev>
> +	The short device name of the mounted filesystem. This is the same device
> +	name that shows up in XFS kernel error messages as "XFS(<dev>): ..."
> +
> +  <class>
> +	The subsystem the error configuration belongs to. As of 4.9, the defined
> +	classes are:
> +
> +		- "metadata": applies metadata buffer write IO
> +
> +  <error>
> +	The individual error handler configurations.
> +
> +
> +Each filesystem has "global" error configuration options defined in their top
> +level directory:
> +
> +  /sys/fs/xfs/<dev>/error/
> +
> +  fail_at_unmount		(Min:  0  Default:  1  Max: 1)
> +	Defines the filesystem error behavior at unmount time.
> +
> +	If set to a value of 1, XFS will override all other error configurations
> +	during unmount and replace them with "immediate fail" characteristics.
> +	i.e. no retries, no retry timeout. This will always allow unmount to
> +	succeed when there are persistent errors present.
> +
> +	If set to 0, the configured retry behaviour will continue until all
> +	retries and/or timeouts have been exhausted. This will delay unmount
> +	completion when there are persistent errors, and it may prevent the
> +	filesystem from ever unmounting fully in the case of "retry forever"
> +	handler configurations.
> +
> +	Note: there is no guarantee that fail_at_unmount can be set whilst an
> +	unmount is in progress. It is possible that the sysfs entries are
> +	removed by the unmounting filesystem before a "retry forever" error
> +	handler configuration causes unmount to hang, and hence the filesystem
> +	must be configured appropriately before unmount begins to prevent
> +	unmount hangs.
> +
> +Each filesystem has specific error class handlers that define the error
> +propagation behaviour for specific errors. There is also a "default" error
> +handler defined, which defines the behaviour for all errors that don't have
> +specific handlers defined. The handler configurations are found in the
> +directory:
> +
> +  /sys/fs/xfs/<dev>/error/<class>/<error>/
> +
> +  max_retries			(Min: -1  Default: Varies  Max: INTMAX)
> +	Defines the allowed number of retries of a specific error before
> +	the filesystem will propagate the error. The retry count for a given
> +	error context (e.g. a specific metadata buffer) is reset ever time there

every time there ...

> +	is a successful completion of the operation.
> +
> +	Setting the value to "-1" will cause XFS to retry forever for this
> +	specific error.
> +
> +	Setting the value to "0" will cause XFS to fail immediately when the
> +	specific error is reported.
> +
> +	Setting the value to "N" (where 0 < N < Max) will make XFS retry the
> +	operation "N" times before propagating the error.
> +
> +  retry_timeout_seconds		(Min:  -1  Default:  Varies  Max: 1 day)
> +	Define the amount of time (in seconds) that the filesystem is
> +	allowed to retry its operations when the specific error is
> +	found.
> +
> +	Setting the value to "-1" will set an infinite timeout, causing
> +	error propagation behaviour to be determined solely by the "max_retries"
> +	parameter.

This is asymmetric; if you want this, then max_retries should probably say that
-1 will cause the behavior to be determined solely by retry_timeout_seconds...

Could also say "removing any time limits on retries."  (and above, "removing any
count limits on retries.)

But that's already covered by "the first condition met by ..., " really.

> +
> +	Setting the value to "0" will cause XFS to fail immediately when the
> +	specific error is reported.
> +
> +	Setting the value to  "N" (where 0 < N < Max) will propagate the error
> +	on the first retry that fails at least "N" seconds after the first error
> +	was detected, unless the number of retries defined by max_retries
> +	expires first.

Same issue here, really; they are symmetric, right?  First condition met for
propagation propagates the error, period.  This sounds overly complex, unless
I'm missing something. Seems like:

+	Setting the value to "N" (where 0 < N < Max) will make XFS retry the
+	operation for "N" seconds before propagating the error.

would suffice, no?

> +
> +Note: The default behaviour for a specific error handler is dependent on both
> +the class and error context. For example, the default values for
> +"metadata/ENODEV" are "0" rather than "-1" so that this error handler defaults
> +to "fail immediately" behaviour. This is done because ENODEV is a fatal,
> +unrecoverable error no matter how many times the metadata IO is retried.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

Dave Chinner Sept. 14, 2016, 10:09 p.m. UTC | #3

On Wed, Sep 14, 2016 at 12:02:19PM +0200, Carlos Maiolino wrote:
> On Wed, Sep 14, 2016 at 11:23:34AM +1000, Dave Chinner wrote:
> > Ok, I had to update this for the change in retry timeout values from
> > Eric, so I went and fixed all the other things I thought needed
> > fixing, too. New patch below....
> > 
> 
> Hi, thanks, this looks good to me, with one exception described below.
> 
> > Dave.
> > -- 
> > Dave Chinner
> > david@fromorbit.com
> > 
> > xfs: Document error handlers behavior
> > 
> > From: Carlos Maiolino <cmaiolino@redhat.com>
> > 
> > + -error handlers:
> > +	Defines the behavior for a specific error.
> > +
> > +The filesystem behavior during an error can be set via sysfs files, Each
> > +error handler works independently, the first condition met by and error handler
> > +for a specific class will cause the error to be propagated rather than reset and
> > +retried.
> > +
> > +The action taken by the filesystem when the error is propagated is context
> > +dependent - it may cause a shut down in the case of an unrecoverable error,
> > +it may be reported back to userspace, or it may even be ignored because
> > +there's nothing useful we can with the error or anyone we can report it to (e.g.
> 
> "there's nothing useful we can do with the error"
> 
> > +during unmount).
> 
> Also, I apologize if I misunderstand it, but being ignored doesn't look a proper
> description here, it sounds to me something like 'we ignore the error and tell
> nobody about it", in unmount example, we shut down the filesystem if any error
> happens, for me it doesn't sound like ignoring an error, but I might be
> interpreting it in the wrong way.

I think you're making the assumption that the only way we handle
errors once retries are exhausted is to trigger a filesystem shutdown.
That assumption was repeated throughout the documentation.

While that may be true for /metadata write IO errors/, it is not
true for the generic error handling case. e.g. if we extend it to
memory allocation contexts, we may end up returning ENOMEM to
userspace. Or, in certain contexts, we might be able to fall back to
doing a single operation at a time using the stack for storage, in
which case there is no reason at all to report the allocation
failure to anyone.

The infrastructure is generic, as is the documentation, and so it
shouldn't assume anything about what is going to happen once the
retries are exhausted and the error is propagated upwards. What
happens with that error after it is returned is a subsystem and
context dependent behaviour, not something that is defined by the
error retry configuration infrastructure....

Cheers,

Dave.

Dave Chinner Sept. 14, 2016, 10:22 p.m. UTC | #4

On Wed, Sep 14, 2016 at 10:10:05AM -0500, Eric Sandeen wrote:
> On 9/13/16 8:23 PM, Dave Chinner wrote:
....
> > +  max_retries			(Min: -1  Default: Varies  Max: INTMAX)
> > +	Defines the allowed number of retries of a specific error before
> > +	the filesystem will propagate the error. The retry count for a given
> > +	error context (e.g. a specific metadata buffer) is reset ever time there
> 
> every time there ...
> 
> > +	is a successful completion of the operation.
> > +
> > +	Setting the value to "-1" will cause XFS to retry forever for this
> > +	specific error.
> > +
> > +	Setting the value to "0" will cause XFS to fail immediately when the
> > +	specific error is reported.
> > +
> > +	Setting the value to "N" (where 0 < N < Max) will make XFS retry the
> > +	operation "N" times before propagating the error.
> > +
> > +  retry_timeout_seconds		(Min:  -1  Default:  Varies  Max: 1 day)
> > +	Define the amount of time (in seconds) that the filesystem is
> > +	allowed to retry its operations when the specific error is
> > +	found.
> > +
> > +	Setting the value to "-1" will set an infinite timeout, causing
> > +	error propagation behaviour to be determined solely by the "max_retries"
> > +	parameter.
> 
> This is asymmetric; if you want this, then max_retries should probably say that
> -1 will cause the behavior to be determined solely by retry_timeout_seconds...

I could, but...

> Could also say "removing any time limits on retries."  (and above, "removing any
> count limits on retries.)
> 
> But that's already covered by "the first condition met by ..., " really.

this covers it so I can probably just remove it.
> 
> > +
> > +	Setting the value to "0" will cause XFS to fail immediately when the
> > +	specific error is reported.
> > +
> > +	Setting the value to  "N" (where 0 < N < Max) will propagate the error
> > +	on the first retry that fails at least "N" seconds after the first error
> > +	was detected, unless the number of retries defined by max_retries
> > +	expires first.
> 
> Same issue here, really; they are symmetric, right?  First condition met for
> propagation propagates the error, period.  This sounds overly complex, unless
> I'm missing something. Seems like:
> 
> +	Setting the value to "N" (where 0 < N < Max) will make XFS retry the
> +	operation for "N" seconds before propagating the error.
> 
> would suffice, no?

No, because that's not what the implementation does:

	if (retries expired)
		fail
	if (retry timer expired)
		fail

IOWs, the retry count has precedence over the retry timer. if you
set both retry_timeout and max_retries, the timeout only takes
effect if max retries is set high enough that they aren't exhausted
before the timeout fires.

This is for the case where an failure might take a variable time to
report.  (Think interactions with errors that TLER would address).
Normally you might say 10 retries, but if it is taking 5 minutes to
then fail when this specific error condition is hit, you might set a retry
timeout of 1 minute. In that case, we might get an immediate IO
error and retry several times before failing. However, if we hit the
"slow to report" error, we still get failure in the same time frame
as the immediate failures that have been retried many times before
giving up.

It's hard to explain complex stuff like with a simple, concise
description. I'll try again....

Cheers,

Dave.

Eric Sandeen Sept. 14, 2016, 10:31 p.m. UTC | #5

On 9/14/16 5:22 PM, Dave Chinner wrote:
>> Same issue here, really; they are symmetric, right?  First condition met for
>> > propagation propagates the error, period.  This sounds overly complex, unless
>> > I'm missing something. Seems like:
>> > 
>> > +	Setting the value to "N" (where 0 < N < Max) will make XFS retry the
>> > +	operation for "N" seconds before propagating the error.
>> > 
>> > would suffice, no?
> No, because that's not what the implementation does:
> 
> 	if (retries expired)
> 		fail
> 	if (retry timer expired)
> 		fail
> 
> IOWs, the retry count has precedence over the retry timer. if you
> set both retry_timeout and max_retries, the timeout only takes
> effect if max retries is set high enough that they aren't exhausted
> before the timeout fires.

Then:

+	Setting the value to "N" (where 0 < N < Max) will /allow/ XFS to retry
+	the operation for /up to/ "N" seconds before propagating the error.

?  i.e. it could, but only if the retries don't expire first :)

> This is for the case where an failure might take a variable time to
> report.  (Think interactions with errors that TLER would address).
> Normally you might say 10 retries, but if it is taking 5 minutes to
> then fail when this specific error condition is hit, you might set a retry
> timeout of 1 minute. In that case, we might get an immediate IO
> error and retry several times before failing. However, if we hit the
> "slow to report" error, we still get failure in the same time frame
> as the immediate failures that have been retried many times before
> giving up.

Either way, it's still the /first/ condition satisfied which will
bubble it up.

So "first condition satisfied will propagate up," plus
one condition is "retry up to N times," plus
one condition is "retry up for up to N seconds"

seems to cover it all, no?

> It's hard to explain complex stuff like with a simple, concise
> description. I'll try again....
> 
> Cheers,
> 
> Dave.

Carlos Maiolino Sept. 15, 2016, 9:18 a.m. UTC | #6

> > Also, I apologize if I misunderstand it, but being ignored doesn't look a proper
> > description here, it sounds to me something like 'we ignore the error and tell
> > nobody about it", in unmount example, we shut down the filesystem if any error
> > happens, for me it doesn't sound like ignoring an error, but I might be
> > interpreting it in the wrong way.
> 
> I think you're making the assumption that the only way we handle
> errors once retries are exhausted is to trigger a filesystem shutdown.
> That assumption was repeated throughout the documentation.
> 
> While that may be true for /metadata write IO errors/, it is not
> true for the generic error handling case. e.g. if we extend it to
> memory allocation contexts, we may end up returning ENOMEM to
> userspace. Or, in certain contexts, we might be able to fall back to
> doing a single operation at a time using the stack for storage, in
> which case there is no reason at all to report the allocation
> failure to anyone.
> 
> The infrastructure is generic, as is the documentation, and so it
> shouldn't assume anything about what is going to happen once the
> retries are exhausted and the error is propagated upwards. What
> happens with that error after it is returned is a subsystem and
> context dependent behaviour, not something that is defined by the
> error retry configuration infrastructure....
> 
> Cheers,

Thanks for the very detailed explanation Dave
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

[V4] xfs: Document error handlers behavior

Commit Message

Comments

Patch