[13/30] xfs: handle buffer log item IO errors directly

Message ID	20200601214251.4167140-14-david@fromorbit.com (mailing list archive)
State	Superseded
Headers	show Return-Path: <SRS0=XPcr=7O=vger.kernel.org=linux-xfs-owner@kernel.org> From: Dave Chinner <david@fromorbit.com> To: linux-xfs@vger.kernel.org Subject: [PATCH 13/30] xfs: handle buffer log item IO errors directly Date: Tue, 2 Jun 2020 07:42:34 +1000 Message-Id: <20200601214251.4167140-14-david@fromorbit.com> In-Reply-To: <20200601214251.4167140-1-david@fromorbit.com> References: <20200601214251.4167140-1-david@fromorbit.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk
Series	xfs: rework inode flushing to make inode reclaim fully asynchronous \| expand [00/30] xfs: rework inode flushing to make inode reclaim fully asynchronous [01/30] xfs: Don't allow logging of XFS_ISTALE inodes [02/30] xfs: remove logged flag from inode log item [03/30] xfs: add an inode item lock [04/30] xfs: mark inode buffers in cache [05/30] xfs: mark dquot buffers in cache [06/30] xfs: mark log recovery buffers for completion [07/30] xfs: call xfs_buf_iodone directly [08/30] xfs: clean up whacky buffer log item list reinit [09/30] xfs: make inode IO completion buffer centric [10/30] xfs: use direct calls for dquot IO completion [11/30] xfs: clean up the buffer iodone callback functions [12/30] xfs: get rid of log item callbacks [13/30] xfs: handle buffer log item IO errors directly [14/30] xfs: unwind log item error flagging [15/30] xfs: move xfs_clear_li_failed out of xfs_ail_delete_one() [16/30] xfs: pin inode backing buffer to the inode log item [17/30] xfs: make inode reclaim almost non-blocking [18/30] xfs: remove IO submission from xfs_reclaim_inode() [19/30] xfs: allow multiple reclaimers per AG [20/30] xfs: don't block inode reclaim on the ILOCK [21/30] xfs: remove SYNC_TRYLOCK from inode reclaim [22/30] xfs: remove SYNC_WAIT from xfs_reclaim_inodes() [23/30] xfs: clean up inode reclaim comments [24/30] xfs: rework stale inodes in xfs_ifree_cluster [25/30] xfs: attach inodes to the cluster buffer when dirtied [26/30] xfs: xfs_iflush() is no longer necessary [27/30] xfs: rename xfs_iflush_int() [28/30] xfs: rework xfs_iflush_cluster() dirty inode iteration [29/30] xfs: factor xfs_iflush_done [30/30] xfs: remove xfs_inobp_check()

Message ID

20200601214251.4167140-14-david@fromorbit.com (mailing list archive)

State

Superseded

Headers

From: Dave Chinner <david@fromorbit.com>
To: linux-xfs@vger.kernel.org
Subject: [PATCH 13/30] xfs: handle buffer log item IO errors directly
Date: Tue,  2 Jun 2020 07:42:34 +1000
Message-Id: <20200601214251.4167140-14-david@fromorbit.com>
In-Reply-To: <20200601214251.4167140-1-david@fromorbit.com>
References: <20200601214251.4167140-1-david@fromorbit.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: linux-xfs-owner@vger.kernel.org
Precedence: bulk

Series

xfs: rework inode flushing to make inode reclaim fully asynchronous | expand

Commit Message

Dave Chinner June 1, 2020, 9:42 p.m. UTC

From: Dave Chinner <dchinner@redhat.com>

Currently when a buffer with attached log items has an IO error
it called ->iop_error for each attched log item. These all call
xfs_set_li_failed() to handle the error, but we are about to change
the way log items manage buffers. hence we first need to remove the
per-item dependency on buffer handling done by xfs_set_li_failed().

We already have specific buffer type IO completion routines, so move
the log item error handling out of the generic error handling and
into the log item specific functions so we can implement per-type
error handling easily.

This requires a more complex return value from the error handling
code so that we can take the correct action the failure handling
requires.  This results in some repeated boilerplate in the
functions, but that can be cleaned up later once all the changes
cascade through this code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_buf_item.c | 167 ++++++++++++++++++++++++++++--------------
 1 file changed, 112 insertions(+), 55 deletions(-)

Comments

Darrick J. Wong June 2, 2020, 8:39 p.m. UTC | #1

On Tue, Jun 02, 2020 at 07:42:34AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Currently when a buffer with attached log items has an IO error
> it called ->iop_error for each attched log item. These all call
> xfs_set_li_failed() to handle the error, but we are about to change
> the way log items manage buffers. hence we first need to remove the
> per-item dependency on buffer handling done by xfs_set_li_failed().
> 
> We already have specific buffer type IO completion routines, so move
> the log item error handling out of the generic error handling and
> into the log item specific functions so we can implement per-type
> error handling easily.
> 
> This requires a more complex return value from the error handling
> code so that we can take the correct action the failure handling
> requires.  This results in some repeated boilerplate in the
> functions, but that can be cleaned up later once all the changes
> cascade through this code.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/xfs_buf_item.c | 167 ++++++++++++++++++++++++++++--------------
>  1 file changed, 112 insertions(+), 55 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> index 09bfe9c52dbdb..b6995719e877b 100644
> --- a/fs/xfs/xfs_buf_item.c
> +++ b/fs/xfs/xfs_buf_item.c
> @@ -987,20 +987,18 @@ xfs_buf_do_callbacks_fail(
>  }
>  
>  static bool
> -xfs_buf_iodone_callback_error(
> +xfs_buf_ioerror_sync(
>  	struct xfs_buf		*bp)
>  {
>  	struct xfs_mount	*mp = bp->b_mount;
>  	static ulong		lasttime;
>  	static xfs_buftarg_t	*lasttarg;
> -	struct xfs_error_cfg	*cfg;
> -
>  	/*

This should preserve the blank line between the declarations and the
start of the code.

>  	 * If we've already decided to shutdown the filesystem because of
>  	 * I/O errors, there's no point in giving this a retry.
>  	 */
>  	if (XFS_FORCED_SHUTDOWN(mp))
> -		goto out_stale;
> +		return true;
>  
>  	if (bp->b_target != lasttarg ||
>  	    time_after(jiffies, (lasttime + 5*HZ))) {
> @@ -1011,19 +1009,15 @@ xfs_buf_iodone_callback_error(
>  
>  	/* synchronous writes will have callers process the error */
>  	if (!(bp->b_flags & XBF_ASYNC))
> -		goto out_stale;
> -
> -	trace_xfs_buf_item_iodone_async(bp, _RET_IP_);
> -
> -	cfg = xfs_error_get_cfg(mp, XFS_ERR_METADATA, bp->b_error);
> +		return true;
> +	return false;

What does the return value mean here?  true means "let the caller deal
with the error", false means "attempt a retry, if desired?  So this
function decides if we're going to fail immediately or not?

	if (xfs_buf_ioerr_fail_immediately(bp))
		goto out_stale;

That's a lengthy name though.  On second inspection, I guess this
function decides if the buffer is going to be sent through the io retry
mechanism, and the next two functions advance it through the retry
states until either the write succeeds or we declare permanent failure?

> +}
>  
> -	/*
> -	 * If the write was asynchronous then no one will be looking for the
> -	 * error.  If this is the first failure of this type, clear the error
> -	 * state and write the buffer out again. This means we always retry an
> -	 * async write failure at least once, but we also need to set the buffer
> -	 * up to behave correctly now for repeated failures.
> -	 */
> +static bool
> +xfs_buf_ioerror_retry(

Might be nice to preserve some of this comment, since I initially
missed that this function both decides whether or not to do the retry
and sets up the buffer to do that.

/*
 * Decide if we're going to retry the write after a failure, and prepare
 * the buffer for retrying the write.
 */

Or, adding some newlines in the outer if body to make the two lines
that modify the bp state stand out would also help.

(TBH I'm struggling right now to make sense of what these new functions
do, though I'm fairly convinced that they at least aren't changing much
of the functionality...)

> +	struct xfs_buf		*bp,
> +	struct xfs_error_cfg	*cfg)
> +{
>  	if (!(bp->b_flags & (XBF_STALE | XBF_WRITE_FAIL)) ||
>  	     bp->b_last_error != bp->b_error) {
>  		bp->b_flags |= (XBF_WRITE | XBF_DONE | XBF_WRITE_FAIL);
> @@ -1031,36 +1025,80 @@ xfs_buf_iodone_callback_error(
>  		if (cfg->retry_timeout != XFS_ERR_RETRY_FOREVER &&
>  		    !bp->b_first_retry_time)
>  			bp->b_first_retry_time = jiffies;
> -
> -		xfs_buf_ioerror(bp, 0);
> -		xfs_buf_submit(bp);
>  		return true;
>  	}
> +	return false;
> +}
>  
> -	/*
> -	 * Repeated failure on an async write. Take action according to the
> -	 * error configuration we have been set up to use.
> -	 */
> +static bool
> +xfs_buf_ioerror_permanent(

/*
 * Account for this latest trip around the retry handler, and decide if
 * we've failed enough times to constitute a permanent failure.
 */

> +	struct xfs_buf		*bp,
> +	struct xfs_error_cfg	*cfg)
> +{
> +	struct xfs_mount	*mp = bp->b_mount;
>  
>  	if (cfg->max_retries != XFS_ERR_RETRY_FOREVER &&
>  	    ++bp->b_retries > cfg->max_retries)
> -			goto permanent_error;
> +			return true;

Might as well fix the indentation while you're at it.


>  	if (cfg->retry_timeout != XFS_ERR_RETRY_FOREVER &&
>  	    time_after(jiffies, cfg->retry_timeout + bp->b_first_retry_time))
> -			goto permanent_error;
> +			return true;
>  
>  	/* At unmount we may treat errors differently */
>  	if ((mp->m_flags & XFS_MOUNT_UNMOUNTING) && mp->m_fail_unmount)
> +		return true;
> +
> +	return false;
> +}
> +
> +/*
> + * On a sync write or shutdown we just want to stale the buffer and let the
> + * caller handle the error in bp->b_error appropriately.
> + *
> + * If the write was asynchronous then no one will be looking for the error.  If
> + * this is the first failure of this type, clear the error state and write the
> + * buffer out again. This means we always retry an async write failure at least
> + * once, but we also need to set the buffer up to behave correctly now for
> + * repeated failures.
> + *
> + * If we get repeated async write failures, then we take action according to the
> + * error configuration we have been set up to use.
> + *
> + * Multi-state return value:
> + *
> + * 0: clear IO error retry state and run callback completions
> + * 1: resubmitted immediately, do not run any completions
> + * 2: transient error, run failure callback completions and then
> + *    release the buffer

Feels odd not to use an enum here, but as this is a static function
maybe it's not a high risk for screwing up in the callers.

--D

> + */
> +static int
> +xfs_buf_iodone_error(
> +	struct xfs_buf		*bp)
> +{
> +	struct xfs_mount	*mp = bp->b_mount;
> +	struct xfs_error_cfg	*cfg;
> +
> +	if (xfs_buf_ioerror_sync(bp))
> +		goto out_stale;
> +
> +	trace_xfs_buf_item_iodone_async(bp, _RET_IP_);
> +
> +	cfg = xfs_error_get_cfg(mp, XFS_ERR_METADATA, bp->b_error);
> +	if (xfs_buf_ioerror_retry(bp, cfg)) {
> +		xfs_buf_ioerror(bp, 0);
> +		xfs_buf_submit(bp);
> +		return 1;
> +	}
> +
> +	if (xfs_buf_ioerror_permanent(bp, cfg))
>  		goto permanent_error;
>  
>  	/*
>  	 * Still a transient error, run IO completion failure callbacks and let
>  	 * the higher layers retry the buffer.
>  	 */
> -	xfs_buf_do_callbacks_fail(bp);
>  	xfs_buf_ioerror(bp, 0);
> -	xfs_buf_relse(bp);
> -	return true;
> +	return 2;
>  
>  	/*
>  	 * Permanent error - we need to trigger a shutdown if we haven't already
> @@ -1072,30 +1110,7 @@ xfs_buf_iodone_callback_error(
>  	xfs_buf_stale(bp);
>  	bp->b_flags |= XBF_DONE;
>  	trace_xfs_buf_error_relse(bp, _RET_IP_);
> -	return false;
> -}
> -
> -static inline bool
> -xfs_buf_had_callback_errors(
> -	struct xfs_buf		*bp)
> -{
> -
> -	/*
> -	 * If there is an error, process it. Some errors require us to run
> -	 * callbacks after failure processing is done so we detect that and take
> -	 * appropriate action.
> -	 */
> -	if (bp->b_error && xfs_buf_iodone_callback_error(bp))
> -		return true;
> -
> -	/*
> -	 * Successful IO or permanent error. Either way, we can clear the
> -	 * retry state here in preparation for the next error that may occur.
> -	 */
> -	bp->b_last_error = 0;
> -	bp->b_retries = 0;
> -	bp->b_first_retry_time = 0;
> -	return false;
> +	return 0;
>  }
>  
>  static void
> @@ -1122,6 +1137,15 @@ xfs_buf_item_done(
>  	xfs_buf_rele(bp);
>  }
>  
> +static inline void
> +xfs_buf_clear_ioerror_retry_state(
> +	struct xfs_buf		*bp)
> +{
> +	bp->b_last_error = 0;
> +	bp->b_retries = 0;
> +	bp->b_first_retry_time = 0;
> +}
> +
>  /*
>   * Inode buffer iodone callback function.
>   */
> @@ -1129,9 +1153,20 @@ void
>  xfs_buf_inode_iodone(
>  	struct xfs_buf		*bp)
>  {
> -	if (xfs_buf_had_callback_errors(bp))
> +	if (bp->b_error) {
> +		int ret = xfs_buf_iodone_error(bp);
> +		if (!ret)
> +			goto finish_iodone;
> +		if (ret == 1)
> +			return;
> +		ASSERT(ret == 2);
> +		xfs_buf_do_callbacks_fail(bp);
> +		xfs_buf_relse(bp);
>  		return;
> +	}
>  
> +finish_iodone:
> +	xfs_buf_clear_ioerror_retry_state(bp);
>  	xfs_buf_item_done(bp);
>  	xfs_iflush_done(bp);
>  	xfs_buf_ioend_finish(bp);
> @@ -1144,9 +1179,20 @@ void
>  xfs_buf_dquot_iodone(
>  	struct xfs_buf		*bp)
>  {
> -	if (xfs_buf_had_callback_errors(bp))
> +	if (bp->b_error) {
> +		int ret = xfs_buf_iodone_error(bp);
> +		if (!ret)
> +			goto finish_iodone;
> +		if (ret == 1)
> +			return;
> +		ASSERT(ret == 2);
> +		xfs_buf_do_callbacks_fail(bp);
> +		xfs_buf_relse(bp);
>  		return;
> +	}
>  
> +finish_iodone:
> +	xfs_buf_clear_ioerror_retry_state(bp);
>  	/* a newly allocated dquot buffer might have a log item attached */
>  	xfs_buf_item_done(bp);
>  	xfs_dquot_done(bp);
> @@ -1163,9 +1209,20 @@ void
>  xfs_buf_iodone(
>  	struct xfs_buf		*bp)
>  {
> -	if (xfs_buf_had_callback_errors(bp))
> +	if (bp->b_error) {
> +		int ret = xfs_buf_iodone_error(bp);
> +		if (!ret)
> +			goto finish_iodone;
> +		if (ret == 1)
> +			return;
> +		ASSERT(ret == 2);
> +		xfs_buf_do_callbacks_fail(bp);
> +		xfs_buf_relse(bp);
>  		return;
> +	}
>  
> +finish_iodone:
> +	xfs_buf_clear_ioerror_retry_state(bp);
>  	xfs_buf_item_done(bp);
>  	xfs_buf_ioend_finish(bp);
>  }
> -- 
> 2.26.2.761.g0e0b3e54be
>

Dave Chinner June 2, 2020, 10:17 p.m. UTC | #2

On Tue, Jun 02, 2020 at 01:39:51PM -0700, Darrick J. Wong wrote:
> On Tue, Jun 02, 2020 at 07:42:34AM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > Currently when a buffer with attached log items has an IO error
> > it called ->iop_error for each attched log item. These all call
> > xfs_set_li_failed() to handle the error, but we are about to change
> > the way log items manage buffers. hence we first need to remove the
> > per-item dependency on buffer handling done by xfs_set_li_failed().
> > 
> > We already have specific buffer type IO completion routines, so move
> > the log item error handling out of the generic error handling and
> > into the log item specific functions so we can implement per-type
> > error handling easily.
> > 
> > This requires a more complex return value from the error handling
> > code so that we can take the correct action the failure handling
> > requires.  This results in some repeated boilerplate in the
> > functions, but that can be cleaned up later once all the changes
> > cascade through this code.
> > 
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > ---
> >  fs/xfs/xfs_buf_item.c | 167 ++++++++++++++++++++++++++++--------------
> >  1 file changed, 112 insertions(+), 55 deletions(-)
> > 
> > diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> > index 09bfe9c52dbdb..b6995719e877b 100644
> > --- a/fs/xfs/xfs_buf_item.c
> > +++ b/fs/xfs/xfs_buf_item.c
> > @@ -987,20 +987,18 @@ xfs_buf_do_callbacks_fail(
> >  }
> >  
> >  static bool
> > -xfs_buf_iodone_callback_error(
> > +xfs_buf_ioerror_sync(
> >  	struct xfs_buf		*bp)
> >  {
> >  	struct xfs_mount	*mp = bp->b_mount;
> >  	static ulong		lasttime;
> >  	static xfs_buftarg_t	*lasttarg;
> > -	struct xfs_error_cfg	*cfg;
> > -
> >  	/*
> 
> This should preserve the blank line between the declarations and the
> start of the code.
> 
> >  	 * If we've already decided to shutdown the filesystem because of
> >  	 * I/O errors, there's no point in giving this a retry.
> >  	 */
> >  	if (XFS_FORCED_SHUTDOWN(mp))
> > -		goto out_stale;
> > +		return true;
> >  
> >  	if (bp->b_target != lasttarg ||
> >  	    time_after(jiffies, (lasttime + 5*HZ))) {
> > @@ -1011,19 +1009,15 @@ xfs_buf_iodone_callback_error(
> >  
> >  	/* synchronous writes will have callers process the error */
> >  	if (!(bp->b_flags & XBF_ASYNC))
> > -		goto out_stale;
> > -
> > -	trace_xfs_buf_item_iodone_async(bp, _RET_IP_);
> > -
> > -	cfg = xfs_error_get_cfg(mp, XFS_ERR_METADATA, bp->b_error);
> > +		return true;
> > +	return false;
> 
> What does the return value mean here?  true means "let the caller deal
> with the error", false means "attempt a retry, if desired?  So this
> function decides if we're going to fail immediately or not?

Effectively, yes.
> 
> 	if (xfs_buf_ioerr_fail_immediately(bp))
> 		goto out_stale;
> 
> That's a lengthy name though.  On second inspection, I guess this
> function decides if the buffer is going to be sent through the io retry
> mechanism, and the next two functions advance it through the retry
> states until either the write succeeds or we declare permanent failure?

Pretty much. I had some difficulty in working out how to break this
large function up sanely because of the 3-4 conditional functions
it performed for error handling, I named the function originally for
handling sync IO errors vs async IO errors which (may) require
retries.

So, yeah, "fail_immediately" is probably a better description, or
"fail_no_retry" sounds like a better name.

> 
> > +}
> >  
> > -	/*
> > -	 * If the write was asynchronous then no one will be looking for the
> > -	 * error.  If this is the first failure of this type, clear the error
> > -	 * state and write the buffer out again. This means we always retry an
> > -	 * async write failure at least once, but we also need to set the buffer
> > -	 * up to behave correctly now for repeated failures.
> > -	 */
> > +static bool
> > +xfs_buf_ioerror_retry(
> 
> Might be nice to preserve some of this comment, since I initially
> missed that this function both decides whether or not to do the retry
> and sets up the buffer to do that.

I thought I preserved it somewhere... yeah, it's above the
xfs_buf_iodone_error() function now.

> 
> /*
>  * Decide if we're going to retry the write after a failure, and prepare
>  * the buffer for retrying the write.
>  */
> 
> Or, adding some newlines in the outer if body to make the two lines
> that modify the bp state stand out would also help.
> 
> (TBH I'm struggling right now to make sense of what these new functions
> do, though I'm fairly convinced that they at least aren't changing much
> of the functionality...)

I had to break up the IO error handling because the log item error
callbacks for the items attached to the buffer needed to be called
only if we want the higher level to issue retries. Later in this
series we end up with different retry error marking for each type of
buffer, but we only want to do that when the error handling code
itself hasn't done an immediate retry or marked it as a permanent
error.

So I had to break up the function in separate parts so that the
caller could tell exactly what action it needed to take on a
failure.

> > +/*
> > + * On a sync write or shutdown we just want to stale the buffer and let the
> > + * caller handle the error in bp->b_error appropriately.
> > + *
> > + * If the write was asynchronous then no one will be looking for the error.  If
> > + * this is the first failure of this type, clear the error state and write the
> > + * buffer out again. This means we always retry an async write failure at least
> > + * once, but we also need to set the buffer up to behave correctly now for
> > + * repeated failures.
> > + *
> > + * If we get repeated async write failures, then we take action according to the
> > + * error configuration we have been set up to use.
> > + *
> > + * Multi-state return value:
> > + *
> > + * 0: clear IO error retry state and run callback completions
> > + * 1: resubmitted immediately, do not run any completions
> > + * 2: transient error, run failure callback completions and then
> > + *    release the buffer
> 
> Feels odd not to use an enum here, but as this is a static function
> maybe it's not a high risk for screwing up in the callers.

I can change it to use an enum. I wrote this expecting that this
code will get further factored and moved to xfs_buf.c once all the
mods have been made and everything settles down. That's about 3-4
patch series down the road at this point, though, so <shrug>. At
least changes in this patch largely don't affect the rest of this
patchset....

Cheers,

Dave.

Brian Foster June 3, 2020, 3:02 p.m. UTC | #3

On Tue, Jun 02, 2020 at 07:42:34AM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Currently when a buffer with attached log items has an IO error
> it called ->iop_error for each attched log item. These all call
> xfs_set_li_failed() to handle the error, but we are about to change
> the way log items manage buffers. hence we first need to remove the
> per-item dependency on buffer handling done by xfs_set_li_failed().
> 
> We already have specific buffer type IO completion routines, so move
> the log item error handling out of the generic error handling and
> into the log item specific functions so we can implement per-type
> error handling easily.
> 
> This requires a more complex return value from the error handling
> code so that we can take the correct action the failure handling
> requires.  This results in some repeated boilerplate in the
> functions, but that can be cleaned up later once all the changes
> cascade through this code.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---

I reiterate some of Darrick's comments in that it's slightly annoying to
see refactoring squashed together that looks like it could be done in a
couple smaller and more simple patches. That aside, the only thing that
kind of bothers me is...

>  fs/xfs/xfs_buf_item.c | 167 ++++++++++++++++++++++++++++--------------
>  1 file changed, 112 insertions(+), 55 deletions(-)
> 
> diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
> index 09bfe9c52dbdb..b6995719e877b 100644
> --- a/fs/xfs/xfs_buf_item.c
> +++ b/fs/xfs/xfs_buf_item.c
...
> @@ -1031,36 +1025,80 @@ xfs_buf_iodone_callback_error(
...
> +static int
> +xfs_buf_iodone_error(
> +	struct xfs_buf		*bp)
> +{
> +	struct xfs_mount	*mp = bp->b_mount;
> +	struct xfs_error_cfg	*cfg;
> +
> +	if (xfs_buf_ioerror_sync(bp))
> +		goto out_stale;
> +
> +	trace_xfs_buf_item_iodone_async(bp, _RET_IP_);
> +
> +	cfg = xfs_error_get_cfg(mp, XFS_ERR_METADATA, bp->b_error);
> +	if (xfs_buf_ioerror_retry(bp, cfg)) {
> +		xfs_buf_ioerror(bp, 0);
> +		xfs_buf_submit(bp);
> +		return 1;
> +	}
> +
> +	if (xfs_buf_ioerror_permanent(bp, cfg))
>  		goto permanent_error;
>  
>  	/*
>  	 * Still a transient error, run IO completion failure callbacks and let
>  	 * the higher layers retry the buffer.
>  	 */
> -	xfs_buf_do_callbacks_fail(bp);
>  	xfs_buf_ioerror(bp, 0);
> -	xfs_buf_relse(bp);
> -	return true;
> +	return 2;

... that we now clear the buffer error code before running the failure
callbacks. I know that nothing in the callbacks looks at it right now,
but I think it's subtle and inelegant to split it off this way. Can we
just move this entire block together into the type callbacks?

Brian

>  
>  	/*
>  	 * Permanent error - we need to trigger a shutdown if we haven't already
> @@ -1072,30 +1110,7 @@ xfs_buf_iodone_callback_error(
>  	xfs_buf_stale(bp);
>  	bp->b_flags |= XBF_DONE;
>  	trace_xfs_buf_error_relse(bp, _RET_IP_);
> -	return false;
> -}
> -
> -static inline bool
> -xfs_buf_had_callback_errors(
> -	struct xfs_buf		*bp)
> -{
> -
> -	/*
> -	 * If there is an error, process it. Some errors require us to run
> -	 * callbacks after failure processing is done so we detect that and take
> -	 * appropriate action.
> -	 */
> -	if (bp->b_error && xfs_buf_iodone_callback_error(bp))
> -		return true;
> -
> -	/*
> -	 * Successful IO or permanent error. Either way, we can clear the
> -	 * retry state here in preparation for the next error that may occur.
> -	 */
> -	bp->b_last_error = 0;
> -	bp->b_retries = 0;
> -	bp->b_first_retry_time = 0;
> -	return false;
> +	return 0;
>  }
>  
>  static void
> @@ -1122,6 +1137,15 @@ xfs_buf_item_done(
>  	xfs_buf_rele(bp);
>  }
>  
> +static inline void
> +xfs_buf_clear_ioerror_retry_state(
> +	struct xfs_buf		*bp)
> +{
> +	bp->b_last_error = 0;
> +	bp->b_retries = 0;
> +	bp->b_first_retry_time = 0;
> +}
> +
>  /*
>   * Inode buffer iodone callback function.
>   */
> @@ -1129,9 +1153,20 @@ void
>  xfs_buf_inode_iodone(
>  	struct xfs_buf		*bp)
>  {
> -	if (xfs_buf_had_callback_errors(bp))
> +	if (bp->b_error) {
> +		int ret = xfs_buf_iodone_error(bp);
> +		if (!ret)
> +			goto finish_iodone;
> +		if (ret == 1)
> +			return;
> +		ASSERT(ret == 2);
> +		xfs_buf_do_callbacks_fail(bp);
> +		xfs_buf_relse(bp);
>  		return;
> +	}
>  
> +finish_iodone:
> +	xfs_buf_clear_ioerror_retry_state(bp);
>  	xfs_buf_item_done(bp);
>  	xfs_iflush_done(bp);
>  	xfs_buf_ioend_finish(bp);
> @@ -1144,9 +1179,20 @@ void
>  xfs_buf_dquot_iodone(
>  	struct xfs_buf		*bp)
>  {
> -	if (xfs_buf_had_callback_errors(bp))
> +	if (bp->b_error) {
> +		int ret = xfs_buf_iodone_error(bp);
> +		if (!ret)
> +			goto finish_iodone;
> +		if (ret == 1)
> +			return;
> +		ASSERT(ret == 2);
> +		xfs_buf_do_callbacks_fail(bp);
> +		xfs_buf_relse(bp);
>  		return;
> +	}
>  
> +finish_iodone:
> +	xfs_buf_clear_ioerror_retry_state(bp);
>  	/* a newly allocated dquot buffer might have a log item attached */
>  	xfs_buf_item_done(bp);
>  	xfs_dquot_done(bp);
> @@ -1163,9 +1209,20 @@ void
>  xfs_buf_iodone(
>  	struct xfs_buf		*bp)
>  {
> -	if (xfs_buf_had_callback_errors(bp))
> +	if (bp->b_error) {
> +		int ret = xfs_buf_iodone_error(bp);
> +		if (!ret)
> +			goto finish_iodone;
> +		if (ret == 1)
> +			return;
> +		ASSERT(ret == 2);
> +		xfs_buf_do_callbacks_fail(bp);
> +		xfs_buf_relse(bp);
>  		return;
> +	}
>  
> +finish_iodone:
> +	xfs_buf_clear_ioerror_retry_state(bp);
>  	xfs_buf_item_done(bp);
>  	xfs_buf_ioend_finish(bp);
>  }
> -- 
> 2.26.2.761.g0e0b3e54be
>

Dave Chinner June 3, 2020, 9:34 p.m. UTC | #4

On Wed, Jun 03, 2020 at 11:02:07AM -0400, Brian Foster wrote:
> On Tue, Jun 02, 2020 at 07:42:34AM +1000, Dave Chinner wrote:
> > +	if (xfs_buf_ioerror_sync(bp))
> > +		goto out_stale;
> > +
> > +	trace_xfs_buf_item_iodone_async(bp, _RET_IP_);
> > +
> > +	cfg = xfs_error_get_cfg(mp, XFS_ERR_METADATA, bp->b_error);
> > +	if (xfs_buf_ioerror_retry(bp, cfg)) {
> > +		xfs_buf_ioerror(bp, 0);
> > +		xfs_buf_submit(bp);
> > +		return 1;
> > +	}
> > +
> > +	if (xfs_buf_ioerror_permanent(bp, cfg))
> >  		goto permanent_error;
> >  
> >  	/*
> >  	 * Still a transient error, run IO completion failure callbacks and let
> >  	 * the higher layers retry the buffer.
> >  	 */
> > -	xfs_buf_do_callbacks_fail(bp);
> >  	xfs_buf_ioerror(bp, 0);
> > -	xfs_buf_relse(bp);
> > -	return true;
> > +	return 2;
> 
> ... that we now clear the buffer error code before running the failure
> callbacks. I know that nothing in the callbacks looks at it right now,
> but I think it's subtle and inelegant to split it off this way. Can we
> just move this entire block together into the type callbacks?

Sure. It's largely just deck chair rearragnement, though, because
the whole XFS_LI_FAILED ends up going away real soon. The next patchset
gets rid of it entirely for inode log items, and when the same
approach is applied to dquots, it no longer will be used by
anything and will be removed entirely.

IOWs, the future isn't "maybe error callbacks will do something
different", the future is "error callbacks don't exist any more".

Cheers,

Dave.

diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c
index 09bfe9c52dbdb..b6995719e877b 100644
--- a/fs/xfs/xfs_buf_item.c
+++ b/fs/xfs/xfs_buf_item.c
@@ -987,20 +987,18 @@  xfs_buf_do_callbacks_fail(
 }
 
 static bool
-xfs_buf_iodone_callback_error(
+xfs_buf_ioerror_sync(
 	struct xfs_buf		*bp)
 {
 	struct xfs_mount	*mp = bp->b_mount;
 	static ulong		lasttime;
 	static xfs_buftarg_t	*lasttarg;
-	struct xfs_error_cfg	*cfg;
-
 	/*
 	 * If we've already decided to shutdown the filesystem because of
 	 * I/O errors, there's no point in giving this a retry.
 	 */
 	if (XFS_FORCED_SHUTDOWN(mp))
-		goto out_stale;
+		return true;
 
 	if (bp->b_target != lasttarg ||
 	    time_after(jiffies, (lasttime + 5*HZ))) {
@@ -1011,19 +1009,15 @@  xfs_buf_iodone_callback_error(
 
 	/* synchronous writes will have callers process the error */
 	if (!(bp->b_flags & XBF_ASYNC))
-		goto out_stale;
-
-	trace_xfs_buf_item_iodone_async(bp, _RET_IP_);
-
-	cfg = xfs_error_get_cfg(mp, XFS_ERR_METADATA, bp->b_error);
+		return true;
+	return false;
+}
 
-	/*
-	 * If the write was asynchronous then no one will be looking for the
-	 * error.  If this is the first failure of this type, clear the error
-	 * state and write the buffer out again. This means we always retry an
-	 * async write failure at least once, but we also need to set the buffer
-	 * up to behave correctly now for repeated failures.
-	 */
+static bool
+xfs_buf_ioerror_retry(
+	struct xfs_buf		*bp,
+	struct xfs_error_cfg	*cfg)
+{
 	if (!(bp->b_flags & (XBF_STALE | XBF_WRITE_FAIL)) ||
 	     bp->b_last_error != bp->b_error) {
 		bp->b_flags |= (XBF_WRITE | XBF_DONE | XBF_WRITE_FAIL);
@@ -1031,36 +1025,80 @@  xfs_buf_iodone_callback_error(
 		if (cfg->retry_timeout != XFS_ERR_RETRY_FOREVER &&
 		    !bp->b_first_retry_time)
 			bp->b_first_retry_time = jiffies;
-
-		xfs_buf_ioerror(bp, 0);
-		xfs_buf_submit(bp);
 		return true;
 	}
+	return false;
+}
 
-	/*
-	 * Repeated failure on an async write. Take action according to the
-	 * error configuration we have been set up to use.
-	 */
+static bool
+xfs_buf_ioerror_permanent(
+	struct xfs_buf		*bp,
+	struct xfs_error_cfg	*cfg)
+{
+	struct xfs_mount	*mp = bp->b_mount;
 
 	if (cfg->max_retries != XFS_ERR_RETRY_FOREVER &&
 	    ++bp->b_retries > cfg->max_retries)
-			goto permanent_error;
+			return true;
 	if (cfg->retry_timeout != XFS_ERR_RETRY_FOREVER &&
 	    time_after(jiffies, cfg->retry_timeout + bp->b_first_retry_time))
-			goto permanent_error;
+			return true;
 
 	/* At unmount we may treat errors differently */
 	if ((mp->m_flags & XFS_MOUNT_UNMOUNTING) && mp->m_fail_unmount)
+		return true;
+
+	return false;
+}
+
+/*
+ * On a sync write or shutdown we just want to stale the buffer and let the
+ * caller handle the error in bp->b_error appropriately.
+ *
+ * If the write was asynchronous then no one will be looking for the error.  If
+ * this is the first failure of this type, clear the error state and write the
+ * buffer out again. This means we always retry an async write failure at least
+ * once, but we also need to set the buffer up to behave correctly now for
+ * repeated failures.
+ *
+ * If we get repeated async write failures, then we take action according to the
+ * error configuration we have been set up to use.
+ *
+ * Multi-state return value:
+ *
+ * 0: clear IO error retry state and run callback completions
+ * 1: resubmitted immediately, do not run any completions
+ * 2: transient error, run failure callback completions and then
+ *    release the buffer
+ */
+static int
+xfs_buf_iodone_error(
+	struct xfs_buf		*bp)
+{
+	struct xfs_mount	*mp = bp->b_mount;
+	struct xfs_error_cfg	*cfg;
+
+	if (xfs_buf_ioerror_sync(bp))
+		goto out_stale;
+
+	trace_xfs_buf_item_iodone_async(bp, _RET_IP_);
+
+	cfg = xfs_error_get_cfg(mp, XFS_ERR_METADATA, bp->b_error);
+	if (xfs_buf_ioerror_retry(bp, cfg)) {
+		xfs_buf_ioerror(bp, 0);
+		xfs_buf_submit(bp);
+		return 1;
+	}
+
+	if (xfs_buf_ioerror_permanent(bp, cfg))
 		goto permanent_error;
 
 	/*
 	 * Still a transient error, run IO completion failure callbacks and let
 	 * the higher layers retry the buffer.
 	 */
-	xfs_buf_do_callbacks_fail(bp);
 	xfs_buf_ioerror(bp, 0);
-	xfs_buf_relse(bp);
-	return true;
+	return 2;
 
 	/*
 	 * Permanent error - we need to trigger a shutdown if we haven't already
@@ -1072,30 +1110,7 @@  xfs_buf_iodone_callback_error(
 	xfs_buf_stale(bp);
 	bp->b_flags |= XBF_DONE;
 	trace_xfs_buf_error_relse(bp, _RET_IP_);
-	return false;
-}
-
-static inline bool
-xfs_buf_had_callback_errors(
-	struct xfs_buf		*bp)
-{
-
-	/*
-	 * If there is an error, process it. Some errors require us to run
-	 * callbacks after failure processing is done so we detect that and take
-	 * appropriate action.
-	 */
-	if (bp->b_error && xfs_buf_iodone_callback_error(bp))
-		return true;
-
-	/*
-	 * Successful IO or permanent error. Either way, we can clear the
-	 * retry state here in preparation for the next error that may occur.
-	 */
-	bp->b_last_error = 0;
-	bp->b_retries = 0;
-	bp->b_first_retry_time = 0;
-	return false;
+	return 0;
 }
 
 static void
@@ -1122,6 +1137,15 @@  xfs_buf_item_done(
 	xfs_buf_rele(bp);
 }
 
+static inline void
+xfs_buf_clear_ioerror_retry_state(
+	struct xfs_buf		*bp)
+{
+	bp->b_last_error = 0;
+	bp->b_retries = 0;
+	bp->b_first_retry_time = 0;
+}
+
 /*
  * Inode buffer iodone callback function.
  */
@@ -1129,9 +1153,20 @@  void
 xfs_buf_inode_iodone(
 	struct xfs_buf		*bp)
 {
-	if (xfs_buf_had_callback_errors(bp))
+	if (bp->b_error) {
+		int ret = xfs_buf_iodone_error(bp);
+		if (!ret)
+			goto finish_iodone;
+		if (ret == 1)
+			return;
+		ASSERT(ret == 2);
+		xfs_buf_do_callbacks_fail(bp);
+		xfs_buf_relse(bp);
 		return;
+	}
 
+finish_iodone:
+	xfs_buf_clear_ioerror_retry_state(bp);
 	xfs_buf_item_done(bp);
 	xfs_iflush_done(bp);
 	xfs_buf_ioend_finish(bp);
@@ -1144,9 +1179,20 @@  void
 xfs_buf_dquot_iodone(
 	struct xfs_buf		*bp)
 {
-	if (xfs_buf_had_callback_errors(bp))
+	if (bp->b_error) {
+		int ret = xfs_buf_iodone_error(bp);
+		if (!ret)
+			goto finish_iodone;
+		if (ret == 1)
+			return;
+		ASSERT(ret == 2);
+		xfs_buf_do_callbacks_fail(bp);
+		xfs_buf_relse(bp);
 		return;
+	}
 
+finish_iodone:
+	xfs_buf_clear_ioerror_retry_state(bp);
 	/* a newly allocated dquot buffer might have a log item attached */
 	xfs_buf_item_done(bp);
 	xfs_dquot_done(bp);
@@ -1163,9 +1209,20 @@  void
 xfs_buf_iodone(
 	struct xfs_buf		*bp)
 {
-	if (xfs_buf_had_callback_errors(bp))
+	if (bp->b_error) {
+		int ret = xfs_buf_iodone_error(bp);
+		if (!ret)
+			goto finish_iodone;
+		if (ret == 1)
+			return;
+		ASSERT(ret == 2);
+		xfs_buf_do_callbacks_fail(bp);
+		xfs_buf_relse(bp);
 		return;
+	}
 
+finish_iodone:
+	xfs_buf_clear_ioerror_retry_state(bp);
 	xfs_buf_item_done(bp);
 	xfs_buf_ioend_finish(bp);
 }

[13/30] xfs: handle buffer log item IO errors directly

Commit Message

Comments

Patch