Message ID | 20200504141154.55887-6-bfoster@redhat.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | xfs: flush related error handling cleanups | expand |
On 5/4/20 7:11 AM, Brian Foster wrote: > The buffer write failure flag is intended to control the internal > write retry that XFS has historically implemented to help mitigate > the severity of transient I/O errors. The flag is set when a buffer > is resubmitted from the I/O completion path due to a previous > failure. It is checked on subsequent I/O completions to skip the > internal retry and fall through to the higher level configurable > error handling mechanism. The flag is cleared in the synchronous and > delwri submission paths and also checked in various places to log > write failure messages. > > There are a couple minor problems with the current usage of this > flag. One is that we issue an internal retry after every submission > from xfsaild due to how delwri submission clears the flag. This > results in double the expected or configured number of write > attempts when under sustained failures. Another more subtle issue is > that the flag is never cleared on successful I/O completion. This > can cause xfs_wait_buftarg() to suggest that dirty buffers are being > thrown away due to the existence of the flag, when the reality is > that the flag might still be set because the write succeeded on the > retry. > > Clear the write failure flag on successful I/O completion to address > both of these problems. This means that the internal retry attempt > occurs once since the last time a buffer write failed and that > various other contexts only see the flag set when the immediately > previous write attempt has failed. > > Signed-off-by: Brian Foster <bfoster@redhat.com> > Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> > Reviewed-by: Christoph Hellwig <hch@lst.de> Looks ok to me: Reviewed-by: Allison Collins <allison.henderson@oracle.com> > --- > fs/xfs/xfs_buf.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c > index d5d6a68bb1e6..fd76a84cefdd 100644 > --- a/fs/xfs/xfs_buf.c > +++ b/fs/xfs/xfs_buf.c > @@ -1197,8 +1197,10 @@ xfs_buf_ioend( > bp->b_ops->verify_read(bp); > } > > - if (!bp->b_error) > + if (!bp->b_error) { > + bp->b_flags &= ~XBF_WRITE_FAIL; > bp->b_flags |= XBF_DONE; > + } > > if (bp->b_iodone) > (*(bp->b_iodone))(bp); > @@ -1274,7 +1276,7 @@ xfs_bwrite( > > bp->b_flags |= XBF_WRITE; > bp->b_flags &= ~(XBF_ASYNC | XBF_READ | _XBF_DELWRI_Q | > - XBF_WRITE_FAIL | XBF_DONE); > + XBF_DONE); > > error = xfs_buf_submit(bp); > if (error) > @@ -1996,7 +1998,7 @@ xfs_buf_delwri_submit_buffers( > * synchronously. Otherwise, drop the buffer from the delwri > * queue and submit async. > */ > - bp->b_flags &= ~(_XBF_DELWRI_Q | XBF_WRITE_FAIL); > + bp->b_flags &= ~_XBF_DELWRI_Q; > bp->b_flags |= XBF_WRITE; > if (wait_list) { > bp->b_flags &= ~XBF_ASYNC; >
On 5/4/20 7:11 AM, Brian Foster wrote: > The buffer write failure flag is intended to control the internal > write retry that XFS has historically implemented to help mitigate > the severity of transient I/O errors. The flag is set when a buffer > is resubmitted from the I/O completion path due to a previous > failure. It is checked on subsequent I/O completions to skip the > internal retry and fall through to the higher level configurable > error handling mechanism. The flag is cleared in the synchronous and > delwri submission paths and also checked in various places to log > write failure messages. > > There are a couple minor problems with the current usage of this > flag. One is that we issue an internal retry after every submission > from xfsaild due to how delwri submission clears the flag. This > results in double the expected or configured number of write > attempts when under sustained failures. Another more subtle issue is > that the flag is never cleared on successful I/O completion. This > can cause xfs_wait_buftarg() to suggest that dirty buffers are being > thrown away due to the existence of the flag, when the reality is > that the flag might still be set because the write succeeded on the > retry. > > Clear the write failure flag on successful I/O completion to address > both of these problems. This means that the internal retry attempt > occurs once since the last time a buffer write failed and that > various other contexts only see the flag set when the immediately > previous write attempt has failed. > > Signed-off-by: Brian Foster <bfoster@redhat.com> > Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> > Reviewed-by: Christoph Hellwig <hch@lst.de> Looks ok to me: Reviewed-by: Allison Collins <allison.henderson@oracle.com> > --- > fs/xfs/xfs_buf.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c > index d5d6a68bb1e6..fd76a84cefdd 100644 > --- a/fs/xfs/xfs_buf.c > +++ b/fs/xfs/xfs_buf.c > @@ -1197,8 +1197,10 @@ xfs_buf_ioend( > bp->b_ops->verify_read(bp); > } > > - if (!bp->b_error) > + if (!bp->b_error) { > + bp->b_flags &= ~XBF_WRITE_FAIL; > bp->b_flags |= XBF_DONE; > + } > > if (bp->b_iodone) > (*(bp->b_iodone))(bp); > @@ -1274,7 +1276,7 @@ xfs_bwrite( > > bp->b_flags |= XBF_WRITE; > bp->b_flags &= ~(XBF_ASYNC | XBF_READ | _XBF_DELWRI_Q | > - XBF_WRITE_FAIL | XBF_DONE); > + XBF_DONE); > > error = xfs_buf_submit(bp); > if (error) > @@ -1996,7 +1998,7 @@ xfs_buf_delwri_submit_buffers( > * synchronously. Otherwise, drop the buffer from the delwri > * queue and submit async. > */ > - bp->b_flags &= ~(_XBF_DELWRI_Q | XBF_WRITE_FAIL); > + bp->b_flags &= ~_XBF_DELWRI_Q; > bp->b_flags |= XBF_WRITE; > if (wait_list) { > bp->b_flags &= ~XBF_ASYNC; >
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c index d5d6a68bb1e6..fd76a84cefdd 100644 --- a/fs/xfs/xfs_buf.c +++ b/fs/xfs/xfs_buf.c @@ -1197,8 +1197,10 @@ xfs_buf_ioend( bp->b_ops->verify_read(bp); } - if (!bp->b_error) + if (!bp->b_error) { + bp->b_flags &= ~XBF_WRITE_FAIL; bp->b_flags |= XBF_DONE; + } if (bp->b_iodone) (*(bp->b_iodone))(bp); @@ -1274,7 +1276,7 @@ xfs_bwrite( bp->b_flags |= XBF_WRITE; bp->b_flags &= ~(XBF_ASYNC | XBF_READ | _XBF_DELWRI_Q | - XBF_WRITE_FAIL | XBF_DONE); + XBF_DONE); error = xfs_buf_submit(bp); if (error) @@ -1996,7 +1998,7 @@ xfs_buf_delwri_submit_buffers( * synchronously. Otherwise, drop the buffer from the delwri * queue and submit async. */ - bp->b_flags &= ~(_XBF_DELWRI_Q | XBF_WRITE_FAIL); + bp->b_flags &= ~_XBF_DELWRI_Q; bp->b_flags |= XBF_WRITE; if (wait_list) { bp->b_flags &= ~XBF_ASYNC;