[00/12] xfs: flush related error handling cleanups

Message ID	20200417150859.14734-1-bfoster@redhat.com (mailing list archive)
Headers	show Return-Path: <SRS0=+p7O=6B=vger.kernel.org=linux-xfs-owner@kernel.org> From: Brian Foster <bfoster@redhat.com> To: linux-xfs@vger.kernel.org Subject: [PATCH 00/12] xfs: flush related error handling cleanups Date: Fri, 17 Apr 2020 11:08:47 -0400 Message-Id: <20200417150859.14734-1-bfoster@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk
Series	xfs: flush related error handling cleanups \| expand [00/12] xfs: flush related error handling cleanups [01/12] xfs: refactor failed buffer resubmission into xfsaild [02/12] xfs: factor out buffer I/O failure simulation code [03/12] xfs: always attach iflush_done and simplify error handling [04/12] xfs: remove unnecessary shutdown check from xfs_iflush() [05/12] xfs: ratelimit unmount time per-buffer I/O error warning [06/12] xfs: remove duplicate verification from xfs_qm_dqflush() [07/12] xfs: abort consistently on dquot flush failure [08/12] xfs: remove unnecessary quotaoff intent item push handler [09/12] xfs: elide the AIL lock on log item failure tracking [10/12] xfs: clean up AIL log item removal functions [11/12] xfs: remove unused iflush stale parameter [12/12] xfs: random buffer write failure errortag

Message ID

20200417150859.14734-1-bfoster@redhat.com (mailing list archive)

Headers

From: Brian Foster <bfoster@redhat.com>
To: linux-xfs@vger.kernel.org
Subject: [PATCH 00/12] xfs: flush related error handling cleanups
Date: Fri, 17 Apr 2020 11:08:47 -0400
Message-Id: <20200417150859.14734-1-bfoster@redhat.com>
MIME-Version: 1.0
Content-Transfer-Encoding: quoted-printable
Sender: linux-xfs-owner@vger.kernel.org
Precedence: bulk

Series

Message

Brian Foster April 17, 2020, 3:08 p.m. UTC

Hi all,

This actually started as what I intended to be a cleanup of xfsaild
error handling and the fact that unexpected errors are kind of lost in
the ->iop_push() handlers of flushable log items. Some discussion with
Dave on that is available here[1]. I was thinking of genericizing the
behavior, but I'm not so sure that is possible now given the error
handling requirements of the associated items.

While thinking through that, I ended up incorporating various cleanups
in the somewhat confusing and erratic error handling on the periphery of
xfsaild, such as the flush handlers. Most of these are straightforward
cleanups except for patch 9, which I think requires careful review and
is of debatable value. I have used patch 12 to run an hour or so of
highly concurrent fsstress load against it and will execute a longer run
over the weekend now that fstests has completed.

Thoughts, reviews, flames appreciated.

Brian

[1] https://lore.kernel.org/linux-xfs/20200331114653.GA53541@bfoster/

Brian Foster (12):
  xfs: refactor failed buffer resubmission into xfsaild
  xfs: factor out buffer I/O failure simulation code
  xfs: always attach iflush_done and simplify error handling
  xfs: remove unnecessary shutdown check from xfs_iflush()
  xfs: ratelimit unmount time per-buffer I/O error warning
  xfs: remove duplicate verification from xfs_qm_dqflush()
  xfs: abort consistently on dquot flush failure
  xfs: remove unnecessary quotaoff intent item push handler
  xfs: elide the AIL lock on log item failure tracking
  xfs: clean up AIL log item removal functions
  xfs: remove unused iflush stale parameter
  xfs: random buffer write failure errortag

 fs/xfs/libxfs/xfs_errortag.h  |   4 +-
 fs/xfs/libxfs/xfs_inode_buf.c |   7 +--
 fs/xfs/xfs_bmap_item.c        |   2 +-
 fs/xfs/xfs_buf.c              |  36 ++++++++---
 fs/xfs/xfs_buf.h              |   1 +
 fs/xfs/xfs_buf_item.c         |  86 ++++----------------------
 fs/xfs/xfs_buf_item.h         |   2 -
 fs/xfs/xfs_dquot.c            |  84 ++++++++------------------
 fs/xfs/xfs_dquot_item.c       |  31 +---------
 fs/xfs/xfs_error.c            |   3 +
 fs/xfs/xfs_extfree_item.c     |   2 +-
 fs/xfs/xfs_icache.c           |   2 +-
 fs/xfs/xfs_inode.c            | 110 +++++++++-------------------------
 fs/xfs/xfs_inode_item.c       |  39 +++---------
 fs/xfs/xfs_inode_item.h       |   2 +-
 fs/xfs/xfs_refcount_item.c    |   2 +-
 fs/xfs/xfs_rmap_item.c        |   2 +-
 fs/xfs/xfs_trans_ail.c        |  52 +++++++++++++++-
 fs/xfs/xfs_trans_priv.h       |  22 +++----
 19 files changed, 175 insertions(+), 314 deletions(-)

Comments

Dave Chinner April 19, 2020, 10:53 p.m. UTC | #1

On Fri, Apr 17, 2020 at 11:08:47AM -0400, Brian Foster wrote:
> Hi all,
> 
> This actually started as what I intended to be a cleanup of xfsaild
> error handling and the fact that unexpected errors are kind of lost in
> the ->iop_push() handlers of flushable log items. Some discussion with
> Dave on that is available here[1]. I was thinking of genericizing the
> behavior, but I'm not so sure that is possible now given the error
> handling requirements of the associated items.
> 
> While thinking through that, I ended up incorporating various cleanups
> in the somewhat confusing and erratic error handling on the periphery of
> xfsaild, such as the flush handlers. Most of these are straightforward
> cleanups except for patch 9, which I think requires careful review and
> is of debatable value. I have used patch 12 to run an hour or so of
> highly concurrent fsstress load against it and will execute a longer run
> over the weekend now that fstests has completed.
> 
> Thoughts, reviews, flames appreciated.

I'll need to do something thinking on this patchset - I have a
patchset that touches a lot of the same code I'm working on right
now to pin inode cluster buffers in memory when the inode is dirtied
so we don't get RMW cycles in AIL flushing.

That code gets rid of xfs_iflush() completely, removes dirty inodes
from the AIL and tracks only ordered cluster buffers in the AIL for
inode writeback (i.e. reduces AIL tracked log items by up to 30x).
It also only does inode writeback from the ordered cluster buffers.

The idea behind this is to make inode flushing completely
non-blocking, and to simply inode cluster flushing to simply iterate
all the dirty inodes attached to the buffer. This gets rid of radix
tree lookups and races with reclaim, and gets rid of having to
special case a locked inode in the cluster iteration code.

I was looking at this as the model to then apply to dquot flushing,
too, because it currently does not have cluster flushing, and hence
flushes dquots individually, even though there can be multiple dirty
dquots per buffer. Some of this patchset moves the dquot flushing a
bit closer to the inode code, so those parts are going to be useful
regardless of everything else....

Do you have a git tree I could pull this from to see how bad the
conflicts are?

Cheers,

Dave.

Brian Foster April 20, 2020, 2:06 p.m. UTC | #2

On Mon, Apr 20, 2020 at 08:53:06AM +1000, Dave Chinner wrote:
> On Fri, Apr 17, 2020 at 11:08:47AM -0400, Brian Foster wrote:
> > Hi all,
> > 
> > This actually started as what I intended to be a cleanup of xfsaild
> > error handling and the fact that unexpected errors are kind of lost in
> > the ->iop_push() handlers of flushable log items. Some discussion with
> > Dave on that is available here[1]. I was thinking of genericizing the
> > behavior, but I'm not so sure that is possible now given the error
> > handling requirements of the associated items.
> > 
> > While thinking through that, I ended up incorporating various cleanups
> > in the somewhat confusing and erratic error handling on the periphery of
> > xfsaild, such as the flush handlers. Most of these are straightforward
> > cleanups except for patch 9, which I think requires careful review and
> > is of debatable value. I have used patch 12 to run an hour or so of
> > highly concurrent fsstress load against it and will execute a longer run
> > over the weekend now that fstests has completed.
> > 
> > Thoughts, reviews, flames appreciated.
> 
> I'll need to do something thinking on this patchset - I have a
> patchset that touches a lot of the same code I'm working on right
> now to pin inode cluster buffers in memory when the inode is dirtied
> so we don't get RMW cycles in AIL flushing.
> 
> That code gets rid of xfs_iflush() completely, removes dirty inodes
> from the AIL and tracks only ordered cluster buffers in the AIL for
> inode writeback (i.e. reduces AIL tracked log items by up to 30x).
> It also only does inode writeback from the ordered cluster buffers.
> 

Ok. I could see that being reason enough to drop the iflush iodone
patch, given that it depends on a bit of a rework/hack. A cleaner
solution requires more thought and it might not be worth the time if the
code is going away. Most of the rest are straightforward cleanups though
so I wouldn't expect complex conflict resolution. It's hard to say
for sure without seeing the code, of course..

> The idea behind this is to make inode flushing completely
> non-blocking, and to simply inode cluster flushing to simply iterate
> all the dirty inodes attached to the buffer. This gets rid of radix
> tree lookups and races with reclaim, and gets rid of having to
> special case a locked inode in the cluster iteration code.
> 

Sounds interesting, but it's not really clear to me what the general
flushing dynamic looks like in this model. I.e., you mention
xfs_iflush() goes away, but cluster flushing still exists in some form,
so I can't really tell if xfs_iflush() going away is tied to a
functional change or primarily a refactoring/cleanup. Anyways, no need
to go into the weeds if the code will eventually clarify..

> I was looking at this as the model to then apply to dquot flushing,
> too, because it currently does not have cluster flushing, and hence
> flushes dquots individually, even though there can be multiple dirty
> dquots per buffer. Some of this patchset moves the dquot flushing a
> bit closer to the inode code, so those parts are going to be useful
> regardless of everything else....
> 

Makes sense.

> Do you have a git tree I could pull this from to see how bad the
> conflicts are?
> 

I don't have a public tree. I suppose I could look into getting
kernel.org access if somebody could point me in the right direction for
that. :) In the meantime I could make a private tree accessible to you
directly if that's helpful..

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
>

Dave Chinner April 20, 2020, 10:53 p.m. UTC | #3

On Mon, Apr 20, 2020 at 10:06:04AM -0400, Brian Foster wrote:
> On Mon, Apr 20, 2020 at 08:53:06AM +1000, Dave Chinner wrote:
> > On Fri, Apr 17, 2020 at 11:08:47AM -0400, Brian Foster wrote:
> > > Hi all,
> > > 
> > > This actually started as what I intended to be a cleanup of xfsaild
> > > error handling and the fact that unexpected errors are kind of lost in
> > > the ->iop_push() handlers of flushable log items. Some discussion with
> > > Dave on that is available here[1]. I was thinking of genericizing the
> > > behavior, but I'm not so sure that is possible now given the error
> > > handling requirements of the associated items.
> > > 
> > > While thinking through that, I ended up incorporating various cleanups
> > > in the somewhat confusing and erratic error handling on the periphery of
> > > xfsaild, such as the flush handlers. Most of these are straightforward
> > > cleanups except for patch 9, which I think requires careful review and
> > > is of debatable value. I have used patch 12 to run an hour or so of
> > > highly concurrent fsstress load against it and will execute a longer run
> > > over the weekend now that fstests has completed.
> > > 
> > > Thoughts, reviews, flames appreciated.
> > 
> > I'll need to do something thinking on this patchset - I have a
> > patchset that touches a lot of the same code I'm working on right
> > now to pin inode cluster buffers in memory when the inode is dirtied
> > so we don't get RMW cycles in AIL flushing.
> > 
> > That code gets rid of xfs_iflush() completely, removes dirty inodes
> > from the AIL and tracks only ordered cluster buffers in the AIL for
> > inode writeback (i.e. reduces AIL tracked log items by up to 30x).
> > It also only does inode writeback from the ordered cluster buffers.
> > 
> 
> Ok. I could see that being reason enough to drop the iflush iodone
> patch, given that it depends on a bit of a rework/hack. A cleaner
> solution requires more thought and it might not be worth the time if the
> code is going away. Most of the rest are straightforward cleanups though
> so I wouldn't expect complex conflict resolution. It's hard to say
> for sure without seeing the code, of course..

Yeah, now I've been though most of it there isn't a huge impact on
my patchset. Mainly just the conflicts in the mods to xfs_iflush and
friends.

> > The idea behind this is to make inode flushing completely
> > non-blocking, and to simply inode cluster flushing to simply iterate
> > all the dirty inodes attached to the buffer. This gets rid of radix
> > tree lookups and races with reclaim, and gets rid of having to
> > special case a locked inode in the cluster iteration code.
> > 
> 
> Sounds interesting, but it's not really clear to me what the general
> flushing dynamic looks like in this model. I.e., you mention
> xfs_iflush() goes away, but cluster flushing still exists in some form,
> so I can't really tell if xfs_iflush() going away is tied to a
> functional change or primarily a refactoring/cleanup. Anyways, no need
> to go into the weeds if the code will eventually clarify..

It's primarily a clean-up to try to reduce AIL pushing overhead as
I'm regularly seeing the xfsaild CPU bound trying to push inodes
that are already on their way to disk. So I'm trying to reduce
cluster flushing to be driven by a buffer item push rather than by
pushing repeatedly on every inode item that is attached to the
buffer. 

> > Do you have a git tree I could pull this from to see how bad the
> > conflicts are?
> > 
> 
> I don't have a public tree. I suppose I could look into getting
> kernel.org access if somebody could point me in the right
> direction for that. :) In the meantime I could make a private tree
> accessible to you directly if that's helpful..

Send a request for an account and git tree to helpdesk@kernel.org
and cc Darrick, Eric and myself so we can ACK the request.

Details here:

https://korg.wiki.kernel.org/userdoc/accounts

and all the userdoc is here:

https://korg.wiki.kernel.org/start

Cheers,

Dave.