mbox series

[PATCHSET,v2,0/3] fstests: fix some hangs in crash recovery

Message ID 165950054404.199222.5615656337332007333.stgit@magnolia (mailing list archive)
Headers show
Series fstests: fix some hangs in crash recovery | expand

Message

Darrick J. Wong Aug. 3, 2022, 4:22 a.m. UTC
Hi all,

There are several tests in fstests (generic/019, generic/388,
generic/475, xfs/057, etc.) that test filesystem crash recovery by
starting a loop that kicks off a filesystem exerciser, waits a few
seconds, and offlines the filesystem somehow.  Some of them use the
block layer's error injector, some use dm-error, and some use the
shutdown ioctl.

The crash tests that employ error injection have the unfortunate trait
of causing occasional livelocks when tested against XFS because XFS
allows administrators to configure the filesystem to retry some failed
writes indefinitely.  If the offlining races with a full log trying to
update the filesystem, the fs will hang forever.  Fix this by allowing
XFS to go offline immediately.

While we're at it, fix the dmesg scrapers so they don't trip over XFS
reporting these IO errors as internal errors.

v2: add hch reviews

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

fstests git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=fix-shutdown-test-hangs
---
 check                    |    1 +
 common/dmerror           |    4 ++++
 common/fail_make_request |    1 +
 common/rc                |   50 +++++++++++++++++++++++++++++++++++++++++-----
 common/xfs               |   38 ++++++++++++++++++++++++++++++++++-
 tests/xfs/006.out        |    6 +++---
 tests/xfs/264.out        |   12 ++++++-----
 7 files changed, 97 insertions(+), 15 deletions(-)

Comments

Zorro Lang Sept. 4, 2022, 1:18 p.m. UTC | #1
On Tue, Aug 02, 2022 at 09:22:24PM -0700, Darrick J. Wong wrote:
> Hi all,
> 
> There are several tests in fstests (generic/019, generic/388,
> generic/475, xfs/057, etc.) that test filesystem crash recovery by
> starting a loop that kicks off a filesystem exerciser, waits a few
> seconds, and offlines the filesystem somehow.  Some of them use the
> block layer's error injector, some use dm-error, and some use the
> shutdown ioctl.
> 
> The crash tests that employ error injection have the unfortunate trait
> of causing occasional livelocks when tested against XFS because XFS
> allows administrators to configure the filesystem to retry some failed
> writes indefinitely.  If the offlining races with a full log trying to
> update the filesystem, the fs will hang forever.  Fix this by allowing
> XFS to go offline immediately.
> 
> While we're at it, fix the dmesg scrapers so they don't trip over XFS
> reporting these IO errors as internal errors.
> 
> v2: add hch reviews
> 
> If you're going to start using this mess, you probably ought to just
> pull from my git trees, which are linked below.
> 
> This is an extraordinary way to destroy everything.  Enjoy!
> Comments and questions are, as always, welcome.
> 
> --D

I'd like to merge this patchset, after long time testing, it looks like not
bring in regression. It's stuck long time, let's keep moving :)

Reviewed-by: Zorro Lang <zlang@redhat.com>

> 
> fstests git tree:
> https://git.kernel.org/cgit/linux/kernel/git/djwong/xfstests-dev.git/log/?h=fix-shutdown-test-hangs
> ---
>  check                    |    1 +
>  common/dmerror           |    4 ++++
>  common/fail_make_request |    1 +
>  common/rc                |   50 +++++++++++++++++++++++++++++++++++++++++-----
>  common/xfs               |   38 ++++++++++++++++++++++++++++++++++-
>  tests/xfs/006.out        |    6 +++---
>  tests/xfs/264.out        |   12 ++++++-----
>  7 files changed, 97 insertions(+), 15 deletions(-)
>