diff mbox series

[2/8] xfs/155: fail the test if xfs_repair hangs for too long

Message ID 170899915247.896550.12193016117687961302.stgit@frogsfrogsfrogs (mailing list archive)
State Superseded
Headers show
Series [1/8] generic/604: try to make race occur reliably | expand

Commit Message

Darrick J. Wong Feb. 27, 2024, 2:01 a.m. UTC
From: Darrick J. Wong <djwong@kernel.org>

There are a few hard to reproduce bugs in xfs_repair where it can
deadlock trying to lock a buffer that it already owns.  These stalls
cause fstests never to finish, which is annoying!  To fix this, set up
the xfs_repair run to abort after 10 minutes, which will affect the
golden output and capture a core file.

This doesn't fix xfs_repair, obviously.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 tests/xfs/155 |    4 ++++
 1 file changed, 4 insertions(+)

Comments

Zorro Lang Feb. 27, 2024, 4:16 a.m. UTC | #1
On Mon, Feb 26, 2024 at 06:01:03PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> There are a few hard to reproduce bugs in xfs_repair where it can
> deadlock trying to lock a buffer that it already owns.  These stalls
> cause fstests never to finish, which is annoying!  To fix this, set up
> the xfs_repair run to abort after 10 minutes, which will affect the
> golden output and capture a core file.
> 
> This doesn't fix xfs_repair, obviously.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  tests/xfs/155 |    4 ++++
>  1 file changed, 4 insertions(+)
> 
> 
> diff --git a/tests/xfs/155 b/tests/xfs/155
> index 302607b510..fba557bff6 100755
> --- a/tests/xfs/155
> +++ b/tests/xfs/155
> @@ -27,6 +27,10 @@ _require_scratch_xfs_crc		# needsrepair only exists for v5
>  _require_populate_commands
>  _require_libxfs_debug_flag LIBXFS_DEBUG_WRITE_CRASH
>  
> +# Inject a 10 minute abortive timeout on the repair program so that deadlocks
> +# in the program do not cause fstests to hang indefinitely.
> +XFS_REPAIR_PROG="timeout -s ABRT 10m $XFS_REPAIR_PROG"

Others cases of fstests always do:
  _require_command "$TIMEOUT_PROG" timeout
before using timeout.

Others looks good to me, as you only change single one case, it won't affect other testing.
Just hope the 10 minutes is enough even if on a big storage :)

Thanks,
Zorro

> +
>  # Populate the filesystem
>  _scratch_populate_cached nofill >> $seqres.full 2>&1
>  
>
diff mbox series

Patch

diff --git a/tests/xfs/155 b/tests/xfs/155
index 302607b510..fba557bff6 100755
--- a/tests/xfs/155
+++ b/tests/xfs/155
@@ -27,6 +27,10 @@  _require_scratch_xfs_crc		# needsrepair only exists for v5
 _require_populate_commands
 _require_libxfs_debug_flag LIBXFS_DEBUG_WRITE_CRASH
 
+# Inject a 10 minute abortive timeout on the repair program so that deadlocks
+# in the program do not cause fstests to hang indefinitely.
+XFS_REPAIR_PROG="timeout -s ABRT 10m $XFS_REPAIR_PROG"
+
 # Populate the filesystem
 _scratch_populate_cached nofill >> $seqres.full 2>&1