diff mbox series

xfs/076 takes a long long time testing with a realtime volume

Message ID YYXhNip3PctJAaDY@mit.edu (mailing list archive)
State New, archived
Headers show
Series xfs/076 takes a long long time testing with a realtime volume | expand

Commit Message

Theodore Ts'o Nov. 6, 2021, 1:58 a.m. UTC
After committing some exclusions into my test runner framework (see
below), I tested a potential fix to xfs/076 which disables the
real-time volume when creating the scratch volume.  Should I send it
as a formal patch to fstests?

Comments

Darrick J. Wong Nov. 6, 2021, 2:08 a.m. UTC | #1
On Fri, Nov 05, 2021 at 09:58:14PM -0400, Theodore Ts'o wrote:
> After committing some exclusions into my test runner framework (see
> below), I tested a potential fix to xfs/076 which disables the
> real-time volume when creating the scratch volume.  Should I send it
> as a formal patch to fstests?

Does adding:

_xfs_force_bdev data $SCRATCH_MNT

right after _scratch_mount make the performance problem go away?  Sparse
inodes and realtime are a supported configuration.

--D

> diff --git a/tests/xfs/076 b/tests/xfs/076
> index eac7410e..5628c08f 100755
> --- a/tests/xfs/076
> +++ b/tests/xfs/076
> @@ -60,6 +60,7 @@ _require_xfs_io_command "falloc"
>  _require_xfs_io_command "fpunch"
>  _require_xfs_sparse_inodes
>  
> +unset SCRATCH_RTDEV
>  _scratch_mkfs "-d size=50m -m crc=1 -i sparse" |
>  	_filter_mkfs > /dev/null 2> $tmp.mkfs
>  . $tmp.mkfs	# for isize
> 
> 						- Ted
> 
> For why this is needed, see the commit description below:
> 
> commit c41ae1cc0b21eafd2858541c0bc195f951c0726c
> Author: Theodore Ts'o <tytso@mit.edu>
> Date:   Fri Nov 5 20:46:19 2021 -0400
> 
>     test-appliance: exclude xfs/076 from the realtime configs
>     
>     The xfs/076 test takes two minutes on a normal xfs file system (e.g.,
>     a normal 4k block size file system).  However, when there is a
>     real-time volume attached, this test takes over 80 minutes.  The
>     reason for this seems to be because the test is spending a lot more
>     time failing to create files due to missing directories.  Compare:
>     
>     root@xfstests-2:~# ls -sh /results/xfs/results-4k/xfs/076.full
>     48K /results/xfs/results-4k/xfs/076.full
>     root@xfstests-2:~# ls -sh /tmp/realtime-076.full
>     25M /tmp/realtime-076.full
>     
>     and:
>     
>     root@xfstests-2:~# grep "cannot touch" /results/xfs/results-4k/xfs/076.full | wc -l
>     656
>     root@xfstests-2:~# grep "cannot touch" /tmp/realtime-076.full | wc -l
>     327664
>     
>     The failures from 076.full look like this:
>     
>     touch: cannot touch '/xt-vdc/offset.21473722368/25659': No space left on device
>     touch: cannot touch '/xt-vdc/offset.21473656832/0': No such file or directory
>     touch: cannot touch '/xt-vdc/offset.21473591296/0': No such file or directory
>     ...
>     touch: cannot touch '/xt-vdc/offset.196608/0': No such file or directory
>     touch: cannot touch '/xt-vdc/offset.131072/0': No such file or directory
>     touch: cannot touch '/xt-vdc/offset.65536/0': No such file or directory
>     touch: cannot touch '/xt-vdc/offset.0/0': No such file or directory
>     
>     What seems to be going on is that xfs/076 tries to create a small
>     scratch file system --- but when we attach a real-time volume this
>     balloons the available size of the file system.  Of course, that space
>     can't be used for normal files.  As a result, xfs/076 is incorrectly
>     estimating how many files it needs to create to fill the file system.
>     
>     I'm not sure what's the best way to fix this in the test; perhaps the
>     test should forcibly unset SCRATCH_RTDEV environment variable before
>     running _scratch_mkfs?  Anyway, for now, we'll just skip running
>     xfs/076 for the xfs/realtime* configs.
>     
>     Signed-off-by: Theodore Ts'o <tytso@mit.edu>
> 
> diff --git a/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime.exclude b/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime.exclude
> index a9acba9c..bafce552 100644
> --- a/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime.exclude
> +++ b/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime.exclude
> @@ -1,2 +1,7 @@
>  # Normal configurations don't support dax
>  -g dax
> +
> +# The xfs/076 test takes well over an hour (80 minutes using 100GB GCE
> +# PD/SSD) when run with an external realtime device, which triggers
> +# the ltm "test is stalled" failsafe which aborts the VM.
> +xfs/076
> diff --git a/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime_28k_logdev.exclude b/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime_28k_logdev.exclude
> index a9acba9c..bafce552 100644
> --- a/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime_28k_logdev.exclude
> +++ b/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime_28k_logdev.exclude
> @@ -1,2 +1,7 @@
>  # Normal configurations don't support dax
>  -g dax
> +
> +# The xfs/076 test takes well over an hour (80 minutes using 100GB GCE
> +# PD/SSD) when run with an external realtime device, which triggers
> +# the ltm "test is stalled" failsafe which aborts the VM.
> +xfs/076
> diff --git a/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime_logdev.exclude b/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime_logdev.exclude
> index a9acba9c..bafce552 100644
> --- a/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime_logdev.exclude
> +++ b/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime_logdev.exclude
> @@ -1,2 +1,7 @@
>  # Normal configurations don't support dax
>  -g dax
> +
> +# The xfs/076 test takes well over an hour (80 minutes using 100GB GCE
> +# PD/SSD) when run with an external realtime device, which triggers
> +# the ltm "test is stalled" failsafe which aborts the VM.
> +xfs/076
Theodore Ts'o Nov. 6, 2021, 4:43 p.m. UTC | #2
On Fri, Nov 05, 2021 at 07:08:04PM -0700, Darrick J. Wong wrote:
> On Fri, Nov 05, 2021 at 09:58:14PM -0400, Theodore Ts'o wrote:
> > After committing some exclusions into my test runner framework (see
> > below), I tested a potential fix to xfs/076 which disables the
> > real-time volume when creating the scratch volume.  Should I send it
> > as a formal patch to fstests?
> 
> Does adding:
> 
> _xfs_force_bdev data $SCRATCH_MNT
> 
> right after _scratch_mount make the performance problem go away?  Sparse
> inodes and realtime are a supported configuration.

The test fails with an "fpunch failed" in 076.out.bad, and nothing
enlightening in 076.full.  But it does complete in roughly two
minutes.

						- Ted
diff mbox series

Patch

diff --git a/tests/xfs/076 b/tests/xfs/076
index eac7410e..5628c08f 100755
--- a/tests/xfs/076
+++ b/tests/xfs/076
@@ -60,6 +60,7 @@  _require_xfs_io_command "falloc"
 _require_xfs_io_command "fpunch"
 _require_xfs_sparse_inodes
 
+unset SCRATCH_RTDEV
 _scratch_mkfs "-d size=50m -m crc=1 -i sparse" |
 	_filter_mkfs > /dev/null 2> $tmp.mkfs
 . $tmp.mkfs	# for isize

						- Ted

For why this is needed, see the commit description below:

commit c41ae1cc0b21eafd2858541c0bc195f951c0726c
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Fri Nov 5 20:46:19 2021 -0400

    test-appliance: exclude xfs/076 from the realtime configs
    
    The xfs/076 test takes two minutes on a normal xfs file system (e.g.,
    a normal 4k block size file system).  However, when there is a
    real-time volume attached, this test takes over 80 minutes.  The
    reason for this seems to be because the test is spending a lot more
    time failing to create files due to missing directories.  Compare:
    
    root@xfstests-2:~# ls -sh /results/xfs/results-4k/xfs/076.full
    48K /results/xfs/results-4k/xfs/076.full
    root@xfstests-2:~# ls -sh /tmp/realtime-076.full
    25M /tmp/realtime-076.full
    
    and:
    
    root@xfstests-2:~# grep "cannot touch" /results/xfs/results-4k/xfs/076.full | wc -l
    656
    root@xfstests-2:~# grep "cannot touch" /tmp/realtime-076.full | wc -l
    327664
    
    The failures from 076.full look like this:
    
    touch: cannot touch '/xt-vdc/offset.21473722368/25659': No space left on device
    touch: cannot touch '/xt-vdc/offset.21473656832/0': No such file or directory
    touch: cannot touch '/xt-vdc/offset.21473591296/0': No such file or directory
    ...
    touch: cannot touch '/xt-vdc/offset.196608/0': No such file or directory
    touch: cannot touch '/xt-vdc/offset.131072/0': No such file or directory
    touch: cannot touch '/xt-vdc/offset.65536/0': No such file or directory
    touch: cannot touch '/xt-vdc/offset.0/0': No such file or directory
    
    What seems to be going on is that xfs/076 tries to create a small
    scratch file system --- but when we attach a real-time volume this
    balloons the available size of the file system.  Of course, that space
    can't be used for normal files.  As a result, xfs/076 is incorrectly
    estimating how many files it needs to create to fill the file system.
    
    I'm not sure what's the best way to fix this in the test; perhaps the
    test should forcibly unset SCRATCH_RTDEV environment variable before
    running _scratch_mkfs?  Anyway, for now, we'll just skip running
    xfs/076 for the xfs/realtime* configs.
    
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>

diff --git a/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime.exclude b/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime.exclude
index a9acba9c..bafce552 100644
--- a/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime.exclude
+++ b/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime.exclude
@@ -1,2 +1,7 @@ 
 # Normal configurations don't support dax
 -g dax
+
+# The xfs/076 test takes well over an hour (80 minutes using 100GB GCE
+# PD/SSD) when run with an external realtime device, which triggers
+# the ltm "test is stalled" failsafe which aborts the VM.
+xfs/076
diff --git a/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime_28k_logdev.exclude b/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime_28k_logdev.exclude
index a9acba9c..bafce552 100644
--- a/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime_28k_logdev.exclude
+++ b/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime_28k_logdev.exclude
@@ -1,2 +1,7 @@ 
 # Normal configurations don't support dax
 -g dax
+
+# The xfs/076 test takes well over an hour (80 minutes using 100GB GCE
+# PD/SSD) when run with an external realtime device, which triggers
+# the ltm "test is stalled" failsafe which aborts the VM.
+xfs/076
diff --git a/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime_logdev.exclude b/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime_logdev.exclude
index a9acba9c..bafce552 100644
--- a/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime_logdev.exclude
+++ b/kvm-xfstests/test-appliance/files/root/fs/xfs/cfg/realtime_logdev.exclude
@@ -1,2 +1,7 @@ 
 # Normal configurations don't support dax
 -g dax
+
+# The xfs/076 test takes well over an hour (80 minutes using 100GB GCE
+# PD/SSD) when run with an external realtime device, which triggers
+# the ltm "test is stalled" failsafe which aborts the VM.
+xfs/076