diff mbox series

[10/23] mkfs: don't hardcode log size

Message ID 173706974228.1927324.17714311358227511791.stgit@frogsfrogsfrogs (mailing list archive)
State New
Headers show
Series [01/23] generic/476: fix fsstress process management | expand

Commit Message

Darrick J. Wong Jan. 16, 2025, 11:27 p.m. UTC
From: Darrick J. Wong <djwong@kernel.org>

Commit 000813899afb46 hardcoded a log size of 256MB into xfs/501,
xfs/502, and generic/530.  This seems to be an attempt to reduce test
run times by increasing the log size so that more background threads can
run in parallel.  Unfortunately, this breaks a couple of my test
configurations:

 - External logs smaller than 256MB
 - Internal logs where the AG size is less than 256MB

For example, here's seqres.full from a failed xfs/501 invocation:

** mkfs failed with extra mkfs options added to " -m metadir=2,autofsck=1,uquota,gquota,pquota, -d rtinherit=1," by test 501 **
** attempting to mkfs using only test 501 options: -l size=256m **
size 256m specified for log subvolume is too large, maximum is 32768 blocks
<snip>
mount -ortdev=/dev/sdb4 -ologdev=/dev/sdb2 /dev/sda4 /opt failed
umount: /dev/sda4: not mounted.

Note that there's some formatting error here, so we jettison the entire
rt configuration to force the log size option, but then mount fails
because we didn't edit out the rtdev option there too.

Fortunately, mkfs.xfs already /has/ a few options to try to improve
parallelism in the filesystem by avoiding contention on the log grant
heads by scaling up the log size.  These options are aware of log and AG
size constraints so they won't conflict with other geometry options.

Use them.

Cc: <fstests@vger.kernel.org> # v2024.12.08
Fixes: 000813899afb46 ("fstests: scale some tests for high CPU count sanity")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
---
 common/rc         |   27 +++++++++++++++++++++++++++
 tests/generic/530 |    6 +-----
 tests/generic/531 |    6 +-----
 tests/xfs/501     |    2 +-
 tests/xfs/502     |    2 +-
 5 files changed, 31 insertions(+), 12 deletions(-)

Comments

Dave Chinner Jan. 21, 2025, 3:58 a.m. UTC | #1
On Thu, Jan 16, 2025 at 03:27:46PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Commit 000813899afb46 hardcoded a log size of 256MB into xfs/501,
> xfs/502, and generic/530.  This seems to be an attempt to reduce test
> run times by increasing the log size so that more background threads can
> run in parallel.  Unfortunately, this breaks a couple of my test
> configurations:
> 
>  - External logs smaller than 256MB
>  - Internal logs where the AG size is less than 256MB
....

> diff --git a/common/rc b/common/rc
> index 9e34c301b0deb0..885669beeb5e26 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -689,6 +689,33 @@ _test_cycle_mount()
>      _test_mount
>  }
>  
> +# Are there mkfs options to try to improve concurrency?
> +_scratch_mkfs_concurrency_options()
> +{
> +	local nr_cpus="$(( $1 * LOAD_FACTOR ))"

caller does not need to pass a number of CPUs. This function can
simply do:

	local nr_cpus=$(getconf _NPROCESSORS_CONF)

And that will set concurrency to be "optimal" for the number of CPUs
in the machine the test is going to run on. That way tests don't
need to hard code some number that is going to be too large for
small systems and to small for large systems...

-Dave.
Theodore Ts'o Jan. 21, 2025, 12:44 p.m. UTC | #2
On Tue, Jan 21, 2025 at 02:58:25PM +1100, Dave Chinner wrote:
> > +# Are there mkfs options to try to improve concurrency?
> > +_scratch_mkfs_concurrency_options()
> > +{
> > +	local nr_cpus="$(( $1 * LOAD_FACTOR ))"
> 
> caller does not need to pass a number of CPUs. This function can
> simply do:
> 
> 	local nr_cpus=$(getconf _NPROCESSORS_CONF)
> 
> And that will set concurrency to be "optimal" for the number of CPUs
> in the machine the test is going to run on. That way tests don't
> need to hard code some number that is going to be too large for
> small systems and to small for large systems...

Hmm, but is this the right thing if you are using check-parallel?  If
you are running multiple tests that are all running some kind of load
or stress-testing antagonist at the same time, then having 3x to 5x
the number of necessary antagonist threads is going to unnecessarily
slow down the test run, which goes against the original goal of what
we were hoping to achieve with check-parallel.

How many tests are you currently able to run in parallel today, and
what's the ultimate goal?  We could have some kind of antagonist load
which is shared across multiple tests, but it's not clear to me that
it's worth the complexity.  (And note that it's not just fs and cpu
load antagonistsw; there could also be memory stress antagonists, where
having multiple antagonists could lead to OOM kills...)

							- Ted
Dave Chinner Jan. 21, 2025, 10:05 p.m. UTC | #3
On Tue, Jan 21, 2025 at 07:44:30AM -0500, Theodore Ts'o wrote:
> On Tue, Jan 21, 2025 at 02:58:25PM +1100, Dave Chinner wrote:
> > > +# Are there mkfs options to try to improve concurrency?
> > > +_scratch_mkfs_concurrency_options()
> > > +{
> > > +	local nr_cpus="$(( $1 * LOAD_FACTOR ))"
> > 
> > caller does not need to pass a number of CPUs. This function can
> > simply do:
> > 
> > 	local nr_cpus=$(getconf _NPROCESSORS_CONF)
> > 
> > And that will set concurrency to be "optimal" for the number of CPUs
> > in the machine the test is going to run on. That way tests don't
> > need to hard code some number that is going to be too large for
> > small systems and to small for large systems...
> 
> Hmm, but is this the right thing if you are using check-parallel?

Yes. The whole point of check-parallel is to run the tests in such a
way as to max out the resources of the test machine for the entire
test run. Everything that can be run concurrently should be run
concurrently, and we should not be cutting down on the concurrency
just because we are running check-parallel.

> If
> you are running multiple tests that are all running some kind of load
> or stress-testing antagonist at the same time, then having 3x to 5x
> the number of necessary antagonist threads is going to unnecessarily
> slow down the test run, which goes against the original goal of what
> we were hoping to achieve with check-parallel.

There are tests that run a thousand concurrent fsstress processes -
check-parallel still runs all those thousand fsstress processes.

> How many tests are you currently able to run in parallel today,

All of them if I wanted. Default is to run one test per CPU at a
time, but also to allow tests that use concurrency to maximise it.

> and
> what's the ultimate goal?

My initial goal was to maximise the utilisation of the machine when
testing XFS. If I can't max out a 64p server with 1.5 million
IOPS/7GB/s IO and 64GB RAM with check-parallel, then check-parallel
is not running enough tests in parallel.

Right now with 64 runner threads (one per CPU), I'm seeing an
average utilisation across the whole auto group XFS test run of:

-50 CPUs
- 2.5GB/s IO @ 30k IOPS
- 40GB RAM

The utilisation on ext4 is much lower and runtimes are much longer
for (as yet) unknown reasons. Concurrent fsstress loads, in
particular, appear to run much slower on ext4 than XFS...

> We could have some kind of antagonist load
> which is shared across multiple tests, but it's not clear to me that
> it's worth the complexity.

Yes, that's the plan further down the track - stuff like background
CPU hotplug (instead of a test that specifically runs hotplug with
fsstress that takes about 5 minutes to run), cache dropping to add
memory reclaim during tests, etc

> (And note that it's not just fs and cpu
> load antagonistsw; there could also be memory stress antagonists, where
> having multiple antagonists could lead to OOM kills...)

Yes, I eventually plan to use the src/usemem.c memory locker to
create changing levels of background memory stress to the test
runs...

Right now "perturbations" are exercised as a side effect of random
tests performing these actions. I want to make them controllable by
check-parallel so we can exercise the system functionality across
the entire range of correctness tests we have, not just an isolated
test case.

IOWs, the whole point of check-parallel is to make use of large
machines to stress the whole OS at the same time as we are testing
for filesystem behavioural correctness.

I also want to do it in as short a time period as possible - outside
of dedicated QE environments, I don't beleive that long running
stress tests actually provide value for the machine time they
consume. i.e. returns rapidly diminish because stress tests
cover 99.99% of the code paths they are going to exercise in the
first few minutes of running.

Yes, letting them run for longer will -eventually- cover rarely
travelled code paths, but for developers, CI systems and
first/second level QE verification of bug fixes we don't need
extended stress tests.

Further, when we run fstests in the normal way, we never cover
things like memory reclaim racing against unmount, freeze, sync,
etc. And we never cover them when the system is under extremely
heavy load running multiple GB/s of IO whilst CPU hotplug is running
whilst the scheduler is running at nearly a million context
switches/s, etc.

That's exactly the sort of loads that check-parallel is generating
on a machine just running the correctness tests in parallel. It
combines correctness testing with a dynamic, stressful environment,
and it runs the tests -fast-. The coverage I get in a single 10
minute auto-group run of check-parallel is -much higher- than I get
in a single auto-group run of check that takes 4 hours on the same
hardware to complete....

-Dave.
Darrick J. Wong Jan. 22, 2025, 3:30 a.m. UTC | #4
On Tue, Jan 21, 2025 at 02:58:25PM +1100, Dave Chinner wrote:
> On Thu, Jan 16, 2025 at 03:27:46PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Commit 000813899afb46 hardcoded a log size of 256MB into xfs/501,
> > xfs/502, and generic/530.  This seems to be an attempt to reduce test
> > run times by increasing the log size so that more background threads can
> > run in parallel.  Unfortunately, this breaks a couple of my test
> > configurations:
> > 
> >  - External logs smaller than 256MB
> >  - Internal logs where the AG size is less than 256MB
> ....
> 
> > diff --git a/common/rc b/common/rc
> > index 9e34c301b0deb0..885669beeb5e26 100644
> > --- a/common/rc
> > +++ b/common/rc
> > @@ -689,6 +689,33 @@ _test_cycle_mount()
> >      _test_mount
> >  }
> >  
> > +# Are there mkfs options to try to improve concurrency?
> > +_scratch_mkfs_concurrency_options()
> > +{
> > +	local nr_cpus="$(( $1 * LOAD_FACTOR ))"
> 
> caller does not need to pass a number of CPUs. This function can
> simply do:
> 
> 	local nr_cpus=$(getconf _NPROCESSORS_CONF)
> 
> And that will set concurrency to be "optimal" for the number of CPUs
> in the machine the test is going to run on. That way tests don't
> need to hard code some number that is going to be too large for
> small systems and to small for large systems...

Sounds good to me.

--D

> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
>
Darrick J. Wong Jan. 22, 2025, 3:36 a.m. UTC | #5
On Tue, Jan 21, 2025 at 07:44:30AM -0500, Theodore Ts'o wrote:
> On Tue, Jan 21, 2025 at 02:58:25PM +1100, Dave Chinner wrote:
> > > +# Are there mkfs options to try to improve concurrency?
> > > +_scratch_mkfs_concurrency_options()
> > > +{
> > > +	local nr_cpus="$(( $1 * LOAD_FACTOR ))"
> > 
> > caller does not need to pass a number of CPUs. This function can
> > simply do:
> > 
> > 	local nr_cpus=$(getconf _NPROCESSORS_CONF)
> > 
> > And that will set concurrency to be "optimal" for the number of CPUs
> > in the machine the test is going to run on. That way tests don't
> > need to hard code some number that is going to be too large for
> > small systems and to small for large systems...
> 
> Hmm, but is this the right thing if you are using check-parallel?  If
> you are running multiple tests that are all running some kind of load
> or stress-testing antagonist at the same time, then having 3x to 5x
> the number of necessary antagonist threads is going to unnecessarily
> slow down the test run, which goes against the original goal of what
> we were hoping to achieve with check-parallel.

<shrug> Maybe a more appropriate thing to do is:

	local nr_cpus=$(grep Cpus_allowed /proc/self/status | hweight)

So a check-parallel could (if they see such problems) constrain the
parallelism through cpu pinning.  I think getconf _NPROCESSORS_CONF is
probably fine for now.

(The other day I /did/ see some program in either util-linux or
coreutils that told you the number of "available" cpus based on checking
the affinity mask and whatever cgroups constraints are applied.  I can't
find it now, alas...)

> How many tests are you currently able to run in parallel today, and
> what's the ultimate goal?  We could have some kind of antagonist load
> which is shared across multiple tests, but it's not clear to me that
> it's worth the complexity.  (And note that it's not just fs and cpu
> load antagonistsw; there could also be memory stress antagonists, where
> having multiple antagonists could lead to OOM kills...)

On the other hand, perhaps having random antagonistic processes from
other ./check instances is exactly the kind of stress testing that we
want to shake out weirder bugs?  It's clear from Dave's RFC that the
generic/650 cpu hotplug shenanigans had some effect. ;)

--D

> 							- Ted
>
Darrick J. Wong Jan. 22, 2025, 3:40 a.m. UTC | #6
On Wed, Jan 22, 2025 at 09:05:18AM +1100, Dave Chinner wrote:
> On Tue, Jan 21, 2025 at 07:44:30AM -0500, Theodore Ts'o wrote:
> > On Tue, Jan 21, 2025 at 02:58:25PM +1100, Dave Chinner wrote:
> > > > +# Are there mkfs options to try to improve concurrency?
> > > > +_scratch_mkfs_concurrency_options()
> > > > +{
> > > > +	local nr_cpus="$(( $1 * LOAD_FACTOR ))"
> > > 
> > > caller does not need to pass a number of CPUs. This function can
> > > simply do:
> > > 
> > > 	local nr_cpus=$(getconf _NPROCESSORS_CONF)
> > > 
> > > And that will set concurrency to be "optimal" for the number of CPUs
> > > in the machine the test is going to run on. That way tests don't
> > > need to hard code some number that is going to be too large for
> > > small systems and to small for large systems...
> > 
> > Hmm, but is this the right thing if you are using check-parallel?
> 
> Yes. The whole point of check-parallel is to run the tests in such a
> way as to max out the resources of the test machine for the entire
> test run. Everything that can be run concurrently should be run
> concurrently, and we should not be cutting down on the concurrency
> just because we are running check-parallel.
> 
> > If
> > you are running multiple tests that are all running some kind of load
> > or stress-testing antagonist at the same time, then having 3x to 5x
> > the number of necessary antagonist threads is going to unnecessarily
> > slow down the test run, which goes against the original goal of what
> > we were hoping to achieve with check-parallel.
> 
> There are tests that run a thousand concurrent fsstress processes -
> check-parallel still runs all those thousand fsstress processes.
> 
> > How many tests are you currently able to run in parallel today,
> 
> All of them if I wanted. Default is to run one test per CPU at a
> time, but also to allow tests that use concurrency to maximise it.
> 
> > and
> > what's the ultimate goal?
> 
> My initial goal was to maximise the utilisation of the machine when
> testing XFS. If I can't max out a 64p server with 1.5 million
> IOPS/7GB/s IO and 64GB RAM with check-parallel, then check-parallel
> is not running enough tests in parallel.
> 
> Right now with 64 runner threads (one per CPU), I'm seeing an
> average utilisation across the whole auto group XFS test run of:
> 
> -50 CPUs
> - 2.5GB/s IO @ 30k IOPS
> - 40GB RAM
> 
> The utilisation on ext4 is much lower and runtimes are much longer
> for (as yet) unknown reasons. Concurrent fsstress loads, in
> particular, appear to run much slower on ext4 than XFS...
> 
> > We could have some kind of antagonist load
> > which is shared across multiple tests, but it's not clear to me that
> > it's worth the complexity.
> 
> Yes, that's the plan further down the track - stuff like background
> CPU hotplug (instead of a test that specifically runs hotplug with
> fsstress that takes about 5 minutes to run), cache dropping to add
> memory reclaim during tests, etc
> 
> > (And note that it's not just fs and cpu
> > load antagonistsw; there could also be memory stress antagonists, where
> > having multiple antagonists could lead to OOM kills...)
> 
> Yes, I eventually plan to use the src/usemem.c memory locker to
> create changing levels of background memory stress to the test
> runs...
> 
> Right now "perturbations" are exercised as a side effect of random
> tests performing these actions. I want to make them controllable by
> check-parallel so we can exercise the system functionality across
> the entire range of correctness tests we have, not just an isolated
> test case.
> 
> IOWs, the whole point of check-parallel is to make use of large
> machines to stress the whole OS at the same time as we are testing
> for filesystem behavioural correctness.
> 
> I also want to do it in as short a time period as possible - outside
> of dedicated QE environments, I don't beleive that long running
> stress tests actually provide value for the machine time they
> consume. i.e. returns rapidly diminish because stress tests
> cover 99.99% of the code paths they are going to exercise in the
> first few minutes of running.
> 
> Yes, letting them run for longer will -eventually- cover rarely
> travelled code paths, but for developers, CI systems and
> first/second level QE verification of bug fixes we don't need
> extended stress tests.

Admittedly the long soak tests probably don't add much once the scratch
device has filled up and been cleaned out a few times.  Maybe that
sacrificial usemem would be useful sooner than later.

ATM the online repair vs fsstress soak test is still pretty useful for
probing just how bad things can get in terms of system stress and
stalling, but that's only because repairs are resource intensive. :)

--D

> Further, when we run fstests in the normal way, we never cover
> things like memory reclaim racing against unmount, freeze, sync,
> etc. And we never cover them when the system is under extremely
> heavy load running multiple GB/s of IO whilst CPU hotplug is running
> whilst the scheduler is running at nearly a million context
> switches/s, etc.
> 
> That's exactly the sort of loads that check-parallel is generating
> on a machine just running the correctness tests in parallel. It
> combines correctness testing with a dynamic, stressful environment,
> and it runs the tests -fast-. The coverage I get in a single 10
> minute auto-group run of check-parallel is -much higher- than I get
> in a single auto-group run of check that takes 4 hours on the same
> hardware to complete....
> 
> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
>
diff mbox series

Patch

diff --git a/common/rc b/common/rc
index 9e34c301b0deb0..885669beeb5e26 100644
--- a/common/rc
+++ b/common/rc
@@ -689,6 +689,33 @@  _test_cycle_mount()
     _test_mount
 }
 
+# Are there mkfs options to try to improve concurrency?
+_scratch_mkfs_concurrency_options()
+{
+	local nr_cpus="$(( $1 * LOAD_FACTOR ))"
+
+	case "$FSTYP" in
+	xfs)
+		# If any concurrency options are already specified, don't
+		# compute our own conflicting ones.
+		echo "$SCRATCH_OPTIONS $MKFS_OPTIONS" | \
+			grep -q 'concurrency=' &&
+			return
+
+		local sections=(d r)
+
+		# -l concurrency does not work with external logs
+		test _has_logdev || sections+=(l)
+
+		for section in "${sections[@]}"; do
+			$MKFS_XFS_PROG -$section concurrency=$nr_cpus 2>&1 | \
+				grep -q "unknown option -$section" ||
+				echo "-$section concurrency=$nr_cpus "
+		done
+		;;
+	esac
+}
+
 _scratch_mkfs_options()
 {
     _scratch_options mkfs
diff --git a/tests/generic/530 b/tests/generic/530
index f2513156a920e8..7413840476b588 100755
--- a/tests/generic/530
+++ b/tests/generic/530
@@ -25,11 +25,7 @@  _require_test_program "t_open_tmpfiles"
 # For XFS, pushing 50000 unlinked inode inactivations through a small xfs log
 # can result in bottlenecks on the log grant heads, so try to make the log
 # larger to reduce runtime.
-if [ "$FSTYP" = "xfs" ] && ! _has_logdev; then
-    _scratch_mkfs "-l size=256m" >> $seqres.full 2>&1
-else
-    _scratch_mkfs >> $seqres.full 2>&1
-fi
+_scratch_mkfs $(_scratch_mkfs_concurrency_options 32) >> $seqres.full 2>&1
 _scratch_mount
 
 # Set ULIMIT_NOFILE to min(file-max / 2, 50000 files per LOAD_FACTOR)
diff --git a/tests/generic/531 b/tests/generic/531
index ed6c3f91153ecc..3ba2790c923464 100755
--- a/tests/generic/531
+++ b/tests/generic/531
@@ -23,11 +23,7 @@  _require_test_program "t_open_tmpfiles"
 
 # On high CPU count machines, this runs a -lot- of create and unlink
 # concurrency. Set the filesytsem up to handle this.
-if [ $FSTYP = "xfs" ]; then
-	_scratch_mkfs "-d agcount=32" >> $seqres.full 2>&1
-else
-	_scratch_mkfs >> $seqres.full 2>&1
-fi
+_scratch_mkfs $(_scratch_mkfs_concurrency_options 32) >> $seqres.full 2>&1
 _scratch_mount
 
 # Try to load up all the CPUs, two threads per CPU.
diff --git a/tests/xfs/501 b/tests/xfs/501
index 678c51b52948c5..4b29ef97d36c1a 100755
--- a/tests/xfs/501
+++ b/tests/xfs/501
@@ -33,7 +33,7 @@  _require_xfs_sysfs debug/log_recovery_delay
 _require_scratch
 _require_test_program "t_open_tmpfiles"
 
-_scratch_mkfs "-l size=256m" >> $seqres.full 2>&1
+_scratch_mkfs $(_scratch_mkfs_concurrency_options 32) >> $seqres.full 2>&1
 _scratch_mount
 
 # Set ULIMIT_NOFILE to min(file-max / 2, 30000 files per LOAD_FACTOR)
diff --git a/tests/xfs/502 b/tests/xfs/502
index 10b0017f6b2eb2..df3e7bcb17872d 100755
--- a/tests/xfs/502
+++ b/tests/xfs/502
@@ -23,7 +23,7 @@  _require_xfs_io_error_injection "iunlink_fallback"
 _require_scratch
 _require_test_program "t_open_tmpfiles"
 
-_scratch_mkfs "-l size=256m" | _filter_mkfs 2> $tmp.mkfs > /dev/null
+_scratch_mkfs $(_scratch_mkfs_concurrency_options 32) | _filter_mkfs 2> $tmp.mkfs > /dev/null
 cat $tmp.mkfs >> $seqres.full
 . $tmp.mkfs