Message ID | 173706974228.1927324.17714311358227511791.stgit@frogsfrogsfrogs (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [01/23] generic/476: fix fsstress process management | expand |
On Thu, Jan 16, 2025 at 03:27:46PM -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <djwong@kernel.org> > > Commit 000813899afb46 hardcoded a log size of 256MB into xfs/501, > xfs/502, and generic/530. This seems to be an attempt to reduce test > run times by increasing the log size so that more background threads can > run in parallel. Unfortunately, this breaks a couple of my test > configurations: > > - External logs smaller than 256MB > - Internal logs where the AG size is less than 256MB .... > diff --git a/common/rc b/common/rc > index 9e34c301b0deb0..885669beeb5e26 100644 > --- a/common/rc > +++ b/common/rc > @@ -689,6 +689,33 @@ _test_cycle_mount() > _test_mount > } > > +# Are there mkfs options to try to improve concurrency? > +_scratch_mkfs_concurrency_options() > +{ > + local nr_cpus="$(( $1 * LOAD_FACTOR ))" caller does not need to pass a number of CPUs. This function can simply do: local nr_cpus=$(getconf _NPROCESSORS_CONF) And that will set concurrency to be "optimal" for the number of CPUs in the machine the test is going to run on. That way tests don't need to hard code some number that is going to be too large for small systems and to small for large systems... -Dave.
On Tue, Jan 21, 2025 at 02:58:25PM +1100, Dave Chinner wrote: > > +# Are there mkfs options to try to improve concurrency? > > +_scratch_mkfs_concurrency_options() > > +{ > > + local nr_cpus="$(( $1 * LOAD_FACTOR ))" > > caller does not need to pass a number of CPUs. This function can > simply do: > > local nr_cpus=$(getconf _NPROCESSORS_CONF) > > And that will set concurrency to be "optimal" for the number of CPUs > in the machine the test is going to run on. That way tests don't > need to hard code some number that is going to be too large for > small systems and to small for large systems... Hmm, but is this the right thing if you are using check-parallel? If you are running multiple tests that are all running some kind of load or stress-testing antagonist at the same time, then having 3x to 5x the number of necessary antagonist threads is going to unnecessarily slow down the test run, which goes against the original goal of what we were hoping to achieve with check-parallel. How many tests are you currently able to run in parallel today, and what's the ultimate goal? We could have some kind of antagonist load which is shared across multiple tests, but it's not clear to me that it's worth the complexity. (And note that it's not just fs and cpu load antagonistsw; there could also be memory stress antagonists, where having multiple antagonists could lead to OOM kills...) - Ted
diff --git a/common/rc b/common/rc index 9e34c301b0deb0..885669beeb5e26 100644 --- a/common/rc +++ b/common/rc @@ -689,6 +689,33 @@ _test_cycle_mount() _test_mount } +# Are there mkfs options to try to improve concurrency? +_scratch_mkfs_concurrency_options() +{ + local nr_cpus="$(( $1 * LOAD_FACTOR ))" + + case "$FSTYP" in + xfs) + # If any concurrency options are already specified, don't + # compute our own conflicting ones. + echo "$SCRATCH_OPTIONS $MKFS_OPTIONS" | \ + grep -q 'concurrency=' && + return + + local sections=(d r) + + # -l concurrency does not work with external logs + test _has_logdev || sections+=(l) + + for section in "${sections[@]}"; do + $MKFS_XFS_PROG -$section concurrency=$nr_cpus 2>&1 | \ + grep -q "unknown option -$section" || + echo "-$section concurrency=$nr_cpus " + done + ;; + esac +} + _scratch_mkfs_options() { _scratch_options mkfs diff --git a/tests/generic/530 b/tests/generic/530 index f2513156a920e8..7413840476b588 100755 --- a/tests/generic/530 +++ b/tests/generic/530 @@ -25,11 +25,7 @@ _require_test_program "t_open_tmpfiles" # For XFS, pushing 50000 unlinked inode inactivations through a small xfs log # can result in bottlenecks on the log grant heads, so try to make the log # larger to reduce runtime. -if [ "$FSTYP" = "xfs" ] && ! _has_logdev; then - _scratch_mkfs "-l size=256m" >> $seqres.full 2>&1 -else - _scratch_mkfs >> $seqres.full 2>&1 -fi +_scratch_mkfs $(_scratch_mkfs_concurrency_options 32) >> $seqres.full 2>&1 _scratch_mount # Set ULIMIT_NOFILE to min(file-max / 2, 50000 files per LOAD_FACTOR) diff --git a/tests/generic/531 b/tests/generic/531 index ed6c3f91153ecc..3ba2790c923464 100755 --- a/tests/generic/531 +++ b/tests/generic/531 @@ -23,11 +23,7 @@ _require_test_program "t_open_tmpfiles" # On high CPU count machines, this runs a -lot- of create and unlink # concurrency. Set the filesytsem up to handle this. -if [ $FSTYP = "xfs" ]; then - _scratch_mkfs "-d agcount=32" >> $seqres.full 2>&1 -else - _scratch_mkfs >> $seqres.full 2>&1 -fi +_scratch_mkfs $(_scratch_mkfs_concurrency_options 32) >> $seqres.full 2>&1 _scratch_mount # Try to load up all the CPUs, two threads per CPU. diff --git a/tests/xfs/501 b/tests/xfs/501 index 678c51b52948c5..4b29ef97d36c1a 100755 --- a/tests/xfs/501 +++ b/tests/xfs/501 @@ -33,7 +33,7 @@ _require_xfs_sysfs debug/log_recovery_delay _require_scratch _require_test_program "t_open_tmpfiles" -_scratch_mkfs "-l size=256m" >> $seqres.full 2>&1 +_scratch_mkfs $(_scratch_mkfs_concurrency_options 32) >> $seqres.full 2>&1 _scratch_mount # Set ULIMIT_NOFILE to min(file-max / 2, 30000 files per LOAD_FACTOR) diff --git a/tests/xfs/502 b/tests/xfs/502 index 10b0017f6b2eb2..df3e7bcb17872d 100755 --- a/tests/xfs/502 +++ b/tests/xfs/502 @@ -23,7 +23,7 @@ _require_xfs_io_error_injection "iunlink_fallback" _require_scratch _require_test_program "t_open_tmpfiles" -_scratch_mkfs "-l size=256m" | _filter_mkfs 2> $tmp.mkfs > /dev/null +_scratch_mkfs $(_scratch_mkfs_concurrency_options 32) | _filter_mkfs 2> $tmp.mkfs > /dev/null cat $tmp.mkfs >> $seqres.full . $tmp.mkfs