Message ID | 7a257c21d2efe2706bac1f2fac8f7faff1d0423f.1732796051.git.fdmanana@suse.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | btrfs/028: kill lingering processes when test is interrupted | expand |
On Thu, Nov 28, 2024 at 12:14:56PM +0000, fdmanana@kernel.org wrote: > From: Filipe Manana <fdmanana@suse.com> > > If we interrupt the test after it spawned the fsstress and balance > processes (while it's sleeping for 30 seconds * $TIME_FACTOR), we don't > kill them and they stay around for a long time, making it impossible to > unmount the scratch filesystem (failing with -EBUSY). > > Fix this by adding a _cleanup function that kills the processes and > waits for them to exit. > > Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com>
On Thu, Nov 28, 2024 at 12:14:56PM +0000, fdmanana@kernel.org wrote: > From: Filipe Manana <fdmanana@suse.com> > > If we interrupt the test after it spawned the fsstress and balance > processes (while it's sleeping for 30 seconds * $TIME_FACTOR), we don't > kill them and they stay around for a long time, making it impossible to > unmount the scratch filesystem (failing with -EBUSY). > > Fix this by adding a _cleanup function that kills the processes and > waits for them to exit. > > Signed-off-by: Filipe Manana <fdmanana@suse.com> > --- Good to me, I think I can help to unset balance_pid and fsstress_pid if these processes have been killed before doing _cleanup. Reviewed-by: Zorro Lang <zlang@redhat.com> > tests/btrfs/028 | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/tests/btrfs/028 b/tests/btrfs/028 > index f64fc831..4ef83423 100755 > --- a/tests/btrfs/028 > +++ b/tests/btrfs/028 > @@ -13,6 +13,19 @@ > . ./common/preamble > _begin_fstest auto qgroup balance > > +_cleanup() > +{ > + cd / > + rm -r -f $tmp.* > + if [ ! -z "$balance_pid" ]; then > + _btrfs_kill_stress_balance_pid $balance_pid > + fi > + if [ ! -z "$fsstress_pid" ]; then > + kill $fsstress_pid &> /dev/null > + wait $fsstress_pid &> /dev/null > + fi > +} > + > . ./common/filter > > _require_scratch > -- > 2.45.2 > >
On Thu, Nov 28, 2024 at 12:14:56PM +0000, fdmanana@kernel.org wrote: > From: Filipe Manana <fdmanana@suse.com> > > If we interrupt the test after it spawned the fsstress and balance > processes (while it's sleeping for 30 seconds * $TIME_FACTOR), we don't > kill them and they stay around for a long time, making it impossible to > unmount the scratch filesystem (failing with -EBUSY). > > Fix this by adding a _cleanup function that kills the processes and > waits for them to exit. > > Signed-off-by: Filipe Manana <fdmanana@suse.com> > --- > tests/btrfs/028 | 13 +++++++++++++ > 1 file changed, 13 insertions(+) IIRC there are several btrfs tests that run fsstress that need similar fixes. I fixed this problem for fsstress across the entire fstests code base in this patch I recently sent: https://lore.kernel.org/fstests/20241127045403.3665299-3-david@fromorbit.com/ Avoiding needing to repeated rebase that work by having people randomly fixing one-off instances like this is why I'd like those large "fix everything" patches reviewed and merged sooner rather than later... Regardless, > diff --git a/tests/btrfs/028 b/tests/btrfs/028 > index f64fc831..4ef83423 100755 > --- a/tests/btrfs/028 > +++ b/tests/btrfs/028 > @@ -13,6 +13,19 @@ > . ./common/preamble > _begin_fstest auto qgroup balance > > +_cleanup() > +{ > + cd / > + rm -r -f $tmp.* > + if [ ! -z "$balance_pid" ]; then "! -z <var>" is the same as "-n <var>" > + _btrfs_kill_stress_balance_pid $balance_pid > + fi > + if [ ! -z "$fsstress_pid" ]; then > + kill $fsstress_pid &> /dev/null > + wait $fsstress_pid &> /dev/null > + fi > +} This really wants the balance_pid and fsstress_pid variables to be unset after they are killed in normal execution so we don't try to kill and wait for them a second time on exit. -Dave.
On Fri, Nov 29, 2024 at 07:30:31AM +1100, Dave Chinner wrote: > On Thu, Nov 28, 2024 at 12:14:56PM +0000, fdmanana@kernel.org wrote: > > From: Filipe Manana <fdmanana@suse.com> > > > > If we interrupt the test after it spawned the fsstress and balance > > processes (while it's sleeping for 30 seconds * $TIME_FACTOR), we don't > > kill them and they stay around for a long time, making it impossible to > > unmount the scratch filesystem (failing with -EBUSY). > > > > Fix this by adding a _cleanup function that kills the processes and > > waits for them to exit. > > > > Signed-off-by: Filipe Manana <fdmanana@suse.com> > > --- > > tests/btrfs/028 | 13 +++++++++++++ > > 1 file changed, 13 insertions(+) > > IIRC there are several btrfs tests that run fsstress that need > similar fixes. I fixed this problem for fsstress across the entire > fstests code base in this patch I recently sent: > > https://lore.kernel.org/fstests/20241127045403.3665299-3-david@fromorbit.com/ > > Avoiding needing to repeated rebase that work by having people > randomly fixing one-off instances like this is why I'd like those > large "fix everything" patches reviewed and merged sooner rather > than later... Sure Dave, I'll make an individual fstests release just for your big patchset next week. This week I already prepared to have other patches. Your patchset will be reviewed, tested (it changes many things) and merged soon, if there's not big issue from it. Hope that makes sense to you :) Thanks, Zorro > > Regardless, > > > diff --git a/tests/btrfs/028 b/tests/btrfs/028 > > index f64fc831..4ef83423 100755 > > --- a/tests/btrfs/028 > > +++ b/tests/btrfs/028 > > @@ -13,6 +13,19 @@ > > . ./common/preamble > > _begin_fstest auto qgroup balance > > > > +_cleanup() > > +{ > > + cd / > > + rm -r -f $tmp.* > > + if [ ! -z "$balance_pid" ]; then > > "! -z <var>" is the same as "-n <var>" > > > + _btrfs_kill_stress_balance_pid $balance_pid > > + fi > > + if [ ! -z "$fsstress_pid" ]; then > > + kill $fsstress_pid &> /dev/null > > + wait $fsstress_pid &> /dev/null > > + fi > > +} > > This really wants the balance_pid and fsstress_pid variables to be > unset after they are killed in normal execution so we don't try to > kill and wait for them a second time on exit. > > -Dave. > > -- > Dave Chinner > david@fromorbit.com >
diff --git a/tests/btrfs/028 b/tests/btrfs/028 index f64fc831..4ef83423 100755 --- a/tests/btrfs/028 +++ b/tests/btrfs/028 @@ -13,6 +13,19 @@ . ./common/preamble _begin_fstest auto qgroup balance +_cleanup() +{ + cd / + rm -r -f $tmp.* + if [ ! -z "$balance_pid" ]; then + _btrfs_kill_stress_balance_pid $balance_pid + fi + if [ ! -z "$fsstress_pid" ]; then + kill $fsstress_pid &> /dev/null + wait $fsstress_pid &> /dev/null + fi +} + . ./common/filter _require_scratch