diff mbox series

btrfs/028: kill lingering processes when test is interrupted

Message ID 7a257c21d2efe2706bac1f2fac8f7faff1d0423f.1732796051.git.fdmanana@suse.com (mailing list archive)
State New
Headers show
Series btrfs/028: kill lingering processes when test is interrupted | expand

Commit Message

Filipe Manana Nov. 28, 2024, 12:14 p.m. UTC
From: Filipe Manana <fdmanana@suse.com>

If we interrupt the test after it spawned the fsstress and balance
processes (while it's sleeping for 30 seconds * $TIME_FACTOR), we don't
kill them and they stay around for a long time, making it impossible to
unmount the scratch filesystem (failing with -EBUSY).

Fix this by adding a _cleanup function that kills the processes and
waits for them to exit.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
---
 tests/btrfs/028 | 13 +++++++++++++
 1 file changed, 13 insertions(+)

Comments

David Sterba Nov. 28, 2024, 1:14 p.m. UTC | #1
On Thu, Nov 28, 2024 at 12:14:56PM +0000, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> If we interrupt the test after it spawned the fsstress and balance
> processes (while it's sleeping for 30 seconds * $TIME_FACTOR), we don't
> kill them and they stay around for a long time, making it impossible to
> unmount the scratch filesystem (failing with -EBUSY).
> 
> Fix this by adding a _cleanup function that kills the processes and
> waits for them to exit.
> 
> Signed-off-by: Filipe Manana <fdmanana@suse.com>

Reviewed-by: David Sterba <dsterba@suse.com>
Zorro Lang Nov. 28, 2024, 1:27 p.m. UTC | #2
On Thu, Nov 28, 2024 at 12:14:56PM +0000, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> If we interrupt the test after it spawned the fsstress and balance
> processes (while it's sleeping for 30 seconds * $TIME_FACTOR), we don't
> kill them and they stay around for a long time, making it impossible to
> unmount the scratch filesystem (failing with -EBUSY).
> 
> Fix this by adding a _cleanup function that kills the processes and
> waits for them to exit.
> 
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---

Good to me, I think I can help to unset balance_pid and fsstress_pid
if these processes have been killed before doing _cleanup.

Reviewed-by: Zorro Lang <zlang@redhat.com>

>  tests/btrfs/028 | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/tests/btrfs/028 b/tests/btrfs/028
> index f64fc831..4ef83423 100755
> --- a/tests/btrfs/028
> +++ b/tests/btrfs/028
> @@ -13,6 +13,19 @@
>  . ./common/preamble
>  _begin_fstest auto qgroup balance
>  
> +_cleanup()
> +{
> +	cd /
> +	rm -r -f $tmp.*
> +	if [ ! -z "$balance_pid" ]; then
> +		_btrfs_kill_stress_balance_pid $balance_pid
> +	fi
> +	if [ ! -z "$fsstress_pid" ]; then
> +		kill $fsstress_pid &> /dev/null
> +		wait $fsstress_pid &> /dev/null
> +	fi
> +}
> +
>  . ./common/filter
>  
>  _require_scratch
> -- 
> 2.45.2
> 
>
Dave Chinner Nov. 28, 2024, 8:30 p.m. UTC | #3
On Thu, Nov 28, 2024 at 12:14:56PM +0000, fdmanana@kernel.org wrote:
> From: Filipe Manana <fdmanana@suse.com>
> 
> If we interrupt the test after it spawned the fsstress and balance
> processes (while it's sleeping for 30 seconds * $TIME_FACTOR), we don't
> kill them and they stay around for a long time, making it impossible to
> unmount the scratch filesystem (failing with -EBUSY).
> 
> Fix this by adding a _cleanup function that kills the processes and
> waits for them to exit.
> 
> Signed-off-by: Filipe Manana <fdmanana@suse.com>
> ---
>  tests/btrfs/028 | 13 +++++++++++++
>  1 file changed, 13 insertions(+)

IIRC there are several btrfs tests that run fsstress that need
similar fixes. I fixed this problem for fsstress across the entire
fstests code base in this patch I recently sent:

https://lore.kernel.org/fstests/20241127045403.3665299-3-david@fromorbit.com/

Avoiding needing to repeated rebase that work by having people
randomly fixing one-off instances like this is why I'd like those
large "fix everything" patches reviewed and merged sooner rather
than later...

Regardless,

> diff --git a/tests/btrfs/028 b/tests/btrfs/028
> index f64fc831..4ef83423 100755
> --- a/tests/btrfs/028
> +++ b/tests/btrfs/028
> @@ -13,6 +13,19 @@
>  . ./common/preamble
>  _begin_fstest auto qgroup balance
>  
> +_cleanup()
> +{
> +	cd /
> +	rm -r -f $tmp.*
> +	if [ ! -z "$balance_pid" ]; then

"! -z <var>" is the same as "-n <var>"

> +		_btrfs_kill_stress_balance_pid $balance_pid
> +	fi
> +	if [ ! -z "$fsstress_pid" ]; then
> +		kill $fsstress_pid &> /dev/null
> +		wait $fsstress_pid &> /dev/null
> +	fi
> +}

This really wants the balance_pid and fsstress_pid variables to be
unset after they are killed in normal execution so we don't try to
kill and wait for them a second time on exit.

-Dave.
Zorro Lang Nov. 29, 2024, 3:18 a.m. UTC | #4
On Fri, Nov 29, 2024 at 07:30:31AM +1100, Dave Chinner wrote:
> On Thu, Nov 28, 2024 at 12:14:56PM +0000, fdmanana@kernel.org wrote:
> > From: Filipe Manana <fdmanana@suse.com>
> > 
> > If we interrupt the test after it spawned the fsstress and balance
> > processes (while it's sleeping for 30 seconds * $TIME_FACTOR), we don't
> > kill them and they stay around for a long time, making it impossible to
> > unmount the scratch filesystem (failing with -EBUSY).
> > 
> > Fix this by adding a _cleanup function that kills the processes and
> > waits for them to exit.
> > 
> > Signed-off-by: Filipe Manana <fdmanana@suse.com>
> > ---
> >  tests/btrfs/028 | 13 +++++++++++++
> >  1 file changed, 13 insertions(+)
> 
> IIRC there are several btrfs tests that run fsstress that need
> similar fixes. I fixed this problem for fsstress across the entire
> fstests code base in this patch I recently sent:
> 
> https://lore.kernel.org/fstests/20241127045403.3665299-3-david@fromorbit.com/
> 
> Avoiding needing to repeated rebase that work by having people
> randomly fixing one-off instances like this is why I'd like those
> large "fix everything" patches reviewed and merged sooner rather
> than later...

Sure Dave, I'll make an individual fstests release just for your big patchset next
week. This week I already prepared to have other patches. Your patchset will
be reviewed, tested (it changes many things) and merged soon, if there's not big
issue from it. Hope that makes sense to you :)

Thanks,
Zorro

> 
> Regardless,
> 
> > diff --git a/tests/btrfs/028 b/tests/btrfs/028
> > index f64fc831..4ef83423 100755
> > --- a/tests/btrfs/028
> > +++ b/tests/btrfs/028
> > @@ -13,6 +13,19 @@
> >  . ./common/preamble
> >  _begin_fstest auto qgroup balance
> >  
> > +_cleanup()
> > +{
> > +	cd /
> > +	rm -r -f $tmp.*
> > +	if [ ! -z "$balance_pid" ]; then
> 
> "! -z <var>" is the same as "-n <var>"
> 
> > +		_btrfs_kill_stress_balance_pid $balance_pid
> > +	fi
> > +	if [ ! -z "$fsstress_pid" ]; then
> > +		kill $fsstress_pid &> /dev/null
> > +		wait $fsstress_pid &> /dev/null
> > +	fi
> > +}
> 
> This really wants the balance_pid and fsstress_pid variables to be
> unset after they are killed in normal execution so we don't try to
> kill and wait for them a second time on exit.
> 
> -Dave.
> 
> -- 
> Dave Chinner
> david@fromorbit.com
>
diff mbox series

Patch

diff --git a/tests/btrfs/028 b/tests/btrfs/028
index f64fc831..4ef83423 100755
--- a/tests/btrfs/028
+++ b/tests/btrfs/028
@@ -13,6 +13,19 @@ 
 . ./common/preamble
 _begin_fstest auto qgroup balance
 
+_cleanup()
+{
+	cd /
+	rm -r -f $tmp.*
+	if [ ! -z "$balance_pid" ]; then
+		_btrfs_kill_stress_balance_pid $balance_pid
+	fi
+	if [ ! -z "$fsstress_pid" ]; then
+		kill $fsstress_pid &> /dev/null
+		wait $fsstress_pid &> /dev/null
+	fi
+}
+
 . ./common/filter
 
 _require_scratch