diff mbox series

[blktests,v2] common/cpuhotplug: allow _offline_cpu() call in sub-shell

Message ID 20220719025216.1395353-1-shinichiro.kawasaki@wdc.com (mailing list archive)
State New, archived
Headers show
Series [blktests,v2] common/cpuhotplug: allow _offline_cpu() call in sub-shell | expand

Commit Message

Shin'ichiro Kawasaki July 19, 2022, 2:52 a.m. UTC
The helper function _offline_cpu() sets a value to RESTORE_CPUS_ONLINE.
However, the commit bd6b882b2650 ("block/008: check CPU offline failure
due to many IRQs") put _offline_cpu() call in sub-shell, then the set
value to RESTORE_CPUS_ONLINE no longer affects function caller's
environment. This resulted in off-lined CPUs not restored by _cleanup()
when the test case block/008 calls only _offline_cpu() and does not call
_online_cpu().

To fix the issue, set RESTORE_CPUS_ONLINE in _have_cpu_hotplug() in
place of _offline_cpu(). _have_cpu_hotplug() is less likely to be called
in sub-shell. In same manner, do not set RESTORE_CPUS_ONLINE in
_online_cpu() either. Check that RESTORE_CPUS_ONLINE is set in
_offline_cpu() to avoid unexpected CPUs left off-lined. This check also
avoids a shellcheck warning.

Fixes: bd6b882b2650 ("block/008: check CPU offline failure due to many IRQs")
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
---
Changes from v1:
* Change fix approach: fix _offline_cpu() instaed of block/008 test script.

 common/cpuhotplug | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

Comments

Yi Zhang July 26, 2022, 8:33 a.m. UTC | #1
Hi Shin'ichiro

This change introduced one new issue that the offline operation will
be skipped when there are more than one test devs.

# cat config
TEST_DEVS=(/dev/nvme0n1 /dev/nvme1n1 /dev/nvme2n1)
# ./check block/008
block/008 => nvme0n1 (do IO while hotplugging CPUs)          [passed]
    read iops  336729   ...  343317
    runtime    12.791s  ...  12.498s
block/008 => nvme1n1 (do IO while hotplugging CPUs)          [failed]
    read iops          ...
    runtime    0.225s  ...  0.225s
    --- tests/block/008.out 2022-07-26 01:26:32.733912082 -0400
    +++ /mnt/tests/kernel/storage/SSD/nvme_blktest/blktests/results/nvme1n1/block/008.out.bad
2022-07-26 04:29:01.940374953 -0400
    @@ -1,2 +1,3 @@
     Running block/008
    +Failed to offline CPU: offlining cpu but RESTORE_CPUS_ONLINE is not set
     Test complete
block/008 => nvme2n1 (do IO while hotplugging CPUs)          [failed]
    read iops          ...
    runtime    0.231s  ...  0.231s
    --- tests/block/008.out 2022-07-26 01:26:32.733912082 -0400
    +++ /mnt/tests/kernel/storage/SSD/nvme_blktest/blktests/results/nvme2n1/block/008.out.bad
2022-07-26 04:29:02.210381850 -0400
    @@ -1,2 +1,3 @@
     Running block/008
    +Failed to offline CPU: offlining cpu but RESTORE_CPUS_ONLINE is not set
     Test complete

On Tue, Jul 19, 2022 at 11:04 AM Shin'ichiro Kawasaki
<shinichiro.kawasaki@wdc.com> wrote:
>
> The helper function _offline_cpu() sets a value to RESTORE_CPUS_ONLINE.
> However, the commit bd6b882b2650 ("block/008: check CPU offline failure
> due to many IRQs") put _offline_cpu() call in sub-shell, then the set
> value to RESTORE_CPUS_ONLINE no longer affects function caller's
> environment. This resulted in off-lined CPUs not restored by _cleanup()
> when the test case block/008 calls only _offline_cpu() and does not call
> _online_cpu().
>
> To fix the issue, set RESTORE_CPUS_ONLINE in _have_cpu_hotplug() in
> place of _offline_cpu(). _have_cpu_hotplug() is less likely to be called
> in sub-shell. In same manner, do not set RESTORE_CPUS_ONLINE in
> _online_cpu() either. Check that RESTORE_CPUS_ONLINE is set in
> _offline_cpu() to avoid unexpected CPUs left off-lined. This check also
> avoids a shellcheck warning.
>
> Fixes: bd6b882b2650 ("block/008: check CPU offline failure due to many IRQs")
> Reported-by: Yi Zhang <yi.zhang@redhat.com>
> Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
> ---
> Changes from v1:
> * Change fix approach: fix _offline_cpu() instaed of block/008 test script.
>
>  common/cpuhotplug | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/common/cpuhotplug b/common/cpuhotplug
> index 7facd0d..e91c3bc 100644
> --- a/common/cpuhotplug
> +++ b/common/cpuhotplug
> @@ -27,17 +27,20 @@ _have_cpu_hotplug() {
>                 SKIP_REASONS+=("CPU hotplugging is not supported")
>                 return 1
>         fi
> +
> +       RESTORE_CPUS_ONLINE=1
>         return 0
>  }
>
>  _online_cpu() {
> -       # shellcheck disable=SC2034
> -       RESTORE_CPUS_ONLINE=1
>         echo 1 > "/sys/devices/system/cpu/cpu$1/online"
>  }
>
>  _offline_cpu() {
> -       # shellcheck disable=SC2034
> -       RESTORE_CPUS_ONLINE=1
> +       if [[ -z ${RESTORE_CPUS_ONLINE} ]]; then
> +               echo "offlining cpu but RESTORE_CPUS_ONLINE is not set"
> +               return 1
> +       fi
> +
>         echo 0 > "/sys/devices/system/cpu/cpu$1/online"
>  }
> --
> 2.36.1
>
Shin'ichiro Kawasaki July 26, 2022, 10:45 a.m. UTC | #2
On Jul 26, 2022 / 16:33, Yi Zhang wrote:
> Hi Shin'ichiro
> 
> This change introduced one new issue that the offline operation will
> be skipped when there are more than one test devs.

Yi, thanks for finding this. Then this v2 fix is not good. Tomorrow, I'll do
some more confirmation and will revert v2 and apply v1 instead. v1 fix is a bit
dirty, but should work well.
Shin'ichiro Kawasaki July 27, 2022, 4:14 a.m. UTC | #3
On Jul 26, 2022 / 10:45, Shinichiro Kawasaki wrote:
> On Jul 26, 2022 / 16:33, Yi Zhang wrote:
> > Hi Shin'ichiro
> > 
> > This change introduced one new issue that the offline operation will
> > be skipped when there are more than one test devs.
> 
> Yi, thanks for finding this. Then this v2 fix is not good. Tomorrow, I'll do
> some more confirmation and will revert v2 and apply v1 instead. v1 fix is a bit
> dirty, but should work well.

Yi, I have reverted v2 and merged v1 to the blktests upstream. In case you still
see issues, please let me know.
diff mbox series

Patch

diff --git a/common/cpuhotplug b/common/cpuhotplug
index 7facd0d..e91c3bc 100644
--- a/common/cpuhotplug
+++ b/common/cpuhotplug
@@ -27,17 +27,20 @@  _have_cpu_hotplug() {
 		SKIP_REASONS+=("CPU hotplugging is not supported")
 		return 1
 	fi
+
+	RESTORE_CPUS_ONLINE=1
 	return 0
 }
 
 _online_cpu() {
-	# shellcheck disable=SC2034
-	RESTORE_CPUS_ONLINE=1
 	echo 1 > "/sys/devices/system/cpu/cpu$1/online"
 }
 
 _offline_cpu() {
-	# shellcheck disable=SC2034
-	RESTORE_CPUS_ONLINE=1
+	if [[ -z ${RESTORE_CPUS_ONLINE} ]]; then
+		echo "offlining cpu but RESTORE_CPUS_ONLINE is not set"
+		return 1
+	fi
+
 	echo 0 > "/sys/devices/system/cpu/cpu$1/online"
 }