diff mbox series

[kvm-unit-tests,v3,1/2] s390x: sclp: consider monoprocessor on read_info error

Message ID 20230530124056.18332-2-pmorel@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series Fixing infinite loop on SCLP READ SCP INFO error | expand

Commit Message

Pierre Morel May 30, 2023, 12:40 p.m. UTC
A kvm-unit-test would hang if an abort happens before SCLP Read SCP
Information has completed if sclp_get_cpu_num() does not report at
least one CPU.
Since we obviously have one, report it.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 lib/s390x/sclp.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Nico Boehr June 1, 2023, 11:52 a.m. UTC | #1
Quoting Pierre Morel (2023-05-30 14:40:55)
> A kvm-unit-test would hang if an abort happens before SCLP Read SCP
> Information has completed if sclp_get_cpu_num() does not report at
> least one CPU.
> Since we obviously have one, report it.

Sorry for complaining again, in a discussion with Janosch we found that the
description and commit below can be easily misunderstood. I suggest the
following wording in the commit description:

s390x: sclp: treat system as single processor when read_info is NULL

When a test abort()s before SCLP read info is completed, the assertion on
read_info in sclp_read_info() will fail. Since abort() eventually calls
smp_teardown() which in turn calls sclp_get_cpu_num(), this will cause an
infinite abort() chain, causing the test to hang.

Fix this by considering the system single processor when read_info is missing.

[...]
> diff --git a/lib/s390x/sclp.c b/lib/s390x/sclp.c
> index 12919ca..34a31da 100644
> --- a/lib/s390x/sclp.c
> +++ b/lib/s390x/sclp.c
> @@ -121,6 +121,12 @@ int sclp_get_cpu_num(void)
>  {
>         if (read_info)
>                 return read_info->entries_cpu;
> +       /*
> +        * If we fail here and read_info has not being set,
> +        * it means we failed early and we try to abort the test.
> +        * We need to return at least one CPU, and obviously we have
> +        * at least one, for the smp_teardown to correctly work.
> +        */

Please make this:

Don't abort here if read_info is NULL since abort() calls smp_teardown() which
eventually calls this function and thus causes an infinite abort() chain,
causing the test to hang. Since we obviously have at least one CPU, just return
one.

With these changes:

Reviewed-by: Nico Boehr <nrb@linux.ibm.com>

Sorry for the back and forth.
Pierre Morel June 1, 2023, 12:32 p.m. UTC | #2
On 6/1/23 13:52, Nico Boehr wrote:
> Quoting Pierre Morel (2023-05-30 14:40:55)
>> A kvm-unit-test would hang if an abort happens before SCLP Read SCP
>> Information has completed if sclp_get_cpu_num() does not report at
>> least one CPU.
>> Since we obviously have one, report it.
> Sorry for complaining again, in a discussion with Janosch we found that the
> description and commit below can be easily misunderstood. I suggest the
> following wording in the commit description:
>
> s390x: sclp: treat system as single processor when read_info is NULL
>
> When a test abort()s before SCLP read info is completed, the assertion on
> read_info in sclp_read_info() will fail. Since abort() eventually calls
> smp_teardown() which in turn calls sclp_get_cpu_num(), this will cause an
> infinite abort() chain, causing the test to hang.
>
> Fix this by considering the system single processor when read_info is missing.
>
> [...]

better, I take it

thx



>> diff --git a/lib/s390x/sclp.c b/lib/s390x/sclp.c
>> index 12919ca..34a31da 100644
>> --- a/lib/s390x/sclp.c
>> +++ b/lib/s390x/sclp.c
>> @@ -121,6 +121,12 @@ int sclp_get_cpu_num(void)
>>   {
>>          if (read_info)
>>                  return read_info->entries_cpu;
>> +       /*
>> +        * If we fail here and read_info has not being set,
>> +        * it means we failed early and we try to abort the test.
>> +        * We need to return at least one CPU, and obviously we have
>> +        * at least one, for the smp_teardown to correctly work.
>> +        */
> Please make this:
>
> Don't abort here if read_info is NULL since abort() calls smp_teardown() which
> eventually calls this function and thus causes an infinite abort() chain,
> causing the test to hang. Since we obviously have at least one CPU, just return
> one.
>
> With these changes:
>
> Reviewed-by: Nico Boehr <nrb@linux.ibm.com>
>
> Sorry for the back and forth.
diff mbox series

Patch

diff --git a/lib/s390x/sclp.c b/lib/s390x/sclp.c
index 12919ca..34a31da 100644
--- a/lib/s390x/sclp.c
+++ b/lib/s390x/sclp.c
@@ -121,6 +121,12 @@  int sclp_get_cpu_num(void)
 {
 	if (read_info)
 		return read_info->entries_cpu;
+	/*
+	 * If we fail here and read_info has not being set,
+	 * it means we failed early and we try to abort the test.
+	 * We need to return at least one CPU, and obviously we have
+	 * at least one, for the smp_teardown to correctly work.
+	 */
 	return 1;
 }