diff mbox series

[kvm-unit-tests,v1,3/6] s390x: smp: Fix checks for SIGP STOP STORE STATUS

Message ID 20220303210425.1693486-4-farman@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series s390x: SIGP fixes | expand

Commit Message

Eric Farman March 3, 2022, 9:04 p.m. UTC
In the routine test_stop_store_status(), the "running" part of
the test checks a few of the fields in lowcore (to verify the
"STORE STATUS" part of the SIGP order), and then ensures that
the CPU has stopped. But this is backwards, and leads to false
errors.

According to the Principles of Operation:
  The addressed CPU performs the stop function, fol-
  lowed by the store-status operation (see “Store Sta-
  tus” on page 4-82).

By checking the results how they are today, the contents of
the lowcore fields are unreliable until the CPU is stopped.
Thus, check that the CPU is stopped first, before ensuring
that the STORE STATUS was performed correctly.

While here, add the same check to the second part of the test,
even though the CPU is explicitly stopped prior to the SIGP.

Fixes: fc67b07a4 ("s390x: smp: Test stop and store status on a running and stopped cpu")
Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 s390x/smp.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Janosch Frank March 4, 2022, 10:40 a.m. UTC | #1
On 3/3/22 22:04, Eric Farman wrote:
> In the routine test_stop_store_status(), the "running" part of
> the test checks a few of the fields in lowcore (to verify the
> "STORE STATUS" part of the SIGP order), and then ensures that
> the CPU has stopped. But this is backwards, and leads to false
> errors.
> 
> According to the Principles of Operation:
>    The addressed CPU performs the stop function, fol-
>    lowed by the store-status operation (see “Store Sta-
>    tus” on page 4-82).
> 
> By checking the results how they are today, the contents of
> the lowcore fields are unreliable until the CPU is stopped.
> Thus, check that the CPU is stopped first, before ensuring
> that the STORE STATUS was performed correctly.

The results are undefined until the cpu is not busy via SIGP sense, no?
You cover that via doing the smp_cpu_stopped() check since that does a 
sigp sense.

Where the stop check is located doesn't really matter since the library 
waits until the cpu is stopped and it does that via smp_cpu_stopped()


So:
Are we really fixing something here?

Please improve the commit description.
For me this looks more like making checks more explicit and symmetrical 
which I'm generally ok with. We just need to specify correctly why we're 
doing that.

> 
> While here, add the same check to the second part of the test,
> even though the CPU is explicitly stopped prior to the SIGP.
> 
> Fixes: fc67b07a4 ("s390x: smp: Test stop and store status on a running and stopped cpu")
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>   s390x/smp.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/s390x/smp.c b/s390x/smp.c
> index 2f4af820..50811bd0 100644
> --- a/s390x/smp.c
> +++ b/s390x/smp.c
> @@ -98,9 +98,9 @@ static void test_stop_store_status(void)
>   	lc->grs_sa[15] = 0;
>   	smp_cpu_stop_store_status(1);
>   	mb();
> +	report(smp_cpu_stopped(1), "cpu stopped");
>   	report(lc->prefix_sa == (uint32_t)(uintptr_t)cpu->lowcore, "prefix");
>   	report(lc->grs_sa[15], "stack");
> -	report(smp_cpu_stopped(1), "cpu stopped");
>   	report_prefix_pop();
>   
>   	report_prefix_push("stopped");
> @@ -108,6 +108,7 @@ static void test_stop_store_status(void)
>   	lc->grs_sa[15] = 0;
>   	smp_cpu_stop_store_status(1);
>   	mb();
> +	report(smp_cpu_stopped(1), "cpu stopped");
>   	report(lc->prefix_sa == (uint32_t)(uintptr_t)cpu->lowcore, "prefix");
>   	report(lc->grs_sa[15], "stack");
>   	report_prefix_pop();
Eric Farman March 4, 2022, 2:38 p.m. UTC | #2
On Fri, 2022-03-04 at 11:40 +0100, Janosch Frank wrote:
> On 3/3/22 22:04, Eric Farman wrote:
> > In the routine test_stop_store_status(), the "running" part of
> > the test checks a few of the fields in lowcore (to verify the
> > "STORE STATUS" part of the SIGP order), and then ensures that
> > the CPU has stopped. But this is backwards, and leads to false
> > errors.
> > 
> > According to the Principles of Operation:
> >    The addressed CPU performs the stop function, fol-
> >    lowed by the store-status operation (see “Store Sta-
> >    tus” on page 4-82).
> > 
> > By checking the results how they are today, the contents of
> > the lowcore fields are unreliable until the CPU is stopped.
> > Thus, check that the CPU is stopped first, before ensuring
> > that the STORE STATUS was performed correctly.
> 
> The results are undefined until the cpu is not busy via SIGP sense,
> no?
> You cover that via doing the smp_cpu_stopped() check since that does
> a 
> sigp sense.
> 
> Where the stop check is located doesn't really matter since the
> library 
> waits until the cpu is stopped and it does that via smp_cpu_stopped()
> 
> 
> So:
> Are we really fixing something here?

Hrm, I thought so, but I got focused on the order of these checks and
overlooked the point that the library already does this looping. I do
trip up on these checks; let me revisit them.

> 
> Please improve the commit description.
> For me this looks more like making checks more explicit and
> symmetrical 
> which I'm generally ok with. We just need to specify correctly why
> we're 
> doing that.
> 
> > While here, add the same check to the second part of the test,
> > even though the CPU is explicitly stopped prior to the SIGP.
> > 
> > Fixes: fc67b07a4 ("s390x: smp: Test stop and store status on a
> > running and stopped cpu")
> > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > ---
> >   s390x/smp.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/s390x/smp.c b/s390x/smp.c
> > index 2f4af820..50811bd0 100644
> > --- a/s390x/smp.c
> > +++ b/s390x/smp.c
> > @@ -98,9 +98,9 @@ static void test_stop_store_status(void)
> >   	lc->grs_sa[15] = 0;
> >   	smp_cpu_stop_store_status(1);
> >   	mb();
> > +	report(smp_cpu_stopped(1), "cpu stopped");
> >   	report(lc->prefix_sa == (uint32_t)(uintptr_t)cpu->lowcore,
> > "prefix");
> >   	report(lc->grs_sa[15], "stack");
> > -	report(smp_cpu_stopped(1), "cpu stopped");
> >   	report_prefix_pop();
> >   
> >   	report_prefix_push("stopped");
> > @@ -108,6 +108,7 @@ static void test_stop_store_status(void)
> >   	lc->grs_sa[15] = 0;
> >   	smp_cpu_stop_store_status(1);
> >   	mb();
> > +	report(smp_cpu_stopped(1), "cpu stopped");
> >   	report(lc->prefix_sa == (uint32_t)(uintptr_t)cpu->lowcore,
> > "prefix");
> >   	report(lc->grs_sa[15], "stack");
> >   	report_prefix_pop();
Eric Farman March 7, 2022, 6:30 p.m. UTC | #3
On Fri, 2022-03-04 at 09:38 -0500, Eric Farman wrote:
> On Fri, 2022-03-04 at 11:40 +0100, Janosch Frank wrote:
> > On 3/3/22 22:04, Eric Farman wrote:
> > > In the routine test_stop_store_status(), the "running" part of
> > > the test checks a few of the fields in lowcore (to verify the
> > > "STORE STATUS" part of the SIGP order), and then ensures that
> > > the CPU has stopped. But this is backwards, and leads to false
> > > errors.
> > > 
> > > According to the Principles of Operation:
> > >    The addressed CPU performs the stop function, fol-
> > >    lowed by the store-status operation (see “Store Sta-
> > >    tus” on page 4-82).
> > > 
> > > By checking the results how they are today, the contents of
> > > the lowcore fields are unreliable until the CPU is stopped.
> > > Thus, check that the CPU is stopped first, before ensuring
> > > that the STORE STATUS was performed correctly.
> > 
> > The results are undefined until the cpu is not busy via SIGP sense,
> > no?
> > You cover that via doing the smp_cpu_stopped() check since that
> > does
> > a 
> > sigp sense.
> > 
> > Where the stop check is located doesn't really matter since the
> > library 
> > waits until the cpu is stopped and it does that via
> > smp_cpu_stopped()
> > 
> > 
> > So:
> > Are we really fixing something here?
> 
> Hrm, I thought so, but I got focused on the order of these checks and
> overlooked the point that the library already does this looping. I do
> trip up on these checks; let me revisit them.

Ah, my turn to fool myself. To test all the different combinations, I
had both old/new SIGP behavior in otherwise identical kernels and QEMU
binaries. But I appear to have mislabeled QEMU, so the failures I was
seeing was due to running the old QEMU, and not anything in kvm-unit-
tests itself. My apologies.

So, per your next paragraph, I'll keep this patch but tidy up the
commit message accordingly.

> 
> > Please improve the commit description.
> > For me this looks more like making checks more explicit and
> > symmetrical 
> > which I'm generally ok with. We just need to specify correctly why
> > we're 
> > doing that.
> > 
> > > While here, add the same check to the second part of the test,
> > > even though the CPU is explicitly stopped prior to the SIGP.
> > > 
> > > Fixes: fc67b07a4 ("s390x: smp: Test stop and store status on a
> > > running and stopped cpu")
> > > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > > ---
> > >   s390x/smp.c | 3 ++-
> > >   1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/s390x/smp.c b/s390x/smp.c
> > > index 2f4af820..50811bd0 100644
> > > --- a/s390x/smp.c
> > > +++ b/s390x/smp.c
> > > @@ -98,9 +98,9 @@ static void test_stop_store_status(void)
> > >   	lc->grs_sa[15] = 0;
> > >   	smp_cpu_stop_store_status(1);
> > >   	mb();
> > > +	report(smp_cpu_stopped(1), "cpu stopped");
> > >   	report(lc->prefix_sa == (uint32_t)(uintptr_t)cpu-
> > > >lowcore,
> > > "prefix");
> > >   	report(lc->grs_sa[15], "stack");
> > > -	report(smp_cpu_stopped(1), "cpu stopped");
> > >   	report_prefix_pop();
> > >   
> > >   	report_prefix_push("stopped");
> > > @@ -108,6 +108,7 @@ static void test_stop_store_status(void)
> > >   	lc->grs_sa[15] = 0;
> > >   	smp_cpu_stop_store_status(1);
> > >   	mb();
> > > +	report(smp_cpu_stopped(1), "cpu stopped");
> > >   	report(lc->prefix_sa == (uint32_t)(uintptr_t)cpu-
> > > >lowcore,
> > > "prefix");
> > >   	report(lc->grs_sa[15], "stack");
> > >   	report_prefix_pop();
diff mbox series

Patch

diff --git a/s390x/smp.c b/s390x/smp.c
index 2f4af820..50811bd0 100644
--- a/s390x/smp.c
+++ b/s390x/smp.c
@@ -98,9 +98,9 @@  static void test_stop_store_status(void)
 	lc->grs_sa[15] = 0;
 	smp_cpu_stop_store_status(1);
 	mb();
+	report(smp_cpu_stopped(1), "cpu stopped");
 	report(lc->prefix_sa == (uint32_t)(uintptr_t)cpu->lowcore, "prefix");
 	report(lc->grs_sa[15], "stack");
-	report(smp_cpu_stopped(1), "cpu stopped");
 	report_prefix_pop();
 
 	report_prefix_push("stopped");
@@ -108,6 +108,7 @@  static void test_stop_store_status(void)
 	lc->grs_sa[15] = 0;
 	smp_cpu_stop_store_status(1);
 	mb();
+	report(smp_cpu_stopped(1), "cpu stopped");
 	report(lc->prefix_sa == (uint32_t)(uintptr_t)cpu->lowcore, "prefix");
 	report(lc->grs_sa[15], "stack");
 	report_prefix_pop();