diff mbox series

[kvm-unit-tests,v1,4/6] s390x: smp: Create and use a non-waiting CPU stop

Message ID 20220303210425.1693486-5-farman@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series s390x: SIGP fixes | expand

Commit Message

Eric Farman March 3, 2022, 9:04 p.m. UTC
When stopping a CPU, kvm-unit-tests serializes/waits for everything
to finish, in order to get a consistent result whenever those
functions are used.

But to test the SIGP STOP itself, these additional measures could
mask other problems. For example, did the STOP work, or is the CPU
still operating?

Let's create a non-waiting SIGP STOP and use it here, to ensure that
the CPU is correctly stopped. A smp_cpu_stopped() call will still
be used to see that the SIGP STOP has been processed, and the state
of the CPU can be used to determine whether the test passes/fails.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---
 lib/s390x/smp.c | 25 +++++++++++++++++++++++++
 lib/s390x/smp.h |  1 +
 s390x/smp.c     | 10 ++--------
 3 files changed, 28 insertions(+), 8 deletions(-)

Comments

Nico Boehr March 7, 2022, 1:31 p.m. UTC | #1
On Thu, 2022-03-03 at 22:04 +0100, Eric Farman wrote:
[...]

> --- a/s390x/smp.c
> +++ b/s390x/smp.c
> @@ -76,14 +76,8 @@ static void test_restart(void)
>  
>  static void test_stop(void)
>  {
> -       smp_cpu_stop(1);
> -       /*
> -        * The smp library waits for the CPU to shut down, but let's
> -        * also do it here, so we don't rely on the library
> -        * implementation
> -        */
> -       while (!smp_cpu_stopped(1)) {}
> -       report_pass("stop");
> +       smp_cpu_stop_nowait(1);

Now that this can fail because of CC=2, should we check the return
value here?
Claudio Imbrenda March 7, 2022, 3:30 p.m. UTC | #2
On Thu,  3 Mar 2022 22:04:23 +0100
Eric Farman <farman@linux.ibm.com> wrote:

> When stopping a CPU, kvm-unit-tests serializes/waits for everything
> to finish, in order to get a consistent result whenever those
> functions are used.
> 
> But to test the SIGP STOP itself, these additional measures could
> mask other problems. For example, did the STOP work, or is the CPU
> still operating?
> 
> Let's create a non-waiting SIGP STOP and use it here, to ensure that
> the CPU is correctly stopped. A smp_cpu_stopped() call will still
> be used to see that the SIGP STOP has been processed, and the state
> of the CPU can be used to determine whether the test passes/fails.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
>  lib/s390x/smp.c | 25 +++++++++++++++++++++++++
>  lib/s390x/smp.h |  1 +
>  s390x/smp.c     | 10 ++--------
>  3 files changed, 28 insertions(+), 8 deletions(-)
> 
> diff --git a/lib/s390x/smp.c b/lib/s390x/smp.c
> index 368d6add..84e536e8 100644
> --- a/lib/s390x/smp.c
> +++ b/lib/s390x/smp.c
> @@ -119,6 +119,31 @@ int smp_cpu_stop(uint16_t idx)
>  	return rc;
>  }
>  
> +/*
> + * Functionally equivalent to smp_cpu_stop(), but without the
> + * elements that wait/serialize matters itself.
> + * Used to see if KVM itself is serialized correctly.
> + */
> +int smp_cpu_stop_nowait(uint16_t idx)
> +{
> +	/* refuse to work on the boot CPU */
> +	if (idx == 0)
> +		return -1;
> +
> +	spin_lock(&lock);
> +
> +	/* Don't suppress a CC2 with sigp_retry() */
> +	if (smp_sigp(idx, SIGP_STOP, 0, NULL)) {
> +		spin_unlock(&lock);
> +		return -1;
> +	}
> +
> +	cpus[idx].active = false;
> +	spin_unlock(&lock);
> +
> +	return 0;
> +}
> +
>  int smp_cpu_stop_store_status(uint16_t idx)
>  {
>  	int rc;
> diff --git a/lib/s390x/smp.h b/lib/s390x/smp.h
> index 1e69a7de..bae03dfd 100644
> --- a/lib/s390x/smp.h
> +++ b/lib/s390x/smp.h
> @@ -44,6 +44,7 @@ bool smp_sense_running_status(uint16_t idx);
>  int smp_cpu_restart(uint16_t idx);
>  int smp_cpu_start(uint16_t idx, struct psw psw);
>  int smp_cpu_stop(uint16_t idx);
> +int smp_cpu_stop_nowait(uint16_t idx);
>  int smp_cpu_stop_store_status(uint16_t idx);
>  int smp_cpu_destroy(uint16_t idx);
>  int smp_cpu_setup(uint16_t idx, struct psw psw);
> diff --git a/s390x/smp.c b/s390x/smp.c
> index 50811bd0..11c2c673 100644
> --- a/s390x/smp.c
> +++ b/s390x/smp.c
> @@ -76,14 +76,8 @@ static void test_restart(void)
>  
>  static void test_stop(void)
>  {
> -	smp_cpu_stop(1);
> -	/*
> -	 * The smp library waits for the CPU to shut down, but let's
> -	 * also do it here, so we don't rely on the library
> -	 * implementation
> -	 */
> -	while (!smp_cpu_stopped(1)) {}
> -	report_pass("stop");
> +	smp_cpu_stop_nowait(1);

can it happen that the SIGP STOP order is accepted, but the target CPU
is still running (and not even busy)?

> +	report(smp_cpu_stopped(1), "stop");

e.g. can this ^ check race with the actual stopping of the CPU?

>  }
>  
>  static void test_stop_store_status(void)
Eric Farman March 7, 2022, 7:01 p.m. UTC | #3
On Mon, 2022-03-07 at 14:31 +0100, Nico Boehr wrote:
> On Thu, 2022-03-03 at 22:04 +0100, Eric Farman wrote:
> [...]
> 
> > --- a/s390x/smp.c
> > +++ b/s390x/smp.c
> > @@ -76,14 +76,8 @@ static void test_restart(void)
> >  
> >  static void test_stop(void)
> >  {
> > -       smp_cpu_stop(1);
> > -       /*
> > -        * The smp library waits for the CPU to shut down, but
> > let's
> > -        * also do it here, so we don't rely on the library
> > -        * implementation
> > -        */
> > -       while (!smp_cpu_stopped(1)) {}
> > -       report_pass("stop");
> > +       smp_cpu_stop_nowait(1);
> 
> Now that this can fail because of CC=2, should we check the return
> value here?

Ah, yes it should. Thanks,

Eric
Eric Farman March 7, 2022, 7:03 p.m. UTC | #4
On Mon, 2022-03-07 at 16:30 +0100, Claudio Imbrenda wrote:
> On Thu,  3 Mar 2022 22:04:23 +0100
> Eric Farman <farman@linux.ibm.com> wrote:
> 
> > When stopping a CPU, kvm-unit-tests serializes/waits for everything
> > to finish, in order to get a consistent result whenever those
> > functions are used.
> > 
> > But to test the SIGP STOP itself, these additional measures could
> > mask other problems. For example, did the STOP work, or is the CPU
> > still operating?
> > 
> > Let's create a non-waiting SIGP STOP and use it here, to ensure
> > that
> > the CPU is correctly stopped. A smp_cpu_stopped() call will still
> > be used to see that the SIGP STOP has been processed, and the state
> > of the CPU can be used to determine whether the test passes/fails.
> > 
> > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > ---
> >  lib/s390x/smp.c | 25 +++++++++++++++++++++++++
> >  lib/s390x/smp.h |  1 +
> >  s390x/smp.c     | 10 ++--------
> >  3 files changed, 28 insertions(+), 8 deletions(-)
> > 
> > diff --git a/lib/s390x/smp.c b/lib/s390x/smp.c
> > index 368d6add..84e536e8 100644
> > --- a/lib/s390x/smp.c
> > +++ b/lib/s390x/smp.c
> > @@ -119,6 +119,31 @@ int smp_cpu_stop(uint16_t idx)
> >  	return rc;
> >  }
> >  
> > +/*
> > + * Functionally equivalent to smp_cpu_stop(), but without the
> > + * elements that wait/serialize matters itself.
> > + * Used to see if KVM itself is serialized correctly.
> > + */
> > +int smp_cpu_stop_nowait(uint16_t idx)
> > +{
> > +	/* refuse to work on the boot CPU */
> > +	if (idx == 0)
> > +		return -1;
> > +
> > +	spin_lock(&lock);
> > +
> > +	/* Don't suppress a CC2 with sigp_retry() */
> > +	if (smp_sigp(idx, SIGP_STOP, 0, NULL)) {
> > +		spin_unlock(&lock);
> > +		return -1;
> > +	}
> > +
> > +	cpus[idx].active = false;
> > +	spin_unlock(&lock);
> > +
> > +	return 0;
> > +}
> > +
> >  int smp_cpu_stop_store_status(uint16_t idx)
> >  {
> >  	int rc;
> > diff --git a/lib/s390x/smp.h b/lib/s390x/smp.h
> > index 1e69a7de..bae03dfd 100644
> > --- a/lib/s390x/smp.h
> > +++ b/lib/s390x/smp.h
> > @@ -44,6 +44,7 @@ bool smp_sense_running_status(uint16_t idx);
> >  int smp_cpu_restart(uint16_t idx);
> >  int smp_cpu_start(uint16_t idx, struct psw psw);
> >  int smp_cpu_stop(uint16_t idx);
> > +int smp_cpu_stop_nowait(uint16_t idx);
> >  int smp_cpu_stop_store_status(uint16_t idx);
> >  int smp_cpu_destroy(uint16_t idx);
> >  int smp_cpu_setup(uint16_t idx, struct psw psw);
> > diff --git a/s390x/smp.c b/s390x/smp.c
> > index 50811bd0..11c2c673 100644
> > --- a/s390x/smp.c
> > +++ b/s390x/smp.c
> > @@ -76,14 +76,8 @@ static void test_restart(void)
> >  
> >  static void test_stop(void)
> >  {
> > -	smp_cpu_stop(1);
> > -	/*
> > -	 * The smp library waits for the CPU to shut down, but let's
> > -	 * also do it here, so we don't rely on the library
> > -	 * implementation
> > -	 */
> > -	while (!smp_cpu_stopped(1)) {}
> > -	report_pass("stop");
> > +	smp_cpu_stop_nowait(1);
> 
> can it happen that the SIGP STOP order is accepted, but the target
> CPU
> is still running (and not even busy)?

Of course. A SIGP that's processed by userspace (which is many of them)
injects a STOP IRQ back to the kernel, which means the CPU might not be
stopped for some time. But...

> 
> > +	report(smp_cpu_stopped(1), "stop");
> 
> e.g. can this ^ check race with the actual stopping of the CPU?

...the smp_cpu_stopped() routine now loops on the CC2 that SIGP SENSE
returns because of that pending IRQ. If SIGP SENSE returns CC0/1, then
the CPU can correctly be identified stopped/operating, and the test can
correctly pass/fail.

> 
> >  }
> >  
> >  static void test_stop_store_status(void)
Claudio Imbrenda March 8, 2022, 10:31 a.m. UTC | #5
On Mon, 07 Mar 2022 14:03:45 -0500
Eric Farman <farman@linux.ibm.com> wrote:

> On Mon, 2022-03-07 at 16:30 +0100, Claudio Imbrenda wrote:
> > On Thu,  3 Mar 2022 22:04:23 +0100
> > Eric Farman <farman@linux.ibm.com> wrote:
> >   
> > > When stopping a CPU, kvm-unit-tests serializes/waits for everything
> > > to finish, in order to get a consistent result whenever those
> > > functions are used.
> > > 
> > > But to test the SIGP STOP itself, these additional measures could
> > > mask other problems. For example, did the STOP work, or is the CPU
> > > still operating?
> > > 
> > > Let's create a non-waiting SIGP STOP and use it here, to ensure
> > > that
> > > the CPU is correctly stopped. A smp_cpu_stopped() call will still
> > > be used to see that the SIGP STOP has been processed, and the state
> > > of the CPU can be used to determine whether the test passes/fails.
> > > 
> > > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > > ---
> > >  lib/s390x/smp.c | 25 +++++++++++++++++++++++++
> > >  lib/s390x/smp.h |  1 +
> > >  s390x/smp.c     | 10 ++--------
> > >  3 files changed, 28 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/lib/s390x/smp.c b/lib/s390x/smp.c
> > > index 368d6add..84e536e8 100644
> > > --- a/lib/s390x/smp.c
> > > +++ b/lib/s390x/smp.c
> > > @@ -119,6 +119,31 @@ int smp_cpu_stop(uint16_t idx)
> > >  	return rc;
> > >  }
> > >  
> > > +/*
> > > + * Functionally equivalent to smp_cpu_stop(), but without the
> > > + * elements that wait/serialize matters itself.
> > > + * Used to see if KVM itself is serialized correctly.
> > > + */
> > > +int smp_cpu_stop_nowait(uint16_t idx)
> > > +{
> > > +	/* refuse to work on the boot CPU */
> > > +	if (idx == 0)
> > > +		return -1;
> > > +
> > > +	spin_lock(&lock);
> > > +
> > > +	/* Don't suppress a CC2 with sigp_retry() */
> > > +	if (smp_sigp(idx, SIGP_STOP, 0, NULL)) {
> > > +		spin_unlock(&lock);
> > > +		return -1;
> > > +	}
> > > +
> > > +	cpus[idx].active = false;
> > > +	spin_unlock(&lock);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >  int smp_cpu_stop_store_status(uint16_t idx)
> > >  {
> > >  	int rc;
> > > diff --git a/lib/s390x/smp.h b/lib/s390x/smp.h
> > > index 1e69a7de..bae03dfd 100644
> > > --- a/lib/s390x/smp.h
> > > +++ b/lib/s390x/smp.h
> > > @@ -44,6 +44,7 @@ bool smp_sense_running_status(uint16_t idx);
> > >  int smp_cpu_restart(uint16_t idx);
> > >  int smp_cpu_start(uint16_t idx, struct psw psw);
> > >  int smp_cpu_stop(uint16_t idx);
> > > +int smp_cpu_stop_nowait(uint16_t idx);
> > >  int smp_cpu_stop_store_status(uint16_t idx);
> > >  int smp_cpu_destroy(uint16_t idx);
> > >  int smp_cpu_setup(uint16_t idx, struct psw psw);
> > > diff --git a/s390x/smp.c b/s390x/smp.c
> > > index 50811bd0..11c2c673 100644
> > > --- a/s390x/smp.c
> > > +++ b/s390x/smp.c
> > > @@ -76,14 +76,8 @@ static void test_restart(void)
> > >  
> > >  static void test_stop(void)
> > >  {
> > > -	smp_cpu_stop(1);
> > > -	/*
> > > -	 * The smp library waits for the CPU to shut down, but let's
> > > -	 * also do it here, so we don't rely on the library
> > > -	 * implementation
> > > -	 */
> > > -	while (!smp_cpu_stopped(1)) {}
> > > -	report_pass("stop");
> > > +	smp_cpu_stop_nowait(1);  
> > 
> > can it happen that the SIGP STOP order is accepted, but the target
> > CPU
> > is still running (and not even busy)?  
> 
> Of course. A SIGP that's processed by userspace (which is many of them)
> injects a STOP IRQ back to the kernel, which means the CPU might not be
> stopped for some time. But...
> 
> >   
> > > +	report(smp_cpu_stopped(1), "stop");  
> > 
> > e.g. can this ^ check race with the actual stopping of the CPU?  
> 
> ...the smp_cpu_stopped() routine now loops on the CC2 that SIGP SENSE
> returns because of that pending IRQ. If SIGP SENSE returns CC0/1, then
> the CPU can correctly be identified stopped/operating, and the test can
> correctly pass/fail.

my question was: is it possible architecturally that there is a window
where the STOP order is accepted, but a SENSE on the target CPU still
successfully returns that the CPU is running?

in other words: is it specified architecturally that, once an order is
accepted for a target CPU, that CPU can't accept any other order (and
will return CC2), including SENSE, until the order has been completed
successfully?

> 
> >   
> > >  }
> > >  
> > >  static void test_stop_store_status(void)  
>
Eric Farman March 8, 2022, 9:18 p.m. UTC | #6
On Tue, 2022-03-08 at 11:31 +0100, Claudio Imbrenda wrote:
> On Mon, 07 Mar 2022 14:03:45 -0500
> Eric Farman <farman@linux.ibm.com> wrote:
> 
> > On Mon, 2022-03-07 at 16:30 +0100, Claudio Imbrenda wrote:
> > > On Thu,  3 Mar 2022 22:04:23 +0100
> > > Eric Farman <farman@linux.ibm.com> wrote:
> > >   
> > > > When stopping a CPU, kvm-unit-tests serializes/waits for
> > > > everything
> > > > to finish, in order to get a consistent result whenever those
> > > > functions are used.
> > > > 
> > > > But to test the SIGP STOP itself, these additional measures
> > > > could
> > > > mask other problems. For example, did the STOP work, or is the
> > > > CPU
> > > > still operating?
> > > > 
> > > > Let's create a non-waiting SIGP STOP and use it here, to ensure
> > > > that
> > > > the CPU is correctly stopped. A smp_cpu_stopped() call will
> > > > still
> > > > be used to see that the SIGP STOP has been processed, and the
> > > > state
> > > > of the CPU can be used to determine whether the test
> > > > passes/fails.
> > > > 
> > > > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > > > ---
> > > >  lib/s390x/smp.c | 25 +++++++++++++++++++++++++
> > > >  lib/s390x/smp.h |  1 +
> > > >  s390x/smp.c     | 10 ++--------
> > > >  3 files changed, 28 insertions(+), 8 deletions(-)
> > > > 
> > > > diff --git a/lib/s390x/smp.c b/lib/s390x/smp.c
> > > > index 368d6add..84e536e8 100644
> > > > --- a/lib/s390x/smp.c
> > > > +++ b/lib/s390x/smp.c
> > > > @@ -119,6 +119,31 @@ int smp_cpu_stop(uint16_t idx)
> > > >  	return rc;
> > > >  }
> > > >  
> > > > +/*
> > > > + * Functionally equivalent to smp_cpu_stop(), but without the
> > > > + * elements that wait/serialize matters itself.
> > > > + * Used to see if KVM itself is serialized correctly.
> > > > + */
> > > > +int smp_cpu_stop_nowait(uint16_t idx)
> > > > +{
> > > > +	/* refuse to work on the boot CPU */
> > > > +	if (idx == 0)
> > > > +		return -1;
> > > > +
> > > > +	spin_lock(&lock);
> > > > +
> > > > +	/* Don't suppress a CC2 with sigp_retry() */
> > > > +	if (smp_sigp(idx, SIGP_STOP, 0, NULL)) {
> > > > +		spin_unlock(&lock);
> > > > +		return -1;
> > > > +	}
> > > > +
> > > > +	cpus[idx].active = false;
> > > > +	spin_unlock(&lock);
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +
> > > >  int smp_cpu_stop_store_status(uint16_t idx)
> > > >  {
> > > >  	int rc;
> > > > diff --git a/lib/s390x/smp.h b/lib/s390x/smp.h
> > > > index 1e69a7de..bae03dfd 100644
> > > > --- a/lib/s390x/smp.h
> > > > +++ b/lib/s390x/smp.h
> > > > @@ -44,6 +44,7 @@ bool smp_sense_running_status(uint16_t idx);
> > > >  int smp_cpu_restart(uint16_t idx);
> > > >  int smp_cpu_start(uint16_t idx, struct psw psw);
> > > >  int smp_cpu_stop(uint16_t idx);
> > > > +int smp_cpu_stop_nowait(uint16_t idx);
> > > >  int smp_cpu_stop_store_status(uint16_t idx);
> > > >  int smp_cpu_destroy(uint16_t idx);
> > > >  int smp_cpu_setup(uint16_t idx, struct psw psw);
> > > > diff --git a/s390x/smp.c b/s390x/smp.c
> > > > index 50811bd0..11c2c673 100644
> > > > --- a/s390x/smp.c
> > > > +++ b/s390x/smp.c
> > > > @@ -76,14 +76,8 @@ static void test_restart(void)
> > > >  
> > > >  static void test_stop(void)
> > > >  {
> > > > -	smp_cpu_stop(1);
> > > > -	/*
> > > > -	 * The smp library waits for the CPU to shut down, but
> > > > let's
> > > > -	 * also do it here, so we don't rely on the library
> > > > -	 * implementation
> > > > -	 */
> > > > -	while (!smp_cpu_stopped(1)) {}
> > > > -	report_pass("stop");
> > > > +	smp_cpu_stop_nowait(1);  
> > > 
> > > can it happen that the SIGP STOP order is accepted, but the
> > > target
> > > CPU
> > > is still running (and not even busy)?  
> > 
> > Of course. A SIGP that's processed by userspace (which is many of
> > them)
> > injects a STOP IRQ back to the kernel, which means the CPU might
> > not be
> > stopped for some time. But...
> > 
> > >   
> > > > +	report(smp_cpu_stopped(1), "stop");  
> > > 
> > > e.g. can this ^ check race with the actual stopping of the CPU?  
> > 
> > ...the smp_cpu_stopped() routine now loops on the CC2 that SIGP
> > SENSE
> > returns because of that pending IRQ. If SIGP SENSE returns CC0/1,
> > then
> > the CPU can correctly be identified stopped/operating, and the test
> > can
> > correctly pass/fail.
> 
> my question was: is it possible architecturally that there is a
> window
> where the STOP order is accepted, but a SENSE on the target CPU still
> successfully returns that the CPU is running?

Not to my knowledge. The "Conditions Determining Response" section of
POPS (v12, page 4-95; also below) specifically states that the SIGP
SENSE will return CC2 when a SIGP STOP order is outstanding.

In our implementation, the stop IRQ will be injected before the STOP
order gets accepted, such that a SIGP SENSE would see it immediately.

"""
A previously issued start, stop, restart, stop-
and-store-status, set-prefix, store-status-at-
address order, or store-additional-status-at-
address has been accepted by the
addressed CPU, and execution of the func-
tion requested by the order has not yet been
completed.
...
If the currently specified order is sense, external
call, emergency signal, start, stop, restart, stop
and store status, set prefix, store status at
address, set architecture, set multithreading, or
store additional status at address, then the order
is rejected, and condition code 2 is set.
"""

> 
> in other words: is it specified architecturally that, once an order
> is
> accepted for a target CPU, that CPU can't accept any other order (and
> will return CC2), including SENSE, until the order has been completed
> successfully?

The POPS quote I placed above excludes store-extended-status-address,
conditional-emergency-signal, and sense-running-status orders as
returning a CC2. That's something Janosch and I have chatted about
offline, but don't have an answer to at this time. Besides that, any
other order would get a CC2 until the STOP has been completed.

> 
> > >   
> > > >  }
> > > >  
> > > >  static void test_stop_store_status(void)
Claudio Imbrenda March 9, 2022, 9:27 a.m. UTC | #7
On Tue, 08 Mar 2022 16:18:16 -0500
Eric Farman <farman@linux.ibm.com> wrote:

> On Tue, 2022-03-08 at 11:31 +0100, Claudio Imbrenda wrote:
> > On Mon, 07 Mar 2022 14:03:45 -0500
> > Eric Farman <farman@linux.ibm.com> wrote:
> >   
> > > On Mon, 2022-03-07 at 16:30 +0100, Claudio Imbrenda wrote:  
> > > > On Thu,  3 Mar 2022 22:04:23 +0100
> > > > Eric Farman <farman@linux.ibm.com> wrote:
> > > >     
> > > > > When stopping a CPU, kvm-unit-tests serializes/waits for
> > > > > everything
> > > > > to finish, in order to get a consistent result whenever those
> > > > > functions are used.
> > > > > 
> > > > > But to test the SIGP STOP itself, these additional measures
> > > > > could
> > > > > mask other problems. For example, did the STOP work, or is the
> > > > > CPU
> > > > > still operating?
> > > > > 
> > > > > Let's create a non-waiting SIGP STOP and use it here, to ensure
> > > > > that
> > > > > the CPU is correctly stopped. A smp_cpu_stopped() call will
> > > > > still
> > > > > be used to see that the SIGP STOP has been processed, and the
> > > > > state
> > > > > of the CPU can be used to determine whether the test
> > > > > passes/fails.
> > > > > 
> > > > > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > > > > ---
> > > > >  lib/s390x/smp.c | 25 +++++++++++++++++++++++++
> > > > >  lib/s390x/smp.h |  1 +
> > > > >  s390x/smp.c     | 10 ++--------
> > > > >  3 files changed, 28 insertions(+), 8 deletions(-)
> > > > > 
> > > > > diff --git a/lib/s390x/smp.c b/lib/s390x/smp.c
> > > > > index 368d6add..84e536e8 100644
> > > > > --- a/lib/s390x/smp.c
> > > > > +++ b/lib/s390x/smp.c
> > > > > @@ -119,6 +119,31 @@ int smp_cpu_stop(uint16_t idx)
> > > > >  	return rc;
> > > > >  }
> > > > >  
> > > > > +/*
> > > > > + * Functionally equivalent to smp_cpu_stop(), but without the
> > > > > + * elements that wait/serialize matters itself.
> > > > > + * Used to see if KVM itself is serialized correctly.
> > > > > + */
> > > > > +int smp_cpu_stop_nowait(uint16_t idx)
> > > > > +{
> > > > > +	/* refuse to work on the boot CPU */
> > > > > +	if (idx == 0)
> > > > > +		return -1;
> > > > > +
> > > > > +	spin_lock(&lock);
> > > > > +
> > > > > +	/* Don't suppress a CC2 with sigp_retry() */
> > > > > +	if (smp_sigp(idx, SIGP_STOP, 0, NULL)) {
> > > > > +		spin_unlock(&lock);
> > > > > +		return -1;
> > > > > +	}
> > > > > +
> > > > > +	cpus[idx].active = false;
> > > > > +	spin_unlock(&lock);
> > > > > +
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > >  int smp_cpu_stop_store_status(uint16_t idx)
> > > > >  {
> > > > >  	int rc;
> > > > > diff --git a/lib/s390x/smp.h b/lib/s390x/smp.h
> > > > > index 1e69a7de..bae03dfd 100644
> > > > > --- a/lib/s390x/smp.h
> > > > > +++ b/lib/s390x/smp.h
> > > > > @@ -44,6 +44,7 @@ bool smp_sense_running_status(uint16_t idx);
> > > > >  int smp_cpu_restart(uint16_t idx);
> > > > >  int smp_cpu_start(uint16_t idx, struct psw psw);
> > > > >  int smp_cpu_stop(uint16_t idx);
> > > > > +int smp_cpu_stop_nowait(uint16_t idx);
> > > > >  int smp_cpu_stop_store_status(uint16_t idx);
> > > > >  int smp_cpu_destroy(uint16_t idx);
> > > > >  int smp_cpu_setup(uint16_t idx, struct psw psw);
> > > > > diff --git a/s390x/smp.c b/s390x/smp.c
> > > > > index 50811bd0..11c2c673 100644
> > > > > --- a/s390x/smp.c
> > > > > +++ b/s390x/smp.c
> > > > > @@ -76,14 +76,8 @@ static void test_restart(void)
> > > > >  
> > > > >  static void test_stop(void)
> > > > >  {
> > > > > -	smp_cpu_stop(1);
> > > > > -	/*
> > > > > -	 * The smp library waits for the CPU to shut down, but
> > > > > let's
> > > > > -	 * also do it here, so we don't rely on the library
> > > > > -	 * implementation
> > > > > -	 */
> > > > > -	while (!smp_cpu_stopped(1)) {}
> > > > > -	report_pass("stop");
> > > > > +	smp_cpu_stop_nowait(1);    
> > > > 
> > > > can it happen that the SIGP STOP order is accepted, but the
> > > > target
> > > > CPU
> > > > is still running (and not even busy)?    
> > > 
> > > Of course. A SIGP that's processed by userspace (which is many of
> > > them)
> > > injects a STOP IRQ back to the kernel, which means the CPU might
> > > not be
> > > stopped for some time. But...
> > >   
> > > >     
> > > > > +	report(smp_cpu_stopped(1), "stop");    
> > > > 
> > > > e.g. can this ^ check race with the actual stopping of the CPU?    
> > > 
> > > ...the smp_cpu_stopped() routine now loops on the CC2 that SIGP
> > > SENSE
> > > returns because of that pending IRQ. If SIGP SENSE returns CC0/1,
> > > then
> > > the CPU can correctly be identified stopped/operating, and the test
> > > can
> > > correctly pass/fail.  
> > 
> > my question was: is it possible architecturally that there is a
> > window
> > where the STOP order is accepted, but a SENSE on the target CPU still
> > successfully returns that the CPU is running?  
> 
> Not to my knowledge. The "Conditions Determining Response" section of
> POPS (v12, page 4-95; also below) specifically states that the SIGP
> SENSE will return CC2 when a SIGP STOP order is outstanding.
> 
> In our implementation, the stop IRQ will be injected before the STOP
> order gets accepted, such that a SIGP SENSE would see it immediately.
> 
> """
> A previously issued start, stop, restart, stop-
> and-store-status, set-prefix, store-status-at-
> address order, or store-additional-status-at-
> address has been accepted by the
> addressed CPU, and execution of the func-
> tion requested by the order has not yet been
> completed.
> ...
> If the currently specified order is sense, external
> call, emergency signal, start, stop, restart, stop
> and store status, set prefix, store status at
> address, set architecture, set multithreading, or
> store additional status at address, then the order
> is rejected, and condition code 2 is set.
> """
> 
> > 
> > in other words: is it specified architecturally that, once an order
> > is
> > accepted for a target CPU, that CPU can't accept any other order (and
> > will return CC2), including SENSE, until the order has been completed
> > successfully?  
> 
> The POPS quote I placed above excludes store-extended-status-address,
> conditional-emergency-signal, and sense-running-status orders as
> returning a CC2. That's something Janosch and I have chatted about
> offline, but don't have an answer to at this time. Besides that, any
> other order would get a CC2 until the STOP has been completed.

this is what I wanted to know, thanks

you can add

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>


> 
> >   
> > > >     
> > > > >  }
> > > > >  
> > > > >  static void test_stop_store_status(void)    
>
diff mbox series

Patch

diff --git a/lib/s390x/smp.c b/lib/s390x/smp.c
index 368d6add..84e536e8 100644
--- a/lib/s390x/smp.c
+++ b/lib/s390x/smp.c
@@ -119,6 +119,31 @@  int smp_cpu_stop(uint16_t idx)
 	return rc;
 }
 
+/*
+ * Functionally equivalent to smp_cpu_stop(), but without the
+ * elements that wait/serialize matters itself.
+ * Used to see if KVM itself is serialized correctly.
+ */
+int smp_cpu_stop_nowait(uint16_t idx)
+{
+	/* refuse to work on the boot CPU */
+	if (idx == 0)
+		return -1;
+
+	spin_lock(&lock);
+
+	/* Don't suppress a CC2 with sigp_retry() */
+	if (smp_sigp(idx, SIGP_STOP, 0, NULL)) {
+		spin_unlock(&lock);
+		return -1;
+	}
+
+	cpus[idx].active = false;
+	spin_unlock(&lock);
+
+	return 0;
+}
+
 int smp_cpu_stop_store_status(uint16_t idx)
 {
 	int rc;
diff --git a/lib/s390x/smp.h b/lib/s390x/smp.h
index 1e69a7de..bae03dfd 100644
--- a/lib/s390x/smp.h
+++ b/lib/s390x/smp.h
@@ -44,6 +44,7 @@  bool smp_sense_running_status(uint16_t idx);
 int smp_cpu_restart(uint16_t idx);
 int smp_cpu_start(uint16_t idx, struct psw psw);
 int smp_cpu_stop(uint16_t idx);
+int smp_cpu_stop_nowait(uint16_t idx);
 int smp_cpu_stop_store_status(uint16_t idx);
 int smp_cpu_destroy(uint16_t idx);
 int smp_cpu_setup(uint16_t idx, struct psw psw);
diff --git a/s390x/smp.c b/s390x/smp.c
index 50811bd0..11c2c673 100644
--- a/s390x/smp.c
+++ b/s390x/smp.c
@@ -76,14 +76,8 @@  static void test_restart(void)
 
 static void test_stop(void)
 {
-	smp_cpu_stop(1);
-	/*
-	 * The smp library waits for the CPU to shut down, but let's
-	 * also do it here, so we don't rely on the library
-	 * implementation
-	 */
-	while (!smp_cpu_stopped(1)) {}
-	report_pass("stop");
+	smp_cpu_stop_nowait(1);
+	report(smp_cpu_stopped(1), "stop");
 }
 
 static void test_stop_store_status(void)