diff mbox series

[1/3] x86/ACPI: re-park previously parked CPUs upon resume from S3

Message ID 5D0386DC0200007800238470@prv1-mh.provo.novell.com (mailing list archive)
State New, archived
Headers show
Series x86: S3 resume adjustments | expand

Commit Message

Jan Beulich June 14, 2019, 11:37 a.m. UTC
Aiui when resuming from S3, CPUs come back out of RESET/INIT. Therefore
they need to undergo the same procedure as was added elsewhere by
commits d8f974f1a6 ("x86: command line option to avoid use of secondary
hyper-threads") and 8797d20a6e ("x86: possibly bring up all CPUs even
if not all are supposed to be used").

Signed-off-by: Jan Beulich <jbeulich@suse.com>

Comments

Julien Grall June 14, 2019, 4:52 p.m. UTC | #1
Hi Jan,

The title and commit message are a bit odd to read because you are modifying 
common code but everything is geared towards x86.

On 14/06/2019 12:37, Jan Beulich wrote:
> Aiui when resuming from S3, CPUs come back out of RESET/INIT. Therefore
> they need to undergo the same procedure as was added elsewhere by
> commits d8f974f1a6 ("x86: command line option to avoid use of secondary
> hyper-threads") and 8797d20a6e ("x86: possibly bring up all CPUs even
> if not all are supposed to be used").
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/common/cpu.c
> +++ b/xen/common/cpu.c
> @@ -105,7 +105,7 @@ int cpu_down(unsigned int cpu)
>       if ( err )
>           goto fail;
>   
> -    if ( unlikely(system_state < SYS_STATE_active) )
> +    if ( system_state < SYS_STATE_active || system_state == SYS_STATE_resume )

So this change here is necessary because enable_nonboot_cpus() may call 
cpu_down(), am I right? If so, could you please mention it in the commit message?

>           on_selected_cpus(cpumask_of(cpu), _take_cpu_down, NULL, true);
>       else if ( (err = stop_machine_run(take_cpu_down, NULL, cpu)) < 0 )
>           goto fail;
> @@ -207,15 +207,19 @@ void enable_nonboot_cpus(void)
>   
>       printk("Enabling non-boot CPUs  ...\n");
>   
> -    for_each_cpu ( cpu, &frozen_cpus )
> +    for_each_present_cpu ( cpu )
>       {
> +        if ( park_offline_cpus ? cpu == smp_processor_id()

park_offline_cpus is x86 specific, so it will not build on Arm.

> +                               : !cpumask_test_cpu(cpu, &frozen_cpus) )
> +            continue;
>           if ( (error = cpu_up(cpu)) )
>           {
>               printk("Error bringing CPU%d up: %d\n", cpu, error);
>               BUG_ON(error == -EBUSY);
>           }
> -        else
> -            __cpumask_clear_cpu(cpu, &frozen_cpus);
> +        else if ( !__cpumask_test_and_clear_cpu(cpu, &frozen_cpus) &&
> +                  (error = cpu_down(cpu)) )
> +            printk("Error re-offlining CPU%d: %d\n", cpu, error);
>       }
>   
>       for_each_cpu ( cpu, &frozen_cpus )

Cheers,
Jan Beulich June 17, 2019, 6:40 a.m. UTC | #2
>>> On 14.06.19 at 18:52, <julien.grall@arm.com> wrote:
> The title and commit message are a bit odd to read because you are modifying 
> common code but everything is geared towards x86.

Indeed. There's no caller of {en,dis}able_nonboot_cpus() in Arm code
at present, afaics. Hence the code changed (but not the file) is truly
x86-specific at the moment. I've explicitly thought about the
"inconsistency" between title and contents, but I've deliberately put it
as is: The change _is_ x86 / ACPI only, _despite_ touching common
code (and hence needing a REST maintainer ack).

>> --- a/xen/common/cpu.c
>> +++ b/xen/common/cpu.c
>> @@ -105,7 +105,7 @@ int cpu_down(unsigned int cpu)
>>       if ( err )
>>           goto fail;
>>   
>> -    if ( unlikely(system_state < SYS_STATE_active) )
>> +    if ( system_state < SYS_STATE_active || system_state == SYS_STATE_resume )
> 
> So this change here is necessary because enable_nonboot_cpus() may call 
> cpu_down(), am I right?

Yes (albeit likely s/necessary/wanted/).

> If so, could you please mention it in the commit message?

Hmm, I could. But this is just paralleling what we're already doing for
the boot path, so it didn't seem imperative to me to call it out. But
anyway, I've added a sentence.

>> @@ -207,15 +207,19 @@ void enable_nonboot_cpus(void)
>>   
>>       printk("Enabling non-boot CPUs  ...\n");
>>   
>> -    for_each_cpu ( cpu, &frozen_cpus )
>> +    for_each_present_cpu ( cpu )
>>       {
>> +        if ( park_offline_cpus ? cpu == smp_processor_id()
> 
> park_offline_cpus is x86 specific, so it will not build on Arm.

And that's intentional, even more so that (as said above) Arm doesn't
call here in the first place. And even if it did - whether to do things the
"new" way would then still (intentionally) depend on whether Arm had
any way of park_offline_cpus being "true".

Jan
Julien Grall June 17, 2019, 8:12 a.m. UTC | #3
Hi Jan,

On 6/17/19 7:40 AM, Jan Beulich wrote:
>>>> On 14.06.19 at 18:52, <julien.grall@arm.com> wrote:
>> The title and commit message are a bit odd to read because you are modifying
>> common code but everything is geared towards x86.
> 
> Indeed. There's no caller of {en,dis}able_nonboot_cpus() in Arm code
> at present, afaics. Hence the code changed (but not the file) is truly
> x86-specific at the moment. I've explicitly thought about the
> "inconsistency" between title and contents, but I've deliberately put it
> as is: The change _is_ x86 / ACPI only, _despite_ touching common
> code (and hence needing a REST maintainer ack).

Bear in mind that I have nearly no knowledge of x86, so trying to write 
a commit message fully the x86 way is not going to help me understand 
why this makes sense for everyone (today or in the future).

>>> @@ -207,15 +207,19 @@ void enable_nonboot_cpus(void)
>>>    
>>>        printk("Enabling non-boot CPUs  ...\n");
>>>    
>>> -    for_each_cpu ( cpu, &frozen_cpus )
>>> +    for_each_present_cpu ( cpu )
>>>        {
>>> +        if ( park_offline_cpus ? cpu == smp_processor_id()
>>
>> park_offline_cpus is x86 specific, so it will not build on Arm.
> 
> And that's intentional, even more so that (as said above) Arm doesn't
> call here in the first place. 

Calling and building are two separate things... A function may be built 
even if it is not called.

> And even if it did - whether to do things the
> "new" way would then still (intentionally) depend on whether Arm had
> any way of park_offline_cpus being "true".

Looking again, we are defining park_offline_cpus to false on Arm (see
a6448adfd3 "xen/cpu: Fix ARM build following c/s 597fbb8"). So there are 
no build issue as I first thought.

Cheers,
Andrew Cooper Aug. 29, 2019, 1:37 p.m. UTC | #4
On 14/06/2019 12:37, Jan Beulich wrote:
> Aiui when resuming from S3, CPUs come back out of RESET/INIT.

From a programmers point of view, all APs are in Wait-for-SIPI after
resume, and this is confirmed by several of the BWG flowcharts.  This is
a logical consequence of the CPU losing all power, meaning that its
startup sequence will be the same whether it is booting from S5 or S3.

>  Therefore
> they need to undergo the same procedure as was added elsewhere by
> commits d8f974f1a6 ("x86: command line option to avoid use of secondary
> hyper-threads") and 8797d20a6e ("x86: possibly bring up all CPUs even
> if not all are supposed to be used").
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
diff mbox series

Patch

--- a/xen/common/cpu.c
+++ b/xen/common/cpu.c
@@ -105,7 +105,7 @@  int cpu_down(unsigned int cpu)
     if ( err )
         goto fail;
 
-    if ( unlikely(system_state < SYS_STATE_active) )
+    if ( system_state < SYS_STATE_active || system_state == SYS_STATE_resume )
         on_selected_cpus(cpumask_of(cpu), _take_cpu_down, NULL, true);
     else if ( (err = stop_machine_run(take_cpu_down, NULL, cpu)) < 0 )
         goto fail;
@@ -207,15 +207,19 @@  void enable_nonboot_cpus(void)
 
     printk("Enabling non-boot CPUs  ...\n");
 
-    for_each_cpu ( cpu, &frozen_cpus )
+    for_each_present_cpu ( cpu )
     {
+        if ( park_offline_cpus ? cpu == smp_processor_id()
+                               : !cpumask_test_cpu(cpu, &frozen_cpus) )
+            continue;
         if ( (error = cpu_up(cpu)) )
         {
             printk("Error bringing CPU%d up: %d\n", cpu, error);
             BUG_ON(error == -EBUSY);
         }
-        else
-            __cpumask_clear_cpu(cpu, &frozen_cpus);
+        else if ( !__cpumask_test_and_clear_cpu(cpu, &frozen_cpus) &&
+                  (error = cpu_down(cpu)) )
+            printk("Error re-offlining CPU%d: %d\n", cpu, error);
     }
 
     for_each_cpu ( cpu, &frozen_cpus )