diff mbox

[2/2] ARM: tegra: moving the clock gating procedure to tegra_cpu_kill

Message ID 1355970519-28200-2-git-send-email-josephl@nvidia.com (mailing list archive)
State New, archived
Headers show

Commit Message

Joseph Lo Dec. 20, 2012, 2:28 a.m. UTC
The tegra_cpu_die was be executed by the CPU itslf. So the clock gating
procedure won't be executed after the CPU hardware shutdown code. Moving
the clock gating procedure to tegra_cpu_kill that will be run by another
CPU after the CPU died.

Signed-off-by: Joseph Lo <josephl@nvidia.com>
---
 arch/arm/mach-tegra/common.h  |    1 +
 arch/arm/mach-tegra/hotplug.c |   15 +++++++++++----
 arch/arm/mach-tegra/platsmp.c |    1 +
 3 files changed, 13 insertions(+), 4 deletions(-)

Comments

Stephen Warren Dec. 20, 2012, 5:01 p.m. UTC | #1
On 12/19/2012 07:28 PM, Joseph Lo wrote:
> The tegra_cpu_die was be executed by the CPU itslf. So the clock gating
> procedure won't be executed after the CPU hardware shutdown code. Moving
> the clock gating procedure to tegra_cpu_kill that will be run by another
> CPU after the CPU died.

Hmmm. I wonder if this is enough to make kexec-on-Tegra-with-SMP-enabled
work without explicitly hot-unplugging all the CPUs first... An
implementation of cpu_kill() was a major part of what was missing. I
thought an implementation of cpu_kill() would require a bunch of code
from cpu_die() too. Does this patch assume cpu_die() has executed first,
and only then cpu_kill() will work, or can cpu_kill() be used on its own?

> diff --git a/arch/arm/mach-tegra/hotplug.c b/arch/arm/mach-tegra/hotplug.c

> @@ -34,10 +45,6 @@ void __ref tegra_cpu_die(unsigned int cpu)
>  	/* Shut down the current CPU. */
>  	tegra_hotplug_shutdown();
>  
> -	/* Clock gate the CPU */
> -	tegra_wait_cpu_in_reset(cpu);
> -	tegra_disable_cpu_clock(cpu);

I believe that's the last use of the calculation performed right at the
start of this function:

        cpu = cpu_logical_map(cpu);

Can you remove that?

Also, Peter, can you review these 2 patches as well?
Joseph Lo Dec. 21, 2012, 9:58 a.m. UTC | #2
On Fri, 2012-12-21 at 01:01 +0800, Stephen Warren wrote:
> On 12/19/2012 07:28 PM, Joseph Lo wrote:
> > The tegra_cpu_die was be executed by the CPU itslf. So the clock gating
> > procedure won't be executed after the CPU hardware shutdown code. Moving
> > the clock gating procedure to tegra_cpu_kill that will be run by another
> > CPU after the CPU died.
> 
> Hmmm. I wonder if this is enough to make kexec-on-Tegra-with-SMP-enabled
> work without explicitly hot-unplugging all the CPUs first... An
> implementation of cpu_kill() was a major part of what was missing. I
> thought an implementation of cpu_kill() would require a bunch of code
> from cpu_die() too. Does this patch assume cpu_die() has executed first,
> and only then cpu_kill() will work, or can cpu_kill() be used on its own?
> 
Yes, the tegra_cpu_die must be run first then tegra_cpu_kill. Because
the tegra_cpu_die shutdown the CPU by itself, then we can clock gate the
CPU in tegra_cpu_kill. If the tegra_cpu_kill be run itself, the system
will be hung in wait_for_reset(CPU).

About the kexec issue, it's complicate. Your solution should be the
right solution. And we should implement the hardware shutdown code in
cpu_die not cpu_kill. I suspect this issue can be reproduced on all ARM
platform. Because the smp_send_stop can't really shutdown the secondary
CPU core if just call machine_shutdown without disable_nonboot_cpu. And
in platform_shutdown and platform_reboot case, the secondary CPU already
been offlined and killed before this function be called. Does any other
ARM SMP platform can run kexec without any issue?

(I will check if we can do some HW shutdown in tegra_cpu_kill. For ex,
just clock gate the CPU without wait_for_reset.)

Thanks,
Joseph
Stephen Warren Dec. 21, 2012, 9:23 p.m. UTC | #3
On 12/21/2012 02:58 AM, Joseph Lo wrote:
> On Fri, 2012-12-21 at 01:01 +0800, Stephen Warren wrote:
>> On 12/19/2012 07:28 PM, Joseph Lo wrote:
>>> The tegra_cpu_die was be executed by the CPU itslf. So the clock gating
>>> procedure won't be executed after the CPU hardware shutdown code. Moving
>>> the clock gating procedure to tegra_cpu_kill that will be run by another
>>> CPU after the CPU died.
>>
>> Hmmm. I wonder if this is enough to make kexec-on-Tegra-with-SMP-enabled
>> work without explicitly hot-unplugging all the CPUs first... An
>> implementation of cpu_kill() was a major part of what was missing. I
>> thought an implementation of cpu_kill() would require a bunch of code
>> from cpu_die() too. Does this patch assume cpu_die() has executed first,
>> and only then cpu_kill() will work, or can cpu_kill() be used on its own?
>>
> Yes, the tegra_cpu_die must be run first then tegra_cpu_kill. Because
> the tegra_cpu_die shutdown the CPU by itself, then we can clock gate the
> CPU in tegra_cpu_kill. If the tegra_cpu_kill be run itself, the system
> will be hung in wait_for_reset(CPU).
> 
> About the kexec issue, it's complicate. Your solution should be the
> right solution. And we should implement the hardware shutdown code in
> cpu_die not cpu_kill. I suspect this issue can be reproduced on all ARM
> platform. Because the smp_send_stop can't really shutdown the secondary
> CPU core if just call machine_shutdown without disable_nonboot_cpu. And
> in platform_shutdown and platform_reboot case, the secondary CPU already
> been offlined and killed before this function be called. Does any other
> ARM SMP platform can run kexec without any issue?
> 
> (I will check if we can do some HW shutdown in tegra_cpu_kill. For ex,
> just clock gate the CPU without wait_for_reset.)

Will, Joseph's thoughts above pretty much match mine re: kexec. In the
kexec thread I started earlier, I talked about replacing:

void machine_shutdown(void)
{
#ifdef CONFIG_SMP
	smp_send_stop();
#endif
}

with something like:

void machine_shutdown(void)
{
#ifdef CONFIG_HOTPLUG_CPU
 	disable_nonboot_cpus();
#elifdef CONFIG_SMP
	smp_send_stop();
#endif
}

and you'd responded "I think you're better off using what we currently
have and hanging your code off platform_cpu_kill.".

What exactly did you mean by "what we currently have"; did you mean that
cpu_kill() should work without cpu_die() having executed on the target
CPU itself first, so Tegra should simply implement cpu_kill()? As Joseph
says above, I'm not sure that will work. Any more detailed thoughts you
have here would be greatly appreciated. Thanks.
Will Deacon Dec. 23, 2012, 11:25 a.m. UTC | #4
On Fri, Dec 21, 2012 at 09:23:17PM +0000, Stephen Warren wrote:
> On 12/21/2012 02:58 AM, Joseph Lo wrote:
> > About the kexec issue, it's complicate. Your solution should be the
> > right solution. And we should implement the hardware shutdown code in
> > cpu_die not cpu_kill. I suspect this issue can be reproduced on all ARM
> > platform. Because the smp_send_stop can't really shutdown the secondary
> > CPU core if just call machine_shutdown without disable_nonboot_cpu. And
> > in platform_shutdown and platform_reboot case, the secondary CPU already
> > been offlined and killed before this function be called. Does any other
> > ARM SMP platform can run kexec without any issue?
> > 
> > (I will check if we can do some HW shutdown in tegra_cpu_kill. For ex,
> > just clock gate the CPU without wait_for_reset.)
> 
> Will, Joseph's thoughts above pretty much match mine re: kexec. In the
> kexec thread I started earlier, I talked about replacing:
> 
> void machine_shutdown(void)
> {
> #ifdef CONFIG_SMP
> 	smp_send_stop();
> #endif
> }
> 
> with something like:
> 
> void machine_shutdown(void)
> {
> #ifdef CONFIG_HOTPLUG_CPU
>  	disable_nonboot_cpus();
> #elifdef CONFIG_SMP
> 	smp_send_stop();
> #endif
> }
> 
> and you'd responded "I think you're better off using what we currently
> have and hanging your code off platform_cpu_kill.".

Sorry for being vague: I just meant that the code as it stands in mainline
is better than calling disable_nonboot_cpus(), because the latter seems to
require all the secondary cores to hit the idle loop, notice that they are
not online, and then call cpu_die. We really need this sequence to be
synchronous wrt the CPU initiating the kexec.

> What exactly did you mean by "what we currently have"; did you mean that
> cpu_kill() should work without cpu_die() having executed on the target
> CPU itself first, so Tegra should simply implement cpu_kill()? As Joseph
> says above, I'm not sure that will work. Any more detailed thoughts you
> have here would be greatly appreciated. Thanks.

I guess this comes down to the different between cpu_die/cpu_kill and
whether you actually need cpu_die if you're not planning on returning to
the same kernel that you left. If you really need to call cpu_die, then we
need to figure out a way to do that whilst ensuring that only one CPU is
left standing at the time we start copying the new kernel into place.

Will
Will Deacon Dec. 23, 2012, 12:12 p.m. UTC | #5
On Sun, Dec 23, 2012 at 11:25:40AM +0000, Will Deacon wrote:
> On Fri, Dec 21, 2012 at 09:23:17PM +0000, Stephen Warren wrote:
> > What exactly did you mean by "what we currently have"; did you mean that
> > cpu_kill() should work without cpu_die() having executed on the target
> > CPU itself first, so Tegra should simply implement cpu_kill()? As Joseph
> > says above, I'm not sure that will work. Any more detailed thoughts you
> > have here would be greatly appreciated. Thanks.
> 
> I guess this comes down to the different between cpu_die/cpu_kill and
> whether you actually need cpu_die if you're not planning on returning to
> the same kernel that you left. If you really need to call cpu_die, then we
> need to figure out a way to do that whilst ensuring that only one CPU is
> left standing at the time we start copying the new kernel into place.

I took a closer look at the code to try and figure out a way to solve
this...

First of all, disable_nonboot_cpus *does* actually wait for the secondaries
to get out of the way, because it calls __cpu_die and we end up waiting on a
completion. The problem is that we IPI the secondaries in smp_send_stop,
which puts then into a while(1) cpu_relax(); loop with interrupts disabled,
so that completion won't be able to finish (the cores won't pass through
idle and call cpu_die() themselves).

This means we must call disable_nonboot_cpus before sending the IPI. My
first attempt was to make a hotplug-aware smp_send_stop, using
disable_nonboot_cpus and smp_kill_cpus but this is also problematic because
it's called by panic, where the simpler IPI-based code is probably a lot
better.

The best bet is probably to call disable_nonboot_cpus in machine shutdown,
before sending the stop and subsequent kill. It's very similar to your
initial suggestion, but without the #elif.

Does that work for you?

Will
Stephen Warren Jan. 2, 2013, 9:08 p.m. UTC | #6
On 12/23/2012 05:12 AM, Will Deacon wrote:
... [kexec on ARM SMP discussion]
> The best bet is probably to call disable_nonboot_cpus in machine shutdown,
> before sending the stop and subsequent kill. It's very similar to your
> initial suggestion, but without the #elif.
> 
> Does that work for you?

Yes, it does. I just sent a patch implementing that. Thanks.
diff mbox

Patch

diff --git a/arch/arm/mach-tegra/common.h b/arch/arm/mach-tegra/common.h
index 02f71b4..32f8eb3 100644
--- a/arch/arm/mach-tegra/common.h
+++ b/arch/arm/mach-tegra/common.h
@@ -1,4 +1,5 @@ 
 extern struct smp_operations tegra_smp_ops;
 
+extern int tegra_cpu_kill(unsigned int cpu);
 extern void tegra_cpu_die(unsigned int cpu);
 extern int tegra_cpu_disable(unsigned int cpu);
diff --git a/arch/arm/mach-tegra/hotplug.c b/arch/arm/mach-tegra/hotplug.c
index d8c683b..b8f26c5 100644
--- a/arch/arm/mach-tegra/hotplug.c
+++ b/arch/arm/mach-tegra/hotplug.c
@@ -19,6 +19,17 @@ 
 
 static void (*tegra_hotplug_shutdown)(void);
 
+int tegra_cpu_kill(unsigned cpu)
+{
+	cpu = cpu_logical_map(cpu);
+
+	/* Clock gate the CPU */
+	tegra_wait_cpu_in_reset(cpu);
+	tegra_disable_cpu_clock(cpu);
+
+	return 1;
+}
+
 /*
  * platform-specific code to shutdown a CPU
  *
@@ -34,10 +45,6 @@  void __ref tegra_cpu_die(unsigned int cpu)
 	/* Shut down the current CPU. */
 	tegra_hotplug_shutdown();
 
-	/* Clock gate the CPU */
-	tegra_wait_cpu_in_reset(cpu);
-	tegra_disable_cpu_clock(cpu);
-
 	/* Should never return here. */
 	BUG();
 }
diff --git a/arch/arm/mach-tegra/platsmp.c b/arch/arm/mach-tegra/platsmp.c
index 1b926df..6cedb3e 100644
--- a/arch/arm/mach-tegra/platsmp.c
+++ b/arch/arm/mach-tegra/platsmp.c
@@ -175,6 +175,7 @@  struct smp_operations tegra_smp_ops __initdata = {
 	.smp_secondary_init	= tegra_secondary_init,
 	.smp_boot_secondary	= tegra_boot_secondary,
 #ifdef CONFIG_HOTPLUG_CPU
+	.cpu_kill		= tegra_cpu_kill,
 	.cpu_die		= tegra_cpu_die,
 	.cpu_disable		= tegra_cpu_disable,
 #endif