diff mbox

[v6,1/4] arm: introduce psci_smp_ops

Message ID 1365167495-18508-1-git-send-email-stefano.stabellini@eu.citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Stefano Stabellini April 5, 2013, 1:11 p.m. UTC
Rename virt_smp_ops to psci_smp_ops and move them to arch/arm/kernel/psci_smp.c.
Remove mach-virt/platsmp.c, now unused.
Compile psci_smp if CONFIG_ARM_PSCI and CONFIG_SMP.

Add a cpu_die smp_op based on psci_ops.cpu_off.

Initialize PSCI before setting smp_ops in setup_arch.
Use psci_smp_ops if the platform doesn't provide its own smp_ops.

Changes in v6:
- fixed return values for psci_smp_available and psci_init ifndef
CONFIG_ARM_PSCI.

Changes in v5:
- document psci_operations;
- psci_init returns NULL.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Rob Herring <rob.herring@calxeda.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>

CC: will.deacon@arm.com
CC: arnd@arndb.de
CC: marc.zyngier@arm.com
CC: linux@arm.linux.org.uk
CC: nico@linaro.org
---
 arch/arm/include/asm/psci.h  |   29 ++++++++++++++++++
 arch/arm/kernel/Makefile     |    5 ++-
 arch/arm/kernel/psci.c       |    7 ++--
 arch/arm/kernel/psci_smp.c   |   67 ++++++++++++++++++++++++++++++++++++++++++
 arch/arm/kernel/setup.c      |    7 ++++-
 arch/arm/mach-virt/Makefile  |    1 -
 arch/arm/mach-virt/platsmp.c |   58 ------------------------------------
 arch/arm/mach-virt/virt.c    |    3 --
 8 files changed, 109 insertions(+), 68 deletions(-)
 create mode 100644 arch/arm/kernel/psci_smp.c
 delete mode 100644 arch/arm/mach-virt/platsmp.c

Comments

Russell King - ARM Linux April 18, 2013, 4:13 p.m. UTC | #1
On Fri, Apr 05, 2013 at 02:11:32PM +0100, Stefano Stabellini wrote:
> +	psci_init();
>  #ifdef CONFIG_SMP
>  	if (is_smp()) {
> -		smp_set_ops(mdesc->smp);
> +		if (mdesc->smp)
> +			smp_set_ops(mdesc->smp);
> +		else if (psci_smp_available())
> +			smp_set_ops(&psci_smp_ops);

So, I have a vague recollection that the ordering of the above got discussed
but I can't find it amongst the 21k of messages so far this year.

The above looks weird to me.  Surely this should be:

		if (psci_smp_available())
			smp_set_ops(&psci_smp_ops);
		else if (mdesc->smp)
			smp_set_ops(mdesc->ops);

This means that if PSCI is available, and provides a set of operations,
we override whatever the platform has statically provided.

Remember, we're trying to move away from using "mdesc"s for platform
stuff, relying on things like DT and such like.  We really should not
be going for mdesc-overriding-newstuff but newstuff-overriding-mdesc.

Now, if the psci stuff can't be relied upon to provide the correct
functionality, then that's a separate problem which needs addressing
differently.

This should allow the Xen problem to be resolved, because Xen will
provide the PSCI operations, and it's correct in that case to override
the platform's SMP operations.
Stefano Stabellini April 18, 2013, 4:20 p.m. UTC | #2
On Thu, 18 Apr 2013, Russell King - ARM Linux wrote:
> On Fri, Apr 05, 2013 at 02:11:32PM +0100, Stefano Stabellini wrote:
> > +	psci_init();
> >  #ifdef CONFIG_SMP
> >  	if (is_smp()) {
> > -		smp_set_ops(mdesc->smp);
> > +		if (mdesc->smp)
> > +			smp_set_ops(mdesc->smp);
> > +		else if (psci_smp_available())
> > +			smp_set_ops(&psci_smp_ops);
> 
> So, I have a vague recollection that the ordering of the above got discussed
> but I can't find it amongst the 21k of messages so far this year.
> 
> The above looks weird to me.  Surely this should be:
> 
> 		if (psci_smp_available())
> 			smp_set_ops(&psci_smp_ops);
> 		else if (mdesc->smp)
> 			smp_set_ops(mdesc->ops);
> 
> This means that if PSCI is available, and provides a set of operations,
> we override whatever the platform has statically provided.
> 
> Remember, we're trying to move away from using "mdesc"s for platform
> stuff, relying on things like DT and such like.  We really should not
> be going for mdesc-overriding-newstuff but newstuff-overriding-mdesc.

That's correct, in fact if you look at the next patch you'll see that it
changes the order.

I introduced the mechanism first and changed the priority later - it
should help bisectability.
I can fold the two patches into one if you prefer.


> Now, if the psci stuff can't be relied upon to provide the correct
> functionality, then that's a separate problem which needs addressing
> differently.
> 
> This should allow the Xen problem to be resolved, because Xen will
> provide the PSCI operations, and it's correct in that case to override
> the platform's SMP operations.

Yes, increasing the priority of PSCI helps Xen a lot.
In order to completely solve the issue for Xen though, another patch is
needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because
of the introduction of smp_init.
Nicolas Pitre April 18, 2013, 4:35 p.m. UTC | #3
On Thu, 18 Apr 2013, Stefano Stabellini wrote:

> On Thu, 18 Apr 2013, Russell King - ARM Linux wrote:
> > On Fri, Apr 05, 2013 at 02:11:32PM +0100, Stefano Stabellini wrote:
> > > +	psci_init();
> > >  #ifdef CONFIG_SMP
> > >  	if (is_smp()) {
> > > -		smp_set_ops(mdesc->smp);
> > > +		if (mdesc->smp)
> > > +			smp_set_ops(mdesc->smp);
> > > +		else if (psci_smp_available())
> > > +			smp_set_ops(&psci_smp_ops);
> > 
> > So, I have a vague recollection that the ordering of the above got discussed
> > but I can't find it amongst the 21k of messages so far this year.
> > 
> > The above looks weird to me.  Surely this should be:
> > 
> > 		if (psci_smp_available())
> > 			smp_set_ops(&psci_smp_ops);
> > 		else if (mdesc->smp)
> > 			smp_set_ops(mdesc->ops);
> > 
> > This means that if PSCI is available, and provides a set of operations,
> > we override whatever the platform has statically provided.
> > 
> > Remember, we're trying to move away from using "mdesc"s for platform
> > stuff, relying on things like DT and such like.  We really should not
> > be going for mdesc-overriding-newstuff but newstuff-overriding-mdesc.
> 
> That's correct, in fact if you look at the next patch you'll see that it
> changes the order.
> 
> I introduced the mechanism first and changed the priority later - it
> should help bisectability.
> I can fold the two patches into one if you prefer.

Please let's keep the order as we discussed.  Otherwise this is just too 
confusing (Russell's comment is a good example of that).


Nicolas
Nicolas Pitre April 18, 2013, 4:40 p.m. UTC | #4
On Thu, 18 Apr 2013, Stefano Stabellini wrote:

> On Thu, 18 Apr 2013, Russell King - ARM Linux wrote:
> > This should allow the Xen problem to be resolved, because Xen will
> > provide the PSCI operations, and it's correct in that case to override
> > the platform's SMP operations.
> 
> Yes, increasing the priority of PSCI helps Xen a lot.
> In order to completely solve the issue for Xen though, another patch is
> needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because
> of the introduction of smp_init.

Please look at the latest smp_init patch version I sent to you.  It 
shouldn't conflict with Xen any longer.  It now returns a bool result 
depending on whether it did set up smp_ops or not.


Nicolas
Stefano Stabellini April 18, 2013, 4:49 p.m. UTC | #5
On Thu, 18 Apr 2013, Nicolas Pitre wrote:
> On Thu, 18 Apr 2013, Stefano Stabellini wrote:
> 
> > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote:
> > > On Fri, Apr 05, 2013 at 02:11:32PM +0100, Stefano Stabellini wrote:
> > > > +	psci_init();
> > > >  #ifdef CONFIG_SMP
> > > >  	if (is_smp()) {
> > > > -		smp_set_ops(mdesc->smp);
> > > > +		if (mdesc->smp)
> > > > +			smp_set_ops(mdesc->smp);
> > > > +		else if (psci_smp_available())
> > > > +			smp_set_ops(&psci_smp_ops);
> > > 
> > > So, I have a vague recollection that the ordering of the above got discussed
> > > but I can't find it amongst the 21k of messages so far this year.
> > > 
> > > The above looks weird to me.  Surely this should be:
> > > 
> > > 		if (psci_smp_available())
> > > 			smp_set_ops(&psci_smp_ops);
> > > 		else if (mdesc->smp)
> > > 			smp_set_ops(mdesc->ops);
> > > 
> > > This means that if PSCI is available, and provides a set of operations,
> > > we override whatever the platform has statically provided.
> > > 
> > > Remember, we're trying to move away from using "mdesc"s for platform
> > > stuff, relying on things like DT and such like.  We really should not
> > > be going for mdesc-overriding-newstuff but newstuff-overriding-mdesc.
> > 
> > That's correct, in fact if you look at the next patch you'll see that it
> > changes the order.
> > 
> > I introduced the mechanism first and changed the priority later - it
> > should help bisectability.
> > I can fold the two patches into one if you prefer.
> 
> Please let's keep the order as we discussed.  Otherwise this is just too 
> confusing (Russell's comment is a good example of that).

You are right, it is confusing.
By "keep the order as we discussed", do you mean merge the second patch
into the first one, correct?
Stefano Stabellini April 18, 2013, 5:01 p.m. UTC | #6
On Thu, 18 Apr 2013, Nicolas Pitre wrote:
> On Thu, 18 Apr 2013, Stefano Stabellini wrote:
> 
> > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote:
> > > This should allow the Xen problem to be resolved, because Xen will
> > > provide the PSCI operations, and it's correct in that case to override
> > > the platform's SMP operations.
> > 
> > Yes, increasing the priority of PSCI helps Xen a lot.
> > In order to completely solve the issue for Xen though, another patch is
> > needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because
> > of the introduction of smp_init.
> 
> Please look at the latest smp_init patch version I sent to you.  It 
> shouldn't conflict with Xen any longer.  It now returns a bool result 
> depending on whether it did set up smp_ops or not.

CPUs are virtualized by Xen and do not reflect or expose the underlying
SMP hardware and firmware features, so an hardware specific smp_init
cannot run.

So the smp_init patch still breaks Xen because even if smp_init can fail
graciously, executing a platform specific smp_init function that tries
to access registers and memory regions that are not present is going to
cause an undefined behaviour.

Under Xen, smp_init won't have a chance to return any value because it
is going to crash first.
Nicolas Pitre April 18, 2013, 5:38 p.m. UTC | #7
On Thu, 18 Apr 2013, Stefano Stabellini wrote:

> On Thu, 18 Apr 2013, Nicolas Pitre wrote:
> > On Thu, 18 Apr 2013, Stefano Stabellini wrote:
> > 
> > > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote:
> > > > This should allow the Xen problem to be resolved, because Xen will
> > > > provide the PSCI operations, and it's correct in that case to override
> > > > the platform's SMP operations.
> > > 
> > > Yes, increasing the priority of PSCI helps Xen a lot.
> > > In order to completely solve the issue for Xen though, another patch is
> > > needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because
> > > of the introduction of smp_init.
> > 
> > Please look at the latest smp_init patch version I sent to you.  It 
> > shouldn't conflict with Xen any longer.  It now returns a bool result 
> > depending on whether it did set up smp_ops or not.
> 
> CPUs are virtualized by Xen and do not reflect or expose the underlying
> SMP hardware and firmware features, so an hardware specific smp_init
> cannot run.
> 
> So the smp_init patch still breaks Xen because even if smp_init can fail
> graciously, executing a platform specific smp_init function that tries
> to access registers and memory regions that are not present is going to
> cause an undefined behaviour.

It won't access hardware but just look into the DT and return false if 
nothing interesting is found.  At which point the next attempt in the 
priority list is PSCI by default.


Nicolas
Nicolas Pitre April 18, 2013, 5:40 p.m. UTC | #8
On Thu, 18 Apr 2013, Stefano Stabellini wrote:

> On Thu, 18 Apr 2013, Nicolas Pitre wrote:
> > On Thu, 18 Apr 2013, Stefano Stabellini wrote:
> > 
> > > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote:
> > > > On Fri, Apr 05, 2013 at 02:11:32PM +0100, Stefano Stabellini wrote:
> > > > > +	psci_init();
> > > > >  #ifdef CONFIG_SMP
> > > > >  	if (is_smp()) {
> > > > > -		smp_set_ops(mdesc->smp);
> > > > > +		if (mdesc->smp)
> > > > > +			smp_set_ops(mdesc->smp);
> > > > > +		else if (psci_smp_available())
> > > > > +			smp_set_ops(&psci_smp_ops);
> > > > 
> > > > So, I have a vague recollection that the ordering of the above got discussed
> > > > but I can't find it amongst the 21k of messages so far this year.
> > > > 
> > > > The above looks weird to me.  Surely this should be:
> > > > 
> > > > 		if (psci_smp_available())
> > > > 			smp_set_ops(&psci_smp_ops);
> > > > 		else if (mdesc->smp)
> > > > 			smp_set_ops(mdesc->ops);
> > > > 
> > > > This means that if PSCI is available, and provides a set of operations,
> > > > we override whatever the platform has statically provided.
> > > > 
> > > > Remember, we're trying to move away from using "mdesc"s for platform
> > > > stuff, relying on things like DT and such like.  We really should not
> > > > be going for mdesc-overriding-newstuff but newstuff-overriding-mdesc.
> > > 
> > > That's correct, in fact if you look at the next patch you'll see that it
> > > changes the order.
> > > 
> > > I introduced the mechanism first and changed the priority later - it
> > > should help bisectability.
> > > I can fold the two patches into one if you prefer.
> > 
> > Please let's keep the order as we discussed.  Otherwise this is just too 
> > confusing (Russell's comment is a good example of that).
> 
> You are right, it is confusing.
> By "keep the order as we discussed", do you mean merge the second patch
> into the first one, correct?

Correct.  And merge the commit log message too.


Nicolas
Stefano Stabellini April 19, 2013, 9:40 a.m. UTC | #9
On Thu, 18 Apr 2013, Nicolas Pitre wrote:
> On Thu, 18 Apr 2013, Stefano Stabellini wrote:
> 
> > On Thu, 18 Apr 2013, Nicolas Pitre wrote:
> > > On Thu, 18 Apr 2013, Stefano Stabellini wrote:
> > > 
> > > > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote:
> > > > > This should allow the Xen problem to be resolved, because Xen will
> > > > > provide the PSCI operations, and it's correct in that case to override
> > > > > the platform's SMP operations.
> > > > 
> > > > Yes, increasing the priority of PSCI helps Xen a lot.
> > > > In order to completely solve the issue for Xen though, another patch is
> > > > needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because
> > > > of the introduction of smp_init.
> > > 
> > > Please look at the latest smp_init patch version I sent to you.  It 
> > > shouldn't conflict with Xen any longer.  It now returns a bool result 
> > > depending on whether it did set up smp_ops or not.
> > 
> > CPUs are virtualized by Xen and do not reflect or expose the underlying
> > SMP hardware and firmware features, so an hardware specific smp_init
> > cannot run.
> > 
> > So the smp_init patch still breaks Xen because even if smp_init can fail
> > graciously, executing a platform specific smp_init function that tries
> > to access registers and memory regions that are not present is going to
> > cause an undefined behaviour.
> 
> It won't access hardware but just look into the DT and return false if 
> nothing interesting is found.  At which point the next attempt in the 
> priority list is PSCI by default.

OK, that should work.
I am going to drop "xen/arm: introduce xen_early_init, use PSCI on xen".

We'll have to be careful enforcing that future smp_init implementations
use DT rather than probing registers. Maybe I should add a comment to
the smp_init patch to clarify that?
Ian Campbell April 19, 2013, 9:52 a.m. UTC | #10
On Thu, 2013-04-18 at 18:38 +0100, Nicolas Pitre wrote:
> On Thu, 18 Apr 2013, Stefano Stabellini wrote:
> 
> > On Thu, 18 Apr 2013, Nicolas Pitre wrote:
> > > On Thu, 18 Apr 2013, Stefano Stabellini wrote:
> > > 
> > > > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote:
> > > > > This should allow the Xen problem to be resolved, because Xen will
> > > > > provide the PSCI operations, and it's correct in that case to override
> > > > > the platform's SMP operations.
> > > > 
> > > > Yes, increasing the priority of PSCI helps Xen a lot.
> > > > In order to completely solve the issue for Xen though, another patch is
> > > > needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because
> > > > of the introduction of smp_init.
> > > 
> > > Please look at the latest smp_init patch version I sent to you.  It 
> > > shouldn't conflict with Xen any longer.  It now returns a bool result 
> > > depending on whether it did set up smp_ops or not.
> > 
> > CPUs are virtualized by Xen and do not reflect or expose the underlying
> > SMP hardware and firmware features, so an hardware specific smp_init
> > cannot run.
> > 
> > So the smp_init patch still breaks Xen because even if smp_init can fail
> > graciously, executing a platform specific smp_init function that tries
> > to access registers and memory regions that are not present is going to
> > cause an undefined behaviour.
> 
> It won't access hardware but just look into the DT and return false if 
> nothing interesting is found.  At which point the next attempt in the 
> priority list is PSCI by default.

I think there might be some confusion about the semantics of smp_init,
since it is in mdesc I had interpreted it as a per-platform hook to
allow "magic" SMP setup, which I at least had assumed would (be
permitted to) involve hardware specific frobbing, including touching
platform specific devices etc.

Is that not the case?

Can we guarantee that this hook won't be used by hardware platforms to
e.g. probe NVRAM for SMP topology information or other activities which
touch hardware?

If it isn't hardware specific then does this hook really belong in
mdesc? Or if it is purely driven by DT can we not implement it in terms
of DT at the top level rather than abstracting via a hardware specific
hook?

Ian.
Nicolas Pitre April 19, 2013, 3:47 p.m. UTC | #11
On Fri, 19 Apr 2013, Stefano Stabellini wrote:

> On Thu, 18 Apr 2013, Nicolas Pitre wrote:
> > On Thu, 18 Apr 2013, Stefano Stabellini wrote:
> > 
> > > On Thu, 18 Apr 2013, Nicolas Pitre wrote:
> > > > On Thu, 18 Apr 2013, Stefano Stabellini wrote:
> > > > 
> > > > > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote:
> > > > > > This should allow the Xen problem to be resolved, because Xen will
> > > > > > provide the PSCI operations, and it's correct in that case to override
> > > > > > the platform's SMP operations.
> > > > > 
> > > > > Yes, increasing the priority of PSCI helps Xen a lot.
> > > > > In order to completely solve the issue for Xen though, another patch is
> > > > > needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because
> > > > > of the introduction of smp_init.
> > > > 
> > > > Please look at the latest smp_init patch version I sent to you.  It 
> > > > shouldn't conflict with Xen any longer.  It now returns a bool result 
> > > > depending on whether it did set up smp_ops or not.
> > > 
> > > CPUs are virtualized by Xen and do not reflect or expose the underlying
> > > SMP hardware and firmware features, so an hardware specific smp_init
> > > cannot run.
> > > 
> > > So the smp_init patch still breaks Xen because even if smp_init can fail
> > > graciously, executing a platform specific smp_init function that tries
> > > to access registers and memory regions that are not present is going to
> > > cause an undefined behaviour.
> > 
> > It won't access hardware but just look into the DT and return false if 
> > nothing interesting is found.  At which point the next attempt in the 
> > priority list is PSCI by default.
> 
> OK, that should work.
> I am going to drop "xen/arm: introduce xen_early_init, use PSCI on xen".
> 
> We'll have to be careful enforcing that future smp_init implementations
> use DT rather than probing registers. Maybe I should add a comment to
> the smp_init patch to clarify that?

No one should be probing registers without making sure it is safe to do 
so.  Even on non virtualized hardware this can be a dangerous thing to 
do.


Nicolas
Ian Campbell April 19, 2013, 4:16 p.m. UTC | #12
On Fri, 2013-04-19 at 16:47 +0100, Nicolas Pitre wrote:
> No one should be probing registers without making sure it is safe to do 
> so.  Even on non virtualized hardware this can be a dangerous thing to 
> do.

Won't people writing per machine code consider, not unreasonably, that
having been called through a mdesc machine specific hook constitutes
enough "making sure" that they think it is safe to touch registers which
are specific to that machine?

Ian.
Nicolas Pitre April 19, 2013, 4:33 p.m. UTC | #13
On Fri, 19 Apr 2013, Ian Campbell wrote:

> On Fri, 2013-04-19 at 16:47 +0100, Nicolas Pitre wrote:
> > No one should be probing registers without making sure it is safe to do 
> > so.  Even on non virtualized hardware this can be a dangerous thing to 
> > do.
> 
> Won't people writing per machine code consider, not unreasonably, that
> having been called through a mdesc machine specific hook constitutes
> enough "making sure" that they think it is safe to touch registers which
> are specific to that machine?

Remember that this hook was introduced to decide at run time between 
different possibilities for SMP ops on the _same_ machine configuration.  
That machine shouldn't do things which is not permitted on either 
possible alternatives already.  So far the actual usage of that hook 
only looks in the DT to make a decision.  But even if it were to touch 
the hardware, that means it has to be safe to do so on all the possible 
hardware variations this mdesc is associated to.

And if Xen wants to emulate one of those hardware alternatives, then it 
better be ready to emulate it properly, or manage for _another_ mdesc to 
be selected instead.  That's why mach-virt was introduced.

So in my opinion there is just no issue here.


Nicolas
Stefano Stabellini April 19, 2013, 5:06 p.m. UTC | #14
On Fri, 19 Apr 2013, Nicolas Pitre wrote:
> On Fri, 19 Apr 2013, Ian Campbell wrote:
> 
> > On Fri, 2013-04-19 at 16:47 +0100, Nicolas Pitre wrote:
> > > No one should be probing registers without making sure it is safe to do 
> > > so.  Even on non virtualized hardware this can be a dangerous thing to 
> > > do.
> > 
> > Won't people writing per machine code consider, not unreasonably, that
> > having been called through a mdesc machine specific hook constitutes
> > enough "making sure" that they think it is safe to touch registers which
> > are specific to that machine?
> 
> Remember that this hook was introduced to decide at run time between 
> different possibilities for SMP ops on the _same_ machine configuration.  
> That machine shouldn't do things which is not permitted on either 
> possible alternatives already.  So far the actual usage of that hook 
> only looks in the DT to make a decision.  But even if it were to touch 
> the hardware, that means it has to be safe to do so on all the possible 
> hardware variations this mdesc is associated to.

This last sentence is the key.

It is not guaranteed that being safe on "all the possible hardware
variations this mdesc is associated to" and being safe on Xen are the
same thing.

What if the same magic configuration register exists on all the possible
variations of the platform?
I understand that it is unlikely but I think it is a possibility.

If we don't take further precautions I'll be forced to regularly scan
the list for smp_init implementations and look for possible breakages,
pointing the unaware author of the patches to this conversation.
Honestly it is not my idea of fun.


> And if Xen wants to emulate one of those hardware alternatives, then it 
> better be ready to emulate it properly, or manage for _another_ mdesc to 
> be selected instead.  That's why mach-virt was introduced.

I don't mean to argue pointlessly here, but I thought that it was clear
that Xen does not emulate any hardware interfaces at all.

Of course when a convenient virtualization aware hardware interface is
available (the vGIC for example) Xen is going to use it, but other than
that no emulation is going to be done.


> So in my opinion there is just no issue here.

Not now, and maybe not even in the near future.
But I feel that we are taking an error prone approach.
Russell King - ARM Linux April 22, 2013, 2:06 p.m. UTC | #15
On Thu, Apr 18, 2013 at 05:20:23PM +0100, Stefano Stabellini wrote:
> On Thu, 18 Apr 2013, Russell King - ARM Linux wrote:
> > Remember, we're trying to move away from using "mdesc"s for platform
> > stuff, relying on things like DT and such like.  We really should not
> > be going for mdesc-overriding-newstuff but newstuff-overriding-mdesc.
> 
> That's correct, in fact if you look at the next patch you'll see that it
> changes the order.

You may have noticed that I've been catching up with email, and it's
exceedingly difficult to track what patches are obsolete and have been
overridden by new versions.

> I introduced the mechanism first and changed the priority later - it
> should help bisectability.
> I can fold the two patches into one if you prefer.

On the face of it, I think that would be better.  I don't remember what
your last version looks like though.

> > Now, if the psci stuff can't be relied upon to provide the correct
> > functionality, then that's a separate problem which needs addressing
> > differently.
> > 
> > This should allow the Xen problem to be resolved, because Xen will
> > provide the PSCI operations, and it's correct in that case to override
> > the platform's SMP operations.
> 
> Yes, increasing the priority of PSCI helps Xen a lot.
> In order to completely solve the issue for Xen though, another patch is
> needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because
> of the introduction of smp_init.

Given the way the code in setup.c will work:

+static bool __init xen_smp_init(void)
+{
+#ifdef CONFIG_SMP
+	/* If we are running on Xen, use PSCI if available.
+	 * In any case do not try to use the native smp_ops. */
+	if (psci_smp_available())
+		smp_set_ops(&psci_smp_ops);
+#endif
+	return true;
+}

Doesn't this just need to return false, and then we'll drop down to
using PSCI if those operations are available?
Russell King - ARM Linux April 22, 2013, 2:20 p.m. UTC | #16
On Thu, Apr 18, 2013 at 12:35:57PM -0400, Nicolas Pitre wrote:
> On Thu, 18 Apr 2013, Stefano Stabellini wrote:
> 
> > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote:
> > > On Fri, Apr 05, 2013 at 02:11:32PM +0100, Stefano Stabellini wrote:
> > > > +	psci_init();
> > > >  #ifdef CONFIG_SMP
> > > >  	if (is_smp()) {
> > > > -		smp_set_ops(mdesc->smp);
> > > > +		if (mdesc->smp)
> > > > +			smp_set_ops(mdesc->smp);
> > > > +		else if (psci_smp_available())
> > > > +			smp_set_ops(&psci_smp_ops);
> > > 
> > > So, I have a vague recollection that the ordering of the above got discussed
> > > but I can't find it amongst the 21k of messages so far this year.
> > > 
> > > The above looks weird to me.  Surely this should be:
> > > 
> > > 		if (psci_smp_available())
> > > 			smp_set_ops(&psci_smp_ops);
> > > 		else if (mdesc->smp)
> > > 			smp_set_ops(mdesc->ops);
> > > 
> > > This means that if PSCI is available, and provides a set of operations,
> > > we override whatever the platform has statically provided.
> > > 
> > > Remember, we're trying to move away from using "mdesc"s for platform
> > > stuff, relying on things like DT and such like.  We really should not
> > > be going for mdesc-overriding-newstuff but newstuff-overriding-mdesc.
> > 
> > That's correct, in fact if you look at the next patch you'll see that it
> > changes the order.
> > 
> > I introduced the mechanism first and changed the priority later - it
> > should help bisectability.
> > I can fold the two patches into one if you prefer.
> 
> Please let's keep the order as we discussed.  Otherwise this is just too 
> confusing (Russell's comment is a good example of that).

My comment is based on the code which I had read and quoted.  Things
had moved on from that point, but, because I had 2000 odd messages to
get through, there was no way to know at that point that there had been
further discussion.

This is why catching up is such a problem - not only the amount of time
it takes (I'm _still_ reading messages - I've only just read your reply
above, so I'm still back in Thursday's email...)

I don't think there's any confusion though - the order I suggested seems
to be the order which I recall later patches adopted.
Russell King - ARM Linux April 22, 2013, 2:23 p.m. UTC | #17
On Thu, Apr 18, 2013 at 05:49:21PM +0100, Stefano Stabellini wrote:
> On Thu, 18 Apr 2013, Nicolas Pitre wrote:
> > Please let's keep the order as we discussed.  Otherwise this is just too 
> > confusing (Russell's comment is a good example of that).
> 
> You are right, it is confusing.
> By "keep the order as we discussed", do you mean merge the second patch
> into the first one, correct?

Well, the solution to this kind of confusion is to ignore comments on the
older patches and post a new set of patches and ask for all relevant
comments to be against that set and that set only.
Ian Campbell April 22, 2013, 3:21 p.m. UTC | #18
On Fri, 2013-04-19 at 18:06 +0100, Stefano Stabellini wrote:
> On Fri, 19 Apr 2013, Nicolas Pitre wrote:
> > On Fri, 19 Apr 2013, Ian Campbell wrote:
> > 
> > > On Fri, 2013-04-19 at 16:47 +0100, Nicolas Pitre wrote:
> > > > No one should be probing registers without making sure it is safe to do 
> > > > so.  Even on non virtualized hardware this can be a dangerous thing to 
> > > > do.
> > > 
> > > Won't people writing per machine code consider, not unreasonably, that
> > > having been called through a mdesc machine specific hook constitutes
> > > enough "making sure" that they think it is safe to touch registers which
> > > are specific to that machine?
> > 
> > Remember that this hook was introduced to decide at run time between 
> > different possibilities for SMP ops on the _same_ machine configuration.  
> > That machine shouldn't do things which is not permitted on either 
> > possible alternatives already.  So far the actual usage of that hook 
> > only looks in the DT to make a decision.  But even if it were to touch 
> > the hardware, that means it has to be safe to do so on all the possible 
> > hardware variations this mdesc is associated to.
> 
> This last sentence is the key.
> 
> It is not guaranteed that being safe on "all the possible hardware
> variations this mdesc is associated to" and being safe on Xen are the
> same thing.
> 
> What if the same magic configuration register exists on all the possible
> variations of the platform?
> I understand that it is unlikely but I think it is a possibility.

I think it is actually quite likely for such configuration registers,
nvram etc to exist on all variants of a platform and not all that
unlikely that smp_init might want to consult them.

> If we don't take further precautions I'll be forced to regularly scan
> the list for smp_init implementations and look for possible breakages,
> pointing the unaware author of the patches to this conversation.
> Honestly it is not my idea of fun.
> 
> 
> > And if Xen wants to emulate one of those hardware alternatives, then it 
> > better be ready to emulate it properly, or manage for _another_ mdesc to 
> > be selected instead.  That's why mach-virt was introduced.
> 
> I don't mean to argue pointlessly here, but I thought that it was clear
> that Xen does not emulate any hardware interfaces at all.
> 
> Of course when a convenient virtualization aware hardware interface is
> available (the vGIC for example) Xen is going to use it, but other than
> that no emulation is going to be done.

I think the key thing that non-Xen folks aren't aware of is that
although Xen passes the majority of the underlying hardware to the dom0
kernel to manage one of the few things it does not expose to guests
(where "guests" includes dom0) is the SMP topology of the underlying system.

The physical CPUs are managed/scheduled/etc by the hypervisor and it is
the hypervisor which will need to be aware of big.LITTLE and MCPM etc.

The guests, and again this includes dom0, see only a set of VCPUs which
are scheduled amongst the PCPUs by the hypervisor. It is not unusual to
configure dom0 to have less VCPUs than there are PCPUs in the system.
Currently the VCPUs are all considered to be uniform i.e. there is no
big.LITTLE exposed to guests, although the hypervisor will most likely
eventually support it and so may be scheduling the VCPUs among the big
and LITTLE PCPUs according to $THINGS.

I don't think we want to go the route of mandating a 1:1 PCPU:VCPU
relationship for dom0 for Xen on ARM.

Ian.
Nicolas Pitre April 22, 2013, 4:07 p.m. UTC | #19
On Mon, 22 Apr 2013, Ian Campbell wrote:

> On Fri, 2013-04-19 at 18:06 +0100, Stefano Stabellini wrote:
> > On Fri, 19 Apr 2013, Nicolas Pitre wrote:
> > > On Fri, 19 Apr 2013, Ian Campbell wrote:
> > > 
> > > > On Fri, 2013-04-19 at 16:47 +0100, Nicolas Pitre wrote:
> > > > > No one should be probing registers without making sure it is safe to do 
> > > > > so.  Even on non virtualized hardware this can be a dangerous thing to 
> > > > > do.
> > > > 
> > > > Won't people writing per machine code consider, not unreasonably, that
> > > > having been called through a mdesc machine specific hook constitutes
> > > > enough "making sure" that they think it is safe to touch registers which
> > > > are specific to that machine?
> > > 
> > > Remember that this hook was introduced to decide at run time between 
> > > different possibilities for SMP ops on the _same_ machine configuration.  
> > > That machine shouldn't do things which is not permitted on either 
> > > possible alternatives already.  So far the actual usage of that hook 
> > > only looks in the DT to make a decision.  But even if it were to touch 
> > > the hardware, that means it has to be safe to do so on all the possible 
> > > hardware variations this mdesc is associated to.
> > 
> > This last sentence is the key.
> > 
> > It is not guaranteed that being safe on "all the possible hardware
> > variations this mdesc is associated to" and being safe on Xen are the
> > same thing.
> > 
> > What if the same magic configuration register exists on all the possible
> > variations of the platform?
> > I understand that it is unlikely but I think it is a possibility.
> 
> I think it is actually quite likely for such configuration registers,
> nvram etc to exist on all variants of a platform and not all that
> unlikely that smp_init might want to consult them.

In which case the rest of the kernel is also going to consult those 
registers for other purposes.

Again, what's the problem please?

> I think the key thing that non-Xen folks aren't aware of is that
> although Xen passes the majority of the underlying hardware to the dom0
> kernel to manage one of the few things it does not expose to guests
> (where "guests" includes dom0) is the SMP topology of the underlying system.

I suspect it is unlikely to pass on the DT information about the CCI 
either.  Which means that, in the case that started this thread, the 
smp_init function needed by MCPM will simply return false and the next 
priority in the list i.e. plain PSCI will be initialized instead.

I don't forsee the need to poke at the hardware directly within 
->smp_init, not since everything is moving to DT now.

Sure there are ways to screw up Xen support from within this hook, but 
that can be achieved in many other places.  Will Xen take over every 
possible hooks in the kernel to prevent that from happening?

So please let's not cry wolf needlessly.

> Currently the VCPUs are all considered to be uniform i.e. there is no
> big.LITTLE exposed to guests, although the hypervisor will most likely
> eventually support it and so may be scheduling the VCPUs among the big
> and LITTLE PCPUs according to $THINGS.

Indeed.  That's what I suggested earlier.


Nicolas
Stefano Stabellini April 24, 2013, 6:13 p.m. UTC | #20
On Mon, 22 Apr 2013, Nicolas Pitre wrote:
> > I think the key thing that non-Xen folks aren't aware of is that
> > although Xen passes the majority of the underlying hardware to the dom0
> > kernel to manage one of the few things it does not expose to guests
> > (where "guests" includes dom0) is the SMP topology of the underlying system.
> 
> I suspect it is unlikely to pass on the DT information about the CCI 
> either.  Which means that, in the case that started this thread, the 
> smp_init function needed by MCPM will simply return false and the next 
> priority in the list i.e. plain PSCI will be initialized instead.
> 
> I don't forsee the need to poke at the hardware directly within 
> ->smp_init, not since everything is moving to DT now.
> 
> Sure there are ways to screw up Xen support from within this hook, but 
> that can be achieved in many other places.  Will Xen take over every 
> possible hooks in the kernel to prevent that from happening?
> 
> So please let's not cry wolf needlessly.

OK, you convinced me :-)
I'll drop the patch.
Stefano Stabellini April 24, 2013, 6:25 p.m. UTC | #21
On Mon, 22 Apr 2013, Russell King - ARM Linux wrote:
> > > Now, if the psci stuff can't be relied upon to provide the correct
> > > functionality, then that's a separate problem which needs addressing
> > > differently.
> > > 
> > > This should allow the Xen problem to be resolved, because Xen will
> > > provide the PSCI operations, and it's correct in that case to override
> > > the platform's SMP operations.
> > 
> > Yes, increasing the priority of PSCI helps Xen a lot.
> > In order to completely solve the issue for Xen though, another patch is
> > needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because
> > of the introduction of smp_init.
> 
> Given the way the code in setup.c will work:
> 
> +static bool __init xen_smp_init(void)
> +{
> +#ifdef CONFIG_SMP
> +	/* If we are running on Xen, use PSCI if available.
> +	 * In any case do not try to use the native smp_ops. */
> +	if (psci_smp_available())
> +		smp_set_ops(&psci_smp_ops);
> +#endif
> +	return true;
> +}
> 
> Doesn't this just need to return false, and then we'll drop down to
> using PSCI if those operations are available?

If we are running on Xen, we are sure that the only available mechanism
to boot secondary cpus is PSCI; there is no need to try anything else if
that one is not available. This is way xen_smp_init is returning true
unconditionally.

However I am going to drop the "xen/arm: introduce xen_early_init, use
PSCI on xen" patch linked above entirely, because I'll just assume that
if we are running on Xen smp_init is going to fail and return false. The
next option on the list is PSCI so in theory we won't have to do
anything special to make it work.
Ian Campbell April 25, 2013, 7:48 a.m. UTC | #22
On Mon, 2013-04-22 at 17:07 +0100, Nicolas Pitre wrote:
> Sure there are ways to screw up Xen support from within this hook, but 
> that can be achieved in many other places.  Will Xen take over every 
> possible hooks in the kernel to prevent that from happening?

In the majority of the other cases we would consider it a bug in Xen for
not exposing these bits of the underlying hardware correctly to dom0,
however the SMP stuff is a bit of a special case for the reasons I
explained earlier which is why we are worrying about it more.

> So please let's not cry wolf needlessly.

Yes, perhaps those worries unfounded at this stage and we should just
run with it and reference this discussion if it turns out to not work
out well in practice.

Ian.
diff mbox

Patch

diff --git a/arch/arm/include/asm/psci.h b/arch/arm/include/asm/psci.h
index ce0dbe7..fa44417 100644
--- a/arch/arm/include/asm/psci.h
+++ b/arch/arm/include/asm/psci.h
@@ -23,6 +23,26 @@  struct psci_power_state {
 	u8	affinity_level;
 };
 
+/*
+ * cpu_suspend   Suspend the execution on a CPU
+ * @state        we don't currently describe affinity levels, so just pass 0.
+ * @entry_point  the first instruction to be executed on return
+ * returns 0  success, < 0 on failure
+ *
+ * cpu_off       Power down a CPU
+ * @state        we don't currently describe affinity levels, so just pass 0.
+ * no return on successful call
+ *
+ * cpu_on        Power up a CPU
+ * @cpuid        cpuid of target CPU, as from MPIDR
+ * @entry_point  the first instruction to be executed on return
+ * returns 0  success, < 0 on failure
+ *
+ * migrate       Migrate the context to a different CPU
+ * @cpuid        cpuid of target CPU, as from MPIDR
+ * returns 0  success, < 0 on failure
+ *
+ */
 struct psci_operations {
 	int (*cpu_suspend)(struct psci_power_state state,
 			   unsigned long entry_point);
@@ -32,5 +52,14 @@  struct psci_operations {
 };
 
 extern struct psci_operations psci_ops;
+extern struct smp_operations psci_smp_ops;
+
+#ifdef CONFIG_ARM_PSCI
+void psci_init(void);
+bool psci_smp_available(void);
+#else
+static inline void psci_init(void) { }
+static inline bool psci_smp_available(void) { return false; }
+#endif
 
 #endif /* __ASM_ARM_PSCI_H */
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index 5f3338e..dd9d90a 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -82,6 +82,9 @@  obj-$(CONFIG_DEBUG_LL)	+= debug.o
 obj-$(CONFIG_EARLY_PRINTK)	+= early_printk.o
 
 obj-$(CONFIG_ARM_VIRT_EXT)	+= hyp-stub.o
-obj-$(CONFIG_ARM_PSCI)		+= psci.o
+ifeq ($(CONFIG_ARM_PSCI),y)
+obj-y				+= psci.o
+obj-$(CONFIG_SMP)		+= psci_smp.o
+endif
 
 extra-y := $(head-y) vmlinux.lds
diff --git a/arch/arm/kernel/psci.c b/arch/arm/kernel/psci.c
index 3653164..4693188 100644
--- a/arch/arm/kernel/psci.c
+++ b/arch/arm/kernel/psci.c
@@ -158,7 +158,7 @@  static const struct of_device_id psci_of_match[] __initconst = {
 	{},
 };
 
-static int __init psci_init(void)
+void __init psci_init(void)
 {
 	struct device_node *np;
 	const char *method;
@@ -166,7 +166,7 @@  static int __init psci_init(void)
 
 	np = of_find_matching_node(NULL, psci_of_match);
 	if (!np)
-		return 0;
+		return;
 
 	pr_info("probing function IDs from device-tree\n");
 
@@ -206,6 +206,5 @@  static int __init psci_init(void)
 
 out_put_node:
 	of_node_put(np);
-	return 0;
+	return;
 }
-early_initcall(psci_init);
diff --git a/arch/arm/kernel/psci_smp.c b/arch/arm/kernel/psci_smp.c
new file mode 100644
index 0000000..66b0f77
--- /dev/null
+++ b/arch/arm/kernel/psci_smp.c
@@ -0,0 +1,67 @@ 
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ */
+
+#include <linux/init.h>
+#include <linux/irqchip/arm-gic.h>
+#include <linux/smp.h>
+#include <linux/of.h>
+
+#include <asm/psci.h>
+#include <asm/smp_plat.h>
+
+extern void secondary_startup(void);
+
+static int __cpuinit psci_boot_secondary(unsigned int cpu,
+					 struct task_struct *idle)
+{
+	if (psci_ops.cpu_on)
+		return psci_ops.cpu_on(cpu_logical_map(cpu),
+				       __pa(secondary_startup));
+	return -ENODEV;
+}
+
+static void __cpuinit psci_secondary_init(unsigned int cpu)
+{
+	gic_secondary_init(0);
+}
+
+#ifdef CONFIG_HOTPLUG_CPU
+void __ref psci_cpu_die(unsigned int cpu)
+{
+       const struct psci_power_state ps = {
+               .type = PSCI_POWER_STATE_TYPE_POWER_DOWN,
+       };
+
+       if (psci_ops.cpu_off)
+               psci_ops.cpu_off(ps);
+
+       /* We should never return */
+       panic("psci: cpu %d failed to shutdown\n", cpu);
+}
+#else
+#define psci_cpu_die NULL
+#endif
+
+bool __init psci_smp_available(void)
+{
+	/* is cpu_on available at least? */
+	return (psci_ops.cpu_on != NULL);
+}
+
+struct smp_operations __initdata psci_smp_ops = {
+	.smp_secondary_init	= psci_secondary_init,
+	.smp_boot_secondary	= psci_boot_secondary,
+	.cpu_die		= psci_cpu_die,
+};
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 3f6cbb2..341efaa 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -36,6 +36,7 @@ 
 #include <asm/cputype.h>
 #include <asm/elf.h>
 #include <asm/procinfo.h>
+#include <asm/psci.h>
 #include <asm/sections.h>
 #include <asm/setup.h>
 #include <asm/smp_plat.h>
@@ -766,9 +767,13 @@  void __init setup_arch(char **cmdline_p)
 	unflatten_device_tree();
 
 	arm_dt_init_cpu_maps();
+	psci_init();
 #ifdef CONFIG_SMP
 	if (is_smp()) {
-		smp_set_ops(mdesc->smp);
+		if (mdesc->smp)
+			smp_set_ops(mdesc->smp);
+		else if (psci_smp_available())
+			smp_set_ops(&psci_smp_ops);
 		smp_init_cpus();
 	}
 #endif
diff --git a/arch/arm/mach-virt/Makefile b/arch/arm/mach-virt/Makefile
index 042afc1..7ddbfa6 100644
--- a/arch/arm/mach-virt/Makefile
+++ b/arch/arm/mach-virt/Makefile
@@ -3,4 +3,3 @@ 
 #
 
 obj-y					:= virt.o
-obj-$(CONFIG_SMP)			+= platsmp.o
diff --git a/arch/arm/mach-virt/platsmp.c b/arch/arm/mach-virt/platsmp.c
deleted file mode 100644
index 8badaab..0000000
--- a/arch/arm/mach-virt/platsmp.c
+++ /dev/null
@@ -1,58 +0,0 @@ 
-/*
- * Dummy Virtual Machine - does what it says on the tin.
- *
- * Copyright (C) 2012 ARM Ltd
- * Author: Will Deacon <will.deacon@arm.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#include <linux/init.h>
-#include <linux/smp.h>
-#include <linux/of.h>
-
-#include <linux/irqchip/arm-gic.h>
-
-#include <asm/psci.h>
-#include <asm/smp_plat.h>
-
-extern void secondary_startup(void);
-
-static void __init virt_smp_init_cpus(void)
-{
-}
-
-static void __init virt_smp_prepare_cpus(unsigned int max_cpus)
-{
-}
-
-static int __cpuinit virt_boot_secondary(unsigned int cpu,
-					 struct task_struct *idle)
-{
-	if (psci_ops.cpu_on)
-		return psci_ops.cpu_on(cpu_logical_map(cpu),
-				       __pa(secondary_startup));
-	return -ENODEV;
-}
-
-static void __cpuinit virt_secondary_init(unsigned int cpu)
-{
-	gic_secondary_init(0);
-}
-
-struct smp_operations __initdata virt_smp_ops = {
-	.smp_init_cpus		= virt_smp_init_cpus,
-	.smp_prepare_cpus	= virt_smp_prepare_cpus,
-	.smp_secondary_init	= virt_secondary_init,
-	.smp_boot_secondary	= virt_boot_secondary,
-};
diff --git a/arch/arm/mach-virt/virt.c b/arch/arm/mach-virt/virt.c
index 528c05e..c417752 100644
--- a/arch/arm/mach-virt/virt.c
+++ b/arch/arm/mach-virt/virt.c
@@ -44,12 +44,9 @@  static const char *virt_dt_match[] = {
 	NULL
 };
 
-extern struct smp_operations virt_smp_ops;
-
 DT_MACHINE_START(VIRT, "Dummy Virtual Machine")
 	.init_irq	= irqchip_init,
 	.init_time	= virt_timer_init,
 	.init_machine	= virt_init,
-	.smp		= smp_ops(virt_smp_ops),
 	.dt_compat	= virt_dt_match,
 MACHINE_END