Message ID | 1365167495-18508-1-git-send-email-stefano.stabellini@eu.citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Apr 05, 2013 at 02:11:32PM +0100, Stefano Stabellini wrote: > + psci_init(); > #ifdef CONFIG_SMP > if (is_smp()) { > - smp_set_ops(mdesc->smp); > + if (mdesc->smp) > + smp_set_ops(mdesc->smp); > + else if (psci_smp_available()) > + smp_set_ops(&psci_smp_ops); So, I have a vague recollection that the ordering of the above got discussed but I can't find it amongst the 21k of messages so far this year. The above looks weird to me. Surely this should be: if (psci_smp_available()) smp_set_ops(&psci_smp_ops); else if (mdesc->smp) smp_set_ops(mdesc->ops); This means that if PSCI is available, and provides a set of operations, we override whatever the platform has statically provided. Remember, we're trying to move away from using "mdesc"s for platform stuff, relying on things like DT and such like. We really should not be going for mdesc-overriding-newstuff but newstuff-overriding-mdesc. Now, if the psci stuff can't be relied upon to provide the correct functionality, then that's a separate problem which needs addressing differently. This should allow the Xen problem to be resolved, because Xen will provide the PSCI operations, and it's correct in that case to override the platform's SMP operations.
On Thu, 18 Apr 2013, Russell King - ARM Linux wrote: > On Fri, Apr 05, 2013 at 02:11:32PM +0100, Stefano Stabellini wrote: > > + psci_init(); > > #ifdef CONFIG_SMP > > if (is_smp()) { > > - smp_set_ops(mdesc->smp); > > + if (mdesc->smp) > > + smp_set_ops(mdesc->smp); > > + else if (psci_smp_available()) > > + smp_set_ops(&psci_smp_ops); > > So, I have a vague recollection that the ordering of the above got discussed > but I can't find it amongst the 21k of messages so far this year. > > The above looks weird to me. Surely this should be: > > if (psci_smp_available()) > smp_set_ops(&psci_smp_ops); > else if (mdesc->smp) > smp_set_ops(mdesc->ops); > > This means that if PSCI is available, and provides a set of operations, > we override whatever the platform has statically provided. > > Remember, we're trying to move away from using "mdesc"s for platform > stuff, relying on things like DT and such like. We really should not > be going for mdesc-overriding-newstuff but newstuff-overriding-mdesc. That's correct, in fact if you look at the next patch you'll see that it changes the order. I introduced the mechanism first and changed the priority later - it should help bisectability. I can fold the two patches into one if you prefer. > Now, if the psci stuff can't be relied upon to provide the correct > functionality, then that's a separate problem which needs addressing > differently. > > This should allow the Xen problem to be resolved, because Xen will > provide the PSCI operations, and it's correct in that case to override > the platform's SMP operations. Yes, increasing the priority of PSCI helps Xen a lot. In order to completely solve the issue for Xen though, another patch is needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because of the introduction of smp_init.
On Thu, 18 Apr 2013, Stefano Stabellini wrote: > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote: > > On Fri, Apr 05, 2013 at 02:11:32PM +0100, Stefano Stabellini wrote: > > > + psci_init(); > > > #ifdef CONFIG_SMP > > > if (is_smp()) { > > > - smp_set_ops(mdesc->smp); > > > + if (mdesc->smp) > > > + smp_set_ops(mdesc->smp); > > > + else if (psci_smp_available()) > > > + smp_set_ops(&psci_smp_ops); > > > > So, I have a vague recollection that the ordering of the above got discussed > > but I can't find it amongst the 21k of messages so far this year. > > > > The above looks weird to me. Surely this should be: > > > > if (psci_smp_available()) > > smp_set_ops(&psci_smp_ops); > > else if (mdesc->smp) > > smp_set_ops(mdesc->ops); > > > > This means that if PSCI is available, and provides a set of operations, > > we override whatever the platform has statically provided. > > > > Remember, we're trying to move away from using "mdesc"s for platform > > stuff, relying on things like DT and such like. We really should not > > be going for mdesc-overriding-newstuff but newstuff-overriding-mdesc. > > That's correct, in fact if you look at the next patch you'll see that it > changes the order. > > I introduced the mechanism first and changed the priority later - it > should help bisectability. > I can fold the two patches into one if you prefer. Please let's keep the order as we discussed. Otherwise this is just too confusing (Russell's comment is a good example of that). Nicolas
On Thu, 18 Apr 2013, Stefano Stabellini wrote: > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote: > > This should allow the Xen problem to be resolved, because Xen will > > provide the PSCI operations, and it's correct in that case to override > > the platform's SMP operations. > > Yes, increasing the priority of PSCI helps Xen a lot. > In order to completely solve the issue for Xen though, another patch is > needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because > of the introduction of smp_init. Please look at the latest smp_init patch version I sent to you. It shouldn't conflict with Xen any longer. It now returns a bool result depending on whether it did set up smp_ops or not. Nicolas
On Thu, 18 Apr 2013, Nicolas Pitre wrote: > On Thu, 18 Apr 2013, Stefano Stabellini wrote: > > > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote: > > > On Fri, Apr 05, 2013 at 02:11:32PM +0100, Stefano Stabellini wrote: > > > > + psci_init(); > > > > #ifdef CONFIG_SMP > > > > if (is_smp()) { > > > > - smp_set_ops(mdesc->smp); > > > > + if (mdesc->smp) > > > > + smp_set_ops(mdesc->smp); > > > > + else if (psci_smp_available()) > > > > + smp_set_ops(&psci_smp_ops); > > > > > > So, I have a vague recollection that the ordering of the above got discussed > > > but I can't find it amongst the 21k of messages so far this year. > > > > > > The above looks weird to me. Surely this should be: > > > > > > if (psci_smp_available()) > > > smp_set_ops(&psci_smp_ops); > > > else if (mdesc->smp) > > > smp_set_ops(mdesc->ops); > > > > > > This means that if PSCI is available, and provides a set of operations, > > > we override whatever the platform has statically provided. > > > > > > Remember, we're trying to move away from using "mdesc"s for platform > > > stuff, relying on things like DT and such like. We really should not > > > be going for mdesc-overriding-newstuff but newstuff-overriding-mdesc. > > > > That's correct, in fact if you look at the next patch you'll see that it > > changes the order. > > > > I introduced the mechanism first and changed the priority later - it > > should help bisectability. > > I can fold the two patches into one if you prefer. > > Please let's keep the order as we discussed. Otherwise this is just too > confusing (Russell's comment is a good example of that). You are right, it is confusing. By "keep the order as we discussed", do you mean merge the second patch into the first one, correct?
On Thu, 18 Apr 2013, Nicolas Pitre wrote: > On Thu, 18 Apr 2013, Stefano Stabellini wrote: > > > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote: > > > This should allow the Xen problem to be resolved, because Xen will > > > provide the PSCI operations, and it's correct in that case to override > > > the platform's SMP operations. > > > > Yes, increasing the priority of PSCI helps Xen a lot. > > In order to completely solve the issue for Xen though, another patch is > > needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because > > of the introduction of smp_init. > > Please look at the latest smp_init patch version I sent to you. It > shouldn't conflict with Xen any longer. It now returns a bool result > depending on whether it did set up smp_ops or not. CPUs are virtualized by Xen and do not reflect or expose the underlying SMP hardware and firmware features, so an hardware specific smp_init cannot run. So the smp_init patch still breaks Xen because even if smp_init can fail graciously, executing a platform specific smp_init function that tries to access registers and memory regions that are not present is going to cause an undefined behaviour. Under Xen, smp_init won't have a chance to return any value because it is going to crash first.
On Thu, 18 Apr 2013, Stefano Stabellini wrote: > On Thu, 18 Apr 2013, Nicolas Pitre wrote: > > On Thu, 18 Apr 2013, Stefano Stabellini wrote: > > > > > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote: > > > > This should allow the Xen problem to be resolved, because Xen will > > > > provide the PSCI operations, and it's correct in that case to override > > > > the platform's SMP operations. > > > > > > Yes, increasing the priority of PSCI helps Xen a lot. > > > In order to completely solve the issue for Xen though, another patch is > > > needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because > > > of the introduction of smp_init. > > > > Please look at the latest smp_init patch version I sent to you. It > > shouldn't conflict with Xen any longer. It now returns a bool result > > depending on whether it did set up smp_ops or not. > > CPUs are virtualized by Xen and do not reflect or expose the underlying > SMP hardware and firmware features, so an hardware specific smp_init > cannot run. > > So the smp_init patch still breaks Xen because even if smp_init can fail > graciously, executing a platform specific smp_init function that tries > to access registers and memory regions that are not present is going to > cause an undefined behaviour. It won't access hardware but just look into the DT and return false if nothing interesting is found. At which point the next attempt in the priority list is PSCI by default. Nicolas
On Thu, 18 Apr 2013, Stefano Stabellini wrote: > On Thu, 18 Apr 2013, Nicolas Pitre wrote: > > On Thu, 18 Apr 2013, Stefano Stabellini wrote: > > > > > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote: > > > > On Fri, Apr 05, 2013 at 02:11:32PM +0100, Stefano Stabellini wrote: > > > > > + psci_init(); > > > > > #ifdef CONFIG_SMP > > > > > if (is_smp()) { > > > > > - smp_set_ops(mdesc->smp); > > > > > + if (mdesc->smp) > > > > > + smp_set_ops(mdesc->smp); > > > > > + else if (psci_smp_available()) > > > > > + smp_set_ops(&psci_smp_ops); > > > > > > > > So, I have a vague recollection that the ordering of the above got discussed > > > > but I can't find it amongst the 21k of messages so far this year. > > > > > > > > The above looks weird to me. Surely this should be: > > > > > > > > if (psci_smp_available()) > > > > smp_set_ops(&psci_smp_ops); > > > > else if (mdesc->smp) > > > > smp_set_ops(mdesc->ops); > > > > > > > > This means that if PSCI is available, and provides a set of operations, > > > > we override whatever the platform has statically provided. > > > > > > > > Remember, we're trying to move away from using "mdesc"s for platform > > > > stuff, relying on things like DT and such like. We really should not > > > > be going for mdesc-overriding-newstuff but newstuff-overriding-mdesc. > > > > > > That's correct, in fact if you look at the next patch you'll see that it > > > changes the order. > > > > > > I introduced the mechanism first and changed the priority later - it > > > should help bisectability. > > > I can fold the two patches into one if you prefer. > > > > Please let's keep the order as we discussed. Otherwise this is just too > > confusing (Russell's comment is a good example of that). > > You are right, it is confusing. > By "keep the order as we discussed", do you mean merge the second patch > into the first one, correct? Correct. And merge the commit log message too. Nicolas
On Thu, 18 Apr 2013, Nicolas Pitre wrote: > On Thu, 18 Apr 2013, Stefano Stabellini wrote: > > > On Thu, 18 Apr 2013, Nicolas Pitre wrote: > > > On Thu, 18 Apr 2013, Stefano Stabellini wrote: > > > > > > > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote: > > > > > This should allow the Xen problem to be resolved, because Xen will > > > > > provide the PSCI operations, and it's correct in that case to override > > > > > the platform's SMP operations. > > > > > > > > Yes, increasing the priority of PSCI helps Xen a lot. > > > > In order to completely solve the issue for Xen though, another patch is > > > > needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because > > > > of the introduction of smp_init. > > > > > > Please look at the latest smp_init patch version I sent to you. It > > > shouldn't conflict with Xen any longer. It now returns a bool result > > > depending on whether it did set up smp_ops or not. > > > > CPUs are virtualized by Xen and do not reflect or expose the underlying > > SMP hardware and firmware features, so an hardware specific smp_init > > cannot run. > > > > So the smp_init patch still breaks Xen because even if smp_init can fail > > graciously, executing a platform specific smp_init function that tries > > to access registers and memory regions that are not present is going to > > cause an undefined behaviour. > > It won't access hardware but just look into the DT and return false if > nothing interesting is found. At which point the next attempt in the > priority list is PSCI by default. OK, that should work. I am going to drop "xen/arm: introduce xen_early_init, use PSCI on xen". We'll have to be careful enforcing that future smp_init implementations use DT rather than probing registers. Maybe I should add a comment to the smp_init patch to clarify that?
On Thu, 2013-04-18 at 18:38 +0100, Nicolas Pitre wrote: > On Thu, 18 Apr 2013, Stefano Stabellini wrote: > > > On Thu, 18 Apr 2013, Nicolas Pitre wrote: > > > On Thu, 18 Apr 2013, Stefano Stabellini wrote: > > > > > > > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote: > > > > > This should allow the Xen problem to be resolved, because Xen will > > > > > provide the PSCI operations, and it's correct in that case to override > > > > > the platform's SMP operations. > > > > > > > > Yes, increasing the priority of PSCI helps Xen a lot. > > > > In order to completely solve the issue for Xen though, another patch is > > > > needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because > > > > of the introduction of smp_init. > > > > > > Please look at the latest smp_init patch version I sent to you. It > > > shouldn't conflict with Xen any longer. It now returns a bool result > > > depending on whether it did set up smp_ops or not. > > > > CPUs are virtualized by Xen and do not reflect or expose the underlying > > SMP hardware and firmware features, so an hardware specific smp_init > > cannot run. > > > > So the smp_init patch still breaks Xen because even if smp_init can fail > > graciously, executing a platform specific smp_init function that tries > > to access registers and memory regions that are not present is going to > > cause an undefined behaviour. > > It won't access hardware but just look into the DT and return false if > nothing interesting is found. At which point the next attempt in the > priority list is PSCI by default. I think there might be some confusion about the semantics of smp_init, since it is in mdesc I had interpreted it as a per-platform hook to allow "magic" SMP setup, which I at least had assumed would (be permitted to) involve hardware specific frobbing, including touching platform specific devices etc. Is that not the case? Can we guarantee that this hook won't be used by hardware platforms to e.g. probe NVRAM for SMP topology information or other activities which touch hardware? If it isn't hardware specific then does this hook really belong in mdesc? Or if it is purely driven by DT can we not implement it in terms of DT at the top level rather than abstracting via a hardware specific hook? Ian.
On Fri, 19 Apr 2013, Stefano Stabellini wrote: > On Thu, 18 Apr 2013, Nicolas Pitre wrote: > > On Thu, 18 Apr 2013, Stefano Stabellini wrote: > > > > > On Thu, 18 Apr 2013, Nicolas Pitre wrote: > > > > On Thu, 18 Apr 2013, Stefano Stabellini wrote: > > > > > > > > > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote: > > > > > > This should allow the Xen problem to be resolved, because Xen will > > > > > > provide the PSCI operations, and it's correct in that case to override > > > > > > the platform's SMP operations. > > > > > > > > > > Yes, increasing the priority of PSCI helps Xen a lot. > > > > > In order to completely solve the issue for Xen though, another patch is > > > > > needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because > > > > > of the introduction of smp_init. > > > > > > > > Please look at the latest smp_init patch version I sent to you. It > > > > shouldn't conflict with Xen any longer. It now returns a bool result > > > > depending on whether it did set up smp_ops or not. > > > > > > CPUs are virtualized by Xen and do not reflect or expose the underlying > > > SMP hardware and firmware features, so an hardware specific smp_init > > > cannot run. > > > > > > So the smp_init patch still breaks Xen because even if smp_init can fail > > > graciously, executing a platform specific smp_init function that tries > > > to access registers and memory regions that are not present is going to > > > cause an undefined behaviour. > > > > It won't access hardware but just look into the DT and return false if > > nothing interesting is found. At which point the next attempt in the > > priority list is PSCI by default. > > OK, that should work. > I am going to drop "xen/arm: introduce xen_early_init, use PSCI on xen". > > We'll have to be careful enforcing that future smp_init implementations > use DT rather than probing registers. Maybe I should add a comment to > the smp_init patch to clarify that? No one should be probing registers without making sure it is safe to do so. Even on non virtualized hardware this can be a dangerous thing to do. Nicolas
On Fri, 2013-04-19 at 16:47 +0100, Nicolas Pitre wrote: > No one should be probing registers without making sure it is safe to do > so. Even on non virtualized hardware this can be a dangerous thing to > do. Won't people writing per machine code consider, not unreasonably, that having been called through a mdesc machine specific hook constitutes enough "making sure" that they think it is safe to touch registers which are specific to that machine? Ian.
On Fri, 19 Apr 2013, Ian Campbell wrote: > On Fri, 2013-04-19 at 16:47 +0100, Nicolas Pitre wrote: > > No one should be probing registers without making sure it is safe to do > > so. Even on non virtualized hardware this can be a dangerous thing to > > do. > > Won't people writing per machine code consider, not unreasonably, that > having been called through a mdesc machine specific hook constitutes > enough "making sure" that they think it is safe to touch registers which > are specific to that machine? Remember that this hook was introduced to decide at run time between different possibilities for SMP ops on the _same_ machine configuration. That machine shouldn't do things which is not permitted on either possible alternatives already. So far the actual usage of that hook only looks in the DT to make a decision. But even if it were to touch the hardware, that means it has to be safe to do so on all the possible hardware variations this mdesc is associated to. And if Xen wants to emulate one of those hardware alternatives, then it better be ready to emulate it properly, or manage for _another_ mdesc to be selected instead. That's why mach-virt was introduced. So in my opinion there is just no issue here. Nicolas
On Fri, 19 Apr 2013, Nicolas Pitre wrote: > On Fri, 19 Apr 2013, Ian Campbell wrote: > > > On Fri, 2013-04-19 at 16:47 +0100, Nicolas Pitre wrote: > > > No one should be probing registers without making sure it is safe to do > > > so. Even on non virtualized hardware this can be a dangerous thing to > > > do. > > > > Won't people writing per machine code consider, not unreasonably, that > > having been called through a mdesc machine specific hook constitutes > > enough "making sure" that they think it is safe to touch registers which > > are specific to that machine? > > Remember that this hook was introduced to decide at run time between > different possibilities for SMP ops on the _same_ machine configuration. > That machine shouldn't do things which is not permitted on either > possible alternatives already. So far the actual usage of that hook > only looks in the DT to make a decision. But even if it were to touch > the hardware, that means it has to be safe to do so on all the possible > hardware variations this mdesc is associated to. This last sentence is the key. It is not guaranteed that being safe on "all the possible hardware variations this mdesc is associated to" and being safe on Xen are the same thing. What if the same magic configuration register exists on all the possible variations of the platform? I understand that it is unlikely but I think it is a possibility. If we don't take further precautions I'll be forced to regularly scan the list for smp_init implementations and look for possible breakages, pointing the unaware author of the patches to this conversation. Honestly it is not my idea of fun. > And if Xen wants to emulate one of those hardware alternatives, then it > better be ready to emulate it properly, or manage for _another_ mdesc to > be selected instead. That's why mach-virt was introduced. I don't mean to argue pointlessly here, but I thought that it was clear that Xen does not emulate any hardware interfaces at all. Of course when a convenient virtualization aware hardware interface is available (the vGIC for example) Xen is going to use it, but other than that no emulation is going to be done. > So in my opinion there is just no issue here. Not now, and maybe not even in the near future. But I feel that we are taking an error prone approach.
On Thu, Apr 18, 2013 at 05:20:23PM +0100, Stefano Stabellini wrote: > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote: > > Remember, we're trying to move away from using "mdesc"s for platform > > stuff, relying on things like DT and such like. We really should not > > be going for mdesc-overriding-newstuff but newstuff-overriding-mdesc. > > That's correct, in fact if you look at the next patch you'll see that it > changes the order. You may have noticed that I've been catching up with email, and it's exceedingly difficult to track what patches are obsolete and have been overridden by new versions. > I introduced the mechanism first and changed the priority later - it > should help bisectability. > I can fold the two patches into one if you prefer. On the face of it, I think that would be better. I don't remember what your last version looks like though. > > Now, if the psci stuff can't be relied upon to provide the correct > > functionality, then that's a separate problem which needs addressing > > differently. > > > > This should allow the Xen problem to be resolved, because Xen will > > provide the PSCI operations, and it's correct in that case to override > > the platform's SMP operations. > > Yes, increasing the priority of PSCI helps Xen a lot. > In order to completely solve the issue for Xen though, another patch is > needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because > of the introduction of smp_init. Given the way the code in setup.c will work: +static bool __init xen_smp_init(void) +{ +#ifdef CONFIG_SMP + /* If we are running on Xen, use PSCI if available. + * In any case do not try to use the native smp_ops. */ + if (psci_smp_available()) + smp_set_ops(&psci_smp_ops); +#endif + return true; +} Doesn't this just need to return false, and then we'll drop down to using PSCI if those operations are available?
On Thu, Apr 18, 2013 at 12:35:57PM -0400, Nicolas Pitre wrote: > On Thu, 18 Apr 2013, Stefano Stabellini wrote: > > > On Thu, 18 Apr 2013, Russell King - ARM Linux wrote: > > > On Fri, Apr 05, 2013 at 02:11:32PM +0100, Stefano Stabellini wrote: > > > > + psci_init(); > > > > #ifdef CONFIG_SMP > > > > if (is_smp()) { > > > > - smp_set_ops(mdesc->smp); > > > > + if (mdesc->smp) > > > > + smp_set_ops(mdesc->smp); > > > > + else if (psci_smp_available()) > > > > + smp_set_ops(&psci_smp_ops); > > > > > > So, I have a vague recollection that the ordering of the above got discussed > > > but I can't find it amongst the 21k of messages so far this year. > > > > > > The above looks weird to me. Surely this should be: > > > > > > if (psci_smp_available()) > > > smp_set_ops(&psci_smp_ops); > > > else if (mdesc->smp) > > > smp_set_ops(mdesc->ops); > > > > > > This means that if PSCI is available, and provides a set of operations, > > > we override whatever the platform has statically provided. > > > > > > Remember, we're trying to move away from using "mdesc"s for platform > > > stuff, relying on things like DT and such like. We really should not > > > be going for mdesc-overriding-newstuff but newstuff-overriding-mdesc. > > > > That's correct, in fact if you look at the next patch you'll see that it > > changes the order. > > > > I introduced the mechanism first and changed the priority later - it > > should help bisectability. > > I can fold the two patches into one if you prefer. > > Please let's keep the order as we discussed. Otherwise this is just too > confusing (Russell's comment is a good example of that). My comment is based on the code which I had read and quoted. Things had moved on from that point, but, because I had 2000 odd messages to get through, there was no way to know at that point that there had been further discussion. This is why catching up is such a problem - not only the amount of time it takes (I'm _still_ reading messages - I've only just read your reply above, so I'm still back in Thursday's email...) I don't think there's any confusion though - the order I suggested seems to be the order which I recall later patches adopted.
On Thu, Apr 18, 2013 at 05:49:21PM +0100, Stefano Stabellini wrote: > On Thu, 18 Apr 2013, Nicolas Pitre wrote: > > Please let's keep the order as we discussed. Otherwise this is just too > > confusing (Russell's comment is a good example of that). > > You are right, it is confusing. > By "keep the order as we discussed", do you mean merge the second patch > into the first one, correct? Well, the solution to this kind of confusion is to ignore comments on the older patches and post a new set of patches and ask for all relevant comments to be against that set and that set only.
On Fri, 2013-04-19 at 18:06 +0100, Stefano Stabellini wrote: > On Fri, 19 Apr 2013, Nicolas Pitre wrote: > > On Fri, 19 Apr 2013, Ian Campbell wrote: > > > > > On Fri, 2013-04-19 at 16:47 +0100, Nicolas Pitre wrote: > > > > No one should be probing registers without making sure it is safe to do > > > > so. Even on non virtualized hardware this can be a dangerous thing to > > > > do. > > > > > > Won't people writing per machine code consider, not unreasonably, that > > > having been called through a mdesc machine specific hook constitutes > > > enough "making sure" that they think it is safe to touch registers which > > > are specific to that machine? > > > > Remember that this hook was introduced to decide at run time between > > different possibilities for SMP ops on the _same_ machine configuration. > > That machine shouldn't do things which is not permitted on either > > possible alternatives already. So far the actual usage of that hook > > only looks in the DT to make a decision. But even if it were to touch > > the hardware, that means it has to be safe to do so on all the possible > > hardware variations this mdesc is associated to. > > This last sentence is the key. > > It is not guaranteed that being safe on "all the possible hardware > variations this mdesc is associated to" and being safe on Xen are the > same thing. > > What if the same magic configuration register exists on all the possible > variations of the platform? > I understand that it is unlikely but I think it is a possibility. I think it is actually quite likely for such configuration registers, nvram etc to exist on all variants of a platform and not all that unlikely that smp_init might want to consult them. > If we don't take further precautions I'll be forced to regularly scan > the list for smp_init implementations and look for possible breakages, > pointing the unaware author of the patches to this conversation. > Honestly it is not my idea of fun. > > > > And if Xen wants to emulate one of those hardware alternatives, then it > > better be ready to emulate it properly, or manage for _another_ mdesc to > > be selected instead. That's why mach-virt was introduced. > > I don't mean to argue pointlessly here, but I thought that it was clear > that Xen does not emulate any hardware interfaces at all. > > Of course when a convenient virtualization aware hardware interface is > available (the vGIC for example) Xen is going to use it, but other than > that no emulation is going to be done. I think the key thing that non-Xen folks aren't aware of is that although Xen passes the majority of the underlying hardware to the dom0 kernel to manage one of the few things it does not expose to guests (where "guests" includes dom0) is the SMP topology of the underlying system. The physical CPUs are managed/scheduled/etc by the hypervisor and it is the hypervisor which will need to be aware of big.LITTLE and MCPM etc. The guests, and again this includes dom0, see only a set of VCPUs which are scheduled amongst the PCPUs by the hypervisor. It is not unusual to configure dom0 to have less VCPUs than there are PCPUs in the system. Currently the VCPUs are all considered to be uniform i.e. there is no big.LITTLE exposed to guests, although the hypervisor will most likely eventually support it and so may be scheduling the VCPUs among the big and LITTLE PCPUs according to $THINGS. I don't think we want to go the route of mandating a 1:1 PCPU:VCPU relationship for dom0 for Xen on ARM. Ian.
On Mon, 22 Apr 2013, Ian Campbell wrote: > On Fri, 2013-04-19 at 18:06 +0100, Stefano Stabellini wrote: > > On Fri, 19 Apr 2013, Nicolas Pitre wrote: > > > On Fri, 19 Apr 2013, Ian Campbell wrote: > > > > > > > On Fri, 2013-04-19 at 16:47 +0100, Nicolas Pitre wrote: > > > > > No one should be probing registers without making sure it is safe to do > > > > > so. Even on non virtualized hardware this can be a dangerous thing to > > > > > do. > > > > > > > > Won't people writing per machine code consider, not unreasonably, that > > > > having been called through a mdesc machine specific hook constitutes > > > > enough "making sure" that they think it is safe to touch registers which > > > > are specific to that machine? > > > > > > Remember that this hook was introduced to decide at run time between > > > different possibilities for SMP ops on the _same_ machine configuration. > > > That machine shouldn't do things which is not permitted on either > > > possible alternatives already. So far the actual usage of that hook > > > only looks in the DT to make a decision. But even if it were to touch > > > the hardware, that means it has to be safe to do so on all the possible > > > hardware variations this mdesc is associated to. > > > > This last sentence is the key. > > > > It is not guaranteed that being safe on "all the possible hardware > > variations this mdesc is associated to" and being safe on Xen are the > > same thing. > > > > What if the same magic configuration register exists on all the possible > > variations of the platform? > > I understand that it is unlikely but I think it is a possibility. > > I think it is actually quite likely for such configuration registers, > nvram etc to exist on all variants of a platform and not all that > unlikely that smp_init might want to consult them. In which case the rest of the kernel is also going to consult those registers for other purposes. Again, what's the problem please? > I think the key thing that non-Xen folks aren't aware of is that > although Xen passes the majority of the underlying hardware to the dom0 > kernel to manage one of the few things it does not expose to guests > (where "guests" includes dom0) is the SMP topology of the underlying system. I suspect it is unlikely to pass on the DT information about the CCI either. Which means that, in the case that started this thread, the smp_init function needed by MCPM will simply return false and the next priority in the list i.e. plain PSCI will be initialized instead. I don't forsee the need to poke at the hardware directly within ->smp_init, not since everything is moving to DT now. Sure there are ways to screw up Xen support from within this hook, but that can be achieved in many other places. Will Xen take over every possible hooks in the kernel to prevent that from happening? So please let's not cry wolf needlessly. > Currently the VCPUs are all considered to be uniform i.e. there is no > big.LITTLE exposed to guests, although the hypervisor will most likely > eventually support it and so may be scheduling the VCPUs among the big > and LITTLE PCPUs according to $THINGS. Indeed. That's what I suggested earlier. Nicolas
On Mon, 22 Apr 2013, Nicolas Pitre wrote: > > I think the key thing that non-Xen folks aren't aware of is that > > although Xen passes the majority of the underlying hardware to the dom0 > > kernel to manage one of the few things it does not expose to guests > > (where "guests" includes dom0) is the SMP topology of the underlying system. > > I suspect it is unlikely to pass on the DT information about the CCI > either. Which means that, in the case that started this thread, the > smp_init function needed by MCPM will simply return false and the next > priority in the list i.e. plain PSCI will be initialized instead. > > I don't forsee the need to poke at the hardware directly within > ->smp_init, not since everything is moving to DT now. > > Sure there are ways to screw up Xen support from within this hook, but > that can be achieved in many other places. Will Xen take over every > possible hooks in the kernel to prevent that from happening? > > So please let's not cry wolf needlessly. OK, you convinced me :-) I'll drop the patch.
On Mon, 22 Apr 2013, Russell King - ARM Linux wrote: > > > Now, if the psci stuff can't be relied upon to provide the correct > > > functionality, then that's a separate problem which needs addressing > > > differently. > > > > > > This should allow the Xen problem to be resolved, because Xen will > > > provide the PSCI operations, and it's correct in that case to override > > > the platform's SMP operations. > > > > Yes, increasing the priority of PSCI helps Xen a lot. > > In order to completely solve the issue for Xen though, another patch is > > needed (http://marc.info/?l=linux-kernel&m=136630106201968&w=2) because > > of the introduction of smp_init. > > Given the way the code in setup.c will work: > > +static bool __init xen_smp_init(void) > +{ > +#ifdef CONFIG_SMP > + /* If we are running on Xen, use PSCI if available. > + * In any case do not try to use the native smp_ops. */ > + if (psci_smp_available()) > + smp_set_ops(&psci_smp_ops); > +#endif > + return true; > +} > > Doesn't this just need to return false, and then we'll drop down to > using PSCI if those operations are available? If we are running on Xen, we are sure that the only available mechanism to boot secondary cpus is PSCI; there is no need to try anything else if that one is not available. This is way xen_smp_init is returning true unconditionally. However I am going to drop the "xen/arm: introduce xen_early_init, use PSCI on xen" patch linked above entirely, because I'll just assume that if we are running on Xen smp_init is going to fail and return false. The next option on the list is PSCI so in theory we won't have to do anything special to make it work.
On Mon, 2013-04-22 at 17:07 +0100, Nicolas Pitre wrote: > Sure there are ways to screw up Xen support from within this hook, but > that can be achieved in many other places. Will Xen take over every > possible hooks in the kernel to prevent that from happening? In the majority of the other cases we would consider it a bug in Xen for not exposing these bits of the underlying hardware correctly to dom0, however the SMP stuff is a bit of a special case for the reasons I explained earlier which is why we are worrying about it more. > So please let's not cry wolf needlessly. Yes, perhaps those worries unfounded at this stage and we should just run with it and reference this discussion if it turns out to not work out well in practice. Ian.
diff --git a/arch/arm/include/asm/psci.h b/arch/arm/include/asm/psci.h index ce0dbe7..fa44417 100644 --- a/arch/arm/include/asm/psci.h +++ b/arch/arm/include/asm/psci.h @@ -23,6 +23,26 @@ struct psci_power_state { u8 affinity_level; }; +/* + * cpu_suspend Suspend the execution on a CPU + * @state we don't currently describe affinity levels, so just pass 0. + * @entry_point the first instruction to be executed on return + * returns 0 success, < 0 on failure + * + * cpu_off Power down a CPU + * @state we don't currently describe affinity levels, so just pass 0. + * no return on successful call + * + * cpu_on Power up a CPU + * @cpuid cpuid of target CPU, as from MPIDR + * @entry_point the first instruction to be executed on return + * returns 0 success, < 0 on failure + * + * migrate Migrate the context to a different CPU + * @cpuid cpuid of target CPU, as from MPIDR + * returns 0 success, < 0 on failure + * + */ struct psci_operations { int (*cpu_suspend)(struct psci_power_state state, unsigned long entry_point); @@ -32,5 +52,14 @@ struct psci_operations { }; extern struct psci_operations psci_ops; +extern struct smp_operations psci_smp_ops; + +#ifdef CONFIG_ARM_PSCI +void psci_init(void); +bool psci_smp_available(void); +#else +static inline void psci_init(void) { } +static inline bool psci_smp_available(void) { return false; } +#endif #endif /* __ASM_ARM_PSCI_H */ diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile index 5f3338e..dd9d90a 100644 --- a/arch/arm/kernel/Makefile +++ b/arch/arm/kernel/Makefile @@ -82,6 +82,9 @@ obj-$(CONFIG_DEBUG_LL) += debug.o obj-$(CONFIG_EARLY_PRINTK) += early_printk.o obj-$(CONFIG_ARM_VIRT_EXT) += hyp-stub.o -obj-$(CONFIG_ARM_PSCI) += psci.o +ifeq ($(CONFIG_ARM_PSCI),y) +obj-y += psci.o +obj-$(CONFIG_SMP) += psci_smp.o +endif extra-y := $(head-y) vmlinux.lds diff --git a/arch/arm/kernel/psci.c b/arch/arm/kernel/psci.c index 3653164..4693188 100644 --- a/arch/arm/kernel/psci.c +++ b/arch/arm/kernel/psci.c @@ -158,7 +158,7 @@ static const struct of_device_id psci_of_match[] __initconst = { {}, }; -static int __init psci_init(void) +void __init psci_init(void) { struct device_node *np; const char *method; @@ -166,7 +166,7 @@ static int __init psci_init(void) np = of_find_matching_node(NULL, psci_of_match); if (!np) - return 0; + return; pr_info("probing function IDs from device-tree\n"); @@ -206,6 +206,5 @@ static int __init psci_init(void) out_put_node: of_node_put(np); - return 0; + return; } -early_initcall(psci_init); diff --git a/arch/arm/kernel/psci_smp.c b/arch/arm/kernel/psci_smp.c new file mode 100644 index 0000000..66b0f77 --- /dev/null +++ b/arch/arm/kernel/psci_smp.c @@ -0,0 +1,67 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * Copyright (C) 2012 ARM Limited + * + * Author: Will Deacon <will.deacon@arm.com> + */ + +#include <linux/init.h> +#include <linux/irqchip/arm-gic.h> +#include <linux/smp.h> +#include <linux/of.h> + +#include <asm/psci.h> +#include <asm/smp_plat.h> + +extern void secondary_startup(void); + +static int __cpuinit psci_boot_secondary(unsigned int cpu, + struct task_struct *idle) +{ + if (psci_ops.cpu_on) + return psci_ops.cpu_on(cpu_logical_map(cpu), + __pa(secondary_startup)); + return -ENODEV; +} + +static void __cpuinit psci_secondary_init(unsigned int cpu) +{ + gic_secondary_init(0); +} + +#ifdef CONFIG_HOTPLUG_CPU +void __ref psci_cpu_die(unsigned int cpu) +{ + const struct psci_power_state ps = { + .type = PSCI_POWER_STATE_TYPE_POWER_DOWN, + }; + + if (psci_ops.cpu_off) + psci_ops.cpu_off(ps); + + /* We should never return */ + panic("psci: cpu %d failed to shutdown\n", cpu); +} +#else +#define psci_cpu_die NULL +#endif + +bool __init psci_smp_available(void) +{ + /* is cpu_on available at least? */ + return (psci_ops.cpu_on != NULL); +} + +struct smp_operations __initdata psci_smp_ops = { + .smp_secondary_init = psci_secondary_init, + .smp_boot_secondary = psci_boot_secondary, + .cpu_die = psci_cpu_die, +}; diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index 3f6cbb2..341efaa 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -36,6 +36,7 @@ #include <asm/cputype.h> #include <asm/elf.h> #include <asm/procinfo.h> +#include <asm/psci.h> #include <asm/sections.h> #include <asm/setup.h> #include <asm/smp_plat.h> @@ -766,9 +767,13 @@ void __init setup_arch(char **cmdline_p) unflatten_device_tree(); arm_dt_init_cpu_maps(); + psci_init(); #ifdef CONFIG_SMP if (is_smp()) { - smp_set_ops(mdesc->smp); + if (mdesc->smp) + smp_set_ops(mdesc->smp); + else if (psci_smp_available()) + smp_set_ops(&psci_smp_ops); smp_init_cpus(); } #endif diff --git a/arch/arm/mach-virt/Makefile b/arch/arm/mach-virt/Makefile index 042afc1..7ddbfa6 100644 --- a/arch/arm/mach-virt/Makefile +++ b/arch/arm/mach-virt/Makefile @@ -3,4 +3,3 @@ # obj-y := virt.o -obj-$(CONFIG_SMP) += platsmp.o diff --git a/arch/arm/mach-virt/platsmp.c b/arch/arm/mach-virt/platsmp.c deleted file mode 100644 index 8badaab..0000000 --- a/arch/arm/mach-virt/platsmp.c +++ /dev/null @@ -1,58 +0,0 @@ -/* - * Dummy Virtual Machine - does what it says on the tin. - * - * Copyright (C) 2012 ARM Ltd - * Author: Will Deacon <will.deacon@arm.com> - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, - * but WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - * - * You should have received a copy of the GNU General Public License - * along with this program. If not, see <http://www.gnu.org/licenses/>. - */ - -#include <linux/init.h> -#include <linux/smp.h> -#include <linux/of.h> - -#include <linux/irqchip/arm-gic.h> - -#include <asm/psci.h> -#include <asm/smp_plat.h> - -extern void secondary_startup(void); - -static void __init virt_smp_init_cpus(void) -{ -} - -static void __init virt_smp_prepare_cpus(unsigned int max_cpus) -{ -} - -static int __cpuinit virt_boot_secondary(unsigned int cpu, - struct task_struct *idle) -{ - if (psci_ops.cpu_on) - return psci_ops.cpu_on(cpu_logical_map(cpu), - __pa(secondary_startup)); - return -ENODEV; -} - -static void __cpuinit virt_secondary_init(unsigned int cpu) -{ - gic_secondary_init(0); -} - -struct smp_operations __initdata virt_smp_ops = { - .smp_init_cpus = virt_smp_init_cpus, - .smp_prepare_cpus = virt_smp_prepare_cpus, - .smp_secondary_init = virt_secondary_init, - .smp_boot_secondary = virt_boot_secondary, -}; diff --git a/arch/arm/mach-virt/virt.c b/arch/arm/mach-virt/virt.c index 528c05e..c417752 100644 --- a/arch/arm/mach-virt/virt.c +++ b/arch/arm/mach-virt/virt.c @@ -44,12 +44,9 @@ static const char *virt_dt_match[] = { NULL }; -extern struct smp_operations virt_smp_ops; - DT_MACHINE_START(VIRT, "Dummy Virtual Machine") .init_irq = irqchip_init, .init_time = virt_timer_init, .init_machine = virt_init, - .smp = smp_ops(virt_smp_ops), .dt_compat = virt_dt_match, MACHINE_END