diff mbox

drivers: cpuidle: don't initialize big.LITTLE driver if MCPM is unavailable

Message ID 1420698544-10277-1-git-send-email-sudeep.holla@arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Sudeep Holla Jan. 8, 2015, 6:29 a.m. UTC
If big.LITTLE driver is initialized even when MCPM is unavailable,
we get the below warning the first time cpu tries to enter deeper
C-states.

------------[ cut here ]------------
WARNING: CPU: 4 PID: 0 at kernel/arch/arm/common/mcpm_entry.c:130 mcpm_cpu_suspend+0x6d/0x74()
Modules linked in:
CPU: 4 PID: 0 Comm: swapper/4 Not tainted 3.19.0-rc3-00007-gaf5a2cb1ad5c-dirty #11
Hardware name: ARM-Versatile Express
[<c0013fa5>] (unwind_backtrace) from [<c001084d>] (show_stack+0x11/0x14)
[<c001084d>] (show_stack) from [<c04fe7f1>] (dump_stack+0x6d/0x78)
[<c04fe7f1>] (dump_stack) from [<c0020645>] (warn_slowpath_common+0x69/0x90)
[<c0020645>] (warn_slowpath_common) from [<c00206db>] (warn_slowpath_null+0x17/0x1c)
[<c00206db>] (warn_slowpath_null) from [<c001cbdd>] (mcpm_cpu_suspend+0x6d/0x74)
[<c001cbdd>] (mcpm_cpu_suspend) from [<c03c6919>] (bl_powerdown_finisher+0x21/0x24)
[<c03c6919>] (bl_powerdown_finisher) from [<c001218d>] (cpu_suspend_abort+0x1/0x14)
[<c001218d>] (cpu_suspend_abort) from [<00000000>] (  (null))
---[ end trace d098e3fd00000008 ]---

This patch fixes the issue by checking for the availability of MCPM
before initializing the big.LITTLE cpuidle driver

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
---
 drivers/cpuidle/cpuidle-big_little.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Daniel Lezcano Jan. 8, 2015, 8:53 a.m. UTC | #1
On 01/08/2015 07:29 AM, Sudeep Holla wrote:
> If big.LITTLE driver is initialized even when MCPM is unavailable,
> we get the below warning the first time cpu tries to enter deeper
> C-states.

Can you elaborate why MCPM could be unavailable when the tc2 pm code 
registers the mcpm platform ops before the cpuidle driver ?



> ------------[ cut here ]------------
> WARNING: CPU: 4 PID: 0 at kernel/arch/arm/common/mcpm_entry.c:130 mcpm_cpu_suspend+0x6d/0x74()
> Modules linked in:
> CPU: 4 PID: 0 Comm: swapper/4 Not tainted 3.19.0-rc3-00007-gaf5a2cb1ad5c-dirty #11
> Hardware name: ARM-Versatile Express
> [<c0013fa5>] (unwind_backtrace) from [<c001084d>] (show_stack+0x11/0x14)
> [<c001084d>] (show_stack) from [<c04fe7f1>] (dump_stack+0x6d/0x78)
> [<c04fe7f1>] (dump_stack) from [<c0020645>] (warn_slowpath_common+0x69/0x90)
> [<c0020645>] (warn_slowpath_common) from [<c00206db>] (warn_slowpath_null+0x17/0x1c)
> [<c00206db>] (warn_slowpath_null) from [<c001cbdd>] (mcpm_cpu_suspend+0x6d/0x74)
> [<c001cbdd>] (mcpm_cpu_suspend) from [<c03c6919>] (bl_powerdown_finisher+0x21/0x24)
> [<c03c6919>] (bl_powerdown_finisher) from [<c001218d>] (cpu_suspend_abort+0x1/0x14)
> [<c001218d>] (cpu_suspend_abort) from [<00000000>] (  (null))
> ---[ end trace d098e3fd00000008 ]---
>
> This patch fixes the issue by checking for the availability of MCPM
> before initializing the big.LITTLE cpuidle driver
>
> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> ---
>   drivers/cpuidle/cpuidle-big_little.c | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/drivers/cpuidle/cpuidle-big_little.c b/drivers/cpuidle/cpuidle-big_little.c
> index e3e225fe6b45..40c34faffe59 100644
> --- a/drivers/cpuidle/cpuidle-big_little.c
> +++ b/drivers/cpuidle/cpuidle-big_little.c
> @@ -182,6 +182,10 @@ static int __init bl_idle_init(void)
>   	 */
>   	if (!of_match_node(compatible_machine_match, root))
>   		return -ENODEV;
> +
> +	if (!mcpm_is_available())
> +		return -EUNATCH;
> +
>   	/*
>   	 * For now the differentiation between little and big cores
>   	 * is based on the part number. A7 cores are considered little
>
Sudeep Holla Jan. 8, 2015, 9:16 a.m. UTC | #2
Hi Daniel,

On Thursday 08 January 2015 02:23 PM, Daniel Lezcano wrote:
> On 01/08/2015 07:29 AM, Sudeep Holla wrote:
>> If big.LITTLE driver is initialized even when MCPM is unavailable,
>> we get the below warning the first time cpu tries to enter deeper
>> C-states.
>
> Can you elaborate why MCPM could be unavailable when the tc2 pm code
> registers the mcpm platform ops before the cpuidle driver ?
>
>
I can think of 3 possible scenarios. Let me know if these make sense.

1. If the firmware settings in Vexpress configuration files are set to
    boot in legacy mode, but MCPM is enabled in the kernel.
2. If some failure occurs during MCPM initialization
3. For example, if CCI is not accessible as in some Exynos boards [1],
    we don't want to wait till mpcm_cpu_suspend ?

Regards,
Sudeep

[1] 
https://www.mail-archive.com/linux-samsung-soc@vger.kernel.org/msg39624.html
Daniel Lezcano Jan. 8, 2015, 10:02 a.m. UTC | #3
On 01/08/2015 10:16 AM, Sudeep Holla wrote:
> Hi Daniel,
>
> On Thursday 08 January 2015 02:23 PM, Daniel Lezcano wrote:
>> On 01/08/2015 07:29 AM, Sudeep Holla wrote:
>>> If big.LITTLE driver is initialized even when MCPM is unavailable,
>>> we get the below warning the first time cpu tries to enter deeper
>>> C-states.
>>
>> Can you elaborate why MCPM could be unavailable when the tc2 pm code
>> registers the mcpm platform ops before the cpuidle driver ?
>>
>>
> I can think of 3 possible scenarios. Let me know if these make sense.
>
> 1. If the firmware settings in Vexpress configuration files are set to
>     boot in legacy mode, but MCPM is enabled in the kernel.

If I am not wrong, we have a BUG_ON in this path, right ?

> 2. If some failure occurs during MCPM initialization
> 3. For example, if CCI is not accessible as in some Exynos boards [1],
>     we don't want to wait till mpcm_cpu_suspend ?

Well, I think if the firmware is preventing us to play with the CCI but 
MCPM is enabled. We should add BUG_ON also in the same initialization 
path. IIRC, Kevin spent some time to figure out what was happening to 
its odroid-xu3 board before understanding mcpm wasn't able to deal with 
the CCI due to the broken firmware.

The patch you are proposing is valid. Nevertheless, I would really like 
to have the firmwares to be fixed and your patch is hiding an 
incompatible firmware with the kernel configuration and letting the 
kernel continue to work in degraded mode.

IMO, it would be better to be more strict with the mcpm initialization 
and not let the system boot if something is wrong with it which I 
believe is coming from the firmware and let the user to figure out what 
is really happening by letting him to disable mcpm in the kernel 
configuration (which in turn will disable cpuidle).

Nico, Kevin, what is your opinion ?



> [1]
> https://www.mail-archive.com/linux-samsung-soc@vger.kernel.org/msg39624.html
Sudeep Holla Jan. 8, 2015, 10:31 a.m. UTC | #4
Hi Daniel,

On Thursday 08 January 2015 03:32 PM, Daniel Lezcano wrote:
> On 01/08/2015 10:16 AM, Sudeep Holla wrote:
>> Hi Daniel,
>>
>> On Thursday 08 January 2015 02:23 PM, Daniel Lezcano wrote:
>>> On 01/08/2015 07:29 AM, Sudeep Holla wrote:
>>>> If big.LITTLE driver is initialized even when MCPM is
>>>> unavailable, we get the below warning the first time cpu tries
>>>>  to enter deeper C-states.
>>>
>>> Can you elaborate why MCPM could be unavailable when the tc2 pm
>>> code registers the mcpm platform ops before the cpuidle driver ?
>>>
>>>
>> I can think of 3 possible scenarios. Let me know if these make
>> sense.
>>
>> 1. If the firmware settings in Vexpress configuration files are set
>> to boot in legacy mode, but MCPM is enabled in the kernel.
>
> If I am not wrong, we have a BUG_ON in this path, right ?
>

No we can't do that. E.g. on TC2 we should continue to boot in legacy
mode though none of the power management features work which is fine.
One scenario is I don't want to recompile the kernel, but try legacy
boot on TC2 flipping the firmware setting.

>> 2. If some failure occurs during MCPM initialization 3. For
>> example, if CCI is not accessible as in some Exynos boards [1], we
>>  don't want to wait till mpcm_cpu_suspend ?
>
> Well, I think if the firmware is preventing us to play with the CCI
> but MCPM is enabled. We should add BUG_ON also in the same

Again why if alternate method of booting exists with limited/no PM features.

> initialization path. IIRC, Kevin spent some time to figure out what
> was happening to its odroid-xu3 board before understanding mcpm
> wasn't able to deal with the CCI due to the broken firmware.
>

I agree, I did follow that thread, though I don't fully understand if
there exists alternate boot protocol on such boards. For sure TC2 or
Vexpress with any coretile in general have legacy boot support.

> The patch you are proposing is valid. Nevertheless, I would really
> like to have the firmwares to be fixed and your patch is hiding an
> incompatible firmware with the kernel configuration and letting the
> kernel continue to work in degraded mode.
>

I fully agree, but most of the time this argument is suppressed by
saying the product is shipped and firmware can't be upgraded.

> IMO, it would be better to be more strict with the mcpm
> initialization and not let the system boot if something is wrong with
> it which I believe is coming from the firmware and let the user to
> figure out what is really happening by letting him to disable mcpm in
> the kernel configuration (which in turn will disable cpuidle).

Again I fully agree, but in this case I manually switched to legacy boot
mode on TC2 and used same kernel with MCPM config enabled. Do you mean
to say we should not support that even when developer understand the
consequence of that ?

Regards,
Sudeep
Daniel Lezcano Jan. 8, 2015, 11:11 a.m. UTC | #5
On 01/08/2015 11:31 AM, Sudeep Holla wrote:
> Hi Daniel,
>
> On Thursday 08 January 2015 03:32 PM, Daniel Lezcano wrote:
>> On 01/08/2015 10:16 AM, Sudeep Holla wrote:
>>> Hi Daniel,
>>>
>>> On Thursday 08 January 2015 02:23 PM, Daniel Lezcano wrote:
>>>> On 01/08/2015 07:29 AM, Sudeep Holla wrote:
>>>>> If big.LITTLE driver is initialized even when MCPM is
>>>>> unavailable, we get the below warning the first time cpu tries
>>>>>  to enter deeper C-states.
>>>>
>>>> Can you elaborate why MCPM could be unavailable when the tc2 pm
>>>> code registers the mcpm platform ops before the cpuidle driver ?
>>>>
>>>>
>>> I can think of 3 possible scenarios. Let me know if these make
>>> sense.
>>>
>>> 1. If the firmware settings in Vexpress configuration files are set
>>> to boot in legacy mode, but MCPM is enabled in the kernel.
>>
>> If I am not wrong, we have a BUG_ON in this path, right ?
>>
>
> No we can't do that. E.g. on TC2 we should continue to boot in legacy
> mode though none of the power management features work which is fine.
> One scenario is I don't want to recompile the kernel, but try legacy
> boot on TC2 flipping the firmware setting.

What I meant is the BUG_ON is already there, no ?

>>> 2. If some failure occurs during MCPM initialization 3. For
>>> example, if CCI is not accessible as in some Exynos boards [1], we
>>>  don't want to wait till mpcm_cpu_suspend ?
>>
>> Well, I think if the firmware is preventing us to play with the CCI
>> but MCPM is enabled. We should add BUG_ON also in the same
>
> Again why if alternate method of booting exists with limited/no PM
> features.
>
>> initialization path. IIRC, Kevin spent some time to figure out what
>> was happening to its odroid-xu3 board before understanding mcpm
>> wasn't able to deal with the CCI due to the broken firmware.
>>
>
> I agree, I did follow that thread, though I don't fully understand if
> there exists alternate boot protocol on such boards. For sure TC2 or
> Vexpress with any coretile in general have legacy boot support.
>
>> The patch you are proposing is valid. Nevertheless, I would really
>> like to have the firmwares to be fixed and your patch is hiding an
>> incompatible firmware with the kernel configuration and letting the
>> kernel continue to work in degraded mode.
>>
>
> I fully agree, but most of the time this argument is suppressed by
> saying the product is shipped and firmware can't be upgraded.

This is an invalid argument. Shipped product have their own kernels, so 
they can hack their kernel to remove the limitation.

>> IMO, it would be better to be more strict with the mcpm
>> initialization and not let the system boot if something is wrong with
>> it which I believe is coming from the firmware and let the user to
>> figure out what is really happening by letting him to disable mcpm in
>> the kernel configuration (which in turn will disable cpuidle).
>
> Again I fully agree, but in this case I manually switched to legacy boot
> mode on TC2 and used same kernel with MCPM config enabled. Do you mean
> to say we should not support that even when developer understand the
> consequence of that ?

Well, I see there are the exynos5410/5420/5422. For the 5422 on 
chromebook2 MCPM works well, IIUC. But for the 5422 on odroid-xu3, MCPM 
does not work, hence cpuidle neither because of the firmware.

Silently disabling cpuidle because mcpm did not initialize will hide the 
issue.

I understand your point about switching to legacy without recompiling 
the kernel.

I suggest we add a big fat WARN_ON when the mcpm initialization fails 
with your patch.
Lorenzo Pieralisi Jan. 8, 2015, 12:29 p.m. UTC | #6
On Thu, Jan 08, 2015 at 11:11:40AM +0000, Daniel Lezcano wrote:

[...]

> >> IMO, it would be better to be more strict with the mcpm
> >> initialization and not let the system boot if something is wrong with
> >> it which I believe is coming from the firmware and let the user to
> >> figure out what is really happening by letting him to disable mcpm in
> >> the kernel configuration (which in turn will disable cpuidle).
> >
> > Again I fully agree, but in this case I manually switched to legacy boot
> > mode on TC2 and used same kernel with MCPM config enabled. Do you mean
> > to say we should not support that even when developer understand the
> > consequence of that ?
> 
> Well, I see there are the exynos5410/5420/5422. For the 5422 on 
> chromebook2 MCPM works well, IIUC. But for the 5422 on odroid-xu3, MCPM 
> does not work, hence cpuidle neither because of the firmware.
> 
> Silently disabling cpuidle because mcpm did not initialize will hide the 
> issue.

No. MCPM *will* initialize, Sudeep's patch does not silently disable
CPUidle.
To put it differently MCPM will initialize if CCI is in the DT and it
is "available", so unless defined differently in the dts mcpm will be
available and CPUidle will be initialized (and break if there is an issue
with the platform FW/HW).

I agree the mechanism to define if MCPM is available can be improved
but that's what it is at the moment.

The problem here is to boot a platform with different boot methods
and still have a single kernel image.

> I understand your point about switching to legacy without recompiling 
> the kernel.
> 
> I suggest we add a big fat WARN_ON when the mcpm initialization fails 
> with your patch.

I think there are multiple facets we are tackling at once here and they
should not be mixed.

1) We left static idle states there to cope with legacy DTBs that were
   published before we introduced idle states bindings. If we want to
   boot eg vexpress in legacy mode but single kernel image with MCPM on,
   we could remove the idle states in DT and the problem would be
   solved; we can't do that since we were forced to leave the static
   idle tables. Overall I think this is not the way to fix the issue.
2) The idle driver should be initialized if there is an idle state entry
   method, which in this case is MCPM. If I boot vexpress with MCPM
   enabled but legacy boot method (ie spin table) with a single kernel image
   I do not want to warn if the idle states entry method (MCPM) can't be
   initialized (and I do not want to get a warning if the idle driver is
   triggering a mcpm_cpu_suspend), so Sudeep's patch is valid and I am
   against adding a:

   if (WARN_ON(!mcpm_is_available())

3) Sudeep's patch is not hiding anything. If CCI is in DT, CCI is
   probed so mcpm_is_available() == true. If the firmware is borked
   the idle states will be entered and we will notice there is something
   wrong

So overall I think Sudeep's patch is sound. I also think we should
improve the way we detect if MCPM is available, and again, I think the
CPU operations on arm64 are a good example that we can and we should
replicate.

Lorenzo
Daniel Lezcano Jan. 8, 2015, 2:01 p.m. UTC | #7
On 01/08/2015 01:29 PM, Lorenzo Pieralisi wrote:
> On Thu, Jan 08, 2015 at 11:11:40AM +0000, Daniel Lezcano wrote:
>
> [...]
>
>>>> IMO, it would be better to be more strict with the mcpm
>>>> initialization and not let the system boot if something is wrong with
>>>> it which I believe is coming from the firmware and let the user to
>>>> figure out what is really happening by letting him to disable mcpm in
>>>> the kernel configuration (which in turn will disable cpuidle).
>>>
>>> Again I fully agree, but in this case I manually switched to legacy boot
>>> mode on TC2 and used same kernel with MCPM config enabled. Do you mean
>>> to say we should not support that even when developer understand the
>>> consequence of that ?
>>
>> Well, I see there are the exynos5410/5420/5422. For the 5422 on
>> chromebook2 MCPM works well, IIUC. But for the 5422 on odroid-xu3, MCPM
>> does not work, hence cpuidle neither because of the firmware.
>>
>> Silently disabling cpuidle because mcpm did not initialize will hide the
>> issue.
>
> No. MCPM *will* initialize, Sudeep's patch does not silently disable
> CPUidle.
> To put it differently MCPM will initialize if CCI is in the DT and it
> is "available", so unless defined differently in the dts mcpm will be
> available and CPUidle will be initialized (and break if there is an issue
> with the platform FW/HW).
>
> I agree the mechanism to define if MCPM is available can be improved
> but that's what it is at the moment.
>
> The problem here is to boot a platform with different boot methods
> and still have a single kernel image.
>
>> I understand your point about switching to legacy without recompiling
>> the kernel.
>>
>> I suggest we add a big fat WARN_ON when the mcpm initialization fails
>> with your patch.
>
> I think there are multiple facets we are tackling at once here and they
> should not be mixed.

Yes, I agree.

> 1) We left static idle states there to cope with legacy DTBs that were
>     published before we introduced idle states bindings. If we want to
>     boot eg vexpress in legacy mode but single kernel image with MCPM on,
>     we could remove the idle states in DT and the problem would be
>     solved; we can't do that since we were forced to leave the static
>     idle tables. Overall I think this is not the way to fix the issue.
> 2) The idle driver should be initialized if there is an idle state entry
>     method, which in this case is MCPM. If I boot vexpress with MCPM
>     enabled but legacy boot method (ie spin table) with a single kernel image
>     I do not want to warn if the idle states entry method (MCPM) can't be
>     initialized (and I do not want to get a warning if the idle driver is
>     triggering a mcpm_cpu_suspend), so Sudeep's patch is valid and I am
>     against adding a:
>
>     if (WARN_ON(!mcpm_is_available())
>
> 3) Sudeep's patch is not hiding anything. If CCI is in DT, CCI is
>     probed so mcpm_is_available() == true. If the firmware is borked
>     the idle states will be entered and we will notice there is something
>     wrong
>
> So overall I think Sudeep's patch is sound.

Ok, I will pick Sudeep's patch. Shall I add your acked-by ?

> I also think we should
> improve the way we detect if MCPM is available,

Yes, I agree.

> and again, I think the
> CPU operations on arm64 are a good example that we can and we should
> replicate.

Ok, I think that will raise another thread soon :)

Thanks for your feedbacks.

   -- Daniel


> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Lorenzo Pieralisi Jan. 8, 2015, 2:46 p.m. UTC | #8
On Thu, Jan 08, 2015 at 02:01:03PM +0000, Daniel Lezcano wrote:

[...]

> > So overall I think Sudeep's patch is sound.
> 
> Ok, I will pick Sudeep's patch. Shall I add your acked-by ?

Yes please, thanks a lot.

Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Kevin Hilman Jan. 8, 2015, 8:27 p.m. UTC | #9
Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> writes:

> On Thu, Jan 08, 2015 at 11:11:40AM +0000, Daniel Lezcano wrote:
>
> [...]
>
>> >> IMO, it would be better to be more strict with the mcpm
>> >> initialization and not let the system boot if something is wrong with
>> >> it which I believe is coming from the firmware and let the user to
>> >> figure out what is really happening by letting him to disable mcpm in
>> >> the kernel configuration (which in turn will disable cpuidle).
>> >
>> > Again I fully agree, but in this case I manually switched to legacy boot
>> > mode on TC2 and used same kernel with MCPM config enabled. Do you mean
>> > to say we should not support that even when developer understand the
>> > consequence of that ?
>> 
>> Well, I see there are the exynos5410/5420/5422. For the 5422 on 
>> chromebook2 MCPM works well, IIUC. But for the 5422 on odroid-xu3, MCPM 
>> does not work, hence cpuidle neither because of the firmware.
>> 
>> Silently disabling cpuidle because mcpm did not initialize will hide the 
>> issue.
>
> No. MCPM *will* initialize, Sudeep's patch does not silently disable
> CPUidle.
> To put it differently MCPM will initialize if CCI is in the DT and it
> is "available", so unless defined differently in the dts mcpm will be
> available and CPUidle will be initialized (and break if there is an issue
> with the platform FW/HW).
>
> I agree the mechanism to define if MCPM is available can be improved
> but that's what it is at the moment.
>
> The problem here is to boot a platform with different boot methods
> and still have a single kernel image.
>
>> I understand your point about switching to legacy without recompiling 
>> the kernel.
>> 
>> I suggest we add a big fat WARN_ON when the mcpm initialization fails 
>> with your patch.
>
> I think there are multiple facets we are tackling at once here and they
> should not be mixed.
>
> 1) We left static idle states there to cope with legacy DTBs that were
>    published before we introduced idle states bindings. If we want to
>    boot eg vexpress in legacy mode but single kernel image with MCPM on,
>    we could remove the idle states in DT and the problem would be
>    solved; we can't do that since we were forced to leave the static
>    idle tables. Overall I think this is not the way to fix the issue.
> 2) The idle driver should be initialized if there is an idle state entry
>    method, which in this case is MCPM. If I boot vexpress with MCPM
>    enabled but legacy boot method (ie spin table) with a single kernel image
>    I do not want to warn if the idle states entry method (MCPM) can't be
>    initialized (and I do not want to get a warning if the idle driver is
>    triggering a mcpm_cpu_suspend), so Sudeep's patch is valid and I am
>    against adding a:
>
>    if (WARN_ON(!mcpm_is_available())
>
> 3) Sudeep's patch is not hiding anything. If CCI is in DT, CCI is
>    probed so mcpm_is_available() == true. If the firmware is borked
>    the idle states will be entered and we will notice there is something
>    wrong
>
> So overall I think Sudeep's patch is sound. I also think we should
> improve the way we detect if MCPM is available, and again, I think the
> CPU operations on arm64 are a good example that we can and we should
> replicate.

This patch disables CPUidle all together, but shouldn't it just disable
the states that rely on MCPM?  IOW, C1 should still work just fine since
it doesn't use MCPM, right?  So, rather than fail the init, it should
just drop any MCPM states (e.g. set ->state_count = 1)

Kevin
Daniel Lezcano Jan. 8, 2015, 8:51 p.m. UTC | #10
On 01/08/2015 09:27 PM, Kevin Hilman wrote:
> Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> writes:
>
>> On Thu, Jan 08, 2015 at 11:11:40AM +0000, Daniel Lezcano wrote:
>>
>> [...]
>>
>>>>> IMO, it would be better to be more strict with the mcpm
>>>>> initialization and not let the system boot if something is wrong with
>>>>> it which I believe is coming from the firmware and let the user to
>>>>> figure out what is really happening by letting him to disable mcpm in
>>>>> the kernel configuration (which in turn will disable cpuidle).
>>>>
>>>> Again I fully agree, but in this case I manually switched to legacy boot
>>>> mode on TC2 and used same kernel with MCPM config enabled. Do you mean
>>>> to say we should not support that even when developer understand the
>>>> consequence of that ?
>>>
>>> Well, I see there are the exynos5410/5420/5422. For the 5422 on
>>> chromebook2 MCPM works well, IIUC. But for the 5422 on odroid-xu3, MCPM
>>> does not work, hence cpuidle neither because of the firmware.
>>>
>>> Silently disabling cpuidle because mcpm did not initialize will hide the
>>> issue.
>>
>> No. MCPM *will* initialize, Sudeep's patch does not silently disable
>> CPUidle.
>> To put it differently MCPM will initialize if CCI is in the DT and it
>> is "available", so unless defined differently in the dts mcpm will be
>> available and CPUidle will be initialized (and break if there is an issue
>> with the platform FW/HW).
>>
>> I agree the mechanism to define if MCPM is available can be improved
>> but that's what it is at the moment.
>>
>> The problem here is to boot a platform with different boot methods
>> and still have a single kernel image.
>>
>>> I understand your point about switching to legacy without recompiling
>>> the kernel.
>>>
>>> I suggest we add a big fat WARN_ON when the mcpm initialization fails
>>> with your patch.
>>
>> I think there are multiple facets we are tackling at once here and they
>> should not be mixed.
>>
>> 1) We left static idle states there to cope with legacy DTBs that were
>>     published before we introduced idle states bindings. If we want to
>>     boot eg vexpress in legacy mode but single kernel image with MCPM on,
>>     we could remove the idle states in DT and the problem would be
>>     solved; we can't do that since we were forced to leave the static
>>     idle tables. Overall I think this is not the way to fix the issue.
>> 2) The idle driver should be initialized if there is an idle state entry
>>     method, which in this case is MCPM. If I boot vexpress with MCPM
>>     enabled but legacy boot method (ie spin table) with a single kernel image
>>     I do not want to warn if the idle states entry method (MCPM) can't be
>>     initialized (and I do not want to get a warning if the idle driver is
>>     triggering a mcpm_cpu_suspend), so Sudeep's patch is valid and I am
>>     against adding a:
>>
>>     if (WARN_ON(!mcpm_is_available())
>>
>> 3) Sudeep's patch is not hiding anything. If CCI is in DT, CCI is
>>     probed so mcpm_is_available() == true. If the firmware is borked
>>     the idle states will be entered and we will notice there is something
>>     wrong
>>
>> So overall I think Sudeep's patch is sound. I also think we should
>> improve the way we detect if MCPM is available, and again, I think the
>> CPU operations on arm64 are a good example that we can and we should
>> replicate.
>
> This patch disables CPUidle all together, but shouldn't it just disable
> the states that rely on MCPM?  IOW, C1 should still work just fine since
> it doesn't use MCPM, right?  So, rather than fail the init, it should
> just drop any MCPM states (e.g. set ->state_count = 1)

Well, that means we will have a cpuidle driver with the WFI state only 
which is the default idle function when there is no cpuidle driver (+ 
without the governor math).
Sudeep Holla Jan. 9, 2015, 4:58 a.m. UTC | #11
Hi Kevin,

On Friday 09 January 2015 01:57 AM, Kevin Hilman wrote:
> Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> writes:
>
>> On Thu, Jan 08, 2015 at 11:11:40AM +0000, Daniel Lezcano wrote:
>>

[...]

>> 3) Sudeep's patch is not hiding anything. If CCI is in DT, CCI is
>> probed so mcpm_is_available() == true. If the firmware is borked
>> the idle states will be entered and we will notice there is
>> something wrong
>>
>> So overall I think Sudeep's patch is sound. I also think we should
>> improve the way we detect if MCPM is available, and again, I think
>> the CPU operations on arm64 are a good example that we can and we
>> should replicate.
>
> This patch disables CPUidle all together, but shouldn't it just
> disable the states that rely on MCPM?  IOW, C1 should still work just
> fine since it doesn't use MCPM, right?  So, rather than fail the
> init, it should just drop any MCPM states (e.g. set ->state_count =
> 1)
>

As Daniel pointed out, if there's no cpuidle driver or if cpuidle fails
to choose a convenient idle state, we fall back to the default arch idle
method(arch_cpu_idle)

Regards,
Sudeep
Sudeep Holla Jan. 9, 2015, 5:01 a.m. UTC | #12
Hi Lorenzo,

On Thursday 08 January 2015 05:59 PM, Lorenzo Pieralisi wrote:
> On Thu, Jan 08, 2015 at 11:11:40AM +0000, Daniel Lezcano wrote:
>
> [...]
>
>>>> IMO, it would be better to be more strict with the mcpm
>>>> initialization and not let the system boot if something is wrong with
>>>> it which I believe is coming from the firmware and let the user to
>>>> figure out what is really happening by letting him to disable mcpm in
>>>> the kernel configuration (which in turn will disable cpuidle).
>>>
>>> Again I fully agree, but in this case I manually switched to legacy boot
>>> mode on TC2 and used same kernel with MCPM config enabled. Do you mean
>>> to say we should not support that even when developer understand the
>>> consequence of that ?
>>
>> Well, I see there are the exynos5410/5420/5422. For the 5422 on
>> chromebook2 MCPM works well, IIUC. But for the 5422 on odroid-xu3, MCPM
>> does not work, hence cpuidle neither because of the firmware.
>>
>> Silently disabling cpuidle because mcpm did not initialize will hide the
>> issue.
>
> No. MCPM *will* initialize, Sudeep's patch does not silently disable
> CPUidle.
> To put it differently MCPM will initialize if CCI is in the DT and it
> is "available", so unless defined differently in the dts mcpm will be
> available and CPUidle will be initialized (and break if there is an issue
> with the platform FW/HW).
>
> I agree the mechanism to define if MCPM is available can be improved
> but that's what it is at the moment.
>
> The problem here is to boot a platform with different boot methods
> and still have a single kernel image.
>
>> I understand your point about switching to legacy without recompiling
>> the kernel.
>>
>> I suggest we add a big fat WARN_ON when the mcpm initialization fails
>> with your patch.
>
> I think there are multiple facets we are tackling at once here and they
> should not be mixed.
>

I agree, I could have been more clear on which problem I was fixing.

> 1) We left static idle states there to cope with legacy DTBs that were
>     published before we introduced idle states bindings. If we want to
>     boot eg vexpress in legacy mode but single kernel image with MCPM on,
>     we could remove the idle states in DT and the problem would be
>     solved; we can't do that since we were forced to leave the static
>     idle tables. Overall I think this is not the way to fix the issue.
> 2) The idle driver should be initialized if there is an idle state entry
>     method, which in this case is MCPM. If I boot vexpress with MCPM
>     enabled but legacy boot method (ie spin table) with a single kernel image
>     I do not want to warn if the idle states entry method (MCPM) can't be
>     initialized (and I do not want to get a warning if the idle driver is
>     triggering a mcpm_cpu_suspend), so Sudeep's patch is valid and I am
>     against adding a:
>
>     if (WARN_ON(!mcpm_is_available())
>
> 3) Sudeep's patch is not hiding anything. If CCI is in DT, CCI is
>     probed so mcpm_is_available() == true. If the firmware is borked
>     the idle states will be entered and we will notice there is something
>     wrong
>
> So overall I think Sudeep's patch is sound. I also think we should
> improve the way we detect if MCPM is available, and again, I think the
> CPU operations on arm64 are a good example that we can and we should
> replicate.

Thanks for the providing clarification in details.

Regards,
Sudeep
Kevin Hilman Jan. 9, 2015, 5:34 p.m. UTC | #13
Daniel Lezcano <daniel.lezcano@linaro.org> writes:

> On 01/08/2015 09:27 PM, Kevin Hilman wrote:
>> Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> writes:
>>
>>> On Thu, Jan 08, 2015 at 11:11:40AM +0000, Daniel Lezcano wrote:
>>>
>>> [...]
>>>
>>>>>> IMO, it would be better to be more strict with the mcpm
>>>>>> initialization and not let the system boot if something is wrong with
>>>>>> it which I believe is coming from the firmware and let the user to
>>>>>> figure out what is really happening by letting him to disable mcpm in
>>>>>> the kernel configuration (which in turn will disable cpuidle).
>>>>>
>>>>> Again I fully agree, but in this case I manually switched to legacy boot
>>>>> mode on TC2 and used same kernel with MCPM config enabled. Do you mean
>>>>> to say we should not support that even when developer understand the
>>>>> consequence of that ?
>>>>
>>>> Well, I see there are the exynos5410/5420/5422. For the 5422 on
>>>> chromebook2 MCPM works well, IIUC. But for the 5422 on odroid-xu3, MCPM
>>>> does not work, hence cpuidle neither because of the firmware.
>>>>
>>>> Silently disabling cpuidle because mcpm did not initialize will hide the
>>>> issue.
>>>
>>> No. MCPM *will* initialize, Sudeep's patch does not silently disable
>>> CPUidle.
>>> To put it differently MCPM will initialize if CCI is in the DT and it
>>> is "available", so unless defined differently in the dts mcpm will be
>>> available and CPUidle will be initialized (and break if there is an issue
>>> with the platform FW/HW).
>>>
>>> I agree the mechanism to define if MCPM is available can be improved
>>> but that's what it is at the moment.
>>>
>>> The problem here is to boot a platform with different boot methods
>>> and still have a single kernel image.
>>>
>>>> I understand your point about switching to legacy without recompiling
>>>> the kernel.
>>>>
>>>> I suggest we add a big fat WARN_ON when the mcpm initialization fails
>>>> with your patch.
>>>
>>> I think there are multiple facets we are tackling at once here and they
>>> should not be mixed.
>>>
>>> 1) We left static idle states there to cope with legacy DTBs that were
>>>     published before we introduced idle states bindings. If we want to
>>>     boot eg vexpress in legacy mode but single kernel image with MCPM on,
>>>     we could remove the idle states in DT and the problem would be
>>>     solved; we can't do that since we were forced to leave the static
>>>     idle tables. Overall I think this is not the way to fix the issue.
>>> 2) The idle driver should be initialized if there is an idle state entry
>>>     method, which in this case is MCPM. If I boot vexpress with MCPM
>>>     enabled but legacy boot method (ie spin table) with a single kernel image
>>>     I do not want to warn if the idle states entry method (MCPM) can't be
>>>     initialized (and I do not want to get a warning if the idle driver is
>>>     triggering a mcpm_cpu_suspend), so Sudeep's patch is valid and I am
>>>     against adding a:
>>>
>>>     if (WARN_ON(!mcpm_is_available())
>>>
>>> 3) Sudeep's patch is not hiding anything. If CCI is in DT, CCI is
>>>     probed so mcpm_is_available() == true. If the firmware is borked
>>>     the idle states will be entered and we will notice there is something
>>>     wrong
>>>
>>> So overall I think Sudeep's patch is sound. I also think we should
>>> improve the way we detect if MCPM is available, and again, I think the
>>> CPU operations on arm64 are a good example that we can and we should
>>> replicate.
>>
>> This patch disables CPUidle all together, but shouldn't it just disable
>> the states that rely on MCPM?  IOW, C1 should still work just fine since
>> it doesn't use MCPM, right?  So, rather than fail the init, it should
>> just drop any MCPM states (e.g. set ->state_count = 1)
>
> Well, that means we will have a cpuidle driver with the WFI state only
> which is the default idle function when there is no cpuidle driver (+
> without the governor math).

Ah, OK.  Then it makes more sense to disable the driver all together.

Feel free to add

Acked-by: Kevin Hilman <khilman@linaro.org>

Thanks,

Kevin
diff mbox

Patch

diff --git a/drivers/cpuidle/cpuidle-big_little.c b/drivers/cpuidle/cpuidle-big_little.c
index e3e225fe6b45..40c34faffe59 100644
--- a/drivers/cpuidle/cpuidle-big_little.c
+++ b/drivers/cpuidle/cpuidle-big_little.c
@@ -182,6 +182,10 @@  static int __init bl_idle_init(void)
 	 */
 	if (!of_match_node(compatible_machine_match, root))
 		return -ENODEV;
+
+	if (!mcpm_is_available())
+		return -EUNATCH;
+
 	/*
 	 * For now the differentiation between little and big cores
 	 * is based on the part number. A7 cores are considered little