diff mbox

ARM: OMAP4: cpuidle: fix: call cpu_cluster_pm_exit conditionally

Message ID 1377598250-2355-1-git-send-email-murzin.v@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Vladimir Murzin Aug. 27, 2013, 10:10 a.m. UTC
We call cpu_cluster_pm_enter for dev->cpu == 0 only, but
cpu_cluster_pm_exit called without that check.

Because of that unhandled page fault may happen:

[    3.803405] Unable to handle kernel paging request at virtual address 00002500
[    3.810974] pgd = c0004000
[    3.813812] [00002500] *pgd=00000000
[    3.817596] Internal error: Oops: 5 [#1] SMP ARM
[    3.822418] Modules linked in:
[    3.825653] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc6+ #21
[    3.832397] task: ed86ef40 ti: ed896000 task.ti: ed896000
[    3.838073] PC is at irq_notifier+0x234/0x25c
[    3.842651] LR is at irq_notifier+0x218/0x25c
[    3.847229] pc : [<c0029ed8>]    lr : [<c0029ebc>]    psr: 80000193
[    3.847229] sp : ed897ee8  ip : 00000005  fp : 00000001
[    3.859283] r10: c0b395f0  r9 : c0b30594  r8 : c0b8c2ac
[    3.864776] r7 : ffffffff  r6 : 00000000  r5 : 00000005  r4 : 00000000
[    3.871643] r3 : 00002500  r2 : 00000000  r1 : 00000005  r0 : 44302244
[    3.878479] Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
[    3.886260] Control: 10c5387d  Table: 8000404a  DAC: 00000015
[    3.892272] Process swapper/1 (pid: 0, stack limit = 0xed896240)
[    3.898590] Stack: (0xed897ee8 to 0xed898000)
[    3.903167] 7ee0:                   c0979c3a 00000001 ed897ef8 ed896000 c0014f7c 00000000
[    3.911743] 7f00: 00000005 00000000 ffffffff c0b8c2ac c0b395f0 c077c04c c0c94b48 c0b3953c
[    3.920318] 7f20: c0bcd928 00000002 c0b39524 c00cfad8 00000000 ffffffff 00000000 c00cfb10
[    3.928924] 7f40: c14e62c0 c002c1c8 c002c0ac c14e62c0 00000002 e251c37d 00000000 c0b39548
[    3.937499] 7f60: c0b395f0 c05a1bc4 e251c37d 00000000 00000005 c05a3870 edc90380 edc90380
[    3.946105] 7f80: edc90394 c14e62c0 c0b39548 00000002 c0784064 c05a3c78 c0b395e0 c14e62c0
[    3.954681] 7fa0: 00000002 c0b39548 c0bc9db8 00000000 00000001 c05a1dc0 ed896000 00000015
[    3.963287] 7fc0: c0bc9db8 ed896000 8000406a c0b30594 c0784064 c000e504 00000746 c007a528
[    3.971862] 7fe0: 00000001 0000001d 600001d3 c0bcc004 00000000 800086c4 ee0aa6a7 d2aabaa9
[    3.980499] [<c0029ed8>] (irq_notifier+0x234/0x25c) from [<c077c04c>] (notifier_call_chain+0x38/0x68)
[    3.990173] [<c077c04c>] (notifier_call_chain+0x38/0x68) from [<c00cfad8>] (cpu_pm_notify+0x20/0x38)
[    3.999786] [<c00cfad8>] (cpu_pm_notify+0x20/0x38) from [<c00cfb10>] (cpu_cluster_pm_exit+0x20/0x50)
[    4.009399] [<c00cfb10>] (cpu_cluster_pm_exit+0x20/0x50) from [<c002c1c8>] (omap_enter_idle_coupled+0x11c/0x14c)
[    4.020111] [<c002c1c8>] (omap_enter_idle_coupled+0x11c/0x14c) from [<c05a1bc4>] (cpuidle_enter_state+0x40/0xec)
[    4.030822] [<c05a1bc4>] (cpuidle_enter_state+0x40/0xec) from [<c05a3c78>] (cpuidle_enter_state_coupled+0x1f4/0x240)
[    4.041870] [<c05a3c78>] (cpuidle_enter_state_coupled+0x1f4/0x240) from [<c05a1dc0>] (cpuidle_idle_call+0x150/0x228)
[    4.052947] [<c05a1dc0>] (cpuidle_idle_call+0x150/0x228) from [<c000e504>] (arch_cpu_idle+0x8/0x38)
[    4.062499] [<c000e504>] (arch_cpu_idle+0x8/0x38) from [<c007a528>] (cpu_startup_entry+0x178/0x1e4)
[    4.071990] [<c007a528>] (cpu_startup_entry+0x178/0x1e4) from [<800086c4>] (0x800086c4)
[    4.080383] Code: e5922288 03a03b0a 13a03c25 e0823003 (e5932000)
[    4.086791] ---[ end trace d83954a84a6fa69e ]---

It is supposed that sar_base is initialized in irq_save_context, which
is called on CPU_CLUSTER_PM_ENTER notification. If this notification
has been missed and CPU_CLUSTER_PM_EXIT is received sar_base is NULL.

Fix it by calling CPU_CLUSTER_PM_{ENTER,EXIT} under the same condition.

Signed-off-by: Vladimir Murzin <murzin.v@gmail.com>
---
 arch/arm/mach-omap2/cpuidle44xx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Kevin Hilman Aug. 27, 2013, 3:54 p.m. UTC | #1
Vladimir Murzin <murzin.v@gmail.com> writes:

> We call cpu_cluster_pm_enter for dev->cpu == 0 only, but
> cpu_cluster_pm_exit called without that check.
>
> Because of that unhandled page fault may happen:
>
> [    3.803405] Unable to handle kernel paging request at virtual address 00002500
> [    3.810974] pgd = c0004000
> [    3.813812] [00002500] *pgd=00000000
> [    3.817596] Internal error: Oops: 5 [#1] SMP ARM
> [    3.822418] Modules linked in:
> [    3.825653] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc6+ #21
> [    3.832397] task: ed86ef40 ti: ed896000 task.ti: ed896000
> [    3.838073] PC is at irq_notifier+0x234/0x25c
> [    3.842651] LR is at irq_notifier+0x218/0x25c
> [    3.847229] pc : [<c0029ed8>]    lr : [<c0029ebc>]    psr: 80000193
> [    3.847229] sp : ed897ee8  ip : 00000005  fp : 00000001
> [    3.859283] r10: c0b395f0  r9 : c0b30594  r8 : c0b8c2ac
> [    3.864776] r7 : ffffffff  r6 : 00000000  r5 : 00000005  r4 : 00000000
> [    3.871643] r3 : 00002500  r2 : 00000000  r1 : 00000005  r0 : 44302244
> [    3.878479] Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
> [    3.886260] Control: 10c5387d  Table: 8000404a  DAC: 00000015
> [    3.892272] Process swapper/1 (pid: 0, stack limit = 0xed896240)
> [    3.898590] Stack: (0xed897ee8 to 0xed898000)
> [    3.903167] 7ee0:                   c0979c3a 00000001 ed897ef8 ed896000 c0014f7c 00000000
> [    3.911743] 7f00: 00000005 00000000 ffffffff c0b8c2ac c0b395f0 c077c04c c0c94b48 c0b3953c
> [    3.920318] 7f20: c0bcd928 00000002 c0b39524 c00cfad8 00000000 ffffffff 00000000 c00cfb10
> [    3.928924] 7f40: c14e62c0 c002c1c8 c002c0ac c14e62c0 00000002 e251c37d 00000000 c0b39548
> [    3.937499] 7f60: c0b395f0 c05a1bc4 e251c37d 00000000 00000005 c05a3870 edc90380 edc90380
> [    3.946105] 7f80: edc90394 c14e62c0 c0b39548 00000002 c0784064 c05a3c78 c0b395e0 c14e62c0
> [    3.954681] 7fa0: 00000002 c0b39548 c0bc9db8 00000000 00000001 c05a1dc0 ed896000 00000015
> [    3.963287] 7fc0: c0bc9db8 ed896000 8000406a c0b30594 c0784064 c000e504 00000746 c007a528
> [    3.971862] 7fe0: 00000001 0000001d 600001d3 c0bcc004 00000000 800086c4 ee0aa6a7 d2aabaa9
> [    3.980499] [<c0029ed8>] (irq_notifier+0x234/0x25c) from [<c077c04c>] (notifier_call_chain+0x38/0x68)
> [    3.990173] [<c077c04c>] (notifier_call_chain+0x38/0x68) from [<c00cfad8>] (cpu_pm_notify+0x20/0x38)
> [    3.999786] [<c00cfad8>] (cpu_pm_notify+0x20/0x38) from [<c00cfb10>] (cpu_cluster_pm_exit+0x20/0x50)
> [    4.009399] [<c00cfb10>] (cpu_cluster_pm_exit+0x20/0x50) from [<c002c1c8>] (omap_enter_idle_coupled+0x11c/0x14c)
> [    4.020111] [<c002c1c8>] (omap_enter_idle_coupled+0x11c/0x14c) from [<c05a1bc4>] (cpuidle_enter_state+0x40/0xec)
> [    4.030822] [<c05a1bc4>] (cpuidle_enter_state+0x40/0xec) from [<c05a3c78>] (cpuidle_enter_state_coupled+0x1f4/0x240)
> [    4.041870] [<c05a3c78>] (cpuidle_enter_state_coupled+0x1f4/0x240) from [<c05a1dc0>] (cpuidle_idle_call+0x150/0x228)
> [    4.052947] [<c05a1dc0>] (cpuidle_idle_call+0x150/0x228) from [<c000e504>] (arch_cpu_idle+0x8/0x38)
> [    4.062499] [<c000e504>] (arch_cpu_idle+0x8/0x38) from [<c007a528>] (cpu_startup_entry+0x178/0x1e4)
> [    4.071990] [<c007a528>] (cpu_startup_entry+0x178/0x1e4) from [<800086c4>] (0x800086c4)
> [    4.080383] Code: e5922288 03a03b0a 13a03c25 e0823003 (e5932000)
> [    4.086791] ---[ end trace d83954a84a6fa69e ]---
>
> It is supposed that sar_base is initialized in irq_save_context, which
> is called on CPU_CLUSTER_PM_ENTER notification. If this notification
> has been missed and CPU_CLUSTER_PM_EXIT is received sar_base is NULL.
>
> Fix it by calling CPU_CLUSTER_PM_{ENTER,EXIT} under the same condition.
>
> Signed-off-by: Vladimir Murzin <murzin.v@gmail.com>

Good catch.

Acked-by: Kevin Hilman <khilman@linaro.org>

> ---
>  arch/arm/mach-omap2/cpuidle44xx.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm/mach-omap2/cpuidle44xx.c b/arch/arm/mach-omap2/cpuidle44xx.c
> index c443f2e..4c8982a 100644
> --- a/arch/arm/mach-omap2/cpuidle44xx.c
> +++ b/arch/arm/mach-omap2/cpuidle44xx.c
> @@ -143,7 +143,7 @@ static int omap_enter_idle_coupled(struct cpuidle_device *dev,
>  	 * Call idle CPU cluster PM exit notifier chain
>  	 * to restore GIC and wakeupgen context.
>  	 */
> -	if ((cx->mpu_state == PWRDM_POWER_RET) &&
> +	if (dev->cpu == 0 && (cx->mpu_state == PWRDM_POWER_RET) &&
>  		(cx->mpu_logic_state == PWRDM_POWER_OFF))
>  		cpu_cluster_pm_exit();
Grygorii Strashko Aug. 29, 2013, 3:20 p.m. UTC | #2
Hi Vladimir, Kevin

On 08/27/2013 06:54 PM, Kevin Hilman wrote:
> Vladimir Murzin <murzin.v@gmail.com> writes:
>
>> We call cpu_cluster_pm_enter for dev->cpu == 0 only, but
>> cpu_cluster_pm_exit called without that check.
>>
>> Because of that unhandled page fault may happen:
>>

[...]

>>
>> It is supposed that sar_base is initialized in irq_save_context, which
>> is called on CPU_CLUSTER_PM_ENTER notification. If this notification
>> has been missed and CPU_CLUSTER_PM_EXIT is received sar_base is NULL.
>>
>> Fix it by calling CPU_CLUSTER_PM_{ENTER,EXIT} under the same condition.

Could you check, if revert of the following patch will solve the issue, pls?
commit e7457253494fff660a72bc0cedeee97491ccd173
"ARM: OMAP4+: CPUidle: Deprecate use of omap4_mpuss_read_prev_context_state()"

>>
>> Signed-off-by: Vladimir Murzin <murzin.v@gmail.com>
>
> Good catch.

Yes, but It seems, that CPUIdle logic is unclear for OAMP4 .
The above issue may happen if CPU1 enter/exit LP while CPU0:
- not enter at all (somewhere inside "coupled" core);
- still entering LP (somewhere before call to omap4_enter_lowpower());

The question is - Should first CPUx, who exited from LP(C3) state,
restore Cluster context, or it should be done by CPU0 only?
(on OMAP4 CPUs may return from C3 async).


>
> Acked-by: Kevin Hilman <khilman@linaro.org>
>
>> ---
>>   arch/arm/mach-omap2/cpuidle44xx.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/arm/mach-omap2/cpuidle44xx.c b/arch/arm/mach-omap2/cpuidle44xx.c
>> index c443f2e..4c8982a 100644
>> --- a/arch/arm/mach-omap2/cpuidle44xx.c
>> +++ b/arch/arm/mach-omap2/cpuidle44xx.c
>> @@ -143,7 +143,7 @@ static int omap_enter_idle_coupled(struct cpuidle_device *dev,
>>   	 * Call idle CPU cluster PM exit notifier chain
>>   	 * to restore GIC and wakeupgen context.
>>   	 */
>> -	if ((cx->mpu_state == PWRDM_POWER_RET) &&
>> +	if (dev->cpu == 0 && (cx->mpu_state == PWRDM_POWER_RET) &&
>>   		(cx->mpu_logic_state == PWRDM_POWER_OFF))
>>   		cpu_cluster_pm_exit();
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Santosh Shilimkar Aug. 29, 2013, 3:26 p.m. UTC | #3
On Thursday 29 August 2013 11:20 AM, Strashko, Grygorii wrote:
> Hi Vladimir, Kevin
> 
> On 08/27/2013 06:54 PM, Kevin Hilman wrote:
>> Vladimir Murzin <murzin.v@gmail.com> writes:
>>
>>> We call cpu_cluster_pm_enter for dev->cpu == 0 only, but
>>> cpu_cluster_pm_exit called without that check.
>>>
>>> Because of that unhandled page fault may happen:
>>>
> 
> [...]
> 
>>>
>>> It is supposed that sar_base is initialized in irq_save_context, which
>>> is called on CPU_CLUSTER_PM_ENTER notification. If this notification
>>> has been missed and CPU_CLUSTER_PM_EXIT is received sar_base is NULL.
>>>
>>> Fix it by calling CPU_CLUSTER_PM_{ENTER,EXIT} under the same condition.
> 
> Could you check, if revert of the following patch will solve the issue, pls?
> commit e7457253494fff660a72bc0cedeee97491ccd173
> "ARM: OMAP4+: CPUidle: Deprecate use of omap4_mpuss_read_prev_context_state()"
> 
>>>
>>> Signed-off-by: Vladimir Murzin <murzin.v@gmail.com>
>>
>> Good catch.
> 
> Yes, but It seems, that CPUIdle logic is unclear for OAMP4 .
> The above issue may happen if CPU1 enter/exit LP while CPU0:
> - not enter at all (somewhere inside "coupled" core);
> - still entering LP (somewhere before call to omap4_enter_lowpower());
> 
> The question is - Should first CPUx, who exited from LP(C3) state,
> restore Cluster context, or it should be done by CPU0 only?
> (on OMAP4 CPUs may return from C3 async).
> 
> 
Looks like I missed this thread completely. The cluster restore
on purpose was not tied to any CPU considering some OMAPs may
support such thing in future but for couple idle we anyway
synchronize the CPUs at the idle exit so doing either way should be fine.

I think the sar_base bug should be fixed regardless of whether we
change idle code.

Regards,
Santosh
Grygorii Strashko Aug. 29, 2013, 4:03 p.m. UTC | #4
Hi 

On 08/29/2013 06:26 PM, Santosh Shilimkar wrote:> On Thursday 29 August 2013 11:20 AM, Strashko, Grygorii wrote:
>> Hi Vladimir, Kevin
>>
>> On 08/27/2013 06:54 PM, Kevin Hilman wrote:
>>> Vladimir Murzin <murzin.v@gmail.com> writes:
>>>
>>>> We call cpu_cluster_pm_enter for dev->cpu == 0 only, but
>>>> cpu_cluster_pm_exit called without that check.
>>>>
>>>> Because of that unhandled page fault may happen:
>>>>
>>
>> [...]
>>
>>>>
>>>> It is supposed that sar_base is initialized in irq_save_context, which
>>>> is called on CPU_CLUSTER_PM_ENTER notification. If this notification
>>>> has been missed and CPU_CLUSTER_PM_EXIT is received sar_base is NULL.
>>>>
>>>> Fix it by calling CPU_CLUSTER_PM_{ENTER,EXIT} under the same condition.
>>
>> Could you check, if revert of the following patch will solve the issue, pls?
>> commit e7457253494fff660a72bc0cedeee97491ccd173
>> "ARM: OMAP4+: CPUidle: Deprecate use of omap4_mpuss_read_prev_context_state()"
>>
>>>>
>>>> Signed-off-by: Vladimir Murzin <murzin.v@gmail.com>
>>>
>>> Good catch.
>>
>> Yes, but It seems, that CPUIdle logic is unclear for OAMP4 .
>> The above issue may happen if CPU1 enter/exit LP while CPU0:
>> - not enter at all (somewhere inside "coupled" core);
>> - still entering LP (somewhere before call to omap4_enter_lowpower());
>>
>> The question is - Should first CPUx, who exited from LP(C3) state,
>> restore Cluster context, or it should be done by CPU0 only?
>> (on OMAP4 CPUs may return from C3 async).
>>
>>
> Looks like I missed this thread completely. The cluster restore
> on purpose was not tied to any CPU considering some OMAPs may
> support such thing in future but for couple idle we anyway
> synchronize the CPUs at the idle exit so doing either way should be fine.
> 
> I think the sar_base bug should be fixed regardless of whether we
> change idle code.

Thanks for clarification. 

I know about couple CPU sync, but .. CPUIdle is so fragile :)

So wouldn't it be good to fix/protect omap-wakeupgen.c/irq_sar_clear() too.

Like:
	if (!sar_base)
		return;

Also, without this patch cpu_cluster_pm_exit() will be called twice by each CPU.

No questions to this patch any more.

> 
> Regards,
> Santosh

Regards,
-grygorii
Kevin Hilman Aug. 29, 2013, 5:15 p.m. UTC | #5
+Santosh

"Strashko, Grygorii" <grygorii.strashko@ti.com> writes:

> Hi Vladimir, Kevin
>
> On 08/27/2013 06:54 PM, Kevin Hilman wrote:
>> Vladimir Murzin <murzin.v@gmail.com> writes:
>>
>>> We call cpu_cluster_pm_enter for dev->cpu == 0 only, but
>>> cpu_cluster_pm_exit called without that check.
>>>
>>> Because of that unhandled page fault may happen:
>>>
>
> [...]
>
>>>
>>> It is supposed that sar_base is initialized in irq_save_context, which
>>> is called on CPU_CLUSTER_PM_ENTER notification. If this notification
>>> has been missed and CPU_CLUSTER_PM_EXIT is received sar_base is NULL.
>>>
>>> Fix it by calling CPU_CLUSTER_PM_{ENTER,EXIT} under the same condition.
>
> Could you check, if revert of the following patch will solve the issue, pls?
> commit e7457253494fff660a72bc0cedeee97491ccd173
> "ARM: OMAP4+: CPUidle: Deprecate use of omap4_mpuss_read_prev_context_state()"
>
>>>
>>> Signed-off-by: Vladimir Murzin <murzin.v@gmail.com>
>>
>> Good catch.
>
> Yes, but It seems, that CPUIdle logic is unclear for OAMP4 .
> The above issue may happen if CPU1 enter/exit LP while CPU0:
> - not enter at all (somewhere inside "coupled" core);
> - still entering LP (somewhere before call to omap4_enter_lowpower());
>
> The question is - Should first CPUx, who exited from LP(C3) state,
> restore Cluster context, or it should be done by CPU0 only?
> (on OMAP4 CPUs may return from C3 async).

Well, they're not *supposed* to return async on OMAP4.  IIUC, only CPU0
wakes up and then it's CPU0s job to wake up CPU1. However, the crash
reported here certainly suggests that CPU1 exiting before CPU0, so
one of the possibilities you suggest above is probably happening (I
suspect the latter.)

It looks like we might still need to check the actual hardware state
there to avoid those potential cases.

Alternatively, another solution might be to use
cpuidle_coupled_parallel_barrier() before the cluster_pm_exit()

Kevin
Santosh Shilimkar Aug. 29, 2013, 5:29 p.m. UTC | #6
On Thursday 29 August 2013 01:15 PM, Kevin Hilman wrote:
> +Santosh
> 
> "Strashko, Grygorii" <grygorii.strashko@ti.com> writes:
> 
>> Hi Vladimir, Kevin
>>
>> On 08/27/2013 06:54 PM, Kevin Hilman wrote:
>>> Vladimir Murzin <murzin.v@gmail.com> writes:
>>>
>>>> We call cpu_cluster_pm_enter for dev->cpu == 0 only, but
>>>> cpu_cluster_pm_exit called without that check.
>>>>
>>>> Because of that unhandled page fault may happen:
>>>>
>>
>> [...]
>>
>>>>
>>>> It is supposed that sar_base is initialized in irq_save_context, which
>>>> is called on CPU_CLUSTER_PM_ENTER notification. If this notification
>>>> has been missed and CPU_CLUSTER_PM_EXIT is received sar_base is NULL.
>>>>
>>>> Fix it by calling CPU_CLUSTER_PM_{ENTER,EXIT} under the same condition.
>>
>> Could you check, if revert of the following patch will solve the issue, pls?
>> commit e7457253494fff660a72bc0cedeee97491ccd173
>> "ARM: OMAP4+: CPUidle: Deprecate use of omap4_mpuss_read_prev_context_state()"
>>
>>>>
>>>> Signed-off-by: Vladimir Murzin <murzin.v@gmail.com>
>>>
>>> Good catch.
>>
>> Yes, but It seems, that CPUIdle logic is unclear for OAMP4 .
>> The above issue may happen if CPU1 enter/exit LP while CPU0:
>> - not enter at all (somewhere inside "coupled" core);
>> - still entering LP (somewhere before call to omap4_enter_lowpower());
>>
>> The question is - Should first CPUx, who exited from LP(C3) state,
>> restore Cluster context, or it should be done by CPU0 only?
>> (on OMAP4 CPUs may return from C3 async).
> 
> Well, they're not *supposed* to return async on OMAP4.  IIUC, only CPU0
> wakes up and then it's CPU0s job to wake up CPU1. However, the crash
> reported here certainly suggests that CPU1 exiting before CPU0, so
> one of the possibilities you suggest above is probably happening (I
> suspect the latter.)
> 
> It looks like we might still need to check the actual hardware state
> there to avoid those potential cases.
> 
The subject patch is good enough to avoid the double notifier call chain
even though its not harmful its UN-necessary.

And then the sar_base check should be in place as well to avoid the
reported issue.

Regards,
Santosh
Tony Lindgren Sept. 17, 2013, 11:47 p.m. UTC | #7
* Kevin Hilman <khilman@linaro.org> [130827 09:01]:
> Vladimir Murzin <murzin.v@gmail.com> writes:
> 
> > We call cpu_cluster_pm_enter for dev->cpu == 0 only, but
> > cpu_cluster_pm_exit called without that check.
> >
> > Because of that unhandled page fault may happen:
> >
> > [    3.803405] Unable to handle kernel paging request at virtual address 00002500
> > [    3.810974] pgd = c0004000
> > [    3.813812] [00002500] *pgd=00000000
> > [    3.817596] Internal error: Oops: 5 [#1] SMP ARM
> > [    3.822418] Modules linked in:
> > [    3.825653] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.11.0-rc6+ #21
> > [    3.832397] task: ed86ef40 ti: ed896000 task.ti: ed896000
> > [    3.838073] PC is at irq_notifier+0x234/0x25c
> > [    3.842651] LR is at irq_notifier+0x218/0x25c
> > [    3.847229] pc : [<c0029ed8>]    lr : [<c0029ebc>]    psr: 80000193
> > [    3.847229] sp : ed897ee8  ip : 00000005  fp : 00000001
> > [    3.859283] r10: c0b395f0  r9 : c0b30594  r8 : c0b8c2ac
> > [    3.864776] r7 : ffffffff  r6 : 00000000  r5 : 00000005  r4 : 00000000
> > [    3.871643] r3 : 00002500  r2 : 00000000  r1 : 00000005  r0 : 44302244
> > [    3.878479] Flags: Nzcv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
> > [    3.886260] Control: 10c5387d  Table: 8000404a  DAC: 00000015
> > [    3.892272] Process swapper/1 (pid: 0, stack limit = 0xed896240)
> > [    3.898590] Stack: (0xed897ee8 to 0xed898000)
> > [    3.903167] 7ee0:                   c0979c3a 00000001 ed897ef8 ed896000 c0014f7c 00000000
> > [    3.911743] 7f00: 00000005 00000000 ffffffff c0b8c2ac c0b395f0 c077c04c c0c94b48 c0b3953c
> > [    3.920318] 7f20: c0bcd928 00000002 c0b39524 c00cfad8 00000000 ffffffff 00000000 c00cfb10
> > [    3.928924] 7f40: c14e62c0 c002c1c8 c002c0ac c14e62c0 00000002 e251c37d 00000000 c0b39548
> > [    3.937499] 7f60: c0b395f0 c05a1bc4 e251c37d 00000000 00000005 c05a3870 edc90380 edc90380
> > [    3.946105] 7f80: edc90394 c14e62c0 c0b39548 00000002 c0784064 c05a3c78 c0b395e0 c14e62c0
> > [    3.954681] 7fa0: 00000002 c0b39548 c0bc9db8 00000000 00000001 c05a1dc0 ed896000 00000015
> > [    3.963287] 7fc0: c0bc9db8 ed896000 8000406a c0b30594 c0784064 c000e504 00000746 c007a528
> > [    3.971862] 7fe0: 00000001 0000001d 600001d3 c0bcc004 00000000 800086c4 ee0aa6a7 d2aabaa9
> > [    3.980499] [<c0029ed8>] (irq_notifier+0x234/0x25c) from [<c077c04c>] (notifier_call_chain+0x38/0x68)
> > [    3.990173] [<c077c04c>] (notifier_call_chain+0x38/0x68) from [<c00cfad8>] (cpu_pm_notify+0x20/0x38)
> > [    3.999786] [<c00cfad8>] (cpu_pm_notify+0x20/0x38) from [<c00cfb10>] (cpu_cluster_pm_exit+0x20/0x50)
> > [    4.009399] [<c00cfb10>] (cpu_cluster_pm_exit+0x20/0x50) from [<c002c1c8>] (omap_enter_idle_coupled+0x11c/0x14c)
> > [    4.020111] [<c002c1c8>] (omap_enter_idle_coupled+0x11c/0x14c) from [<c05a1bc4>] (cpuidle_enter_state+0x40/0xec)
> > [    4.030822] [<c05a1bc4>] (cpuidle_enter_state+0x40/0xec) from [<c05a3c78>] (cpuidle_enter_state_coupled+0x1f4/0x240)
> > [    4.041870] [<c05a3c78>] (cpuidle_enter_state_coupled+0x1f4/0x240) from [<c05a1dc0>] (cpuidle_idle_call+0x150/0x228)
> > [    4.052947] [<c05a1dc0>] (cpuidle_idle_call+0x150/0x228) from [<c000e504>] (arch_cpu_idle+0x8/0x38)
> > [    4.062499] [<c000e504>] (arch_cpu_idle+0x8/0x38) from [<c007a528>] (cpu_startup_entry+0x178/0x1e4)
> > [    4.071990] [<c007a528>] (cpu_startup_entry+0x178/0x1e4) from [<800086c4>] (0x800086c4)
> > [    4.080383] Code: e5922288 03a03b0a 13a03c25 e0823003 (e5932000)
> > [    4.086791] ---[ end trace d83954a84a6fa69e ]---
> >
> > It is supposed that sar_base is initialized in irq_save_context, which
> > is called on CPU_CLUSTER_PM_ENTER notification. If this notification
> > has been missed and CPU_CLUSTER_PM_EXIT is received sar_base is NULL.
> >
> > Fix it by calling CPU_CLUSTER_PM_{ENTER,EXIT} under the same condition.
> >
> > Signed-off-by: Vladimir Murzin <murzin.v@gmail.com>
> 
> Good catch.
> 
> Acked-by: Kevin Hilman <khilman@linaro.org>

Thanks applying into omap-for-v3.12/fixes.

Tony
 
> > ---
> >  arch/arm/mach-omap2/cpuidle44xx.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/arm/mach-omap2/cpuidle44xx.c b/arch/arm/mach-omap2/cpuidle44xx.c
> > index c443f2e..4c8982a 100644
> > --- a/arch/arm/mach-omap2/cpuidle44xx.c
> > +++ b/arch/arm/mach-omap2/cpuidle44xx.c
> > @@ -143,7 +143,7 @@ static int omap_enter_idle_coupled(struct cpuidle_device *dev,
> >  	 * Call idle CPU cluster PM exit notifier chain
> >  	 * to restore GIC and wakeupgen context.
> >  	 */
> > -	if ((cx->mpu_state == PWRDM_POWER_RET) &&
> > +	if (dev->cpu == 0 && (cx->mpu_state == PWRDM_POWER_RET) &&
> >  		(cx->mpu_logic_state == PWRDM_POWER_OFF))
> >  		cpu_cluster_pm_exit();
diff mbox

Patch

diff --git a/arch/arm/mach-omap2/cpuidle44xx.c b/arch/arm/mach-omap2/cpuidle44xx.c
index c443f2e..4c8982a 100644
--- a/arch/arm/mach-omap2/cpuidle44xx.c
+++ b/arch/arm/mach-omap2/cpuidle44xx.c
@@ -143,7 +143,7 @@  static int omap_enter_idle_coupled(struct cpuidle_device *dev,
 	 * Call idle CPU cluster PM exit notifier chain
 	 * to restore GIC and wakeupgen context.
 	 */
-	if ((cx->mpu_state == PWRDM_POWER_RET) &&
+	if (dev->cpu == 0 && (cx->mpu_state == PWRDM_POWER_RET) &&
 		(cx->mpu_logic_state == PWRDM_POWER_OFF))
 		cpu_cluster_pm_exit();