diff mbox

exynos5800-peach-pi: suspend/resume (still) broken

Message ID 550C47ED.3010301@collabora.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Javier Martinez Canillas March 20, 2015, 4:16 p.m. UTC
Hello Abhilash,

On 03/20/2015 03:23 PM, Abhilash Kesavan wrote:
>> On 03/17/2015 06:35 PM, Kevin Hilman wrote:
>>>
>>> Anyone else having better luck with suspend/resume on peach-pi?
>>>
>>
>> # echo +2 > /sys/class/rtc/rtc0/wakealarm && echo mem > /sys/power/state
>>
>> Suspend and CPUs shutdown seems to succeed according to [0] but the system
>> never wakes up...
>>
>> I also tried to wakeup the system with the keyboard and the trackpad that is
>> a wake up source but it does not work either.
>>
>> I remember that when the 5420 s2r support series were posted, aclk200_disp1
>> and aclk300_disp1 clocks needed to be marked as CLK_IGNORE_UNUSED but afaiu
>> that was only because display support was not yet merged but it is now.
>>
>> I tried anyways both marking those clocks as CLK_IGNORE_UNUSED and passing
>> the clk_ignore_unused to the kernel command line but did not work either.
>>
>> Abhilash, Vikas, Pankaj,
>>
>> Any ideas of what could be causing this regression? It seems that by the
>> time the Exynos5420 S2R support landed in mainline, it was already not
>> working which makes it hard to bisect what caused the issue.
> 
> I remember the Pi power LED changing color from blue on suspend. Does

Thanks a lot for answering. Who manages that LED? is the kernel or the
firwmare in the EC? I tried suspend to ram using ChromeOS 3.8 kernel and
I see that the blue LED is indeed turned off on suspend but that does not
happen in mainline.

> that happen ? I'll try reproducing the issue and then probably use an
> old working s2r branch in one of my local repos to track this down.
>

If I checkout mainline with HEAD in your commit adc548d77c22
("ARM: EXYNOS: Use MCPM call-backs to support S2R on exynos5420") + the
patch you mentioned back then to keep the aclk200_disp1 and aclk300_disp1
clocks enabled even when not used [0], I have S2R working. But even with
that commit, I don't see the blue LED to be turn off like is the case in
the ChromeOS 3.8 kernel.

So I think you can use that as a base. I tried bisecting but it is tricky
due other issues masking the S2R regression. I also tried to compare the

Comments

Abhilash Kesavan March 20, 2015, 4:29 p.m. UTC | #1
Hi Javier,

On Fri, Mar 20, 2015 at 9:46 PM, Javier Martinez Canillas
<javier.martinez@collabora.co.uk> wrote:
> Hello Abhilash,
>
> On 03/20/2015 03:23 PM, Abhilash Kesavan wrote:
>>> On 03/17/2015 06:35 PM, Kevin Hilman wrote:
>>>>
>>>> Anyone else having better luck with suspend/resume on peach-pi?
>>>>
>>>
>>> # echo +2 > /sys/class/rtc/rtc0/wakealarm && echo mem > /sys/power/state
>>>
>>> Suspend and CPUs shutdown seems to succeed according to [0] but the system
>>> never wakes up...
>>>
>>> I also tried to wakeup the system with the keyboard and the trackpad that is
>>> a wake up source but it does not work either.
>>>
>>> I remember that when the 5420 s2r support series were posted, aclk200_disp1
>>> and aclk300_disp1 clocks needed to be marked as CLK_IGNORE_UNUSED but afaiu
>>> that was only because display support was not yet merged but it is now.
>>>
>>> I tried anyways both marking those clocks as CLK_IGNORE_UNUSED and passing
>>> the clk_ignore_unused to the kernel command line but did not work either.
>>>
>>> Abhilash, Vikas, Pankaj,
>>>
>>> Any ideas of what could be causing this regression? It seems that by the
>>> time the Exynos5420 S2R support landed in mainline, it was already not
>>> working which makes it hard to bisect what caused the issue.
>>
>> I remember the Pi power LED changing color from blue on suspend. Does
>
> Thanks a lot for answering. Who manages that LED? is the kernel or the
> firwmare in the EC? I tried suspend to ram using ChromeOS 3.8 kernel and
> I see that the blue LED is indeed turned off on suspend but that does not
> happen in mainline.

I am not too sure. It was just something I remembered from earlier.

>
>> that happen ? I'll try reproducing the issue and then probably use an
>> old working s2r branch in one of my local repos to track this down.
>>
>
> If I checkout mainline with HEAD in your commit adc548d77c22
> ("ARM: EXYNOS: Use MCPM call-backs to support S2R on exynos5420") + the
> patch you mentioned back then to keep the aclk200_disp1 and aclk300_disp1
> clocks enabled even when not used [0], I have S2R working. But even with
> that commit, I don't see the blue LED to be turn off like is the case in
> the ChromeOS 3.8 kernel.
>
> So I think you can use that as a base. I tried bisecting but it is tricky
> due other issues masking the S2R regression. I also tried to compare the
> diff between adc548d77c22 and v3.19-rc1 that is the first known bad afaict
> but didn't find any relevant either.
>
> By adding printouts I can tell that all the CPUs enter exynos_power_down()
> in arch/arm/mach-exynos/mcpm-exynos.c and also the last man disables the
> cluster for both Cortex A-15 and A-7 clusters.
>
> So it seems that the problem is on the resume path.

I have made some progress on this. This is the current state:

If I use next-20141114 (which was when the S2R code first appeared in
linux-next), then all is good. next-20141117 is fine too but things
are broken in next-20141118.
I have narrowed it down to the commit: "ae43b32 ARM: 8202/1:
dmaengine: pl330: Add runtime Power Management support v12". The only
way I see this impacting s2r is because it disables the dma pclk while
suspending or before.

Checking further, will update in a bit.

Regards,
Abhilash
>
>> Regards,
>> Abhilash
>>>
>
> Best regards,
> Javier
>
> [0]:
> diff --git a/drivers/clk/samsung/clk-exynos5420.c b/drivers/clk/samsung/clk-exynos5420.c
> index 848d602efc06..d8b66339d564 100644
> --- a/drivers/clk/samsung/clk-exynos5420.c
> +++ b/drivers/clk/samsung/clk-exynos5420.c
> @@ -932,14 +932,14 @@ static struct samsung_gate_clock exynos5x_gate_clks[] __initdata = {
>         GATE(0, "aclk400_mscl", "mout_user_aclk400_mscl",
>                         GATE_BUS_TOP, 17, 0, 0),
>         GATE(0, "aclk200_disp1", "mout_user_aclk200_disp1",
> -                       GATE_BUS_TOP, 18, 0, 0),
> +                       GATE_BUS_TOP, 18, CLK_IGNORE_UNUSED, 0),
>         GATE(CLK_SCLK_MPHY_IXTAL24, "sclk_mphy_ixtal24", "mphy_refclk_ixtal24",
>                         GATE_BUS_TOP, 28, 0, 0),
>         GATE(CLK_SCLK_HSIC_12M, "sclk_hsic_12m", "ff_hsic_12m",
>                         GATE_BUS_TOP, 29, 0, 0),
>
>         GATE(0, "aclk300_disp1", "mout_user_aclk300_disp1",
> -                       SRC_MASK_TOP2, 24, 0, 0),
> +                       SRC_MASK_TOP2, 24, CLK_IGNORE_UNUSED, 0),
>
>         GATE(CLK_MAU_EPLL, "mau_epll", "mout_mau_epll_clk",
>                         SRC_MASK_TOP7, 20, 0, 0),
>
Abhilash Kesavan March 20, 2015, 5:40 p.m. UTC | #2
Hi,

On Fri, Mar 20, 2015 at 9:59 PM, Abhilash Kesavan
<kesavan.abhilash@gmail.com> wrote:
> Hi Javier,
>
> On Fri, Mar 20, 2015 at 9:46 PM, Javier Martinez Canillas
> <javier.martinez@collabora.co.uk> wrote:
>> Hello Abhilash,
>>
>> On 03/20/2015 03:23 PM, Abhilash Kesavan wrote:
>>>> On 03/17/2015 06:35 PM, Kevin Hilman wrote:
>>>>>
>>>>> Anyone else having better luck with suspend/resume on peach-pi?
>>>>>
>>>>
>>>> # echo +2 > /sys/class/rtc/rtc0/wakealarm && echo mem > /sys/power/state
>>>>
>>>> Suspend and CPUs shutdown seems to succeed according to [0] but the system
>>>> never wakes up...
>>>>
>>>> I also tried to wakeup the system with the keyboard and the trackpad that is
>>>> a wake up source but it does not work either.
>>>>
>>>> I remember that when the 5420 s2r support series were posted, aclk200_disp1
>>>> and aclk300_disp1 clocks needed to be marked as CLK_IGNORE_UNUSED but afaiu
>>>> that was only because display support was not yet merged but it is now.
>>>>
>>>> I tried anyways both marking those clocks as CLK_IGNORE_UNUSED and passing
>>>> the clk_ignore_unused to the kernel command line but did not work either.
>>>>
>>>> Abhilash, Vikas, Pankaj,
>>>>
>>>> Any ideas of what could be causing this regression? It seems that by the
>>>> time the Exynos5420 S2R support landed in mainline, it was already not
>>>> working which makes it hard to bisect what caused the issue.
>>>
>>> I remember the Pi power LED changing color from blue on suspend. Does
>>
>> Thanks a lot for answering. Who manages that LED? is the kernel or the
>> firwmare in the EC? I tried suspend to ram using ChromeOS 3.8 kernel and
>> I see that the blue LED is indeed turned off on suspend but that does not
>> happen in mainline.
>
> I am not too sure. It was just something I remembered from earlier.
>
>>
>>> that happen ? I'll try reproducing the issue and then probably use an
>>> old working s2r branch in one of my local repos to track this down.
>>>
>>
>> If I checkout mainline with HEAD in your commit adc548d77c22
>> ("ARM: EXYNOS: Use MCPM call-backs to support S2R on exynos5420") + the
>> patch you mentioned back then to keep the aclk200_disp1 and aclk300_disp1
>> clocks enabled even when not used [0], I have S2R working. But even with
>> that commit, I don't see the blue LED to be turn off like is the case in
>> the ChromeOS 3.8 kernel.
>>
>> So I think you can use that as a base. I tried bisecting but it is tricky
>> due other issues masking the S2R regression. I also tried to compare the
>> diff between adc548d77c22 and v3.19-rc1 that is the first known bad afaict
>> but didn't find any relevant either.
>>
>> By adding printouts I can tell that all the CPUs enter exynos_power_down()
>> in arch/arm/mach-exynos/mcpm-exynos.c and also the last man disables the
>> cluster for both Cortex A-15 and A-7 clusters.
>>
>> So it seems that the problem is on the resume path.
>
> I have made some progress on this. This is the current state:
>
> If I use next-20141114 (which was when the S2R code first appeared in
> linux-next), then all is good. next-20141117 is fine too but things
> are broken in next-20141118.
> I have narrowed it down to the commit: "ae43b32 ARM: 8202/1:
> dmaengine: pl330: Add runtime Power Management support v12". The only
> way I see this impacting s2r is because it disables the dma pclk while
> suspending or before.
>
> Checking further, will update in a bit.

OK, so disabling the mdma0 node in "arch/arm/boot/dts/exynos5420.dtsi"
gets things working. Like Kevin mentioned in the initial report, I
need to disable DRM else there is a crash while suspending. With these
two changes, on linus' tree and kgene's for-next s2r works fine.

On linux-next, I need to disable CONFIG_MWIFIEX too.

Also, I observe cros-ec-spi transfer failures during resume and
sometimes it is unable to re-enable the tps fets causing a crash.
However, that would be a driver specific issue.

Regarding the mdma0 disablement, it looks like for the system to
suspend properly the mdma0 pclk needs to stay on.

Regards,
Abhilash
>>
>>> Regards,
>>> Abhilash
>>>>
>>
>> Best regards,
>> Javier
>>
>> [0]:
>> diff --git a/drivers/clk/samsung/clk-exynos5420.c b/drivers/clk/samsung/clk-exynos5420.c
>> index 848d602efc06..d8b66339d564 100644
>> --- a/drivers/clk/samsung/clk-exynos5420.c
>> +++ b/drivers/clk/samsung/clk-exynos5420.c
>> @@ -932,14 +932,14 @@ static struct samsung_gate_clock exynos5x_gate_clks[] __initdata = {
>>         GATE(0, "aclk400_mscl", "mout_user_aclk400_mscl",
>>                         GATE_BUS_TOP, 17, 0, 0),
>>         GATE(0, "aclk200_disp1", "mout_user_aclk200_disp1",
>> -                       GATE_BUS_TOP, 18, 0, 0),
>> +                       GATE_BUS_TOP, 18, CLK_IGNORE_UNUSED, 0),
>>         GATE(CLK_SCLK_MPHY_IXTAL24, "sclk_mphy_ixtal24", "mphy_refclk_ixtal24",
>>                         GATE_BUS_TOP, 28, 0, 0),
>>         GATE(CLK_SCLK_HSIC_12M, "sclk_hsic_12m", "ff_hsic_12m",
>>                         GATE_BUS_TOP, 29, 0, 0),
>>
>>         GATE(0, "aclk300_disp1", "mout_user_aclk300_disp1",
>> -                       SRC_MASK_TOP2, 24, 0, 0),
>> +                       SRC_MASK_TOP2, 24, CLK_IGNORE_UNUSED, 0),
>>
>>         GATE(CLK_MAU_EPLL, "mau_epll", "mout_mau_epll_clk",
>>                         SRC_MASK_TOP7, 20, 0, 0),
>>
Javier Martinez Canillas March 20, 2015, 5:52 p.m. UTC | #3
Hello Abhilash,

On 03/20/2015 06:40 PM, Abhilash Kesavan wrote:
>>
>> I have made some progress on this. This is the current state:
>>
>> If I use next-20141114 (which was when the S2R code first appeared in
>> linux-next), then all is good. next-20141117 is fine too but things
>> are broken in next-20141118.
>> I have narrowed it down to the commit: "ae43b32 ARM: 8202/1:
>> dmaengine: pl330: Add runtime Power Management support v12". The only
>> way I see this impacting s2r is because it disables the dma pclk while
>> suspending or before.
>>
>> Checking further, will update in a bit.
> 
> OK, so disabling the mdma0 node in "arch/arm/boot/dts/exynos5420.dtsi"
> gets things working. Like Kevin mentioned in the initial report, I
> need to disable DRM else there is a crash while suspending. With these
> two changes, on linus' tree and kgene's for-next s2r works fine.
>

Awesome, thanks a lot for digging this out!
 
> On linux-next, I need to disable CONFIG_MWIFIEX too.
> 

Yes, I also saw that issue with mwfiex when suspend-to-idle (which works
in -next) due the MMC_PM_KEEP_POWER flag being set in the host pm caps.
But I don't see that being set in the host driver.

> Also, I observe cros-ec-spi transfer failures during resume and
> sometimes it is unable to re-enable the tps fets causing a crash.
> However, that would be a driver specific issue.
> 

Indeed, I can take a look to that issue as well.

> Regarding the mdma0 disablement, it looks like for the system to
> suspend properly the mdma0 pclk needs to stay on.
>

It seems so, I remember we had other issues with the mentioned commit
due clocks being gated. For example the mau_epll clock that was needed
to access the audss block registers and caused a boot hang on Exyos5420.

That specific issue was fixed by f1e9203e2366 ("clk: samsung: Fix Exynos
5420 pinctrl setup and clock disable failure due to domain being gated")
which solves it by enabling the mau_epll on probe.
 
> Regards,
> Abhilash
>>>

Best regards,
Javier
Javier Martinez Canillas March 27, 2015, 1:29 p.m. UTC | #4
Hello Abhilash,

On 03/20/2015 06:40 PM, Abhilash Kesavan wrote:
> 
> Regarding the mdma0 disablement, it looks like for the system to
> suspend properly the mdma0 pclk needs to stay on.
> 

I had time today again to work on this issue and the best
place I found to enable and disable the mdma0 clock is in
exynos5420_pm_{prepare,resume}. Please let me know if you
have a better idea of where the clock should be managed.

I'll send a RFC patch-set soon.

> Regards,
> Abhilash

Best regards,
Javier
Abhilash Kesavan March 27, 2015, 2:06 p.m. UTC | #5
Hello Javier,

On Fri, Mar 27, 2015 at 6:59 PM, Javier Martinez Canillas
<javier.martinez@collabora.co.uk> wrote:
> Hello Abhilash,
>
> On 03/20/2015 06:40 PM, Abhilash Kesavan wrote:
>>
>> Regarding the mdma0 disablement, it looks like for the system to
>> suspend properly the mdma0 pclk needs to stay on.
>>
>
> I had time today again to work on this issue and the best
> place I found to enable and disable the mdma0 clock is in
> exynos5420_pm_{prepare,resume}. Please let me know if you
> have a better idea of where the clock should be managed.
>

Modifying exynos5420_clk_suspend in
drivers/clk/samsung/clk-exynos5420.c would be another way to go,
however I have not tested if this actually works.

> I'll send a RFC patch-set soon.

Thanks for the effort.

Regards,
Abhilash
Javier Martinez Canillas March 27, 2015, 2:30 p.m. UTC | #6
Hello Abhilash,

On 03/27/2015 03:06 PM, Abhilash Kesavan wrote:
> Hello Javier,
> 
> On Fri, Mar 27, 2015 at 6:59 PM, Javier Martinez Canillas
> <javier.martinez@collabora.co.uk> wrote:
>> Hello Abhilash,
>>
>> On 03/20/2015 06:40 PM, Abhilash Kesavan wrote:
>>>
>>> Regarding the mdma0 disablement, it looks like for the system to
>>> suspend properly the mdma0 pclk needs to stay on.
>>>
>>
>> I had time today again to work on this issue and the best
>> place I found to enable and disable the mdma0 clock is in
>> exynos5420_pm_{prepare,resume}. Please let me know if you
>> have a better idea of where the clock should be managed.
>>
> 
> Modifying exynos5420_clk_suspend in
> drivers/clk/samsung/clk-exynos5420.c would be another way to go,
> however I have not tested if this actually works.
> 

Sorry, I just posted the series: "[RFC PATCH 0/2] ARM: EXYNOS: Fix
Suspend-to-RAM on Exynos5420" before reading your email...

Please let me know if you think that is not the best appraoch and
I can come up with another one to modify the clk-exynos5420 suspend
and resume callbacks.

Another option is to use Lee Jone's clocks "always on" support [0]
that has been posted.

>> I'll send a RFC patch-set soon.
> 
> Thanks for the effort.
>

Thanks to you for all the help.

> Regards,
> Abhilash
>

Best regards,
Javier

[0]: http://lists.infradead.org/pipermail/linux-arm-kernel/2015-February/326616.html
diff mbox

Patch

diff between adc548d77c22 and v3.19-rc1 that is the first known bad afaict
but didn't find any relevant either.

By adding printouts I can tell that all the CPUs enter exynos_power_down()
in arch/arm/mach-exynos/mcpm-exynos.c and also the last man disables the
cluster for both Cortex A-15 and A-7 clusters.

So it seems that the problem is on the resume path.
 
> Regards,
> Abhilash
>>

Best regards,
Javier

[0]:
diff --git a/drivers/clk/samsung/clk-exynos5420.c b/drivers/clk/samsung/clk-exynos5420.c
index 848d602efc06..d8b66339d564 100644
--- a/drivers/clk/samsung/clk-exynos5420.c
+++ b/drivers/clk/samsung/clk-exynos5420.c
@@ -932,14 +932,14 @@  static struct samsung_gate_clock exynos5x_gate_clks[] __initdata = {
 	GATE(0, "aclk400_mscl", "mout_user_aclk400_mscl",
 			GATE_BUS_TOP, 17, 0, 0),
 	GATE(0, "aclk200_disp1", "mout_user_aclk200_disp1",
-			GATE_BUS_TOP, 18, 0, 0),
+			GATE_BUS_TOP, 18, CLK_IGNORE_UNUSED, 0),
 	GATE(CLK_SCLK_MPHY_IXTAL24, "sclk_mphy_ixtal24", "mphy_refclk_ixtal24",
 			GATE_BUS_TOP, 28, 0, 0),
 	GATE(CLK_SCLK_HSIC_12M, "sclk_hsic_12m", "ff_hsic_12m",
 			GATE_BUS_TOP, 29, 0, 0),
 
 	GATE(0, "aclk300_disp1", "mout_user_aclk300_disp1",
-			SRC_MASK_TOP2, 24, 0, 0),
+			SRC_MASK_TOP2, 24, CLK_IGNORE_UNUSED, 0),
 
 	GATE(CLK_MAU_EPLL, "mau_epll", "mout_mau_epll_clk",
 			SRC_MASK_TOP7, 20, 0, 0),