diff mbox

ravb: add wake-on-lan support via magic packet

Message ID 20170511222735.10706-1-niklas.soderlund+renesas@ragnatech.se (mailing list archive)
State Superseded
Delegated to: Geert Uytterhoeven
Headers show

Commit Message

Niklas Söderlund May 11, 2017, 10:27 p.m. UTC
WoL is enabled in the suspend callback by setting MagicPacket detection
and disabling all interrupts expect MagicPacket. In the resume path the
driver needs to reset the hardware to rearm the WoL logic, this prevents
the driver from simply restoring the registers and to take advantage of
that ravb was not suspended to reduce resume time. To reset the
hardware the driver closes the device, sets it in reset mode and reopens
the device just like it would do in a normal suspend/resume scenario
without WoL enabled, but it both closes and opens the device in the
resume callback since the device needs to be reset for WoL to work.

One quirk needed for WoL is that the module clock needs to be prevented
from being switched off by Runtime PM. To keep the clock alive the
suspend callback need to call clk_enable() directly to increase the
usage count of the clock. Then when Runtime PM decreases the clock usage
count it won't reach 0 and be switched off.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
---

Tested on Salvator-X H3 and M3-W.

 drivers/net/ethernet/renesas/ravb.h      |   2 +
 drivers/net/ethernet/renesas/ravb_main.c | 108 +++++++++++++++++++++++++++++--
 2 files changed, 106 insertions(+), 4 deletions(-)

Comments

Sergei Shtylyov May 12, 2017, 12:06 p.m. UTC | #1
Hello!

On 05/12/2017 01:27 AM, Niklas Söderlund wrote:

> WoL is enabled in the suspend callback by setting MagicPacket detection
> and disabling all interrupts expect MagicPacket. In the resume path the

    Except.

> driver needs to reset the hardware to rearm the WoL logic, this prevents
> the driver from simply restoring the registers and to take advantage of
> that ravb was not suspended to reduce resume time. To reset the
> hardware the driver closes the device, sets it in reset mode and reopens
> the device just like it would do in a normal suspend/resume scenario
> without WoL enabled, but it both closes and opens the device in the
> resume callback since the device needs to be reset for WoL to work.
>
> One quirk needed for WoL is that the module clock needs to be prevented
> from being switched off by Runtime PM. To keep the clock alive the
> suspend callback need to call clk_enable() directly to increase the
> usage count of the clock. Then when Runtime PM decreases the clock usage
> count it won't reach 0 and be switched off.
>
> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
[...]

Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>

MBR, Sergei
Geert Uytterhoeven May 12, 2017, 12:47 p.m. UTC | #2
Hi Niklas,

On Fri, May 12, 2017 at 12:27 AM, Niklas Söderlund
<niklas.soderlund+renesas@ragnatech.se> wrote:
> WoL is enabled in the suspend callback by setting MagicPacket detection
> and disabling all interrupts expect MagicPacket. In the resume path the
> driver needs to reset the hardware to rearm the WoL logic, this prevents
> the driver from simply restoring the registers and to take advantage of
> that ravb was not suspended to reduce resume time. To reset the
> hardware the driver closes the device, sets it in reset mode and reopens
> the device just like it would do in a normal suspend/resume scenario
> without WoL enabled, but it both closes and opens the device in the
> resume callback since the device needs to be reset for WoL to work.
>
> One quirk needed for WoL is that the module clock needs to be prevented
> from being switched off by Runtime PM. To keep the clock alive the
> suspend callback need to call clk_enable() directly to increase the
> usage count of the clock. Then when Runtime PM decreases the clock usage
> count it won't reach 0 and be switched off.
>
> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>

Thanks for your patch!

Wake-on-LAN now works for me on Salvator-X (both H3 and M3).

However, after a few cycles, Ethernet stopped working, because CMA ran out of
memory:

    cma: cma_alloc: alloc failed, req-size: 6 pages, ret: -4
    dpm_run_callback(): ravb_resume+0x0/0x148 returns -12
    PM: Device e6800000.ethernet failed to resume: error -12

Are we still missing some ravb patches in net-next, or is this a different
issue?

When using PSCI suspend/resume, and Wake-on-LAN cannot work due to
PSCI firmware issues, Ethernet fails to come up afterwards:

    ravb e6800000.ethernet eth0: failed to switch device to config mode
    ravb e6800000.ethernet eth0: device will be stopped after h/w
processes are done.
    ravb e6800000.ethernet eth0: failed to switch device to config mode
    dpm_run_callback(): ravb_resume+0x0/0x148 returns -110
    PM: Device e6800000.ethernet failed to resume: error -110

Your resume routine cannot assume RAVB is in a sane mode, as it will
have been reset if PSCI suspend was used.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Niklas Söderlund May 12, 2017, 12:58 p.m. UTC | #3
Hi Geert,

Thanks for testing.

On 2017-05-12 14:47:53 +0200, Geert Uytterhoeven wrote:
> Hi Niklas,
> 
> On Fri, May 12, 2017 at 12:27 AM, Niklas Söderlund
> <niklas.soderlund+renesas@ragnatech.se> wrote:
> > WoL is enabled in the suspend callback by setting MagicPacket detection
> > and disabling all interrupts expect MagicPacket. In the resume path the
> > driver needs to reset the hardware to rearm the WoL logic, this prevents
> > the driver from simply restoring the registers and to take advantage of
> > that ravb was not suspended to reduce resume time. To reset the
> > hardware the driver closes the device, sets it in reset mode and reopens
> > the device just like it would do in a normal suspend/resume scenario
> > without WoL enabled, but it both closes and opens the device in the
> > resume callback since the device needs to be reset for WoL to work.
> >
> > One quirk needed for WoL is that the module clock needs to be prevented
> > from being switched off by Runtime PM. To keep the clock alive the
> > suspend callback need to call clk_enable() directly to increase the
> > usage count of the clock. Then when Runtime PM decreases the clock usage
> > count it won't reach 0 and be switched off.
> >
> > Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
> 
> Thanks for your patch!
> 
> Wake-on-LAN now works for me on Salvator-X (both H3 and M3).
> 
> However, after a few cycles, Ethernet stopped working, because CMA ran out of
> memory:
> 
>     cma: cma_alloc: alloc failed, req-size: 6 pages, ret: -4
>     dpm_run_callback(): ravb_resume+0x0/0x148 returns -12
>     PM: Device e6800000.ethernet failed to resume: error -12
> 
> Are we still missing some ravb patches in net-next, or is this a different
> issue?

Interesting, I did a 100 loop test on H3 based on v4.11 without any 
issues. What are you using as a base for your test and I will test if I 
can reproduce it.

> 
> When using PSCI suspend/resume, and Wake-on-LAN cannot work due to
> PSCI firmware issues, Ethernet fails to come up afterwards:
> 
>     ravb e6800000.ethernet eth0: failed to switch device to config mode
>     ravb e6800000.ethernet eth0: device will be stopped after h/w
> processes are done.
>     ravb e6800000.ethernet eth0: failed to switch device to config mode
>     dpm_run_callback(): ravb_resume+0x0/0x148 returns -110
>     PM: Device e6800000.ethernet failed to resume: error -110
> 
> Your resume routine cannot assume RAVB is in a sane mode, as it will
> have been reset if PSCI suspend was used.

Ouch, yes this is true thanks for reporting will look in to it.

The problem is that in the resume handler if WoL is enabled it will try 
to close the device before reinitializing it from reset state. If WoL is 
not enabled the device will be closed at suspend time so no need to 
close it before restoring operation from reset in the resume handler.

> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
Geert Uytterhoeven May 12, 2017, 1:03 p.m. UTC | #4
Hi Niklas,

On Fri, May 12, 2017 at 2:58 PM, Niklas Söderlund
<niklas.soderlund@ragnatech.se> wrote:
> On 2017-05-12 14:47:53 +0200, Geert Uytterhoeven wrote:
>> On Fri, May 12, 2017 at 12:27 AM, Niklas Söderlund
>> <niklas.soderlund+renesas@ragnatech.se> wrote:
>> > WoL is enabled in the suspend callback by setting MagicPacket detection
>> > and disabling all interrupts expect MagicPacket. In the resume path the
>> > driver needs to reset the hardware to rearm the WoL logic, this prevents
>> > the driver from simply restoring the registers and to take advantage of
>> > that ravb was not suspended to reduce resume time. To reset the
>> > hardware the driver closes the device, sets it in reset mode and reopens
>> > the device just like it would do in a normal suspend/resume scenario
>> > without WoL enabled, but it both closes and opens the device in the
>> > resume callback since the device needs to be reset for WoL to work.
>> >
>> > One quirk needed for WoL is that the module clock needs to be prevented
>> > from being switched off by Runtime PM. To keep the clock alive the
>> > suspend callback need to call clk_enable() directly to increase the
>> > usage count of the clock. Then when Runtime PM decreases the clock usage
>> > count it won't reach 0 and be switched off.
>> >
>> > Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
>>
>> Thanks for your patch!
>>
>> Wake-on-LAN now works for me on Salvator-X (both H3 and M3).
>>
>> However, after a few cycles, Ethernet stopped working, because CMA ran out of
>> memory:
>>
>>     cma: cma_alloc: alloc failed, req-size: 6 pages, ret: -4
>>     dpm_run_callback(): ravb_resume+0x0/0x148 returns -12
>>     PM: Device e6800000.ethernet failed to resume: error -12
>>
>> Are we still missing some ravb patches in net-next, or is this a different
>> issue?
>
> Interesting, I did a 100 loop test on H3 based on v4.11 without any
> issues. What are you using as a base for your test and I will test if I
> can reproduce it.

Something based on last renesas-drivers.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Niklas Söderlund May 12, 2017, 2:32 p.m. UTC | #5
Hi Geert,

On 2017-05-12 14:58:53 +0200, Niklas Söderlund wrote:
> Hi Geert,
> 
> Thanks for testing.
> 
> On 2017-05-12 14:47:53 +0200, Geert Uytterhoeven wrote:
> > Hi Niklas,
> > 
> > On Fri, May 12, 2017 at 12:27 AM, Niklas Söderlund
> > <niklas.soderlund+renesas@ragnatech.se> wrote:
> > > WoL is enabled in the suspend callback by setting MagicPacket detection
> > > and disabling all interrupts expect MagicPacket. In the resume path the
> > > driver needs to reset the hardware to rearm the WoL logic, this prevents
> > > the driver from simply restoring the registers and to take advantage of
> > > that ravb was not suspended to reduce resume time. To reset the
> > > hardware the driver closes the device, sets it in reset mode and reopens
> > > the device just like it would do in a normal suspend/resume scenario
> > > without WoL enabled, but it both closes and opens the device in the
> > > resume callback since the device needs to be reset for WoL to work.
> > >
> > > One quirk needed for WoL is that the module clock needs to be prevented
> > > from being switched off by Runtime PM. To keep the clock alive the
> > > suspend callback need to call clk_enable() directly to increase the
> > > usage count of the clock. Then when Runtime PM decreases the clock usage
> > > count it won't reach 0 and be switched off.
> > >
> > > Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
> > 
> > Thanks for your patch!
> > 
> > Wake-on-LAN now works for me on Salvator-X (both H3 and M3).
> > 
> > However, after a few cycles, Ethernet stopped working, because CMA ran out of
> > memory:
> > 
> >     cma: cma_alloc: alloc failed, req-size: 6 pages, ret: -4
> >     dpm_run_callback(): ravb_resume+0x0/0x148 returns -12
> >     PM: Device e6800000.ethernet failed to resume: error -12
> > 
> > Are we still missing some ravb patches in net-next, or is this a different
> > issue?
> 
> Interesting, I did a 100 loop test on H3 based on v4.11 without any 
> issues. What are you using as a base for your test and I will test if I 
> can reproduce it.

I re-ran my test based on renesas-drivers-2017-05-02-v4.11 using the 
arm64 defconfig. And was not able to see this :-(

My TC is basically a expect script that do:

ethtool -s eth0 wol g

for (i = 0; i < 100; i++) {
    echo s2idle > /sys/power/mem_sleep
    echo mem > /sys/power/state

    sleep 5

    HOST: etherwake -i net0 2e:09:0a:00:83:90

    WAIT_FOR_CONSOLE
}

I use the default DT and arm64 defconfig and no other modifications. I'm 
testing on H3 ES1.x board. What can be the different for us?

> 
> > 
> > When using PSCI suspend/resume, and Wake-on-LAN cannot work due to
> > PSCI firmware issues, Ethernet fails to come up afterwards:
> > 
> >     ravb e6800000.ethernet eth0: failed to switch device to config mode
> >     ravb e6800000.ethernet eth0: device will be stopped after h/w
> > processes are done.
> >     ravb e6800000.ethernet eth0: failed to switch device to config mode
> >     dpm_run_callback(): ravb_resume+0x0/0x148 returns -110
> >     PM: Device e6800000.ethernet failed to resume: error -110
> > 
> > Your resume routine cannot assume RAVB is in a sane mode, as it will
> > have been reset if PSCI suspend was used.
> 
> Ouch, yes this is true thanks for reporting will look in to it.
> 
> The problem is that in the resume handler if WoL is enabled it will try 
> to close the device before reinitializing it from reset state. If WoL is 
> not enabled the device will be closed at suspend time so no need to 
> close it before restoring operation from reset in the resume handler.

I was not able to reproduce this fault neither :-( But here I see 
something is wrong, nothing is outputted on the console if WoL is 
enabled:

echo 0 > /sys/module/printk/parameters/console_suspend
i2cset -f -y 7 0x30 0x20 0x0F
<Flip SW23>
echo mem > /sys/power/state
<Flip SW23>
<System wakes up>

While:
ethtool -s eth0 wol g
echo 0 > /sys/module/printk/parameters/console_suspend
i2cset -f -y 7 0x30 0x20 0x0F
<Flip SW23>
echo mem > /sys/power/state
<Flip SW23>
<Nothing happens on the console, not even firmware info is printed>

Are you enabling anything else then console_suspend to be able to 
capture output at this point? I'm using the same setup tag and config as 
I described above.

> 
> > 
> > Gr{oetje,eeting}s,
> > 
> >                         Geert
> > 
> > --
> > Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> > 
> > In personal conversations with technical people, I call myself a hacker. But
> > when I'm talking to journalists I just say "programmer" or something like that.
> >                                 -- Linus Torvalds
> 
> -- 
> Regards,
> Niklas Söderlund
Geert Uytterhoeven May 12, 2017, 2:43 p.m. UTC | #6
Hi Niklas,

On Fri, May 12, 2017 at 4:32 PM, Niklas Söderlund
<niklas.soderlund@ragnatech.se> wrote:
> On 2017-05-12 14:58:53 +0200, Niklas Söderlund wrote:
>> On 2017-05-12 14:47:53 +0200, Geert Uytterhoeven wrote:
>> > On Fri, May 12, 2017 at 12:27 AM, Niklas Söderlund
>> > <niklas.soderlund+renesas@ragnatech.se> wrote:
>> > > WoL is enabled in the suspend callback by setting MagicPacket detection
>> > > and disabling all interrupts expect MagicPacket. In the resume path the
>> > > driver needs to reset the hardware to rearm the WoL logic, this prevents
>> > > the driver from simply restoring the registers and to take advantage of
>> > > that ravb was not suspended to reduce resume time. To reset the
>> > > hardware the driver closes the device, sets it in reset mode and reopens
>> > > the device just like it would do in a normal suspend/resume scenario
>> > > without WoL enabled, but it both closes and opens the device in the
>> > > resume callback since the device needs to be reset for WoL to work.
>> > >
>> > > One quirk needed for WoL is that the module clock needs to be prevented
>> > > from being switched off by Runtime PM. To keep the clock alive the
>> > > suspend callback need to call clk_enable() directly to increase the
>> > > usage count of the clock. Then when Runtime PM decreases the clock usage
>> > > count it won't reach 0 and be switched off.
>> > >
>> > > Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
>> >
>> > Thanks for your patch!
>> >
>> > Wake-on-LAN now works for me on Salvator-X (both H3 and M3).
>> >
>> > However, after a few cycles, Ethernet stopped working, because CMA ran out of
>> > memory:
>> >
>> >     cma: cma_alloc: alloc failed, req-size: 6 pages, ret: -4
>> >     dpm_run_callback(): ravb_resume+0x0/0x148 returns -12
>> >     PM: Device e6800000.ethernet failed to resume: error -12
>> >
>> > Are we still missing some ravb patches in net-next, or is this a different
>> > issue?
>>
>> Interesting, I did a 100 loop test on H3 based on v4.11 without any
>> issues. What are you using as a base for your test and I will test if I
>> can reproduce it.
>
> I re-ran my test based on renesas-drivers-2017-05-02-v4.11 using the
> arm64 defconfig. And was not able to see this :-(
>
> My TC is basically a expect script that do:
>
> ethtool -s eth0 wol g
>
> for (i = 0; i < 100; i++) {
>     echo s2idle > /sys/power/mem_sleep
>     echo mem > /sys/power/state
>
>     sleep 5
>
>     HOST: etherwake -i net0 2e:09:0a:00:83:90
>
>     WAIT_FOR_CONSOLE
> }
>
> I use the default DT and arm64 defconfig and no other modifications. I'm
> testing on H3 ES1.x board. What can be the different for us?

I'm not using "echo s2idle > /sys/power/mem_sleep", as renesas-drivers still
has my patches to not use PSCI suspend if any other wake-up sources
are configured.

I'll retry later...

>> > When using PSCI suspend/resume, and Wake-on-LAN cannot work due to
>> > PSCI firmware issues, Ethernet fails to come up afterwards:
>> >
>> >     ravb e6800000.ethernet eth0: failed to switch device to config mode
>> >     ravb e6800000.ethernet eth0: device will be stopped after h/w
>> > processes are done.
>> >     ravb e6800000.ethernet eth0: failed to switch device to config mode
>> >     dpm_run_callback(): ravb_resume+0x0/0x148 returns -110
>> >     PM: Device e6800000.ethernet failed to resume: error -110
>> >
>> > Your resume routine cannot assume RAVB is in a sane mode, as it will
>> > have been reset if PSCI suspend was used.
>>
>> Ouch, yes this is true thanks for reporting will look in to it.
>>
>> The problem is that in the resume handler if WoL is enabled it will try
>> to close the device before reinitializing it from reset state. If WoL is
>> not enabled the device will be closed at suspend time so no need to
>> close it before restoring operation from reset in the resume handler.
>
> I was not able to reproduce this fault neither :-( But here I see
> something is wrong, nothing is outputted on the console if WoL is
> enabled:
>
> echo 0 > /sys/module/printk/parameters/console_suspend
> i2cset -f -y 7 0x30 0x20 0x0F
> <Flip SW23>
> echo mem > /sys/power/state
> <Flip SW23>
> <System wakes up>
>
> While:
> ethtool -s eth0 wol g
> echo 0 > /sys/module/printk/parameters/console_suspend
> i2cset -f -y 7 0x30 0x20 0x0F
> <Flip SW23>
> echo mem > /sys/power/state
> <Flip SW23>
> <Nothing happens on the console, not even firmware info is printed>
>
> Are you enabling anything else then console_suspend to be able to
> capture output at this point? I'm using the same setup tag and config as
> I described above.

With renesas-drivers, it will refuse to use PSCI suspend if any other wake-up
sources are configured, i.e. try

    echo disabled > /sys/devices/platform/soc/e6800000.ethernet/power/wakeup

Good luck, and have a nice weekend!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Niklas Söderlund May 12, 2017, 9:12 p.m. UTC | #7
Hi Geert,

On 2017-05-12 16:43:55 +0200, Geert Uytterhoeven wrote:
> Hi Niklas,
> 
> On Fri, May 12, 2017 at 4:32 PM, Niklas Söderlund
> <niklas.soderlund@ragnatech.se> wrote:
> > On 2017-05-12 14:58:53 +0200, Niklas Söderlund wrote:
> >> On 2017-05-12 14:47:53 +0200, Geert Uytterhoeven wrote:
> >> > On Fri, May 12, 2017 at 12:27 AM, Niklas Söderlund
> >> > <niklas.soderlund+renesas@ragnatech.se> wrote:
> >> > > WoL is enabled in the suspend callback by setting MagicPacket detection
> >> > > and disabling all interrupts expect MagicPacket. In the resume path the
> >> > > driver needs to reset the hardware to rearm the WoL logic, this prevents
> >> > > the driver from simply restoring the registers and to take advantage of
> >> > > that ravb was not suspended to reduce resume time. To reset the
> >> > > hardware the driver closes the device, sets it in reset mode and reopens
> >> > > the device just like it would do in a normal suspend/resume scenario
> >> > > without WoL enabled, but it both closes and opens the device in the
> >> > > resume callback since the device needs to be reset for WoL to work.
> >> > >
> >> > > One quirk needed for WoL is that the module clock needs to be prevented
> >> > > from being switched off by Runtime PM. To keep the clock alive the
> >> > > suspend callback need to call clk_enable() directly to increase the
> >> > > usage count of the clock. Then when Runtime PM decreases the clock usage
> >> > > count it won't reach 0 and be switched off.
> >> > >
> >> > > Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
> >> >
> >> > Thanks for your patch!
> >> >
> >> > Wake-on-LAN now works for me on Salvator-X (both H3 and M3).
> >> >
> >> > However, after a few cycles, Ethernet stopped working, because CMA ran out of
> >> > memory:
> >> >
> >> >     cma: cma_alloc: alloc failed, req-size: 6 pages, ret: -4
> >> >     dpm_run_callback(): ravb_resume+0x0/0x148 returns -12
> >> >     PM: Device e6800000.ethernet failed to resume: error -12
> >> >
> >> > Are we still missing some ravb patches in net-next, or is this a different
> >> > issue?
> >>
> >> Interesting, I did a 100 loop test on H3 based on v4.11 without any
> >> issues. What are you using as a base for your test and I will test if I
> >> can reproduce it.
> >
> > I re-ran my test based on renesas-drivers-2017-05-02-v4.11 using the
> > arm64 defconfig. And was not able to see this :-(
> >
> > My TC is basically a expect script that do:
> >
> > ethtool -s eth0 wol g
> >
> > for (i = 0; i < 100; i++) {
> >     echo s2idle > /sys/power/mem_sleep
> >     echo mem > /sys/power/state
> >
> >     sleep 5
> >
> >     HOST: etherwake -i net0 2e:09:0a:00:83:90
> >
> >     WAIT_FOR_CONSOLE
> > }
> >
> > I use the default DT and arm64 defconfig and no other modifications. I'm
> > testing on H3 ES1.x board. What can be the different for us?
> 
> I'm not using "echo s2idle > /sys/power/mem_sleep", as renesas-drivers still
> has my patches to not use PSCI suspend if any other wake-up sources
> are configured.
> 
> I'll retry later...
> 
> >> > When using PSCI suspend/resume, and Wake-on-LAN cannot work due to
> >> > PSCI firmware issues, Ethernet fails to come up afterwards:
> >> >
> >> >     ravb e6800000.ethernet eth0: failed to switch device to config mode
> >> >     ravb e6800000.ethernet eth0: device will be stopped after h/w
> >> > processes are done.
> >> >     ravb e6800000.ethernet eth0: failed to switch device to config mode
> >> >     dpm_run_callback(): ravb_resume+0x0/0x148 returns -110
> >> >     PM: Device e6800000.ethernet failed to resume: error -110
> >> >
> >> > Your resume routine cannot assume RAVB is in a sane mode, as it will
> >> > have been reset if PSCI suspend was used.
> >>
> >> Ouch, yes this is true thanks for reporting will look in to it.
> >>
> >> The problem is that in the resume handler if WoL is enabled it will try
> >> to close the device before reinitializing it from reset state. If WoL is
> >> not enabled the device will be closed at suspend time so no need to
> >> close it before restoring operation from reset in the resume handler.
> >
> > I was not able to reproduce this fault neither :-( But here I see
> > something is wrong, nothing is outputted on the console if WoL is
> > enabled:
> >
> > echo 0 > /sys/module/printk/parameters/console_suspend
> > i2cset -f -y 7 0x30 0x20 0x0F
> > <Flip SW23>
> > echo mem > /sys/power/state
> > <Flip SW23>
> > <System wakes up>
> >
> > While:
> > ethtool -s eth0 wol g
> > echo 0 > /sys/module/printk/parameters/console_suspend
> > i2cset -f -y 7 0x30 0x20 0x0F
> > <Flip SW23>
> > echo mem > /sys/power/state
> > <Flip SW23>
> > <Nothing happens on the console, not even firmware info is printed>
> >
> > Are you enabling anything else then console_suspend to be able to
> > capture output at this point? I'm using the same setup tag and config as
> > I described above.
> 
> With renesas-drivers, it will refuse to use PSCI suspend if any other wake-up
> sources are configured, i.e. try
> 
>     echo disabled > /sys/devices/platform/soc/e6800000.ethernet/power/wakeup
> 
> Good luck, and have a nice weekend!

Thanks this allowed me to reproduce the same error as you. And after 
future digging I don't believe the problem being in the logic of the 
ravb suspend/resume functions. The problem is that the module clock is 
never turned on after PCSI system suspend if its usage count is above 0 
at suspend time, so the errors we both now observe are due to the module 
clock being disabled.

If I set DEBUG in renesas-cpg-mssr.c and observe when each module clock 
is turned OFF and ON, the fault is clear. If WoL is enabled the clock is 
never turned on when the system is resuming, while if WoL is disabled it 
is. I verified this by removing the calls to clk_enable() and 
clk_disable() from this patch, and by doing so the PCSI system suspend 
works perfect with WoL enabled and the ravb comes up fine after toggling 
SW23 (while ofc WoL no longer works in s2idle due to the module clock is 
switched off at suspend time).

I checked drivers/clk/renesas and I can't find a suspend/resume handler 
for the clock driver, how is this intended to work? If a clock have a 
usage count higher then 0 when the system is PSCI System Suspended it 
seems like it won't be turned back on when the system is resumed from 
this sleep stage. I might have misunderstood something and I need to 
alter the logic in the ravb driver to let the clock driver know it 
should turn on the clock at resume time?

Whit all this being said I still like to withdraw this patch as I found 
another fault with it, ravb_wol_restore() will unconditionally be called 
while ravb_wol_setup() will only be called if netif_running(ndev). This 
is en easy fix and I will send out a v2 once we figure out what to do 
about the clock.

> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
Geert Uytterhoeven May 16, 2017, 7:48 a.m. UTC | #8
Hi Niklas,

On Fri, May 12, 2017 at 11:12 PM, Niklas Söderlund
<niklas.soderlund@ragnatech.se> wrote:
> On 2017-05-12 16:43:55 +0200, Geert Uytterhoeven wrote:
>> On Fri, May 12, 2017 at 4:32 PM, Niklas Söderlund
>> <niklas.soderlund@ragnatech.se> wrote:
>> > On 2017-05-12 14:58:53 +0200, Niklas Söderlund wrote:
>> >> On 2017-05-12 14:47:53 +0200, Geert Uytterhoeven wrote:
>> >> > On Fri, May 12, 2017 at 12:27 AM, Niklas Söderlund
>> >> > <niklas.soderlund+renesas@ragnatech.se> wrote:
>> >> > When using PSCI suspend/resume, and Wake-on-LAN cannot work due to
>> >> > PSCI firmware issues, Ethernet fails to come up afterwards:
>> >> >
>> >> >     ravb e6800000.ethernet eth0: failed to switch device to config mode
>> >> >     ravb e6800000.ethernet eth0: device will be stopped after h/w
>> >> > processes are done.
>> >> >     ravb e6800000.ethernet eth0: failed to switch device to config mode
>> >> >     dpm_run_callback(): ravb_resume+0x0/0x148 returns -110
>> >> >     PM: Device e6800000.ethernet failed to resume: error -110
>> >> >
>> >> > Your resume routine cannot assume RAVB is in a sane mode, as it will
>> >> > have been reset if PSCI suspend was used.
>> >>
>> >> Ouch, yes this is true thanks for reporting will look in to it.
>> >>
>> >> The problem is that in the resume handler if WoL is enabled it will try
>> >> to close the device before reinitializing it from reset state. If WoL is
>> >> not enabled the device will be closed at suspend time so no need to
>> >> close it before restoring operation from reset in the resume handler.

>> With renesas-drivers, it will refuse to use PSCI suspend if any other wake-up
>> sources are configured, i.e. try
>>
>>     echo disabled > /sys/devices/platform/soc/e6800000.ethernet/power/wakeup
>>
>> Good luck, and have a nice weekend!
>
> Thanks this allowed me to reproduce the same error as you. And after
> future digging I don't believe the problem being in the logic of the
> ravb suspend/resume functions. The problem is that the module clock is
> never turned on after PCSI system suspend if its usage count is above 0
> at suspend time, so the errors we both now observe are due to the module
> clock being disabled.
>
> If I set DEBUG in renesas-cpg-mssr.c and observe when each module clock
> is turned OFF and ON, the fault is clear. If WoL is enabled the clock is
> never turned on when the system is resuming, while if WoL is disabled it
> is. I verified this by removing the calls to clk_enable() and
> clk_disable() from this patch, and by doing so the PCSI system suspend
> works perfect with WoL enabled and the ravb comes up fine after toggling
> SW23 (while ofc WoL no longer works in s2idle due to the module clock is
> switched off at suspend time).
>
> I checked drivers/clk/renesas and I can't find a suspend/resume handler
> for the clock driver, how is this intended to work? If a clock have a
> usage count higher then 0 when the system is PSCI System Suspended it
> seems like it won't be turned back on when the system is resumed from
> this sleep stage. I might have misunderstood something and I need to
> alter the logic in the ravb driver to let the clock driver know it
> should turn on the clock at resume time?

Ah, you found a real use case for suspend/resume support in the clock
drivers ;-)

Due to PSCI system suspend powering down the whole SoC, all clock
settings are lost.

Thanks, I will look into this...

> Whit all this being said I still like to withdraw this patch as I found
> another fault with it, ravb_wol_restore() will unconditionally be called
> while ravb_wol_setup() will only be called if netif_running(ndev). This
> is en easy fix and I will send out a v2 once we figure out what to do
> about the clock.

The clock issue is external to the ravb driver. If it works with
s2idle, it should
be OK.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Niklas Söderlund May 16, 2017, 9:02 a.m. UTC | #9
Hi Geert,

Thanks for your feedback.

On 2017-05-16 09:48:51 +0200, Geert Uytterhoeven wrote:
> Hi Niklas,
> 
> On Fri, May 12, 2017 at 11:12 PM, Niklas Söderlund
> <niklas.soderlund@ragnatech.se> wrote:
> > On 2017-05-12 16:43:55 +0200, Geert Uytterhoeven wrote:
> >> On Fri, May 12, 2017 at 4:32 PM, Niklas Söderlund
> >> <niklas.soderlund@ragnatech.se> wrote:
> >> > On 2017-05-12 14:58:53 +0200, Niklas Söderlund wrote:
> >> >> On 2017-05-12 14:47:53 +0200, Geert Uytterhoeven wrote:
> >> >> > On Fri, May 12, 2017 at 12:27 AM, Niklas Söderlund
> >> >> > <niklas.soderlund+renesas@ragnatech.se> wrote:
> >> >> > When using PSCI suspend/resume, and Wake-on-LAN cannot work due to
> >> >> > PSCI firmware issues, Ethernet fails to come up afterwards:
> >> >> >
> >> >> >     ravb e6800000.ethernet eth0: failed to switch device to config mode
> >> >> >     ravb e6800000.ethernet eth0: device will be stopped after h/w
> >> >> > processes are done.
> >> >> >     ravb e6800000.ethernet eth0: failed to switch device to config mode
> >> >> >     dpm_run_callback(): ravb_resume+0x0/0x148 returns -110
> >> >> >     PM: Device e6800000.ethernet failed to resume: error -110
> >> >> >
> >> >> > Your resume routine cannot assume RAVB is in a sane mode, as it will
> >> >> > have been reset if PSCI suspend was used.
> >> >>
> >> >> Ouch, yes this is true thanks for reporting will look in to it.
> >> >>
> >> >> The problem is that in the resume handler if WoL is enabled it will try
> >> >> to close the device before reinitializing it from reset state. If WoL is
> >> >> not enabled the device will be closed at suspend time so no need to
> >> >> close it before restoring operation from reset in the resume handler.
> 
> >> With renesas-drivers, it will refuse to use PSCI suspend if any other wake-up
> >> sources are configured, i.e. try
> >>
> >>     echo disabled > /sys/devices/platform/soc/e6800000.ethernet/power/wakeup
> >>
> >> Good luck, and have a nice weekend!
> >
> > Thanks this allowed me to reproduce the same error as you. And after
> > future digging I don't believe the problem being in the logic of the
> > ravb suspend/resume functions. The problem is that the module clock is
> > never turned on after PCSI system suspend if its usage count is above 0
> > at suspend time, so the errors we both now observe are due to the module
> > clock being disabled.
> >
> > If I set DEBUG in renesas-cpg-mssr.c and observe when each module clock
> > is turned OFF and ON, the fault is clear. If WoL is enabled the clock is
> > never turned on when the system is resuming, while if WoL is disabled it
> > is. I verified this by removing the calls to clk_enable() and
> > clk_disable() from this patch, and by doing so the PCSI system suspend
> > works perfect with WoL enabled and the ravb comes up fine after toggling
> > SW23 (while ofc WoL no longer works in s2idle due to the module clock is
> > switched off at suspend time).
> >
> > I checked drivers/clk/renesas and I can't find a suspend/resume handler
> > for the clock driver, how is this intended to work? If a clock have a
> > usage count higher then 0 when the system is PSCI System Suspended it
> > seems like it won't be turned back on when the system is resumed from
> > this sleep stage. I might have misunderstood something and I need to
> > alter the logic in the ravb driver to let the clock driver know it
> > should turn on the clock at resume time?
> 
> Ah, you found a real use case for suspend/resume support in the clock
> drivers ;-)

Happy to be of service ;-)

> 
> Due to PSCI system suspend powering down the whole SoC, all clock
> settings are lost.
> 
> Thanks, I will look into this...

Thanks!

> 
> > Whit all this being said I still like to withdraw this patch as I found
> > another fault with it, ravb_wol_restore() will unconditionally be called
> > while ravb_wol_setup() will only be called if netif_running(ndev). This
> > is en easy fix and I will send out a v2 once we figure out what to do
> > about the clock.
> 
> The clock issue is external to the ravb driver. If it works with
> s2idle, it should
> be OK.

Do you think it's a good idea to move ahead with a v2 of the ravb WoL 
patch to fix the unrelated issue and aim for it to be picked up prior to
suspend/resume support is added to the CPG/MSSR?

> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
Geert Uytterhoeven May 16, 2017, 9:07 a.m. UTC | #10
Hi Niklas,

On Tue, May 16, 2017 at 11:02 AM, Niklas Söderlund
<niklas.soderlund@ragnatech.se> wrote:
>> > Whit all this being said I still like to withdraw this patch as I found
>> > another fault with it, ravb_wol_restore() will unconditionally be called
>> > while ravb_wol_setup() will only be called if netif_running(ndev). This
>> > is en easy fix and I will send out a v2 once we figure out what to do
>> > about the clock.
>>
>> The clock issue is external to the ravb driver. If it works with
>> s2idle, it should
>> be OK.
>
> Do you think it's a good idea to move ahead with a v2 of the ravb WoL
> patch to fix the unrelated issue and aim for it to be picked up prior to
> suspend/resume support is added to the CPG/MSSR?

Sure.

It can still be used on R-Car Gen2, where we're not s*d by mandatory PSCI.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Simon Horman May 16, 2017, 11:01 a.m. UTC | #11
On Tue, May 16, 2017 at 11:07:34AM +0200, Geert Uytterhoeven wrote:
> Hi Niklas,
> 
> On Tue, May 16, 2017 at 11:02 AM, Niklas Söderlund
> <niklas.soderlund@ragnatech.se> wrote:
> >> > Whit all this being said I still like to withdraw this patch as I found
> >> > another fault with it, ravb_wol_restore() will unconditionally be called
> >> > while ravb_wol_setup() will only be called if netif_running(ndev). This
> >> > is en easy fix and I will send out a v2 once we figure out what to do
> >> > about the clock.
> >>
> >> The clock issue is external to the ravb driver. If it works with
> >> s2idle, it should
> >> be OK.
> >
> > Do you think it's a good idea to move ahead with a v2 of the ravb WoL
> > patch to fix the unrelated issue and aim for it to be picked up prior to
> > suspend/resume support is added to the CPG/MSSR?
> 
> Sure.
> 
> It can still be used on R-Car Gen2, where we're not s*d by mandatory PSCI.

Is there some way for - e.g. the driver - to not enable WoL on Gen3 SoCs
until the clock issues is sorted out? I'm quite happy to enable features
where they work; not so much where they don't.
Geert Uytterhoeven May 16, 2017, 11:36 a.m. UTC | #12
Hi Simon,

On Tue, May 16, 2017 at 1:01 PM, Simon Horman <horms@verge.net.au> wrote:
> On Tue, May 16, 2017 at 11:07:34AM +0200, Geert Uytterhoeven wrote:
>> On Tue, May 16, 2017 at 11:02 AM, Niklas Söderlund
>> <niklas.soderlund@ragnatech.se> wrote:
>> >> > Whit all this being said I still like to withdraw this patch as I found
>> >> > another fault with it, ravb_wol_restore() will unconditionally be called
>> >> > while ravb_wol_setup() will only be called if netif_running(ndev). This
>> >> > is en easy fix and I will send out a v2 once we figure out what to do
>> >> > about the clock.
>> >>
>> >> The clock issue is external to the ravb driver. If it works with
>> >> s2idle, it should
>> >> be OK.
>> >
>> > Do you think it's a good idea to move ahead with a v2 of the ravb WoL
>> > patch to fix the unrelated issue and aim for it to be picked up prior to
>> > suspend/resume support is added to the CPG/MSSR?
>>
>> Sure.
>>
>> It can still be used on R-Car Gen2, where we're not s*d by mandatory PSCI.
>
> Is there some way for - e.g. the driver - to not enable WoL on Gen3 SoCs
> until the clock issues is sorted out? I'm quite happy to enable features

"priv->chip_id != RCAR_GEN3". However, as we don't have RAVB enabled
on any R-Car Gen2 board, its use is limited.

> where they work; not so much where they don't.

Agreed.

One workaround could be to disable/enable the module clock in the WoL
resume path, to make sure it is enabled.  Once the enable count reaches
0, CCF will know it's disabled, and will really enable next time.
You may need a double disable/double enable though, without testing I
don't know remember the enable count is 1 or 2 at that point (due to PM
runtime).

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Niklas Söderlund May 16, 2017, 12:16 p.m. UTC | #13
Hi,

On 2017-05-16 13:36:21 +0200, Geert Uytterhoeven wrote:
> Hi Simon,
> 
> On Tue, May 16, 2017 at 1:01 PM, Simon Horman <horms@verge.net.au> wrote:
> > On Tue, May 16, 2017 at 11:07:34AM +0200, Geert Uytterhoeven wrote:
> >> On Tue, May 16, 2017 at 11:02 AM, Niklas Söderlund
> >> <niklas.soderlund@ragnatech.se> wrote:
> >> >> > Whit all this being said I still like to withdraw this patch as I found
> >> >> > another fault with it, ravb_wol_restore() will unconditionally be called
> >> >> > while ravb_wol_setup() will only be called if netif_running(ndev). This
> >> >> > is en easy fix and I will send out a v2 once we figure out what to do
> >> >> > about the clock.
> >> >>
> >> >> The clock issue is external to the ravb driver. If it works with
> >> >> s2idle, it should
> >> >> be OK.
> >> >
> >> > Do you think it's a good idea to move ahead with a v2 of the ravb WoL
> >> > patch to fix the unrelated issue and aim for it to be picked up prior to
> >> > suspend/resume support is added to the CPG/MSSR?
> >>
> >> Sure.
> >>
> >> It can still be used on R-Car Gen2, where we're not s*d by mandatory PSCI.
> >
> > Is there some way for - e.g. the driver - to not enable WoL on Gen3 SoCs
> > until the clock issues is sorted out? I'm quite happy to enable features
> 
> "priv->chip_id != RCAR_GEN3". However, as we don't have RAVB enabled
> on any R-Car Gen2 board, its use is limited.

I agree that it's not so useful to enable this on Gen2 only.

> 
> > where they work; not so much where they don't.
> 
> Agreed.
> 
> One workaround could be to disable/enable the module clock in the WoL
> resume path, to make sure it is enabled.  Once the enable count reaches
> 0, CCF will know it's disabled, and will really enable next time.
> You may need a double disable/double enable though, without testing I
> don't know remember the enable count is 1 or 2 at that point (due to PM
> runtime).

I thought about this but it feels like such a hack I did not dare 
suggest it :-) But at the same time it would be nice to enable WoL for 
the s2idle use-case where it works. Only resume from PSCI with WoL 
enabled that is broken, and WoL in PSCI suspend will never work :-)

How about I add another patch in v2 on-top of this that adds the clock 
disable/enable hack? That way it's clear that this is a workaround and 
once we have support for suspend/resume in CPG/MSSR just that patch can 
be reverted? Or is it cleaner to fold it in to this patch with a big 
comment that this is a workaround? Or is it maybe better to hold of on 
this until CPG/MSSR supports suspend/resume?

> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
Geert Uytterhoeven May 18, 2017, 8:52 a.m. UTC | #14
Hi Niklas,

On Tue, May 16, 2017 at 2:16 PM, Niklas Söderlund
<niklas.soderlund@ragnatech.se> wrote:
> On 2017-05-16 13:36:21 +0200, Geert Uytterhoeven wrote:
>> On Tue, May 16, 2017 at 1:01 PM, Simon Horman <horms@verge.net.au> wrote:
>> > Is there some way for - e.g. the driver - to not enable WoL on Gen3 SoCs
>> > until the clock issues is sorted out? I'm quite happy to enable features
>> > where they work; not so much where they don't.
>>
>> Agreed.
>>
>> One workaround could be to disable/enable the module clock in the WoL
>> resume path, to make sure it is enabled.  Once the enable count reaches
>> 0, CCF will know it's disabled, and will really enable next time.
>> You may need a double disable/double enable though, without testing I
>> don't know remember the enable count is 1 or 2 at that point (due to PM
>> runtime).
>
> I thought about this but it feels like such a hack I did not dare
> suggest it :-) But at the same time it would be nice to enable WoL for
> the s2idle use-case where it works. Only resume from PSCI with WoL
> enabled that is broken, and WoL in PSCI suspend will never work :-)

Indeed.

> How about I add another patch in v2 on-top of this that adds the clock
> disable/enable hack? That way it's clear that this is a workaround and
> once we have support for suspend/resume in CPG/MSSR just that patch can
> be reverted? Or is it cleaner to fold it in to this patch with a big
> comment that this is a workaround? Or is it maybe better to hold of on
> this until CPG/MSSR supports suspend/resume?

Personally, I would have no problems of having the workaround integrated (and
documented, of course) in the WoL patch, as it avoids having broken PSCI
suspend in between WoL-without-workaround and a separate workaround.

It's not that dissimilar from the initial R-Car Gen3 support patch limiting
ravb to 100 Mbps.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Simon Horman May 18, 2017, 1:40 p.m. UTC | #15
On Tue, May 16, 2017 at 02:16:04PM +0200, Niklas Söderlund wrote:
> Hi,
> 
> On 2017-05-16 13:36:21 +0200, Geert Uytterhoeven wrote:
> > Hi Simon,
> > 
> > On Tue, May 16, 2017 at 1:01 PM, Simon Horman <horms@verge.net.au> wrote:
> > > On Tue, May 16, 2017 at 11:07:34AM +0200, Geert Uytterhoeven wrote:
> > >> On Tue, May 16, 2017 at 11:02 AM, Niklas Söderlund
> > >> <niklas.soderlund@ragnatech.se> wrote:
> > >> >> > Whit all this being said I still like to withdraw this patch as I found
> > >> >> > another fault with it, ravb_wol_restore() will unconditionally be called
> > >> >> > while ravb_wol_setup() will only be called if netif_running(ndev). This
> > >> >> > is en easy fix and I will send out a v2 once we figure out what to do
> > >> >> > about the clock.
> > >> >>
> > >> >> The clock issue is external to the ravb driver. If it works with
> > >> >> s2idle, it should
> > >> >> be OK.
> > >> >
> > >> > Do you think it's a good idea to move ahead with a v2 of the ravb WoL
> > >> > patch to fix the unrelated issue and aim for it to be picked up prior to
> > >> > suspend/resume support is added to the CPG/MSSR?
> > >>
> > >> Sure.
> > >>
> > >> It can still be used on R-Car Gen2, where we're not s*d by mandatory PSCI.
> > >
> > > Is there some way for - e.g. the driver - to not enable WoL on Gen3 SoCs
> > > until the clock issues is sorted out? I'm quite happy to enable features
> > 
> > "priv->chip_id != RCAR_GEN3". However, as we don't have RAVB enabled
> > on any R-Car Gen2 board, its use is limited.
> 
> I agree that it's not so useful to enable this on Gen2 only.

Yeah, agreed. I had forgotten that RAVB only enabled on Gen3.

...
Niklas Söderlund June 12, 2017, 11:32 a.m. UTC | #16
Hi Geert,

On 2017-05-18 10:52:25 +0200, Geert Uytterhoeven wrote:
> Hi Niklas,
> 
> On Tue, May 16, 2017 at 2:16 PM, Niklas Söderlund
> <niklas.soderlund@ragnatech.se> wrote:
> > On 2017-05-16 13:36:21 +0200, Geert Uytterhoeven wrote:
> >> On Tue, May 16, 2017 at 1:01 PM, Simon Horman <horms@verge.net.au> wrote:
> >> > Is there some way for - e.g. the driver - to not enable WoL on Gen3 SoCs
> >> > until the clock issues is sorted out? I'm quite happy to enable features
> >> > where they work; not so much where they don't.
> >>
> >> Agreed.
> >>
> >> One workaround could be to disable/enable the module clock in the WoL
> >> resume path, to make sure it is enabled.  Once the enable count reaches
> >> 0, CCF will know it's disabled, and will really enable next time.
> >> You may need a double disable/double enable though, without testing I
> >> don't know remember the enable count is 1 or 2 at that point (due to PM
> >> runtime).
> >
> > I thought about this but it feels like such a hack I did not dare
> > suggest it :-) But at the same time it would be nice to enable WoL for
> > the s2idle use-case where it works. Only resume from PSCI with WoL
> > enabled that is broken, and WoL in PSCI suspend will never work :-)
> 
> Indeed.
> 
> > How about I add another patch in v2 on-top of this that adds the clock
> > disable/enable hack? That way it's clear that this is a workaround and
> > once we have support for suspend/resume in CPG/MSSR just that patch can
> > be reverted? Or is it cleaner to fold it in to this patch with a big
> > comment that this is a workaround? Or is it maybe better to hold of on
> > this until CPG/MSSR supports suspend/resume?
> 
> Personally, I would have no problems of having the workaround integrated (and
> documented, of course) in the WoL patch, as it avoids having broken PSCI
> suspend in between WoL-without-workaround and a separate workaround.

You have now posted '[PATCH/RFC] clk: renesas: cpg-mssr: Restore module 
clock registers during resume' which solves this issue. Do you think it 
would be OK for me to resubmit this patch due to an unrelated fix and 
state that it depends on your patch or do you feel it still would be 
valuable to include a workaround in the RAVB driver as to not make it 
dependent on your patch?

> 
> It's not that dissimilar from the initial R-Car Gen3 support patch limiting
> ravb to 100 Mbps.
> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
Geert Uytterhoeven June 12, 2017, 11:43 a.m. UTC | #17
Hi Niklas,

On Mon, Jun 12, 2017 at 1:32 PM, Niklas Söderlund
<niklas.soderlund@ragnatech.se> wrote:
> On 2017-05-18 10:52:25 +0200, Geert Uytterhoeven wrote:
>> On Tue, May 16, 2017 at 2:16 PM, Niklas Söderlund
>> <niklas.soderlund@ragnatech.se> wrote:
>> > On 2017-05-16 13:36:21 +0200, Geert Uytterhoeven wrote:
>> >> On Tue, May 16, 2017 at 1:01 PM, Simon Horman <horms@verge.net.au> wrote:
>> >> > Is there some way for - e.g. the driver - to not enable WoL on Gen3 SoCs
>> >> > until the clock issues is sorted out? I'm quite happy to enable features
>> >> > where they work; not so much where they don't.
>> >>
>> >> Agreed.
>> >>
>> >> One workaround could be to disable/enable the module clock in the WoL
>> >> resume path, to make sure it is enabled.  Once the enable count reaches
>> >> 0, CCF will know it's disabled, and will really enable next time.
>> >> You may need a double disable/double enable though, without testing I
>> >> don't know remember the enable count is 1 or 2 at that point (due to PM
>> >> runtime).
>> >
>> > I thought about this but it feels like such a hack I did not dare
>> > suggest it :-) But at the same time it would be nice to enable WoL for
>> > the s2idle use-case where it works. Only resume from PSCI with WoL
>> > enabled that is broken, and WoL in PSCI suspend will never work :-)
>>
>> Indeed.
>>
>> > How about I add another patch in v2 on-top of this that adds the clock
>> > disable/enable hack? That way it's clear that this is a workaround and
>> > once we have support for suspend/resume in CPG/MSSR just that patch can
>> > be reverted? Or is it cleaner to fold it in to this patch with a big
>> > comment that this is a workaround? Or is it maybe better to hold of on
>> > this until CPG/MSSR supports suspend/resume?
>>
>> Personally, I would have no problems of having the workaround integrated (and
>> documented, of course) in the WoL patch, as it avoids having broken PSCI
>> suspend in between WoL-without-workaround and a separate workaround.
>
> You have now posted '[PATCH/RFC] clk: renesas: cpg-mssr: Restore module
> clock registers during resume' which solves this issue. Do you think it
> would be OK for me to resubmit this patch due to an unrelated fix and
> state that it depends on your patch or do you feel it still would be
> valuable to include a workaround in the RAVB driver as to not make it
> dependent on your patch?

The the module clock restore patch fixes the issue, I think it's too immature
to upstream.  Applying your WoL patch as-is does introduce a regression if
WoL is enabled (although you could debate that it's not a regression, as WoL
couldn't be enabled before, but a generic userspace may try to do that anyway).

If the workaround isn't too ugly, I would include it.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
diff mbox

Patch

diff --git a/drivers/net/ethernet/renesas/ravb.h b/drivers/net/ethernet/renesas/ravb.h
index 0525bd696d5d02e5..96a27b00c90e212a 100644
--- a/drivers/net/ethernet/renesas/ravb.h
+++ b/drivers/net/ethernet/renesas/ravb.h
@@ -991,6 +991,7 @@  struct ravb_private {
 	struct net_device *ndev;
 	struct platform_device *pdev;
 	void __iomem *addr;
+	struct clk *clk;
 	struct mdiobb_ctrl mdiobb;
 	u32 num_rx_ring[NUM_RX_QUEUE];
 	u32 num_tx_ring[NUM_TX_QUEUE];
@@ -1033,6 +1034,7 @@  struct ravb_private {
 
 	unsigned no_avb_link:1;
 	unsigned avb_link_active_low:1;
+	unsigned wol_enabled:1;
 };
 
 static inline u32 ravb_read(struct net_device *ndev, enum ravb_reg reg)
diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
index 3cd7989c007dfe46..90913cf2477eb25d 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -680,6 +680,9 @@  static void ravb_emac_interrupt_unlocked(struct net_device *ndev)
 
 	ecsr = ravb_read(ndev, ECSR);
 	ravb_write(ndev, ecsr, ECSR);	/* clear interrupt */
+
+	if (ecsr & ECSR_MPD)
+		pm_wakeup_event(&priv->pdev->dev, 0);
 	if (ecsr & ECSR_ICD)
 		ndev->stats.tx_carrier_errors++;
 	if (ecsr & ECSR_LCHNG) {
@@ -1330,6 +1333,33 @@  static int ravb_get_ts_info(struct net_device *ndev,
 	return 0;
 }
 
+static void ravb_get_wol(struct net_device *ndev, struct ethtool_wolinfo *wol)
+{
+	struct ravb_private *priv = netdev_priv(ndev);
+
+	wol->supported = 0;
+	wol->wolopts = 0;
+
+	if (priv->clk) {
+		wol->supported = WAKE_MAGIC;
+		wol->wolopts = priv->wol_enabled ? WAKE_MAGIC : 0;
+	}
+}
+
+static int ravb_set_wol(struct net_device *ndev, struct ethtool_wolinfo *wol)
+{
+	struct ravb_private *priv = netdev_priv(ndev);
+
+	if (!priv->clk || wol->wolopts & ~WAKE_MAGIC)
+		return -EOPNOTSUPP;
+
+	priv->wol_enabled = !!(wol->wolopts & WAKE_MAGIC);
+
+	device_set_wakeup_enable(&priv->pdev->dev, priv->wol_enabled);
+
+	return 0;
+}
+
 static const struct ethtool_ops ravb_ethtool_ops = {
 	.nway_reset		= ravb_nway_reset,
 	.get_msglevel		= ravb_get_msglevel,
@@ -1343,6 +1373,8 @@  static const struct ethtool_ops ravb_ethtool_ops = {
 	.get_ts_info		= ravb_get_ts_info,
 	.get_link_ksettings	= ravb_get_link_ksettings,
 	.set_link_ksettings	= ravb_set_link_ksettings,
+	.get_wol		= ravb_get_wol,
+	.set_wol		= ravb_set_wol,
 };
 
 static inline int ravb_hook_irq(unsigned int irq, irq_handler_t handler,
@@ -2041,6 +2073,11 @@  static int ravb_probe(struct platform_device *pdev)
 
 	priv->chip_id = chip_id;
 
+	/* Get clock, if not found that's OK but Wake-On-Lan is unavailable */
+	priv->clk = devm_clk_get(&pdev->dev, NULL);
+	if (IS_ERR(priv->clk))
+		priv->clk = NULL;
+
 	/* Set function */
 	ndev->netdev_ops = &ravb_netdev_ops;
 	ndev->ethtool_ops = &ravb_ethtool_ops;
@@ -2107,6 +2144,9 @@  static int ravb_probe(struct platform_device *pdev)
 	if (error)
 		goto out_napi_del;
 
+	if (priv->clk)
+		device_set_wakeup_capable(&pdev->dev, 1);
+
 	/* Print device information */
 	netdev_info(ndev, "Base address at %#x, %pM, IRQ %d.\n",
 		    (u32)ndev->base_addr, ndev->dev_addr, ndev->irq);
@@ -2160,15 +2200,69 @@  static int ravb_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static int ravb_wol_setup(struct net_device *ndev)
+{
+	struct ravb_private *priv = netdev_priv(ndev);
+
+	/* Disable interrupts by clearing the interrupt masks. */
+	ravb_write(ndev, 0, RIC0);
+	ravb_write(ndev, 0, RIC2);
+	ravb_write(ndev, 0, TIC);
+
+	/* Only allow ECI interrupts */
+	synchronize_irq(priv->emac_irq);
+	napi_disable(&priv->napi[RAVB_NC]);
+	napi_disable(&priv->napi[RAVB_BE]);
+	ravb_write(ndev, ECSIPR_MPDIP, ECSIPR);
+
+	/* Enable MagicPacket */
+	ravb_modify(ndev, ECMR, ECMR_MPDE, ECMR_MPDE);
+
+	/* Increased clock usage so device won't be suspended */
+	clk_enable(priv->clk);
+
+	return enable_irq_wake(priv->emac_irq);
+}
+
+static int ravb_wol_restore(struct net_device *ndev)
+{
+	struct ravb_private *priv = netdev_priv(ndev);
+	int ret;
+
+	napi_enable(&priv->napi[RAVB_NC]);
+	napi_enable(&priv->napi[RAVB_BE]);
+
+	/* Disable MagicPacket */
+	ravb_modify(ndev, ECMR, ECMR_MPDE, 0);
+
+	ret = ravb_close(ndev);
+	if (ret < 0)
+		return ret;
+
+	/* Restore clock usage count */
+	clk_disable(priv->clk);
+
+	/* Set reset mode */
+	ravb_write(ndev, CCC_OPC_RESET, CCC);
+
+	return disable_irq_wake(priv->emac_irq);
+}
+
 static int __maybe_unused ravb_suspend(struct device *dev)
 {
 	struct net_device *ndev = dev_get_drvdata(dev);
-	int ret = 0;
+	struct ravb_private *priv = netdev_priv(ndev);
+	int ret;
 
-	if (netif_running(ndev)) {
-		netif_device_detach(ndev);
+	if (!netif_running(ndev))
+		return 0;
+
+	netif_device_detach(ndev);
+
+	if (priv->wol_enabled)
+		ret = ravb_wol_setup(ndev);
+	else
 		ret = ravb_close(ndev);
-	}
 
 	return ret;
 }
@@ -2179,6 +2273,12 @@  static int __maybe_unused ravb_resume(struct device *dev)
 	struct ravb_private *priv = netdev_priv(ndev);
 	int ret = 0;
 
+	if (priv->wol_enabled) {
+		ret = ravb_wol_restore(ndev);
+		if (ret)
+			return ret;
+	}
+
 	/* All register have been reset to default values.
 	 * Restore all registers which where setup at probe time and
 	 * reopen device if it was running before system suspended.