[v2,00/20] rtc: omap: fixes and power-off feature

Message ID	20141024195532.GF19377@localhost (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org> Date: Fri, 24 Oct 2014 21:55:32 +0200 From: Johan Hovold <johan@kernel.org> To: Felipe Balbi <balbi@ti.com> Subject: Re: [PATCH v2 00/20] rtc: omap: fixes and power-off feature Message-ID: <20141024195532.GF19377@localhost> References: <1412881594-25678-1-git-send-email-johan@kernel.org> <1413913086-12730-1-git-send-email-johan@kernel.org> <20141024160845.GM26941@saruman> <20141024190251.GB19377@localhost> <20141024192540.GD11455@saruman> <20141024192948.GE11455@saruman> <20141024193655.GD19377@localhost> <20141024194442.GG11455@saruman> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20141024194442.GG11455@saruman> User-Agent: Mutt/1.5.22 (2013-10-16) Cc: Alessandro Zummo <a.zummo@towertech.it>, rtc-linux@googlegroups.com, devicetree@vger.kernel.org, Tony Lindgren <tony@atomide.com>, j-keerthy@ti.com, nsekhar@ti.com, Johan Hovold <johan@kernel.org>, linux-kernel@vger.kernel.org, t-kristo@ti.com, Lokesh Vutla <lokeshvutla@ti.com>, linux-arm-kernel@lists.infradead.org, =?iso-8859-1?Q?Beno=EEt?= Cousson <bcousson@baylibre.com>, Russell King <rmk+kernel@arm.linux.org.uk>, Andrew Morton <akpm@linux-foundation.org>, linux-omap@vger.kernel.org, Guenter Roeck <linux@roeck-us.net> Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Johan Hovold Oct. 24, 2014, 7:55 p.m. UTC

On Fri, Oct 24, 2014 at 02:44:42PM -0500, Felipe Balbi wrote:
> On Fri, Oct 24, 2014 at 09:36:55PM +0200, Johan Hovold wrote:
> > On Fri, Oct 24, 2014 at 02:29:48PM -0500, Felipe Balbi wrote:
> > > Hi,
> > > 
> > > On Fri, Oct 24, 2014 at 02:25:40PM -0500, Felipe Balbi wrote:

> > > > with this I always get to "Power off failed -- system halted". If I
> > > > switch to v3.18-rc1 vanilla, then it works. So it's definitely caused by
> > > > your rtc-only patches.
> > 
> > That's expected (see below). It works with v3.18-rc1 vanilla because
> > machine_halt is called instead of machine_power_off as there is no
> > registered power-off handler.
> 
> yeah, that much I figured :-)
> 
> > > ok, so it seems like it takes more than 1 second for things to
> > > propagate. If I increase that mdelay() to 3000, then everything works
> > > fine on my end. I think we should get RMK's input on this 3000ms delay
> > > to machine_power_off(). Should it be generic, or should we add it to our
> > > rtc pm_power_off implementation ?
> > 
> > As I wrote above, we still need a 2-second mdelay in rtc-omap, which I
> > intend to add to the pmic_power_en patch.
> 
> oh, alright then. If you can Cc me, I'll make sure to test that too ;-)

I will. :) Just wanted to see whether Andrew preferred I resend the
whole series or just that one patch first.

The diff is minimal:

> > which wouldn't take nearly as long as rtc-omap and hence wouldn't be
> > adding an explicit delay in the driver power-off handler.
> 
> alright.

I'll call it a day now. Will keep you posted.

Thanks,
Johan

Felipe Balbi Oct. 24, 2014, 8:08 p.m. UTC | #1

On Fri, Oct 24, 2014 at 09:55:32PM +0200, Johan Hovold wrote:
> On Fri, Oct 24, 2014 at 02:44:42PM -0500, Felipe Balbi wrote:
> > On Fri, Oct 24, 2014 at 09:36:55PM +0200, Johan Hovold wrote:
> > > On Fri, Oct 24, 2014 at 02:29:48PM -0500, Felipe Balbi wrote:
> > > > Hi,
> > > > 
> > > > On Fri, Oct 24, 2014 at 02:25:40PM -0500, Felipe Balbi wrote:
> 
> > > > > with this I always get to "Power off failed -- system halted". If I
> > > > > switch to v3.18-rc1 vanilla, then it works. So it's definitely caused by
> > > > > your rtc-only patches.
> > > 
> > > That's expected (see below). It works with v3.18-rc1 vanilla because
> > > machine_halt is called instead of machine_power_off as there is no
> > > registered power-off handler.
> > 
> > yeah, that much I figured :-)
> > 
> > > > ok, so it seems like it takes more than 1 second for things to
> > > > propagate. If I increase that mdelay() to 3000, then everything works
> > > > fine on my end. I think we should get RMK's input on this 3000ms delay
> > > > to machine_power_off(). Should it be generic, or should we add it to our
> > > > rtc pm_power_off implementation ?
> > > 
> > > As I wrote above, we still need a 2-second mdelay in rtc-omap, which I
> > > intend to add to the pmic_power_en patch.
> > 
> > oh, alright then. If you can Cc me, I'll make sure to test that too ;-)
> 
> I will. :) Just wanted to see whether Andrew preferred I resend the
> whole series or just that one patch first.
> 
> The diff is minimal:
> 
> diff --git a/drivers/rtc/rtc-omap.c b/drivers/rtc/rtc-omap.c
> index e74750f00b18..e4f97ad9eb21 100644
> --- a/drivers/rtc/rtc-omap.c
> +++ b/drivers/rtc/rtc-omap.c
> @@ -423,6 +423,8 @@ static void omap_rtc_power_off(void)
>         val = rtc_read(rtc, OMAP_RTC_INTERRUPTS_REG);
>         rtc_writel(rtc, OMAP_RTC_INTERRUPTS_REG,
>                         val | OMAP_RTC_INTERRUPTS_IT_ALARM2);
> +
> +       mdelay(2000);
>  }

you can add my:

Tested-by: Felipe Balbi <balbi@ti.com>

for this too. Now it's all good.

Andrew Morton Oct. 27, 2014, 11:22 p.m. UTC | #2

On Fri, 24 Oct 2014 21:55:32 +0200 Johan Hovold <johan@kernel.org> wrote:

> On Fri, Oct 24, 2014 at 02:44:42PM -0500, Felipe Balbi wrote:
> > On Fri, Oct 24, 2014 at 09:36:55PM +0200, Johan Hovold wrote:
> > > On Fri, Oct 24, 2014 at 02:29:48PM -0500, Felipe Balbi wrote:
> > > > Hi,
> > > > 
> > > > On Fri, Oct 24, 2014 at 02:25:40PM -0500, Felipe Balbi wrote:
> 
> > > > > with this I always get to "Power off failed -- system halted". If I
> > > > > switch to v3.18-rc1 vanilla, then it works. So it's definitely caused by
> > > > > your rtc-only patches.
> > > 
> > > That's expected (see below). It works with v3.18-rc1 vanilla because
> > > machine_halt is called instead of machine_power_off as there is no
> > > registered power-off handler.
> > 
> > yeah, that much I figured :-)
> > 
> > > > ok, so it seems like it takes more than 1 second for things to
> > > > propagate. If I increase that mdelay() to 3000, then everything works
> > > > fine on my end. I think we should get RMK's input on this 3000ms delay
> > > > to machine_power_off(). Should it be generic, or should we add it to our
> > > > rtc pm_power_off implementation ?
> > > 
> > > As I wrote above, we still need a 2-second mdelay in rtc-omap, which I
> > > intend to add to the pmic_power_en patch.
> > 
> > oh, alright then. If you can Cc me, I'll make sure to test that too ;-)
> 
> I will. :) Just wanted to see whether Andrew preferred I resend the
> whole series or just that one patch first.
> 
> The diff is minimal:
> 
> diff --git a/drivers/rtc/rtc-omap.c b/drivers/rtc/rtc-omap.c
> index e74750f00b18..e4f97ad9eb21 100644
> --- a/drivers/rtc/rtc-omap.c
> +++ b/drivers/rtc/rtc-omap.c
> @@ -423,6 +423,8 @@ static void omap_rtc_power_off(void)
>         val = rtc_read(rtc, OMAP_RTC_INTERRUPTS_REG);
>         rtc_writel(rtc, OMAP_RTC_INTERRUPTS_REG,
>                         val | OMAP_RTC_INTERRUPTS_IT_ALARM2);
> +
> +       mdelay(2000);
>  }

Yes, having read this threadlet: we need a very good comment in there
explaining what's going on, please.

Do we even need this delay on anything other than arm?  Or even on all arm?

Russell King - ARM Linux Oct. 28, 2014, 12:25 a.m. UTC | #3

On Mon, Oct 27, 2014 at 04:22:51PM -0700, Andrew Morton wrote:
> On Fri, 24 Oct 2014 21:55:32 +0200 Johan Hovold <johan@kernel.org> wrote:
> > I will. :) Just wanted to see whether Andrew preferred I resend the
> > whole series or just that one patch first.
> > 
> > The diff is minimal:
> > 
> > diff --git a/drivers/rtc/rtc-omap.c b/drivers/rtc/rtc-omap.c
> > index e74750f00b18..e4f97ad9eb21 100644
> > --- a/drivers/rtc/rtc-omap.c
> > +++ b/drivers/rtc/rtc-omap.c
> > @@ -423,6 +423,8 @@ static void omap_rtc_power_off(void)
> >         val = rtc_read(rtc, OMAP_RTC_INTERRUPTS_REG);
> >         rtc_writel(rtc, OMAP_RTC_INTERRUPTS_REG,
> >                         val | OMAP_RTC_INTERRUPTS_IT_ALARM2);
> > +
> > +       mdelay(2000);
> >  }
> 
> Yes, having read this threadlet: we need a very good comment in there
> explaining what's going on, please.
> 
> Do we even need this delay on anything other than arm?  Or even on all arm?

I think I've already commented on the behaviour of the reboot syscalls
such as power off which can return to userspace, pointing out that
x86 can return to userspace.

As long as x86 can return to userspace, I see no harm in ARM returning
to userspace.  If a driver which is hooking into the power off stuff
is unable to immediately shut off the power (wtf it can't for 2 sec
I've no idea) then having that driver work around that hardware's
specific brokenness with a delay seems entirely reasonable.

That allows those SoCs which can do the right thing to do the right
thing without being hindered by such silliness.  And it also stops
the next person coming along and bumping the delay from 2 to 3, to 5,
and then what... 10 seconds?

Keeping it in the driver means that the workaround for the broken
hardware is kept with the driver for the broken hardware - exactly
where it should be.

Johan Hovold Oct. 28, 2014, 8:16 a.m. UTC | #4

On Tue, Oct 28, 2014 at 12:25:52AM +0000, Russell King - ARM Linux wrote:
> On Mon, Oct 27, 2014 at 04:22:51PM -0700, Andrew Morton wrote:
> > On Fri, 24 Oct 2014 21:55:32 +0200 Johan Hovold <johan@kernel.org> wrote:
> > > I will. :) Just wanted to see whether Andrew preferred I resend the
> > > whole series or just that one patch first.
> > > 
> > > The diff is minimal:
> > > 
> > > diff --git a/drivers/rtc/rtc-omap.c b/drivers/rtc/rtc-omap.c
> > > index e74750f00b18..e4f97ad9eb21 100644
> > > --- a/drivers/rtc/rtc-omap.c
> > > +++ b/drivers/rtc/rtc-omap.c
> > > @@ -423,6 +423,8 @@ static void omap_rtc_power_off(void)
> > >         val = rtc_read(rtc, OMAP_RTC_INTERRUPTS_REG);
> > >         rtc_writel(rtc, OMAP_RTC_INTERRUPTS_REG,
> > >                         val | OMAP_RTC_INTERRUPTS_IT_ALARM2);
> > > +
> > > +       mdelay(2000);
> > >  }
> > 
> > Yes, having read this threadlet: we need a very good comment in there
> > explaining what's going on, please.
> >
> > Do we even need this delay on anything other than arm?  Or even on all arm?
> 
> I think I've already commented on the behaviour of the reboot syscalls
> such as power off which can return to userspace, pointing out that
> x86 can return to userspace.
> 
> As long as x86 can return to userspace, I see no harm in ARM returning
> to userspace.  If a driver which is hooking into the power off stuff
> is unable to immediately shut off the power (wtf it can't for 2 sec
> I've no idea) then having that driver work around that hardware's
> specific brokenness with a delay seems entirely reasonable.

Yeah, there are two issues here. If a power-off handler is crazy slow
there really should be a delay in the handler. That was just an
oversight on my part. [ In this case it takes between one and two
seconds due to the resolution of the rtc and they way it's alarm events
are triggered. ]

The other issue is whether arch code should inform the user about failed
power-off, in really exactly the same way as it does for failed reboot,
see:

	ac15e00b1efe ("ARM: restart: move reboot failure handing into
	machine_restart()"

by Russell.

It looks like we're soon to be having power-off call chains, with
configurable priorities, to shut of various parts of the hardware and
this is all at least partly configurable through DT. [1] I think it's
reasonable to expect to see more frequent failures to power off either
due to (DT) misconfiguration or broken or flakey hardware.

Having a short delay (I proposed 1s as for reboot) would also prevent
any oopses when returning to user-space for just quite slow devices
(e.g. millisecond range) without requiring explicit delays in these
handlers.

But as Andrew points out above, this really isn't an arm-specific issue,
and currently various arches deal with this differently, where some
return to user-space, some spin indefinitely (without an error message),
and some spin on failed reboot but not power-off (e.g. arm and arm64).

> That allows those SoCs which can do the right thing to do the right
> thing without being hindered by such silliness.  And it also stops
> the next person coming along and bumping the delay from 2 to 3, to 5,
> and then what... 10 seconds?

That wouldn't be an issue then. Arch code would only handle the
non-crazy case and complete power-off failures.

> Keeping it in the driver means that the workaround for the broken
> hardware is kept with the driver for the broken hardware - exactly
> where it should be.

Agreed.

Johan

[1] https://lkml.org/lkml/2014/10/27/506

Russell King - ARM Linux Oct. 28, 2014, 8:47 a.m. UTC | #5

On Tue, Oct 28, 2014 at 09:16:16AM +0100, Johan Hovold wrote:
> It looks like we're soon to be having power-off call chains, with
> configurable priorities, to shut of various parts of the hardware

I really hope that they *don't* get used like that.  I guess this is
what happens when people don't read the code before they decide to
implement something.

All the reboot/power off/halt methods already call into the driver model,
and the driver model then iterates over all bound drivers calling their
"shutdown" method.  So, today, as it has been for years, all device
drivers are notified of system power off.

That's where most device drivers should be cleanly stopping their
hardware.

The only thing which is left is the actual hardware triggering of the
action, and that should be what's done via the notifier.

Johan Hovold Oct. 28, 2014, 1:12 p.m. UTC | #6

On Tue, Oct 28, 2014 at 08:47:46AM +0000, Russell King - ARM Linux wrote:
> On Tue, Oct 28, 2014 at 09:16:16AM +0100, Johan Hovold wrote:
> > It looks like we're soon to be having power-off call chains, with
> > configurable priorities, to shut of various parts of the hardware
> 
> I really hope that they *don't* get used like that.  I guess this is
> what happens when people don't read the code before they decide to
> implement something.
> 
> All the reboot/power off/halt methods already call into the driver model,
> and the driver model then iterates over all bound drivers calling their
> "shutdown" method.  So, today, as it has been for years, all device
> drivers are notified of system power off.
> 
> That's where most device drivers should be cleanly stopping their
> hardware.
> 
> The only thing which is left is the actual hardware triggering of the
> action, and that should be what's done via the notifier.

That's not what I was trying to refer to. But the patch set explicitly
allows for multiple, prioritised power-off handlers, which can power
off a board in different ways and with various degrees of success.
Specifically, it allows for fallback handlers in case one or more
power-off handlers fail.

So if we allow for that, what is to prevent the final power-off handler
from failing? And should this not be logged by arch code in the same way
as failure to restart is?

Johan

Russell King - ARM Linux Oct. 28, 2014, 3:16 p.m. UTC | #7

On Tue, Oct 28, 2014 at 02:12:57PM +0100, Johan Hovold wrote:
> That's not what I was trying to refer to. But the patch set explicitly
> allows for multiple, prioritised power-off handlers, which can power
> off a board in different ways and with various degrees of success.
> Specifically, it allows for fallback handlers in case one or more
> power-off handlers fail.
> 
> So if we allow for that, what is to prevent the final power-off handler
> from failing? And should this not be logged by arch code in the same way
> as failure to restart is?

And how is that different from having a set of power-off handlers, and
reporting when each individual one fails?  Don't you want to know if
your primary high priority reboot handler fails, just as much as you
want to know if your final last-resort power-off handler fails?

Or different from having no power-off handlers.

Here's the x86 code:

void machine_power_off(void)
{
        machine_ops.power_off();
}

struct machine_ops machine_ops = {
        .power_off = native_machine_power_off,
...

static void native_machine_power_off(void)
{
        if (pm_power_off) {
                if (!reboot_force)
                        machine_shutdown();
                pm_power_off();
        }
        /* A fallback in case there is no PM info available */
        tboot_shutdown(TB_SHUTDOWN_HALT);
}

void tboot_shutdown(u32 shutdown_type)
{
        void (*shutdown)(void);

        if (!tboot_enabled())
                return;

See - x86 can very well just fall straight back out of machine_power_off()
if there's no pm_power_off() hook and tboot is not enabled.

Johan Hovold Oct. 29, 2014, 12:34 p.m. UTC | #8

On Tue, Oct 28, 2014 at 03:16:10PM +0000, Russell King - ARM Linux wrote:
> On Tue, Oct 28, 2014 at 02:12:57PM +0100, Johan Hovold wrote:
> > That's not what I was trying to refer to. But the patch set explicitly
> > allows for multiple, prioritised power-off handlers, which can power
> > off a board in different ways and with various degrees of success.
> > Specifically, it allows for fallback handlers in case one or more
> > power-off handlers fail.
> > 
> > So if we allow for that, what is to prevent the final power-off handler
> > from failing? And should this not be logged by arch code in the same way
> > as failure to restart is?
> 
> And how is that different from having a set of power-off handlers, and
> reporting when each individual one fails?  Don't you want to know if
> your primary high priority reboot handler fails, just as much as you
> want to know if your final last-resort power-off handler fails?

Good point. Failed power-off should probably be logged by the power-off
call chain implementation (which seems to makes notifier chains a bad
fit).

And what about any power-off latencies? Should this always be dealt with
in the power-off handler?

Again, if it's predictable and high, as in the OMAP RTC case, it should
go in the handler. But what if it's just normal bus latencies
(peripheral busses, i2c, or whatever people may come up with)?

Should there always be a short delay before calling the next handler?

> Or different from having no power-off handlers.

That is actually quite different, as in that case we call machine_halt
instead (via kernel_halt).

> Here's the x86 code:
> 
> void machine_power_off(void)
> {
>         machine_ops.power_off();
> }
> 
> struct machine_ops machine_ops = {
>         .power_off = native_machine_power_off,
> ...
> 
> static void native_machine_power_off(void)
> {
>         if (pm_power_off) {
>                 if (!reboot_force)
>                         machine_shutdown();
>                 pm_power_off();
>         }
>         /* A fallback in case there is no PM info available */
>         tboot_shutdown(TB_SHUTDOWN_HALT);
> }
> 
> void tboot_shutdown(u32 shutdown_type)
> {
>         void (*shutdown)(void);
> 
>         if (!tboot_enabled())
>                 return;
> 
> See - x86 can very well just fall straight back out of machine_power_off()
> if there's no pm_power_off() hook and tboot is not enabled.

I never doubted that, but is the right thing to do? Not all arches do it
that way.

And what about the killing of init? Shall we simply consider that a
systemd bug? 

	case LINUX_REBOOT_CMD_POWER_OFF:
		kernel_power_off();
		do_exit(0);
		break;

If power-off fails (for whatever reason), do_exit(0) will trigger a
panic when called from PID 1.

Johan

Romain Perier Oct. 29, 2014, 12:55 p.m. UTC | #9

Johan:. do you really plan to use this "poweroff-source" property ? As
you proposed a renaming few days ago...
I don't really want to waste time to propose patches to fix things
incrementally and rename it if the old one is used...

Romain

2014-10-29 13:34 GMT+01:00 Johan Hovold <johan@kernel.org>:
> On Tue, Oct 28, 2014 at 03:16:10PM +0000, Russell King - ARM Linux wrote:
>> On Tue, Oct 28, 2014 at 02:12:57PM +0100, Johan Hovold wrote:
>> > That's not what I was trying to refer to. But the patch set explicitly
>> > allows for multiple, prioritised power-off handlers, which can power
>> > off a board in different ways and with various degrees of success.
>> > Specifically, it allows for fallback handlers in case one or more
>> > power-off handlers fail.
>> >
>> > So if we allow for that, what is to prevent the final power-off handler
>> > from failing? And should this not be logged by arch code in the same way
>> > as failure to restart is?
>>
>> And how is that different from having a set of power-off handlers, and
>> reporting when each individual one fails?  Don't you want to know if
>> your primary high priority reboot handler fails, just as much as you
>> want to know if your final last-resort power-off handler fails?
>
> Good point. Failed power-off should probably be logged by the power-off
> call chain implementation (which seems to makes notifier chains a bad
> fit).
>
> And what about any power-off latencies? Should this always be dealt with
> in the power-off handler?
>
> Again, if it's predictable and high, as in the OMAP RTC case, it should
> go in the handler. But what if it's just normal bus latencies
> (peripheral busses, i2c, or whatever people may come up with)?
>
> Should there always be a short delay before calling the next handler?
>
>> Or different from having no power-off handlers.
>
> That is actually quite different, as in that case we call machine_halt
> instead (via kernel_halt).
>
>> Here's the x86 code:
>>
>> void machine_power_off(void)
>> {
>>         machine_ops.power_off();
>> }
>>
>> struct machine_ops machine_ops = {
>>         .power_off = native_machine_power_off,
>> ...
>>
>> static void native_machine_power_off(void)
>> {
>>         if (pm_power_off) {
>>                 if (!reboot_force)
>>                         machine_shutdown();
>>                 pm_power_off();
>>         }
>>         /* A fallback in case there is no PM info available */
>>         tboot_shutdown(TB_SHUTDOWN_HALT);
>> }
>>
>> void tboot_shutdown(u32 shutdown_type)
>> {
>>         void (*shutdown)(void);
>>
>>         if (!tboot_enabled())
>>                 return;
>>
>> See - x86 can very well just fall straight back out of machine_power_off()
>> if there's no pm_power_off() hook and tboot is not enabled.
>
> I never doubted that, but is the right thing to do? Not all arches do it
> that way.
>
> And what about the killing of init? Shall we simply consider that a
> systemd bug?
>
>         case LINUX_REBOOT_CMD_POWER_OFF:
>                 kernel_power_off();
>                 do_exit(0);
>                 break;
>
> If power-off fails (for whatever reason), do_exit(0) will trigger a
> panic when called from PID 1.
>
> Johan
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Johan Hovold Oct. 29, 2014, 1 p.m. UTC | #10

[ Please do not top-post. ]

On Wed, Oct 29, 2014 at 01:55:49PM +0100, Romain Perier wrote:
> Johan:. do you really plan to use this "poweroff-source" property ? As
> you proposed a renaming few days ago...
> I don't really want to waste time to propose patches to fix things
> incrementally and rename it if the old one is used...

Why would I want to use your retracted renaming proposal for this
property (i.e. "poweroff-source")?

These patches use "ti,system-power-controller" and the vendor prefix
will be dropped when your patches have been merged.

Johan

Russell King - ARM Linux Oct. 29, 2014, 1:10 p.m. UTC | #11

On Wed, Oct 29, 2014 at 01:34:18PM +0100, Johan Hovold wrote:
> On Tue, Oct 28, 2014 at 03:16:10PM +0000, Russell King - ARM Linux wrote:
> > And how is that different from having a set of power-off handlers, and
> > reporting when each individual one fails?  Don't you want to know if
> > your primary high priority reboot handler fails, just as much as you
> > want to know if your final last-resort power-off handler fails?
> 
> Good point. Failed power-off should probably be logged by the power-off
> call chain implementation (which seems to makes notifier chains a bad
> fit).
> 
> And what about any power-off latencies? Should this always be dealt with
> in the power-off handler?
> 
> Again, if it's predictable and high, as in the OMAP RTC case, it should
> go in the handler. But what if it's just normal bus latencies
> (peripheral busses, i2c, or whatever people may come up with)?
> 
> Should there always be a short delay before calling the next handler?

If the handler has determined that it has failed, then why delay before
trying the next handler?  At the point it has decided it has failed,
surely that's after it has waited sufficient time to determine that
failure?

> > Or different from having no power-off handlers.
> 
> That is actually quite different, as in that case we call machine_halt
> instead (via kernel_halt).

Today, ARM does exactly what x86 does.  If there's no power off handler
registered, machine_power_off() shuts down other CPUs and returns.

> > Here's the x86 code:
> > 
> > void machine_power_off(void)
> > {
> >         machine_ops.power_off();
> > }
> > 
> > struct machine_ops machine_ops = {
> >         .power_off = native_machine_power_off,
> > ...
> > 
> > static void native_machine_power_off(void)
> > {
> >         if (pm_power_off) {
> >                 if (!reboot_force)
> >                         machine_shutdown();
> >                 pm_power_off();
> >         }
> >         /* A fallback in case there is no PM info available */
> >         tboot_shutdown(TB_SHUTDOWN_HALT);
> > }
> > 
> > void tboot_shutdown(u32 shutdown_type)
> > {
> >         void (*shutdown)(void);
> > 
> >         if (!tboot_enabled())
> >                 return;
> > 
> > See - x86 can very well just fall straight back out of machine_power_off()
> > if there's no pm_power_off() hook and tboot is not enabled.
> 
> I never doubted that, but is the right thing to do? Not all arches do it
> that way.

Well, the biggest question there is: if the power off or restart syscall
fails, what is the _generic_ non-architecture action which is supposed to
happen?

Whatever the answer is to that question, that action should be performed
by the _generic_ non-architecture code, rather than having the same
implementation spread across all 30 architectures which the kernel
supports today.

> And what about the killing of init? Shall we simply consider that a
> systemd bug? 
> 
> 	case LINUX_REBOOT_CMD_POWER_OFF:
> 		kernel_power_off();
> 		do_exit(0);
> 		break;
> 
> If power-off fails (for whatever reason), do_exit(0) will trigger a
> panic when called from PID 1.

Oh, systemd calls this from PID1?  I guess that's another reason to hate
systemd with vitriol.  :)  SysVinit and upstart implementations call it
from the "halt" command, which is itself normally run from a script,
which totally avoids that problem.

I'm quite sure the insane systemd lobby will scream that this is a kernel
bug and will want to change it somehow, just like they want to change the
kernel in soo many other silly ways.

Romain Perier Oct. 29, 2014, 1:11 p.m. UTC | #12

This just to be sure, nothing more. I did read that you mentionned
"poweroff-source" earlier. However If I am wrong, my bad, everything
is fine.

2014-10-29 14:00 GMT+01:00 Johan Hovold <johan@kernel.org>:
> [ Please do not top-post. ]
>
> On Wed, Oct 29, 2014 at 01:55:49PM +0100, Romain Perier wrote:
>> Johan:. do you really plan to use this "poweroff-source" property ? As
>> you proposed a renaming few days ago...
>> I don't really want to waste time to propose patches to fix things
>> incrementally and rename it if the old one is used...
>
> Why would I want to use your retracted renaming proposal for this
> property (i.e. "poweroff-source")?
>
> These patches use "ti,system-power-controller" and the vendor prefix
> will be dropped when your patches have been merged.
>
> Johan

Guenter Roeck Oct. 29, 2014, 1:20 p.m. UTC | #13

On 10/29/2014 05:34 AM, Johan Hovold wrote:
> On Tue, Oct 28, 2014 at 03:16:10PM +0000, Russell King - ARM Linux wrote:
>> On Tue, Oct 28, 2014 at 02:12:57PM +0100, Johan Hovold wrote:
>>> That's not what I was trying to refer to. But the patch set explicitly
>>> allows for multiple, prioritised power-off handlers, which can power
>>> off a board in different ways and with various degrees of success.
>>> Specifically, it allows for fallback handlers in case one or more
>>> power-off handlers fail.
>>>
>>> So if we allow for that, what is to prevent the final power-off handler
>>> from failing? And should this not be logged by arch code in the same way
>>> as failure to restart is?
>>
>> And how is that different from having a set of power-off handlers, and
>> reporting when each individual one fails?  Don't you want to know if
>> your primary high priority reboot handler fails, just as much as you
>> want to know if your final last-resort power-off handler fails?
>
> Good point. Failed power-off should probably be logged by the power-off
> call chain implementation (which seems to makes notifier chains a bad
> fit).
>
Good that I just replaced notifier chain with an open coded implementation.
Sure, that is possible, but I would prefer to do that as a follow-up commit,
and it should be discussed in the context of the power-off handler patch set.

> And what about any power-off latencies? Should this always be dealt with
> in the power-off handler?
>
> Again, if it's predictable and high, as in the OMAP RTC case, it should
> go in the handler. But what if it's just normal bus latencies
> (peripheral busses, i2c, or whatever people may come up with)?
>
> Should there always be a short delay before calling the next handler?
>
That delay would depend on the individual power-off handler, so I think
the current implementation works just fine (where power-off handlers
implement the delay).

We could move the delay into the infrastructure, but it would have
to be configurable. I would prefer to consider that as a follow-up patch
to not overload the power-off handler patch set with too many changes
at the same time.

>> Or different from having no power-off handlers.
>
> That is actually quite different, as in that case we call machine_halt
> instead (via kernel_halt).
>
>> Here's the x86 code:
>>
>> void machine_power_off(void)
>> {
>>          machine_ops.power_off();
>> }
>>
>> struct machine_ops machine_ops = {
>>          .power_off = native_machine_power_off,
>> ...
>>
>> static void native_machine_power_off(void)
>> {
>>          if (pm_power_off) {
>>                  if (!reboot_force)
>>                          machine_shutdown();
>>                  pm_power_off();
>>          }
>>          /* A fallback in case there is no PM info available */
>>          tboot_shutdown(TB_SHUTDOWN_HALT);
>> }
>>
>> void tboot_shutdown(u32 shutdown_type)
>> {
>>          void (*shutdown)(void);
>>
>>          if (!tboot_enabled())
>>                  return;
>>
>> See - x86 can very well just fall straight back out of machine_power_off()
>> if there's no pm_power_off() hook and tboot is not enabled.
>
> I never doubted that, but is the right thing to do? Not all arches do it
> that way.
>
> And what about the killing of init? Shall we simply consider that a
> systemd bug?
>
> 	case LINUX_REBOOT_CMD_POWER_OFF:
> 		kernel_power_off();
> 		do_exit(0);
> 		break;
>
> If power-off fails (for whatever reason), do_exit(0) will trigger a
> panic when called from PID 1.
>
Common handling of that condition - eg to call machine_halt() - might be
an option. Separate patch, though.

Guenter

Johan Hovold Oct. 29, 2014, 1:22 p.m. UTC | #14

On Wed, Oct 29, 2014 at 01:10:20PM +0000, Russell King - ARM Linux wrote:
> On Wed, Oct 29, 2014 at 01:34:18PM +0100, Johan Hovold wrote:
> > On Tue, Oct 28, 2014 at 03:16:10PM +0000, Russell King - ARM Linux wrote:
> > > And how is that different from having a set of power-off handlers, and
> > > reporting when each individual one fails?  Don't you want to know if
> > > your primary high priority reboot handler fails, just as much as you
> > > want to know if your final last-resort power-off handler fails?
> > 
> > Good point. Failed power-off should probably be logged by the power-off
> > call chain implementation (which seems to makes notifier chains a bad
> > fit).
> > 
> > And what about any power-off latencies? Should this always be dealt with
> > in the power-off handler?
> > 
> > Again, if it's predictable and high, as in the OMAP RTC case, it should
> > go in the handler. But what if it's just normal bus latencies
> > (peripheral busses, i2c, or whatever people may come up with)?
> > 
> > Should there always be a short delay before calling the next handler?
> 
> If the handler has determined that it has failed, then why delay before
> trying the next handler?  At the point it has decided it has failed,
> surely that's after it has waited sufficient time to determine that
> failure?

The current handlers we have are not expecting any other handler to be
run after they return. My question was whether all these handlers should
get a short mdelay added to them (e.g. to compensate for bus latencies)
or if this could be done in the power-off handler (e.g. before printing
the error message).

> > > Or different from having no power-off handlers.
> > 
> > That is actually quite different, as in that case we call machine_halt
> > instead (via kernel_halt).
> 
> Today, ARM does exactly what x86 does.  If there's no power off handler
> registered, machine_power_off() shuts down other CPUs and returns.

No, if there are no power-off handlers registered, kernel/reboot.c will
never call machine_power_off:

	/* Instead of trying to make the power_off code look like
	 * halt when pm_power_off is not set do it the easy way.
	 */
	if ((cmd == LINUX_REBOOT_CMD_POWER_OFF) && !pm_power_off)
		cmd = LINUX_REBOOT_CMD_HALT;

So in that case on arm, a system-halted message is printed, and we never
return to user-space.

> > > Here's the x86 code:
> > > 
> > > void machine_power_off(void)
> > > {
> > >         machine_ops.power_off();
> > > }
> > > 
> > > struct machine_ops machine_ops = {
> > >         .power_off = native_machine_power_off,
> > > ...
> > > 
> > > static void native_machine_power_off(void)
> > > {
> > >         if (pm_power_off) {
> > >                 if (!reboot_force)
> > >                         machine_shutdown();
> > >                 pm_power_off();
> > >         }
> > >         /* A fallback in case there is no PM info available */
> > >         tboot_shutdown(TB_SHUTDOWN_HALT);
> > > }
> > > 
> > > void tboot_shutdown(u32 shutdown_type)
> > > {
> > >         void (*shutdown)(void);
> > > 
> > >         if (!tboot_enabled())
> > >                 return;
> > > 
> > > See - x86 can very well just fall straight back out of machine_power_off()
> > > if there's no pm_power_off() hook and tboot is not enabled.
> > 
> > I never doubted that, but is the right thing to do? Not all arches do it
> > that way.
> 
> Well, the biggest question there is: if the power off or restart syscall
> fails, what is the _generic_ non-architecture action which is supposed to
> happen?
> 
> Whatever the answer is to that question, that action should be performed
> by the _generic_ non-architecture code, rather than having the same
> implementation spread across all 30 architectures which the kernel
> supports today.

I fully agree.

> > And what about the killing of init? Shall we simply consider that a
> > systemd bug? 
> > 
> > 	case LINUX_REBOOT_CMD_POWER_OFF:
> > 		kernel_power_off();
> > 		do_exit(0);
> > 		break;
> > 
> > If power-off fails (for whatever reason), do_exit(0) will trigger a
> > panic when called from PID 1.
> 
> Oh, systemd calls this from PID1?  I guess that's another reason to hate
> systemd with vitriol.  :)  SysVinit and upstart implementations call it
> from the "halt" command, which is itself normally run from a script,
> which totally avoids that problem.

Yeah, that's why I never noticed the missing mdelay.

> I'm quite sure the insane systemd lobby will scream that this is a kernel
> bug and will want to change it somehow, just like they want to change the
> kernel in soo many other silly ways.

Will be interesting to follow. :)

Johan

Johan Hovold Oct. 29, 2014, 1:35 p.m. UTC | #15

On Wed, Oct 29, 2014 at 06:20:40AM -0700, Guenter Roeck wrote:
> On 10/29/2014 05:34 AM, Johan Hovold wrote:
> > On Tue, Oct 28, 2014 at 03:16:10PM +0000, Russell King - ARM Linux wrote:
> >> On Tue, Oct 28, 2014 at 02:12:57PM +0100, Johan Hovold wrote:
> >>> That's not what I was trying to refer to. But the patch set explicitly
> >>> allows for multiple, prioritised power-off handlers, which can power
> >>> off a board in different ways and with various degrees of success.
> >>> Specifically, it allows for fallback handlers in case one or more
> >>> power-off handlers fail.
> >>>
> >>> So if we allow for that, what is to prevent the final power-off handler
> >>> from failing? And should this not be logged by arch code in the same way
> >>> as failure to restart is?
> >>
> >> And how is that different from having a set of power-off handlers, and
> >> reporting when each individual one fails?  Don't you want to know if
> >> your primary high priority reboot handler fails, just as much as you
> >> want to know if your final last-resort power-off handler fails?
> >
> > Good point. Failed power-off should probably be logged by the power-off
> > call chain implementation (which seems to makes notifier chains a bad
> > fit).
>
> Good that I just replaced notifier chain with an open coded implementation.

Good to hear.

> Sure, that is possible, but I would prefer to do that as a follow-up commit,
> and it should be discussed in the context of the power-off handler patch set.

Fine with me.

> > And what about any power-off latencies? Should this always be dealt with
> > in the power-off handler?
> >
> > Again, if it's predictable and high, as in the OMAP RTC case, it should
> > go in the handler. But what if it's just normal bus latencies
> > (peripheral busses, i2c, or whatever people may come up with)?
> >
> > Should there always be a short delay before calling the next handler?
> 
> That delay would depend on the individual power-off handler, so I think
> the current implementation works just fine (where power-off handlers
> implement the delay).

Some don't, and could possibly unknowingly have been relying on the fact
that they could return to user space and be powered off at some later
time. With systemd that would have caused a panic.

Also consider generic power-off handlers such as gpio-poweroff. It
currently hard-codes a three-second delay but the actual delay would
really be board specific.

> We could move the delay into the infrastructure, but it would have
> to be configurable. I would prefer to consider that as a follow-up patch
> to not overload the power-off handler patch set with too many changes
> at the same time.

Sure.

Johan

Johan Hovold Oct. 29, 2014, 1:44 p.m. UTC | #16

[ Again, please stop with the top-posting.

A: http://en.wikipedia.org/wiki/Top_post
Q: Were do I find info about this thing called top-posting?

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing? ]

On Wed, Oct 29, 2014 at 02:11:02PM +0100, Romain Perier wrote:
> This just to be sure, nothing more. I did read that you mentionned
> "poweroff-source" earlier. However If I am wrong, my bad, everything
> is fine.

I only mentioned that you had proposed that name in my cover letter.

Johan

Guenter Roeck Oct. 29, 2014, 3:25 p.m. UTC | #17

On Wed, Oct 29, 2014 at 02:22:44PM +0100, Johan Hovold wrote:
> On Wed, Oct 29, 2014 at 01:10:20PM +0000, Russell King - ARM Linux wrote:
> > On Wed, Oct 29, 2014 at 01:34:18PM +0100, Johan Hovold wrote:
> > > On Tue, Oct 28, 2014 at 03:16:10PM +0000, Russell King - ARM Linux wrote:
> > > > And how is that different from having a set of power-off handlers, and
> > > > reporting when each individual one fails?  Don't you want to know if
> > > > your primary high priority reboot handler fails, just as much as you
> > > > want to know if your final last-resort power-off handler fails?
> > > 
> > > Good point. Failed power-off should probably be logged by the power-off
> > > call chain implementation (which seems to makes notifier chains a bad
> > > fit).
> > > 
> > > And what about any power-off latencies? Should this always be dealt with
> > > in the power-off handler?
> > > 
> > > Again, if it's predictable and high, as in the OMAP RTC case, it should
> > > go in the handler. But what if it's just normal bus latencies
> > > (peripheral busses, i2c, or whatever people may come up with)?
> > > 
> > > Should there always be a short delay before calling the next handler?
> > 
> > If the handler has determined that it has failed, then why delay before
> > trying the next handler?  At the point it has decided it has failed,
> > surely that's after it has waited sufficient time to determine that
> > failure?
> 
> The current handlers we have are not expecting any other handler to be
> run after they return. My question was whether all these handlers should
> get a short mdelay added to them (e.g. to compensate for bus latencies)

Some of them do add a delay.

> or if this could be done in the power-off handler (e.g. before printing
> the error message).
> 
That might make sense, but it would have to be configurable, since the delay
is platform specific and power-off handler does not know how long to wait
(the longest delay I have seen is 10 seconds).

> > > > Or different from having no power-off handlers.
> > > 
> > > That is actually quite different, as in that case we call machine_halt
> > > instead (via kernel_halt).
> > 
> > Today, ARM does exactly what x86 does.  If there's no power off handler
> > registered, machine_power_off() shuts down other CPUs and returns.
> 
> No, if there are no power-off handlers registered, kernel/reboot.c will
> never call machine_power_off:
> 
> 	/* Instead of trying to make the power_off code look like
> 	 * halt when pm_power_off is not set do it the easy way.
> 	 */
> 	if ((cmd == LINUX_REBOOT_CMD_POWER_OFF) && !pm_power_off)
> 		cmd = LINUX_REBOOT_CMD_HALT;
> 
> So in that case on arm, a system-halted message is printed, and we never
> return to user-space.
> 
Some architectures do that, or go into an endless loop. Others do return
from machine_power_off. Having a well defined behavior would be nice
(such as dumping an error mesasge and calling machine_halt if
machine_power_off returns). Only question would be where to put it.
kernel_power_off() might be a good place; only problem is that there
are direct callers of machine_power_off().

Guenter

Guenter Roeck Oct. 29, 2014, 3:36 p.m. UTC | #18

On Wed, Oct 29, 2014 at 02:35:26PM +0100, Johan Hovold wrote:
> On Wed, Oct 29, 2014 at 06:20:40AM -0700, Guenter Roeck wrote:
> > On 10/29/2014 05:34 AM, Johan Hovold wrote:
> > > On Tue, Oct 28, 2014 at 03:16:10PM +0000, Russell King - ARM Linux wrote:
> > >> On Tue, Oct 28, 2014 at 02:12:57PM +0100, Johan Hovold wrote:
> > >>> That's not what I was trying to refer to. But the patch set explicitly
> > >>> allows for multiple, prioritised power-off handlers, which can power
> > >>> off a board in different ways and with various degrees of success.
> > >>> Specifically, it allows for fallback handlers in case one or more
> > >>> power-off handlers fail.
> > >>>
> > >>> So if we allow for that, what is to prevent the final power-off handler
> > >>> from failing? And should this not be logged by arch code in the same way
> > >>> as failure to restart is?
> > >>
> > >> And how is that different from having a set of power-off handlers, and
> > >> reporting when each individual one fails?  Don't you want to know if
> > >> your primary high priority reboot handler fails, just as much as you
> > >> want to know if your final last-resort power-off handler fails?
> > >
> > > Good point. Failed power-off should probably be logged by the power-off
> > > call chain implementation (which seems to makes notifier chains a bad
> > > fit).
> >
> > Good that I just replaced notifier chain with an open coded implementation.
> 
> Good to hear.
> 
> > Sure, that is possible, but I would prefer to do that as a follow-up commit,
> > and it should be discussed in the context of the power-off handler patch set.
> 
> Fine with me.
> 
> > > And what about any power-off latencies? Should this always be dealt with
> > > in the power-off handler?
> > >
> > > Again, if it's predictable and high, as in the OMAP RTC case, it should
> > > go in the handler. But what if it's just normal bus latencies
> > > (peripheral busses, i2c, or whatever people may come up with)?
> > >
> > > Should there always be a short delay before calling the next handler?
> > 
> > That delay would depend on the individual power-off handler, so I think
> > the current implementation works just fine (where power-off handlers
> > implement the delay).
> 
> Some don't, and could possibly unknowingly have been relying on the fact
> that they could return to user space and be powered off at some later
> time. With systemd that would have caused a panic.
> 
Agreed, but there are two cases to consider: What should be the delay
before the next power-off handler is called, and what should the system
do if all power-off handlers fail (or if there are none). The current
behavior isn't exactly well defined. Ok, with systemd that results in
a crash, but I am not really sure if one can or should blame systemd
for that. The discussion about systemd and its philosophy should not
cloud the fact that power-off behavior isn't exactly well defined.

> Also consider generic power-off handlers such as gpio-poweroff. It
> currently hard-codes a three-second delay but the actual delay would
> really be board specific.
> 
A configurable delay would address that. The actually required delay
could be provided in platform data or as devicetree property.

Thanks,
Guenter

Johan Hovold Oct. 29, 2014, 3:51 p.m. UTC | #19

On Wed, Oct 29, 2014 at 08:25:02AM -0700, Guenter Roeck wrote:
> On Wed, Oct 29, 2014 at 02:22:44PM +0100, Johan Hovold wrote:
> > On Wed, Oct 29, 2014 at 01:10:20PM +0000, Russell King - ARM Linux wrote:
> > > On Wed, Oct 29, 2014 at 01:34:18PM +0100, Johan Hovold wrote:
> > > > On Tue, Oct 28, 2014 at 03:16:10PM +0000, Russell King - ARM Linux wrote:
> > > > > And how is that different from having a set of power-off handlers, and
> > > > > reporting when each individual one fails?  Don't you want to know if
> > > > > your primary high priority reboot handler fails, just as much as you
> > > > > want to know if your final last-resort power-off handler fails?
> > > > 
> > > > Good point. Failed power-off should probably be logged by the power-off
> > > > call chain implementation (which seems to makes notifier chains a bad
> > > > fit).
> > > > 
> > > > And what about any power-off latencies? Should this always be dealt with
> > > > in the power-off handler?
> > > > 
> > > > Again, if it's predictable and high, as in the OMAP RTC case, it should
> > > > go in the handler. But what if it's just normal bus latencies
> > > > (peripheral busses, i2c, or whatever people may come up with)?
> > > > 
> > > > Should there always be a short delay before calling the next handler?
> > > 
> > > If the handler has determined that it has failed, then why delay before
> > > trying the next handler?  At the point it has decided it has failed,
> > > surely that's after it has waited sufficient time to determine that
> > > failure?
> > 
> > The current handlers we have are not expecting any other handler to be
> > run after they return. My question was whether all these handlers should
> > get a short mdelay added to them (e.g. to compensate for bus latencies)
> 
> Some of them do add a delay.

Yes, but not all.

> > or if this could be done in the power-off handler (e.g. before printing
> > the error message).
> > 
> That might make sense, but it would have to be configurable, since the delay
> is platform specific and power-off handler does not know how long to wait
> (the longest delay I have seen is 10 seconds).

We've already concluded in this thread that such delays must be encoded
in the actual handler (even if it's an argument to the power-off call
chain code). The only exception should be generic handlers such as
gpio-poweroff, which may need to define different delays depending on
board. This could of course just be an argument of the corresponding DT
node.

My question above was if it was reasonable to add some generic short
delay after calling each power-off handler to handle short power-off
latencies (e.g. bus latencies) so that not every handler needs an
explicit delay.

> > > > > Or different from having no power-off handlers.
> > > > 
> > > > That is actually quite different, as in that case we call machine_halt
> > > > instead (via kernel_halt).
> > > 
> > > Today, ARM does exactly what x86 does.  If there's no power off handler
> > > registered, machine_power_off() shuts down other CPUs and returns.
> > 
> > No, if there are no power-off handlers registered, kernel/reboot.c will
> > never call machine_power_off:
> > 
> > 	/* Instead of trying to make the power_off code look like
> > 	 * halt when pm_power_off is not set do it the easy way.
> > 	 */
> > 	if ((cmd == LINUX_REBOOT_CMD_POWER_OFF) && !pm_power_off)
> > 		cmd = LINUX_REBOOT_CMD_HALT;
> > 
> > So in that case on arm, a system-halted message is printed, and we never
> > return to user-space.
> > 
> Some architectures do that, or go into an endless loop. Others do return
> from machine_power_off.

Please re-read my comment and the code above. machine_power_off is never
called if there's no handler registered.

Some archs machine_power_off to spin on failed power-off (i.e. when
there is a handler), something which I've mentioned a few times already
in this thread.

> Having a well defined behavior would be nice
> (such as dumping an error mesasge and calling machine_halt if
> machine_power_off returns). Only question would be where to put it.
> kernel_power_off() might be a good place; only problem is that there
> are direct callers of machine_power_off().

Indeed. Adding an error message to the power-off handler call chain code
would solve the first problem as I mentioned before.

Then it's mostly a matter of whether we care about consistency, and
either remove the indefinite spins from those non-x86/non-arm arches, or
prevent the latter (and some others) from returning to user-space.

I'm inclined to having all arches return to user space on failed
power-off, even if it means systemd cannot call reboot() from PID 1.

Johan

Johan Hovold Oct. 29, 2014, 3:54 p.m. UTC | #20

On Wed, Oct 29, 2014 at 08:36:41AM -0700, Guenter Roeck wrote:
> On Wed, Oct 29, 2014 at 02:35:26PM +0100, Johan Hovold wrote:
> > On Wed, Oct 29, 2014 at 06:20:40AM -0700, Guenter Roeck wrote:
> > > On 10/29/2014 05:34 AM, Johan Hovold wrote:

> > > > And what about any power-off latencies? Should this always be dealt with
> > > > in the power-off handler?
> > > >
> > > > Again, if it's predictable and high, as in the OMAP RTC case, it should
> > > > go in the handler. But what if it's just normal bus latencies
> > > > (peripheral busses, i2c, or whatever people may come up with)?
> > > >
> > > > Should there always be a short delay before calling the next handler?
> > > 
> > > That delay would depend on the individual power-off handler, so I think
> > > the current implementation works just fine (where power-off handlers
> > > implement the delay).
> > 
> > Some don't, and could possibly unknowingly have been relying on the fact
> > that they could return to user space and be powered off at some later
> > time. With systemd that would have caused a panic.
>
> Agreed, but there are two cases to consider: What should be the delay
> before the next power-off handler is called, and what should the system
> do if all power-off handlers fail (or if there are none). The current
> behavior isn't exactly well defined. Ok, with systemd that results in
> a crash, but I am not really sure if one can or should blame systemd
> for that. The discussion about systemd and its philosophy should not
> cloud the fact that power-off behavior isn't exactly well defined.

Sounds like we pretty much agree. See my response to your last mail.

> > Also consider generic power-off handlers such as gpio-poweroff. It
> > currently hard-codes a three-second delay but the actual delay would
> > really be board specific.
> > 
> A configurable delay would address that. The actually required delay
> could be provided in platform data or as devicetree property.

Yep, see mail mentioned above.

Johan

Johan Hovold Oct. 30, 2014, 10:01 a.m. UTC | #21

On Wed, Oct 29, 2014 at 04:51:09PM +0100, Johan Hovold wrote:
> On Wed, Oct 29, 2014 at 08:25:02AM -0700, Guenter Roeck wrote:
> > On Wed, Oct 29, 2014 at 02:22:44PM +0100, Johan Hovold wrote:
> > > On Wed, Oct 29, 2014 at 01:10:20PM +0000, Russell King - ARM Linux wrote:
> > > > On Wed, Oct 29, 2014 at 01:34:18PM +0100, Johan Hovold wrote:
> > > > > On Tue, Oct 28, 2014 at 03:16:10PM +0000, Russell King - ARM Linux wrote:
> > > > > > And how is that different from having a set of power-off handlers, and
> > > > > > reporting when each individual one fails?  Don't you want to know if
> > > > > > your primary high priority reboot handler fails, just as much as you
> > > > > > want to know if your final last-resort power-off handler fails?
> > > > > 
> > > > > Good point. Failed power-off should probably be logged by the power-off
> > > > > call chain implementation (which seems to makes notifier chains a bad
> > > > > fit).
> > > > > 
> > > > > And what about any power-off latencies? Should this always be dealt with
> > > > > in the power-off handler?
> > > > > 
> > > > > Again, if it's predictable and high, as in the OMAP RTC case, it should
> > > > > go in the handler. But what if it's just normal bus latencies
> > > > > (peripheral busses, i2c, or whatever people may come up with)?
> > > > > 
> > > > > Should there always be a short delay before calling the next handler?
> > > > 
> > > > If the handler has determined that it has failed, then why delay before
> > > > trying the next handler?  At the point it has decided it has failed,
> > > > surely that's after it has waited sufficient time to determine that
> > > > failure?
> > > 
> > > The current handlers we have are not expecting any other handler to be
> > > run after they return. My question was whether all these handlers should
> > > get a short mdelay added to them (e.g. to compensate for bus latencies)
> > 
> > Some of them do add a delay.
> 
> Yes, but not all.
> 
> > > or if this could be done in the power-off handler (e.g. before printing
> > > the error message).
> > > 
> > That might make sense, but it would have to be configurable, since the delay
> > is platform specific and power-off handler does not know how long to wait
> > (the longest delay I have seen is 10 seconds).
> 
> We've already concluded in this thread that such delays must be encoded
> in the actual handler (even if it's an argument to the power-off call
> chain code). The only exception should be generic handlers such as
> gpio-poweroff, which may need to define different delays depending on
> board. This could of course just be an argument of the corresponding DT
> node.

There probably should be a generic device-tree property for this, even
if some delays would be device specific (e.g. the two second delay for
omap rtc).

But on top of that, external latencies would be quite board specific.
The PMIC on the Beaglebone black adds a 50ms deglitch time, for
instance.

> My question above was if it was reasonable to add some generic short
> delay after calling each power-off handler to handle short power-off
> latencies (e.g. bus latencies) so that not every handler needs an
> explicit delay.

I just added a 500ms margin to the OMAP rtc power-off handler. That
should be more than enough, and perhaps something like that could be a
default in the power-off call chain. Consider that arch-arm has a 1s
grace period before reporting failed reboot already today.

Johan

[v2,00/20] rtc: omap: fixes and power-off feature

Commit Message

Comments

Patch