diff mbox series

[RFC,v3,5/5] arm64: Use SYSTEM_OFF2 PSCI call to power off for hibernate

Message ID 20240319130957.1050637-6-dwmw2@infradead.org (mailing list archive)
State RFC, archived
Headers show
Series Add PSCI v1.3 SYSTEM_OFF2 support for hibernation | expand

Commit Message

David Woodhouse March 19, 2024, 12:59 p.m. UTC
From: David Woodhouse <dwmw@amazon.co.uk>

The PSCI v1.3 specification (alpha) adds support for a SYSTEM_OFF2
function which is analogous to ACPI S4 state. This will allow hosting
environments to determine that a guest is hibernated rather than just
powered off, and handle that state appropriately on subsequent launches.

Since commit 60c0d45a7f7a ("efi/arm64: use UEFI for system reset and
poweroff") the EFI shutdown method is deliberately preferred over PSCI
or other methods. So register a SYS_OFF_MODE_POWER_OFF handler which
*only* handles the hibernation, leaving the original PSCI SYSTEM_OFF as
a last resort via the legacy pm_power_off function pointer.

The hibernation code already exports a system_entering_hibernation()
function which is be used by the higher-priority handler to check for
hibernation. That existing function just returns the value of a static
boolean variable from hibernate.c, which was previously only set in the
hibernation_platform_enter() code path. Set the same flag in the simpler
code path around the call to kernel_power_off() too.

An alternative way to hook SYSTEM_OFF2 into the hibernation code would
be to register a platform_hibernation_ops structure with an ->enter()
method which makes the new SYSTEM_OFF2 call. But that would have the
unwanted side-effect of making hibernation take a completely different
code path in hibernation_platform_enter(), invoking a lot of special dpm
callbacks.

Another option might be to add a new SYS_OFF_MODE_HIBERNATE mode, with
fallback to SYS_OFF_MODE_POWER_OFF. Or to use the sys_off_data to
indicate whether the power off is for hibernation.

But this version works and is relatively simple.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 drivers/firmware/psci/psci.c | 35 +++++++++++++++++++++++++++++++++++
 kernel/power/hibernate.c     |  5 ++++-
 2 files changed, 39 insertions(+), 1 deletion(-)

Comments

Marc Zyngier March 22, 2024, 4:02 p.m. UTC | #1
On Tue, 19 Mar 2024 12:59:06 +0000,
David Woodhouse <dwmw2@infradead.org> wrote:

[...]

> +static void __init psci_init_system_off2(void)
> +{
> +	int ret;
> +
> +	ret = psci_features(PSCI_FN_NATIVE(1_3, SYSTEM_OFF2));
> +
> +	if (ret != PSCI_RET_NOT_SUPPORTED)
> +		psci_system_off2_supported = true;

It'd be worth considering the (slightly broken) case where SYSTEM_OFF2
is supported, but HIBERNATE_OFF is not set in the response, as the
spec doesn't say that this bit is mandatory (it seems legal to
implement SYSTEM_OFF2 without any hibernate type, making it similar to
SYSTEM_OFF).

Thanks,

	M.
David Woodhouse March 22, 2024, 4:12 p.m. UTC | #2
On Fri, 2024-03-22 at 16:02 +0000, Marc Zyngier wrote:
> On Tue, 19 Mar 2024 12:59:06 +0000,
> David Woodhouse <dwmw2@infradead.org> wrote:
> 
> [...]
> 
> > +static void __init psci_init_system_off2(void)
> > +{
> > +       int ret;
> > +
> > +       ret = psci_features(PSCI_FN_NATIVE(1_3, SYSTEM_OFF2));
> > +
> > +       if (ret != PSCI_RET_NOT_SUPPORTED)
> > +               psci_system_off2_supported = true;
> 
> It'd be worth considering the (slightly broken) case where SYSTEM_OFF2
> is supported, but HIBERNATE_OFF is not set in the response, as the
> spec doesn't say that this bit is mandatory (it seems legal to
> implement SYSTEM_OFF2 without any hibernate type, making it similar to
> SYSTEM_OFF).

Such is not my understanding. If SYSTEM_OFF2 is supported, then
HIBERNATE_OFF *is* mandatory.

The next update to the spec is turning the PSCI_FEATURES response into
a *bitmap* of the available features, and I believe it will mandate
that bit zero is set.

And if for whatever reason that SYSTEM_OFF2/HIBERNATE_OFF call
*doesn't* work, Linux will end up doing a 'real' poweroff, first
through EFI and then finally as a last resort with a PSCI SYSTEM_OFF.
So it would be OK to have false positives in the detection.
Marc Zyngier March 22, 2024, 4:37 p.m. UTC | #3
On Fri, 22 Mar 2024 16:12:44 +0000,
David Woodhouse <dwmw2@infradead.org> wrote:
> 
> On Fri, 2024-03-22 at 16:02 +0000, Marc Zyngier wrote:
> > On Tue, 19 Mar 2024 12:59:06 +0000,
> > David Woodhouse <dwmw2@infradead.org> wrote:
> > 
> > [...]
> > 
> > > +static void __init psci_init_system_off2(void)
> > > +{
> > > +       int ret;
> > > +
> > > +       ret = psci_features(PSCI_FN_NATIVE(1_3, SYSTEM_OFF2));
> > > +
> > > +       if (ret != PSCI_RET_NOT_SUPPORTED)
> > > +               psci_system_off2_supported = true;
> > 
> > It'd be worth considering the (slightly broken) case where SYSTEM_OFF2
> > is supported, but HIBERNATE_OFF is not set in the response, as the
> > spec doesn't say that this bit is mandatory (it seems legal to
> > implement SYSTEM_OFF2 without any hibernate type, making it similar to
> > SYSTEM_OFF).
> 
> Such is not my understanding. If SYSTEM_OFF2 is supported, then
> HIBERNATE_OFF *is* mandatory.
> 
> The next update to the spec is turning the PSCI_FEATURES response into
> a *bitmap* of the available features, and I believe it will mandate
> that bit zero is set.

The bitmap is already present in the current Alpha spec:

<quote>
5.16.2 Implementation responsibilities

[...]

Bits[31] Reserved, must be zero.

Bits[30:0] Hibernate types supported.
	- 0x0 - HIBERNATE_OFF

All other values are reserved for future use.
</quote>

and doesn't say (yet) that HIBERNATE_OFF is mandatory. Furthermore,

<quote>
5.11.2 Caller responsibilities

The calling OS uses the PSCI_FEATURES API, with the SYSTEM_OFF2
function ID, to discover whether the function is present:

- If the function is implemented, PSCI_FEATURES returns the hibernate
  types supported.

- If the function is not implemented, PSCI_FEATURES returns
  NOT_SUPPORTED.
</quote>

which doesn't say anything about which hibernate type must be
implemented. Which makes sense, as I expect it to, in the fine ARM
tradition, grow things such as "HIBERNATE_WITH_ROT13_ENCRYPTION" and
even "HIBERNATE_WITH_ERRATA_XYZ", because firmware is where people
dump their crap. And we will need some special handling for these
tasty variants.

> And if for whatever reason that SYSTEM_OFF2/HIBERNATE_OFF call
> *doesn't* work, Linux will end up doing a 'real' poweroff, first
> through EFI and then finally as a last resort with a PSCI SYSTEM_OFF.
> So it would be OK to have false positives in the detection.

I agree that nothing really breaks, but I also hold the view that
broken firmware implementations should be given the finger, specially
given that you have done this work *ahead* of the spec. I would really
like this to fail immediately on these and not even try to suspend.

With that in mind, if doesn't really matter whether HIBERNATE_OFF is
mandatory or not. We really should check for it and pretend it doesn't
exist if the correct flag isn't set.

	M.
David Woodhouse March 22, 2024, 4:55 p.m. UTC | #4
On Fri, 2024-03-22 at 16:37 +0000, Marc Zyngier wrote:
> 
> I agree that nothing really breaks, but I also hold the view that
> broken firmware implementations should be given the finger, specially
> given that you have done this work *ahead* of the spec. I would really
> like this to fail immediately on these and not even try to suspend.
> 
> With that in mind, if doesn't really matter whether HIBERNATE_OFF is
> mandatory or not. We really should check for it and pretend it doesn't
> exist if the correct flag isn't set.

Ack.

I'll rename that variable to 'psci_system_off2_hibernate_supported' then.

static void __init psci_init_system_off2(void)
{
	int ret;

	ret = psci_features(PSCI_FN_NATIVE(1_3, SYSTEM_OFF2));
	if (ret < 0)
		return;

	if (ret & (1 << PSCI_1_3_HIBERNATE_TYPE_OFF))
		psci_system_off2_hibernate_supported = true;
}
Sudeep Holla March 22, 2024, 5:05 p.m. UTC | #5
On Fri, Mar 22, 2024 at 04:37:19PM +0000, Marc Zyngier wrote:
> On Fri, 22 Mar 2024 16:12:44 +0000,
> David Woodhouse <dwmw2@infradead.org> wrote:
> > 
> > On Fri, 2024-03-22 at 16:02 +0000, Marc Zyngier wrote:
> > > On Tue, 19 Mar 2024 12:59:06 +0000,
> > > David Woodhouse <dwmw2@infradead.org> wrote:
> > > 
> > > [...]
> > > 
> > > > +static void __init psci_init_system_off2(void)
> > > > +{
> > > > +       int ret;
> > > > +
> > > > +       ret = psci_features(PSCI_FN_NATIVE(1_3, SYSTEM_OFF2));
> > > > +
> > > > +       if (ret != PSCI_RET_NOT_SUPPORTED)
> > > > +               psci_system_off2_supported = true;
> > > 
> > > It'd be worth considering the (slightly broken) case where SYSTEM_OFF2
> > > is supported, but HIBERNATE_OFF is not set in the response, as the
> > > spec doesn't say that this bit is mandatory (it seems legal to
> > > implement SYSTEM_OFF2 without any hibernate type, making it similar to
> > > SYSTEM_OFF).
> > 
> > Such is not my understanding. If SYSTEM_OFF2 is supported, then
> > HIBERNATE_OFF *is* mandatory.
> > 
> > The next update to the spec is turning the PSCI_FEATURES response into
> > a *bitmap* of the available features, and I believe it will mandate
> > that bit zero is set.

Correct, but we add a extra check as well to be sure even if it is mandated
unless the spec relaxes in a way that psci_features(SYSTEM_OFF2) need not
return the mandatory types in the bitmask which I doubt.

Something like:
	if (ret != PSCI_RET_NOT_SUPPORTED &&
		(ret & BIT(PSCI_1_3_HIBERNATE_TYPE_OFF)))
		psci_system_off2_supported = true;

This will ensure the firmware will not randomly set bit[0]=0 if in the
future it support some newer types as well.

I understand the kernel is not conformance test for the spec but in
practice especially for such features and PSCI spec in particular, kernel
has become defacto conformance for firmware developers which is sad.
It some feature works in the kernel, the firmware is assumed to be
conformant to the spec w.r.t the feature.

--
Regards,
Sudeep
Sudeep Holla March 22, 2024, 5:08 p.m. UTC | #6
On Fri, Mar 22, 2024 at 04:55:04PM +0000, David Woodhouse wrote:
> On Fri, 2024-03-22 at 16:37 +0000, Marc Zyngier wrote:
> > 
> > I agree that nothing really breaks, but I also hold the view that
> > broken firmware implementations should be given the finger, specially
> > given that you have done this work *ahead* of the spec. I would really
> > like this to fail immediately on these and not even try to suspend.
> > 
> > With that in mind, if doesn't really matter whether HIBERNATE_OFF is
> > mandatory or not. We really should check for it and pretend it doesn't
> > exist if the correct flag isn't set.
> 
> Ack.
> 
> I'll rename that variable to 'psci_system_off2_hibernate_supported' then.
> 
> static void __init psci_init_system_off2(void)
> {
> 	int ret;
> 
> 	ret = psci_features(PSCI_FN_NATIVE(1_3, SYSTEM_OFF2));
> 	if (ret < 0)
> 		return;
>
> 	if (ret & (1 << PSCI_1_3_HIBERNATE_TYPE_OFF))
> 		psci_system_off2_hibernate_supported = true;
>

Ah OK, you have already agreed to do this, please ignore my response then.

--
Regards,
Sudeep
diff mbox series

Patch

diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
index d9629ff87861..69d2f6969438 100644
--- a/drivers/firmware/psci/psci.c
+++ b/drivers/firmware/psci/psci.c
@@ -78,6 +78,7 @@  struct psci_0_1_function_ids get_psci_0_1_function_ids(void)
 
 static u32 psci_cpu_suspend_feature;
 static bool psci_system_reset2_supported;
+static bool psci_system_off2_supported;
 
 static inline bool psci_has_ext_power_state(void)
 {
@@ -333,6 +334,28 @@  static void psci_sys_poweroff(void)
 	invoke_psci_fn(PSCI_0_2_FN_SYSTEM_OFF, 0, 0, 0);
 }
 
+#ifdef CONFIG_HIBERNATION
+static int psci_sys_hibernate(struct sys_off_data *data)
+{
+	if (system_entering_hibernation())
+		invoke_psci_fn(PSCI_FN_NATIVE(1_3, SYSTEM_OFF2),
+			       PSCI_1_3_HIBERNATE_TYPE_OFF, 0, 0);
+	return NOTIFY_DONE;
+}
+
+static int __init psci_hibernate_init(void)
+{
+	if (psci_system_off2_supported) {
+		/* Higher priority than EFI shutdown, but only for hibernate */
+		register_sys_off_handler(SYS_OFF_MODE_POWER_OFF,
+					 SYS_OFF_PRIO_FIRMWARE + 2,
+					 psci_sys_hibernate, NULL);
+	}
+	return 0;
+}
+subsys_initcall(psci_hibernate_init);
+#endif
+
 static int psci_features(u32 psci_func_id)
 {
 	return invoke_psci_fn(PSCI_1_0_FN_PSCI_FEATURES,
@@ -364,6 +387,7 @@  static const struct {
 	PSCI_ID_NATIVE(1_1, SYSTEM_RESET2),
 	PSCI_ID(1_1, MEM_PROTECT),
 	PSCI_ID_NATIVE(1_1, MEM_PROTECT_CHECK_RANGE),
+	PSCI_ID_NATIVE(1_3, SYSTEM_OFF2),
 };
 
 static int psci_debugfs_read(struct seq_file *s, void *data)
@@ -523,6 +547,16 @@  static void __init psci_init_system_reset2(void)
 		psci_system_reset2_supported = true;
 }
 
+static void __init psci_init_system_off2(void)
+{
+	int ret;
+
+	ret = psci_features(PSCI_FN_NATIVE(1_3, SYSTEM_OFF2));
+
+	if (ret != PSCI_RET_NOT_SUPPORTED)
+		psci_system_off2_supported = true;
+}
+
 static void __init psci_init_system_suspend(void)
 {
 	int ret;
@@ -653,6 +687,7 @@  static int __init psci_probe(void)
 		psci_init_cpu_suspend();
 		psci_init_system_suspend();
 		psci_init_system_reset2();
+		psci_init_system_off2();
 		kvm_init_hyp_services();
 	}
 
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index 4b0b7cf2e019..ac87b3cb670c 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -676,8 +676,11 @@  static void power_down(void)
 		}
 		fallthrough;
 	case HIBERNATION_SHUTDOWN:
-		if (kernel_can_power_off())
+		if (kernel_can_power_off()) {
+			entering_platform_hibernation = true;
 			kernel_power_off();
+			entering_platform_hibernation = false;
+		}
 		break;
 	}
 	kernel_halt();