mbox series

[RFC,0/2] Add PSCI v1.3 SYSTEM_OFF2 support for hibernation

Message ID 20240312135958.727765-1-dwmw2@infradead.org (mailing list archive)
Headers show
Series Add PSCI v1.3 SYSTEM_OFF2 support for hibernation | expand

Message

David Woodhouse March 12, 2024, 1:51 p.m. UTC
The upcoming PSCI v1.3 specification adds support for a SYSTEM_OFF2 
function which is analogous to ACPI S4 state. This will allow hosting 
environments to determine that a guest is hibernated rather than just 
powered off, and ensure that they preserve the virtual environment 
appropriately to allow the guest to resume safely (or bump the 
hardware_signature in the FACS to trigger a clean reboot instead).

This adds support for it to KVM, and to the guest hibernate code.

Strictly, we should perhaps also allow the guest to detect PSCI v1.3, 
but when v1.1 was added in commit 512865d83fd9 it was done 
unconditionally, which seems wrong. Shouldn't we have a way for 
userspace to control what gets exposed, rather than silently changing 
the guest behaviour with newer host kernels? Should I add a 
KVM_CAP_ARM_PSCI_VERSION?

For the guest side, this adds a new SYS_OFF_MODE_POWER_OFF with higher 
priority than the EFI one, but which *only* triggers when there's a 
hibernation in progress. That seemed like the simplest option, but see 
the commit message for alternative possilities. I told Rafael I'd post a 
straw man for bikeshedding, and here it is.

 Documentation/virt/kvm/api.rst       | 11 +++++++++++
 arch/arm64/include/asm/kvm_host.h    |  2 ++
 arch/arm64/include/uapi/asm/kvm.h    |  6 ++++++
 arch/arm64/kvm/arm.c                 |  5 +++++
 arch/arm64/kvm/hyp/nvhe/psci-relay.c |  2 ++
 arch/arm64/kvm/psci.c                | 37 ++++++++++++++++++++++++++++++++++++
 drivers/firmware/psci/psci.c         | 35 ++++++++++++++++++++++++++++++++++
 include/uapi/linux/kvm.h             |  1 +
 include/uapi/linux/psci.h            |  5 +++++
 kernel/power/hibernate.c             |  5 ++++-
 10 files changed, 108 insertions(+), 1 deletion(-)

Comments

Marc Zyngier March 12, 2024, 3:24 p.m. UTC | #1
On Tue, 12 Mar 2024 13:51:27 +0000,
David Woodhouse <dwmw2@infradead.org> wrote:
> 
> The upcoming PSCI v1.3 specification adds support for a SYSTEM_OFF2

Pointer to the spec? Crucially, this is in the Alpha state, meaning
that it is still subject to change [1].

> function which is analogous to ACPI S4 state. This will allow hosting 
> environments to determine that a guest is hibernated rather than just 
> powered off, and ensure that they preserve the virtual environment 
> appropriately to allow the guest to resume safely (or bump the 
> hardware_signature in the FACS to trigger a clean reboot instead).
> 
> This adds support for it to KVM, and to the guest hibernate code.
> 
> Strictly, we should perhaps also allow the guest to detect PSCI v1.3, 
> but when v1.1 was added in commit 512865d83fd9 it was done 
> unconditionally, which seems wrong. Shouldn't we have a way for 
> userspace to control what gets exposed, rather than silently changing 
> the guest behaviour with newer host kernels? Should I add a 
> KVM_CAP_ARM_PSCI_VERSION?

Do you mean something like 85bd0ba1ff98?

	M.

[1] https://documentation-service.arm.com/static/65e59325837c4d065f6556a6
David Woodhouse March 12, 2024, 5:01 p.m. UTC | #2
On Tue, 2024-03-12 at 15:24 +0000, Marc Zyngier wrote:
> 
> > Strictly, we should perhaps also allow the guest to detect PSCI v1.3, 
> > but when v1.1 was added in commit 512865d83fd9 it was done 
> > unconditionally, which seems wrong. Shouldn't we have a way for 
> > userspace to control what gets exposed, rather than silently changing 
> > the guest behaviour with newer host kernels? Should I add a 
> > KVM_CAP_ARM_PSCI_VERSION?
> 
> Do you mean something like 85bd0ba1ff98?

Ew :)

That isn't quite what I was thinking, no. I wasn't thinking of
something that would default to the latest, and would have a per-vCPU
way of setting what's essentially a KVM-wide configuration.

So if current userspace doesn't want the environment it exposes to
guests to be randomly changed by a kernel upgrade in the future, it
needs to explicitly use KVM_ARM_SET_REG on any one of the vCPUs, to set
KVM_REG_ARM_PSCI_VERSION to KVM_ARM_PSCI_1_1?

It isn't just new optional features; PSCI v1.2 added new error returns
from CPU_ON for example. Should guests start to see those, just because
the host kernel got upgraded? 

Now I see it, I suppose we can extend it to v1.2 (and v1.3 when that's
eventually published for real). Should we really continue to increment
the *default* though?