mbox series

[RFC,v2,0/6] KVM: arm64: Userspace SMCCC call filtering

Message ID 20230211013759.3556016-1-oliver.upton@linux.dev (mailing list archive)
Headers show
Series KVM: arm64: Userspace SMCCC call filtering | expand

Message

Oliver Upton Feb. 11, 2023, 1:37 a.m. UTC
The Arm SMCCC is rather prescriptive in regards to the allocation of
SMCCC function ID ranges. Many of the hypercall ranges have an
associated specification from Arm (FF-A, PSCI, SDEI, etc.) with some
room for vendor-specific implementations.

The ever-expanding SMCCC surface leaves a lot of work within KVM for
providing new features. Furthermore, KVM implements its own
vendor-specific ABI, with little room for other implementations (like
Hyper-V, for example).

Not only that, it would appear that vCPU hotplug [1] has a legitimate
use case for something like this, sending PSCI calls to userspace (where
they should have gone in the first place).

=> We have these new hypercall bitmap registers, why not use that?

The hypercall bitmap registers aren't necessarily aimed at the same
problem. The bitmap registers allow a VMM to preserve the ABI the guest
gets from KVM by default when migrating between hosts. By default KVM
exposes the entire feature set to the guest, whereas user SMCCC calls
need explicit opt-in from userspace.

Applies to 6.2-rc3.

TODO:
 - Reject the ranges of hypercalls we don't want userspace to handle.
   Spectre crud mainly, any others?

   I plan on using the invariant of the maple tree to reject filters
   that intersect with a reserved range.

 - Should exits for SMC calls have the PC pre-incremented to align with
   HVC? Go read the comment in handle_smc() if you aren't following.

   I think the answer is 'yes', but opinions welcome as always :)

 - This series unifies the SMCCC space for HVCs and SMCs but this
   requires a lot more thought. Otherwise, we can add support for two
   separate namespaces.

 - Testing! I only got as far as compiling this on my machine. At
   minimum a decent selftest is requried considering the UAPI here is
   rather involved.

RFC v1 -> v2:
 - Use a range-based interface instead of filtering entire services
 - Stop using the braindead term of 'trapping' in relation to userspace.

Oliver Upton (6):
  KVM: arm64: Add a helper to check if a VM has ran once
  KVM: arm64: Add vm fd device attribute accessors
  KVM: arm64: Refactor hvc filtering to support different actions
  KVM: arm64: Use a maple tree to represent the SMCCC filter
  KVM: arm64: Add support for KVM_EXIT_HYPERCALL
  KVM: arm64: Indroduce support for userspace SMCCC filtering

 Documentation/virt/kvm/api.rst        |  24 +++-
 Documentation/virt/kvm/devices/vm.rst |  67 ++++++++++
 arch/arm64/include/asm/kvm_host.h     |   8 +-
 arch/arm64/include/uapi/asm/kvm.h     |  31 +++++
 arch/arm64/kvm/arm.c                  |  35 +++++
 arch/arm64/kvm/handle_exit.c          |  12 +-
 arch/arm64/kvm/hypercalls.c           | 176 +++++++++++++++++++++++++-
 arch/arm64/kvm/pmu-emul.c             |   4 +-
 include/kvm/arm_hypercalls.h          |   5 +
 include/uapi/linux/kvm.h              |   2 +-
 10 files changed, 350 insertions(+), 14 deletions(-)


base-commit: b7bfaa761d760e72a969d116517eaa12e404c262

Comments

James Morse Feb. 24, 2023, 3:12 p.m. UTC | #1
Hi Oliver,

On 11/02/2023 01:37, Oliver Upton wrote:
> The Arm SMCCC is rather prescriptive in regards to the allocation of
> SMCCC function ID ranges. Many of the hypercall ranges have an
> associated specification from Arm (FF-A, PSCI, SDEI, etc.) with some
> room for vendor-specific implementations.
> 
> The ever-expanding SMCCC surface leaves a lot of work within KVM for
> providing new features. Furthermore, KVM implements its own
> vendor-specific ABI, with little room for other implementations (like
> Hyper-V, for example).
> 
> Not only that, it would appear that vCPU hotplug [1] has a legitimate
> use case for something like this, sending PSCI calls to userspace (where
> they should have gone in the first place).
> 
> => We have these new hypercall bitmap registers, why not use that?
> 
> The hypercall bitmap registers aren't necessarily aimed at the same
> problem. The bitmap registers allow a VMM to preserve the ABI the guest
> gets from KVM by default when migrating between hosts. By default KVM
> exposes the entire feature set to the guest, whereas user SMCCC calls
> need explicit opt-in from userspace.
> 
> Applies to 6.2-rc3.


> TODO:
>  - Reject the ranges of hypercalls we don't want userspace to handle.
>    Spectre crud mainly, any others?

We can predict what future 'ARCH_WORKAROUND_foo' values will be in the future, as they
have to be generated by a single instruction. I think its worth preventing user-space from
using any of those.

I don't see how user-space could possibly implement stolen time correctly ... but I don'
think we should prevent it trying.

The 'features' calls are going to be a headache, especially when the features call in one
range gives results about calls in a different range. (e.g. you query PSCI_FEATURES to
find if SMCCC_VERSION is supported). I'm working on a reference implementation for kvmtool
to show we don't regress any of the existing SMC-CC supoprt.


>    I plan on using the invariant of the maple tree to reject filters
>    that intersect with a reserved range.
> 
>  - Should exits for SMC calls have the PC pre-incremented to align with
>    HVC? Go read the comment in handle_smc() if you aren't following.
> 
>    I think the answer is 'yes', but opinions welcome as always :)

I don't think there is a compelling argument either way. But please document whether
user-space must increment the PC, or must not!


>  - This series unifies the SMCCC space for HVCs and SMCs but this
>    requires a lot more thought. Otherwise, we can add support for two
>    separate namespaces.

I checked with ATG, they think the function IDs are one space, and have no intention of
having different APIs for the same function-id behind HVC/SMC.

They pointed to 'SMCCC issue E, appendix D' which says hypervisors are expected to trap
SMC, both conduits go to the same 'managing EL'.


>  - Testing! I only got as far as compiling this on my machine. At
>    minimum a decent selftest is requried considering the UAPI here is
>    rather involved.

I've got PSCI support in kvmtool, (including cpu-suspend), I intend to try and test as
much of SMC-CC as I can.

I'll rebase the virtual-cpu hotplug stuff onto this, Salil should be able to give some
feedback from the Qemu side.


Thanks,

James
Oliver Upton Feb. 24, 2023, 9:32 p.m. UTC | #2
Hi James,

On Fri, Feb 24, 2023 at 03:12:08PM +0000, James Morse wrote:
> Hi Oliver,
> 
> On 11/02/2023 01:37, Oliver Upton wrote:
> > The Arm SMCCC is rather prescriptive in regards to the allocation of
> > SMCCC function ID ranges. Many of the hypercall ranges have an
> > associated specification from Arm (FF-A, PSCI, SDEI, etc.) with some
> > room for vendor-specific implementations.
> > 
> > The ever-expanding SMCCC surface leaves a lot of work within KVM for
> > providing new features. Furthermore, KVM implements its own
> > vendor-specific ABI, with little room for other implementations (like
> > Hyper-V, for example).
> > 
> > Not only that, it would appear that vCPU hotplug [1] has a legitimate
> > use case for something like this, sending PSCI calls to userspace (where
> > they should have gone in the first place).
> > 
> > => We have these new hypercall bitmap registers, why not use that?
> > 
> > The hypercall bitmap registers aren't necessarily aimed at the same
> > problem. The bitmap registers allow a VMM to preserve the ABI the guest
> > gets from KVM by default when migrating between hosts. By default KVM
> > exposes the entire feature set to the guest, whereas user SMCCC calls
> > need explicit opt-in from userspace.
> > 
> > Applies to 6.2-rc3.
> 
> 
> > TODO:
> >  - Reject the ranges of hypercalls we don't want userspace to handle.
> >    Spectre crud mainly, any others?
> 
> We can predict what future 'ARCH_WORKAROUND_foo' values will be in the future, as they
> have to be generated by a single instruction. I think its worth preventing user-space from
> using any of those.

Agreed.

> I don't see how user-space could possibly implement stolen time correctly ... but I don'
> think we should prevent it trying.

Yeah, unless there's a real hazard to getting userspace involved (like
above) then I see no issue in letting the VMM have at it.

> The 'features' calls are going to be a headache, especially when the features call in one
> range gives results about calls in a different range. (e.g. you query PSCI_FEATURES to
> find if SMCCC_VERSION is supported). I'm working on a reference implementation for kvmtool
> to show we don't regress any of the existing SMC-CC supoprt.

Goodness, thanks for taking a stab at the userspace angle of this.

I've been mulling on this issue for a while. I originally wanted the
UAPI to consume the whole set of subranges described in the filter at
once to impose some degree of validation, but I worry that'll become
unnecessarily complex (so I will definitely get it wrong).

Short of a better solution for the problem I'm fine saying it is
entirely a userspace problem to present a sensible feature set. No
matter what we do the problem of incorrectly advertising to the guest
will exist with user hypercalls, possibly in a range KVM doesn't
know/care about.

> 
> >    I plan on using the invariant of the maple tree to reject filters
> >    that intersect with a reserved range.
> > 
> >  - Should exits for SMC calls have the PC pre-incremented to align with
> >    HVC? Go read the comment in handle_smc() if you aren't following.
> > 
> >    I think the answer is 'yes', but opinions welcome as always :)
> 
> I don't think there is a compelling argument either way. But please document whether
> user-space must increment the PC, or must not!

I'm going forward with pre-increment, both for consistency with HVCs and
avoiding additional ioctls to manually increment PC from userspace.
Documentation to boot!

> >  - This series unifies the SMCCC space for HVCs and SMCs but this
> >    requires a lot more thought. Otherwise, we can add support for two
> >    separate namespaces.
> 
> I checked with ATG, they think the function IDs are one space, and have no intention of
> having different APIs for the same function-id behind HVC/SMC.
> 
> They pointed to 'SMCCC issue E, appendix D' which says hypervisors are expected to trap
> SMC, both conduits go to the same 'managing EL'.

Excellent, thank you for getting clarification on that. What I'm also
reading here is that KVM is indeed wrong in its unwillingness to handle
SMCCC calls from EL1 :)

> 
> >  - Testing! I only got as far as compiling this on my machine. At
> >    minimum a decent selftest is requried considering the UAPI here is
> >    rather involved.
> 
> I've got PSCI support in kvmtool, (including cpu-suspend), I intend to try and test as
> much of SMC-CC as I can.

Thanks, that is tremendously helpful. Bonus points if userspace PSCI
handling is enabled unconditionally and not just for hotplug :)