mbox series

[00/11] KVM: arm64: Move PC/ELR/SPSR/PSTATE updatess to EL2

Message ID 20201026133450.73304-1-maz@kernel.org (mailing list archive)
Headers show
Series KVM: arm64: Move PC/ELR/SPSR/PSTATE updatess to EL2 | expand

Message

Marc Zyngier Oct. 26, 2020, 1:34 p.m. UTC
As we progress towards being able to keep the guest state private to
the nVHE hypervisor, this series aims at moving anything that touches
the registers involved into an exception to EL2.

The general idea is that any update to these registers is driven by a
set of flags passed from EL1 to EL2, and EL2 will deal with the
register update itself, removing the need for EL1 to see the guest
state. It also results in a bunch of cleanup, mostly in the 32bit
department (negative diffstat, yay!).

Of course, none of that has any real effect on security yet. It is
only once we start having a private VCPU structure at EL2 that we can
enforce the isolation. Similarly, there is no policy enforcement, and
a malicious EL1 can still inject exceptions at random points. It can
also give bogus ESR values to the guest. Baby steps.

        M.

Marc Zyngier (11):
  KVM: arm64: Don't adjust PC on SError during SMC trap
  KVM: arm64: Move kvm_vcpu_trap_il_is32bit into kvm_skip_instr32()
  KVM: arm64: Make kvm_skip_instr() and co private to HYP
  KVM: arm64: Move PC rollback on SError to HYP
  KVM: arm64: Move VHE direct sysreg accessors into kvm_host.h
  KVM: arm64: Add basic hooks for injecting exceptions from EL2
  KVM: arm64: Inject AArch64 exceptions from HYP
  KVM: arm64: Inject AArch32 exceptions from HYP
  KVM: arm64: Remove SPSR manipulation primitives
  KVM: arm64: Consolidate exception injection
  KVM: arm64: Get rid of the AArch32 register mapping code

 arch/arm64/include/asm/kvm_emulate.h       |  70 +---
 arch/arm64/include/asm/kvm_host.h          | 115 ++++++-
 arch/arm64/kvm/Makefile                    |   4 +-
 arch/arm64/kvm/aarch32.c                   | 232 -------------
 arch/arm64/kvm/guest.c                     |  28 +-
 arch/arm64/kvm/handle_exit.c               |  23 +-
 arch/arm64/kvm/hyp/aarch32.c               |   4 +-
 arch/arm64/kvm/hyp/exception.c             | 368 +++++++++++++++++++++
 arch/arm64/kvm/hyp/include/hyp/adjust_pc.h |  62 ++++
 arch/arm64/kvm/hyp/include/hyp/switch.h    |  17 +
 arch/arm64/kvm/hyp/nvhe/Makefile           |   2 +-
 arch/arm64/kvm/hyp/nvhe/switch.c           |   3 +
 arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c   |   2 +
 arch/arm64/kvm/hyp/vgic-v3-sr.c            |   2 +
 arch/arm64/kvm/hyp/vhe/Makefile            |   2 +-
 arch/arm64/kvm/hyp/vhe/switch.c            |   3 +
 arch/arm64/kvm/inject_fault.c              | 187 +++++------
 arch/arm64/kvm/mmio.c                      |   2 +-
 arch/arm64/kvm/mmu.c                       |   2 +-
 arch/arm64/kvm/regmap.c                    | 224 -------------
 arch/arm64/kvm/sys_regs.c                  |  83 +----
 21 files changed, 698 insertions(+), 737 deletions(-)
 delete mode 100644 arch/arm64/kvm/aarch32.c
 create mode 100644 arch/arm64/kvm/hyp/exception.c
 create mode 100644 arch/arm64/kvm/hyp/include/hyp/adjust_pc.h
 delete mode 100644 arch/arm64/kvm/regmap.c

Comments

Mark Rutland Oct. 26, 2020, 2:04 p.m. UTC | #1
On Mon, Oct 26, 2020 at 01:34:42PM +0000, Marc Zyngier wrote:
> In an effort to remove the vcpu PC manipulations from EL1 on nVHE
> systems, move kvm_skip_instr() to be HYP-specific. EL1's intent
> to increment PC post emulation is now signalled via a flag in the
> vcpu structure.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>

[...]

> +/*
> + * Adjust the guest PC on entry, depending on flags provided by EL1
> + * for the purpose of emulation (MMIO, sysreg).
> + */
> +static inline void __adjust_pc(struct kvm_vcpu *vcpu)
> +{
> +	if (vcpu->arch.flags & KVM_ARM64_INCREMENT_PC) {
> +		kvm_skip_instr(vcpu);
> +		vcpu->arch.flags &= ~KVM_ARM64_INCREMENT_PC;
> +	}
> +}

What's your plan for restricting *when* EL1 can ask for the PC to be
adjusted?

I'm assuming that either:

1. You have EL2 sanity-check all responses from EL1 are permitted for
   the current state. e.g. if EL1 asks to increment the PC, EL2 must
   check that that was a sane response for the current state.

2. You raise the level of abstraction at the EL2/EL1 boundary, such that
   EL2 simply knows. e.g. if emulating a memory access, EL1 can either
   provide the response or signal an abort, but doesn't choose to
   manipulate the PC as EL2 will infer the right thing to do.

I know that either are tricky in practice, so I'm curious what your view
is. Generally option #2 is easier to fortify, but I guess we might have
to do #1 since we also have to support unprotected VMs?

Thanks,
Mark.
Marc Zyngier Oct. 27, 2020, 4:17 p.m. UTC | #2
On 2020-10-26 14:04, Mark Rutland wrote:
> On Mon, Oct 26, 2020 at 01:34:42PM +0000, Marc Zyngier wrote:
>> In an effort to remove the vcpu PC manipulations from EL1 on nVHE
>> systems, move kvm_skip_instr() to be HYP-specific. EL1's intent
>> to increment PC post emulation is now signalled via a flag in the
>> vcpu structure.
>> 
>> Signed-off-by: Marc Zyngier <maz@kernel.org>
> 
> [...]
> 
>> +/*
>> + * Adjust the guest PC on entry, depending on flags provided by EL1
>> + * for the purpose of emulation (MMIO, sysreg).
>> + */
>> +static inline void __adjust_pc(struct kvm_vcpu *vcpu)
>> +{
>> +	if (vcpu->arch.flags & KVM_ARM64_INCREMENT_PC) {
>> +		kvm_skip_instr(vcpu);
>> +		vcpu->arch.flags &= ~KVM_ARM64_INCREMENT_PC;
>> +	}
>> +}
> 
> What's your plan for restricting *when* EL1 can ask for the PC to be
> adjusted?
> 
> I'm assuming that either:
> 
> 1. You have EL2 sanity-check all responses from EL1 are permitted for
>    the current state. e.g. if EL1 asks to increment the PC, EL2 must
>    check that that was a sane response for the current state.
> 
> 2. You raise the level of abstraction at the EL2/EL1 boundary, such 
> that
>    EL2 simply knows. e.g. if emulating a memory access, EL1 can either
>    provide the response or signal an abort, but doesn't choose to
>    manipulate the PC as EL2 will infer the right thing to do.
> 
> I know that either are tricky in practice, so I'm curious what your 
> view
> is. Generally option #2 is easier to fortify, but I guess we might have
> to do #1 since we also have to support unprotected VMs?

To be honest, I'm still in two minds about it, which is why I have
gone with this "middle of the road" option (moving the PC update
to EL2, but leave the control at EL1).

I guess the answer is "it depends". MMIO is easy to put in the #2 model,
while things like WFI/WFE really need #1. sysregs are yet another can of
worm.

         M.