mbox series

[00/12] KVM: arm64: nv: Add support for address translation instructions

Message ID 20240625133508.259829-1-maz@kernel.org (mailing list archive)
Headers show
Series KVM: arm64: nv: Add support for address translation instructions | expand

Message

Marc Zyngier June 25, 2024, 1:34 p.m. UTC
Another task that a hypervisor supporting NV on arm64 has to deal with
is to emulate the AT instruction, because we multiplex all the S1
translations on a single set of registers, and the guest S2 is never
truly resident on the CPU.

So given that we lie about page tables, we also have to lie about
translation instructions, hence the emulation. Things are made
complicated by the fact that guest S1 page tables can be swapped out,
and that our shadow S2 is likely to be incomplete. So while using AT
to emulate AT is tempting (and useful), it is not going to always
work, and we thus need a fallback in the shape of a SW S1 walker.

This series is built in 4 basic blocks:

- Add missing definition and basic reworking

- Dumb emulation of all relevant AT instructions using AT instructions

- Add a SW S1 walker that is using our S2 walker

- Add FEAT_ATS1A support, which is almost trivial

This has been tested by comparing the output of a HW walker with the
output of the SW one. Obviously, this isn't bullet proof, and I'm
pretty sure there are some nasties in there.

In a departure from my usual habit, this series is on top of
kvmarm/next, as it depends on the NV S2 shadow code.

Joey Gouly (1):
  KVM: arm64: make kvm_at() take an OP_AT_*

Marc Zyngier (11):
  arm64: Add missing APTable and TCR_ELx.HPD masks
  arm64: Add PAR_EL1 field description
  KVM: arm64: nv: Turn upper_attr for S2 walk into the full descriptor
  KVM: arm64: nv: Honor absence of FEAT_PAN2
  KVM: arm64: nv: Add basic emulation of AT S1E{0,1}{R,W}[P]
  KVM: arm64: nv: Add basic emulation of AT S1E2{R,W}
  KVM: arm64: nv: Add emulation of AT S12E{0,1}{R,W}
  KVM: arm64: nv: Make ps_to_output_size() generally available
  KVM: arm64: nv: Add SW walker for AT S1 emulation
  KVM: arm64: nv: Plumb handling of AT S1* traps from EL2
  KVM: arm64: nv: Add support for FEAT_ATS1A

 arch/arm64/include/asm/kvm_arm.h       |    1 +
 arch/arm64/include/asm/kvm_asm.h       |    6 +-
 arch/arm64/include/asm/kvm_nested.h    |   18 +-
 arch/arm64/include/asm/pgtable-hwdef.h |    7 +
 arch/arm64/include/asm/sysreg.h        |   19 +
 arch/arm64/kvm/Makefile                |    2 +-
 arch/arm64/kvm/at.c                    | 1007 ++++++++++++++++++++++++
 arch/arm64/kvm/emulate-nested.c        |    2 +
 arch/arm64/kvm/hyp/include/hyp/fault.h |    2 +-
 arch/arm64/kvm/nested.c                |   26 +-
 arch/arm64/kvm/sys_regs.c              |   60 ++
 11 files changed, 1125 insertions(+), 25 deletions(-)
 create mode 100644 arch/arm64/kvm/at.c

Comments

Alexandru Elisei July 8, 2024, 4:28 p.m. UTC | #1
Hi Marc,

On Tue, Jun 25, 2024 at 02:34:59PM +0100, Marc Zyngier wrote:
> Another task that a hypervisor supporting NV on arm64 has to deal with
> is to emulate the AT instruction, because we multiplex all the S1
> translations on a single set of registers, and the guest S2 is never
> truly resident on the CPU.
> 
> So given that we lie about page tables, we also have to lie about
> translation instructions, hence the emulation. Things are made
> complicated by the fact that guest S1 page tables can be swapped out,
> and that our shadow S2 is likely to be incomplete. So while using AT
> to emulate AT is tempting (and useful), it is not going to always
> work, and we thus need a fallback in the shape of a SW S1 walker.
> 
> This series is built in 4 basic blocks:
> 
> - Add missing definition and basic reworking
> 
> - Dumb emulation of all relevant AT instructions using AT instructions
> 
> - Add a SW S1 walker that is using our S2 walker

I wanted to have a look at the S1 walker, and in my inbox I only have
patches #1 to #9 ("KVM: arm64: nv: Make ps_to_output_size() generally
available"). Checked on the kvm mailing list archive [1], same thing; a
google search for the string "KVM: arm64: nv: Add SW walker for AT S1
emulation" (quotes included) turns up the cover letter.

Am I looking in the wrong places?

[1] https://www.spinics.net/lists/kvm/msg351826.html

Thanks,
Alex

> 
> - Add FEAT_ATS1A support, which is almost trivial
> 
> This has been tested by comparing the output of a HW walker with the
> output of the SW one. Obviously, this isn't bullet proof, and I'm
> pretty sure there are some nasties in there.
> 
> In a departure from my usual habit, this series is on top of
> kvmarm/next, as it depends on the NV S2 shadow code.
> 
> Joey Gouly (1):
>   KVM: arm64: make kvm_at() take an OP_AT_*
> 
> Marc Zyngier (11):
>   arm64: Add missing APTable and TCR_ELx.HPD masks
>   arm64: Add PAR_EL1 field description
>   KVM: arm64: nv: Turn upper_attr for S2 walk into the full descriptor
>   KVM: arm64: nv: Honor absence of FEAT_PAN2
>   KVM: arm64: nv: Add basic emulation of AT S1E{0,1}{R,W}[P]
>   KVM: arm64: nv: Add basic emulation of AT S1E2{R,W}
>   KVM: arm64: nv: Add emulation of AT S12E{0,1}{R,W}
>   KVM: arm64: nv: Make ps_to_output_size() generally available
>   KVM: arm64: nv: Add SW walker for AT S1 emulation
>   KVM: arm64: nv: Plumb handling of AT S1* traps from EL2
>   KVM: arm64: nv: Add support for FEAT_ATS1A
> 
>  arch/arm64/include/asm/kvm_arm.h       |    1 +
>  arch/arm64/include/asm/kvm_asm.h       |    6 +-
>  arch/arm64/include/asm/kvm_nested.h    |   18 +-
>  arch/arm64/include/asm/pgtable-hwdef.h |    7 +
>  arch/arm64/include/asm/sysreg.h        |   19 +
>  arch/arm64/kvm/Makefile                |    2 +-
>  arch/arm64/kvm/at.c                    | 1007 ++++++++++++++++++++++++
>  arch/arm64/kvm/emulate-nested.c        |    2 +
>  arch/arm64/kvm/hyp/include/hyp/fault.h |    2 +-
>  arch/arm64/kvm/nested.c                |   26 +-
>  arch/arm64/kvm/sys_regs.c              |   60 ++
>  11 files changed, 1125 insertions(+), 25 deletions(-)
>  create mode 100644 arch/arm64/kvm/at.c
> 
> -- 
> 2.39.2
> 
>
Marc Zyngier July 8, 2024, 5 p.m. UTC | #2
Hi Alex,

On Mon, 08 Jul 2024 17:28:11 +0100,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> Hi Marc,
> 
> On Tue, Jun 25, 2024 at 02:34:59PM +0100, Marc Zyngier wrote:
> > Another task that a hypervisor supporting NV on arm64 has to deal with
> > is to emulate the AT instruction, because we multiplex all the S1
> > translations on a single set of registers, and the guest S2 is never
> > truly resident on the CPU.
> > 
> > So given that we lie about page tables, we also have to lie about
> > translation instructions, hence the emulation. Things are made
> > complicated by the fact that guest S1 page tables can be swapped out,
> > and that our shadow S2 is likely to be incomplete. So while using AT
> > to emulate AT is tempting (and useful), it is not going to always
> > work, and we thus need a fallback in the shape of a SW S1 walker.
> > 
> > This series is built in 4 basic blocks:
> > 
> > - Add missing definition and basic reworking
> > 
> > - Dumb emulation of all relevant AT instructions using AT instructions
> > 
> > - Add a SW S1 walker that is using our S2 walker
> 
> I wanted to have a look at the S1 walker, and in my inbox I only have
> patches #1 to #9 ("KVM: arm64: nv: Make ps_to_output_size() generally
> available"). Checked on the kvm mailing list archive [1], same thing; a
> google search for the string "KVM: arm64: nv: Add SW walker for AT S1
> emulation" (quotes included) turns up the cover letter.
> 
> Am I looking in the wrong places?
> 
> [1] https://www.spinics.net/lists/kvm/msg351826.html

This is very odd. I probably have sent them by specifying 000*patch
instead of 00*patch, hence the truncation to 9 patches.

Let me try and send the delta. With a bit of luck, it won't make a
mess in the archive[1].

Thanks for the heads up,

	M.

[1] https://lore.kernel.org/all/20240625133508.259829-1-maz@kernel.or
Alexandru Elisei July 31, 2024, 10:05 a.m. UTC | #3
Hi Marc,

On Tue, Jun 25, 2024 at 02:34:59PM +0100, Marc Zyngier wrote:
> Another task that a hypervisor supporting NV on arm64 has to deal with
> is to emulate the AT instruction, because we multiplex all the S1
> translations on a single set of registers, and the guest S2 is never
> truly resident on the CPU.

I'm unfamiliar with the state of NV support in KVM, but I thought I would have a
look at when AT trapping is enabled. As far as I can tell, it's only enabled in
vhe/switch.c::__activate_traps() -> compute_hcr() if is_hyp_ctct(vcpu). Found
this by grep'ing for HCR_AT.

Assuming the above is correct, I am curious about the following:

- The above paragraph mentions guest's stage 2 (and the code takes that into
  consideration), yet when is_hyp_ctxt() is true it is likely that the guest
  stage 2 is not enabled. Are you planning to enable the AT trap based on
  virtual HCR_EL2.VM being set in a later series?

- A guest might also set the HCR_EL2.AT bit in the virtual HCR_EL2 register. I
  suppose I have the same question, injecting the exception back into the guest
  is going to be handled in another series?

Thanks,
Alex

> 
> So given that we lie about page tables, we also have to lie about
> translation instructions, hence the emulation. Things are made
> complicated by the fact that guest S1 page tables can be swapped out,
> and that our shadow S2 is likely to be incomplete. So while using AT
> to emulate AT is tempting (and useful), it is not going to always
> work, and we thus need a fallback in the shape of a SW S1 walker.
> 
> This series is built in 4 basic blocks:
> 
> - Add missing definition and basic reworking
> 
> - Dumb emulation of all relevant AT instructions using AT instructions
> 
> - Add a SW S1 walker that is using our S2 walker
> 
> - Add FEAT_ATS1A support, which is almost trivial
> 
> This has been tested by comparing the output of a HW walker with the
> output of the SW one. Obviously, this isn't bullet proof, and I'm
> pretty sure there are some nasties in there.
> 
> In a departure from my usual habit, this series is on top of
> kvmarm/next, as it depends on the NV S2 shadow code.
> 
> Joey Gouly (1):
>   KVM: arm64: make kvm_at() take an OP_AT_*
> 
> Marc Zyngier (11):
>   arm64: Add missing APTable and TCR_ELx.HPD masks
>   arm64: Add PAR_EL1 field description
>   KVM: arm64: nv: Turn upper_attr for S2 walk into the full descriptor
>   KVM: arm64: nv: Honor absence of FEAT_PAN2
>   KVM: arm64: nv: Add basic emulation of AT S1E{0,1}{R,W}[P]
>   KVM: arm64: nv: Add basic emulation of AT S1E2{R,W}
>   KVM: arm64: nv: Add emulation of AT S12E{0,1}{R,W}
>   KVM: arm64: nv: Make ps_to_output_size() generally available
>   KVM: arm64: nv: Add SW walker for AT S1 emulation
>   KVM: arm64: nv: Plumb handling of AT S1* traps from EL2
>   KVM: arm64: nv: Add support for FEAT_ATS1A
> 
>  arch/arm64/include/asm/kvm_arm.h       |    1 +
>  arch/arm64/include/asm/kvm_asm.h       |    6 +-
>  arch/arm64/include/asm/kvm_nested.h    |   18 +-
>  arch/arm64/include/asm/pgtable-hwdef.h |    7 +
>  arch/arm64/include/asm/sysreg.h        |   19 +
>  arch/arm64/kvm/Makefile                |    2 +-
>  arch/arm64/kvm/at.c                    | 1007 ++++++++++++++++++++++++
>  arch/arm64/kvm/emulate-nested.c        |    2 +
>  arch/arm64/kvm/hyp/include/hyp/fault.h |    2 +-
>  arch/arm64/kvm/nested.c                |   26 +-
>  arch/arm64/kvm/sys_regs.c              |   60 ++
>  11 files changed, 1125 insertions(+), 25 deletions(-)
>  create mode 100644 arch/arm64/kvm/at.c
> 
> -- 
> 2.39.2
> 
>
Marc Zyngier July 31, 2024, 11:02 a.m. UTC | #4
On Wed, 31 Jul 2024 11:05:05 +0100,
Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> 
> Hi Marc,
> 
> On Tue, Jun 25, 2024 at 02:34:59PM +0100, Marc Zyngier wrote:
> > Another task that a hypervisor supporting NV on arm64 has to deal with
> > is to emulate the AT instruction, because we multiplex all the S1
> > translations on a single set of registers, and the guest S2 is never
> > truly resident on the CPU.
> 
> I'm unfamiliar with the state of NV support in KVM, but I thought I would have a
> look at when AT trapping is enabled. As far as I can tell, it's only enabled in
> vhe/switch.c::__activate_traps() -> compute_hcr() if is_hyp_ctct(vcpu). Found
> this by grep'ing for HCR_AT.
> 
> Assuming the above is correct, I am curious about the following:
> 
> - The above paragraph mentions guest's stage 2 (and the code takes that into
>   consideration), yet when is_hyp_ctxt() is true it is likely that the guest
>   stage 2 is not enabled. Are you planning to enable the AT trap based on
>   virtual HCR_EL2.VM being set in a later series?

I don't understand what you are referring to. AT traps and the guest's
HCR_EL2.VM are totally orthogonal, and are (or at least should be)
treated independently.

But more importantly, there are a bunch of cases where you have no
other choice but trap, and that what I allude to when I say "because
we multiplex all the S1 translations on a single set of register".

If I'm running the EL2 part of the guest, and that guest executes an
AT S1E1R while HCR_EL2.{E2H,TGE}={1,0}, it refers to the guest's EL1&0
translation regime. I can't let the guest execute it, because it would
walk its view of the EL2&0 regime. So we need to trap, evaluate what
the guest is trying to do, and do the walk in the correct context (by
using the instructions or the SW walk).

> 
> - A guest might also set the HCR_EL2.AT bit in the virtual HCR_EL2 register. I
>   suppose I have the same question, injecting the exception back into the guest
>   is going to be handled in another series?

This is already handled. The guest's HCR_EL2 is always folded into the
runtime configuration, and the resulting trap handled through the
existing trap routing infrastructure (see d0fc0a2519a6d, which added
the triaging of most traps resulting from HCR_EL2).

Thanks,

	M.
Alexandru Elisei July 31, 2024, 2:19 p.m. UTC | #5
Hi Marc,

On Wed, Jul 31, 2024 at 12:02:24PM +0100, Marc Zyngier wrote:
> On Wed, 31 Jul 2024 11:05:05 +0100,
> Alexandru Elisei <alexandru.elisei@arm.com> wrote:
> > 
> > Hi Marc,
> > 
> > On Tue, Jun 25, 2024 at 02:34:59PM +0100, Marc Zyngier wrote:
> > > Another task that a hypervisor supporting NV on arm64 has to deal with
> > > is to emulate the AT instruction, because we multiplex all the S1
> > > translations on a single set of registers, and the guest S2 is never
> > > truly resident on the CPU.
> > 
> > I'm unfamiliar with the state of NV support in KVM, but I thought I would have a
> > look at when AT trapping is enabled. As far as I can tell, it's only enabled in
> > vhe/switch.c::__activate_traps() -> compute_hcr() if is_hyp_ctct(vcpu). Found
> > this by grep'ing for HCR_AT.
> > 
> > Assuming the above is correct, I am curious about the following:
> > 
> > - The above paragraph mentions guest's stage 2 (and the code takes that into
> >   consideration), yet when is_hyp_ctxt() is true it is likely that the guest
> >   stage 2 is not enabled. Are you planning to enable the AT trap based on
> >   virtual HCR_EL2.VM being set in a later series?
> 
> I don't understand what you are referring to. AT traps and the guest's
> HCR_EL2.VM are totally orthogonal, and are (or at least should be)
> treated independently.

I was referring to what happens when a guest is running at EL1 with virtual
stage 2 enabled and that guest performs an AT instruction. If the stage 1
translation tables are not mapped at virtual stage 2, then KVM should inject a
data abort in the guest hypervisor.

But after thinking about it some more, I guess that's not something that needs
AT trapping: if the stage 1 tables are not mapped in the physical stage 2
(because the level 1 hypervisor unmapped them from the virtual stage 2), then
KVM will get a data abort, and then inject that back into the guest hypervisor.

And as far as I can tell, KVM tracks IPAs becoming unmapped from virtual stage 2
by trapping TLBIs.

So everything looks correct to me, sorry for the noise.

> 
> But more importantly, there are a bunch of cases where you have no
> other choice but trap, and that what I allude to when I say "because
> we multiplex all the S1 translations on a single set of register".
> 
> If I'm running the EL2 part of the guest, and that guest executes an
> AT S1E1R while HCR_EL2.{E2H,TGE}={1,0}, it refers to the guest's EL1&0
> translation regime. I can't let the guest execute it, because it would
> walk its view of the EL2&0 regime. So we need to trap, evaluate what
> the guest is trying to do, and do the walk in the correct context (by
> using the instructions or the SW walk).

Yes, that looks correct to me.

> 
> > 
> > - A guest might also set the HCR_EL2.AT bit in the virtual HCR_EL2 register. I
> >   suppose I have the same question, injecting the exception back into the guest
> >   is going to be handled in another series?
> 
> This is already handled. The guest's HCR_EL2 is always folded into the
> runtime configuration, and the resulting trap handled through the
> existing trap routing infrastructure (see d0fc0a2519a6d, which added
> the triaging of most traps resulting from HCR_EL2).

That explains it then, thanks for digging out the commit id!

Alex