Message ID | 20250110172411.39845-3-miko.lenczewski@arm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v1] arm64: Add TLB Conflict Abort Exception handler to KVM | expand |
On Fri, Jan 10, 2025 at 05:24:07PM +0000, Mikołaj Lenczewski wrote: > Currently, KVM does not handle the case of a stage 2 TLB conflict abort > exception. This can legitimately occurs when the guest is eliding full > BBM semantics as permitted by BBM level 2. In this case it is possible > for a confclit abort to be delivered to EL2. We handle that by > invalidating the full TLB. typo: conflict Also, a bit of a nitpick, but mentioning that TLB conflict abort routing is implementation defined when S2 is enabled is valuable information. > @@ -1756,6 +1756,19 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) > ipa = fault_ipa = kvm_vcpu_get_fault_ipa(vcpu); > is_iabt = kvm_vcpu_trap_is_iabt(vcpu); > > + if (esr_fsc_is_tlb_conflict_abort(esr)) { > + > + /* Architecturely, at this stage 2 tlb conflict abort, we must > + * either perform a `tlbi vmalls12e1`, or a `tlbi alle1`. Due > + * to nesting of VMs, we would have to iterate all flattened > + * VMIDs to clean out a single guest, so we perform a `tlbi alle1` > + * instead to save time. > + */ I'm not sure I follow this. At this point we've taken a TLB conflict abort out of a specific hardware MMU context, and it's unclear to me why a conflict abort in one stage-2 MMU has any bearing on the other stage-2 MMUs that could be associated with this guest. Even in NV, KVM is always responsible for the maintenance of hardware stage-2 MMUs. So stage-2 TLBI elision in the guest hypervisor should not lead to a stage-2 TLB conflict abort. TLBI ALLE1 is a larger hammer than what's actually necessary here. Could you perhaps introduce a new invalidation routine, __kvm_tlb_flush_vmid_nsh(), that does a TLBI VMALLS12E1 behind the scenes? If an NV guest is playing games at stage-1 across VMIDs then it gets to suffer the consequences (additional TLB conflict aborts).
diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h index d1b1a33f9a8b..8a66f81ca291 100644 --- a/arch/arm64/include/asm/esr.h +++ b/arch/arm64/include/asm/esr.h @@ -121,6 +121,7 @@ #define ESR_ELx_FSC_SEA_TTW(n) (0x14 + (n)) #define ESR_ELx_FSC_SECC (0x18) #define ESR_ELx_FSC_SECC_TTW(n) (0x1c + (n)) +#define ESR_ELx_FSC_TLBABT (0x30) /* Status codes for individual page table levels */ #define ESR_ELx_FSC_ACCESS_L(n) (ESR_ELx_FSC_ACCESS + (n)) @@ -464,6 +465,13 @@ static inline bool esr_fsc_is_access_flag_fault(unsigned long esr) (esr == ESR_ELx_FSC_ACCESS_L(0)); } +static inline bool esr_fsc_is_tlb_conflict_abort(unsigned long esr) +{ + esr = esr & ESR_ELx_FSC; + + return esr == ESR_ELx_FSC_TLBABT; +} + /* Indicate whether ESR.EC==0x1A is for an ERETAx instruction */ static inline bool esr_iss_is_eretax(unsigned long esr) { diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h index ca2590344313..095872af764a 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -222,7 +222,7 @@ DECLARE_KVM_NVHE_SYM(__per_cpu_end); DECLARE_KVM_HYP_SYM(__bp_harden_hyp_vecs); #define __bp_harden_hyp_vecs CHOOSE_HYP_SYM(__bp_harden_hyp_vecs) -extern void __kvm_flush_vm_context(void); +extern void __kvm_flush_vm_context(bool cpu_local); extern void __kvm_flush_cpu_context(struct kvm_s2_mmu *mmu); extern void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, phys_addr_t ipa, int level); diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c index 6aa0b13d86e5..f44a7550f4a7 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -195,7 +195,7 @@ static void handle___kvm_adjust_pc(struct kvm_cpu_context *host_ctxt) static void handle___kvm_flush_vm_context(struct kvm_cpu_context *host_ctxt) { - __kvm_flush_vm_context(); + __kvm_flush_vm_context(false); } static void handle___kvm_tlb_flush_vmid_ipa(struct kvm_cpu_context *host_ctxt) diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c index 48da9ca9763f..97f749ad63cc 100644 --- a/arch/arm64/kvm/hyp/nvhe/tlb.c +++ b/arch/arm64/kvm/hyp/nvhe/tlb.c @@ -261,10 +261,15 @@ void __kvm_flush_cpu_context(struct kvm_s2_mmu *mmu) exit_vmid_context(&cxt); } -void __kvm_flush_vm_context(void) +void __kvm_flush_vm_context(bool cpu_local) { /* Same remark as in enter_vmid_context() */ dsb(ish); - __tlbi(alle1is); + + if (cpu_local) + __tlbi(alle1); + else + __tlbi(alle1is); + dsb(ish); } diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c index 3d50a1bd2bdb..564602fa4d62 100644 --- a/arch/arm64/kvm/hyp/vhe/tlb.c +++ b/arch/arm64/kvm/hyp/vhe/tlb.c @@ -213,10 +213,15 @@ void __kvm_flush_cpu_context(struct kvm_s2_mmu *mmu) exit_vmid_context(&cxt); } -void __kvm_flush_vm_context(void) +void __kvm_flush_vm_context(bool cpu_local) { dsb(ishst); - __tlbi(alle1is); + + if (cpu_local) + __tlbi(alle1); + else + __tlbi(alle1is); + dsb(ish); } diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index c9d46ad57e52..7c0d97449d23 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1756,6 +1756,19 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) ipa = fault_ipa = kvm_vcpu_get_fault_ipa(vcpu); is_iabt = kvm_vcpu_trap_is_iabt(vcpu); + if (esr_fsc_is_tlb_conflict_abort(esr)) { + + /* Architecturely, at this stage 2 tlb conflict abort, we must + * either perform a `tlbi vmalls12e1`, or a `tlbi alle1`. Due + * to nesting of VMs, we would have to iterate all flattened + * VMIDs to clean out a single guest, so we perform a `tlbi alle1` + * instead to save time. + */ + __kvm_flush_vm_context(true); + + return 1; + } + if (esr_fsc_is_translation_fault(esr)) { /* Beyond sanitised PARange (which is the IPA limit) */ if (fault_ipa >= BIT_ULL(get_kvm_ipa_limit())) { diff --git a/arch/arm64/kvm/vmid.c b/arch/arm64/kvm/vmid.c index 806223b7022a..d558428fcfed 100644 --- a/arch/arm64/kvm/vmid.c +++ b/arch/arm64/kvm/vmid.c @@ -66,7 +66,7 @@ static void flush_context(void) * the next context-switch, we broadcast TLB flush + I-cache * invalidation over the inner shareable domain on rollover. */ - kvm_call_hyp(__kvm_flush_vm_context); + kvm_call_hyp(__kvm_flush_vm_context, false); } static bool check_update_reserved_vmid(u64 vmid, u64 newvmid)
Currently, KVM does not handle the case of a stage 2 TLB conflict abort exception. This can legitimately occurs when the guest is eliding full BBM semantics as permitted by BBM level 2. In this case it is possible for a confclit abort to be delivered to EL2. We handle that by invalidating the full TLB. The Arm ARM specifies that the worst-case invalidation is either a `tlbi vmalls12e1` or a `tlbi alle1` (as per DDI0487K section D8.16.3). We implement `tlbi alle1` by extending the existing __kvm_flush_vm_context() helper to allow for differentiating between inner-shareable and cpu-local invalidations. This commit applies on top of v6.13-rc2 (fac04efc5c79). Signed-off-by: Mikołaj Lenczewski <miko.lenczewski@arm.com> --- arch/arm64/include/asm/esr.h | 8 ++++++++ arch/arm64/include/asm/kvm_asm.h | 2 +- arch/arm64/kvm/hyp/nvhe/hyp-main.c | 2 +- arch/arm64/kvm/hyp/nvhe/tlb.c | 9 +++++++-- arch/arm64/kvm/hyp/vhe/tlb.c | 9 +++++++-- arch/arm64/kvm/mmu.c | 13 +++++++++++++ arch/arm64/kvm/vmid.c | 2 +- 7 files changed, 38 insertions(+), 7 deletions(-)