Message ID | 20200327143941.195626-1-ascull@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] arm64: unify WORKAROUND_SPECULATIVE_AT_{NVHE,VHE} | expand |
On 27/03/2020 14:39, Andrew Scull wrote: > Errata 1165522, 1319367 and 1530923 each allow TLB entries to be > allocated as a result of a speculative AT instruction. In order to > avoid mandating VHE on certain affected CPUs, apply the workaround to > both the nVHE and the VHE case for all affected CPUs. > > Signed-off-by: Andrew Scull <ascull@google.com> > CC: Marc Zyngier <maz@kernel.org> > CC: James Morse <james.morse@arm.com> > CC: Suzuki K Poulose <suzuki.poulose@arm.com> > CC: Will Deacon <will@kernel.org> > CC: Steven Price <steven.price@arm.com> > > --- > > I'm not able to test the workarounds properly for the affected CPUs but > have built and booted under qemu configs with and without VHE as well as > the workaround being enabled and disabled. > > As there exist work arounds for nVHE and VHE, it doesn't appear to be a > technical limitation that meant VHE was being mandated. Please correct > me if this understanding is inaccurate. Thanks! I proposed something similar a while ago[1], but Marc was concerned about the microarch detail[2] and hence I split the workaround into VHE/non-VHE. That said I'm not saying this is necessarily wrong, just that we'd need some more information on whether the non-VHE workaround is suitable for the CPUs we're currently forcing VHE on. Steve [1] https://lore.kernel.org/kvmarm/20191113114118.2427-2-steven.price@arm.com/ [2] https://lore.kernel.org/kvmarm/566ecd45c8bf07b3cb5d75a10c9413a8@www.loen.fr/ > --- > arch/arm64/Kconfig | 39 ++++++++++++++----------------- > arch/arm64/include/asm/cpucaps.h | 9 ++++--- > arch/arm64/include/asm/kvm_host.h | 4 ---- > arch/arm64/include/asm/kvm_hyp.h | 2 +- > arch/arm64/kernel/cpu_errata.c | 25 +++++++++----------- > arch/arm64/kvm/hyp/switch.c | 6 ++--- > arch/arm64/kvm/hyp/sysreg-sr.c | 4 ++-- > arch/arm64/kvm/hyp/tlb.c | 8 +++---- > 8 files changed, 43 insertions(+), 54 deletions(-) > > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig > index 0b30e884e088..7492b929cb12 100644 > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -521,13 +521,13 @@ config ARM64_ERRATUM_1418040 > > If unsure, say Y. > > -config ARM64_WORKAROUND_SPECULATIVE_AT_VHE > +config ARM64_WORKAROUND_SPECULATIVE_AT > bool > > config ARM64_ERRATUM_1165522 > - bool "Cortex-A76: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation" > + bool "Cortex-A76: 1165522: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation" > default y > - select ARM64_WORKAROUND_SPECULATIVE_AT_VHE > + select ARM64_WORKAROUND_SPECULATIVE_AT > help > This option adds a workaround for ARM Cortex-A76 erratum 1165522. > > @@ -537,10 +537,23 @@ config ARM64_ERRATUM_1165522 > > If unsure, say Y. > > +config ARM64_ERRATUM_1319367 > + bool "Cortex-A57/A72: 1319537: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation" > + default y > + select ARM64_WORKAROUND_SPECULATIVE_AT > + help > + This option adds work arounds for ARM Cortex-A57 erratum 1319537 > + and A72 erratum 1319367 > + > + Cortex-A57 and A72 cores could end-up with corrupted TLBs by > + speculating an AT instruction during a guest context switch. > + > + If unsure, say Y. > + > config ARM64_ERRATUM_1530923 > - bool "Cortex-A55: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation" > + bool "Cortex-A55: 1530923: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation" > default y > - select ARM64_WORKAROUND_SPECULATIVE_AT_VHE > + select ARM64_WORKAROUND_SPECULATIVE_AT > help > This option adds a workaround for ARM Cortex-A55 erratum 1530923. > > @@ -566,22 +579,6 @@ config ARM64_ERRATUM_1286807 > invalidated has been observed by other observers. The > workaround repeats the TLBI+DSB operation. > > -config ARM64_WORKAROUND_SPECULATIVE_AT_NVHE > - bool > - > -config ARM64_ERRATUM_1319367 > - bool "Cortex-A57/A72: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation" > - default y > - select ARM64_WORKAROUND_SPECULATIVE_AT_NVHE > - help > - This option adds work arounds for ARM Cortex-A57 erratum 1319537 > - and A72 erratum 1319367 > - > - Cortex-A57 and A72 cores could end-up with corrupted TLBs by > - speculating an AT instruction during a guest context switch. > - > - If unsure, say Y. > - > config ARM64_ERRATUM_1463225 > bool "Cortex-A76: Software Step might prevent interrupt recognition" > default y > diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h > index 865e0253fc1e..e396e357f5b2 100644 > --- a/arch/arm64/include/asm/cpucaps.h > +++ b/arch/arm64/include/asm/cpucaps.h > @@ -44,7 +44,7 @@ > #define ARM64_SSBS 34 > #define ARM64_WORKAROUND_1418040 35 > #define ARM64_HAS_SB 36 > -#define ARM64_WORKAROUND_SPECULATIVE_AT_VHE 37 > +#define ARM64_WORKAROUND_SPECULATIVE_AT 37 > #define ARM64_HAS_ADDRESS_AUTH_ARCH 38 > #define ARM64_HAS_ADDRESS_AUTH_IMP_DEF 39 > #define ARM64_HAS_GENERIC_AUTH_ARCH 40 > @@ -55,10 +55,9 @@ > #define ARM64_WORKAROUND_CAVIUM_TX2_219_TVM 45 > #define ARM64_WORKAROUND_CAVIUM_TX2_219_PRFM 46 > #define ARM64_WORKAROUND_1542419 47 > -#define ARM64_WORKAROUND_SPECULATIVE_AT_NVHE 48 > -#define ARM64_HAS_E0PD 49 > -#define ARM64_HAS_RNG 50 > +#define ARM64_HAS_E0PD 48 > +#define ARM64_HAS_RNG 49 > > -#define ARM64_NCAPS 51 > +#define ARM64_NCAPS 50 > > #endif /* __ASM_CPUCAPS_H */ > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > index 57fd46acd058..4c8acbb949ce 100644 > --- a/arch/arm64/include/asm/kvm_host.h > +++ b/arch/arm64/include/asm/kvm_host.h > @@ -572,10 +572,6 @@ static inline bool kvm_arch_requires_vhe(void) > if (system_supports_sve()) > return true; > > - /* Some implementations have defects that confine them to VHE */ > - if (cpus_have_cap(ARM64_WORKAROUND_SPECULATIVE_AT_VHE)) > - return true; > - > return false; > } > > diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h > index fe57f60f06a8..238d2e049694 100644 > --- a/arch/arm64/include/asm/kvm_hyp.h > +++ b/arch/arm64/include/asm/kvm_hyp.h > @@ -102,7 +102,7 @@ static __always_inline void __hyp_text __load_guest_stage2(struct kvm *kvm) > * above before we can switch to the EL1/EL0 translation regime used by > * the guest. > */ > - asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT_VHE)); > + asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT)); > } > > #endif /* __ARM64_KVM_HYP_H__ */ > diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c > index 703ad0a84f99..fe3bbdb82c70 100644 > --- a/arch/arm64/kernel/cpu_errata.c > +++ b/arch/arm64/kernel/cpu_errata.c > @@ -637,7 +637,7 @@ has_neoverse_n1_erratum_1542419(const struct arm64_cpu_capabilities *entry, > return is_midr_in_range(midr, &range) && has_dic; > } > > -#if defined(CONFIG_HARDEN_EL2_VECTORS) || defined(CONFIG_ARM64_ERRATUM_1319367) > +#if defined(CONFIG_HARDEN_EL2_VECTORS) > > static const struct midr_range ca57_a72[] = { > MIDR_ALL_VERSIONS(MIDR_CORTEX_A57), > @@ -759,12 +759,16 @@ static const struct arm64_cpu_capabilities erratum_843419_list[] = { > }; > #endif > > -#ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT_VHE > -static const struct midr_range erratum_speculative_at_vhe_list[] = { > +#ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT > +static const struct midr_range erratum_speculative_at_list[] = { > #ifdef CONFIG_ARM64_ERRATUM_1165522 > /* Cortex A76 r0p0 to r2p0 */ > MIDR_RANGE(MIDR_CORTEX_A76, 0, 0, 2, 0), > #endif > +#ifdef CONFIG_ARM64_ERRATUM_1319367 > + MIDR_ALL_VERSIONS(MIDR_CORTEX_A57), > + MIDR_ALL_VERSIONS(MIDR_CORTEX_A72), > +#endif > #ifdef CONFIG_ARM64_ERRATUM_1530923 > /* Cortex A55 r0p0 to r2p0 */ > MIDR_RANGE(MIDR_CORTEX_A55, 0, 0, 2, 0), > @@ -899,11 +903,11 @@ const struct arm64_cpu_capabilities arm64_errata[] = { > ERRATA_MIDR_RANGE_LIST(erratum_1418040_list), > }, > #endif > -#ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT_VHE > +#ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT > { > - .desc = "ARM errata 1165522, 1530923", > - .capability = ARM64_WORKAROUND_SPECULATIVE_AT_VHE, > - ERRATA_MIDR_RANGE_LIST(erratum_speculative_at_vhe_list), > + .desc = "ARM errata 1165522, 1319367, 1530923", > + .capability = ARM64_WORKAROUND_SPECULATIVE_AT, > + ERRATA_MIDR_RANGE_LIST(erratum_speculative_at_list), > }, > #endif > #ifdef CONFIG_ARM64_ERRATUM_1463225 > @@ -936,13 +940,6 @@ const struct arm64_cpu_capabilities arm64_errata[] = { > .matches = has_neoverse_n1_erratum_1542419, > .cpu_enable = cpu_enable_trap_ctr_access, > }, > -#endif > -#ifdef CONFIG_ARM64_ERRATUM_1319367 > - { > - .desc = "ARM erratum 1319367", > - .capability = ARM64_WORKAROUND_SPECULATIVE_AT_NVHE, > - ERRATA_MIDR_RANGE_LIST(ca57_a72), > - }, > #endif > { > } > diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c > index 925086b46136..71d4a7250421 100644 > --- a/arch/arm64/kvm/hyp/switch.c > +++ b/arch/arm64/kvm/hyp/switch.c > @@ -127,7 +127,7 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu) > > write_sysreg(val, cptr_el2); > > - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { > + if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { > struct kvm_cpu_context *ctxt = &vcpu->arch.ctxt; > > isb(); > @@ -170,7 +170,7 @@ static void deactivate_traps_vhe(void) > * above before we can switch to the EL2/EL0 translation regime used by > * the host. > */ > - asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT_VHE)); > + asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT)); > > write_sysreg(CPACR_EL1_DEFAULT, cpacr_el1); > write_sysreg(vectors, vbar_el1); > @@ -181,7 +181,7 @@ static void __hyp_text __deactivate_traps_nvhe(void) > { > u64 mdcr_el2 = read_sysreg(mdcr_el2); > > - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { > + if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { > u64 val; > > /* > diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c > index 7672a978926c..2c1436fc0830 100644 > --- a/arch/arm64/kvm/hyp/sysreg-sr.c > +++ b/arch/arm64/kvm/hyp/sysreg-sr.c > @@ -118,7 +118,7 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt) > write_sysreg(ctxt->sys_regs[MPIDR_EL1], vmpidr_el2); > write_sysreg(ctxt->sys_regs[CSSELR_EL1], csselr_el1); > > - if (!cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { > + if (!cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { > write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1], SYS_SCTLR); > write_sysreg_el1(ctxt->sys_regs[TCR_EL1], SYS_TCR); > } else if (!ctxt->__hyp_running_vcpu) { > @@ -149,7 +149,7 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt) > write_sysreg(ctxt->sys_regs[PAR_EL1], par_el1); > write_sysreg(ctxt->sys_regs[TPIDR_EL1], tpidr_el1); > > - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE) && > + if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT) && > ctxt->__hyp_running_vcpu) { > /* > * Must only be done for host registers, hence the context > diff --git a/arch/arm64/kvm/hyp/tlb.c b/arch/arm64/kvm/hyp/tlb.c > index 92f560e3e1aa..6b6a139ad29a 100644 > --- a/arch/arm64/kvm/hyp/tlb.c > +++ b/arch/arm64/kvm/hyp/tlb.c > @@ -23,7 +23,7 @@ static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm, > > local_irq_save(cxt->flags); > > - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_VHE)) { > + if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { > /* > * For CPUs that are affected by ARM errata 1165522 or 1530923, > * we cannot trust stage-1 to be in a correct state at that > @@ -63,7 +63,7 @@ static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm, > static void __hyp_text __tlb_switch_to_guest_nvhe(struct kvm *kvm, > struct tlb_inv_context *cxt) > { > - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { > + if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { > u64 val; > > /* > @@ -103,7 +103,7 @@ static void __hyp_text __tlb_switch_to_host_vhe(struct kvm *kvm, > write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2); > isb(); > > - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_VHE)) { > + if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { > /* Restore the registers to what they were */ > write_sysreg_el1(cxt->tcr, SYS_TCR); > write_sysreg_el1(cxt->sctlr, SYS_SCTLR); > @@ -117,7 +117,7 @@ static void __hyp_text __tlb_switch_to_host_nvhe(struct kvm *kvm, > { > write_sysreg(0, vttbr_el2); > > - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { > + if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { > /* Ensure write of the host VMID */ > isb(); > /* Restore the host's TCR_EL1 */ >
Hi Andrew, On 3/27/20 2:39 PM, Andrew Scull wrote: > Errata 1165522, 1319367 and 1530923 each allow TLB entries to be > allocated as a result of a speculative AT instruction. In order to > avoid mandating VHE on certain affected CPUs, apply the workaround to > both the nVHE and the VHE case for all affected CPUs. You're booting a VHE capable system without VHE, and need KVM? Do tell! Would enabling the nVHE workaround on a VHE capable part solve your problem? Merging the errata has some side effects... > --- > I'm not able to test the workarounds properly for the affected CPUs but > have built and booted under qemu configs with and without VHE as well as > the workaround being enabled and disabled. > > As there exist work arounds for nVHE and VHE, it doesn't appear to be a > technical limitation that meant VHE was being mandated. Please correct > me if this understanding is inaccurate. Thanks! The affected VHE parts came first. Then came those that didn't have VHE at all. > diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c > index 7672a978926c..2c1436fc0830 100644 > --- a/arch/arm64/kvm/hyp/sysreg-sr.c > +++ b/arch/arm64/kvm/hyp/sysreg-sr.c > @@ -118,7 +118,7 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt) > write_sysreg(ctxt->sys_regs[MPIDR_EL1], vmpidr_el2); > write_sysreg(ctxt->sys_regs[CSSELR_EL1], csselr_el1); > > - if (!cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { > + if (!cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { > write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1], SYS_SCTLR); > write_sysreg_el1(ctxt->sys_regs[TCR_EL1], SYS_TCR); > } else if (!ctxt->__hyp_running_vcpu) { The comment just below here: | /* | * Must only be done for guest registers, hence the context | * test. We're coming from the host, so SCTLR.M is already | * set. Pairs with __activate_traps_nvhe(). | */ The VHE parts aren't going to run __activate_traps_nvhe(), so you skip restoring the guest's SCTLR_EL1 and TCR_EL1... Thanks, James
Hi James, On Fri, Mar 27, 2020 at 04:11:56PM +0000, James Morse wrote: > On 3/27/20 2:39 PM, Andrew Scull wrote: > > Errata 1165522, 1319367 and 1530923 each allow TLB entries to be > > allocated as a result of a speculative AT instruction. In order to > > avoid mandating VHE on certain affected CPUs, apply the workaround to > > both the nVHE and the VHE case for all affected CPUs. > > You're booting a VHE capable system without VHE, and need KVM? > Do tell! I'll stick my neck out for this part... Basically, we're looking at enabling virtualisation on Android through KVM and we're going to be upstreaming more of this as it gets going. One of the goals of this work is to provide an environment where virtual machines can run in isolation from the KVM host; in such a design, the guest memory must not be accessible to the host, so this doesn't lend itself at all to VHE. Our current work is focussing on extending the nVHE __hyp_text so that things like stage-2 page table and vCPU state is all handled there, with a stage-2 put over the host kernel to prevent access to guest memory. The host is still responsible for scheduling vCPUs, so a compromised host can cause a DoS but it's the data integrity that we really care about. We're looking at implementing this on top of the SPCI spec from Arm, where the VMs are a lot less dynamic than a traditional KVM invocation but implement message passing and shared memory interfaces so that we can still use things like virtio. We're also aiming to support SPCI partitions alongside traditional VMs, although the stage-2 must still be handled at EL2 for both types of guest. It's a bit like a Type-1.5 hypervisor ;) Anyway, there's plenty to do, but one of the exercises is removing some of the artificial Kconfig dependencies we have on VHE. This erratum workaround is one of them, but there are others such as SVE and Pointer Auth. > Would enabling the nVHE workaround on a VHE capable part solve your problem? Could do, but it seems a bit weird to enable the workarounds independently once both VHE and nVHE are supported. We probably need to use has_vhe() to distinguish the two paths, rather than rely on the presence of the workaround as a proxy for that. Will
Thanks, Steven. Could we look into the EPD* caching microarch details Marc was referring to for these A55 and A76 cores? If the behaviour is the same as A57/A72 then it sounds safe. Otherwise, there'll need to be some invalidation to avoid that issue. And thanks for the pointer to your previous patch, it's a very helpful reference.
Hi Andrew, On 2020-03-27 17:48, Andrew Scull wrote: > Thanks, Steven. Could we look into the EPD* caching microarch details > Marc was referring to for these A55 and A76 cores? Only the folks that have access to the RTL can tell you, unfortunately. Thankfully, there is a bunch of people on Cc that should be able to ping the relevant teams and report back... Thanks, M.
On Fri, Mar 27, 2020 at 05:52:59PM +0000, Marc Zyngier wrote: > On 2020-03-27 17:48, Andrew Scull wrote: > > Thanks, Steven. Could we look into the EPD* caching microarch details > > Marc was referring to for these A55 and A76 cores? > > Only the folks that have access to the RTL can tell you, unfortunately. > > Thankfully, there is a bunch of people on Cc that should be able to ping > the relevant teams and report back... Yup, otherwise I guess we can throw in the TLB invalidation after setting the EPDx bits until we have clarity from Arm. We wouldn't need to broadcast it, so it might not be too bad on performance. If it is, I suppose we could switch to a reserved VMID? Will
On Fri, 27 Mar 2020 18:09:14 +0000 Will Deacon <will@kernel.org> wrote: > On Fri, Mar 27, 2020 at 05:52:59PM +0000, Marc Zyngier wrote: > > On 2020-03-27 17:48, Andrew Scull wrote: > > > Thanks, Steven. Could we look into the EPD* caching microarch details > > > Marc was referring to for these A55 and A76 cores? > > > > Only the folks that have access to the RTL can tell you, unfortunately. > > > > Thankfully, there is a bunch of people on Cc that should be able to ping > > the relevant teams and report back... > > Yup, otherwise I guess we can throw in the TLB invalidation after setting > the EPDx bits until we have clarity from Arm. We wouldn't need to broadcast > it, so it might not be too bad on performance. If it is, I suppose we could > switch to a reserved VMID? I'd like to avoid the reserved VMID if at all possible -- we already have one for the host on nVHE, and I bet your patches are going to create some interesting havoc^Wchanges in that area, as the host is now a guest from the mm perspective. It isn't clear either whether a reserved VMID really solves the problem either, as you could still end-up with speculative PTWs. Can it be harmful to create conflicting TLB entries if you never hit them? But if we bring in TLB invalidation, I wonder whether the problem can be mitigated solely with just that on the affected CPUs, and what the impact would be. /me eyes the D05 running its 32 guests... M.
On Sat, Mar 28, 2020 at 11:23:51AM +0000, Marc Zyngier wrote: > On Fri, 27 Mar 2020 18:09:14 +0000 > Will Deacon <will@kernel.org> wrote: > > > On Fri, Mar 27, 2020 at 05:52:59PM +0000, Marc Zyngier wrote: > > > On 2020-03-27 17:48, Andrew Scull wrote: > > > > Thanks, Steven. Could we look into the EPD* caching microarch details > > > > Marc was referring to for these A55 and A76 cores? > > > > > > Only the folks that have access to the RTL can tell you, unfortunately. > > > > > > Thankfully, there is a bunch of people on Cc that should be able to ping > > > the relevant teams and report back... > > > > Yup, otherwise I guess we can throw in the TLB invalidation after setting > > the EPDx bits until we have clarity from Arm. We wouldn't need to broadcast > > it, so it might not be too bad on performance. If it is, I suppose we could > > switch to a reserved VMID? > > I'd like to avoid the reserved VMID if at all possible -- we already > have one for the host on nVHE, and I bet your patches are going to > create some interesting havoc^Wchanges in that area, as the host is now > a guest from the mm perspective. It isn't clear either whether a > reserved VMID really solves the problem either, as you could still > end-up with speculative PTWs. Can it be harmful to create conflicting > TLB entries if you never hit them? I think you'd have to ensure that the reserved VMID is only ever used when the EPDx bits are set, which is easy to say but harder to enforce! > But if we bring in TLB invalidation, I wonder whether the problem can > be mitigated solely with just that on the affected CPUs, and what the > impact would be. It seems as though this erratum is quietly cropping up for other CPUs (e.g. A53 and A77) as well, so I'd be inclined to add the local TLBI and then Arm can do the uarch work to disable it if it's worth it. Interestingly, I think you only need the invalidation on the __deactivate_traps_nvhe() path, because the CPU is only permitted to cache those bits when they are 0 (strange but true!). Will
On Fri, Mar 27, 2020 at 02:59:47PM +0000, Steven Price wrote: > I proposed something similar a while ago[1], but Marc was concerned about > the microarch detail[2] and hence I split the workaround into VHE/non-VHE. > > That said I'm not saying this is necessarily wrong, just that we'd need some > more information on whether the non-VHE workaround is suitable for the CPUs > we're currently forcing VHE on. We noticed that both the nVHE and VHE workarounds share the same assumption that the EPDx bits are not being cached in the TLB. `__tlb_switch_to_guest_vhe` and `__tlb_switch_to_guest_nvhe` are both setting EPDx as part of the workaround. However, neither handles the possibility of a speculative AT being able to make use of a cached EPD=0 value in the TLB in order to allocate bad TLB entries. If this is correct, the microarch concern appears to have been solved already. Otherwise, or if we are unsure, we should go ahead and add the TLB flushes to keep this safe.
On Fri, Apr 03, 2020 at 01:57:26PM +0100, Andrew Scull wrote: > On Fri, Mar 27, 2020 at 02:59:47PM +0000, Steven Price wrote: > > I proposed something similar a while ago[1], but Marc was concerned about > > the microarch detail[2] and hence I split the workaround into VHE/non-VHE. > > > > That said I'm not saying this is necessarily wrong, just that we'd need some > > more information on whether the non-VHE workaround is suitable for the CPUs > > we're currently forcing VHE on. > > We noticed that both the nVHE and VHE workarounds share the same > assumption that the EPDx bits are not being cached in the TLB. > > `__tlb_switch_to_guest_vhe` and `__tlb_switch_to_guest_nvhe` are both > setting EPDx as part of the workaround. However, neither handles the > possibility of a speculative AT being able to make use of a cached EPD=0 > value in the TLB in order to allocate bad TLB entries. > > If this is correct, the microarch concern appears to have been solved > already. Otherwise, or if we are unsure, we should go ahead and add the > TLB flushes to keep this safe. I think Andrew's right here. Can we go ahead with the original approach of combining the workarounds, or is there something we've missed? Cheers, Will
On 17/04/2020 17:41, Will Deacon wrote: > On Fri, Apr 03, 2020 at 01:57:26PM +0100, Andrew Scull wrote: >> On Fri, Mar 27, 2020 at 02:59:47PM +0000, Steven Price wrote: >>> I proposed something similar a while ago[1], but Marc was concerned about >>> the microarch detail[2] and hence I split the workaround into VHE/non-VHE. >>> >>> That said I'm not saying this is necessarily wrong, just that we'd need some >>> more information on whether the non-VHE workaround is suitable for the CPUs >>> we're currently forcing VHE on. >> >> We noticed that both the nVHE and VHE workarounds share the same >> assumption that the EPDx bits are not being cached in the TLB. >> >> `__tlb_switch_to_guest_vhe` and `__tlb_switch_to_guest_nvhe` are both >> setting EPDx as part of the workaround. However, neither handles the >> possibility of a speculative AT being able to make use of a cached EPD=0 >> value in the TLB in order to allocate bad TLB entries. >> >> If this is correct, the microarch concern appears to have been solved >> already. Otherwise, or if we are unsure, we should go ahead and add the >> TLB flushes to keep this safe. > > I think Andrew's right here. Can we go ahead with the original approach of > combining the workarounds, or is there something we've missed? As far as I know it is safe to combine the workarounds: I did post my own patch earlier. But I don't have the deep understanding of the microarch - so I accepted Marc's concerns and dropped that, and was simply linking up the discussions. The assumption before was that booting a VHE capable system without VHE wasn't useful, so there wasn't a pressing need to unify the workarounds. And therefore the easiest option for me was to keep the workarounds separate. Since there is apparently a desire to do such a thing then unifying the workarounds seems reasonable. Steve
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 0b30e884e088..7492b929cb12 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -521,13 +521,13 @@ config ARM64_ERRATUM_1418040 If unsure, say Y. -config ARM64_WORKAROUND_SPECULATIVE_AT_VHE +config ARM64_WORKAROUND_SPECULATIVE_AT bool config ARM64_ERRATUM_1165522 - bool "Cortex-A76: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation" + bool "Cortex-A76: 1165522: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation" default y - select ARM64_WORKAROUND_SPECULATIVE_AT_VHE + select ARM64_WORKAROUND_SPECULATIVE_AT help This option adds a workaround for ARM Cortex-A76 erratum 1165522. @@ -537,10 +537,23 @@ config ARM64_ERRATUM_1165522 If unsure, say Y. +config ARM64_ERRATUM_1319367 + bool "Cortex-A57/A72: 1319537: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation" + default y + select ARM64_WORKAROUND_SPECULATIVE_AT + help + This option adds work arounds for ARM Cortex-A57 erratum 1319537 + and A72 erratum 1319367 + + Cortex-A57 and A72 cores could end-up with corrupted TLBs by + speculating an AT instruction during a guest context switch. + + If unsure, say Y. + config ARM64_ERRATUM_1530923 - bool "Cortex-A55: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation" + bool "Cortex-A55: 1530923: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation" default y - select ARM64_WORKAROUND_SPECULATIVE_AT_VHE + select ARM64_WORKAROUND_SPECULATIVE_AT help This option adds a workaround for ARM Cortex-A55 erratum 1530923. @@ -566,22 +579,6 @@ config ARM64_ERRATUM_1286807 invalidated has been observed by other observers. The workaround repeats the TLBI+DSB operation. -config ARM64_WORKAROUND_SPECULATIVE_AT_NVHE - bool - -config ARM64_ERRATUM_1319367 - bool "Cortex-A57/A72: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation" - default y - select ARM64_WORKAROUND_SPECULATIVE_AT_NVHE - help - This option adds work arounds for ARM Cortex-A57 erratum 1319537 - and A72 erratum 1319367 - - Cortex-A57 and A72 cores could end-up with corrupted TLBs by - speculating an AT instruction during a guest context switch. - - If unsure, say Y. - config ARM64_ERRATUM_1463225 bool "Cortex-A76: Software Step might prevent interrupt recognition" default y diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h index 865e0253fc1e..e396e357f5b2 100644 --- a/arch/arm64/include/asm/cpucaps.h +++ b/arch/arm64/include/asm/cpucaps.h @@ -44,7 +44,7 @@ #define ARM64_SSBS 34 #define ARM64_WORKAROUND_1418040 35 #define ARM64_HAS_SB 36 -#define ARM64_WORKAROUND_SPECULATIVE_AT_VHE 37 +#define ARM64_WORKAROUND_SPECULATIVE_AT 37 #define ARM64_HAS_ADDRESS_AUTH_ARCH 38 #define ARM64_HAS_ADDRESS_AUTH_IMP_DEF 39 #define ARM64_HAS_GENERIC_AUTH_ARCH 40 @@ -55,10 +55,9 @@ #define ARM64_WORKAROUND_CAVIUM_TX2_219_TVM 45 #define ARM64_WORKAROUND_CAVIUM_TX2_219_PRFM 46 #define ARM64_WORKAROUND_1542419 47 -#define ARM64_WORKAROUND_SPECULATIVE_AT_NVHE 48 -#define ARM64_HAS_E0PD 49 -#define ARM64_HAS_RNG 50 +#define ARM64_HAS_E0PD 48 +#define ARM64_HAS_RNG 49 -#define ARM64_NCAPS 51 +#define ARM64_NCAPS 50 #endif /* __ASM_CPUCAPS_H */ diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 57fd46acd058..4c8acbb949ce 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -572,10 +572,6 @@ static inline bool kvm_arch_requires_vhe(void) if (system_supports_sve()) return true; - /* Some implementations have defects that confine them to VHE */ - if (cpus_have_cap(ARM64_WORKAROUND_SPECULATIVE_AT_VHE)) - return true; - return false; } diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h index fe57f60f06a8..238d2e049694 100644 --- a/arch/arm64/include/asm/kvm_hyp.h +++ b/arch/arm64/include/asm/kvm_hyp.h @@ -102,7 +102,7 @@ static __always_inline void __hyp_text __load_guest_stage2(struct kvm *kvm) * above before we can switch to the EL1/EL0 translation regime used by * the guest. */ - asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT_VHE)); + asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT)); } #endif /* __ARM64_KVM_HYP_H__ */ diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 703ad0a84f99..fe3bbdb82c70 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -637,7 +637,7 @@ has_neoverse_n1_erratum_1542419(const struct arm64_cpu_capabilities *entry, return is_midr_in_range(midr, &range) && has_dic; } -#if defined(CONFIG_HARDEN_EL2_VECTORS) || defined(CONFIG_ARM64_ERRATUM_1319367) +#if defined(CONFIG_HARDEN_EL2_VECTORS) static const struct midr_range ca57_a72[] = { MIDR_ALL_VERSIONS(MIDR_CORTEX_A57), @@ -759,12 +759,16 @@ static const struct arm64_cpu_capabilities erratum_843419_list[] = { }; #endif -#ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT_VHE -static const struct midr_range erratum_speculative_at_vhe_list[] = { +#ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT +static const struct midr_range erratum_speculative_at_list[] = { #ifdef CONFIG_ARM64_ERRATUM_1165522 /* Cortex A76 r0p0 to r2p0 */ MIDR_RANGE(MIDR_CORTEX_A76, 0, 0, 2, 0), #endif +#ifdef CONFIG_ARM64_ERRATUM_1319367 + MIDR_ALL_VERSIONS(MIDR_CORTEX_A57), + MIDR_ALL_VERSIONS(MIDR_CORTEX_A72), +#endif #ifdef CONFIG_ARM64_ERRATUM_1530923 /* Cortex A55 r0p0 to r2p0 */ MIDR_RANGE(MIDR_CORTEX_A55, 0, 0, 2, 0), @@ -899,11 +903,11 @@ const struct arm64_cpu_capabilities arm64_errata[] = { ERRATA_MIDR_RANGE_LIST(erratum_1418040_list), }, #endif -#ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT_VHE +#ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT { - .desc = "ARM errata 1165522, 1530923", - .capability = ARM64_WORKAROUND_SPECULATIVE_AT_VHE, - ERRATA_MIDR_RANGE_LIST(erratum_speculative_at_vhe_list), + .desc = "ARM errata 1165522, 1319367, 1530923", + .capability = ARM64_WORKAROUND_SPECULATIVE_AT, + ERRATA_MIDR_RANGE_LIST(erratum_speculative_at_list), }, #endif #ifdef CONFIG_ARM64_ERRATUM_1463225 @@ -936,13 +940,6 @@ const struct arm64_cpu_capabilities arm64_errata[] = { .matches = has_neoverse_n1_erratum_1542419, .cpu_enable = cpu_enable_trap_ctr_access, }, -#endif -#ifdef CONFIG_ARM64_ERRATUM_1319367 - { - .desc = "ARM erratum 1319367", - .capability = ARM64_WORKAROUND_SPECULATIVE_AT_NVHE, - ERRATA_MIDR_RANGE_LIST(ca57_a72), - }, #endif { } diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c index 925086b46136..71d4a7250421 100644 --- a/arch/arm64/kvm/hyp/switch.c +++ b/arch/arm64/kvm/hyp/switch.c @@ -127,7 +127,7 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu) write_sysreg(val, cptr_el2); - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { + if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { struct kvm_cpu_context *ctxt = &vcpu->arch.ctxt; isb(); @@ -170,7 +170,7 @@ static void deactivate_traps_vhe(void) * above before we can switch to the EL2/EL0 translation regime used by * the host. */ - asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT_VHE)); + asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT)); write_sysreg(CPACR_EL1_DEFAULT, cpacr_el1); write_sysreg(vectors, vbar_el1); @@ -181,7 +181,7 @@ static void __hyp_text __deactivate_traps_nvhe(void) { u64 mdcr_el2 = read_sysreg(mdcr_el2); - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { + if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { u64 val; /* diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c index 7672a978926c..2c1436fc0830 100644 --- a/arch/arm64/kvm/hyp/sysreg-sr.c +++ b/arch/arm64/kvm/hyp/sysreg-sr.c @@ -118,7 +118,7 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt) write_sysreg(ctxt->sys_regs[MPIDR_EL1], vmpidr_el2); write_sysreg(ctxt->sys_regs[CSSELR_EL1], csselr_el1); - if (!cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { + if (!cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1], SYS_SCTLR); write_sysreg_el1(ctxt->sys_regs[TCR_EL1], SYS_TCR); } else if (!ctxt->__hyp_running_vcpu) { @@ -149,7 +149,7 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt) write_sysreg(ctxt->sys_regs[PAR_EL1], par_el1); write_sysreg(ctxt->sys_regs[TPIDR_EL1], tpidr_el1); - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE) && + if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT) && ctxt->__hyp_running_vcpu) { /* * Must only be done for host registers, hence the context diff --git a/arch/arm64/kvm/hyp/tlb.c b/arch/arm64/kvm/hyp/tlb.c index 92f560e3e1aa..6b6a139ad29a 100644 --- a/arch/arm64/kvm/hyp/tlb.c +++ b/arch/arm64/kvm/hyp/tlb.c @@ -23,7 +23,7 @@ static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm, local_irq_save(cxt->flags); - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_VHE)) { + if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { /* * For CPUs that are affected by ARM errata 1165522 or 1530923, * we cannot trust stage-1 to be in a correct state at that @@ -63,7 +63,7 @@ static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm, static void __hyp_text __tlb_switch_to_guest_nvhe(struct kvm *kvm, struct tlb_inv_context *cxt) { - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { + if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { u64 val; /* @@ -103,7 +103,7 @@ static void __hyp_text __tlb_switch_to_host_vhe(struct kvm *kvm, write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2); isb(); - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_VHE)) { + if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { /* Restore the registers to what they were */ write_sysreg_el1(cxt->tcr, SYS_TCR); write_sysreg_el1(cxt->sctlr, SYS_SCTLR); @@ -117,7 +117,7 @@ static void __hyp_text __tlb_switch_to_host_nvhe(struct kvm *kvm, { write_sysreg(0, vttbr_el2); - if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) { + if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) { /* Ensure write of the host VMID */ isb(); /* Restore the host's TCR_EL1 */
Errata 1165522, 1319367 and 1530923 each allow TLB entries to be allocated as a result of a speculative AT instruction. In order to avoid mandating VHE on certain affected CPUs, apply the workaround to both the nVHE and the VHE case for all affected CPUs. Signed-off-by: Andrew Scull <ascull@google.com> CC: Marc Zyngier <maz@kernel.org> CC: James Morse <james.morse@arm.com> CC: Suzuki K Poulose <suzuki.poulose@arm.com> CC: Will Deacon <will@kernel.org> CC: Steven Price <steven.price@arm.com> --- I'm not able to test the workarounds properly for the affected CPUs but have built and booted under qemu configs with and without VHE as well as the workaround being enabled and disabled. As there exist work arounds for nVHE and VHE, it doesn't appear to be a technical limitation that meant VHE was being mandated. Please correct me if this understanding is inaccurate. Thanks! --- arch/arm64/Kconfig | 39 ++++++++++++++----------------- arch/arm64/include/asm/cpucaps.h | 9 ++++--- arch/arm64/include/asm/kvm_host.h | 4 ---- arch/arm64/include/asm/kvm_hyp.h | 2 +- arch/arm64/kernel/cpu_errata.c | 25 +++++++++----------- arch/arm64/kvm/hyp/switch.c | 6 ++--- arch/arm64/kvm/hyp/sysreg-sr.c | 4 ++-- arch/arm64/kvm/hyp/tlb.c | 8 +++---- 8 files changed, 43 insertions(+), 54 deletions(-)