[RFC] arm64: unify WORKAROUND_SPECULATIVE_AT_{NVHE,VHE}

Message ID	20200327143941.195626-1-ascull@google.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=GBlT=5M=lists.infradead.org=linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 06764206F2 Date: Fri, 27 Mar 2020 14:39:41 +0000 Message-Id: <20200327143941.195626-1-ascull@google.com> Mime-Version: 1.0 Subject: [RFC PATCH] arm64: unify WORKAROUND_SPECULATIVE_AT_{NVHE,VHE} From: Andrew Scull <ascull@google.com> To: linux-arm-kernel@lists.infradead.org, kvmarm@lists.cs.columbia.edu summary: Content analysis details: (-7.7 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [2607:f8b0:4864:20:0:0:0:549 listed in] [list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record -7.5 USER_IN_DEF_DKIM_WL From: address is in the default DKIM white-list 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.0 DKIMWL_WL_MED DKIMwl.org - Medium sender Precedence: list Cc: qwandor@google.com, qperret@google.com, Marc Zyngier <maz@kernel.org>, Suzuki K Poulose <suzuki.poulose@arm.com>, Steven Price <steven.price@arm.com>, Will Deacon <will@kernel.org>, wedsonaf@google.com, Andrew Scull <ascull@google.com>, James Morse <james.morse@arm.com>, dbrazdil@google.com, kernel-team@android.com, tabba@google.com Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org
Series	[RFC] arm64: unify WORKAROUND_SPECULATIVE_AT_{NVHE,VHE} \| expand [RFC] arm64: unify WORKAROUND_SPECULATIVE_AT_{NVHE,VHE}

Andrew Scull March 27, 2020, 2:39 p.m. UTC

Errata 1165522, 1319367 and 1530923 each allow TLB entries to be
allocated as a result of a speculative AT instruction. In order to
avoid mandating VHE on certain affected CPUs, apply the workaround to
both the nVHE and the VHE case for all affected CPUs.

Signed-off-by: Andrew Scull <ascull@google.com>
CC: Marc Zyngier <maz@kernel.org>
CC: James Morse <james.morse@arm.com>
CC: Suzuki K Poulose <suzuki.poulose@arm.com>
CC: Will Deacon <will@kernel.org>
CC: Steven Price <steven.price@arm.com>

---

I'm not able to test the workarounds properly for the affected CPUs but
have built and booted under qemu configs with and without VHE as well as
the workaround being enabled and disabled.

As there exist work arounds for nVHE and VHE, it doesn't appear to be a
technical limitation that meant VHE was being mandated. Please correct
me if this understanding is inaccurate. Thanks!
---
 arch/arm64/Kconfig                | 39 ++++++++++++++-----------------
 arch/arm64/include/asm/cpucaps.h  |  9 ++++---
 arch/arm64/include/asm/kvm_host.h |  4 ----
 arch/arm64/include/asm/kvm_hyp.h  |  2 +-
 arch/arm64/kernel/cpu_errata.c    | 25 +++++++++-----------
 arch/arm64/kvm/hyp/switch.c       |  6 ++---
 arch/arm64/kvm/hyp/sysreg-sr.c    |  4 ++--
 arch/arm64/kvm/hyp/tlb.c          |  8 +++----
 8 files changed, 43 insertions(+), 54 deletions(-)

Steven Price March 27, 2020, 2:59 p.m. UTC | #1

On 27/03/2020 14:39, Andrew Scull wrote:
> Errata 1165522, 1319367 and 1530923 each allow TLB entries to be
> allocated as a result of a speculative AT instruction. In order to
> avoid mandating VHE on certain affected CPUs, apply the workaround to
> both the nVHE and the VHE case for all affected CPUs.
> 
> Signed-off-by: Andrew Scull <ascull@google.com>
> CC: Marc Zyngier <maz@kernel.org>
> CC: James Morse <james.morse@arm.com>
> CC: Suzuki K Poulose <suzuki.poulose@arm.com>
> CC: Will Deacon <will@kernel.org>
> CC: Steven Price <steven.price@arm.com>
> 
> ---
> 
> I'm not able to test the workarounds properly for the affected CPUs but
> have built and booted under qemu configs with and without VHE as well as
> the workaround being enabled and disabled.
> 
> As there exist work arounds for nVHE and VHE, it doesn't appear to be a
> technical limitation that meant VHE was being mandated. Please correct
> me if this understanding is inaccurate. Thanks!

I proposed something similar a while ago[1], but Marc was concerned 
about the microarch detail[2] and hence I split the workaround into 
VHE/non-VHE.

That said I'm not saying this is necessarily wrong, just that we'd need 
some more information on whether the non-VHE workaround is suitable for 
the CPUs we're currently forcing VHE on.

Steve

[1] 
https://lore.kernel.org/kvmarm/20191113114118.2427-2-steven.price@arm.com/

[2] 
https://lore.kernel.org/kvmarm/566ecd45c8bf07b3cb5d75a10c9413a8@www.loen.fr/

> ---
>   arch/arm64/Kconfig                | 39 ++++++++++++++-----------------
>   arch/arm64/include/asm/cpucaps.h  |  9 ++++---
>   arch/arm64/include/asm/kvm_host.h |  4 ----
>   arch/arm64/include/asm/kvm_hyp.h  |  2 +-
>   arch/arm64/kernel/cpu_errata.c    | 25 +++++++++-----------
>   arch/arm64/kvm/hyp/switch.c       |  6 ++---
>   arch/arm64/kvm/hyp/sysreg-sr.c    |  4 ++--
>   arch/arm64/kvm/hyp/tlb.c          |  8 +++----
>   8 files changed, 43 insertions(+), 54 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 0b30e884e088..7492b929cb12 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -521,13 +521,13 @@ config ARM64_ERRATUM_1418040
>   
>   	  If unsure, say Y.
>   
> -config ARM64_WORKAROUND_SPECULATIVE_AT_VHE
> +config ARM64_WORKAROUND_SPECULATIVE_AT
>   	bool
>   
>   config ARM64_ERRATUM_1165522
> -	bool "Cortex-A76: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation"
> +	bool "Cortex-A76: 1165522: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation"
>   	default y
> -	select ARM64_WORKAROUND_SPECULATIVE_AT_VHE
> +	select ARM64_WORKAROUND_SPECULATIVE_AT
>   	help
>   	  This option adds a workaround for ARM Cortex-A76 erratum 1165522.
>   
> @@ -537,10 +537,23 @@ config ARM64_ERRATUM_1165522
>   
>   	  If unsure, say Y.
>   
> +config ARM64_ERRATUM_1319367
> +	bool "Cortex-A57/A72: 1319537: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation"
> +	default y
> +	select ARM64_WORKAROUND_SPECULATIVE_AT
> +	help
> +	  This option adds work arounds for ARM Cortex-A57 erratum 1319537
> +	  and A72 erratum 1319367
> +
> +	  Cortex-A57 and A72 cores could end-up with corrupted TLBs by
> +	  speculating an AT instruction during a guest context switch.
> +
> +	  If unsure, say Y.
> +
>   config ARM64_ERRATUM_1530923
> -	bool "Cortex-A55: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation"
> +	bool "Cortex-A55: 1530923: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation"
>   	default y
> -	select ARM64_WORKAROUND_SPECULATIVE_AT_VHE
> +	select ARM64_WORKAROUND_SPECULATIVE_AT
>   	help
>   	  This option adds a workaround for ARM Cortex-A55 erratum 1530923.
>   
> @@ -566,22 +579,6 @@ config ARM64_ERRATUM_1286807
>   	  invalidated has been observed by other observers. The
>   	  workaround repeats the TLBI+DSB operation.
>   
> -config ARM64_WORKAROUND_SPECULATIVE_AT_NVHE
> -	bool
> -
> -config ARM64_ERRATUM_1319367
> -	bool "Cortex-A57/A72: Speculative AT instruction using out-of-context translation regime could cause subsequent request to generate an incorrect translation"
> -	default y
> -	select ARM64_WORKAROUND_SPECULATIVE_AT_NVHE
> -	help
> -	  This option adds work arounds for ARM Cortex-A57 erratum 1319537
> -	  and A72 erratum 1319367
> -
> -	  Cortex-A57 and A72 cores could end-up with corrupted TLBs by
> -	  speculating an AT instruction during a guest context switch.
> -
> -	  If unsure, say Y.
> -
>   config ARM64_ERRATUM_1463225
>   	bool "Cortex-A76: Software Step might prevent interrupt recognition"
>   	default y
> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
> index 865e0253fc1e..e396e357f5b2 100644
> --- a/arch/arm64/include/asm/cpucaps.h
> +++ b/arch/arm64/include/asm/cpucaps.h
> @@ -44,7 +44,7 @@
>   #define ARM64_SSBS				34
>   #define ARM64_WORKAROUND_1418040		35
>   #define ARM64_HAS_SB				36
> -#define ARM64_WORKAROUND_SPECULATIVE_AT_VHE	37
> +#define ARM64_WORKAROUND_SPECULATIVE_AT	37
>   #define ARM64_HAS_ADDRESS_AUTH_ARCH		38
>   #define ARM64_HAS_ADDRESS_AUTH_IMP_DEF		39
>   #define ARM64_HAS_GENERIC_AUTH_ARCH		40
> @@ -55,10 +55,9 @@
>   #define ARM64_WORKAROUND_CAVIUM_TX2_219_TVM	45
>   #define ARM64_WORKAROUND_CAVIUM_TX2_219_PRFM	46
>   #define ARM64_WORKAROUND_1542419		47
> -#define ARM64_WORKAROUND_SPECULATIVE_AT_NVHE	48
> -#define ARM64_HAS_E0PD				49
> -#define ARM64_HAS_RNG				50
> +#define ARM64_HAS_E0PD				48
> +#define ARM64_HAS_RNG				49
>   
> -#define ARM64_NCAPS				51
> +#define ARM64_NCAPS				50
>   
>   #endif /* __ASM_CPUCAPS_H */
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 57fd46acd058..4c8acbb949ce 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -572,10 +572,6 @@ static inline bool kvm_arch_requires_vhe(void)
>   	if (system_supports_sve())
>   		return true;
>   
> -	/* Some implementations have defects that confine them to VHE */
> -	if (cpus_have_cap(ARM64_WORKAROUND_SPECULATIVE_AT_VHE))
> -		return true;
> -
>   	return false;
>   }
>   
> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> index fe57f60f06a8..238d2e049694 100644
> --- a/arch/arm64/include/asm/kvm_hyp.h
> +++ b/arch/arm64/include/asm/kvm_hyp.h
> @@ -102,7 +102,7 @@ static __always_inline void __hyp_text __load_guest_stage2(struct kvm *kvm)
>   	 * above before we can switch to the EL1/EL0 translation regime used by
>   	 * the guest.
>   	 */
> -	asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT_VHE));
> +	asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT));
>   }
>   
>   #endif /* __ARM64_KVM_HYP_H__ */
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index 703ad0a84f99..fe3bbdb82c70 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -637,7 +637,7 @@ has_neoverse_n1_erratum_1542419(const struct arm64_cpu_capabilities *entry,
>   	return is_midr_in_range(midr, &range) && has_dic;
>   }
>   
> -#if defined(CONFIG_HARDEN_EL2_VECTORS) || defined(CONFIG_ARM64_ERRATUM_1319367)
> +#if defined(CONFIG_HARDEN_EL2_VECTORS)
>   
>   static const struct midr_range ca57_a72[] = {
>   	MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
> @@ -759,12 +759,16 @@ static const struct arm64_cpu_capabilities erratum_843419_list[] = {
>   };
>   #endif
>   
> -#ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT_VHE
> -static const struct midr_range erratum_speculative_at_vhe_list[] = {
> +#ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT
> +static const struct midr_range erratum_speculative_at_list[] = {
>   #ifdef CONFIG_ARM64_ERRATUM_1165522
>   	/* Cortex A76 r0p0 to r2p0 */
>   	MIDR_RANGE(MIDR_CORTEX_A76, 0, 0, 2, 0),
>   #endif
> +#ifdef CONFIG_ARM64_ERRATUM_1319367
> +	MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
> +	MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
> +#endif
>   #ifdef CONFIG_ARM64_ERRATUM_1530923
>   	/* Cortex A55 r0p0 to r2p0 */
>   	MIDR_RANGE(MIDR_CORTEX_A55, 0, 0, 2, 0),
> @@ -899,11 +903,11 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
>   		ERRATA_MIDR_RANGE_LIST(erratum_1418040_list),
>   	},
>   #endif
> -#ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT_VHE
> +#ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT
>   	{
> -		.desc = "ARM errata 1165522, 1530923",
> -		.capability = ARM64_WORKAROUND_SPECULATIVE_AT_VHE,
> -		ERRATA_MIDR_RANGE_LIST(erratum_speculative_at_vhe_list),
> +		.desc = "ARM errata 1165522, 1319367, 1530923",
> +		.capability = ARM64_WORKAROUND_SPECULATIVE_AT,
> +		ERRATA_MIDR_RANGE_LIST(erratum_speculative_at_list),
>   	},
>   #endif
>   #ifdef CONFIG_ARM64_ERRATUM_1463225
> @@ -936,13 +940,6 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
>   		.matches = has_neoverse_n1_erratum_1542419,
>   		.cpu_enable = cpu_enable_trap_ctr_access,
>   	},
> -#endif
> -#ifdef CONFIG_ARM64_ERRATUM_1319367
> -	{
> -		.desc = "ARM erratum 1319367",
> -		.capability = ARM64_WORKAROUND_SPECULATIVE_AT_NVHE,
> -		ERRATA_MIDR_RANGE_LIST(ca57_a72),
> -	},
>   #endif
>   	{
>   	}
> diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
> index 925086b46136..71d4a7250421 100644
> --- a/arch/arm64/kvm/hyp/switch.c
> +++ b/arch/arm64/kvm/hyp/switch.c
> @@ -127,7 +127,7 @@ static void __hyp_text __activate_traps_nvhe(struct kvm_vcpu *vcpu)
>   
>   	write_sysreg(val, cptr_el2);
>   
> -	if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) {
> +	if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
>   		struct kvm_cpu_context *ctxt = &vcpu->arch.ctxt;
>   
>   		isb();
> @@ -170,7 +170,7 @@ static void deactivate_traps_vhe(void)
>   	 * above before we can switch to the EL2/EL0 translation regime used by
>   	 * the host.
>   	 */
> -	asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT_VHE));
> +	asm(ALTERNATIVE("nop", "isb", ARM64_WORKAROUND_SPECULATIVE_AT));
>   
>   	write_sysreg(CPACR_EL1_DEFAULT, cpacr_el1);
>   	write_sysreg(vectors, vbar_el1);
> @@ -181,7 +181,7 @@ static void __hyp_text __deactivate_traps_nvhe(void)
>   {
>   	u64 mdcr_el2 = read_sysreg(mdcr_el2);
>   
> -	if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) {
> +	if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
>   		u64 val;
>   
>   		/*
> diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
> index 7672a978926c..2c1436fc0830 100644
> --- a/arch/arm64/kvm/hyp/sysreg-sr.c
> +++ b/arch/arm64/kvm/hyp/sysreg-sr.c
> @@ -118,7 +118,7 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
>   	write_sysreg(ctxt->sys_regs[MPIDR_EL1],		vmpidr_el2);
>   	write_sysreg(ctxt->sys_regs[CSSELR_EL1],	csselr_el1);
>   
> -	if (!cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) {
> +	if (!cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
>   		write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1],	SYS_SCTLR);
>   		write_sysreg_el1(ctxt->sys_regs[TCR_EL1],	SYS_TCR);
>   	} else	if (!ctxt->__hyp_running_vcpu) {
> @@ -149,7 +149,7 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
>   	write_sysreg(ctxt->sys_regs[PAR_EL1],		par_el1);
>   	write_sysreg(ctxt->sys_regs[TPIDR_EL1],		tpidr_el1);
>   
> -	if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE) &&
> +	if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT) &&
>   	    ctxt->__hyp_running_vcpu) {
>   		/*
>   		 * Must only be done for host registers, hence the context
> diff --git a/arch/arm64/kvm/hyp/tlb.c b/arch/arm64/kvm/hyp/tlb.c
> index 92f560e3e1aa..6b6a139ad29a 100644
> --- a/arch/arm64/kvm/hyp/tlb.c
> +++ b/arch/arm64/kvm/hyp/tlb.c
> @@ -23,7 +23,7 @@ static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm,
>   
>   	local_irq_save(cxt->flags);
>   
> -	if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_VHE)) {
> +	if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
>   		/*
>   		 * For CPUs that are affected by ARM errata 1165522 or 1530923,
>   		 * we cannot trust stage-1 to be in a correct state at that
> @@ -63,7 +63,7 @@ static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm,
>   static void __hyp_text __tlb_switch_to_guest_nvhe(struct kvm *kvm,
>   						  struct tlb_inv_context *cxt)
>   {
> -	if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) {
> +	if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
>   		u64 val;
>   
>   		/*
> @@ -103,7 +103,7 @@ static void __hyp_text __tlb_switch_to_host_vhe(struct kvm *kvm,
>   	write_sysreg(HCR_HOST_VHE_FLAGS, hcr_el2);
>   	isb();
>   
> -	if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_VHE)) {
> +	if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
>   		/* Restore the registers to what they were */
>   		write_sysreg_el1(cxt->tcr, SYS_TCR);
>   		write_sysreg_el1(cxt->sctlr, SYS_SCTLR);
> @@ -117,7 +117,7 @@ static void __hyp_text __tlb_switch_to_host_nvhe(struct kvm *kvm,
>   {
>   	write_sysreg(0, vttbr_el2);
>   
> -	if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) {
> +	if (cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
>   		/* Ensure write of the host VMID */
>   		isb();
>   		/* Restore the host's TCR_EL1 */
>

James Morse March 27, 2020, 4:11 p.m. UTC | #2

Hi Andrew,

On 3/27/20 2:39 PM, Andrew Scull wrote:
> Errata 1165522, 1319367 and 1530923 each allow TLB entries to be
> allocated as a result of a speculative AT instruction. In order to
> avoid mandating VHE on certain affected CPUs, apply the workaround to
> both the nVHE and the VHE case for all affected CPUs.

You're booting a VHE capable system without VHE, and need KVM?
Do tell!

Would enabling the nVHE workaround on a VHE capable part solve your problem?
Merging the errata has some side effects...


> ---
> I'm not able to test the workarounds properly for the affected CPUs but
> have built and booted under qemu configs with and without VHE as well as
> the workaround being enabled and disabled.
> 
> As there exist work arounds for nVHE and VHE, it doesn't appear to be a
> technical limitation that meant VHE was being mandated. Please correct
> me if this understanding is inaccurate. Thanks!

The affected VHE parts came first. Then came those that didn't have VHE at all.


> diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
> index 7672a978926c..2c1436fc0830 100644
> --- a/arch/arm64/kvm/hyp/sysreg-sr.c
> +++ b/arch/arm64/kvm/hyp/sysreg-sr.c
> @@ -118,7 +118,7 @@ static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
>  	write_sysreg(ctxt->sys_regs[MPIDR_EL1],		vmpidr_el2);
>  	write_sysreg(ctxt->sys_regs[CSSELR_EL1],	csselr_el1);
>  
> -	if (!cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT_NVHE)) {
> +	if (!cpus_have_const_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
>  		write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1],	SYS_SCTLR);
>  		write_sysreg_el1(ctxt->sys_regs[TCR_EL1],	SYS_TCR);
>  	} else	if (!ctxt->__hyp_running_vcpu) {

The comment just below here:
|		/*
|		 * Must only be done for guest registers, hence the context
|		 * test. We're coming from the host, so SCTLR.M is already
|		 * set. Pairs with __activate_traps_nvhe().
|		 */

The VHE parts aren't going to run __activate_traps_nvhe(), so you skip restoring
the guest's SCTLR_EL1 and TCR_EL1...


Thanks,

James

Will Deacon March 27, 2020, 4:59 p.m. UTC | #3

Hi James,

On Fri, Mar 27, 2020 at 04:11:56PM +0000, James Morse wrote:
> On 3/27/20 2:39 PM, Andrew Scull wrote:
> > Errata 1165522, 1319367 and 1530923 each allow TLB entries to be
> > allocated as a result of a speculative AT instruction. In order to
> > avoid mandating VHE on certain affected CPUs, apply the workaround to
> > both the nVHE and the VHE case for all affected CPUs.
> 
> You're booting a VHE capable system without VHE, and need KVM?
> Do tell!

I'll stick my neck out for this part...

Basically, we're looking at enabling virtualisation on Android through
KVM and we're going to be upstreaming more of this as it gets going.
One of the goals of this work is to provide an environment where virtual
machines can run in isolation from the KVM host; in such a design, the
guest memory must not be accessible to the host, so this doesn't lend
itself at all to VHE. Our current work is focussing on extending the nVHE
__hyp_text so that things like stage-2 page table and vCPU state is all
handled there, with a stage-2 put over the host kernel to prevent access
to guest memory. The host is still responsible for scheduling vCPUs, so
a compromised host can cause a DoS but it's the data integrity that we
really care about. We're looking at implementing this on top of the SPCI
spec from Arm, where the VMs are a lot less dynamic than a traditional
KVM invocation but implement message passing and shared memory interfaces
so that we can still use things like virtio. We're also aiming to support
SPCI partitions alongside traditional VMs, although the stage-2 must still
be handled at EL2 for both types of guest.

It's a bit like a Type-1.5 hypervisor ;)

Anyway, there's plenty to do, but one of the exercises is removing some of
the artificial Kconfig dependencies we have on VHE. This erratum workaround
is one of them, but there are others such as SVE and Pointer Auth.

> Would enabling the nVHE workaround on a VHE capable part solve your problem?

Could do, but it seems a bit weird to enable the workarounds independently
once both VHE and nVHE are supported. We probably need to use has_vhe() to
distinguish the two paths, rather than rely on the presence of the
workaround as a proxy for that.

Will

Andrew Scull March 27, 2020, 5:48 p.m. UTC | #4

Thanks, Steven. Could we look into the EPD* caching microarch details
Marc was referring to for these A55 and A76 cores?

If the behaviour is the same as A57/A72 then it sounds safe. Otherwise,
there'll need to be some invalidation to avoid that issue.

And thanks for the pointer to your previous patch, it's a very helpful
reference.

Marc Zyngier March 27, 2020, 5:52 p.m. UTC | #5

Hi Andrew,

On 2020-03-27 17:48, Andrew Scull wrote:
> Thanks, Steven. Could we look into the EPD* caching microarch details
> Marc was referring to for these A55 and A76 cores?

Only the folks that have access to the RTL can tell you, unfortunately.

Thankfully, there is a bunch of people on Cc that should be able to ping
the relevant teams and report back...

Thanks,

         M.

Will Deacon March 27, 2020, 6:09 p.m. UTC | #6

On Fri, Mar 27, 2020 at 05:52:59PM +0000, Marc Zyngier wrote:
> On 2020-03-27 17:48, Andrew Scull wrote:
> > Thanks, Steven. Could we look into the EPD* caching microarch details
> > Marc was referring to for these A55 and A76 cores?
> 
> Only the folks that have access to the RTL can tell you, unfortunately.
> 
> Thankfully, there is a bunch of people on Cc that should be able to ping
> the relevant teams and report back...

Yup, otherwise I guess we can throw in the TLB invalidation after setting
the EPDx bits until we have clarity from Arm. We wouldn't need to broadcast
it, so it might not be too bad on performance. If it is, I suppose we could
switch to a reserved VMID?

Will

Marc Zyngier March 28, 2020, 11:23 a.m. UTC | #7

On Fri, 27 Mar 2020 18:09:14 +0000
Will Deacon <will@kernel.org> wrote:

> On Fri, Mar 27, 2020 at 05:52:59PM +0000, Marc Zyngier wrote:
> > On 2020-03-27 17:48, Andrew Scull wrote:  
> > > Thanks, Steven. Could we look into the EPD* caching microarch details
> > > Marc was referring to for these A55 and A76 cores?  
> > 
> > Only the folks that have access to the RTL can tell you, unfortunately.
> > 
> > Thankfully, there is a bunch of people on Cc that should be able to ping
> > the relevant teams and report back...  
> 
> Yup, otherwise I guess we can throw in the TLB invalidation after setting
> the EPDx bits until we have clarity from Arm. We wouldn't need to broadcast
> it, so it might not be too bad on performance. If it is, I suppose we could
> switch to a reserved VMID?

I'd like to avoid the reserved VMID if at all possible -- we already
have one for the host on nVHE, and I bet your patches are going to
create some interesting havoc^Wchanges in that area, as the host is now
a guest from the mm perspective. It isn't clear either whether a
reserved VMID really solves the problem either, as you could still
end-up with speculative PTWs. Can it be harmful to create conflicting
TLB entries if you never hit them?

But if we bring in TLB invalidation, I wonder whether the problem can
be mitigated solely with just that on the affected CPUs, and what the
impact would be.

/me eyes the D05 running its 32 guests...

	M.

Will Deacon March 30, 2020, 10:06 a.m. UTC | #8

On Sat, Mar 28, 2020 at 11:23:51AM +0000, Marc Zyngier wrote:
> On Fri, 27 Mar 2020 18:09:14 +0000
> Will Deacon <will@kernel.org> wrote:
> 
> > On Fri, Mar 27, 2020 at 05:52:59PM +0000, Marc Zyngier wrote:
> > > On 2020-03-27 17:48, Andrew Scull wrote:  
> > > > Thanks, Steven. Could we look into the EPD* caching microarch details
> > > > Marc was referring to for these A55 and A76 cores?  
> > > 
> > > Only the folks that have access to the RTL can tell you, unfortunately.
> > > 
> > > Thankfully, there is a bunch of people on Cc that should be able to ping
> > > the relevant teams and report back...  
> > 
> > Yup, otherwise I guess we can throw in the TLB invalidation after setting
> > the EPDx bits until we have clarity from Arm. We wouldn't need to broadcast
> > it, so it might not be too bad on performance. If it is, I suppose we could
> > switch to a reserved VMID?
> 
> I'd like to avoid the reserved VMID if at all possible -- we already
> have one for the host on nVHE, and I bet your patches are going to
> create some interesting havoc^Wchanges in that area, as the host is now
> a guest from the mm perspective. It isn't clear either whether a
> reserved VMID really solves the problem either, as you could still
> end-up with speculative PTWs. Can it be harmful to create conflicting
> TLB entries if you never hit them?

I think you'd have to ensure that the reserved VMID is only ever used when
the EPDx bits are set, which is easy to say but harder to enforce!

> But if we bring in TLB invalidation, I wonder whether the problem can
> be mitigated solely with just that on the affected CPUs, and what the
> impact would be.

It seems as though this erratum is quietly cropping up for other CPUs (e.g.
A53 and A77) as well, so I'd be inclined to add the local TLBI and then
Arm can do the uarch work to disable it if it's worth it. Interestingly,
I think you only need the invalidation on the __deactivate_traps_nvhe()
path, because the CPU is only permitted to cache those bits when they are
0 (strange but true!).

Will

Andrew Scull April 3, 2020, 12:57 p.m. UTC | #9

On Fri, Mar 27, 2020 at 02:59:47PM +0000, Steven Price wrote:
> I proposed something similar a while ago[1], but Marc was concerned about
> the microarch detail[2] and hence I split the workaround into VHE/non-VHE.
> 
> That said I'm not saying this is necessarily wrong, just that we'd need some
> more information on whether the non-VHE workaround is suitable for the CPUs
> we're currently forcing VHE on.

We noticed that both the nVHE and VHE workarounds share the same
assumption that the EPDx bits are not being cached in the TLB.

`__tlb_switch_to_guest_vhe` and `__tlb_switch_to_guest_nvhe` are both
setting EPDx as part of the workaround. However, neither handles the
possibility of a speculative AT being able to make use of a cached EPD=0
value in the TLB in order to allocate bad TLB entries.

If this is correct, the microarch concern appears to have been solved
already. Otherwise, or if we are unsure, we should go ahead and add the
TLB flushes to keep this safe.

Will Deacon April 17, 2020, 4:41 p.m. UTC | #10

On Fri, Apr 03, 2020 at 01:57:26PM +0100, Andrew Scull wrote:
> On Fri, Mar 27, 2020 at 02:59:47PM +0000, Steven Price wrote:
> > I proposed something similar a while ago[1], but Marc was concerned about
> > the microarch detail[2] and hence I split the workaround into VHE/non-VHE.
> > 
> > That said I'm not saying this is necessarily wrong, just that we'd need some
> > more information on whether the non-VHE workaround is suitable for the CPUs
> > we're currently forcing VHE on.
> 
> We noticed that both the nVHE and VHE workarounds share the same
> assumption that the EPDx bits are not being cached in the TLB.
> 
> `__tlb_switch_to_guest_vhe` and `__tlb_switch_to_guest_nvhe` are both
> setting EPDx as part of the workaround. However, neither handles the
> possibility of a speculative AT being able to make use of a cached EPD=0
> value in the TLB in order to allocate bad TLB entries.
> 
> If this is correct, the microarch concern appears to have been solved
> already. Otherwise, or if we are unsure, we should go ahead and add the
> TLB flushes to keep this safe.

I think Andrew's right here. Can we go ahead with the original approach of
combining the workarounds, or is there something we've missed?

Cheers,

Will

Steven Price April 20, 2020, 1:46 p.m. UTC | #11

On 17/04/2020 17:41, Will Deacon wrote:
> On Fri, Apr 03, 2020 at 01:57:26PM +0100, Andrew Scull wrote:
>> On Fri, Mar 27, 2020 at 02:59:47PM +0000, Steven Price wrote:
>>> I proposed something similar a while ago[1], but Marc was concerned about
>>> the microarch detail[2] and hence I split the workaround into VHE/non-VHE.
>>>
>>> That said I'm not saying this is necessarily wrong, just that we'd need some
>>> more information on whether the non-VHE workaround is suitable for the CPUs
>>> we're currently forcing VHE on.
>>
>> We noticed that both the nVHE and VHE workarounds share the same
>> assumption that the EPDx bits are not being cached in the TLB.
>>
>> `__tlb_switch_to_guest_vhe` and `__tlb_switch_to_guest_nvhe` are both
>> setting EPDx as part of the workaround. However, neither handles the
>> possibility of a speculative AT being able to make use of a cached EPD=0
>> value in the TLB in order to allocate bad TLB entries.
>>
>> If this is correct, the microarch concern appears to have been solved
>> already. Otherwise, or if we are unsure, we should go ahead and add the
>> TLB flushes to keep this safe.
> 
> I think Andrew's right here. Can we go ahead with the original approach of
> combining the workarounds, or is there something we've missed?

As far as I know it is safe to combine the workarounds: I did post my 
own patch earlier. But I don't have the deep understanding of the 
microarch - so I accepted Marc's concerns and dropped that, and was 
simply linking up the discussions.

The assumption before was that booting a VHE capable system without VHE 
wasn't useful, so there wasn't a pressing need to unify the workarounds. 
And therefore the easiest option for me was to keep the workarounds 
separate. Since there is apparently a desire to do such a thing then 
unifying the workarounds seems reasonable.

Steve

[RFC] arm64: unify WORKAROUND_SPECULATIVE_AT_{NVHE,VHE}

Commit Message

Comments

Patch