diff mbox series

[v2] arm64: errata: Add NXP iMX8QM workaround for A53 Cache coherency issue

Message ID 20230420112952.28340-1-iivanov@suse.de (mailing list archive)
State New, archived
Headers show
Series [v2] arm64: errata: Add NXP iMX8QM workaround for A53 Cache coherency issue | expand

Commit Message

Ivan T. Ivanov April 20, 2023, 11:29 a.m. UTC
According to NXP errata document[1] i.MX8QuadMax SoC suffers from
serious cache coherence issue. It was also mentioned in initial
support[2] for imx8qm mek machine.

I chose to use an ALTERNATIVE() framework, instead downstream solution[3],
for this issue with the hope to reduce effect of this fix on unaffected
platforms.

Unfortunately I was unable to find a way to identify SoC ID using
registers. Boot CPU MIDR_EL1 is equal to 0x410fd034, AIDR_EL1 is
equal to 0. So I fallback to using devicetree compatible strings for this.
And because we can not guarantee that VMs on top of KVM will get
correct devicetree KVM is disabled.

Also make sure that SMMU "Broadcast TLB Maintenance" is not used even
SMMU device build in this SoC supports it.

I know this fix is a suboptimal solution for affected machines, but I
haven't been able to come up with a less intrusive fix.  And I hope once
TLB caches are invalidated any immediate attempt to invalidate them again
will be close to NOP operation (flush_tlb_kernel_range())

I have run few simple benchmarks and perf tests on affected and unaffected
machines and I was not able see any obvious issues. iMX8QM "performance"
was nearly doubled with 2 A72 bringed online.

Following is excerpt from NXP IMX8_1N94W "Mask Set Errata" document
Rev. 5, 3/2023. Just in case it gets lost somehow.

--
"ERR050104: Arm/A53: Cache coherency issue"

Description

Some maintenance operations exchanged between the A53 and A72
core clusters, involving some Translation Look-aside Buffer
Invalidate (TLBI) and Instruction Cache (IC) instructions can
be corrupted. The upper bits, above bit-35, of ARADDR and ACADDR
buses within in Arm A53 sub-system have been incorrectly connected.
Therefore ARADDR and ACADDR address bits above bit-35 should not
be used.

Workaround

The following software instructions are required to be downgraded
to TLBI VMALLE1IS:  TLBI ASIDE1, TLBI ASIDE1IS, TLBI VAAE1,
TLBI VAAE1IS, TLBI VAALE1, TLBI VAALE1IS, TLBI VAE1, TLBI VAE1IS,
TLBI VALE1, TLBI VALE1IS

The following software instructions are required to be downgraded
to TLBI VMALLS12E1IS: TLBI IPAS2E1IS, TLBI IPAS2LE1IS

The following software instructions are required to be downgraded
to TLBI ALLE2IS: TLBI VAE2IS, TLBI VALE2IS.

The following software instructions are required to be downgraded
to TLBI ALLE3IS: TLBI VAE3IS, TLBI VALE3IS.

The following software instructions are required to be downgraded
to TLBI VMALLE1IS when the Force Broadcast (FB) bit [9] of the
Hypervisor Configuration Register (HCR_EL2) is set:
TLBI ASIDE1, TLBI VAAE1, TLBI VAALE1, TLBI VAE1, TLBI VALE1

The following software instruction is required to be downgraded
to IC IALLUIS: IC IVAU, Xt

Specifically for the IC IVAU, Xt downgrade, setting SCTLR_EL1.UCI
to 0 will disable EL0 access to this instruction. Any attempt to
execute from EL0 will generate an EL1 trap, where the downgrade to
IC ALLUIS can be implemented.
--

[1] https://www.nxp.com/docs/en/errata/IMX8_1N94W.pdf
[2] commit 307fd14d4b14c ("arm64: dts: imx: add imx8qm mek support")
[3] https://github.com/nxp-imx/linux-imx/blob/lf-6.1.y/arch/arm64/include/asm/tlbflush.h#L19

Signed-off-by: Ivan T. Ivanov <iivanov@suse.de>
---

Changes since v1:
https://lore.kernel.org/linux-arm-kernel/20230412125506.21634-1-iivanov@suse.de/

 - Disable KVM on affected SoC add a note in commit message why this is done.
 - Updated Kconfig help message and made option enabled by default, like
   the rest of the workarounds.
 - Removed unnecessary DSB ISH + ISB instructions from the ALTERNATIVES.
 - Reworked handling of user-space IC IVAU.
 - Adjusted position of workaround definition in cpucaps file
 - Make sure that "Broadcast TLB Maintenance" is not used even if SMMU device
   build in this SoC supports it. And note it in commit message.

 Documentation/arm64/silicon-errata.rst |  2 ++
 arch/arm64/Kconfig                     | 12 ++++++++++++
 arch/arm64/include/asm/tlbflush.h      |  6 +++++-
 arch/arm64/kernel/cpu_errata.c         | 18 ++++++++++++++++++
 arch/arm64/kernel/traps.c              |  5 +++++
 arch/arm64/kvm/arm.c                   |  5 +++++
 arch/arm64/tools/cpucaps               |  1 +
 drivers/iommu/arm/arm-smmu/arm-smmu.c  |  4 ++++
 drivers/iommu/arm/arm-smmu/arm-smmu.h  |  1 +
 9 files changed, 53 insertions(+), 1 deletion(-)

Comments

Ivan T. Ivanov May 18, 2023, 11:59 a.m. UTC | #1
Hi Catalin, Will,

Is there something more that I could for this patch or
it could be merged as is?

Regards,
Ivan


On 04-20 14:29, Ivan T. Ivanov wrote:
> Message-Id: <20230420112952.28340-1-iivanov@suse.de>
> 
> According to NXP errata document[1] i.MX8QuadMax SoC suffers from
> serious cache coherence issue. It was also mentioned in initial
> support[2] for imx8qm mek machine.
> 
> I chose to use an ALTERNATIVE() framework, instead downstream solution[3],
> for this issue with the hope to reduce effect of this fix on unaffected
> platforms.
> 
> Unfortunately I was unable to find a way to identify SoC ID using
> registers. Boot CPU MIDR_EL1 is equal to 0x410fd034, AIDR_EL1 is
> equal to 0. So I fallback to using devicetree compatible strings for this.
> And because we can not guarantee that VMs on top of KVM will get
> correct devicetree KVM is disabled.
> 
> Also make sure that SMMU "Broadcast TLB Maintenance" is not used even
> SMMU device build in this SoC supports it.
> 
> I know this fix is a suboptimal solution for affected machines, but I
> haven't been able to come up with a less intrusive fix.  And I hope once
> TLB caches are invalidated any immediate attempt to invalidate them again
> will be close to NOP operation (flush_tlb_kernel_range())
> 
> I have run few simple benchmarks and perf tests on affected and unaffected
> machines and I was not able see any obvious issues. iMX8QM "performance"
> was nearly doubled with 2 A72 bringed online.
> 
> Following is excerpt from NXP IMX8_1N94W "Mask Set Errata" document
> Rev. 5, 3/2023. Just in case it gets lost somehow.
> 
> --
> "ERR050104: Arm/A53: Cache coherency issue"
> 
> Description
> 
> Some maintenance operations exchanged between the A53 and A72
> core clusters, involving some Translation Look-aside Buffer
> Invalidate (TLBI) and Instruction Cache (IC) instructions can
> be corrupted. The upper bits, above bit-35, of ARADDR and ACADDR
> buses within in Arm A53 sub-system have been incorrectly connected.
> Therefore ARADDR and ACADDR address bits above bit-35 should not
> be used.
> 
> Workaround
> 
> The following software instructions are required to be downgraded
> to TLBI VMALLE1IS:  TLBI ASIDE1, TLBI ASIDE1IS, TLBI VAAE1,
> TLBI VAAE1IS, TLBI VAALE1, TLBI VAALE1IS, TLBI VAE1, TLBI VAE1IS,
> TLBI VALE1, TLBI VALE1IS
> 
> The following software instructions are required to be downgraded
> to TLBI VMALLS12E1IS: TLBI IPAS2E1IS, TLBI IPAS2LE1IS
> 
> The following software instructions are required to be downgraded
> to TLBI ALLE2IS: TLBI VAE2IS, TLBI VALE2IS.
> 
> The following software instructions are required to be downgraded
> to TLBI ALLE3IS: TLBI VAE3IS, TLBI VALE3IS.
> 
> The following software instructions are required to be downgraded
> to TLBI VMALLE1IS when the Force Broadcast (FB) bit [9] of the
> Hypervisor Configuration Register (HCR_EL2) is set:
> TLBI ASIDE1, TLBI VAAE1, TLBI VAALE1, TLBI VAE1, TLBI VALE1
> 
> The following software instruction is required to be downgraded
> to IC IALLUIS: IC IVAU, Xt
> 
> Specifically for the IC IVAU, Xt downgrade, setting SCTLR_EL1.UCI
> to 0 will disable EL0 access to this instruction. Any attempt to
> execute from EL0 will generate an EL1 trap, where the downgrade to
> IC ALLUIS can be implemented.
> --
> 
> [1] https://www.nxp.com/docs/en/errata/IMX8_1N94W.pdf
> [2] commit 307fd14d4b14c ("arm64: dts: imx: add imx8qm mek support")
> [3] https://github.com/nxp-imx/linux-imx/blob/lf-6.1.y/arch/arm64/include/asm/tlbflush.h#L19
> 
> Signed-off-by: Ivan T. Ivanov <iivanov@suse.de>
> ---
> 
> Changes since v1:
> https://lore.kernel.org/linux-arm-kernel/20230412125506.21634-1-iivanov@suse.de/
> 
>  - Disable KVM on affected SoC add a note in commit message why this is done.
>  - Updated Kconfig help message and made option enabled by default, like
>    the rest of the workarounds.
>  - Removed unnecessary DSB ISH + ISB instructions from the ALTERNATIVES.
>  - Reworked handling of user-space IC IVAU.
>  - Adjusted position of workaround definition in cpucaps file
>  - Make sure that "Broadcast TLB Maintenance" is not used even if SMMU device
>    build in this SoC supports it. And note it in commit message.
> 
>  Documentation/arm64/silicon-errata.rst |  2 ++
>  arch/arm64/Kconfig                     | 12 ++++++++++++
>  arch/arm64/include/asm/tlbflush.h      |  6 +++++-
>  arch/arm64/kernel/cpu_errata.c         | 18 ++++++++++++++++++
>  arch/arm64/kernel/traps.c              |  5 +++++
>  arch/arm64/kvm/arm.c                   |  5 +++++
>  arch/arm64/tools/cpucaps               |  1 +
>  drivers/iommu/arm/arm-smmu/arm-smmu.c  |  4 ++++
>  drivers/iommu/arm/arm-smmu/arm-smmu.h  |  1 +
>  9 files changed, 53 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
> index ec5f889d7681..fce231797184 100644
> --- a/Documentation/arm64/silicon-errata.rst
> +++ b/Documentation/arm64/silicon-errata.rst
> @@ -175,6 +175,8 @@ stable kernels.
>  +----------------+-----------------+-----------------+-----------------------------+
>  | Freescale/NXP  | LS2080A/LS1043A | A-008585        | FSL_ERRATUM_A008585         |
>  +----------------+-----------------+-----------------+-----------------------------+
> +| Freescale/NXP  | i.MX 8QuadMax   | ERR050104       | NXP_IMX8QM_ERRATUM_ERR050104|
> ++----------------+-----------------+-----------------+-----------------------------+
>  +----------------+-----------------+-----------------+-----------------------------+
>  | Hisilicon      | Hip0{5,6,7}     | #161010101      | HISILICON_ERRATUM_161010101 |
>  +----------------+-----------------+-----------------+-----------------------------+
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 1023e896d46b..a5fe6ffb8150 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1159,6 +1159,18 @@ config SOCIONEXT_SYNQUACER_PREITS
>  
>  	  If unsure, say Y.
>  
> +config NXP_IMX8QM_ERRATUM_ERR050104
> +	bool "NXP iMX8QM ERR050104: broken cache/tlb invalidation"
> +	default y
> +	help
> +	  On iMX8QM, addresses above bit 35 are not broadcast correctly for
> +	  TLBI or IC operations, making TLBI and IC unreliable.
> +
> +	  Work around this erratum by using TLBI *ALL*IS and IC IALLUIS
> +	  operations. EL0 use of IC IVAU is trapped and upgraded to IC IALLUIS.
> +
> +	  If unsure, say Y.
> +
>  endmenu # "ARM errata workarounds via the alternatives framework"
>  
>  choice
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index 412a3b9a3c25..e3bad2298ea5 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -37,7 +37,11 @@
>  			    : : )
>  
>  #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE			       \
> -			       "tlbi " #op ", %0\n"			       \
> +		   ALTERNATIVE("tlbi " #op ", %0\n",			       \
> +			       "tlbi vmalle1is\n",			       \
> +			       ARM64_WORKAROUND_NXP_ERR050104)		       \
> +			    : : "r" (arg));				       \
> +			  asm (ARM64_ASM_PREAMBLE			       \
>  		   ALTERNATIVE("nop\n			nop",		       \
>  			       "dsb ish\n		tlbi " #op ", %0",     \
>  			       ARM64_WORKAROUND_REPEAT_TLBI,		       \
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index 307faa2b4395..b0647b64dbb8 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -8,6 +8,7 @@
>  #include <linux/arm-smccc.h>
>  #include <linux/types.h>
>  #include <linux/cpu.h>
> +#include <linux/of.h>
>  #include <asm/cpu.h>
>  #include <asm/cputype.h>
>  #include <asm/cpufeature.h>
> @@ -55,6 +56,14 @@ is_kryo_midr(const struct arm64_cpu_capabilities *entry, int scope)
>  	return model == entry->midr_range.model;
>  }
>  
> +static bool __maybe_unused
> +is_imx8qm_soc(const struct arm64_cpu_capabilities *entry, int scope)
> +{
> +	WARN_ON(preemptible());
> +
> +	return of_machine_is_compatible("fsl,imx8qm");
> +}
> +
>  static bool
>  has_mismatched_cache_type(const struct arm64_cpu_capabilities *entry,
>  			  int scope)
> @@ -729,6 +738,15 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
>  		MIDR_FIXED(MIDR_CPU_VAR_REV(1,1), BIT(25)),
>  		.cpu_enable = cpu_clear_bf16_from_user_emulation,
>  	},
> +#endif
> +#ifdef CONFIG_NXP_IMX8QM_ERRATUM_ERR050104
> +	{
> +		.desc = "NXP erratum ERR050104",
> +		.capability = ARM64_WORKAROUND_NXP_ERR050104,
> +		.type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
> +		.matches = is_imx8qm_soc,
> +		.cpu_enable = cpu_enable_cache_maint_trap,
> +	},
>  #endif
>  	{
>  	}
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index 4a79ba100799..265b6334291b 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -556,6 +556,11 @@ static void user_cache_maint_handler(unsigned long esr, struct pt_regs *regs)
>  		__user_cache_maint("dc civac", address, ret);
>  		break;
>  	case ESR_ELx_SYS64_ISS_CRM_IC_IVAU:	/* IC IVAU */
> +		if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104)) {
> +			asm volatile("ic ialluis");
> +			ret = 0;
> +			break;
> +		}
>  		__user_cache_maint("ic ivau", address, ret);
>  		break;
>  	default:
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 4b2e16e696a8..5066332302d2 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -2239,6 +2239,11 @@ static __init int kvm_arm_init(void)
>  		return -ENODEV;
>  	}
>  
> +	if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104)) {
> +		kvm_info("KVM is not supported on this system\n");
> +		return -ENODEV;
> +	}
> +
>  	err = kvm_sys_reg_table_init();
>  	if (err) {
>  		kvm_info("Error initializing system register tables");
> diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
> index 37b1340e9646..c1de9235f922 100644
> --- a/arch/arm64/tools/cpucaps
> +++ b/arch/arm64/tools/cpucaps
> @@ -87,6 +87,7 @@ WORKAROUND_CAVIUM_TX2_219_TVM
>  WORKAROUND_CLEAN_CACHE
>  WORKAROUND_DEVICE_LOAD_ACQUIRE
>  WORKAROUND_NVIDIA_CARMEL_CNP
> +WORKAROUND_NXP_ERR050104
>  WORKAROUND_QCOM_FALKOR_E1003
>  WORKAROUND_REPEAT_TLBI
>  WORKAROUND_SPECULATIVE_AT
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index 2ff7a72cf377..29bbb16bae6e 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -1680,6 +1680,10 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
>  	/* ID0 */
>  	id = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_ID0);
>  
> +	/* Do not use Broadcast TLB Maintenance on affected platforms */
> +	if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104))
> +		id &= ~ARM_SMMU_ID0_BTM;
> +
>  	/* Restrict available stages based on module parameter */
>  	if (force_stage == 1)
>  		id &= ~(ARM_SMMU_ID0_S2TS | ARM_SMMU_ID0_NTS);
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> index 703fd5817ec1..1589acfdbed9 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
> @@ -52,6 +52,7 @@
>  #define ARM_SMMU_ID0_PTFS_NO_AARCH32S	BIT(24)
>  #define ARM_SMMU_ID0_NUMIRPT		GENMASK(23, 16)
>  #define ARM_SMMU_ID0_CTTW		BIT(14)
> +#define ARM_SMMU_ID0_BTM		BIT(13)
>  #define ARM_SMMU_ID0_NUMSIDB		GENMASK(12, 9)
>  #define ARM_SMMU_ID0_EXIDS		BIT(8)
>  #define ARM_SMMU_ID0_NUMSMRG		GENMASK(7, 0)
> -- 
> 2.35.3
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Will Deacon June 2, 2023, 10:34 a.m. UTC | #2
Hi Ivan,

Sorry for the delay in getting to this.

On Thu, Apr 20, 2023 at 02:29:52PM +0300, Ivan T. Ivanov wrote:
> According to NXP errata document[1] i.MX8QuadMax SoC suffers from
> serious cache coherence issue. It was also mentioned in initial
> support[2] for imx8qm mek machine.
> 
> I chose to use an ALTERNATIVE() framework, instead downstream solution[3],
> for this issue with the hope to reduce effect of this fix on unaffected
> platforms.
> 
> Unfortunately I was unable to find a way to identify SoC ID using
> registers. Boot CPU MIDR_EL1 is equal to 0x410fd034, AIDR_EL1 is
> equal to 0. So I fallback to using devicetree compatible strings for this.
> And because we can not guarantee that VMs on top of KVM will get
> correct devicetree KVM is disabled.
> 
> Also make sure that SMMU "Broadcast TLB Maintenance" is not used even
> SMMU device build in this SoC supports it.
> 
> I know this fix is a suboptimal solution for affected machines, but I
> haven't been able to come up with a less intrusive fix.  And I hope once
> TLB caches are invalidated any immediate attempt to invalidate them again
> will be close to NOP operation (flush_tlb_kernel_range())

[...]

> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 1023e896d46b..a5fe6ffb8150 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1159,6 +1159,18 @@ config SOCIONEXT_SYNQUACER_PREITS
>  
>  	  If unsure, say Y.
>  
> +config NXP_IMX8QM_ERRATUM_ERR050104
> +	bool "NXP iMX8QM ERR050104: broken cache/tlb invalidation"
> +	default y
> +	help
> +	  On iMX8QM, addresses above bit 35 are not broadcast correctly for
> +	  TLBI or IC operations, making TLBI and IC unreliable.
> +
> +	  Work around this erratum by using TLBI *ALL*IS and IC IALLUIS
> +	  operations. EL0 use of IC IVAU is trapped and upgraded to IC IALLUIS.
> +
> +	  If unsure, say Y.
> +
>  endmenu # "ARM errata workarounds via the alternatives framework"
>  
>  choice
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index 412a3b9a3c25..e3bad2298ea5 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -37,7 +37,11 @@
>  			    : : )
>  
>  #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE			       \
> -			       "tlbi " #op ", %0\n"			       \
> +		   ALTERNATIVE("tlbi " #op ", %0\n",			       \
> +			       "tlbi vmalle1is\n",			       \
> +			       ARM64_WORKAROUND_NXP_ERR050104)		       \
> +			    : : "r" (arg));				       \
> +			  asm (ARM64_ASM_PREAMBLE			       \
>  		   ALTERNATIVE("nop\n			nop",		       \
>  			       "dsb ish\n		tlbi " #op ", %0",     \
>  			       ARM64_WORKAROUND_REPEAT_TLBI,		       \

I can see two problems with this:

  1. It doesn't interact properly with the ARM64_WORKAROUND_REPEAT_TLBI
     workaround

  2. You're nuking the entire TLB in the low-level helper, so things like
     flush_tlb_range() are going to over-invalidate excessively

Granted, you may not care about (1) for your SoC, but I would prefer not
to introduce artificial constraints on the workaround. I think both of
the issues above stem from plumbing this in too low in the stack.

Can you instead use a static key to redirect the higher-level control flow
to flush_tlb_mm() or flush_tlb_all() for user and kernel respectively? I'm
assuming the ASID _is_ carried on the interconnect?

> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index 307faa2b4395..b0647b64dbb8 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -8,6 +8,7 @@
>  #include <linux/arm-smccc.h>
>  #include <linux/types.h>
>  #include <linux/cpu.h>
> +#include <linux/of.h>
>  #include <asm/cpu.h>
>  #include <asm/cputype.h>
>  #include <asm/cpufeature.h>
> @@ -55,6 +56,14 @@ is_kryo_midr(const struct arm64_cpu_capabilities *entry, int scope)
>  	return model == entry->midr_range.model;
>  }
>  
> +static bool __maybe_unused
> +is_imx8qm_soc(const struct arm64_cpu_capabilities *entry, int scope)
> +{
> +	WARN_ON(preemptible());
> +
> +	return of_machine_is_compatible("fsl,imx8qm");
> +}
> +
>  static bool
>  has_mismatched_cache_type(const struct arm64_cpu_capabilities *entry,
>  			  int scope)
> @@ -729,6 +738,15 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
>  		MIDR_FIXED(MIDR_CPU_VAR_REV(1,1), BIT(25)),
>  		.cpu_enable = cpu_clear_bf16_from_user_emulation,
>  	},
> +#endif
> +#ifdef CONFIG_NXP_IMX8QM_ERRATUM_ERR050104
> +	{
> +		.desc = "NXP erratum ERR050104",
> +		.capability = ARM64_WORKAROUND_NXP_ERR050104,
> +		.type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
> +		.matches = is_imx8qm_soc,
> +		.cpu_enable = cpu_enable_cache_maint_trap,
> +	},
>  #endif
>  	{
>  	}
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index 4a79ba100799..265b6334291b 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -556,6 +556,11 @@ static void user_cache_maint_handler(unsigned long esr, struct pt_regs *regs)
>  		__user_cache_maint("dc civac", address, ret);
>  		break;
>  	case ESR_ELx_SYS64_ISS_CRM_IC_IVAU:	/* IC IVAU */
> +		if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104)) {
> +			asm volatile("ic ialluis");

Hmm, one oddity here is that you can pass a faulting address and not see
the fault. It looks like that's already IMP DEF, so it's probably ok, but
might be worth a comment.

> +			ret = 0;

'ret' is initialised to 0 already?

Finally, how come you don't need to upgrade I-cache invalidation by-VA
in the kernel? It looks like you're only handling operations trapped
from EL0.

Will
Ivan T. Ivanov June 8, 2023, 1:39 p.m. UTC | #3
On 06-02 11:34, Will Deacon wrote:
> 
> Hi Ivan,
> 
> Sorry for the delay in getting to this.
> 
> On Thu, Apr 20, 2023 at 02:29:52PM +0300, Ivan T. Ivanov wrote:
> > According to NXP errata document[1] i.MX8QuadMax SoC suffers from
> > serious cache coherence issue. It was also mentioned in initial
> > support[2] for imx8qm mek machine.
> > 
> > I chose to use an ALTERNATIVE() framework, instead downstream solution[3],
> > for this issue with the hope to reduce effect of this fix on unaffected
> > platforms.
> > 
> > Unfortunately I was unable to find a way to identify SoC ID using
> > registers. Boot CPU MIDR_EL1 is equal to 0x410fd034, AIDR_EL1 is
> > equal to 0. So I fallback to using devicetree compatible strings for this.
> > And because we can not guarantee that VMs on top of KVM will get
> > correct devicetree KVM is disabled.
> > 
> > Also make sure that SMMU "Broadcast TLB Maintenance" is not used even
> > SMMU device build in this SoC supports it.
> > 
> > I know this fix is a suboptimal solution for affected machines, but I
> > haven't been able to come up with a less intrusive fix.  And I hope once
> > TLB caches are invalidated any immediate attempt to invalidate them again
> > will be close to NOP operation (flush_tlb_kernel_range())
> 
> [...]
> 
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index 1023e896d46b..a5fe6ffb8150 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -1159,6 +1159,18 @@ config SOCIONEXT_SYNQUACER_PREITS
> >  
> >  	  If unsure, say Y.
> >  
> > +config NXP_IMX8QM_ERRATUM_ERR050104
> > +	bool "NXP iMX8QM ERR050104: broken cache/tlb invalidation"
> > +	default y
> > +	help
> > +	  On iMX8QM, addresses above bit 35 are not broadcast correctly for
> > +	  TLBI or IC operations, making TLBI and IC unreliable.
> > +
> > +	  Work around this erratum by using TLBI *ALL*IS and IC IALLUIS
> > +	  operations. EL0 use of IC IVAU is trapped and upgraded to IC IALLUIS.
> > +
> > +	  If unsure, say Y.
> > +
> >  endmenu # "ARM errata workarounds via the alternatives framework"
> >  
> >  choice
> > diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> > index 412a3b9a3c25..e3bad2298ea5 100644
> > --- a/arch/arm64/include/asm/tlbflush.h
> > +++ b/arch/arm64/include/asm/tlbflush.h
> > @@ -37,7 +37,11 @@
> >  			    : : )
> >  
> >  #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE			       \
> > -			       "tlbi " #op ", %0\n"			       \
> > +		   ALTERNATIVE("tlbi " #op ", %0\n",			       \
> > +			       "tlbi vmalle1is\n",			       \
> > +			       ARM64_WORKAROUND_NXP_ERR050104)		       \
> > +			    : : "r" (arg));				       \
> > +			  asm (ARM64_ASM_PREAMBLE			       \
> >  		   ALTERNATIVE("nop\n			nop",		       \
> >  			       "dsb ish\n		tlbi " #op ", %0",     \
> >  			       ARM64_WORKAROUND_REPEAT_TLBI,		       \
> 
> I can see two problems with this:
> 
>   1. It doesn't interact properly with the ARM64_WORKAROUND_REPEAT_TLBI
>      workaround

From what I see, they can not happen at the same time on the same SoC,
so this should not be an issue, but I could be missing something.

> 
>   2. You're nuking the entire TLB in the low-level helper, so things like
>      flush_tlb_range() are going to over-invalidate excessively

Right, this is noted in the commit message.

> 
> Granted, you may not care about (1) for your SoC, but I would prefer not
> to introduce artificial constraints on the workaround. I think both of
> the issues above stem from plumbing this in too low in the stack.
> 
> Can you instead use a static key to redirect the higher-level control flow
> to flush_tlb_mm() or flush_tlb_all() for user and kernel respectively?

This was the other option considered, but it was looking more intrusive.

> I'm
> assuming the ASID _is_ carried on the interconnect?

Looking at history of in tlbflush in the vendor tree [4] I think ASID
is not properly wired either [5]. But I am failing to see why this
should matter.

[4] https://github.com/nxp-imx/linux-imx/commits/lf-6.1.y/arch/arm64/include/asm/tlbflush.h
[5] https://github.com/nxp-imx/linux-imx/commit/db6c157fff2a39c378e41e63f78d6fba18578758

> 
> > diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> > index 307faa2b4395..b0647b64dbb8 100644
> > --- a/arch/arm64/kernel/cpu_errata.c
> > +++ b/arch/arm64/kernel/cpu_errata.c
> > @@ -8,6 +8,7 @@
> >  #include <linux/arm-smccc.h>
> >  #include <linux/types.h>
> >  #include <linux/cpu.h>
> > +#include <linux/of.h>
> >  #include <asm/cpu.h>
> >  #include <asm/cputype.h>
> >  #include <asm/cpufeature.h>
> > @@ -55,6 +56,14 @@ is_kryo_midr(const struct arm64_cpu_capabilities *entry, int scope)
> >  	return model == entry->midr_range.model;
> >  }
> >  
> > +static bool __maybe_unused
> > +is_imx8qm_soc(const struct arm64_cpu_capabilities *entry, int scope)
> > +{
> > +	WARN_ON(preemptible());
> > +
> > +	return of_machine_is_compatible("fsl,imx8qm");
> > +}
> > +
> >  static bool
> >  has_mismatched_cache_type(const struct arm64_cpu_capabilities *entry,
> >  			  int scope)
> > @@ -729,6 +738,15 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
> >  		MIDR_FIXED(MIDR_CPU_VAR_REV(1,1), BIT(25)),
> >  		.cpu_enable = cpu_clear_bf16_from_user_emulation,
> >  	},
> > +#endif
> > +#ifdef CONFIG_NXP_IMX8QM_ERRATUM_ERR050104
> > +	{
> > +		.desc = "NXP erratum ERR050104",
> > +		.capability = ARM64_WORKAROUND_NXP_ERR050104,
> > +		.type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
> > +		.matches = is_imx8qm_soc,
> > +		.cpu_enable = cpu_enable_cache_maint_trap,
> > +	},
> >  #endif
> >  	{
> >  	}
> > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > index 4a79ba100799..265b6334291b 100644
> > --- a/arch/arm64/kernel/traps.c
> > +++ b/arch/arm64/kernel/traps.c
> > @@ -556,6 +556,11 @@ static void user_cache_maint_handler(unsigned long esr, struct pt_regs *regs)
> >  		__user_cache_maint("dc civac", address, ret);
> >  		break;
> >  	case ESR_ELx_SYS64_ISS_CRM_IC_IVAU:	/* IC IVAU */
> > +		if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104)) {
> > +			asm volatile("ic ialluis");
> 
> Hmm, one oddity here is that you can pass a faulting address and not see
> the fault. It looks like that's already IMP DEF, so it's probably ok, but
> might be worth a comment.

I am not sure what should be expected behavior, but I could
add comment, sure.

> 
> > +			ret = 0;
> 
> 'ret' is initialised to 0 already?

Oh, yes.

> 
> Finally, how come you don't need to upgrade I-cache invalidation by-VA
> in the kernel? It looks like you're only handling operations trapped
> from EL0.

Hm, I was thinking that __tlbi() is taking care for this or you mean 
something else, like locations in assembler.h?

Regards,
Ivan
Mark Rutland June 8, 2023, 2:16 p.m. UTC | #4
On Thu, Jun 08, 2023 at 04:39:29PM +0300, Ivan T. Ivanov wrote:
> On 06-02 11:34, Will Deacon wrote:
> > On Thu, Apr 20, 2023 at 02:29:52PM +0300, Ivan T. Ivanov wrote:
> > > According to NXP errata document[1] i.MX8QuadMax SoC suffers from
> > > serious cache coherence issue. It was also mentioned in initial
> > > support[2] for imx8qm mek machine.
> > > 
> > > I chose to use an ALTERNATIVE() framework, instead downstream solution[3],
> > > for this issue with the hope to reduce effect of this fix on unaffected
> > > platforms.

> > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > index 4a79ba100799..265b6334291b 100644
> > > --- a/arch/arm64/kernel/traps.c
> > > +++ b/arch/arm64/kernel/traps.c
> > > @@ -556,6 +556,11 @@ static void user_cache_maint_handler(unsigned long esr, struct pt_regs *regs)
> > >  		__user_cache_maint("dc civac", address, ret);
> > >  		break;
> > >  	case ESR_ELx_SYS64_ISS_CRM_IC_IVAU:	/* IC IVAU */
> > > +		if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104)) {
> > > +			asm volatile("ic ialluis");
> > 
> > Hmm, one oddity here is that you can pass a faulting address and not see
> > the fault. It looks like that's already IMP DEF, so it's probably ok, but
> > might be worth a comment.
> 
> I am not sure what should be expected behavior, but I could
> add comment, sure.

Another option is to make this:

	 case ESR_ELx_SYS64_ISS_CRM_IC_IVAU:     /* IC IVAU */
	 	__user_cache_maint("ic ivau", address, ret)
		if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104) && !ret)
			asm volatile("ic ialluis");
		break;

Which'll ensure that if the regular IC IVAU faults we'll handle that, and if
not we'll do the IC IALLUIS.

I think that looks a bit cleaner, too.

> > Finally, how come you don't need to upgrade I-cache invalidation by-VA
> > in the kernel? It looks like you're only handling operations trapped
> > from EL0.
> 
> Hm, I was thinking that __tlbi() is taking care for this or you mean 
> something else, like locations in assembler.h?

The __tlbi macro handles only TLBI instructions.

The trap handler above *only* handles IC instructions trapped from userspace;
we have IC IVAU instructions elsewhere in the kernel (e.g.
arch/arm64/mm/cache.S).

Thanks,
Mark.
Ivan T. Ivanov June 8, 2023, 3:05 p.m. UTC | #5
On 06-08 15:16, Mark Rutland wrote:
> 
> On Thu, Jun 08, 2023 at 04:39:29PM +0300, Ivan T. Ivanov wrote:
> > On 06-02 11:34, Will Deacon wrote:
> > > On Thu, Apr 20, 2023 at 02:29:52PM +0300, Ivan T. Ivanov wrote:
> > > > According to NXP errata document[1] i.MX8QuadMax SoC suffers from
> > > > serious cache coherence issue. It was also mentioned in initial
> > > > support[2] for imx8qm mek machine.
> > > > 
> > > > I chose to use an ALTERNATIVE() framework, instead downstream solution[3],
> > > > for this issue with the hope to reduce effect of this fix on unaffected
> > > > platforms.
> 
> > > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > > index 4a79ba100799..265b6334291b 100644
> > > > --- a/arch/arm64/kernel/traps.c
> > > > +++ b/arch/arm64/kernel/traps.c
> > > > @@ -556,6 +556,11 @@ static void user_cache_maint_handler(unsigned long esr, struct pt_regs *regs)
> > > >  		__user_cache_maint("dc civac", address, ret);
> > > >  		break;
> > > >  	case ESR_ELx_SYS64_ISS_CRM_IC_IVAU:	/* IC IVAU */
> > > > +		if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104)) {
> > > > +			asm volatile("ic ialluis");
> > > 
> > > Hmm, one oddity here is that you can pass a faulting address and not see
> > > the fault. It looks like that's already IMP DEF, so it's probably ok, but
> > > might be worth a comment.
> > 
> > I am not sure what should be expected behavior, but I could
> > add comment, sure.
> 
> Another option is to make this:
> 
> 	 case ESR_ELx_SYS64_ISS_CRM_IC_IVAU:     /* IC IVAU */
> 	 	__user_cache_maint("ic ivau", address, ret)
> 		if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104) && !ret)
> 			asm volatile("ic ialluis");
> 		break;
> 
> Which'll ensure that if the regular IC IVAU faults we'll handle that, and if
> not we'll do the IC IALLUIS.
> 
> I think that looks a bit cleaner, too.

What I am afraid is that "ic ivau address" will do cache invalidation of a 
random address, because of wrong address wiring. So the end result will not
be what should be expected.

> 
> > > Finally, how come you don't need to upgrade I-cache invalidation by-VA
> > > in the kernel? It looks like you're only handling operations trapped
> > > from EL0.
> > 
> > Hm, I was thinking that __tlbi() is taking care for this or you mean 
> > something else, like locations in assembler.h?
> 
> The __tlbi macro handles only TLBI instructions.
> 
> The trap handler above *only* handles IC instructions trapped from userspace;

Yep, I get this.

> we have IC IVAU instructions elsewhere in the kernel (e.g.
> arch/arm64/mm/cache.S).

But I have missed this one :-)

I think that this is working because these are used only 
in operations witch work up to the Point of Unification,
thus not messing up with caches of the rest of PE.

Regards,
Ivan
Mark Rutland June 8, 2023, 3:32 p.m. UTC | #6
On Thu, Jun 08, 2023 at 06:05:54PM +0300, Ivan T. Ivanov wrote:
> On 06-08 15:16, Mark Rutland wrote:
> > On Thu, Jun 08, 2023 at 04:39:29PM +0300, Ivan T. Ivanov wrote:
> > > On 06-02 11:34, Will Deacon wrote:
> > > > On Thu, Apr 20, 2023 at 02:29:52PM +0300, Ivan T. Ivanov wrote:

> > > > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > > > index 4a79ba100799..265b6334291b 100644
> > > > > --- a/arch/arm64/kernel/traps.c
> > > > > +++ b/arch/arm64/kernel/traps.c
> > > > > @@ -556,6 +556,11 @@ static void user_cache_maint_handler(unsigned long esr, struct pt_regs *regs)
> > > > >  		__user_cache_maint("dc civac", address, ret);
> > > > >  		break;
> > > > >  	case ESR_ELx_SYS64_ISS_CRM_IC_IVAU:	/* IC IVAU */
> > > > > +		if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104)) {
> > > > > +			asm volatile("ic ialluis");
> > > > 
> > > > Hmm, one oddity here is that you can pass a faulting address and not see
> > > > the fault. It looks like that's already IMP DEF, so it's probably ok, but
> > > > might be worth a comment.
> > > 
> > > I am not sure what should be expected behavior, but I could
> > > add comment, sure.
> > 
> > Another option is to make this:
> > 
> > 	 case ESR_ELx_SYS64_ISS_CRM_IC_IVAU:     /* IC IVAU */
> > 	 	__user_cache_maint("ic ivau", address, ret)
> > 		if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104) && !ret)
> > 			asm volatile("ic ialluis");
> > 		break;
> > 
> > Which'll ensure that if the regular IC IVAU faults we'll handle that, and if
> > not we'll do the IC IALLUIS.
> > 
> > I think that looks a bit cleaner, too.
> 
> What I am afraid is that "ic ivau address" will do cache invalidation of a 
> random address, because of wrong address wiring. So the end result will not
> be what should be expected.

I don't follow:

 - The faulting logic is entirely local (and has to happen before the
   broadcast), so the faulting logic shouldn't be affected by the erratum and
   will use the correct address.

 - We're going to do an IC IALLUIS anyway, so we're going to do invalidation of
   *every* address. So randomly invalidating a random address at the same time
   isn't going to cause a functiona problem because we're going to invaldiate
   it anyway.

> > > > Finally, how come you don't need to upgrade I-cache invalidation by-VA
> > > > in the kernel? It looks like you're only handling operations trapped
> > > > from EL0.
> > > 
> > > Hm, I was thinking that __tlbi() is taking care for this or you mean 
> > > something else, like locations in assembler.h?
> > 
> > The __tlbi macro handles only TLBI instructions.
> > 
> > The trap handler above *only* handles IC instructions trapped from userspace;
> 
> Yep, I get this.
> 
> > we have IC IVAU instructions elsewhere in the kernel (e.g.
> > arch/arm64/mm/cache.S).
> 
> But I have missed this one :-)

Looking again, those all use the common invalidate_icache_by_line macro in
arch/arm64/include/asm/assembler.h

Lukily that seems to be all:

| [mark@lakrids:~/src/linux]% git grep -iw ivau -- arch/arm64 
| arch/arm64/include/asm/assembler.h:     ic      ivau, \tmp2                     // invalidate I line PoU
| arch/arm64/kernel/traps.c:      case ESR_ELx_SYS64_ISS_CRM_IC_IVAU:     /* IC IVAU */
| arch/arm64/kernel/traps.c:              __user_cache_maint("ic ivau", address, ret);

> I think that this is working because these are used only in operations witch
> work up to the Point of Unification, thus not messing up with caches of the
> rest of PE.

I think you have a misunderstanding of the architecture, because that doesn't
make sense. All IC IVAU operations operate to the Point of Unification. That's
what the 'U' in 'IVAU' stands for. The Point of Unification is the point in the
memory system where instruction fetches and data fetches see the same thing.
 
If the VA passed to IC IVAU isn't correctly broadcast, it's broken regardless
of where that IC IVAU is executed from, because it will leave stale
instructions in the I-caches of other PEs.

Thanks,
Mark.
Ivan T. Ivanov June 8, 2023, 6:22 p.m. UTC | #7
On 06-08 16:32, Mark Rutland wrote:
> On Thu, Jun 08, 2023 at 06:05:54PM +0300, Ivan T. Ivanov wrote:
> > On 06-08 15:16, Mark Rutland wrote:
> > > On Thu, Jun 08, 2023 at 04:39:29PM +0300, Ivan T. Ivanov wrote:
> > > > On 06-02 11:34, Will Deacon wrote:
> > > > > On Thu, Apr 20, 2023 at 02:29:52PM +0300, Ivan T. Ivanov wrote:
> 
> > > > > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > > > > index 4a79ba100799..265b6334291b 100644
> > > > > > --- a/arch/arm64/kernel/traps.c
> > > > > > +++ b/arch/arm64/kernel/traps.c
> > > > > > @@ -556,6 +556,11 @@ static void user_cache_maint_handler(unsigned long esr, struct pt_regs *regs)
> > > > > >  		__user_cache_maint("dc civac", address, ret);
> > > > > >  		break;
> > > > > >  	case ESR_ELx_SYS64_ISS_CRM_IC_IVAU:	/* IC IVAU */
> > > > > > +		if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104)) {
> > > > > > +			asm volatile("ic ialluis");
> > > > > 
> > > > > Hmm, one oddity here is that you can pass a faulting address and not see
> > > > > the fault. It looks like that's already IMP DEF, so it's probably ok, but
> > > > > might be worth a comment.
> > > > 
> > > > I am not sure what should be expected behavior, but I could
> > > > add comment, sure.
> > > 
> > > Another option is to make this:
> > > 
> > > 	 case ESR_ELx_SYS64_ISS_CRM_IC_IVAU:     /* IC IVAU */
> > > 	 	__user_cache_maint("ic ivau", address, ret)
> > > 		if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104) && !ret)
> > > 			asm volatile("ic ialluis");
> > > 		break;
> > > 
> > > Which'll ensure that if the regular IC IVAU faults we'll handle that, and if
> > > not we'll do the IC IALLUIS.
> > > 
> > > I think that looks a bit cleaner, too.
> > 
> > What I am afraid is that "ic ivau address" will do cache invalidation of a 
> > random address, because of wrong address wiring. So the end result will not
> > be what should be expected.
> 
> I don't follow:
> 
>  - The faulting logic is entirely local (and has to happen before the
>    broadcast), so the faulting logic shouldn't be affected by the erratum and
>    will use the correct address.

Perhaps is all me really not understanding PoU, so please 
bear with me :-). This SoC hasi two CPU clusters.

4x Cortex A53
• 32KB-I and 32KB-D cache
• 1MB Shared L2 Cache

2x Cortex A72
• 48KB-I and 32KB-D cache
• 1MB Shared L2 Cache

If it is all local then how far this broadcast will go? Up to the
L2 cache? But then errata says that: ".. maintenance operations 
exchanged between the A53 and A72 core clusters.. " are affected.
So in case of IC IAVU there is no interaction between clusters?

> 
>  - We're going to do an IC IALLUIS anyway, so we're going to do invalidation of
>    *every* address. So randomly invalidating a random address at the same time
>    isn't going to cause a functiona problem because we're going to invaldiate
>    it anyway.

Yes, but my point was that this random address might generate a translation
error where there shouldn't be one.

> 
> > > > > Finally, how come you don't need to upgrade I-cache invalidation by-VA
> > > > > in the kernel? It looks like you're only handling operations trapped
> > > > > from EL0.
> > > > 
> > > > Hm, I was thinking that __tlbi() is taking care for this or you mean 
> > > > something else, like locations in assembler.h?
> > > 
> > > The __tlbi macro handles only TLBI instructions.
> > > 
> > > The trap handler above *only* handles IC instructions trapped from userspace;
> > 
> > Yep, I get this.
> > 
> > > we have IC IVAU instructions elsewhere in the kernel (e.g.
> > > arch/arm64/mm/cache.S).
> > 
> > But I have missed this one :-)
> 
> Looking again, those all use the common invalidate_icache_by_line macro in
> arch/arm64/include/asm/assembler.h
> 
> Lukily that seems to be all:
> 
> | [mark@lakrids:~/src/linux]% git grep -iw ivau -- arch/arm64 
> | arch/arm64/include/asm/assembler.h:     ic      ivau, \tmp2                     // invalidate I line PoU
> | arch/arm64/kernel/traps.c:      case ESR_ELx_SYS64_ISS_CRM_IC_IVAU:     /* IC IVAU */
> | arch/arm64/kernel/traps.c:              __user_cache_maint("ic ivau", address, ret);
> 
> > I think that this is working because these are used only in operations witch
> > work up to the Point of Unification, thus not messing up with caches of the
> > rest of PE.
> 
> I think you have a misunderstanding of the architecture, because that doesn't
> make sense. All IC IVAU operations operate to the Point of Unification. That's
> what the 'U' in 'IVAU' stands for. The Point of Unification is the point in the
> memory system where instruction fetches and data fetches see the same thing.
>  
> If the VA passed to IC IVAU isn't correctly broadcast, it's broken regardless
> of where that IC IVAU is executed from, because it will leave stale
> instructions in the I-caches of other PEs.

I am totally confused :-) If this is true, and I am not saying it is not,
then we can't expect that IC IVAU will generate correct translation might
fault, no?

Regards,
Ivan
Will Deacon June 26, 2023, 12:15 p.m. UTC | #8
On Mon, Jun 12, 2023 at 01:22:37PM +0300, Ivan T. Ivanov wrote:
> On 06-12 10:33, Mark Rutland wrote:
> > I'm saying that the *faulting logic* is local, not the broadcast.
> > 
> > IIUC the erratum is due to the wiring between clusters losing some bits. The
> > *recipient* of a broadcast DVM message (which is what TLBI or IC instructions
> > will generate) will receive a bogus address (and other context) to use for the
> > invalidate.
> > 
> > The CPU which executes the `IC IVAU <Xt>` instruction will check the address in
> > `<Xt>` using its local MMU to determine whether the access should fault
> > *before* sending the broadcast DVM message, and the recipients will not perform
> > any MMU check (since e.g. they could be running a different process with a
> > different VA space anyway).
> > 
> > The MMU check is local to the CPU, and doesn't depend on any broadcast; that
> > should be unaffected by the erratum. If that is affected then the erratum
> > description is wrong and we have a bigger set of problems.
> > 
> 
> Thanks, I think I get it. Now, what will be preferred way to fix IC
> TLBI instructions for this errata? Using static_key for TLBI and
> alternatives for IC instruction or something else?

As I mentioned in a previous comment, I think this should use a static_key
to drive the high-level behaviour instead of patching the low-level code
with alternatives.

> Keep trapping userspace IC instruction seems ok to me. Perhaps
> alternative for IC in invalidate_icache_by_line macro? About
> TLBI, it will be really invasive if static_key is used, I think.
> Yes, there is over invalidation, but still overall performance of
> SoC is almost doubled because of enabled 2 CA72 :-)
> 
> Another point is, should I keep hunk in arm-smmu.c, because
> cpus_have_final_cap() is not generally available for drivers?

What do you mean here? We obviously can't break the build, but I think
the easiest thing to do is add a clause to `arm_smmu_sva_supported()`
to return false if this erratum is present.

Will
Ivan T. Ivanov July 13, 2023, 2:29 p.m. UTC | #9
On 06-26 13:15, Will Deacon wrote:
> > 
> > Another point is, should I keep hunk in arm-smmu.c, because
> > cpus_have_final_cap() is not generally available for drivers?
> 
> What do you mean here? We obviously can't break the build,

Build breaks on x86_64 for example. I will do what Robin suggest
in the other email.

Regards,
Ivan

> but I think
> the easiest thing to do is add a clause to `arm_smmu_sva_supported()`
> to return false if this erratum is present.
>
diff mbox series

Patch

diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
index ec5f889d7681..fce231797184 100644
--- a/Documentation/arm64/silicon-errata.rst
+++ b/Documentation/arm64/silicon-errata.rst
@@ -175,6 +175,8 @@  stable kernels.
 +----------------+-----------------+-----------------+-----------------------------+
 | Freescale/NXP  | LS2080A/LS1043A | A-008585        | FSL_ERRATUM_A008585         |
 +----------------+-----------------+-----------------+-----------------------------+
+| Freescale/NXP  | i.MX 8QuadMax   | ERR050104       | NXP_IMX8QM_ERRATUM_ERR050104|
++----------------+-----------------+-----------------+-----------------------------+
 +----------------+-----------------+-----------------+-----------------------------+
 | Hisilicon      | Hip0{5,6,7}     | #161010101      | HISILICON_ERRATUM_161010101 |
 +----------------+-----------------+-----------------+-----------------------------+
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1023e896d46b..a5fe6ffb8150 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1159,6 +1159,18 @@  config SOCIONEXT_SYNQUACER_PREITS
 
 	  If unsure, say Y.
 
+config NXP_IMX8QM_ERRATUM_ERR050104
+	bool "NXP iMX8QM ERR050104: broken cache/tlb invalidation"
+	default y
+	help
+	  On iMX8QM, addresses above bit 35 are not broadcast correctly for
+	  TLBI or IC operations, making TLBI and IC unreliable.
+
+	  Work around this erratum by using TLBI *ALL*IS and IC IALLUIS
+	  operations. EL0 use of IC IVAU is trapped and upgraded to IC IALLUIS.
+
+	  If unsure, say Y.
+
 endmenu # "ARM errata workarounds via the alternatives framework"
 
 choice
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 412a3b9a3c25..e3bad2298ea5 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -37,7 +37,11 @@ 
 			    : : )
 
 #define __TLBI_1(op, arg) asm (ARM64_ASM_PREAMBLE			       \
-			       "tlbi " #op ", %0\n"			       \
+		   ALTERNATIVE("tlbi " #op ", %0\n",			       \
+			       "tlbi vmalle1is\n",			       \
+			       ARM64_WORKAROUND_NXP_ERR050104)		       \
+			    : : "r" (arg));				       \
+			  asm (ARM64_ASM_PREAMBLE			       \
 		   ALTERNATIVE("nop\n			nop",		       \
 			       "dsb ish\n		tlbi " #op ", %0",     \
 			       ARM64_WORKAROUND_REPEAT_TLBI,		       \
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 307faa2b4395..b0647b64dbb8 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -8,6 +8,7 @@ 
 #include <linux/arm-smccc.h>
 #include <linux/types.h>
 #include <linux/cpu.h>
+#include <linux/of.h>
 #include <asm/cpu.h>
 #include <asm/cputype.h>
 #include <asm/cpufeature.h>
@@ -55,6 +56,14 @@  is_kryo_midr(const struct arm64_cpu_capabilities *entry, int scope)
 	return model == entry->midr_range.model;
 }
 
+static bool __maybe_unused
+is_imx8qm_soc(const struct arm64_cpu_capabilities *entry, int scope)
+{
+	WARN_ON(preemptible());
+
+	return of_machine_is_compatible("fsl,imx8qm");
+}
+
 static bool
 has_mismatched_cache_type(const struct arm64_cpu_capabilities *entry,
 			  int scope)
@@ -729,6 +738,15 @@  const struct arm64_cpu_capabilities arm64_errata[] = {
 		MIDR_FIXED(MIDR_CPU_VAR_REV(1,1), BIT(25)),
 		.cpu_enable = cpu_clear_bf16_from_user_emulation,
 	},
+#endif
+#ifdef CONFIG_NXP_IMX8QM_ERRATUM_ERR050104
+	{
+		.desc = "NXP erratum ERR050104",
+		.capability = ARM64_WORKAROUND_NXP_ERR050104,
+		.type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
+		.matches = is_imx8qm_soc,
+		.cpu_enable = cpu_enable_cache_maint_trap,
+	},
 #endif
 	{
 	}
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 4a79ba100799..265b6334291b 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -556,6 +556,11 @@  static void user_cache_maint_handler(unsigned long esr, struct pt_regs *regs)
 		__user_cache_maint("dc civac", address, ret);
 		break;
 	case ESR_ELx_SYS64_ISS_CRM_IC_IVAU:	/* IC IVAU */
+		if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104)) {
+			asm volatile("ic ialluis");
+			ret = 0;
+			break;
+		}
 		__user_cache_maint("ic ivau", address, ret);
 		break;
 	default:
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 4b2e16e696a8..5066332302d2 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -2239,6 +2239,11 @@  static __init int kvm_arm_init(void)
 		return -ENODEV;
 	}
 
+	if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104)) {
+		kvm_info("KVM is not supported on this system\n");
+		return -ENODEV;
+	}
+
 	err = kvm_sys_reg_table_init();
 	if (err) {
 		kvm_info("Error initializing system register tables");
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 37b1340e9646..c1de9235f922 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -87,6 +87,7 @@  WORKAROUND_CAVIUM_TX2_219_TVM
 WORKAROUND_CLEAN_CACHE
 WORKAROUND_DEVICE_LOAD_ACQUIRE
 WORKAROUND_NVIDIA_CARMEL_CNP
+WORKAROUND_NXP_ERR050104
 WORKAROUND_QCOM_FALKOR_E1003
 WORKAROUND_REPEAT_TLBI
 WORKAROUND_SPECULATIVE_AT
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 2ff7a72cf377..29bbb16bae6e 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1680,6 +1680,10 @@  static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
 	/* ID0 */
 	id = arm_smmu_gr0_read(smmu, ARM_SMMU_GR0_ID0);
 
+	/* Do not use Broadcast TLB Maintenance on affected platforms */
+	if (cpus_have_final_cap(ARM64_WORKAROUND_NXP_ERR050104))
+		id &= ~ARM_SMMU_ID0_BTM;
+
 	/* Restrict available stages based on module parameter */
 	if (force_stage == 1)
 		id &= ~(ARM_SMMU_ID0_S2TS | ARM_SMMU_ID0_NTS);
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index 703fd5817ec1..1589acfdbed9 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -52,6 +52,7 @@ 
 #define ARM_SMMU_ID0_PTFS_NO_AARCH32S	BIT(24)
 #define ARM_SMMU_ID0_NUMIRPT		GENMASK(23, 16)
 #define ARM_SMMU_ID0_CTTW		BIT(14)
+#define ARM_SMMU_ID0_BTM		BIT(13)
 #define ARM_SMMU_ID0_NUMSIDB		GENMASK(12, 9)
 #define ARM_SMMU_ID0_EXIDS		BIT(8)
 #define ARM_SMMU_ID0_NUMSMRG		GENMASK(7, 0)