diff mbox series

arm64: Restore forced disabling of KPTI on ThunderX

Message ID 20210922135924.3109291-1-dann.frazier@canonical.com (mailing list archive)
State New, archived
Headers show
Series arm64: Restore forced disabling of KPTI on ThunderX | expand

Commit Message

dann frazier Sept. 22, 2021, 1:59 p.m. UTC
A noted side-effect of commit 0c6c2d3615ef ("arm64: Generate cpucaps.h")
is that cpucaps are now sorted, changing the enumeration order. This
assumed no dependencies between cpucaps, which turned out not to be true
in one case. UNMAP_KERNEL_AT_EL0 currently needs to be processed after
WORKAROUND_CAVIUM_27456. ThunderX systems are incompatible with KPTI, so
unmap_kernel_at_el0() bails if WORKAROUND_CAVIUM_27456 is set. But because
of the sorting, WORKAROUND_CAVIUM_27456 will not yet have been considered
when unmap_kernel_at_el0() checks for it, so the kernel tries to
run w/ KPTI - and quickly falls over.

Because all ThunderX implementations have homogeneous CPUs, we can remove
this dependency by just checking the current CPU for the erratum.

Fixes: 0c6c2d3615ef ("arm64: Generate cpucaps.h")
Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: stable@vger.kernel.org # 5.13+
Signed-off-by: dann frazier <dann.frazier@canonical.com>
---
 arch/arm64/kernel/cpufeature.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Mark Brown Sept. 22, 2021, 2:11 p.m. UTC | #1
On Wed, Sep 22, 2021 at 07:59:24AM -0600, dann frazier wrote:
> A noted side-effect of commit 0c6c2d3615ef ("arm64: Generate cpucaps.h")
> is that cpucaps are now sorted, changing the enumeration order. This
> assumed no dependencies between cpucaps, which turned out not to be true

Reviwed-by: Mark Brown <broonie@kernel.org>
Marc Zyngier Sept. 22, 2021, 6:59 p.m. UTC | #2
On Wed, 22 Sep 2021 14:59:24 +0100,
dann frazier <dann.frazier@canonical.com> wrote:
> 
> A noted side-effect of commit 0c6c2d3615ef ("arm64: Generate cpucaps.h")
> is that cpucaps are now sorted, changing the enumeration order. This
> assumed no dependencies between cpucaps, which turned out not to be true
> in one case. UNMAP_KERNEL_AT_EL0 currently needs to be processed after
> WORKAROUND_CAVIUM_27456. ThunderX systems are incompatible with KPTI, so
> unmap_kernel_at_el0() bails if WORKAROUND_CAVIUM_27456 is set. But because
> of the sorting, WORKAROUND_CAVIUM_27456 will not yet have been considered
> when unmap_kernel_at_el0() checks for it, so the kernel tries to
> run w/ KPTI - and quickly falls over.
> 
> Because all ThunderX implementations have homogeneous CPUs, we can remove
> this dependency by just checking the current CPU for the erratum.
> 
> Fixes: 0c6c2d3615ef ("arm64: Generate cpucaps.h")
> Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> Cc: stable@vger.kernel.org # 5.13+
> Signed-off-by: dann frazier <dann.frazier@canonical.com>
> ---
>  arch/arm64/kernel/cpufeature.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index f8a3067d10c6..7275b49034f3 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1528,7 +1528,7 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
>  	 * ThunderX leads to apparent I-cache corruption of kernel text, which
>  	 * ends as well as you might imagine. Don't even try.
>  	 */
> -	if (cpus_have_const_cap(ARM64_WORKAROUND_CAVIUM_27456)) {
> +	if (this_cpu_has_cap(ARM64_WORKAROUND_CAVIUM_27456)) {
>  		str = "ARM64_WORKAROUND_CAVIUM_27456";
>  		__kpti_forced = -1;
>  	}

Ouch, nice catch. Hopefully, nobody will build a big-little system
using TX1 in this instance of the universe.

Acked-by: Marc Zyngier <maz@kernel.org>

	M.
Suzuki K Poulose Sept. 23, 2021, 9:41 a.m. UTC | #3
On 22/09/2021 14:59, dann frazier wrote:
> A noted side-effect of commit 0c6c2d3615ef ("arm64: Generate cpucaps.h")
> is that cpucaps are now sorted, changing the enumeration order. This
> assumed no dependencies between cpucaps, which turned out not to be true
> in one case. UNMAP_KERNEL_AT_EL0 currently needs to be processed after
> WORKAROUND_CAVIUM_27456. ThunderX systems are incompatible with KPTI, so
> unmap_kernel_at_el0() bails if WORKAROUND_CAVIUM_27456 is set. But because
> of the sorting, WORKAROUND_CAVIUM_27456 will not yet have been considered
> when unmap_kernel_at_el0() checks for it, so the kernel tries to
> run w/ KPTI - and quickly falls over.
> 
> Because all ThunderX implementations have homogeneous CPUs, we can remove
> this dependency by just checking the current CPU for the erratum.
> 
> Fixes: 0c6c2d3615ef ("arm64: Generate cpucaps.h")
> Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> Cc: stable@vger.kernel.org # 5.13+
> Signed-off-by: dann frazier <dann.frazier@canonical.com>
> ---
>   arch/arm64/kernel/cpufeature.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index f8a3067d10c6..7275b49034f3 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1528,7 +1528,7 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
>   	 * ThunderX leads to apparent I-cache corruption of kernel text, which
>   	 * ends as well as you might imagine. Don't even try.
>   	 */
> -	if (cpus_have_const_cap(ARM64_WORKAROUND_CAVIUM_27456)) {
> +	if (this_cpu_has_cap(ARM64_WORKAROUND_CAVIUM_27456)) {

Please could you also update the comment right above this line to
explain, why we do this and why this is fine (just like you have
in the description) ? Something like :

	 * Since we cannot rely on the order of the cpucaps
  	 * we cannot rely on the cpus_have_*cap() helpers to
	 * detect the erratum on the system. However, since
	 * affected CPUs are always in a homoegeneous configuration
	 * we could rely on this_cpu_has_cap()
	 */

So that looking at the code, it is easier to comprehend what we
figured out in the mailing list (and the description)

With that:

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

Suzuki
dann frazier Sept. 23, 2021, 2:45 p.m. UTC | #4
On Thu, Sep 23, 2021 at 10:41:00AM +0100, Suzuki K Poulose wrote:
> On 22/09/2021 14:59, dann frazier wrote:
> > A noted side-effect of commit 0c6c2d3615ef ("arm64: Generate cpucaps.h")
> > is that cpucaps are now sorted, changing the enumeration order. This
> > assumed no dependencies between cpucaps, which turned out not to be true
> > in one case. UNMAP_KERNEL_AT_EL0 currently needs to be processed after
> > WORKAROUND_CAVIUM_27456. ThunderX systems are incompatible with KPTI, so
> > unmap_kernel_at_el0() bails if WORKAROUND_CAVIUM_27456 is set. But because
> > of the sorting, WORKAROUND_CAVIUM_27456 will not yet have been considered
> > when unmap_kernel_at_el0() checks for it, so the kernel tries to
> > run w/ KPTI - and quickly falls over.
> > 
> > Because all ThunderX implementations have homogeneous CPUs, we can remove
> > this dependency by just checking the current CPU for the erratum.
> > 
> > Fixes: 0c6c2d3615ef ("arm64: Generate cpucaps.h")
> > Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> > Cc: stable@vger.kernel.org # 5.13+
> > Signed-off-by: dann frazier <dann.frazier@canonical.com>
> > ---
> >   arch/arm64/kernel/cpufeature.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> > index f8a3067d10c6..7275b49034f3 100644
> > --- a/arch/arm64/kernel/cpufeature.c
> > +++ b/arch/arm64/kernel/cpufeature.c
> > @@ -1528,7 +1528,7 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
> >   	 * ThunderX leads to apparent I-cache corruption of kernel text, which
> >   	 * ends as well as you might imagine. Don't even try.
> >   	 */
> > -	if (cpus_have_const_cap(ARM64_WORKAROUND_CAVIUM_27456)) {
> > +	if (this_cpu_has_cap(ARM64_WORKAROUND_CAVIUM_27456)) {
> 
> Please could you also update the comment right above this line to
> explain, why we do this and why this is fine (just like you have
> in the description) ? Something like :
> 
> 	 * Since we cannot rely on the order of the cpucaps
>  	 * we cannot rely on the cpus_have_*cap() helpers to
> 	 * detect the erratum on the system. However, since
> 	 * affected CPUs are always in a homoegeneous configuration
> 	 * we could rely on this_cpu_has_cap()
> 	 */
> 
> So that looking at the code, it is easier to comprehend what we
> figured out in the mailing list (and the description)

Sure thing, v2 coming shortly.

  -dann

> With that:
> 
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> 
> Suzuki
diff mbox series

Patch

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index f8a3067d10c6..7275b49034f3 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1528,7 +1528,7 @@  static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
 	 * ThunderX leads to apparent I-cache corruption of kernel text, which
 	 * ends as well as you might imagine. Don't even try.
 	 */
-	if (cpus_have_const_cap(ARM64_WORKAROUND_CAVIUM_27456)) {
+	if (this_cpu_has_cap(ARM64_WORKAROUND_CAVIUM_27456)) {
 		str = "ARM64_WORKAROUND_CAVIUM_27456";
 		__kpti_forced = -1;
 	}