diff mbox series

arm64: errata: Fix handling of 1418040 with late CPU onlining

Message ID 20201106114952.10032-1-will@kernel.org (mailing list archive)
State New, archived
Headers show
Series arm64: errata: Fix handling of 1418040 with late CPU onlining | expand

Commit Message

Will Deacon Nov. 6, 2020, 11:49 a.m. UTC
In a surprising turn of events, it transpires that CPU capabilities
configured as ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE are never set as the
result of late-onlining. Therefore our handling of erratum 1418040 does
not get activated if it is not required by any of the boot CPUs, even
though we allow late-onlining of an affected CPU.

In order to get things working again, replace the cpus_have_const_cap()
invocation with an explicit check for the current CPU using
this_cpu_has_cap().

Cc: Marc Zyngier <maz@kernel.org>
Cc: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Cc: Stephen Boyd <swboyd@chromium.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
---

Found by code inspection and compile-tested only, so I would really
appreciate a second look.

 arch/arm64/include/asm/cpufeature.h | 2 ++
 arch/arm64/kernel/process.c         | 5 ++---
 2 files changed, 4 insertions(+), 3 deletions(-)

Comments

Marc Zyngier Nov. 6, 2020, 12:15 p.m. UTC | #1
On 2020-11-06 11:49, Will Deacon wrote:
> In a surprising turn of events, it transpires that CPU capabilities
> configured as ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE are never set as the
> result of late-onlining. Therefore our handling of erratum 1418040 does
> not get activated if it is not required by any of the boot CPUs, even
> though we allow late-onlining of an affected CPU.
> 
> In order to get things working again, replace the cpus_have_const_cap()
> invocation with an explicit check for the current CPU using
> this_cpu_has_cap().
> 
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
> Cc: Stephen Boyd <swboyd@chromium.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Suzuki Poulose <suzuki.poulose@arm.com>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
> 
> Found by code inspection and compile-tested only, so I would really
> appreciate a second look.
> 
>  arch/arm64/include/asm/cpufeature.h | 2 ++
>  arch/arm64/kernel/process.c         | 5 ++---
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/cpufeature.h
> b/arch/arm64/include/asm/cpufeature.h
> index f7e7144af174..c59c16a6ea8b 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -268,6 +268,8 @@ extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0;
>  /*
>   * CPU feature detected at boot time based on feature of one or more 
> CPUs.
>   * All possible conflicts for a late CPU are ignored.
> + * NOTE: this means that a late CPU with the feature will *not* cause 
> the
> + * capability to be advertised by cpus_have_*cap()!
>   */
>  #define ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE		\
>  	(ARM64_CPUCAP_SCOPE_LOCAL_CPU		|	\
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 4784011cecac..a47a40ec6ad9 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -522,14 +522,13 @@ static void erratum_1418040_thread_switch(struct
> task_struct *prev,
>  	bool prev32, next32;
>  	u64 val;
> 
> -	if (!(IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) &&
> -	      cpus_have_const_cap(ARM64_WORKAROUND_1418040)))
> +	if (!IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))
>  		return;
> 
>  	prev32 = is_compat_thread(task_thread_info(prev));
>  	next32 = is_compat_thread(task_thread_info(next));
> 
> -	if (prev32 == next32)
> +	if (prev32 == next32 || !this_cpu_has_cap(ARM64_WORKAROUND_1418040))
>  		return;

It is not going to be cheap to switch between 32 and 64bit.
Hopefully it doesn't happen too often... :-(.

Acked-by: Marc Zyngier <maz@kernel.org>

         M.
Suzuki K Poulose Nov. 6, 2020, 12:18 p.m. UTC | #2
Hi Will,

On 11/6/20 11:49 AM, Will Deacon wrote:
> In a surprising turn of events, it transpires that CPU capabilities
> configured as ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE are never set as the
> result of late-onlining. Therefore our handling of erratum 1418040 does
> not get activated if it is not required by any of the boot CPUs, even
> though we allow late-onlining of an affected CPU.

The capability state is not altered after the SMP boot for all types
of caps. The weak caps are there to allow a late CPU to turn online
without getting "banned". This may be something we could relax with
a new flag in the scope.

> 
> In order to get things working again, replace the cpus_have_const_cap()
> invocation with an explicit check for the current CPU using
> this_cpu_has_cap().
> 
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
> Cc: Stephen Boyd <swboyd@chromium.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Suzuki Poulose <suzuki.poulose@arm.com>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
> 
> Found by code inspection and compile-tested only, so I would really
> appreciate a second look.
> 
>   arch/arm64/include/asm/cpufeature.h | 2 ++
>   arch/arm64/kernel/process.c         | 5 ++---
>   2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index f7e7144af174..c59c16a6ea8b 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -268,6 +268,8 @@ extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0;
>   /*
>    * CPU feature detected at boot time based on feature of one or more CPUs.
>    * All possible conflicts for a late CPU are ignored.
> + * NOTE: this means that a late CPU with the feature will *not* cause the
> + * capability to be advertised by cpus_have_*cap()!

This comment applies to all the types, so it may be confusing. And the
comment already mentions that the feature is detected at boot time.

>    */
>   #define ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE		\
>   	(ARM64_CPUCAP_SCOPE_LOCAL_CPU		|	\
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 4784011cecac..a47a40ec6ad9 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -522,14 +522,13 @@ static void erratum_1418040_thread_switch(struct task_struct *prev,
>   	bool prev32, next32;
>   	u64 val;
>   
> -	if (!(IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) &&
> -	      cpus_have_const_cap(ARM64_WORKAROUND_1418040)))
> +	if (!IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))
>   		return;
>   
>   	prev32 = is_compat_thread(task_thread_info(prev));
>   	next32 = is_compat_thread(task_thread_info(next));
>   
> -	if (prev32 == next32)
> +	if (prev32 == next32 || !this_cpu_has_cap(ARM64_WORKAROUND_1418040))
>   		return;
>   
>   	val = read_sysreg(cntkctl_el1);
> 


This change as such looks good to me.

Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Will Deacon Nov. 6, 2020, 12:28 p.m. UTC | #3
On Fri, Nov 06, 2020 at 12:18:32PM +0000, Suzuki K Poulose wrote:
> On 11/6/20 11:49 AM, Will Deacon wrote:
> > In a surprising turn of events, it transpires that CPU capabilities
> > configured as ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE are never set as the
> > result of late-onlining. Therefore our handling of erratum 1418040 does
> > not get activated if it is not required by any of the boot CPUs, even
> > though we allow late-onlining of an affected CPU.
> 
> The capability state is not altered after the SMP boot for all types
> of caps. The weak caps are there to allow a late CPU to turn online
> without getting "banned". This may be something we could relax with
> a new flag in the scope.

I did look briefly into that, but it's really difficult to enable the
static key as this operation can block so you end up having to defer it
past the point of signalling the completion back to the CPU doing cpu_up().

So I figured I'd focus on fixing the current problem for now.

> > In order to get things working again, replace the cpus_have_const_cap()
> > invocation with an explicit check for the current CPU using
> > this_cpu_has_cap().
> > 
> > Cc: Marc Zyngier <maz@kernel.org>
> > Cc: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
> > Cc: Stephen Boyd <swboyd@chromium.org>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Suzuki Poulose <suzuki.poulose@arm.com>
> > Signed-off-by: Will Deacon <will@kernel.org>
> > ---
> > 
> > Found by code inspection and compile-tested only, so I would really
> > appreciate a second look.
> > 
> >   arch/arm64/include/asm/cpufeature.h | 2 ++
> >   arch/arm64/kernel/process.c         | 5 ++---
> >   2 files changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> > index f7e7144af174..c59c16a6ea8b 100644
> > --- a/arch/arm64/include/asm/cpufeature.h
> > +++ b/arch/arm64/include/asm/cpufeature.h
> > @@ -268,6 +268,8 @@ extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0;
> >   /*
> >    * CPU feature detected at boot time based on feature of one or more CPUs.
> >    * All possible conflicts for a late CPU are ignored.
> > + * NOTE: this means that a late CPU with the feature will *not* cause the
> > + * capability to be advertised by cpus_have_*cap()!
> 
> This comment applies to all the types, so it may be confusing. And the
> comment already mentions that the feature is detected at boot time.

The other types don't allow a late CPU to come online with the feature
though, so I don't think it's quite the same. Given that we made this
mistake for this erratum workaround and previously for SSBS, I wanted to add
something to the comment to try to avoid others falling into this trap.

Ideally, we'd warn when calling cpus_have_*cap() for one of these things,
but that adds runtime cost that we'd probably have to gate behind a DEBUG
option.

> >    */
> >   #define ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE		\
> >   	(ARM64_CPUCAP_SCOPE_LOCAL_CPU		|	\
> > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> > index 4784011cecac..a47a40ec6ad9 100644
> > --- a/arch/arm64/kernel/process.c
> > +++ b/arch/arm64/kernel/process.c
> > @@ -522,14 +522,13 @@ static void erratum_1418040_thread_switch(struct task_struct *prev,
> >   	bool prev32, next32;
> >   	u64 val;
> > -	if (!(IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) &&
> > -	      cpus_have_const_cap(ARM64_WORKAROUND_1418040)))
> > +	if (!IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))
> >   		return;
> >   	prev32 = is_compat_thread(task_thread_info(prev));
> >   	next32 = is_compat_thread(task_thread_info(next));
> > -	if (prev32 == next32)
> > +	if (prev32 == next32 || !this_cpu_has_cap(ARM64_WORKAROUND_1418040))
> >   		return;
> >   	val = read_sysreg(cntkctl_el1);
> > 
> 
> 
> This change as such looks good to me.
> 
> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>

Thanks!

Will
Catalin Marinas Nov. 6, 2020, 12:44 p.m. UTC | #4
On Fri, Nov 06, 2020 at 12:18:32PM +0000, Suzuki K Poulose wrote:
> On 11/6/20 11:49 AM, Will Deacon wrote:
> > In a surprising turn of events, it transpires that CPU capabilities
> > configured as ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE are never set as the
> > result of late-onlining. Therefore our handling of erratum 1418040 does
> > not get activated if it is not required by any of the boot CPUs, even
> > though we allow late-onlining of an affected CPU.
> 
> The capability state is not altered after the SMP boot for all types
> of caps. The weak caps are there to allow a late CPU to turn online
> without getting "banned". This may be something we could relax with
> a new flag in the scope.

Like this? Of course, it needs some testing.

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 97244d4feca9..b896e72131d7 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -246,6 +246,8 @@ extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0;
 #define ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU	((u16)BIT(5))
 /* Panic when a conflict is detected */
 #define ARM64_CPUCAP_PANIC_ON_CONFLICT		((u16)BIT(6))
+/* Together with PERMITTED_FOR_LATE_CPU, set the corresponding cpu_hwcaps bit */
+#define ARM64_CPUCAP_SET_FOR_LATE_CPU		((u16)BIT(7))
 
 /*
  * CPU errata workarounds that need to be enabled at boot time if one or
@@ -481,6 +483,16 @@ static __always_inline bool cpus_have_const_cap(int num)
 		return cpus_have_cap(num);
 }
 
+/*
+ * Test for a capability with a runtime check. This is an alias for
+ * cpus_have_cap() but with the name chosen to emphasize the applicability to
+ * late capability setting.
+ */
+static __always_inline bool cpus_have_late_cap(int num)
+{
+	return cpus_have_cap(num);
+}
+
 static inline void cpus_set_cap(unsigned int num)
 {
 	if (num >= ARM64_NCAPS) {
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 61314fd70f13..6b7de7292e8c 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -481,7 +481,8 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
 		 * also need the non-affected CPUs to be able to come
 		 * in at any point in time. Wonderful.
 		 */
-		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
+		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE |
+			ARM64_CPUCAP_SET_FOR_LATE_CPU,
 	},
 #endif
 #ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index dcc165b3fc04..51e63be41ea5 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1720,6 +1720,12 @@ cpucap_late_cpu_permitted(const struct arm64_cpu_capabilities *cap)
 	return !!(cap->type & ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU);
 }
 
+static bool
+cpucap_set_for_late_cpu(const struct arm64_cpu_capabilities *cap)
+{
+	return !!(cap->type & ARM64_CPUCAP_SET_FOR_LATE_CPU);
+}
+
 static bool
 cpucap_panic_on_conflict(const struct arm64_cpu_capabilities *cap)
 {
@@ -2489,6 +2495,11 @@ static void verify_local_cpu_caps(u16 scope_mask)
 			 */
 			if (cpu_has_cap && !cpucap_late_cpu_permitted(caps))
 				break;
+			/*
+			 * Set the capability bit if it allows late setting.
+			 */
+			if (cpucap_set_for_late_cpu(caps))
+				cpus_set_cap(caps->capability);
 		}
 	}
 
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 4784011cecac..152639962845 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -523,7 +523,7 @@ static void erratum_1418040_thread_switch(struct task_struct *prev,
 	u64 val;
 
 	if (!(IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) &&
-	      cpus_have_const_cap(ARM64_WORKAROUND_1418040)))
+	      cpus_have_late_cap(ARM64_WORKAROUND_1418040)))
 		return;
 
 	prev32 = is_compat_thread(task_thread_info(prev));
Suzuki K Poulose Nov. 10, 2020, 10:39 a.m. UTC | #5
Catalin

On Fri, Nov 06, 2020 at 12:44:00PM +0000, Catalin Marinas wrote:
> On Fri, Nov 06, 2020 at 12:18:32PM +0000, Suzuki K Poulose wrote:
> > On 11/6/20 11:49 AM, Will Deacon wrote:
> > > In a surprising turn of events, it transpires that CPU capabilities
> > > configured as ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE are never set as the
> > > result of late-onlining. Therefore our handling of erratum 1418040 does
> > > not get activated if it is not required by any of the boot CPUs, even
> > > though we allow late-onlining of an affected CPU.
> > 
> > The capability state is not altered after the SMP boot for all types
> > of caps. The weak caps are there to allow a late CPU to turn online
> > without getting "banned". This may be something we could relax with
> > a new flag in the scope.
> 
> Like this? Of course, it needs some testing.

Yes, this is exactly what I had in mind. However, there is some concern
over how we expose it.

> 
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index 97244d4feca9..b896e72131d7 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -246,6 +246,8 @@ extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0;
>  #define ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU	((u16)BIT(5))
>  /* Panic when a conflict is detected */
>  #define ARM64_CPUCAP_PANIC_ON_CONFLICT		((u16)BIT(6))
> +/* Together with PERMITTED_FOR_LATE_CPU, set the corresponding cpu_hwcaps bit */
> +#define ARM64_CPUCAP_SET_FOR_LATE_CPU		((u16)BIT(7))
>  
>  /*
>   * CPU errata workarounds that need to be enabled at boot time if one or
> @@ -481,6 +483,16 @@ static __always_inline bool cpus_have_const_cap(int num)
>  		return cpus_have_cap(num);
>  }
>  
> +/*
> + * Test for a capability with a runtime check. This is an alias for
> + * cpus_have_cap() but with the name chosen to emphasize the applicability to
> + * late capability setting.
> + */
> +static __always_inline bool cpus_have_late_cap(int num)
> +{
> +	return cpus_have_cap(num);
> +}
> +

It is quite easy for people to misuse the ubiquitous cpus_have_const_cap().
So, unless we make sure that we fix that to handle these "caps" we might
see subtle bugs. How about the following fix on top of your patch, to
add another (sigh!) set of keys to detect if we can use the const caps
or not.

Untested...

---8>---

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 849b2691fadd..84dd43f1f322 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -379,6 +379,7 @@ cpucap_multi_entry_cap_matches(const struct arm64_cpu_capabilities *entry,
 
 extern DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS);
 extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS];
+extern struct static_key_false cpu_hwcap_const_key[ARM64_NCAPS];
 extern struct static_key_false arm64_const_caps_ready;
 
 /* ARM64 CAPS + alternative_cb */
@@ -414,6 +415,14 @@ static inline bool cpus_have_cap(unsigned int num)
 	return test_bit(num, cpu_hwcaps);
 }
 
+static __always_inline bool cpucap_has_const_key(int num)
+{
+	if (num >= ARM64_NCAPS)
+		return false;
+
+	return static_branch_unlikely(&cpu_hwcap_const_key[num]);
+}
+
 /*
  * Test for a capability without a runtime check.
  *
@@ -424,7 +433,7 @@ static inline bool cpus_have_cap(unsigned int num)
  */
 static __always_inline bool __cpus_have_const_cap(int num)
 {
-	if (num >= ARM64_NCAPS)
+	if (num >= ARM64_NCAPS && WARN_ON(!cpucap_has_const_key(num)))
 		return false;
 	return static_branch_unlikely(&cpu_hwcap_keys[num]);
 }
@@ -439,7 +448,7 @@ static __always_inline bool __cpus_have_const_cap(int num)
  */
 static __always_inline bool cpus_have_const_cap(int num)
 {
-	if (system_capabilities_finalized())
+	if (cpucap_has_const_key(num) && system_capabilities_finalized())
 		return __cpus_have_const_cap(num);
 	else
 		return cpus_have_cap(num);
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 51e63be41ea5..a6c3e4e102c9 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -127,7 +127,9 @@ void dump_cpu_features(void)
 }
 
 DEFINE_STATIC_KEY_ARRAY_FALSE(cpu_hwcap_keys, ARM64_NCAPS);
+DEFINE_STATIC_KEY_ARRAY_FALSE(cpu_hwcap_const_key, ARM64_NCAPS);
 EXPORT_SYMBOL(cpu_hwcap_keys);
+EXPORT_SYMBOL(cpu_hwcap_const_key);
 
 #define __ARM64_FTR_BITS(SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
 	{						\
@@ -2426,7 +2428,10 @@ static void __init enable_cpu_capabilities(u16 scope_mask)
 			continue;
 
 		/* Ensure cpus_have_const_cap(num) works */
-		static_branch_enable(&cpu_hwcap_keys[num]);
+		if (!cpucap_set_for_late_cpu(caps)) {
+			static_branch_enable(&cpu_hwcap_const_key[num]);
+			static_branch_enable(&cpu_hwcap_keys[num]);
+		}
 
 		if (boot_scope && caps->cpu_enable)
 			/*
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 843ecfb16a69..0ff9329670b7 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -90,6 +90,7 @@ KVM_NVHE_ALIAS(__icache_flags);
 /* Kernel symbols needed for cpus_have_final/const_caps checks. */
 KVM_NVHE_ALIAS(arm64_const_caps_ready);
 KVM_NVHE_ALIAS(cpu_hwcap_keys);
+KVM_NVHE_ALIAS(cpu_hwcap_const_key);
 KVM_NVHE_ALIAS(cpu_hwcaps);
 
 /* Static keys which are set if a vGIC trap should be handled in hyp. */

--


>  static inline void cpus_set_cap(unsigned int num)
>  {
>  	if (num >= ARM64_NCAPS) {
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index 61314fd70f13..6b7de7292e8c 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -481,7 +481,8 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
>  		 * also need the non-affected CPUs to be able to come
>  		 * in at any point in time. Wonderful.
>  		 */
> -		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
> +		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE |
> +			ARM64_CPUCAP_SET_FOR_LATE_CPU,
>  	},
>  #endif
>  #ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index dcc165b3fc04..51e63be41ea5 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -1720,6 +1720,12 @@ cpucap_late_cpu_permitted(const struct arm64_cpu_capabilities *cap)
>  	return !!(cap->type & ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU);
>  }
>  
> +static bool
> +cpucap_set_for_late_cpu(const struct arm64_cpu_capabilities *cap)
> +{
> +	return !!(cap->type & ARM64_CPUCAP_SET_FOR_LATE_CPU);
> +}
> +
>  static bool
>  cpucap_panic_on_conflict(const struct arm64_cpu_capabilities *cap)
>  {
> @@ -2489,6 +2495,11 @@ static void verify_local_cpu_caps(u16 scope_mask)
>  			 */
>  			if (cpu_has_cap && !cpucap_late_cpu_permitted(caps))
>  				break;
> +			/*
> +			 * Set the capability bit if it allows late setting.
> +			 */
> +			if (cpucap_set_for_late_cpu(caps))
> +				cpus_set_cap(caps->capability);
>  		}
>  	}
>  
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 4784011cecac..152639962845 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -523,7 +523,7 @@ static void erratum_1418040_thread_switch(struct task_struct *prev,
>  	u64 val;
>  
>  	if (!(IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) &&
> -	      cpus_have_const_cap(ARM64_WORKAROUND_1418040)))
> +	      cpus_have_late_cap(ARM64_WORKAROUND_1418040)))
>  		return;
>  
>  	prev32 = is_compat_thread(task_thread_info(prev));
Catalin Marinas Nov. 10, 2020, 12:05 p.m. UTC | #6
On Tue, Nov 10, 2020 at 10:39:57AM +0000, Suzuki K Poulose wrote:
> Catalin
> 
> On Fri, Nov 06, 2020 at 12:44:00PM +0000, Catalin Marinas wrote:
> > On Fri, Nov 06, 2020 at 12:18:32PM +0000, Suzuki K Poulose wrote:
> > > On 11/6/20 11:49 AM, Will Deacon wrote:
> > +/*
> > + * Test for a capability with a runtime check. This is an alias for
> > + * cpus_have_cap() but with the name chosen to emphasize the applicability to
> > + * late capability setting.
> > + */
> > +static __always_inline bool cpus_have_late_cap(int num)
> > +{
> > +	return cpus_have_cap(num);
> > +}
> > +
> 
> It is quite easy for people to misuse the ubiquitous cpus_have_const_cap().
> So, unless we make sure that we fix that to handle these "caps" we might
> see subtle bugs. How about the following fix on top of your patch, to
> add another (sigh!) set of keys to detect if we can use the const caps
> or not.

I thought about this as well but decided it's not worth the additional
code.

> Untested...
> 
> ---8>---
> 
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index 849b2691fadd..84dd43f1f322 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -379,6 +379,7 @@ cpucap_multi_entry_cap_matches(const struct arm64_cpu_capabilities *entry,
>  
>  extern DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS);
>  extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS];
> +extern struct static_key_false cpu_hwcap_const_key[ARM64_NCAPS];
>  extern struct static_key_false arm64_const_caps_ready;
>  
>  /* ARM64 CAPS + alternative_cb */
> @@ -414,6 +415,14 @@ static inline bool cpus_have_cap(unsigned int num)
>  	return test_bit(num, cpu_hwcaps);
>  }
>  
> +static __always_inline bool cpucap_has_const_key(int num)
> +{
> +	if (num >= ARM64_NCAPS)
> +		return false;
> +
> +	return static_branch_unlikely(&cpu_hwcap_const_key[num]);
> +}
> +
>  /*
>   * Test for a capability without a runtime check.
>   *
> @@ -424,7 +433,7 @@ static inline bool cpus_have_cap(unsigned int num)
>   */
>  static __always_inline bool __cpus_have_const_cap(int num)
>  {
> -	if (num >= ARM64_NCAPS)
> +	if (num >= ARM64_NCAPS && WARN_ON(!cpucap_has_const_key(num)))
>  		return false;
>  	return static_branch_unlikely(&cpu_hwcap_keys[num]);
>  }
> @@ -439,7 +448,7 @@ static __always_inline bool __cpus_have_const_cap(int num)
>   */
>  static __always_inline bool cpus_have_const_cap(int num)
>  {
> -	if (system_capabilities_finalized())
> +	if (cpucap_has_const_key(num) && system_capabilities_finalized())
>  		return __cpus_have_const_cap(num);
>  	else
>  		return cpus_have_cap(num);

I don't particularly like that we end up with three static key checks
here but, OTOH, this has the advantage that we don't need a new
cpus_have_late_cap().

Going back to the original problem, we want to allow late cpucap setting
after the system caps have been finalised and, in a addition, we want to
detect API misuse. We solve the former by setting the corresponding
cpu_hwcaps bit late, based on a new capability flag.

For the API, we can use your approach with another static key and that's
pretty foolproof, it gives us warnings. Alternatively (not as
foolproof), we could skip the static key initialisation entirely for a
late feature even if we detect it early and that would allow one to
(hopefully) figure out the wrong API (cpus_have_const_cap() never true):

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index e346938e9a37..531b7787ee8e 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -2436,7 +2436,8 @@ static void __init enable_cpu_capabilities(u16 scope_mask)
 			continue;
 
 		/* Ensure cpus_have_const_cap(num) works */
-		static_branch_enable(&cpu_hwcap_keys[num]);
+		if (!cpucap_set_for_late_cpu(caps))
+			static_branch_enable(&cpu_hwcap_keys[num]);
 
 		if (boot_scope && caps->cpu_enable)
 			/*

Anyway, either option is fine by me.
Suzuki K Poulose Nov. 10, 2020, 12:14 p.m. UTC | #7
On 11/10/20 12:05 PM, Catalin Marinas wrote:
> On Tue, Nov 10, 2020 at 10:39:57AM +0000, Suzuki K Poulose wrote:
>> Catalin
>>
>> On Fri, Nov 06, 2020 at 12:44:00PM +0000, Catalin Marinas wrote:
>>> On Fri, Nov 06, 2020 at 12:18:32PM +0000, Suzuki K Poulose wrote:
>>>> On 11/6/20 11:49 AM, Will Deacon wrote:
>>> +/*
>>> + * Test for a capability with a runtime check. This is an alias for
>>> + * cpus_have_cap() but with the name chosen to emphasize the applicability to
>>> + * late capability setting.
>>> + */
>>> +static __always_inline bool cpus_have_late_cap(int num)
>>> +{
>>> +	return cpus_have_cap(num);
>>> +}
>>> +
>>
>> It is quite easy for people to misuse the ubiquitous cpus_have_const_cap().
>> So, unless we make sure that we fix that to handle these "caps" we might
>> see subtle bugs. How about the following fix on top of your patch, to
>> add another (sigh!) set of keys to detect if we can use the const caps
>> or not.
> 
> I thought about this as well but decided it's not worth the additional
> code.
> 
>> Untested...
>>
>> ---8>---
>>
>> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
>> index 849b2691fadd..84dd43f1f322 100644
>> --- a/arch/arm64/include/asm/cpufeature.h
>> +++ b/arch/arm64/include/asm/cpufeature.h
>> @@ -379,6 +379,7 @@ cpucap_multi_entry_cap_matches(const struct arm64_cpu_capabilities *entry,
>>   
>>   extern DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS);
>>   extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS];
>> +extern struct static_key_false cpu_hwcap_const_key[ARM64_NCAPS];
>>   extern struct static_key_false arm64_const_caps_ready;
>>   
>>   /* ARM64 CAPS + alternative_cb */
>> @@ -414,6 +415,14 @@ static inline bool cpus_have_cap(unsigned int num)
>>   	return test_bit(num, cpu_hwcaps);
>>   }
>>   
>> +static __always_inline bool cpucap_has_const_key(int num)
>> +{
>> +	if (num >= ARM64_NCAPS)
>> +		return false;
>> +
>> +	return static_branch_unlikely(&cpu_hwcap_const_key[num]);
>> +}
>> +
>>   /*
>>    * Test for a capability without a runtime check.
>>    *
>> @@ -424,7 +433,7 @@ static inline bool cpus_have_cap(unsigned int num)
>>    */
>>   static __always_inline bool __cpus_have_const_cap(int num)
>>   {
>> -	if (num >= ARM64_NCAPS)
>> +	if (num >= ARM64_NCAPS && WARN_ON(!cpucap_has_const_key(num)))
>>   		return false;
>>   	return static_branch_unlikely(&cpu_hwcap_keys[num]);
>>   }
>> @@ -439,7 +448,7 @@ static __always_inline bool __cpus_have_const_cap(int num)
>>    */
>>   static __always_inline bool cpus_have_const_cap(int num)
>>   {
>> -	if (system_capabilities_finalized())
>> +	if (cpucap_has_const_key(num) && system_capabilities_finalized())
>>   		return __cpus_have_const_cap(num);
>>   	else
>>   		return cpus_have_cap(num);
> 
> I don't particularly like that we end up with three static key checks
> here but, OTOH, this has the advantage that we don't need a new
> cpus_have_late_cap().
> 
> Going back to the original problem, we want to allow late cpucap setting
> after the system caps have been finalised and, in a addition, we want to
> detect API misuse. We solve the former by setting the corresponding
> cpu_hwcaps bit late, based on a new capability flag.
> 
> For the API, we can use your approach with another static key and that's
> pretty foolproof, it gives us warnings. Alternatively (not as
> foolproof), we could skip the static key initialisation entirely for a
> late feature even if we detect it early and that would allow one to
> (hopefully) figure out the wrong API (cpus_have_const_cap() never true):
> 
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index e346938e9a37..531b7787ee8e 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -2436,7 +2436,8 @@ static void __init enable_cpu_capabilities(u16 scope_mask)
>   			continue;
>   
>   		/* Ensure cpus_have_const_cap(num) works */
> -		static_branch_enable(&cpu_hwcap_keys[num]);
> +		if (!cpucap_set_for_late_cpu(caps))
> +			static_branch_enable(&cpu_hwcap_keys[num]);

True, I think this would do the trick, provided proper testing is
performed ;-). The issue is, we would advertise the "errata work around" 
(in dmesg) and may not really apply it where it matters. So, if the
testing doesn't really verify this, but instead looks for the message,
then we are back to the problem the original patch was trying to solve.

Cheers
Suzuki
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index f7e7144af174..c59c16a6ea8b 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -268,6 +268,8 @@  extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0;
 /*
  * CPU feature detected at boot time based on feature of one or more CPUs.
  * All possible conflicts for a late CPU are ignored.
+ * NOTE: this means that a late CPU with the feature will *not* cause the
+ * capability to be advertised by cpus_have_*cap()!
  */
 #define ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE		\
 	(ARM64_CPUCAP_SCOPE_LOCAL_CPU		|	\
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 4784011cecac..a47a40ec6ad9 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -522,14 +522,13 @@  static void erratum_1418040_thread_switch(struct task_struct *prev,
 	bool prev32, next32;
 	u64 val;
 
-	if (!(IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) &&
-	      cpus_have_const_cap(ARM64_WORKAROUND_1418040)))
+	if (!IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))
 		return;
 
 	prev32 = is_compat_thread(task_thread_info(prev));
 	next32 = is_compat_thread(task_thread_info(next));
 
-	if (prev32 == next32)
+	if (prev32 == next32 || !this_cpu_has_cap(ARM64_WORKAROUND_1418040))
 		return;
 
 	val = read_sysreg(cntkctl_el1);