Message ID | 20201106114952.10032-1-will@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: errata: Fix handling of 1418040 with late CPU onlining | expand |
On 2020-11-06 11:49, Will Deacon wrote: > In a surprising turn of events, it transpires that CPU capabilities > configured as ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE are never set as the > result of late-onlining. Therefore our handling of erratum 1418040 does > not get activated if it is not required by any of the boot CPUs, even > though we allow late-onlining of an affected CPU. > > In order to get things working again, replace the cpus_have_const_cap() > invocation with an explicit check for the current CPU using > this_cpu_has_cap(). > > Cc: Marc Zyngier <maz@kernel.org> > Cc: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> > Cc: Stephen Boyd <swboyd@chromium.org> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: Suzuki Poulose <suzuki.poulose@arm.com> > Signed-off-by: Will Deacon <will@kernel.org> > --- > > Found by code inspection and compile-tested only, so I would really > appreciate a second look. > > arch/arm64/include/asm/cpufeature.h | 2 ++ > arch/arm64/kernel/process.c | 5 ++--- > 2 files changed, 4 insertions(+), 3 deletions(-) > > diff --git a/arch/arm64/include/asm/cpufeature.h > b/arch/arm64/include/asm/cpufeature.h > index f7e7144af174..c59c16a6ea8b 100644 > --- a/arch/arm64/include/asm/cpufeature.h > +++ b/arch/arm64/include/asm/cpufeature.h > @@ -268,6 +268,8 @@ extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0; > /* > * CPU feature detected at boot time based on feature of one or more > CPUs. > * All possible conflicts for a late CPU are ignored. > + * NOTE: this means that a late CPU with the feature will *not* cause > the > + * capability to be advertised by cpus_have_*cap()! > */ > #define ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE \ > (ARM64_CPUCAP_SCOPE_LOCAL_CPU | \ > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > index 4784011cecac..a47a40ec6ad9 100644 > --- a/arch/arm64/kernel/process.c > +++ b/arch/arm64/kernel/process.c > @@ -522,14 +522,13 @@ static void erratum_1418040_thread_switch(struct > task_struct *prev, > bool prev32, next32; > u64 val; > > - if (!(IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) && > - cpus_have_const_cap(ARM64_WORKAROUND_1418040))) > + if (!IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040)) > return; > > prev32 = is_compat_thread(task_thread_info(prev)); > next32 = is_compat_thread(task_thread_info(next)); > > - if (prev32 == next32) > + if (prev32 == next32 || !this_cpu_has_cap(ARM64_WORKAROUND_1418040)) > return; It is not going to be cheap to switch between 32 and 64bit. Hopefully it doesn't happen too often... :-(. Acked-by: Marc Zyngier <maz@kernel.org> M.
Hi Will, On 11/6/20 11:49 AM, Will Deacon wrote: > In a surprising turn of events, it transpires that CPU capabilities > configured as ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE are never set as the > result of late-onlining. Therefore our handling of erratum 1418040 does > not get activated if it is not required by any of the boot CPUs, even > though we allow late-onlining of an affected CPU. The capability state is not altered after the SMP boot for all types of caps. The weak caps are there to allow a late CPU to turn online without getting "banned". This may be something we could relax with a new flag in the scope. > > In order to get things working again, replace the cpus_have_const_cap() > invocation with an explicit check for the current CPU using > this_cpu_has_cap(). > > Cc: Marc Zyngier <maz@kernel.org> > Cc: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> > Cc: Stephen Boyd <swboyd@chromium.org> > Cc: Catalin Marinas <catalin.marinas@arm.com> > Cc: Mark Rutland <mark.rutland@arm.com> > Cc: Suzuki Poulose <suzuki.poulose@arm.com> > Signed-off-by: Will Deacon <will@kernel.org> > --- > > Found by code inspection and compile-tested only, so I would really > appreciate a second look. > > arch/arm64/include/asm/cpufeature.h | 2 ++ > arch/arm64/kernel/process.c | 5 ++--- > 2 files changed, 4 insertions(+), 3 deletions(-) > > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h > index f7e7144af174..c59c16a6ea8b 100644 > --- a/arch/arm64/include/asm/cpufeature.h > +++ b/arch/arm64/include/asm/cpufeature.h > @@ -268,6 +268,8 @@ extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0; > /* > * CPU feature detected at boot time based on feature of one or more CPUs. > * All possible conflicts for a late CPU are ignored. > + * NOTE: this means that a late CPU with the feature will *not* cause the > + * capability to be advertised by cpus_have_*cap()! This comment applies to all the types, so it may be confusing. And the comment already mentions that the feature is detected at boot time. > */ > #define ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE \ > (ARM64_CPUCAP_SCOPE_LOCAL_CPU | \ > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > index 4784011cecac..a47a40ec6ad9 100644 > --- a/arch/arm64/kernel/process.c > +++ b/arch/arm64/kernel/process.c > @@ -522,14 +522,13 @@ static void erratum_1418040_thread_switch(struct task_struct *prev, > bool prev32, next32; > u64 val; > > - if (!(IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) && > - cpus_have_const_cap(ARM64_WORKAROUND_1418040))) > + if (!IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040)) > return; > > prev32 = is_compat_thread(task_thread_info(prev)); > next32 = is_compat_thread(task_thread_info(next)); > > - if (prev32 == next32) > + if (prev32 == next32 || !this_cpu_has_cap(ARM64_WORKAROUND_1418040)) > return; > > val = read_sysreg(cntkctl_el1); > This change as such looks good to me. Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
On Fri, Nov 06, 2020 at 12:18:32PM +0000, Suzuki K Poulose wrote: > On 11/6/20 11:49 AM, Will Deacon wrote: > > In a surprising turn of events, it transpires that CPU capabilities > > configured as ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE are never set as the > > result of late-onlining. Therefore our handling of erratum 1418040 does > > not get activated if it is not required by any of the boot CPUs, even > > though we allow late-onlining of an affected CPU. > > The capability state is not altered after the SMP boot for all types > of caps. The weak caps are there to allow a late CPU to turn online > without getting "banned". This may be something we could relax with > a new flag in the scope. I did look briefly into that, but it's really difficult to enable the static key as this operation can block so you end up having to defer it past the point of signalling the completion back to the CPU doing cpu_up(). So I figured I'd focus on fixing the current problem for now. > > In order to get things working again, replace the cpus_have_const_cap() > > invocation with an explicit check for the current CPU using > > this_cpu_has_cap(). > > > > Cc: Marc Zyngier <maz@kernel.org> > > Cc: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> > > Cc: Stephen Boyd <swboyd@chromium.org> > > Cc: Catalin Marinas <catalin.marinas@arm.com> > > Cc: Mark Rutland <mark.rutland@arm.com> > > Cc: Suzuki Poulose <suzuki.poulose@arm.com> > > Signed-off-by: Will Deacon <will@kernel.org> > > --- > > > > Found by code inspection and compile-tested only, so I would really > > appreciate a second look. > > > > arch/arm64/include/asm/cpufeature.h | 2 ++ > > arch/arm64/kernel/process.c | 5 ++--- > > 2 files changed, 4 insertions(+), 3 deletions(-) > > > > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h > > index f7e7144af174..c59c16a6ea8b 100644 > > --- a/arch/arm64/include/asm/cpufeature.h > > +++ b/arch/arm64/include/asm/cpufeature.h > > @@ -268,6 +268,8 @@ extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0; > > /* > > * CPU feature detected at boot time based on feature of one or more CPUs. > > * All possible conflicts for a late CPU are ignored. > > + * NOTE: this means that a late CPU with the feature will *not* cause the > > + * capability to be advertised by cpus_have_*cap()! > > This comment applies to all the types, so it may be confusing. And the > comment already mentions that the feature is detected at boot time. The other types don't allow a late CPU to come online with the feature though, so I don't think it's quite the same. Given that we made this mistake for this erratum workaround and previously for SSBS, I wanted to add something to the comment to try to avoid others falling into this trap. Ideally, we'd warn when calling cpus_have_*cap() for one of these things, but that adds runtime cost that we'd probably have to gate behind a DEBUG option. > > */ > > #define ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE \ > > (ARM64_CPUCAP_SCOPE_LOCAL_CPU | \ > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > > index 4784011cecac..a47a40ec6ad9 100644 > > --- a/arch/arm64/kernel/process.c > > +++ b/arch/arm64/kernel/process.c > > @@ -522,14 +522,13 @@ static void erratum_1418040_thread_switch(struct task_struct *prev, > > bool prev32, next32; > > u64 val; > > - if (!(IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) && > > - cpus_have_const_cap(ARM64_WORKAROUND_1418040))) > > + if (!IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040)) > > return; > > prev32 = is_compat_thread(task_thread_info(prev)); > > next32 = is_compat_thread(task_thread_info(next)); > > - if (prev32 == next32) > > + if (prev32 == next32 || !this_cpu_has_cap(ARM64_WORKAROUND_1418040)) > > return; > > val = read_sysreg(cntkctl_el1); > > > > > This change as such looks good to me. > > Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Thanks! Will
On Fri, Nov 06, 2020 at 12:18:32PM +0000, Suzuki K Poulose wrote: > On 11/6/20 11:49 AM, Will Deacon wrote: > > In a surprising turn of events, it transpires that CPU capabilities > > configured as ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE are never set as the > > result of late-onlining. Therefore our handling of erratum 1418040 does > > not get activated if it is not required by any of the boot CPUs, even > > though we allow late-onlining of an affected CPU. > > The capability state is not altered after the SMP boot for all types > of caps. The weak caps are there to allow a late CPU to turn online > without getting "banned". This may be something we could relax with > a new flag in the scope. Like this? Of course, it needs some testing. diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 97244d4feca9..b896e72131d7 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -246,6 +246,8 @@ extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0; #define ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU ((u16)BIT(5)) /* Panic when a conflict is detected */ #define ARM64_CPUCAP_PANIC_ON_CONFLICT ((u16)BIT(6)) +/* Together with PERMITTED_FOR_LATE_CPU, set the corresponding cpu_hwcaps bit */ +#define ARM64_CPUCAP_SET_FOR_LATE_CPU ((u16)BIT(7)) /* * CPU errata workarounds that need to be enabled at boot time if one or @@ -481,6 +483,16 @@ static __always_inline bool cpus_have_const_cap(int num) return cpus_have_cap(num); } +/* + * Test for a capability with a runtime check. This is an alias for + * cpus_have_cap() but with the name chosen to emphasize the applicability to + * late capability setting. + */ +static __always_inline bool cpus_have_late_cap(int num) +{ + return cpus_have_cap(num); +} + static inline void cpus_set_cap(unsigned int num) { if (num >= ARM64_NCAPS) { diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c index 61314fd70f13..6b7de7292e8c 100644 --- a/arch/arm64/kernel/cpu_errata.c +++ b/arch/arm64/kernel/cpu_errata.c @@ -481,7 +481,8 @@ const struct arm64_cpu_capabilities arm64_errata[] = { * also need the non-affected CPUs to be able to come * in at any point in time. Wonderful. */ - .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE, + .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE | + ARM64_CPUCAP_SET_FOR_LATE_CPU, }, #endif #ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index dcc165b3fc04..51e63be41ea5 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1720,6 +1720,12 @@ cpucap_late_cpu_permitted(const struct arm64_cpu_capabilities *cap) return !!(cap->type & ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU); } +static bool +cpucap_set_for_late_cpu(const struct arm64_cpu_capabilities *cap) +{ + return !!(cap->type & ARM64_CPUCAP_SET_FOR_LATE_CPU); +} + static bool cpucap_panic_on_conflict(const struct arm64_cpu_capabilities *cap) { @@ -2489,6 +2495,11 @@ static void verify_local_cpu_caps(u16 scope_mask) */ if (cpu_has_cap && !cpucap_late_cpu_permitted(caps)) break; + /* + * Set the capability bit if it allows late setting. + */ + if (cpucap_set_for_late_cpu(caps)) + cpus_set_cap(caps->capability); } } diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 4784011cecac..152639962845 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -523,7 +523,7 @@ static void erratum_1418040_thread_switch(struct task_struct *prev, u64 val; if (!(IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) && - cpus_have_const_cap(ARM64_WORKAROUND_1418040))) + cpus_have_late_cap(ARM64_WORKAROUND_1418040))) return; prev32 = is_compat_thread(task_thread_info(prev));
Catalin On Fri, Nov 06, 2020 at 12:44:00PM +0000, Catalin Marinas wrote: > On Fri, Nov 06, 2020 at 12:18:32PM +0000, Suzuki K Poulose wrote: > > On 11/6/20 11:49 AM, Will Deacon wrote: > > > In a surprising turn of events, it transpires that CPU capabilities > > > configured as ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE are never set as the > > > result of late-onlining. Therefore our handling of erratum 1418040 does > > > not get activated if it is not required by any of the boot CPUs, even > > > though we allow late-onlining of an affected CPU. > > > > The capability state is not altered after the SMP boot for all types > > of caps. The weak caps are there to allow a late CPU to turn online > > without getting "banned". This may be something we could relax with > > a new flag in the scope. > > Like this? Of course, it needs some testing. Yes, this is exactly what I had in mind. However, there is some concern over how we expose it. > > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h > index 97244d4feca9..b896e72131d7 100644 > --- a/arch/arm64/include/asm/cpufeature.h > +++ b/arch/arm64/include/asm/cpufeature.h > @@ -246,6 +246,8 @@ extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0; > #define ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU ((u16)BIT(5)) > /* Panic when a conflict is detected */ > #define ARM64_CPUCAP_PANIC_ON_CONFLICT ((u16)BIT(6)) > +/* Together with PERMITTED_FOR_LATE_CPU, set the corresponding cpu_hwcaps bit */ > +#define ARM64_CPUCAP_SET_FOR_LATE_CPU ((u16)BIT(7)) > > /* > * CPU errata workarounds that need to be enabled at boot time if one or > @@ -481,6 +483,16 @@ static __always_inline bool cpus_have_const_cap(int num) > return cpus_have_cap(num); > } > > +/* > + * Test for a capability with a runtime check. This is an alias for > + * cpus_have_cap() but with the name chosen to emphasize the applicability to > + * late capability setting. > + */ > +static __always_inline bool cpus_have_late_cap(int num) > +{ > + return cpus_have_cap(num); > +} > + It is quite easy for people to misuse the ubiquitous cpus_have_const_cap(). So, unless we make sure that we fix that to handle these "caps" we might see subtle bugs. How about the following fix on top of your patch, to add another (sigh!) set of keys to detect if we can use the const caps or not. Untested... ---8>--- diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index 849b2691fadd..84dd43f1f322 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -379,6 +379,7 @@ cpucap_multi_entry_cap_matches(const struct arm64_cpu_capabilities *entry, extern DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS); extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS]; +extern struct static_key_false cpu_hwcap_const_key[ARM64_NCAPS]; extern struct static_key_false arm64_const_caps_ready; /* ARM64 CAPS + alternative_cb */ @@ -414,6 +415,14 @@ static inline bool cpus_have_cap(unsigned int num) return test_bit(num, cpu_hwcaps); } +static __always_inline bool cpucap_has_const_key(int num) +{ + if (num >= ARM64_NCAPS) + return false; + + return static_branch_unlikely(&cpu_hwcap_const_key[num]); +} + /* * Test for a capability without a runtime check. * @@ -424,7 +433,7 @@ static inline bool cpus_have_cap(unsigned int num) */ static __always_inline bool __cpus_have_const_cap(int num) { - if (num >= ARM64_NCAPS) + if (num >= ARM64_NCAPS && WARN_ON(!cpucap_has_const_key(num))) return false; return static_branch_unlikely(&cpu_hwcap_keys[num]); } @@ -439,7 +448,7 @@ static __always_inline bool __cpus_have_const_cap(int num) */ static __always_inline bool cpus_have_const_cap(int num) { - if (system_capabilities_finalized()) + if (cpucap_has_const_key(num) && system_capabilities_finalized()) return __cpus_have_const_cap(num); else return cpus_have_cap(num); diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 51e63be41ea5..a6c3e4e102c9 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -127,7 +127,9 @@ void dump_cpu_features(void) } DEFINE_STATIC_KEY_ARRAY_FALSE(cpu_hwcap_keys, ARM64_NCAPS); +DEFINE_STATIC_KEY_ARRAY_FALSE(cpu_hwcap_const_key, ARM64_NCAPS); EXPORT_SYMBOL(cpu_hwcap_keys); +EXPORT_SYMBOL(cpu_hwcap_const_key); #define __ARM64_FTR_BITS(SIGNED, VISIBLE, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \ { \ @@ -2426,7 +2428,10 @@ static void __init enable_cpu_capabilities(u16 scope_mask) continue; /* Ensure cpus_have_const_cap(num) works */ - static_branch_enable(&cpu_hwcap_keys[num]); + if (!cpucap_set_for_late_cpu(caps)) { + static_branch_enable(&cpu_hwcap_const_key[num]); + static_branch_enable(&cpu_hwcap_keys[num]); + } if (boot_scope && caps->cpu_enable) /* diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h index 843ecfb16a69..0ff9329670b7 100644 --- a/arch/arm64/kernel/image-vars.h +++ b/arch/arm64/kernel/image-vars.h @@ -90,6 +90,7 @@ KVM_NVHE_ALIAS(__icache_flags); /* Kernel symbols needed for cpus_have_final/const_caps checks. */ KVM_NVHE_ALIAS(arm64_const_caps_ready); KVM_NVHE_ALIAS(cpu_hwcap_keys); +KVM_NVHE_ALIAS(cpu_hwcap_const_key); KVM_NVHE_ALIAS(cpu_hwcaps); /* Static keys which are set if a vGIC trap should be handled in hyp. */ -- > static inline void cpus_set_cap(unsigned int num) > { > if (num >= ARM64_NCAPS) { > diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c > index 61314fd70f13..6b7de7292e8c 100644 > --- a/arch/arm64/kernel/cpu_errata.c > +++ b/arch/arm64/kernel/cpu_errata.c > @@ -481,7 +481,8 @@ const struct arm64_cpu_capabilities arm64_errata[] = { > * also need the non-affected CPUs to be able to come > * in at any point in time. Wonderful. > */ > - .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE, > + .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE | > + ARM64_CPUCAP_SET_FOR_LATE_CPU, > }, > #endif > #ifdef CONFIG_ARM64_WORKAROUND_SPECULATIVE_AT > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c > index dcc165b3fc04..51e63be41ea5 100644 > --- a/arch/arm64/kernel/cpufeature.c > +++ b/arch/arm64/kernel/cpufeature.c > @@ -1720,6 +1720,12 @@ cpucap_late_cpu_permitted(const struct arm64_cpu_capabilities *cap) > return !!(cap->type & ARM64_CPUCAP_PERMITTED_FOR_LATE_CPU); > } > > +static bool > +cpucap_set_for_late_cpu(const struct arm64_cpu_capabilities *cap) > +{ > + return !!(cap->type & ARM64_CPUCAP_SET_FOR_LATE_CPU); > +} > + > static bool > cpucap_panic_on_conflict(const struct arm64_cpu_capabilities *cap) > { > @@ -2489,6 +2495,11 @@ static void verify_local_cpu_caps(u16 scope_mask) > */ > if (cpu_has_cap && !cpucap_late_cpu_permitted(caps)) > break; > + /* > + * Set the capability bit if it allows late setting. > + */ > + if (cpucap_set_for_late_cpu(caps)) > + cpus_set_cap(caps->capability); > } > } > > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c > index 4784011cecac..152639962845 100644 > --- a/arch/arm64/kernel/process.c > +++ b/arch/arm64/kernel/process.c > @@ -523,7 +523,7 @@ static void erratum_1418040_thread_switch(struct task_struct *prev, > u64 val; > > if (!(IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) && > - cpus_have_const_cap(ARM64_WORKAROUND_1418040))) > + cpus_have_late_cap(ARM64_WORKAROUND_1418040))) > return; > > prev32 = is_compat_thread(task_thread_info(prev));
On Tue, Nov 10, 2020 at 10:39:57AM +0000, Suzuki K Poulose wrote: > Catalin > > On Fri, Nov 06, 2020 at 12:44:00PM +0000, Catalin Marinas wrote: > > On Fri, Nov 06, 2020 at 12:18:32PM +0000, Suzuki K Poulose wrote: > > > On 11/6/20 11:49 AM, Will Deacon wrote: > > +/* > > + * Test for a capability with a runtime check. This is an alias for > > + * cpus_have_cap() but with the name chosen to emphasize the applicability to > > + * late capability setting. > > + */ > > +static __always_inline bool cpus_have_late_cap(int num) > > +{ > > + return cpus_have_cap(num); > > +} > > + > > It is quite easy for people to misuse the ubiquitous cpus_have_const_cap(). > So, unless we make sure that we fix that to handle these "caps" we might > see subtle bugs. How about the following fix on top of your patch, to > add another (sigh!) set of keys to detect if we can use the const caps > or not. I thought about this as well but decided it's not worth the additional code. > Untested... > > ---8>--- > > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h > index 849b2691fadd..84dd43f1f322 100644 > --- a/arch/arm64/include/asm/cpufeature.h > +++ b/arch/arm64/include/asm/cpufeature.h > @@ -379,6 +379,7 @@ cpucap_multi_entry_cap_matches(const struct arm64_cpu_capabilities *entry, > > extern DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS); > extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS]; > +extern struct static_key_false cpu_hwcap_const_key[ARM64_NCAPS]; > extern struct static_key_false arm64_const_caps_ready; > > /* ARM64 CAPS + alternative_cb */ > @@ -414,6 +415,14 @@ static inline bool cpus_have_cap(unsigned int num) > return test_bit(num, cpu_hwcaps); > } > > +static __always_inline bool cpucap_has_const_key(int num) > +{ > + if (num >= ARM64_NCAPS) > + return false; > + > + return static_branch_unlikely(&cpu_hwcap_const_key[num]); > +} > + > /* > * Test for a capability without a runtime check. > * > @@ -424,7 +433,7 @@ static inline bool cpus_have_cap(unsigned int num) > */ > static __always_inline bool __cpus_have_const_cap(int num) > { > - if (num >= ARM64_NCAPS) > + if (num >= ARM64_NCAPS && WARN_ON(!cpucap_has_const_key(num))) > return false; > return static_branch_unlikely(&cpu_hwcap_keys[num]); > } > @@ -439,7 +448,7 @@ static __always_inline bool __cpus_have_const_cap(int num) > */ > static __always_inline bool cpus_have_const_cap(int num) > { > - if (system_capabilities_finalized()) > + if (cpucap_has_const_key(num) && system_capabilities_finalized()) > return __cpus_have_const_cap(num); > else > return cpus_have_cap(num); I don't particularly like that we end up with three static key checks here but, OTOH, this has the advantage that we don't need a new cpus_have_late_cap(). Going back to the original problem, we want to allow late cpucap setting after the system caps have been finalised and, in a addition, we want to detect API misuse. We solve the former by setting the corresponding cpu_hwcaps bit late, based on a new capability flag. For the API, we can use your approach with another static key and that's pretty foolproof, it gives us warnings. Alternatively (not as foolproof), we could skip the static key initialisation entirely for a late feature even if we detect it early and that would allow one to (hopefully) figure out the wrong API (cpus_have_const_cap() never true): diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index e346938e9a37..531b7787ee8e 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -2436,7 +2436,8 @@ static void __init enable_cpu_capabilities(u16 scope_mask) continue; /* Ensure cpus_have_const_cap(num) works */ - static_branch_enable(&cpu_hwcap_keys[num]); + if (!cpucap_set_for_late_cpu(caps)) + static_branch_enable(&cpu_hwcap_keys[num]); if (boot_scope && caps->cpu_enable) /* Anyway, either option is fine by me.
On 11/10/20 12:05 PM, Catalin Marinas wrote: > On Tue, Nov 10, 2020 at 10:39:57AM +0000, Suzuki K Poulose wrote: >> Catalin >> >> On Fri, Nov 06, 2020 at 12:44:00PM +0000, Catalin Marinas wrote: >>> On Fri, Nov 06, 2020 at 12:18:32PM +0000, Suzuki K Poulose wrote: >>>> On 11/6/20 11:49 AM, Will Deacon wrote: >>> +/* >>> + * Test for a capability with a runtime check. This is an alias for >>> + * cpus_have_cap() but with the name chosen to emphasize the applicability to >>> + * late capability setting. >>> + */ >>> +static __always_inline bool cpus_have_late_cap(int num) >>> +{ >>> + return cpus_have_cap(num); >>> +} >>> + >> >> It is quite easy for people to misuse the ubiquitous cpus_have_const_cap(). >> So, unless we make sure that we fix that to handle these "caps" we might >> see subtle bugs. How about the following fix on top of your patch, to >> add another (sigh!) set of keys to detect if we can use the const caps >> or not. > > I thought about this as well but decided it's not worth the additional > code. > >> Untested... >> >> ---8>--- >> >> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h >> index 849b2691fadd..84dd43f1f322 100644 >> --- a/arch/arm64/include/asm/cpufeature.h >> +++ b/arch/arm64/include/asm/cpufeature.h >> @@ -379,6 +379,7 @@ cpucap_multi_entry_cap_matches(const struct arm64_cpu_capabilities *entry, >> >> extern DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS); >> extern struct static_key_false cpu_hwcap_keys[ARM64_NCAPS]; >> +extern struct static_key_false cpu_hwcap_const_key[ARM64_NCAPS]; >> extern struct static_key_false arm64_const_caps_ready; >> >> /* ARM64 CAPS + alternative_cb */ >> @@ -414,6 +415,14 @@ static inline bool cpus_have_cap(unsigned int num) >> return test_bit(num, cpu_hwcaps); >> } >> >> +static __always_inline bool cpucap_has_const_key(int num) >> +{ >> + if (num >= ARM64_NCAPS) >> + return false; >> + >> + return static_branch_unlikely(&cpu_hwcap_const_key[num]); >> +} >> + >> /* >> * Test for a capability without a runtime check. >> * >> @@ -424,7 +433,7 @@ static inline bool cpus_have_cap(unsigned int num) >> */ >> static __always_inline bool __cpus_have_const_cap(int num) >> { >> - if (num >= ARM64_NCAPS) >> + if (num >= ARM64_NCAPS && WARN_ON(!cpucap_has_const_key(num))) >> return false; >> return static_branch_unlikely(&cpu_hwcap_keys[num]); >> } >> @@ -439,7 +448,7 @@ static __always_inline bool __cpus_have_const_cap(int num) >> */ >> static __always_inline bool cpus_have_const_cap(int num) >> { >> - if (system_capabilities_finalized()) >> + if (cpucap_has_const_key(num) && system_capabilities_finalized()) >> return __cpus_have_const_cap(num); >> else >> return cpus_have_cap(num); > > I don't particularly like that we end up with three static key checks > here but, OTOH, this has the advantage that we don't need a new > cpus_have_late_cap(). > > Going back to the original problem, we want to allow late cpucap setting > after the system caps have been finalised and, in a addition, we want to > detect API misuse. We solve the former by setting the corresponding > cpu_hwcaps bit late, based on a new capability flag. > > For the API, we can use your approach with another static key and that's > pretty foolproof, it gives us warnings. Alternatively (not as > foolproof), we could skip the static key initialisation entirely for a > late feature even if we detect it early and that would allow one to > (hopefully) figure out the wrong API (cpus_have_const_cap() never true): > > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c > index e346938e9a37..531b7787ee8e 100644 > --- a/arch/arm64/kernel/cpufeature.c > +++ b/arch/arm64/kernel/cpufeature.c > @@ -2436,7 +2436,8 @@ static void __init enable_cpu_capabilities(u16 scope_mask) > continue; > > /* Ensure cpus_have_const_cap(num) works */ > - static_branch_enable(&cpu_hwcap_keys[num]); > + if (!cpucap_set_for_late_cpu(caps)) > + static_branch_enable(&cpu_hwcap_keys[num]); True, I think this would do the trick, provided proper testing is performed ;-). The issue is, we would advertise the "errata work around" (in dmesg) and may not really apply it where it matters. So, if the testing doesn't really verify this, but instead looks for the message, then we are back to the problem the original patch was trying to solve. Cheers Suzuki
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h index f7e7144af174..c59c16a6ea8b 100644 --- a/arch/arm64/include/asm/cpufeature.h +++ b/arch/arm64/include/asm/cpufeature.h @@ -268,6 +268,8 @@ extern struct arm64_ftr_reg arm64_ftr_reg_ctrel0; /* * CPU feature detected at boot time based on feature of one or more CPUs. * All possible conflicts for a late CPU are ignored. + * NOTE: this means that a late CPU with the feature will *not* cause the + * capability to be advertised by cpus_have_*cap()! */ #define ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE \ (ARM64_CPUCAP_SCOPE_LOCAL_CPU | \ diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c index 4784011cecac..a47a40ec6ad9 100644 --- a/arch/arm64/kernel/process.c +++ b/arch/arm64/kernel/process.c @@ -522,14 +522,13 @@ static void erratum_1418040_thread_switch(struct task_struct *prev, bool prev32, next32; u64 val; - if (!(IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) && - cpus_have_const_cap(ARM64_WORKAROUND_1418040))) + if (!IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040)) return; prev32 = is_compat_thread(task_thread_info(prev)); next32 = is_compat_thread(task_thread_info(next)); - if (prev32 == next32) + if (prev32 == next32 || !this_cpu_has_cap(ARM64_WORKAROUND_1418040)) return; val = read_sysreg(cntkctl_el1);
In a surprising turn of events, it transpires that CPU capabilities configured as ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE are never set as the result of late-onlining. Therefore our handling of erratum 1418040 does not get activated if it is not required by any of the boot CPUs, even though we allow late-onlining of an affected CPU. In order to get things working again, replace the cpus_have_const_cap() invocation with an explicit check for the current CPU using this_cpu_has_cap(). Cc: Marc Zyngier <maz@kernel.org> Cc: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Cc: Stephen Boyd <swboyd@chromium.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Signed-off-by: Will Deacon <will@kernel.org> --- Found by code inspection and compile-tested only, so I would really appreciate a second look. arch/arm64/include/asm/cpufeature.h | 2 ++ arch/arm64/kernel/process.c | 5 ++--- 2 files changed, 4 insertions(+), 3 deletions(-)