diff mbox series

[v2] arm64: errata: Fix exec handling in erratum 1418040 workaround

Message ID 20211216191618.972956-1-scott@os.amperecomputing.com (mailing list archive)
State New, archived
Headers show
Series [v2] arm64: errata: Fix exec handling in erratum 1418040 workaround | expand

Commit Message

D Scott Phillips Dec. 16, 2021, 7:16 p.m. UTC
The erratum 1418040 workaround changes vct access trapping when switching
between compat and non-compat threads. The workaround logic assumes that
the hardware vct trapping state matches the previous task's compat-ness.
However, when a non-compat task execs a compat binary or vice versa, the
cntkctl state and task compat-ness get out of sync. Keep the hardware
trapping state in sync with the task personality.

Fixes: d49f7d7376d0 ("arm64: Move handling of erratum 1418040 into C code")
Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: <stable@vger.kernel.org> # 5.4.x
---

 v2: - Use sysreg_clear_set instead of open coding (Marc)
     - guard this_cpu_has_cap() check under IS_ENABLED() to avoid tons of
       WARN_ON(preemptible()) when built with !CONFIG_ARM64_ERRATUM_1418040

 arch/arm64/include/asm/elf.h | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

Comments

Marc Zyngier Dec. 16, 2021, 7:52 p.m. UTC | #1
On 2021-12-16 19:16, D Scott Phillips wrote:
> The erratum 1418040 workaround changes vct access trapping when 
> switching
> between compat and non-compat threads. The workaround logic assumes 
> that
> the hardware vct trapping state matches the previous task's 
> compat-ness.
> However, when a non-compat task execs a compat binary or vice versa, 
> the
> cntkctl state and task compat-ness get out of sync. Keep the hardware
> trapping state in sync with the task personality.
> 
> Fixes: d49f7d7376d0 ("arm64: Move handling of erratum 1418040 into C 
> code")
> Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com>
> Reviewed-by: Marc Zyngier <maz@kernel.org>
> Cc: <stable@vger.kernel.org> # 5.4.x
> ---
> 
>  v2: - Use sysreg_clear_set instead of open coding (Marc)
>      - guard this_cpu_has_cap() check under IS_ENABLED() to avoid tons 
> of
>        WARN_ON(preemptible()) when built with 
> !CONFIG_ARM64_ERRATUM_1418040

Indeed.

>  arch/arm64/include/asm/elf.h | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/elf.h 
> b/arch/arm64/include/asm/elf.h
> index 97932fbf973d..24036b914226 100644
> --- a/arch/arm64/include/asm/elf.h
> +++ b/arch/arm64/include/asm/elf.h
> @@ -160,8 +160,16 @@ typedef struct user_fpsimd_state elf_fpregset_t;
> 
>  #define SET_PERSONALITY(ex)						\
>  ({									\
> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))			\
> +		preempt_disable();					\
>  	clear_thread_flag(TIF_32BIT);					\
>  	current->personality &= ~READ_IMPLIES_EXEC;			\
> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040)) {			\
> +		if (this_cpu_has_cap(ARM64_WORKAROUND_1418040))		\

Probably better written as:

         if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040) &&
             this_cpu_has_cap(ARM64_WORKAROUND_1418040))
                 sysreg_clear_set(...);

Thanks,

         M.
Catalin Marinas Dec. 17, 2021, 2:41 p.m. UTC | #2
On Thu, Dec 16, 2021 at 11:16:18AM -0800, D Scott Phillips wrote:
> diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
> index 97932fbf973d..24036b914226 100644
> --- a/arch/arm64/include/asm/elf.h
> +++ b/arch/arm64/include/asm/elf.h
> @@ -160,8 +160,16 @@ typedef struct user_fpsimd_state elf_fpregset_t;
>  
>  #define SET_PERSONALITY(ex)						\
>  ({									\
> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))			\
> +		preempt_disable();					\
>  	clear_thread_flag(TIF_32BIT);					\
>  	current->personality &= ~READ_IMPLIES_EXEC;			\
> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040)) {			\
> +		if (this_cpu_has_cap(ARM64_WORKAROUND_1418040))		\
> +			sysreg_clear_set(cntkctl_el1, 0,		\
> +					 ARCH_TIMER_USR_VCT_ACCESS_EN);	\
> +		preempt_enable();					\
> +	}								\
>  })
>  
>  /* update AT_VECTOR_SIZE_ARCH if the number of NEW_AUX_ENT entries changes */
> @@ -223,7 +231,16 @@ int compat_elf_check_arch(const struct elf32_hdr *);
>   */
>  #define COMPAT_SET_PERSONALITY(ex)					\
>  ({									\
> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))			\
> +		preempt_disable();					\
>  	set_thread_flag(TIF_32BIT);					\
> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040)) {			\
> +		if (this_cpu_has_cap(ARM64_WORKAROUND_1418040))		\
> +			sysreg_clear_set(cntkctl_el1,			\
> +					 ARCH_TIMER_USR_VCT_ACCESS_EN,	\
> +					 0);				\
> +		preempt_enable();					\
> +	}								\

I don't particularly like adding more to these macros. There's
arch_setup_new_exec() that gets called after SET_PERSONALITY, so you can
check whether the task is compat or not.

Thanks.
D Scott Phillips Dec. 17, 2021, 4:42 p.m. UTC | #3
Catalin Marinas <catalin.marinas@arm.com> writes:

> On Thu, Dec 16, 2021 at 11:16:18AM -0800, D Scott Phillips wrote:
>> diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
>> index 97932fbf973d..24036b914226 100644
>> --- a/arch/arm64/include/asm/elf.h
>> +++ b/arch/arm64/include/asm/elf.h
>> @@ -160,8 +160,16 @@ typedef struct user_fpsimd_state elf_fpregset_t;
>>  
>>  #define SET_PERSONALITY(ex)						\
>>  ({									\
>> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))			\
>> +		preempt_disable();					\
>>  	clear_thread_flag(TIF_32BIT);					\
>>  	current->personality &= ~READ_IMPLIES_EXEC;			\
>> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040)) {			\
>> +		if (this_cpu_has_cap(ARM64_WORKAROUND_1418040))		\
>> +			sysreg_clear_set(cntkctl_el1, 0,		\
>> +					 ARCH_TIMER_USR_VCT_ACCESS_EN);	\
>> +		preempt_enable();					\
>> +	}								\
>>  })
>>  
>>  /* update AT_VECTOR_SIZE_ARCH if the number of NEW_AUX_ENT entries changes */
>> @@ -223,7 +231,16 @@ int compat_elf_check_arch(const struct elf32_hdr *);
>>   */
>>  #define COMPAT_SET_PERSONALITY(ex)					\
>>  ({									\
>> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))			\
>> +		preempt_disable();					\
>>  	set_thread_flag(TIF_32BIT);					\
>> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040)) {			\
>> +		if (this_cpu_has_cap(ARM64_WORKAROUND_1418040))		\
>> +			sysreg_clear_set(cntkctl_el1,			\
>> +					 ARCH_TIMER_USR_VCT_ACCESS_EN,	\
>> +					 0);				\
>> +		preempt_enable();					\
>> +	}								\
>
> I don't particularly like adding more to these macros. There's
> arch_setup_new_exec() that gets called after SET_PERSONALITY, so you can
> check whether the task is compat or not.

If the task is preemptible between the update to TIF_32BIT and the
update to cntkctl then a window exists where preemption will confuse the
workaround logic we have in switch_to, causing the desync between compat
state and counter access trapping to propagate into other tasks.

Agreed that this change gunks up a previously simple macro though. Maybe
we could have SET_PERSONALITY only set a flag in arch_elf_state and
arch_setup_new_exec() take over setting TIF_32BIT? Or move
set_personality's implementation to a helper? Or something else?

> Thanks.
>
> -- 
> Catalin
Catalin Marinas Dec. 17, 2021, 6:38 p.m. UTC | #4
On Fri, Dec 17, 2021 at 08:42:01AM -0800, D Scott Phillips wrote:
> Catalin Marinas <catalin.marinas@arm.com> writes:
> > On Thu, Dec 16, 2021 at 11:16:18AM -0800, D Scott Phillips wrote:
> >> diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
> >> index 97932fbf973d..24036b914226 100644
> >> --- a/arch/arm64/include/asm/elf.h
> >> +++ b/arch/arm64/include/asm/elf.h
> >> @@ -160,8 +160,16 @@ typedef struct user_fpsimd_state elf_fpregset_t;
> >>  
> >>  #define SET_PERSONALITY(ex)						\
> >>  ({									\
> >> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))			\
> >> +		preempt_disable();					\
> >>  	clear_thread_flag(TIF_32BIT);					\
> >>  	current->personality &= ~READ_IMPLIES_EXEC;			\
> >> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040)) {			\
> >> +		if (this_cpu_has_cap(ARM64_WORKAROUND_1418040))		\
> >> +			sysreg_clear_set(cntkctl_el1, 0,		\
> >> +					 ARCH_TIMER_USR_VCT_ACCESS_EN);	\
> >> +		preempt_enable();					\
> >> +	}								\
> >>  })
> >>  
> >>  /* update AT_VECTOR_SIZE_ARCH if the number of NEW_AUX_ENT entries changes */
> >> @@ -223,7 +231,16 @@ int compat_elf_check_arch(const struct elf32_hdr *);
> >>   */
> >>  #define COMPAT_SET_PERSONALITY(ex)					\
> >>  ({									\
> >> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))			\
> >> +		preempt_disable();					\
> >>  	set_thread_flag(TIF_32BIT);					\
> >> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040)) {			\
> >> +		if (this_cpu_has_cap(ARM64_WORKAROUND_1418040))		\
> >> +			sysreg_clear_set(cntkctl_el1,			\
> >> +					 ARCH_TIMER_USR_VCT_ACCESS_EN,	\
> >> +					 0);				\
> >> +		preempt_enable();					\
> >> +	}								\
> >
> > I don't particularly like adding more to these macros. There's
> > arch_setup_new_exec() that gets called after SET_PERSONALITY, so you can
> > check whether the task is compat or not.
> 
> If the task is preemptible between the update to TIF_32BIT and the
> update to cntkctl then a window exists where preemption will confuse the
> workaround logic we have in switch_to, causing the desync between compat
> state and counter access trapping to propagate into other tasks.
> 
> Agreed that this change gunks up a previously simple macro though. Maybe
> we could have SET_PERSONALITY only set a flag in arch_elf_state and
> arch_setup_new_exec() take over setting TIF_32BIT? Or move
> set_personality's implementation to a helper? Or something else?

Is reading ckntkctl_el1 as expensive as the write? If not we could
change the __switch_to() logic to simply check that the bit is set or
cleared depending on is_compat_thread() and skip the comparison with the
previous task.
D Scott Phillips Dec. 17, 2021, 7:29 p.m. UTC | #5
Catalin Marinas <catalin.marinas@arm.com> writes:

> On Fri, Dec 17, 2021 at 08:42:01AM -0800, D Scott Phillips wrote:
>> Catalin Marinas <catalin.marinas@arm.com> writes:
>> > On Thu, Dec 16, 2021 at 11:16:18AM -0800, D Scott Phillips wrote:
>> >> diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
>> >> index 97932fbf973d..24036b914226 100644
>> >> --- a/arch/arm64/include/asm/elf.h
>> >> +++ b/arch/arm64/include/asm/elf.h
>> >> @@ -160,8 +160,16 @@ typedef struct user_fpsimd_state elf_fpregset_t;
>> >>  
>> >>  #define SET_PERSONALITY(ex)						\
>> >>  ({									\
>> >> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))			\
>> >> +		preempt_disable();					\
>> >>  	clear_thread_flag(TIF_32BIT);					\
>> >>  	current->personality &= ~READ_IMPLIES_EXEC;			\
>> >> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040)) {			\
>> >> +		if (this_cpu_has_cap(ARM64_WORKAROUND_1418040))		\
>> >> +			sysreg_clear_set(cntkctl_el1, 0,		\
>> >> +					 ARCH_TIMER_USR_VCT_ACCESS_EN);	\
>> >> +		preempt_enable();					\
>> >> +	}								\
>> >>  })
>> >>  
>> >>  /* update AT_VECTOR_SIZE_ARCH if the number of NEW_AUX_ENT entries changes */
>> >> @@ -223,7 +231,16 @@ int compat_elf_check_arch(const struct elf32_hdr *);
>> >>   */
>> >>  #define COMPAT_SET_PERSONALITY(ex)					\
>> >>  ({									\
>> >> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))			\
>> >> +		preempt_disable();					\
>> >>  	set_thread_flag(TIF_32BIT);					\
>> >> +	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040)) {			\
>> >> +		if (this_cpu_has_cap(ARM64_WORKAROUND_1418040))		\
>> >> +			sysreg_clear_set(cntkctl_el1,			\
>> >> +					 ARCH_TIMER_USR_VCT_ACCESS_EN,	\
>> >> +					 0);				\
>> >> +		preempt_enable();					\
>> >> +	}								\
>> >
>> > I don't particularly like adding more to these macros. There's
>> > arch_setup_new_exec() that gets called after SET_PERSONALITY, so you can
>> > check whether the task is compat or not.
>> 
>> If the task is preemptible between the update to TIF_32BIT and the
>> update to cntkctl then a window exists where preemption will confuse the
>> workaround logic we have in switch_to, causing the desync between compat
>> state and counter access trapping to propagate into other tasks.
>> 
>> Agreed that this change gunks up a previously simple macro though. Maybe
>> we could have SET_PERSONALITY only set a flag in arch_elf_state and
>> arch_setup_new_exec() take over setting TIF_32BIT? Or move
>> set_personality's implementation to a helper? Or something else?
>
> Is reading ckntkctl_el1 as expensive as the write? If not we could
> change the __switch_to() logic to simply check that the bit is set or
> cleared depending on is_compat_thread() and skip the comparison with the
> previous task.

Ah, yes that's a good idea. From a quick check on an N1 core, it looks
like reading cntkctl has about the same time cost as reading a general
purpose register. I'd assume the other cores affected by the erratum
would be similar, so I'll go that route. Thanks,

Scott

> -- 
> Catalin
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
index 97932fbf973d..24036b914226 100644
--- a/arch/arm64/include/asm/elf.h
+++ b/arch/arm64/include/asm/elf.h
@@ -160,8 +160,16 @@  typedef struct user_fpsimd_state elf_fpregset_t;
 
 #define SET_PERSONALITY(ex)						\
 ({									\
+	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))			\
+		preempt_disable();					\
 	clear_thread_flag(TIF_32BIT);					\
 	current->personality &= ~READ_IMPLIES_EXEC;			\
+	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040)) {			\
+		if (this_cpu_has_cap(ARM64_WORKAROUND_1418040))		\
+			sysreg_clear_set(cntkctl_el1, 0,		\
+					 ARCH_TIMER_USR_VCT_ACCESS_EN);	\
+		preempt_enable();					\
+	}								\
 })
 
 /* update AT_VECTOR_SIZE_ARCH if the number of NEW_AUX_ENT entries changes */
@@ -223,7 +231,16 @@  int compat_elf_check_arch(const struct elf32_hdr *);
  */
 #define COMPAT_SET_PERSONALITY(ex)					\
 ({									\
+	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040))			\
+		preempt_disable();					\
 	set_thread_flag(TIF_32BIT);					\
+	if (IS_ENABLED(CONFIG_ARM64_ERRATUM_1418040)) {			\
+		if (this_cpu_has_cap(ARM64_WORKAROUND_1418040))		\
+			sysreg_clear_set(cntkctl_el1,			\
+					 ARCH_TIMER_USR_VCT_ACCESS_EN,	\
+					 0);				\
+		preempt_enable();					\
+	}								\
  })
 #ifdef CONFIG_COMPAT_VDSO
 #define COMPAT_ARCH_DLINFO						\