diff mbox series

[v4,2/5] arm64: errata: Assume that unknown CPUs _are_ vulnerable to Spectre BHB

Message ID 20250107120555.v4.2.I2040fa004dafe196243f67ebcc647cbedbb516e6@changeid (mailing list archive)
State Not Applicable, archived
Headers show
Series arm64: errata: Rework Spectre BHB mitigations to not assume "safe" | expand

Commit Message

Doug Anderson Jan. 7, 2025, 8:05 p.m. UTC
The code for detecting CPUs that are vulnerable to Spectre BHB was
based on a hardcoded list of CPU IDs that were known to be affected.
Unfortunately, the list mostly only contained the IDs of standard ARM
cores. The IDs for many cores that are minor variants of the standard
ARM cores (like many Qualcomm Kyro CPUs) weren't listed. This led the
code to assume that those variants were not affected.

Flip the code on its head and instead assume that a core is vulnerable
if it doesn't have CSV2_3 but is unrecognized as being safe. This
involves creating a "Spectre BHB safe" list.

As of right now, the only CPU IDs added to the "Spectre BHB safe" list
are ARM Cortex A35, A53, A55, A510, and A520. This list was created by
looking for cores that weren't listed in ARM's list [1] as per review
feedback on v2 of this patch [2]. Additionally Brahma A53 is added as
per mailing list feedback [3].

NOTE: this patch will not actually _mitigate_ anyone, it will simply
cause them to report themselves as vulnerable. If any cores in the
system are reported as vulnerable but not mitigated then the whole
system will be reported as vulnerable though the system will attempt
to mitigate with the information it has about the known cores.

[1] https://developer.arm.com/Arm%20Security%20Center/Spectre-BHB
[2] https://lore.kernel.org/r/20241219175128.GA25477@willie-the-truck
[3] https://lore.kernel.org/r/18dbd7d1-a46c-4112-a425-320c99f67a8d@broadcom.com


Fixes: 558c303c9734 ("arm64: Mitigate spectre style branch history side channels")
Cc: stable@vger.kernel.org
Reviewed-by: Julius Werner <jwerner@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v4:
- Add MIDR_BRAHMA_B53 as safe.
- Get rid of `spectre_bhb_firmware_mitigated_list`.

Changes in v3:
- Don't guess the mitigation; just report unknown cores as vulnerable.
- Restructure the code since is_spectre_bhb_affected() defaults to true

Changes in v2:
- New

 arch/arm64/include/asm/spectre.h |   1 -
 arch/arm64/kernel/proton-pack.c  | 203 ++++++++++++++++---------------
 2 files changed, 102 insertions(+), 102 deletions(-)

Comments

James Morse Jan. 29, 2025, 4:43 p.m. UTC | #1
Hi Doug,

On 07/01/2025 20:05, Douglas Anderson wrote:
> The code for detecting CPUs that are vulnerable to Spectre BHB was
> based on a hardcoded list of CPU IDs that were known to be affected.
> Unfortunately, the list mostly only contained the IDs of standard ARM
> cores. The IDs for many cores that are minor variants of the standard
> ARM cores (like many Qualcomm Kyro CPUs) weren't listed. This led the
> code to assume that those variants were not affected.
> 
> Flip the code on its head and instead assume that a core is vulnerable
> if it doesn't have CSV2_3 but is unrecognized as being safe. This
> involves creating a "Spectre BHB safe" list.
> 
> As of right now, the only CPU IDs added to the "Spectre BHB safe" list
> are ARM Cortex A35, A53, A55, A510, and A520. This list was created by
> looking for cores that weren't listed in ARM's list [1] as per review
> feedback on v2 of this patch [2]. Additionally Brahma A53 is added as
> per mailing list feedback [3].
> 
> NOTE: this patch will not actually _mitigate_ anyone, it will simply
> cause them to report themselves as vulnerable. If any cores in the
> system are reported as vulnerable but not mitigated then the whole
> system will be reported as vulnerable though the system will attempt
> to mitigate with the information it has about the known cores.

> arch/arm64/include/asm/spectre.h |   1 -
> arch/arm64/kernel/proton-pack.c  | 203 ++++++++++++++++---------------
> 2 files changed, 102 insertions(+), 102 deletions(-)

This is a pretty hefty diff-stat for adding a list of six CPUs. It looks like there are
multiple things going on here: I think you're adding the 'safe' list of CPUs, then
removing the list of firmware-mitigated list, then removing some indentation to do the
mitigation detection differently. Any chance this can be split up?
(I'm not sure about the last chunk - it breaks automatic backporting)


> diff --git a/arch/arm64/kernel/proton-pack.c b/arch/arm64/kernel/proton-pack.c
> index e149efadff20..17aa836fe46d 100644
> --- a/arch/arm64/kernel/proton-pack.c
> +++ b/arch/arm64/kernel/proton-pack.c


> +static u8 spectre_bhb_loop_affected(void)
>  {
>  	u8 k = 0;
> -	static u8 max_bhb_k;
> -
> -	if (scope == SCOPE_LOCAL_CPU) {
> -		static const struct midr_range spectre_bhb_k32_list[] = {
> -			MIDR_ALL_VERSIONS(MIDR_CORTEX_A78),
> -			MIDR_ALL_VERSIONS(MIDR_CORTEX_A78AE),
> -			MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C),
> -			MIDR_ALL_VERSIONS(MIDR_CORTEX_X1),
> -			MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
> -			MIDR_ALL_VERSIONS(MIDR_CORTEX_X2),
> -			MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
> -			MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1),
> -			{},
> -		};
> -		static const struct midr_range spectre_bhb_k24_list[] = {
> -			MIDR_ALL_VERSIONS(MIDR_CORTEX_A76),
> -			MIDR_ALL_VERSIONS(MIDR_CORTEX_A77),
> -			MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
> -			MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_4XX_GOLD),
> -			{},
> -		};
> -		static const struct midr_range spectre_bhb_k11_list[] = {
> -			MIDR_ALL_VERSIONS(MIDR_AMPERE1),
> -			{},
> -		};
> -		static const struct midr_range spectre_bhb_k8_list[] = {
> -			MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
> -			MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
> -			{},
> -		};
> -
> -		if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k32_list))
> -			k = 32;
> -		else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k24_list))
> -			k = 24;
> -		else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k11_list))
> -			k = 11;
> -		else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k8_list))
> -			k =  8;
> -
> -		max_bhb_k = max(max_bhb_k, k);
> -	} else {
> -		k = max_bhb_k;
> -	}
> +
> +	static const struct midr_range spectre_bhb_k32_list[] = {
> +		MIDR_ALL_VERSIONS(MIDR_CORTEX_A78),
> +		MIDR_ALL_VERSIONS(MIDR_CORTEX_A78AE),
> +		MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C),
> +		MIDR_ALL_VERSIONS(MIDR_CORTEX_X1),
> +		MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
> +		MIDR_ALL_VERSIONS(MIDR_CORTEX_X2),
> +		MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
> +		MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1),
> +		{},
> +	};
> +	static const struct midr_range spectre_bhb_k24_list[] = {
> +		MIDR_ALL_VERSIONS(MIDR_CORTEX_A76),
> +		MIDR_ALL_VERSIONS(MIDR_CORTEX_A77),
> +		MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
> +		MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_4XX_GOLD),
> +		{},
> +	};
> +	static const struct midr_range spectre_bhb_k11_list[] = {
> +		MIDR_ALL_VERSIONS(MIDR_AMPERE1),
> +		{},
> +	};
> +	static const struct midr_range spectre_bhb_k8_list[] = {
> +		MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
> +		MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
> +		{},
> +	};
> +
> +	if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k32_list))
> +		k = 32;
> +	else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k24_list))
> +		k = 24;
> +	else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k11_list))
> +		k = 11;
> +	else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k8_list))
> +		k =  8;
>  
>  	return k;
>  }

This previous hunk reduces the indent to remove the static variable from inside the
function. Your patch-1 can be picked up automatically by stable branches, but after this
change, that will have to be done by hand. Arm have recently updated that table of CPUs
with extra entries (thanks for picking those up!) - but now that patch can't be easily
applied to older kernels.
I suspect making the reporting assuming-vulnerable may make other CPUs come out of the
wood work too...

Could we avoid changing this unless we really need to?


As background, the idea is that CPUs that are newer than the kernel will discover the need
for mitigation from firmware. That sucks for performance, but it can be improved once the
kernel is updated to know about the 'local' workaround.


> @@ -955,6 +956,8 @@ static bool supports_ecbhb(int scope)
>  						    ID_AA64MMFR1_EL1_ECBHB_SHIFT);
>  }
>  
> +static u8 max_bhb_k;
> +
>  bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry,
>  			     int scope)
>  {
> @@ -963,16 +966,18 @@ bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry,
>  	if (supports_csv2p3(scope))
>  		return false;
>  
> -	if (supports_clearbhb(scope))
> -		return true;
> -
> -	if (spectre_bhb_loop_affected(scope))
> -		return true;
> +	if (is_spectre_bhb_safe(scope))
> +		return false;
>  
> -	if (is_spectre_bhb_fw_affected(scope))
> -		return true;
> +	/*
> +	 * At this point the core isn't known to be "safe" so we're going to
> +	 * assume it's vulnerable. We still need to update `max_bhb_k` though,
> +	 * but only if we aren't mitigating with clearbhb though.

+	or firmware.


> +	 */
> +	if (scope == SCOPE_LOCAL_CPU && !supports_clearbhb(SCOPE_LOCAL_CPU))
> +		max_bhb_k = max(max_bhb_k, spectre_bhb_loop_affected());

CPUs that need a firmware mitigation will get in here too. Its probably harmless as
they'll report '0' as their k value... but you went to the trouble to weed out the CPUs
that support the clearbhb instruction ...

For the change described in the commit message, isn't it enough to replace the final
'return false' with 'return !is_spectre_bhb_safe(scope)' ?


>  
> -	return false;
> +	return true;
>  }


Thanks,

James
Doug Anderson Jan. 29, 2025, 7:14 p.m. UTC | #2
Hi,


On Wed, Jan 29, 2025 at 8:43 AM James Morse <james.morse@arm.com> wrote:
>
> Hi Doug,
>
> On 07/01/2025 20:05, Douglas Anderson wrote:
> > The code for detecting CPUs that are vulnerable to Spectre BHB was
> > based on a hardcoded list of CPU IDs that were known to be affected.
> > Unfortunately, the list mostly only contained the IDs of standard ARM
> > cores. The IDs for many cores that are minor variants of the standard
> > ARM cores (like many Qualcomm Kyro CPUs) weren't listed. This led the
> > code to assume that those variants were not affected.
> >
> > Flip the code on its head and instead assume that a core is vulnerable
> > if it doesn't have CSV2_3 but is unrecognized as being safe. This
> > involves creating a "Spectre BHB safe" list.
> >
> > As of right now, the only CPU IDs added to the "Spectre BHB safe" list
> > are ARM Cortex A35, A53, A55, A510, and A520. This list was created by
> > looking for cores that weren't listed in ARM's list [1] as per review
> > feedback on v2 of this patch [2]. Additionally Brahma A53 is added as
> > per mailing list feedback [3].
> >
> > NOTE: this patch will not actually _mitigate_ anyone, it will simply
> > cause them to report themselves as vulnerable. If any cores in the
> > system are reported as vulnerable but not mitigated then the whole
> > system will be reported as vulnerable though the system will attempt
> > to mitigate with the information it has about the known cores.
>
> > arch/arm64/include/asm/spectre.h |   1 -
> > arch/arm64/kernel/proton-pack.c  | 203 ++++++++++++++++---------------
> > 2 files changed, 102 insertions(+), 102 deletions(-)
>
> This is a pretty hefty diff-stat for adding a list of six CPUs. It looks like there are
> multiple things going on here: I think you're adding the 'safe' list of CPUs, then
> removing the list of firmware-mitigated list, then removing some indentation to do the
> mitigation detection differently. Any chance this can be split up?

I have to get back into the headspace of this patch since it's been a
few weeks, but let's see.

At the end, you suggest that maybe we could simplify this patch by
just adding the function is_spectre_bhb_safe(), calling it at the
beginning of is_spectre_bhb_affected(), and then changing
is_spectre_bhb_affected() to return "true" at the end instead of
false. Looking back at the patch, I believe this is correct.

...that being said, all of the other changes that are part of this
patch came in response to review feedback on earlier versions of the
patch and do make sense. Specifically:

1. Once the default value is to return false, there's not a reason to
call supports_clearbhb() and is_spectre_bhb_fw_affected() in
is_spectre_bhb_affected(), right? Those two functions _only_ job was
to be able to cause is_spectre_bhb_affected() to sometimes return
"true" instead of "false", but the function returns true by default
now so the calls have no purpose.

2. After we do #1, it doesn't really make sense to still have the
snippet of code that looks like this:

  if (spectre_bhb_loop_affected(scope))
    return true;

  return true;

We could just change that to:

  spectre_bhb_loop_affected(scope)
  return true;

...but then it seems really strange to call a function that returns a
value and we're ignoring it. As per review feedback on the previous
versions, I moved the calculation of "max_bhb_k" out of
spectre_bhb_loop_affected() and made that just for local scope and it
turned out much cleaner.

3. Given that we no longer have a reason to call
is_spectre_bhb_fw_affected() in is_spectre_bhb_affected(), it also
made sense to get rid of the "scope" variable for that function and
simplify it and the other caller.

...so I could break some of that stuff up into separate patches, but I
don't think we'd just want to drop any of the changes.


> (I'm not sure about the last chunk - it breaks automatic backporting)

Why would it break things? This entire series should just be
backported including the change of the default value to "affected". If
someone is running an old "stable" kernel and they're missing a
mitigation this would make it more obvious to them. Without
backporting everything then people running old kernels but lacking a
needed FW mitigation would show as "unaffected" when really they are
"vulnerable".


> > diff --git a/arch/arm64/kernel/proton-pack.c b/arch/arm64/kernel/proton-pack.c
> > index e149efadff20..17aa836fe46d 100644
> > --- a/arch/arm64/kernel/proton-pack.c
> > +++ b/arch/arm64/kernel/proton-pack.c
>
>
> > +static u8 spectre_bhb_loop_affected(void)
> >  {
> >       u8 k = 0;
> > -     static u8 max_bhb_k;
> > -
> > -     if (scope == SCOPE_LOCAL_CPU) {
> > -             static const struct midr_range spectre_bhb_k32_list[] = {
> > -                     MIDR_ALL_VERSIONS(MIDR_CORTEX_A78),
> > -                     MIDR_ALL_VERSIONS(MIDR_CORTEX_A78AE),
> > -                     MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C),
> > -                     MIDR_ALL_VERSIONS(MIDR_CORTEX_X1),
> > -                     MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
> > -                     MIDR_ALL_VERSIONS(MIDR_CORTEX_X2),
> > -                     MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
> > -                     MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1),
> > -                     {},
> > -             };
> > -             static const struct midr_range spectre_bhb_k24_list[] = {
> > -                     MIDR_ALL_VERSIONS(MIDR_CORTEX_A76),
> > -                     MIDR_ALL_VERSIONS(MIDR_CORTEX_A77),
> > -                     MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
> > -                     MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_4XX_GOLD),
> > -                     {},
> > -             };
> > -             static const struct midr_range spectre_bhb_k11_list[] = {
> > -                     MIDR_ALL_VERSIONS(MIDR_AMPERE1),
> > -                     {},
> > -             };
> > -             static const struct midr_range spectre_bhb_k8_list[] = {
> > -                     MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
> > -                     MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
> > -                     {},
> > -             };
> > -
> > -             if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k32_list))
> > -                     k = 32;
> > -             else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k24_list))
> > -                     k = 24;
> > -             else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k11_list))
> > -                     k = 11;
> > -             else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k8_list))
> > -                     k =  8;
> > -
> > -             max_bhb_k = max(max_bhb_k, k);
> > -     } else {
> > -             k = max_bhb_k;
> > -     }
> > +
> > +     static const struct midr_range spectre_bhb_k32_list[] = {
> > +             MIDR_ALL_VERSIONS(MIDR_CORTEX_A78),
> > +             MIDR_ALL_VERSIONS(MIDR_CORTEX_A78AE),
> > +             MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C),
> > +             MIDR_ALL_VERSIONS(MIDR_CORTEX_X1),
> > +             MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
> > +             MIDR_ALL_VERSIONS(MIDR_CORTEX_X2),
> > +             MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
> > +             MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1),
> > +             {},
> > +     };
> > +     static const struct midr_range spectre_bhb_k24_list[] = {
> > +             MIDR_ALL_VERSIONS(MIDR_CORTEX_A76),
> > +             MIDR_ALL_VERSIONS(MIDR_CORTEX_A77),
> > +             MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
> > +             MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_4XX_GOLD),
> > +             {},
> > +     };
> > +     static const struct midr_range spectre_bhb_k11_list[] = {
> > +             MIDR_ALL_VERSIONS(MIDR_AMPERE1),
> > +             {},
> > +     };
> > +     static const struct midr_range spectre_bhb_k8_list[] = {
> > +             MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
> > +             MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
> > +             {},
> > +     };
> > +
> > +     if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k32_list))
> > +             k = 32;
> > +     else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k24_list))
> > +             k = 24;
> > +     else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k11_list))
> > +             k = 11;
> > +     else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k8_list))
> > +             k =  8;
> >
> >       return k;
> >  }
>
> This previous hunk reduces the indent to remove the static variable from inside the
> function. Your patch-1 can be picked up automatically by stable branches, but after this
> change, that will have to be done by hand.

As per above, I think the whole series should simply be backported so
this shouldn't be an issue.


> Arm have recently updated that table of CPUs
> with extra entries (thanks for picking those up!) - but now that patch can't be easily
> applied to older kernels.
> I suspect making the reporting assuming-vulnerable may make other CPUs come out of the
> wood work too...
>
> Could we avoid changing this unless we really need to?

Will / Catalin: Do either of you have an opinion here?


> As background, the idea is that CPUs that are newer than the kernel will discover the need
> for mitigation from firmware. That sucks for performance, but it can be improved once the
> kernel is updated to know about the 'local' workaround.
>
>
> > @@ -955,6 +956,8 @@ static bool supports_ecbhb(int scope)
> >                                                   ID_AA64MMFR1_EL1_ECBHB_SHIFT);
> >  }
> >
> > +static u8 max_bhb_k;
> > +
> >  bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry,
> >                            int scope)
> >  {
> > @@ -963,16 +966,18 @@ bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry,
> >       if (supports_csv2p3(scope))
> >               return false;
> >
> > -     if (supports_clearbhb(scope))
> > -             return true;
> > -
> > -     if (spectre_bhb_loop_affected(scope))
> > -             return true;
> > +     if (is_spectre_bhb_safe(scope))
> > +             return false;
> >
> > -     if (is_spectre_bhb_fw_affected(scope))
> > -             return true;
> > +     /*
> > +      * At this point the core isn't known to be "safe" so we're going to
> > +      * assume it's vulnerable. We still need to update `max_bhb_k` though,
> > +      * but only if we aren't mitigating with clearbhb though.
>
> +       or firmware.

No, I believe my comment is correct as-is. The code in
spectre_bhb_enable_mitigation() prior to my change prefers a loop
mitigation over a firmware mitigation. My patch doesn't change that
behavior. So we'll always update `max_bhb_k` even if there is a FW
mitigation that might be in place. Updating it by comparing it to a
value that might be 0 is still an "update" in my mind.

If a FW mitigation _is_ needed for a given core then
spectre_bhb_loop_affected() should return a "k" of 0. That will fall
through to the FW mitigation.

...so in the context of this code here, we always will "update"
`max_bhb_k` regardless of whether there is a firmware mitigation.


> > +      */
> > +     if (scope == SCOPE_LOCAL_CPU && !supports_clearbhb(SCOPE_LOCAL_CPU))
> > +             max_bhb_k = max(max_bhb_k, spectre_bhb_loop_affected());
>
> CPUs that need a firmware mitigation will get in here too. Its probably harmless as
> they'll report '0' as their k value... but you went to the trouble to weed out the CPUs
> that support the clearbhb instruction ...

In order to understand why the patch does what it does, I got all my
understanding of how mitigations should be applied by maintaining the
existing behavior of spectre_bhb_enable_mitigation(). Specifically, if
a FW mitigation is _needed_ for a given core then it's important that
spectre_bhb_loop_affected() return 0. This means that whoever wrote
that code assumes that the loop mitigation trumps the FW mitigation.

I don't think there's any other reason for weeding more CPUs out.


> For the change described in the commit message, isn't it enough to replace the final
> 'return false' with 'return !is_spectre_bhb_safe(scope)' ?

See my response to your first comment.


So I guess in summary I won't plan on sending a v5 at this time and
I'd hope that v4 can still land (maybe Qualcomm can give an Ack?). If
necessary I could break this patch into more than one to break the
diffstat into multiple patches, but I'd prefer to still keep the
resulting code the same. I like how it turned out and it matches all
the existing review feedback that Julius provided. If Will or Catalin
wants to come in or enough other people come in and vote that I should
make more changes, then I could do it.

I could move the last two patches up to the front and send a v5 if
needed. That would make the short term job of backporting those
faster, but any future cores would have the same problem. As per
above, I'm of the opinion that the whole series should go to stable
for as far back as people are willing to take them (which is why I
CCed stable on all the patches)


-Doug
Catalin Marinas Feb. 12, 2025, 5:52 p.m. UTC | #3
On Wed, Jan 29, 2025 at 11:14:20AM -0800, Doug Anderson wrote:
> On Wed, Jan 29, 2025 at 8:43 AM James Morse <james.morse@arm.com> wrote:
> > Arm have recently updated that table of CPUs
> > with extra entries (thanks for picking those up!) - but now that patch can't be easily
> > applied to older kernels.
> > I suspect making the reporting assuming-vulnerable may make other CPUs come out of the
> > wood work too...
> >
> > Could we avoid changing this unless we really need to?
> 
> Will / Catalin: Do either of you have an opinion here?

Is this about whether to report "vulnerable" for unknown CPUs? I think
Will suggested this:

https://lore.kernel.org/all/20241219175128.GA25477@willie-the-truck/

That said, some patch splitting will help to make review easier. Should
such change be back-portable as well? I think so, it's not only for CPUs
we'll see in the future.
Doug Anderson Feb. 12, 2025, 6:05 p.m. UTC | #4
Hi,

On Wed, Feb 12, 2025 at 9:52 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Wed, Jan 29, 2025 at 11:14:20AM -0800, Doug Anderson wrote:
> > On Wed, Jan 29, 2025 at 8:43 AM James Morse <james.morse@arm.com> wrote:
> > > Arm have recently updated that table of CPUs
> > > with extra entries (thanks for picking those up!) - but now that patch can't be easily
> > > applied to older kernels.
> > > I suspect making the reporting assuming-vulnerable may make other CPUs come out of the
> > > wood work too...
> > >
> > > Could we avoid changing this unless we really need to?
> >
> > Will / Catalin: Do either of you have an opinion here?
>
> Is this about whether to report "vulnerable" for unknown CPUs? I think
> Will suggested this:
>
> https://lore.kernel.org/all/20241219175128.GA25477@willie-the-truck/

Right. The patch in its current form is in direct response to what
Will requested and then subsequent feedback on the mailing lists from
Julius Werner.


> That said, some patch splitting will help to make review easier. Should
> such change be back-portable as well? I think so, it's not only for CPUs
> we'll see in the future.

Personally, I don't think the patch will be terribly hard to backport
as-is. I would expect it to just cherry-pick cleanly since it only
touches spectre code and I'd imagine all that stuff is being
backported verbatim. I did at least confirm that it applies cleanly to
v5.15.178 (I didn't try compiling though). I guess there are conflicts
back to v5.10.234, though...

While I've had plenty of time to work on this patch in the last three
months since I posted the early versions, recently my assignments at
work have changed and I have a lot less time to work on this patch
series. If breaking this up is blocking this patch from landing then
I'll try to find some time in the next month or two to do it. Let me
know.

-Doug
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/spectre.h b/arch/arm64/include/asm/spectre.h
index 0c4d9045c31f..f1524cdeacf1 100644
--- a/arch/arm64/include/asm/spectre.h
+++ b/arch/arm64/include/asm/spectre.h
@@ -97,7 +97,6 @@  enum mitigation_state arm64_get_meltdown_state(void);
 
 enum mitigation_state arm64_get_spectre_bhb_state(void);
 bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry, int scope);
-u8 spectre_bhb_loop_affected(int scope);
 void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *__unused);
 bool try_emulate_el1_ssbs(struct pt_regs *regs, u32 instr);
 
diff --git a/arch/arm64/kernel/proton-pack.c b/arch/arm64/kernel/proton-pack.c
index e149efadff20..17aa836fe46d 100644
--- a/arch/arm64/kernel/proton-pack.c
+++ b/arch/arm64/kernel/proton-pack.c
@@ -845,53 +845,70 @@  static unsigned long system_bhb_mitigations;
  * This must be called with SCOPE_LOCAL_CPU for each type of CPU, before any
  * SCOPE_SYSTEM call will give the right answer.
  */
-u8 spectre_bhb_loop_affected(int scope)
+static bool is_spectre_bhb_safe(int scope)
+{
+	static const struct midr_range spectre_bhb_safe_list[] = {
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A35),
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A53),
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A55),
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A510),
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A520),
+		MIDR_ALL_VERSIONS(MIDR_BRAHMA_B53),
+		{},
+	};
+	static bool all_safe = true;
+
+	if (scope != SCOPE_LOCAL_CPU)
+		return all_safe;
+
+	if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_safe_list))
+		return true;
+
+	all_safe = false;
+
+	return false;
+}
+
+static u8 spectre_bhb_loop_affected(void)
 {
 	u8 k = 0;
-	static u8 max_bhb_k;
-
-	if (scope == SCOPE_LOCAL_CPU) {
-		static const struct midr_range spectre_bhb_k32_list[] = {
-			MIDR_ALL_VERSIONS(MIDR_CORTEX_A78),
-			MIDR_ALL_VERSIONS(MIDR_CORTEX_A78AE),
-			MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C),
-			MIDR_ALL_VERSIONS(MIDR_CORTEX_X1),
-			MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
-			MIDR_ALL_VERSIONS(MIDR_CORTEX_X2),
-			MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
-			MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1),
-			{},
-		};
-		static const struct midr_range spectre_bhb_k24_list[] = {
-			MIDR_ALL_VERSIONS(MIDR_CORTEX_A76),
-			MIDR_ALL_VERSIONS(MIDR_CORTEX_A77),
-			MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
-			MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_4XX_GOLD),
-			{},
-		};
-		static const struct midr_range spectre_bhb_k11_list[] = {
-			MIDR_ALL_VERSIONS(MIDR_AMPERE1),
-			{},
-		};
-		static const struct midr_range spectre_bhb_k8_list[] = {
-			MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
-			MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
-			{},
-		};
-
-		if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k32_list))
-			k = 32;
-		else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k24_list))
-			k = 24;
-		else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k11_list))
-			k = 11;
-		else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k8_list))
-			k =  8;
-
-		max_bhb_k = max(max_bhb_k, k);
-	} else {
-		k = max_bhb_k;
-	}
+
+	static const struct midr_range spectre_bhb_k32_list[] = {
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A78),
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A78AE),
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A78C),
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_X1),
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_X2),
+		MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
+		MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1),
+		{},
+	};
+	static const struct midr_range spectre_bhb_k24_list[] = {
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A76),
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A77),
+		MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1),
+		MIDR_ALL_VERSIONS(MIDR_QCOM_KRYO_4XX_GOLD),
+		{},
+	};
+	static const struct midr_range spectre_bhb_k11_list[] = {
+		MIDR_ALL_VERSIONS(MIDR_AMPERE1),
+		{},
+	};
+	static const struct midr_range spectre_bhb_k8_list[] = {
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
+		{},
+	};
+
+	if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k32_list))
+		k = 32;
+	else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k24_list))
+		k = 24;
+	else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k11_list))
+		k = 11;
+	else if (is_midr_in_range_list(read_cpuid_id(), spectre_bhb_k8_list))
+		k =  8;
 
 	return k;
 }
@@ -917,29 +934,13 @@  static enum mitigation_state spectre_bhb_get_cpu_fw_mitigation_state(void)
 	}
 }
 
-static bool is_spectre_bhb_fw_affected(int scope)
+static bool has_spectre_bhb_fw_mitigation(void)
 {
-	static bool system_affected;
 	enum mitigation_state fw_state;
 	bool has_smccc = arm_smccc_1_1_get_conduit() != SMCCC_CONDUIT_NONE;
-	static const struct midr_range spectre_bhb_firmware_mitigated_list[] = {
-		MIDR_ALL_VERSIONS(MIDR_CORTEX_A73),
-		MIDR_ALL_VERSIONS(MIDR_CORTEX_A75),
-		{},
-	};
-	bool cpu_in_list = is_midr_in_range_list(read_cpuid_id(),
-					 spectre_bhb_firmware_mitigated_list);
-
-	if (scope != SCOPE_LOCAL_CPU)
-		return system_affected;
 
 	fw_state = spectre_bhb_get_cpu_fw_mitigation_state();
-	if (cpu_in_list || (has_smccc && fw_state == SPECTRE_MITIGATED)) {
-		system_affected = true;
-		return true;
-	}
-
-	return false;
+	return has_smccc && fw_state == SPECTRE_MITIGATED;
 }
 
 static bool supports_ecbhb(int scope)
@@ -955,6 +956,8 @@  static bool supports_ecbhb(int scope)
 						    ID_AA64MMFR1_EL1_ECBHB_SHIFT);
 }
 
+static u8 max_bhb_k;
+
 bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry,
 			     int scope)
 {
@@ -963,16 +966,18 @@  bool is_spectre_bhb_affected(const struct arm64_cpu_capabilities *entry,
 	if (supports_csv2p3(scope))
 		return false;
 
-	if (supports_clearbhb(scope))
-		return true;
-
-	if (spectre_bhb_loop_affected(scope))
-		return true;
+	if (is_spectre_bhb_safe(scope))
+		return false;
 
-	if (is_spectre_bhb_fw_affected(scope))
-		return true;
+	/*
+	 * At this point the core isn't known to be "safe" so we're going to
+	 * assume it's vulnerable. We still need to update `max_bhb_k` though,
+	 * but only if we aren't mitigating with clearbhb though.
+	 */
+	if (scope == SCOPE_LOCAL_CPU && !supports_clearbhb(SCOPE_LOCAL_CPU))
+		max_bhb_k = max(max_bhb_k, spectre_bhb_loop_affected());
 
-	return false;
+	return true;
 }
 
 static void this_cpu_set_vectors(enum arm64_bp_harden_el1_vectors slot)
@@ -1003,7 +1008,7 @@  early_param("nospectre_bhb", parse_spectre_bhb_param);
 void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *entry)
 {
 	bp_hardening_cb_t cpu_cb;
-	enum mitigation_state fw_state, state = SPECTRE_VULNERABLE;
+	enum mitigation_state state = SPECTRE_VULNERABLE;
 	struct bp_hardening_data *data = this_cpu_ptr(&bp_hardening_data);
 
 	if (!is_spectre_bhb_affected(entry, SCOPE_LOCAL_CPU))
@@ -1029,7 +1034,7 @@  void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *entry)
 		this_cpu_set_vectors(EL1_VECTOR_BHB_CLEAR_INSN);
 		state = SPECTRE_MITIGATED;
 		set_bit(BHB_INSN, &system_bhb_mitigations);
-	} else if (spectre_bhb_loop_affected(SCOPE_LOCAL_CPU)) {
+	} else if (spectre_bhb_loop_affected()) {
 		/*
 		 * Ensure KVM uses the indirect vector which will have the
 		 * branchy-loop added. A57/A72-r0 will already have selected
@@ -1042,32 +1047,29 @@  void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *entry)
 		this_cpu_set_vectors(EL1_VECTOR_BHB_LOOP);
 		state = SPECTRE_MITIGATED;
 		set_bit(BHB_LOOP, &system_bhb_mitigations);
-	} else if (is_spectre_bhb_fw_affected(SCOPE_LOCAL_CPU)) {
-		fw_state = spectre_bhb_get_cpu_fw_mitigation_state();
-		if (fw_state == SPECTRE_MITIGATED) {
-			/*
-			 * Ensure KVM uses one of the spectre bp_hardening
-			 * vectors. The indirect vector doesn't include the EL3
-			 * call, so needs upgrading to
-			 * HYP_VECTOR_SPECTRE_INDIRECT.
-			 */
-			if (!data->slot || data->slot == HYP_VECTOR_INDIRECT)
-				data->slot += 1;
-
-			this_cpu_set_vectors(EL1_VECTOR_BHB_FW);
-
-			/*
-			 * The WA3 call in the vectors supersedes the WA1 call
-			 * made during context-switch. Uninstall any firmware
-			 * bp_hardening callback.
-			 */
-			cpu_cb = spectre_v2_get_sw_mitigation_cb();
-			if (__this_cpu_read(bp_hardening_data.fn) != cpu_cb)
-				__this_cpu_write(bp_hardening_data.fn, NULL);
-
-			state = SPECTRE_MITIGATED;
-			set_bit(BHB_FW, &system_bhb_mitigations);
-		}
+	} else if (has_spectre_bhb_fw_mitigation()) {
+		/*
+		 * Ensure KVM uses one of the spectre bp_hardening
+		 * vectors. The indirect vector doesn't include the EL3
+		 * call, so needs upgrading to
+		 * HYP_VECTOR_SPECTRE_INDIRECT.
+		 */
+		if (!data->slot || data->slot == HYP_VECTOR_INDIRECT)
+			data->slot += 1;
+
+		this_cpu_set_vectors(EL1_VECTOR_BHB_FW);
+
+		/*
+		 * The WA3 call in the vectors supersedes the WA1 call
+		 * made during context-switch. Uninstall any firmware
+		 * bp_hardening callback.
+		 */
+		cpu_cb = spectre_v2_get_sw_mitigation_cb();
+		if (__this_cpu_read(bp_hardening_data.fn) != cpu_cb)
+			__this_cpu_write(bp_hardening_data.fn, NULL);
+
+		state = SPECTRE_MITIGATED;
+		set_bit(BHB_FW, &system_bhb_mitigations);
 	}
 
 	update_mitigation_state(&spectre_bhb_state, state);
@@ -1101,7 +1103,6 @@  void noinstr spectre_bhb_patch_loop_iter(struct alt_instr *alt,
 {
 	u8 rd;
 	u32 insn;
-	u16 loop_count = spectre_bhb_loop_affected(SCOPE_SYSTEM);
 
 	BUG_ON(nr_inst != 1); /* MOV -> MOV */
 
@@ -1110,7 +1111,7 @@  void noinstr spectre_bhb_patch_loop_iter(struct alt_instr *alt,
 
 	insn = le32_to_cpu(*origptr);
 	rd = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RD, insn);
-	insn = aarch64_insn_gen_movewide(rd, loop_count, 0,
+	insn = aarch64_insn_gen_movewide(rd, max_bhb_k, 0,
 					 AARCH64_INSN_VARIANT_64BIT,
 					 AARCH64_INSN_MOVEWIDE_ZERO);
 	*updptr++ = cpu_to_le32(insn);