diff mbox series

[v2,1/4] x86/bugs: Add SRSO_USER_KERNEL_NO support

Message ID 20241202120416.6054-2-bp@kernel.org (mailing list archive)
State New
Headers show
Series x86/bugs: Adjust SRSO mitigation to new features | expand

Commit Message

Borislav Petkov Dec. 2, 2024, 12:04 p.m. UTC
From: "Borislav Petkov (AMD)" <bp@alien8.de>

If the machine has:

  CPUID Fn8000_0021_EAX[30] (SRSO_USER_KERNEL_NO) -- If this bit is 1,
  it indicates the CPU is not subject to the SRSO vulnerability across
  user/kernel boundaries.

have it fall back to IBPB on VMEXIT only, in the case it is going to run
VMs:

  Speculative Return Stack Overflow: CPU user/kernel transitions protected, falling back to IBPB-on-VMEXIT
  Speculative Return Stack Overflow: Mitigation: IBPB on VMEXIT only

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
---
 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/kernel/cpu/bugs.c         | 6 ++++++
 arch/x86/kernel/cpu/common.c       | 1 +
 3 files changed, 8 insertions(+)

Comments

Josh Poimboeuf Dec. 10, 2024, 6:53 a.m. UTC | #1
On Mon, Dec 02, 2024 at 01:04:13PM +0100, Borislav Petkov wrote:
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -2615,6 +2615,11 @@ static void __init srso_select_mitigation(void)
>  		break;
>  
>  	case SRSO_CMD_SAFE_RET:
> +		if (boot_cpu_has(X86_FEATURE_SRSO_USER_KERNEL_NO)) {
> +			pr_notice("CPU user/kernel transitions protected, falling back to IBPB-on-VMEXIT\n");
> +			goto ibpb_on_vmexit;

The presence of SRSO_USER_KERNEL_NO should indeed change the default,
but if the user requests "safe_ret" specifically, shouldn't we give it
to them?  That would be more consistent with how we handle requested
mitigations.

Also it doesn't really make sense to add a printk here as the mitigation
will be printed at the end of the function.
Borislav Petkov Dec. 10, 2024, 3:37 p.m. UTC | #2
On Mon, Dec 09, 2024 at 10:53:31PM -0800, Josh Poimboeuf wrote:
> The presence of SRSO_USER_KERNEL_NO should indeed change the default,
> but if the user requests "safe_ret" specifically, shouldn't we give it
> to them?

Hardly a valid use case except for debugging but if you're doing that, you can
change the kernel too.

Because we just fixed this and if some poor soul has left
spec_rstack_overflow=safe-ret in her /etc/default/grub (it has happened to me
a bunch on my test boxes), she'll never get her performance back and that
would be a yucky situation.

> That would be more consistent with how we handle requested
> mitigations.

Yeah, but if there were a point to this, I guess. We don't really need this
mitigation as there's nothing to mitigate on the user/kernel transition
anymore.

> Also it doesn't really make sense to add a printk here as the mitigation
> will be printed at the end of the function.

This is us letting the user know that we don't need Safe-RET anymore and we're
falling back. But I'm not that hung up on that printk...

Thx.
Josh Poimboeuf Dec. 11, 2024, 7:53 a.m. UTC | #3
On Tue, Dec 10, 2024 at 04:37:10PM +0100, Borislav Petkov wrote:
> > Also it doesn't really make sense to add a printk here as the mitigation
> > will be printed at the end of the function.
> 
> This is us letting the user know that we don't need Safe-RET anymore and we're
> falling back. But I'm not that hung up on that printk...

The printk makes sense when it's actually a fallback from
"spec_rstack_overflow=safe-ret", but if nothing was specified on the
cmdline, it's the default rather than a fallback.  In which case I think
the printk would be confusing.
Borislav Petkov Dec. 11, 2024, 8:38 p.m. UTC | #4
On Tue, Dec 10, 2024 at 11:53:15PM -0800, Josh Poimboeuf wrote:
> The printk makes sense when it's actually a fallback from
> "spec_rstack_overflow=safe-ret"

Well, it kinda is as safe-ret is the default and we're falling back from it...

> but if nothing was specified on the cmdline, it's the default rather than
> a fallback.  In which case I think the printk would be confusing.

... but as I said, I'm not hung on that printk - zapped it is.

Btw, Sean, how should we merge this?

Should I take it all through tip and give you an immutable branch?
Sean Christopherson Dec. 11, 2024, 10:35 p.m. UTC | #5
On Wed, Dec 11, 2024, Borislav Petkov wrote:
> Btw, Sean, how should we merge this?
> 
> Should I take it all through tip and give you an immutable branch?

Hmm, that should work.  I don't anticipate any conflicts other than patch 2
(Advertise SRSO_USER_KERNEL_NO to userspace), which is amusingly the most trivial
patch.

Patch 2 is going to conflict with the CPUID/cpu_caps rework[*], but the conflict
won't be hard to resolve, and I'm pretty sure that if I merge in your branch after
applying the rework, the merge commit will show an "obviously correct" resolution.
Or if I screw it up, an obviously wrong resolution :-)

Alternatively, take 1, 3, and 4 through tip, and 2 through my tree, but that
seems unnecessarily convoluted.

[*] https://lore.kernel.org/all/20241128013424.4096668-40-seanjc@google.com
Borislav Petkov Dec. 16, 2024, 5:21 p.m. UTC | #6
On Wed, Dec 11, 2024 at 02:35:39PM -0800, Sean Christopherson wrote:
> On Wed, Dec 11, 2024, Borislav Petkov wrote:
> > Btw, Sean, how should we merge this?
> > 
> > Should I take it all through tip and give you an immutable branch?
> 
> Hmm, that should work.  I don't anticipate any conflicts other than patch 2
> (Advertise SRSO_USER_KERNEL_NO to userspace), which is amusingly the most trivial
> patch.
> 
> Patch 2 is going to conflict with the CPUID/cpu_caps rework[*], but the conflict
> won't be hard to resolve, and I'm pretty sure that if I merge in your branch after
> applying the rework, the merge commit will show an "obviously correct" resolution.
> Or if I screw it up, an obviously wrong resolution :-)
> 
> Alternatively, take 1, 3, and 4 through tip, and 2 through my tree, but that
> seems unnecessarily convoluted.

Ok, lemme queue them all through tip and we'll see what conflicts we encounter
along the way and then sync again.

Thx.
diff mbox series

Patch

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 17b6590748c0..2787227a8b42 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -464,6 +464,7 @@ 
 #define X86_FEATURE_SBPB		(20*32+27) /* Selective Branch Prediction Barrier */
 #define X86_FEATURE_IBPB_BRTYPE		(20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */
 #define X86_FEATURE_SRSO_NO		(20*32+29) /* CPU is not affected by SRSO */
+#define X86_FEATURE_SRSO_USER_KERNEL_NO	(20*32+30) /* CPU is not affected by SRSO across user/kernel boundaries */
 
 /*
  * Extended auxiliary flags: Linux defined - for features scattered in various
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 47a01d4028f6..8854d9bce2a5 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -2615,6 +2615,11 @@  static void __init srso_select_mitigation(void)
 		break;
 
 	case SRSO_CMD_SAFE_RET:
+		if (boot_cpu_has(X86_FEATURE_SRSO_USER_KERNEL_NO)) {
+			pr_notice("CPU user/kernel transitions protected, falling back to IBPB-on-VMEXIT\n");
+			goto ibpb_on_vmexit;
+		}
+
 		if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) {
 			/*
 			 * Enable the return thunk for generated code
@@ -2658,6 +2663,7 @@  static void __init srso_select_mitigation(void)
 		}
 		break;
 
+ibpb_on_vmexit:
 	case SRSO_CMD_IBPB_ON_VMEXIT:
 		if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) {
 			if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index a5c28975c608..954f9c727f11 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1270,6 +1270,7 @@  static const struct x86_cpu_id cpu_vuln_blacklist[] __initconst = {
 	VULNBL_AMD(0x17, RETBLEED | SMT_RSB | SRSO),
 	VULNBL_HYGON(0x18, RETBLEED | SMT_RSB | SRSO),
 	VULNBL_AMD(0x19, SRSO),
+	VULNBL_AMD(0x1a, SRSO),
 	{}
 };