Message ID | 20241202120416.6054-4-bp@kernel.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | x86/bugs: Adjust SRSO mitigation to new features | expand |
On Mon, Dec 02, 2024, Borislav Petkov wrote: > diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst > index 2ad1c05b8c88..79a8f7dea06d 100644 > --- a/Documentation/admin-guide/hw-vuln/srso.rst > +++ b/Documentation/admin-guide/hw-vuln/srso.rst > @@ -104,7 +104,17 @@ The possible values in this file are: > > (spec_rstack_overflow=ibpb-vmexit) > > + * 'Mitigation: Reduced Speculation': > > + This mitigation gets automatically enabled when the above one "IBPB on > + VMEXIT" has been selected and the CPU supports the BpSpecReduce bit. > + > + Currently, the mitigation is automatically enabled when KVM enables > + virtualization and can incur some cost. How much cost are we talking? > static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE; > @@ -2665,6 +2667,12 @@ static void __init srso_select_mitigation(void) > > ibpb_on_vmexit: > case SRSO_CMD_IBPB_ON_VMEXIT: > + if (boot_cpu_has(X86_FEATURE_SRSO_MSR_FIX)) { > + pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n"); > + srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE; > + break; > + } > + > if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) { > if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) { > setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT); > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index dd15cc635655..e4fad330cd25 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -608,6 +608,9 @@ static void svm_disable_virtualization_cpu(void) > kvm_cpu_svm_disable(); > > amd_pmu_disable_virt(); > + > + if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX)) > + msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); > } > > static int svm_enable_virtualization_cpu(void) > @@ -685,6 +688,9 @@ static int svm_enable_virtualization_cpu(void) > rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi); > } > > + if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX)) > + msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); IIUC, this magic bit reduces how much the CPU is allowed to speculate in order to mitigate potential VM=>host attacks, and that reducing speculation also reduces overall performance. If that's correct, then enabling the magic bit needs to be gated by an appropriate mitagation being enabled, not forced on automatically just because the CPU supports X86_FEATURE_SRSO_MSR_FIX. And depending on the cost, it might also make sense to set the bit on-demand, and then clean up when KVM disables virtualization. E.g. wait to set the bit until entry to a guest is imminent.
On Wed, Dec 11, 2024 at 02:27:42PM -0800, Sean Christopherson wrote: > How much cost are we talking? Likely 1-2%. That's why I'm simply enabling it by default. > IIUC, this magic bit reduces how much the CPU is allowed to speculate in order > to mitigate potential VM=>host attacks, and that reducing speculation also reduces > overall performance. > > If that's correct, then enabling the magic bit needs to be gated by an appropriate > mitagation being enabled, not forced on automatically just because the CPU supports > X86_FEATURE_SRSO_MSR_FIX. Well, in the default case we have safe-RET - the default - but since it is not needed anymore, it falls back to this thing which is needed when the mitigation is enabled. That's why it also is in the SRSO_CMD_IBPB_ON_VMEXIT case as it is part of the spec_rstack_overflow=ibpb-vmexit mitigation option. So it kinda already does that. When you disable the mitigation, this one won't get enabled either. > And depending on the cost, it might also make sense to set the bit on-demand, and > then clean up when KVM disables virtualization. E.g. wait to set the bit until > entry to a guest is imminent. So the "when to set that bit" discussion kinda remained unfinished the last time. Here's gist: You: | "It's not strictly KVM module load, it's when KVM enables virtualization. E.g. | if userspace clears enable_virt_at_load, the MSR will be toggled every time the | number of VMs goes from 0=>1 and 1=>0. | | But why do this in KVM? E.g. why not set-and-forget in init_amd_zen4()?" I: | "Because there's no need to impose an unnecessary - albeit small - perf impact | on users who don't do virt. | | I'm currently gravitating towards the MSR toggling thing, i.e., only when the | VMs number goes 0=>1 but I'm not sure. If udev rules *always* load kvm.ko then | yes, the toggling thing sounds better. I.e., set it only when really needed." So to answer your current question, it sounds like the user can control the on-demand thing with enable_virt_at_load=0, right? Or do you mean something else different? Thx.
On Mon, Dec 16, 2024, Borislav Petkov wrote: > On Wed, Dec 11, 2024 at 02:27:42PM -0800, Sean Christopherson wrote: > > How much cost are we talking? > > Likely 1-2%. > > That's why I'm simply enabling it by default. > > > IIUC, this magic bit reduces how much the CPU is allowed to speculate in order > > to mitigate potential VM=>host attacks, and that reducing speculation also reduces > > overall performance. > > > > If that's correct, then enabling the magic bit needs to be gated by an appropriate > > mitagation being enabled, not forced on automatically just because the CPU supports > > X86_FEATURE_SRSO_MSR_FIX. > > Well, in the default case we have safe-RET - the default - but since it is > not needed anymore, it falls back to this thing which is needed when the > mitigation is enabled. > > That's why it also is in the SRSO_CMD_IBPB_ON_VMEXIT case as it is part of the > spec_rstack_overflow=ibpb-vmexit mitigation option. > > So it kinda already does that. When you disable the mitigation, this one won't > get enabled either. But it's a hardware-defined feature flag, so won't it be set by this code? if (c->extended_cpuid_level >= 0x8000001f) c->x86_capability[CPUID_8000_001F_EAX] = cpuid_eax(0x8000001f); srso_select_mitigation() checks the flag for SRSO_CMD_IBPB_ON_VMEXIT case SRSO_CMD_IBPB_ON_VMEXIT: if (boot_cpu_has(X86_FEATURE_SRSO_MSR_FIX)) { pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n"); srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE; break; } but I don't see any code that would clear X86_FEATURE_SRSO_MSR_FIX. Am I missing something? > > And depending on the cost, it might also make sense to set the bit on-demand, and > > then clean up when KVM disables virtualization. E.g. wait to set the bit until > > entry to a guest is imminent. > > So the "when to set that bit" discussion kinda remained unfinished the last > time. Gah, sorry. I suspect I got thinking about how best to "set it only when really needed", and got lost in analysis paralysis. > Here's gist: > > You: > > | "It's not strictly KVM module load, it's when KVM enables virtualization. E.g. > | if userspace clears enable_virt_at_load, the MSR will be toggled every time the > | number of VMs goes from 0=>1 and 1=>0. > | > | But why do this in KVM? E.g. why not set-and-forget in init_amd_zen4()?" > > I: > > | "Because there's no need to impose an unnecessary - albeit small - perf impact > | on users who don't do virt. > | > | I'm currently gravitating towards the MSR toggling thing, i.e., only when the > | VMs number goes 0=>1 but I'm not sure. If udev rules *always* load kvm.ko then > | yes, the toggling thing sounds better. I.e., set it only when really needed." > > So to answer your current question, it sounds like the user can control the > on-demand thing with enable_virt_at_load=0, right? To some extent. But I strongly suspect that the vast, vast majority of end users will end up with systems that automatically load kvm.ko, but don't run VMs the majority of the time. Expecting non-KVM to users to detect a 1-2% regression and track down enable_virt_at_load doesn't seem like a winning strategy. > Or do you mean something else different? The other possibility would be to wait to set the bit until a CPU is actually going to do VMRUN. If we use KVM's "user-return MSR" framework, the bit would be cleared when the CPU returns to userspace. The only downside to that is KVM would toggle the bit on CPUs running vCPUs on every exit to userspace, e.g. to emulate MMIO/IO and other things. But, userspace exits are relatively slow paths, so if the below is a wash for performance when running VMs, i.e. the cost of the WRMSRs is either in the noise or is offset by the regained 1-2% performance for userspace, then I think it's a no-brainer. Enabling "full" speculation on return to usersepace means non-KVM tasks won't be affected, and there's no "sticky" behavior. E.g. another idea would be to defer setting the bit until VMRUN is imminent, but then wait to clear the bit until virtualization is disabled. But that has the downside of the bit being set on all CPUs over time, especially if enable_virt_at_load is true. Compile tested only... diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index e4fad330cd25..061ac5940432 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -256,6 +256,7 @@ DEFINE_PER_CPU(struct svm_cpu_data, svm_data); * defer the restoration of TSC_AUX until the CPU returns to userspace. */ static int tsc_aux_uret_slot __read_mostly = -1; +static int zen4_bp_cfg_uret_slot __ro_after_init = -1; static const u32 msrpm_ranges[] = {0, 0xc0000000, 0xc0010000}; @@ -608,9 +609,6 @@ static void svm_disable_virtualization_cpu(void) kvm_cpu_svm_disable(); amd_pmu_disable_virt(); - - if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX)) - msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); } static int svm_enable_virtualization_cpu(void) @@ -688,9 +686,6 @@ static int svm_enable_virtualization_cpu(void) rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi); } - if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX)) - msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); - return 0; } @@ -1547,6 +1542,11 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu) (!boot_cpu_has(X86_FEATURE_V_TSC_AUX) || !sev_es_guest(vcpu->kvm))) kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull); + if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX)) + kvm_set_user_return_msr(zen4_bp_cfg_uret_slot, + MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT, + MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); + svm->guest_state_loaded = true; } @@ -5313,6 +5313,14 @@ static __init int svm_hardware_setup(void) tsc_aux_uret_slot = kvm_add_user_return_msr(MSR_TSC_AUX); + if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX)) { + zen4_bp_cfg_uret_slot = kvm_add_user_return_msr(MSR_ZEN4_BP_CFG); + if (WARN_ON_ONCE(zen4_bp_cfg_uret_slot) < 0) { + r = -EIO; + goto err; + } + } + if (boot_cpu_has(X86_FEATURE_AUTOIBRS)) kvm_enable_efer_bits(EFER_AUTOIBRS);
diff --git a/Documentation/admin-guide/hw-vuln/srso.rst b/Documentation/admin-guide/hw-vuln/srso.rst index 2ad1c05b8c88..79a8f7dea06d 100644 --- a/Documentation/admin-guide/hw-vuln/srso.rst +++ b/Documentation/admin-guide/hw-vuln/srso.rst @@ -104,7 +104,17 @@ The possible values in this file are: (spec_rstack_overflow=ibpb-vmexit) + * 'Mitigation: Reduced Speculation': + This mitigation gets automatically enabled when the above one "IBPB on + VMEXIT" has been selected and the CPU supports the BpSpecReduce bit. + + Currently, the mitigation is automatically enabled when KVM enables + virtualization and can incur some cost. If no VMs will run on the system, + you can either disable virtualization or set kvm.enable_virt_at_load=0 to + enable it only when a VM gets started and thus when really needed. See the + text in Documentation/admin-guide/kernel-parameters.txt on this parameter + for more details. In order to exploit vulnerability, an attacker needs to: diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 2787227a8b42..94582c0ed9f2 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -465,6 +465,7 @@ #define X86_FEATURE_IBPB_BRTYPE (20*32+28) /* MSR_PRED_CMD[IBPB] flushes all branch type predictions */ #define X86_FEATURE_SRSO_NO (20*32+29) /* CPU is not affected by SRSO */ #define X86_FEATURE_SRSO_USER_KERNEL_NO (20*32+30) /* CPU is not affected by SRSO across user/kernel boundaries */ +#define X86_FEATURE_SRSO_MSR_FIX (20*32+31) /* MSR BP_CFG[BpSpecReduce] can be used to mitigate SRSO for VMs */ /* * Extended auxiliary flags: Linux defined - for features scattered in various diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 3ae84c3b8e6d..1372a569fb58 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -717,6 +717,7 @@ /* Zen4 */ #define MSR_ZEN4_BP_CFG 0xc001102e +#define MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT 4 #define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5 /* Fam 19h MSRs */ diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index 8854d9bce2a5..a2eb7c0700da 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -2523,6 +2523,7 @@ enum srso_mitigation { SRSO_MITIGATION_SAFE_RET, SRSO_MITIGATION_IBPB, SRSO_MITIGATION_IBPB_ON_VMEXIT, + SRSO_MITIGATION_BP_SPEC_REDUCE, }; enum srso_mitigation_cmd { @@ -2540,7 +2541,8 @@ static const char * const srso_strings[] = { [SRSO_MITIGATION_MICROCODE] = "Vulnerable: Microcode, no safe RET", [SRSO_MITIGATION_SAFE_RET] = "Mitigation: Safe RET", [SRSO_MITIGATION_IBPB] = "Mitigation: IBPB", - [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only" + [SRSO_MITIGATION_IBPB_ON_VMEXIT] = "Mitigation: IBPB on VMEXIT only", + [SRSO_MITIGATION_BP_SPEC_REDUCE] = "Mitigation: Reduced Speculation" }; static enum srso_mitigation srso_mitigation __ro_after_init = SRSO_MITIGATION_NONE; @@ -2665,6 +2667,12 @@ static void __init srso_select_mitigation(void) ibpb_on_vmexit: case SRSO_CMD_IBPB_ON_VMEXIT: + if (boot_cpu_has(X86_FEATURE_SRSO_MSR_FIX)) { + pr_notice("Reducing speculation to address VM/HV SRSO attack vector.\n"); + srso_mitigation = SRSO_MITIGATION_BP_SPEC_REDUCE; + break; + } + if (IS_ENABLED(CONFIG_MITIGATION_SRSO)) { if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) { setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT); diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index dd15cc635655..e4fad330cd25 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -608,6 +608,9 @@ static void svm_disable_virtualization_cpu(void) kvm_cpu_svm_disable(); amd_pmu_disable_virt(); + + if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX)) + msr_clear_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); } static int svm_enable_virtualization_cpu(void) @@ -685,6 +688,9 @@ static int svm_enable_virtualization_cpu(void) rdmsr(MSR_TSC_AUX, sev_es_host_save_area(sd)->tsc_aux, msr_hi); } + if (cpu_feature_enabled(X86_FEATURE_SRSO_MSR_FIX)) + msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_BP_SPEC_REDUCE_BIT); + return 0; } diff --git a/arch/x86/lib/msr.c b/arch/x86/lib/msr.c index 4bf4fad5b148..5a18ecc04a6c 100644 --- a/arch/x86/lib/msr.c +++ b/arch/x86/lib/msr.c @@ -103,6 +103,7 @@ int msr_set_bit(u32 msr, u8 bit) { return __flip_bit(msr, bit, true); } +EXPORT_SYMBOL_GPL(msr_set_bit); /** * msr_clear_bit - Clear @bit in a MSR @msr. @@ -118,6 +119,7 @@ int msr_clear_bit(u32 msr, u8 bit) { return __flip_bit(msr, bit, false); } +EXPORT_SYMBOL_GPL(msr_clear_bit); #ifdef CONFIG_TRACEPOINTS void do_trace_write_msr(unsigned int msr, u64 val, int failed)