Message ID | 20230206060545.628502-3-manali.shukla@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | PreventHostIBS feature for SEV-ES and SNP guests | expand |
On 06/02/23 11:35, Manali Shukla wrote: > Currently, the hypervisor is able to inspect instruction based samples > from a guest and gather execution information. SEV-ES and SNP guests > can disallow the use of instruction based sampling by hypervisor by > enabling the PreventHostIBS feature for the guest. (More information > in Section 15.36.17 APM Volume 2) > > The MSR_AMD64_IBSFETCHCTL[IbsFetchEn] and MSR_AMD64_IBSOPCTL[IbsOpEn] > bits need to be disabled before VMRUN is called when PreventHostIBS > feature is enabled. If either of these bits are not 0, VMRUN will fail > with VMEXIT_INVALID error code. > > Because of an IBS race condition when disabling IBS, KVM needs to > indicate when it is in a PreventHostIBS window. Activate the window > based on whether IBS is currently active or inactive. > > Signed-off-by: Manali Shukla <manali.shukla@amd.com> Looks good. Reviewed-by: Nikunj A Dadhania <nikunj@amd.com>
On Mon, Feb 06, 2023, Manali Shukla wrote: > Currently, the hypervisor is able to inspect instruction based samples > from a guest and gather execution information. SEV-ES and SNP guests > can disallow the use of instruction based sampling by hypervisor by > enabling the PreventHostIBS feature for the guest. (More information > in Section 15.36.17 APM Volume 2) > > The MSR_AMD64_IBSFETCHCTL[IbsFetchEn] and MSR_AMD64_IBSOPCTL[IbsOpEn] > bits need to be disabled before VMRUN is called when PreventHostIBS > feature is enabled. If either of these bits are not 0, VMRUN will fail > with VMEXIT_INVALID error code. > > Because of an IBS race condition when disabling IBS, KVM needs to > indicate when it is in a PreventHostIBS window. Activate the window > based on whether IBS is currently active or inactive. > > Signed-off-by: Manali Shukla <manali.shukla@amd.com> > --- > arch/x86/include/asm/cpufeatures.h | 1 + > arch/x86/kvm/svm/sev.c | 10 ++++++++ > arch/x86/kvm/svm/svm.c | 39 ++++++++++++++++++++++++++++-- > arch/x86/kvm/svm/svm.h | 1 + > 4 files changed, 49 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h > index 61012476d66e..1812e74f846a 100644 > --- a/arch/x86/include/asm/cpufeatures.h > +++ b/arch/x86/include/asm/cpufeatures.h > @@ -425,6 +425,7 @@ > #define X86_FEATURE_SEV_ES (19*32+ 3) /* AMD Secure Encrypted Virtualization - Encrypted State */ > #define X86_FEATURE_V_TSC_AUX (19*32+ 9) /* "" Virtual TSC_AUX */ > #define X86_FEATURE_SME_COHERENT (19*32+10) /* "" AMD hardware-enforced cache coherency */ > +#define X86_FEATURE_PREVENT_HOST_IBS (19*32+15) /* "" AMD prevent host ibs */ > > /* > * BUG word(s) > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c > index 86d6897f4806..b348b8931721 100644 > --- a/arch/x86/kvm/svm/sev.c > +++ b/arch/x86/kvm/svm/sev.c > @@ -569,6 +569,12 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm) > if (svm->vcpu.guest_debug || (svm->vmcb->save.dr7 & ~DR7_FIXED_1)) > return -EINVAL; > > + if (sev_es_guest(svm->vcpu.kvm) && > + guest_cpuid_has(&svm->vcpu, X86_FEATURE_PREVENT_HOST_IBS)) { > + save->sev_features |= BIT(6); > + svm->prevent_hostibs_enabled = true; > + } > + > /* > * SEV-ES will use a VMSA that is pointed to by the VMCB, not > * the traditional VMSA that is part of the VMCB. Copy the > @@ -2158,6 +2164,10 @@ void __init sev_set_cpu_caps(void) > kvm_cpu_cap_clear(X86_FEATURE_SEV); > if (!sev_es_enabled) > kvm_cpu_cap_clear(X86_FEATURE_SEV_ES); > + > + /* Enable PreventhostIBS feature for SEV-ES and higher guests */ > + if (sev_es_enabled) > + kvm_cpu_cap_set(X86_FEATURE_PREVENT_HOST_IBS); Uh, you can't just force a cap, there needs to be actual hardware support. Just copy what was done for X86_FEATURE_SEV_ES. > } > > void __init sev_hardware_setup(void) > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index 9a194aa1a75a..47c1e0fff23e 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -3914,10 +3914,39 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in > > guest_state_enter_irqoff(); > > - if (sev_es_guest(vcpu->kvm)) > + if (sev_es_guest(vcpu->kvm)) { > + bool ibs_fetch_active, ibs_op_active; > + u64 ibs_fetch_ctl, ibs_op_ctl; > + > + if (svm->prevent_hostibs_enabled) { > + /* > + * With PreventHostIBS enabled, IBS profiling cannot > + * be active when VMRUN is executed. Disable IBS before > + * executing VMRUN and, because of a race condition, > + * enable the PreventHostIBS window if IBS profiling was > + * active. And the race can't be fixed because...? > + */ > + ibs_fetch_active = > + amd_disable_ibs_fetch(&ibs_fetch_ctl); > + ibs_op_active = > + amd_disable_ibs_op(&ibs_op_ctl); > + > + amd_prevent_hostibs_window(ibs_fetch_active || > + ibs_op_active); > + } > + > __svm_sev_es_vcpu_run(svm, spec_ctrl_intercepted); > - else > + > + if (svm->prevent_hostibs_enabled) { > + if (ibs_fetch_active) > + amd_restore_ibs_fetch(ibs_fetch_ctl); > + > + if (ibs_op_active) > + amd_restore_ibs_op(ibs_op_ctl); IIUC, this adds up to 2 RDMSRs and 4 WRMSRs to the VMRUN path. Blech. There's gotta be a better way to implement this. Like PeterZ said, this is basically exclude_guest.
On 3/25/2023 1:25 AM, Sean Christopherson wrote: > On Mon, Feb 06, 2023, Manali Shukla wrote: >> Currently, the hypervisor is able to inspect instruction based samples >> from a guest and gather execution information. SEV-ES and SNP guests >> can disallow the use of instruction based sampling by hypervisor by >> enabling the PreventHostIBS feature for the guest. (More information >> in Section 15.36.17 APM Volume 2) >> >> The MSR_AMD64_IBSFETCHCTL[IbsFetchEn] and MSR_AMD64_IBSOPCTL[IbsOpEn] >> bits need to be disabled before VMRUN is called when PreventHostIBS >> feature is enabled. If either of these bits are not 0, VMRUN will fail >> with VMEXIT_INVALID error code. >> >> Because of an IBS race condition when disabling IBS, KVM needs to >> indicate when it is in a PreventHostIBS window. Activate the window >> based on whether IBS is currently active or inactive. >> >> Signed-off-by: Manali Shukla <manali.shukla@amd.com> >> --- >> arch/x86/include/asm/cpufeatures.h | 1 + >> arch/x86/kvm/svm/sev.c | 10 ++++++++ >> arch/x86/kvm/svm/svm.c | 39 ++++++++++++++++++++++++++++-- >> arch/x86/kvm/svm/svm.h | 1 + >> 4 files changed, 49 insertions(+), 2 deletions(-) >> >> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h >> index 61012476d66e..1812e74f846a 100644 >> --- a/arch/x86/include/asm/cpufeatures.h >> +++ b/arch/x86/include/asm/cpufeatures.h >> @@ -425,6 +425,7 @@ >> #define X86_FEATURE_SEV_ES (19*32+ 3) /* AMD Secure Encrypted Virtualization - Encrypted State */ >> #define X86_FEATURE_V_TSC_AUX (19*32+ 9) /* "" Virtual TSC_AUX */ >> #define X86_FEATURE_SME_COHERENT (19*32+10) /* "" AMD hardware-enforced cache coherency */ >> +#define X86_FEATURE_PREVENT_HOST_IBS (19*32+15) /* "" AMD prevent host ibs */ >> >> /* >> * BUG word(s) >> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c >> index 86d6897f4806..b348b8931721 100644 >> --- a/arch/x86/kvm/svm/sev.c >> +++ b/arch/x86/kvm/svm/sev.c >> @@ -569,6 +569,12 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm) >> if (svm->vcpu.guest_debug || (svm->vmcb->save.dr7 & ~DR7_FIXED_1)) >> return -EINVAL; >> >> + if (sev_es_guest(svm->vcpu.kvm) && >> + guest_cpuid_has(&svm->vcpu, X86_FEATURE_PREVENT_HOST_IBS)) { >> + save->sev_features |= BIT(6); >> + svm->prevent_hostibs_enabled = true; >> + } >> + >> /* >> * SEV-ES will use a VMSA that is pointed to by the VMCB, not >> * the traditional VMSA that is part of the VMCB. Copy the >> @@ -2158,6 +2164,10 @@ void __init sev_set_cpu_caps(void) >> kvm_cpu_cap_clear(X86_FEATURE_SEV); >> if (!sev_es_enabled) >> kvm_cpu_cap_clear(X86_FEATURE_SEV_ES); >> + >> + /* Enable PreventhostIBS feature for SEV-ES and higher guests */ >> + if (sev_es_enabled) >> + kvm_cpu_cap_set(X86_FEATURE_PREVENT_HOST_IBS); > > Uh, you can't just force a cap, there needs to be actual hardware support. Just > copy what was done for X86_FEATURE_SEV_ES. Okay. I will do the suggested changes. > > >> } >> >> void __init sev_hardware_setup(void) >> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c >> index 9a194aa1a75a..47c1e0fff23e 100644 >> --- a/arch/x86/kvm/svm/svm.c >> +++ b/arch/x86/kvm/svm/svm.c >> @@ -3914,10 +3914,39 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in >> >> guest_state_enter_irqoff(); >> >> - if (sev_es_guest(vcpu->kvm)) >> + if (sev_es_guest(vcpu->kvm)) { >> + bool ibs_fetch_active, ibs_op_active; >> + u64 ibs_fetch_ctl, ibs_op_ctl; >> + >> + if (svm->prevent_hostibs_enabled) { >> + /* >> + * With PreventHostIBS enabled, IBS profiling cannot >> + * be active when VMRUN is executed. Disable IBS before >> + * executing VMRUN and, because of a race condition, >> + * enable the PreventHostIBS window if IBS profiling was >> + * active. > > And the race can't be fixed because...? Race can not be fixed because VALID and ENABLE bit for IBS_FETCH_CTL and IBS_OP_CTL are contained in their same resepective MSRs. Due to this reason following scenario can be generated: Read IBS_FETCH_CTL (IbsFetchEn bit is 1 and IBSFetchVal bit is 0) Write IBS_FETCH_CTL (IbsFetchEn is 0 now) Imagine in between Read and Write, IBSFetchVal changes to 1. Write to IBS_FETCH_CTL will clear the IBSFetchVal bit. When STGI is executed after VMEXIT, the NMI is taken and check for valid mask will fail and generate Dazed and Confused NMI messages. Please refer to cover letter for more details. > >> + */ >> + ibs_fetch_active = >> + amd_disable_ibs_fetch(&ibs_fetch_ctl); >> + ibs_op_active = >> + amd_disable_ibs_op(&ibs_op_ctl); >> + >> + amd_prevent_hostibs_window(ibs_fetch_active || >> + ibs_op_active); >> + } >> + >> __svm_sev_es_vcpu_run(svm, spec_ctrl_intercepted); >> - else >> + >> + if (svm->prevent_hostibs_enabled) { >> + if (ibs_fetch_active) >> + amd_restore_ibs_fetch(ibs_fetch_ctl); >> + >> + if (ibs_op_active) >> + amd_restore_ibs_op(ibs_op_ctl); > > IIUC, this adds up to 2 RDMSRs and 4 WRMSRs to the VMRUN path. Blech. There's > gotta be a better way to implement this. I will try to find a better way to implement this. > Like PeterZ said, this is basically > exclude_guest. As I mentioned before, exclude_guest lets the profiler decide whether it wants to trace the guest data or not, whereas PreventHostIBS lets the owner of the guest decide whether host can trace guest's data or not. Thank you for reviewing the patches. - Manali
On Wed, Mar 29, 2023, Manali Shukla wrote: > On 3/25/2023 1:25 AM, Sean Christopherson wrote: > > On Mon, Feb 06, 2023, Manali Shukla wrote: > >> - if (sev_es_guest(vcpu->kvm)) > >> + if (sev_es_guest(vcpu->kvm)) { > >> + bool ibs_fetch_active, ibs_op_active; > >> + u64 ibs_fetch_ctl, ibs_op_ctl; > >> + > >> + if (svm->prevent_hostibs_enabled) { > >> + /* > >> + * With PreventHostIBS enabled, IBS profiling cannot > >> + * be active when VMRUN is executed. Disable IBS before > >> + * executing VMRUN and, because of a race condition, > >> + * enable the PreventHostIBS window if IBS profiling was > >> + * active. > > > > And the race can't be fixed because...? > > Race can not be fixed because VALID and ENABLE bit for IBS_FETCH_CTL and IBS_OP_CTL > are contained in their same resepective MSRs. Due to this reason following scenario can > be generated: > Read IBS_FETCH_CTL (IbsFetchEn bit is 1 and IBSFetchVal bit is 0) > Write IBS_FETCH_CTL (IbsFetchEn is 0 now) > Imagine in between Read and Write, IBSFetchVal changes to 1. Write to IBS_FETCH_CTL will > clear the IBSFetchVal bit. When STGI is executed after VMEXIT, the NMI is taken and check for > valid mask will fail and generate Dazed and Confused NMI messages. > Please refer to cover letter for more details. I understand the race, I'm asking why this series doesn't fix the race. Effectively suppressing potentially unexpected NMIs because PreventHostIBS was enable is ugly. > >> + */ > >> + ibs_fetch_active = > >> + amd_disable_ibs_fetch(&ibs_fetch_ctl); > >> + ibs_op_active = > >> + amd_disable_ibs_op(&ibs_op_ctl); > >> + > >> + amd_prevent_hostibs_window(ibs_fetch_active || > >> + ibs_op_active); > >> + } > >> + > >> __svm_sev_es_vcpu_run(svm, spec_ctrl_intercepted); > >> - else > >> + > >> + if (svm->prevent_hostibs_enabled) { > >> + if (ibs_fetch_active) > >> + amd_restore_ibs_fetch(ibs_fetch_ctl); > >> + > >> + if (ibs_op_active) > >> + amd_restore_ibs_op(ibs_op_ctl); > > > > IIUC, this adds up to 2 RDMSRs and 4 WRMSRs to the VMRUN path. Blech. There's > > gotta be a better way to implement this. > > I will try to find a better way to implement this. > > > Like PeterZ said, this is basically > > exclude_guest. > > As I mentioned before, exclude_guest lets the profiler decide whether it wants to trace the guest > data or not, whereas PreventHostIBS lets the owner of the guest decide whether host can trace guest's > data or not. PreventHostIBS is purely an enforcement, it does not actually do anything to disable tracing of the guest. What PeterZ and I are complaining about is that instead of integrating this feature with exclude_guest, e.g. finding a way to make guest tracing mutually exclusive with KVM_RUN so that PreventHostIBS can be contexted switched according, this series instead backdoors into perf to forcefully disable tracing. In other words, please try to create a sane contract between userspace, perf, and KVM, e.g. disallow tracing a guest with PreventHostIBS at some level instead of silently toggling tracing around VMRUN.
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 61012476d66e..1812e74f846a 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -425,6 +425,7 @@ #define X86_FEATURE_SEV_ES (19*32+ 3) /* AMD Secure Encrypted Virtualization - Encrypted State */ #define X86_FEATURE_V_TSC_AUX (19*32+ 9) /* "" Virtual TSC_AUX */ #define X86_FEATURE_SME_COHERENT (19*32+10) /* "" AMD hardware-enforced cache coherency */ +#define X86_FEATURE_PREVENT_HOST_IBS (19*32+15) /* "" AMD prevent host ibs */ /* * BUG word(s) diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c index 86d6897f4806..b348b8931721 100644 --- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -569,6 +569,12 @@ static int sev_es_sync_vmsa(struct vcpu_svm *svm) if (svm->vcpu.guest_debug || (svm->vmcb->save.dr7 & ~DR7_FIXED_1)) return -EINVAL; + if (sev_es_guest(svm->vcpu.kvm) && + guest_cpuid_has(&svm->vcpu, X86_FEATURE_PREVENT_HOST_IBS)) { + save->sev_features |= BIT(6); + svm->prevent_hostibs_enabled = true; + } + /* * SEV-ES will use a VMSA that is pointed to by the VMCB, not * the traditional VMSA that is part of the VMCB. Copy the @@ -2158,6 +2164,10 @@ void __init sev_set_cpu_caps(void) kvm_cpu_cap_clear(X86_FEATURE_SEV); if (!sev_es_enabled) kvm_cpu_cap_clear(X86_FEATURE_SEV_ES); + + /* Enable PreventhostIBS feature for SEV-ES and higher guests */ + if (sev_es_enabled) + kvm_cpu_cap_set(X86_FEATURE_PREVENT_HOST_IBS); } void __init sev_hardware_setup(void) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 9a194aa1a75a..47c1e0fff23e 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3914,10 +3914,39 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu, bool spec_ctrl_in guest_state_enter_irqoff(); - if (sev_es_guest(vcpu->kvm)) + if (sev_es_guest(vcpu->kvm)) { + bool ibs_fetch_active, ibs_op_active; + u64 ibs_fetch_ctl, ibs_op_ctl; + + if (svm->prevent_hostibs_enabled) { + /* + * With PreventHostIBS enabled, IBS profiling cannot + * be active when VMRUN is executed. Disable IBS before + * executing VMRUN and, because of a race condition, + * enable the PreventHostIBS window if IBS profiling was + * active. + */ + ibs_fetch_active = + amd_disable_ibs_fetch(&ibs_fetch_ctl); + ibs_op_active = + amd_disable_ibs_op(&ibs_op_ctl); + + amd_prevent_hostibs_window(ibs_fetch_active || + ibs_op_active); + } + __svm_sev_es_vcpu_run(svm, spec_ctrl_intercepted); - else + + if (svm->prevent_hostibs_enabled) { + if (ibs_fetch_active) + amd_restore_ibs_fetch(ibs_fetch_ctl); + + if (ibs_op_active) + amd_restore_ibs_op(ibs_op_ctl); + } + } else { __svm_vcpu_run(svm, spec_ctrl_intercepted); + } guest_state_exit_irqoff(); } @@ -4008,6 +4037,12 @@ static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu) /* Any pending NMI will happen here */ + /* + * Disable the PreventHostIBS window since any pending IBS NMIs will + * have been handled. + */ + amd_prevent_hostibs_window(false); + if (unlikely(svm->vmcb->control.exit_code == SVM_EXIT_NMI)) kvm_after_interrupt(vcpu); diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 4826e6cc611b..71f32fcfd219 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -254,6 +254,7 @@ struct vcpu_svm { bool pause_filter_enabled : 1; bool pause_threshold_enabled : 1; bool vgif_enabled : 1; + bool prevent_hostibs_enabled : 1; u32 ldr_reg; u32 dfr_reg;
Currently, the hypervisor is able to inspect instruction based samples from a guest and gather execution information. SEV-ES and SNP guests can disallow the use of instruction based sampling by hypervisor by enabling the PreventHostIBS feature for the guest. (More information in Section 15.36.17 APM Volume 2) The MSR_AMD64_IBSFETCHCTL[IbsFetchEn] and MSR_AMD64_IBSOPCTL[IbsOpEn] bits need to be disabled before VMRUN is called when PreventHostIBS feature is enabled. If either of these bits are not 0, VMRUN will fail with VMEXIT_INVALID error code. Because of an IBS race condition when disabling IBS, KVM needs to indicate when it is in a PreventHostIBS window. Activate the window based on whether IBS is currently active or inactive. Signed-off-by: Manali Shukla <manali.shukla@amd.com> --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/kvm/svm/sev.c | 10 ++++++++ arch/x86/kvm/svm/svm.c | 39 ++++++++++++++++++++++++++++-- arch/x86/kvm/svm/svm.h | 1 + 4 files changed, 49 insertions(+), 2 deletions(-)