Message ID | 025fd734d35acbbbbca74c4b3ed671a02d4af628.1694721045.git.thomas.lendacky@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | SEV-ES TSC_AUX virtualization fix and optimization | expand |
On Thu, Sep 14, 2023, Tom Lendacky wrote: > When the TSC_AUX MSR is virtualized, the TSC_AUX value is swap type "B" > within the VMSA. This means that the guest value is loaded on VMRUN and > the host value is restored from the host save area on #VMEXIT. > > Since the value is restored on #VMEXIT, the KVM user return MSR support > for TSC_AUX can be replaced by populating the host save area with current > host value of TSC_AUX. This replaces two WRMSR instructions with a single > RDMSR instruction. > > Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> > --- > arch/x86/kvm/svm/sev.c | 14 +++++++++++++- > arch/x86/kvm/svm/svm.c | 26 ++++++++++++++++---------- > arch/x86/kvm/svm/svm.h | 4 +++- > 3 files changed, 32 insertions(+), 12 deletions(-) > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c > index 565c9de87c6d..1bbaae2fed96 100644 > --- a/arch/x86/kvm/svm/sev.c > +++ b/arch/x86/kvm/svm/sev.c > @@ -2969,6 +2969,7 @@ static void sev_es_init_vmcb_after_set_cpuid(struct vcpu_svm *svm) > if (boot_cpu_has(X86_FEATURE_V_TSC_AUX) && > (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) || > guest_cpuid_has(vcpu, X86_FEATURE_RDPID))) { > + svm->v_tsc_aux = true; > set_msr_interception(vcpu, svm->msrpm, MSR_TSC_AUX, 1, 1); > if (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP)) > svm_clr_intercept(svm, INTERCEPT_RDTSCP); > @@ -3071,8 +3072,10 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm) > sev_enc_bit)); > } > > -void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa) > +void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa) > { > + u32 msr_hi; > + > /* > * All host state for SEV-ES guests is categorized into three swap types > * based on how it is handled by hardware during a world switch: > @@ -3109,6 +3112,15 @@ void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa) > hostsa->dr2_addr_mask = amd_get_dr_addr_mask(2); > hostsa->dr3_addr_mask = amd_get_dr_addr_mask(3); > } > + > + /* > + * If TSC_AUX virtualization is enabled, MSR_TSC_AUX is loaded but NOT > + * saved by the CPU (Type-B). If TSC_AUX is not virtualized, the user > + * return MSR support takes care of restoring MSR_TSC_AUX. This > + * exchanges two WRMSRs for one RDMSR. > + */ > + if (svm->v_tsc_aux) > + rdmsr(MSR_TSC_AUX, hostsa->tsc_aux, msr_hi); IIUC, when V_TSC_AUX is supported, SEV-ES guests context switch MSR_TSC_AUX regardless of what has been exposed to the guest. So rather than condition the hostsa->tsc_aux update on guest CPUID, just do it if V_TSC_AUX is supported. And then to avoid the RDMSR, which is presumably the motivation for checking guest CPUID, grab the host value from user return framework. The host values are per-CPU, but constant after boot, so the only requirement is that KVM sets up MSR_TSC_AUX in the user return framework. > } > > void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector) > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c > index c58d5632e74a..905b1a2664ed 100644 > --- a/arch/x86/kvm/svm/svm.c > +++ b/arch/x86/kvm/svm/svm.c > @@ -1529,13 +1529,13 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu) > struct sev_es_save_area *hostsa; > hostsa = (struct sev_es_save_area *)(page_address(sd->save_area) + 0x400); > > - sev_es_prepare_switch_to_guest(hostsa); > + sev_es_prepare_switch_to_guest(svm, hostsa); > } > > if (tsc_scaling) > __svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio); > > - if (likely(tsc_aux_uret_slot >= 0)) > + if (likely(tsc_aux_uret_slot >= 0) && !svm->v_tsc_aux) And then this too becomes something like if (likely(tsc_aux_uret_slot >= 0) && (!boot_cpu_has(X86_FEATURE_V_TSC_AUX) || !sev_es_guest(vcpu->kvm))) > kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull); > > svm->guest_state_loaded = true; > @@ -3090,15 +3090,21 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) > break; > case MSR_TSC_AUX: And this also is simply: if (boot_cpu_has(X86_FEATURE_V_TSC_AUX) && sev_es_guest(vcpu->kvm)) break; Because svm->tsc_aux will never be consumed. > /* > - * TSC_AUX is usually changed only during boot and never read > - * directly. Intercept TSC_AUX instead of exposing it to the > - * guest via direct_access_msrs, and switch it via user return. > + * If TSC_AUX is being virtualized, do not use the user return > + * MSR support because TSC_AUX is restored on #VMEXIT. > */ > - preempt_disable(); > - ret = kvm_set_user_return_msr(tsc_aux_uret_slot, data, -1ull); > - preempt_enable(); > - if (ret) > - break; > + if (!svm->v_tsc_aux) { > + /* > + * TSC_AUX is usually changed only during boot and never read > + * directly. Intercept TSC_AUX instead of exposing it to the > + * guest via direct_access_msrs, and switch it via user return. > + */ > + preempt_disable(); > + ret = kvm_set_user_return_msr(tsc_aux_uret_slot, data, -1ull); > + preempt_enable(); > + if (ret) > + break; > + } > > svm->tsc_aux = data; > break;
On Fri, Sep 15, 2023, Sean Christopherson wrote: > On Thu, Sep 14, 2023, Tom Lendacky wrote: > > When the TSC_AUX MSR is virtualized, the TSC_AUX value is swap type "B" > > within the VMSA. This means that the guest value is loaded on VMRUN and > > the host value is restored from the host save area on #VMEXIT. > > > > Since the value is restored on #VMEXIT, the KVM user return MSR support > > for TSC_AUX can be replaced by populating the host save area with current > > host value of TSC_AUX. This replaces two WRMSR instructions with a single > > RDMSR instruction. > > > > Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> > > --- > > arch/x86/kvm/svm/sev.c | 14 +++++++++++++- > > arch/x86/kvm/svm/svm.c | 26 ++++++++++++++++---------- > > arch/x86/kvm/svm/svm.h | 4 +++- > > 3 files changed, 32 insertions(+), 12 deletions(-) > > > > diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c > > index 565c9de87c6d..1bbaae2fed96 100644 > > --- a/arch/x86/kvm/svm/sev.c > > +++ b/arch/x86/kvm/svm/sev.c > > @@ -2969,6 +2969,7 @@ static void sev_es_init_vmcb_after_set_cpuid(struct vcpu_svm *svm) > > if (boot_cpu_has(X86_FEATURE_V_TSC_AUX) && > > (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) || > > guest_cpuid_has(vcpu, X86_FEATURE_RDPID))) { > > + svm->v_tsc_aux = true; > > set_msr_interception(vcpu, svm->msrpm, MSR_TSC_AUX, 1, 1); > > if (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP)) > > svm_clr_intercept(svm, INTERCEPT_RDTSCP); > > @@ -3071,8 +3072,10 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm) > > sev_enc_bit)); > > } > > > > -void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa) > > +void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa) > > { > > + u32 msr_hi; > > + > > /* > > * All host state for SEV-ES guests is categorized into three swap types > > * based on how it is handled by hardware during a world switch: > > @@ -3109,6 +3112,15 @@ void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa) > > hostsa->dr2_addr_mask = amd_get_dr_addr_mask(2); > > hostsa->dr3_addr_mask = amd_get_dr_addr_mask(3); > > } > > + > > + /* > > + * If TSC_AUX virtualization is enabled, MSR_TSC_AUX is loaded but NOT > > + * saved by the CPU (Type-B). If TSC_AUX is not virtualized, the user > > + * return MSR support takes care of restoring MSR_TSC_AUX. This > > + * exchanges two WRMSRs for one RDMSR. > > + */ > > + if (svm->v_tsc_aux) > > + rdmsr(MSR_TSC_AUX, hostsa->tsc_aux, msr_hi); > > IIUC, when V_TSC_AUX is supported, SEV-ES guests context switch MSR_TSC_AUX > regardless of what has been exposed to the guest. So rather than condition the > hostsa->tsc_aux update on guest CPUID, just do it if V_TSC_AUX is supported. > > And then to avoid the RDMSR, which is presumably the motivation for checking > guest CPUID, grab the host value from user return framework. The host values > are per-CPU, but constant after boot, so the only requirement is that KVM sets > up MSR_TSC_AUX in the user return framework. Actually, duh. The save area is also per-CPU, so just fill hostsa->tsc_aux in svm_hardware_setup() and then sev_es_prepare_switch_to_guest() never has to do anything.
On 9/15/23 09:51, Sean Christopherson wrote: > On Fri, Sep 15, 2023, Sean Christopherson wrote: >> On Thu, Sep 14, 2023, Tom Lendacky wrote: >>> When the TSC_AUX MSR is virtualized, the TSC_AUX value is swap type "B" >>> within the VMSA. This means that the guest value is loaded on VMRUN and >>> the host value is restored from the host save area on #VMEXIT. >>> >>> Since the value is restored on #VMEXIT, the KVM user return MSR support >>> for TSC_AUX can be replaced by populating the host save area with current >>> host value of TSC_AUX. This replaces two WRMSR instructions with a single >>> RDMSR instruction. >>> >>> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> >>> --- >>> arch/x86/kvm/svm/sev.c | 14 +++++++++++++- >>> arch/x86/kvm/svm/svm.c | 26 ++++++++++++++++---------- >>> arch/x86/kvm/svm/svm.h | 4 +++- >>> 3 files changed, 32 insertions(+), 12 deletions(-) >>> >>> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c >>> index 565c9de87c6d..1bbaae2fed96 100644 >>> --- a/arch/x86/kvm/svm/sev.c >>> +++ b/arch/x86/kvm/svm/sev.c >>> @@ -2969,6 +2969,7 @@ static void sev_es_init_vmcb_after_set_cpuid(struct vcpu_svm *svm) >>> if (boot_cpu_has(X86_FEATURE_V_TSC_AUX) && >>> (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) || >>> guest_cpuid_has(vcpu, X86_FEATURE_RDPID))) { >>> + svm->v_tsc_aux = true; >>> set_msr_interception(vcpu, svm->msrpm, MSR_TSC_AUX, 1, 1); >>> if (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP)) >>> svm_clr_intercept(svm, INTERCEPT_RDTSCP); >>> @@ -3071,8 +3072,10 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm) >>> sev_enc_bit)); >>> } >>> >>> -void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa) >>> +void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa) >>> { >>> + u32 msr_hi; >>> + >>> /* >>> * All host state for SEV-ES guests is categorized into three swap types >>> * based on how it is handled by hardware during a world switch: >>> @@ -3109,6 +3112,15 @@ void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa) >>> hostsa->dr2_addr_mask = amd_get_dr_addr_mask(2); >>> hostsa->dr3_addr_mask = amd_get_dr_addr_mask(3); >>> } >>> + >>> + /* >>> + * If TSC_AUX virtualization is enabled, MSR_TSC_AUX is loaded but NOT >>> + * saved by the CPU (Type-B). If TSC_AUX is not virtualized, the user >>> + * return MSR support takes care of restoring MSR_TSC_AUX. This >>> + * exchanges two WRMSRs for one RDMSR. >>> + */ >>> + if (svm->v_tsc_aux) >>> + rdmsr(MSR_TSC_AUX, hostsa->tsc_aux, msr_hi); >> >> IIUC, when V_TSC_AUX is supported, SEV-ES guests context switch MSR_TSC_AUX >> regardless of what has been exposed to the guest. So rather than condition the >> hostsa->tsc_aux update on guest CPUID, just do it if V_TSC_AUX is supported. >> >> And then to avoid the RDMSR, which is presumably the motivation for checking >> guest CPUID, grab the host value from user return framework. The host values >> are per-CPU, but constant after boot, so the only requirement is that KVM sets >> up MSR_TSC_AUX in the user return framework. > > Actually, duh. The save area is also per-CPU, so just fill hostsa->tsc_aux in > svm_hardware_setup() and then sev_es_prepare_switch_to_guest() never has to do > anything. Ah, right, because Linux never changes TSC_AUX post boot. Much simpler. I'll rework based on the comments and send a v2 series. Thanks, Tom
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c index 565c9de87c6d..1bbaae2fed96 100644 --- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -2969,6 +2969,7 @@ static void sev_es_init_vmcb_after_set_cpuid(struct vcpu_svm *svm) if (boot_cpu_has(X86_FEATURE_V_TSC_AUX) && (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) || guest_cpuid_has(vcpu, X86_FEATURE_RDPID))) { + svm->v_tsc_aux = true; set_msr_interception(vcpu, svm->msrpm, MSR_TSC_AUX, 1, 1); if (guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP)) svm_clr_intercept(svm, INTERCEPT_RDTSCP); @@ -3071,8 +3072,10 @@ void sev_es_vcpu_reset(struct vcpu_svm *svm) sev_enc_bit)); } -void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa) +void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa) { + u32 msr_hi; + /* * All host state for SEV-ES guests is categorized into three swap types * based on how it is handled by hardware during a world switch: @@ -3109,6 +3112,15 @@ void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa) hostsa->dr2_addr_mask = amd_get_dr_addr_mask(2); hostsa->dr3_addr_mask = amd_get_dr_addr_mask(3); } + + /* + * If TSC_AUX virtualization is enabled, MSR_TSC_AUX is loaded but NOT + * saved by the CPU (Type-B). If TSC_AUX is not virtualized, the user + * return MSR support takes care of restoring MSR_TSC_AUX. This + * exchanges two WRMSRs for one RDMSR. + */ + if (svm->v_tsc_aux) + rdmsr(MSR_TSC_AUX, hostsa->tsc_aux, msr_hi); } void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector) diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index c58d5632e74a..905b1a2664ed 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1529,13 +1529,13 @@ static void svm_prepare_switch_to_guest(struct kvm_vcpu *vcpu) struct sev_es_save_area *hostsa; hostsa = (struct sev_es_save_area *)(page_address(sd->save_area) + 0x400); - sev_es_prepare_switch_to_guest(hostsa); + sev_es_prepare_switch_to_guest(svm, hostsa); } if (tsc_scaling) __svm_write_tsc_multiplier(vcpu->arch.tsc_scaling_ratio); - if (likely(tsc_aux_uret_slot >= 0)) + if (likely(tsc_aux_uret_slot >= 0) && !svm->v_tsc_aux) kvm_set_user_return_msr(tsc_aux_uret_slot, svm->tsc_aux, -1ull); svm->guest_state_loaded = true; @@ -3090,15 +3090,21 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) break; case MSR_TSC_AUX: /* - * TSC_AUX is usually changed only during boot and never read - * directly. Intercept TSC_AUX instead of exposing it to the - * guest via direct_access_msrs, and switch it via user return. + * If TSC_AUX is being virtualized, do not use the user return + * MSR support because TSC_AUX is restored on #VMEXIT. */ - preempt_disable(); - ret = kvm_set_user_return_msr(tsc_aux_uret_slot, data, -1ull); - preempt_enable(); - if (ret) - break; + if (!svm->v_tsc_aux) { + /* + * TSC_AUX is usually changed only during boot and never read + * directly. Intercept TSC_AUX instead of exposing it to the + * guest via direct_access_msrs, and switch it via user return. + */ + preempt_disable(); + ret = kvm_set_user_return_msr(tsc_aux_uret_slot, data, -1ull); + preempt_enable(); + if (ret) + break; + } svm->tsc_aux = data; break; diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index c0d17da46fae..49427858474e 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -213,6 +213,8 @@ struct vcpu_svm { u32 asid; u32 sysenter_esp_hi; u32 sysenter_eip_hi; + + bool v_tsc_aux; uint64_t tsc_aux; u64 msr_decfg; @@ -690,7 +692,7 @@ int sev_handle_vmgexit(struct kvm_vcpu *vcpu); int sev_es_string_io(struct vcpu_svm *svm, int size, unsigned int port, int in); void sev_es_vcpu_reset(struct vcpu_svm *svm); void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector); -void sev_es_prepare_switch_to_guest(struct sev_es_save_area *hostsa); +void sev_es_prepare_switch_to_guest(struct vcpu_svm *svm, struct sev_es_save_area *hostsa); void sev_es_unmap_ghcb(struct vcpu_svm *svm); /* vmenter.S */
When the TSC_AUX MSR is virtualized, the TSC_AUX value is swap type "B" within the VMSA. This means that the guest value is loaded on VMRUN and the host value is restored from the host save area on #VMEXIT. Since the value is restored on #VMEXIT, the KVM user return MSR support for TSC_AUX can be replaced by populating the host save area with current host value of TSC_AUX. This replaces two WRMSR instructions with a single RDMSR instruction. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> --- arch/x86/kvm/svm/sev.c | 14 +++++++++++++- arch/x86/kvm/svm/svm.c | 26 ++++++++++++++++---------- arch/x86/kvm/svm/svm.h | 4 +++- 3 files changed, 32 insertions(+), 12 deletions(-)