[v3,7/8] x86/HVM: don't needlessly intercept APERF/MPERF/TSC MSR reads
diff mbox series

Message ID 0d2c44ca-d3ce-bb83-e3fc-0e5037c90143@suse.com
State New
Headers show
Series
  • x86emul: further work
Related show

Commit Message

Jan Beulich Jan. 6, 2020, 4:39 p.m. UTC
If the hardware can handle accesses, we should allow it to do so. This
way we can expose EFRO to HVM guests, and "all" that's left for exposing
APERF/MPERF is to figure out how to handle writes to these MSRs. (Note
that the leaf 6 guest CPUID checks will evaluate to false for now, as
recalculate_misc() zaps the entire leaf for now.)

For TSC the intercepts are made mirror the RDTSC ones.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v4: Make TSC intercepts mirror RDTSC ones. Re-base.
v3: New.

Comments

Tian, Kevin Jan. 19, 2020, 2:44 a.m. UTC | #1
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Tuesday, January 7, 2020 12:39 AM
> 
> If the hardware can handle accesses, we should allow it to do so. This
> way we can expose EFRO to HVM guests, and "all" that's left for exposing
> APERF/MPERF is to figure out how to handle writes to these MSRs. (Note
> that the leaf 6 guest CPUID checks will evaluate to false for now, as
> recalculate_misc() zaps the entire leaf for now.)
> 
> For TSC the intercepts are made mirror the RDTSC ones.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Jan Beulich Jan. 20, 2020, 8:32 a.m. UTC | #2
On 19.01.2020 03:44, Tian, Kevin wrote:
>> From: Jan Beulich <jbeulich@suse.com>
>> Sent: Tuesday, January 7, 2020 12:39 AM
>>
>> If the hardware can handle accesses, we should allow it to do so. This
>> way we can expose EFRO to HVM guests, and "all" that's left for exposing
>> APERF/MPERF is to figure out how to handle writes to these MSRs. (Note
>> that the leaf 6 guest CPUID checks will evaluate to false for now, as
>> recalculate_misc() zaps the entire leaf for now.)
>>
>> For TSC the intercepts are made mirror the RDTSC ones.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>

Thanks. I assume you've seen Andrew's comment, and hence I take it
that the R-b also applies to the adjusted version (not posted yet):

--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -1140,8 +1140,13 @@ static int construct_vmcs(struct vcpu *v
         vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_CS, VMX_MSR_RW);
         vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_ESP, VMX_MSR_RW);
         vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_EIP, VMX_MSR_RW);
+
+        if ( !(v->arch.hvm.vmx.exec_control & CPU_BASED_RDTSC_EXITING) )
+            vmx_clear_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R);
+
         if ( paging_mode_hap(d) && (!is_iommu_enabled(d) || iommu_snoop) )
             vmx_clear_msr_intercept(v, MSR_IA32_CR_PAT, VMX_MSR_RW);
+
         if ( (vmexit_ctl & VM_EXIT_CLEAR_BNDCFGS) &&
              (vmentry_ctl & VM_ENTRY_LOAD_BNDCFGS) )
             vmx_clear_msr_intercept(v, MSR_IA32_BNDCFGS, VMX_MSR_RW);

plus this extra vmx.c hunk:

@@ -1249,7 +1261,12 @@ static void vmx_set_rdtsc_exiting(struct
     vmx_vmcs_enter(v);
     v->arch.hvm.vmx.exec_control &= ~CPU_BASED_RDTSC_EXITING;
     if ( enable )
+    {
         v->arch.hvm.vmx.exec_control |= CPU_BASED_RDTSC_EXITING;
+        vmx_set_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R);
+    }
+    else
+        vmx_clear_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R);
     vmx_update_cpu_exec_control(v);
     vmx_vmcs_exit(v);
 }

Jan
Tian, Kevin Jan. 21, 2020, 2:45 a.m. UTC | #3
> From: Jan Beulich <jbeulich@suse.com>
> Sent: Monday, January 20, 2020 4:33 PM
> 
> On 19.01.2020 03:44, Tian, Kevin wrote:
> >> From: Jan Beulich <jbeulich@suse.com>
> >> Sent: Tuesday, January 7, 2020 12:39 AM
> >>
> >> If the hardware can handle accesses, we should allow it to do so. This
> >> way we can expose EFRO to HVM guests, and "all" that's left for exposing
> >> APERF/MPERF is to figure out how to handle writes to these MSRs. (Note
> >> that the leaf 6 guest CPUID checks will evaluate to false for now, as
> >> recalculate_misc() zaps the entire leaf for now.)
> >>
> >> For TSC the intercepts are made mirror the RDTSC ones.
> >>
> >> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> >
> > Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> 
> Thanks. I assume you've seen Andrew's comment, and hence I take it
> that the R-b also applies to the adjusted version (not posted yet):

sorry I'm not sure which comment is referred here. If you will
anyway send out a new version, please drop my R-b and I will
double check again though the below version alone looks good.

> 
> --- a/xen/arch/x86/hvm/vmx/vmcs.c
> +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> @@ -1140,8 +1140,13 @@ static int construct_vmcs(struct vcpu *v
>          vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_CS, VMX_MSR_RW);
>          vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_ESP, VMX_MSR_RW);
>          vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_EIP, VMX_MSR_RW);
> +
> +        if ( !(v->arch.hvm.vmx.exec_control & CPU_BASED_RDTSC_EXITING) )
> +            vmx_clear_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R);
> +
>          if ( paging_mode_hap(d) && (!is_iommu_enabled(d) || iommu_snoop) )
>              vmx_clear_msr_intercept(v, MSR_IA32_CR_PAT, VMX_MSR_RW);
> +
>          if ( (vmexit_ctl & VM_EXIT_CLEAR_BNDCFGS) &&
>               (vmentry_ctl & VM_ENTRY_LOAD_BNDCFGS) )
>              vmx_clear_msr_intercept(v, MSR_IA32_BNDCFGS, VMX_MSR_RW);
> 
> plus this extra vmx.c hunk:
> 
> @@ -1249,7 +1261,12 @@ static void vmx_set_rdtsc_exiting(struct
>      vmx_vmcs_enter(v);
>      v->arch.hvm.vmx.exec_control &= ~CPU_BASED_RDTSC_EXITING;
>      if ( enable )
> +    {
>          v->arch.hvm.vmx.exec_control |= CPU_BASED_RDTSC_EXITING;
> +        vmx_set_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R);
> +    }
> +    else
> +        vmx_clear_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R);
>      vmx_update_cpu_exec_control(v);
>      vmx_vmcs_exit(v);
>  }
> 
> Jan

Patch
diff mbox series

--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -595,6 +595,7 @@  static void svm_cpuid_policy_changed(str
     struct vmcb_struct *vmcb = svm->vmcb;
     const struct cpuid_policy *cp = v->domain->arch.cpuid;
     u32 bitmap = vmcb_get_exception_intercepts(vmcb);
+    unsigned int mode;
 
     if ( opt_hvm_fep ||
          (v->domain->arch.cpuid->x86_vendor != boot_cpu_data.x86_vendor) )
@@ -607,6 +608,17 @@  static void svm_cpuid_policy_changed(str
     /* Give access to MSR_PRED_CMD if the guest has been told about it. */
     svm_intercept_msr(v, MSR_PRED_CMD,
                       cp->extd.ibpb ? MSR_INTERCEPT_NONE : MSR_INTERCEPT_RW);
+
+    /* Allow direct reads from APERF/MPERF if permitted by the policy. */
+    mode = cp->basic.raw[6].c & CPUID6_ECX_APERFMPERF_CAPABILITY
+           ? MSR_INTERCEPT_WRITE : MSR_INTERCEPT_RW;
+    svm_intercept_msr(v, MSR_IA32_APERF, mode);
+    svm_intercept_msr(v, MSR_IA32_MPERF, mode);
+
+    /* Allow direct access to their r/o counterparts if permitted. */
+    mode = cp->extd.efro ? MSR_INTERCEPT_NONE : MSR_INTERCEPT_RW;
+    svm_intercept_msr(v, MSR_APERF_RD_ONLY, mode);
+    svm_intercept_msr(v, MSR_MPERF_RD_ONLY, mode);
 }
 
 void svm_sync_vmcb(struct vcpu *v, enum vmcb_sync_state new_state)
@@ -860,7 +872,10 @@  static void svm_set_rdtsc_exiting(struct
     {
         general1_intercepts |= GENERAL1_INTERCEPT_RDTSC;
         general2_intercepts |= GENERAL2_INTERCEPT_RDTSCP;
+        svm_enable_intercept_for_msr(v, MSR_IA32_TSC);
     }
+    else
+        svm_intercept_msr(v, MSR_IA32_TSC, MSR_INTERCEPT_WRITE);
 
     vmcb_set_general1_intercepts(vmcb, general1_intercepts);
     vmcb_set_general2_intercepts(vmcb, general2_intercepts);
--- a/xen/arch/x86/hvm/svm/vmcb.c
+++ b/xen/arch/x86/hvm/svm/vmcb.c
@@ -108,6 +108,7 @@  static int construct_vmcb(struct vcpu *v
     {
         vmcb->_general1_intercepts |= GENERAL1_INTERCEPT_RDTSC;
         vmcb->_general2_intercepts |= GENERAL2_INTERCEPT_RDTSCP;
+        svm_intercept_msr(v, MSR_IA32_TSC, MSR_INTERCEPT_WRITE);
     }
 
     /* Guest segment limits. */
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -1140,8 +1140,13 @@  static int construct_vmcs(struct vcpu *v
         vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_CS, VMX_MSR_RW);
         vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_ESP, VMX_MSR_RW);
         vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_EIP, VMX_MSR_RW);
+
+        if ( !(v->arch.hvm.vmx.exec_control & CPU_BASED_RDTSC_EXITING) )
+            vmx_clear_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R);
+
         if ( paging_mode_hap(d) && (!is_iommu_enabled(d) || iommu_snoop) )
             vmx_clear_msr_intercept(v, MSR_IA32_CR_PAT, VMX_MSR_RW);
+
         if ( (vmexit_ctl & VM_EXIT_CLEAR_BNDCFGS) &&
              (vmentry_ctl & VM_ENTRY_LOAD_BNDCFGS) )
             vmx_clear_msr_intercept(v, MSR_IA32_BNDCFGS, VMX_MSR_RW);
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -584,6 +584,18 @@  static void vmx_cpuid_policy_changed(str
         vmx_clear_msr_intercept(v, MSR_FLUSH_CMD, VMX_MSR_RW);
     else
         vmx_set_msr_intercept(v, MSR_FLUSH_CMD, VMX_MSR_RW);
+
+    /* Allow direct reads from APERF/MPERF if permitted by the policy. */
+    if ( cp->basic.raw[6].c & CPUID6_ECX_APERFMPERF_CAPABILITY )
+    {
+        vmx_clear_msr_intercept(v, MSR_IA32_APERF, VMX_MSR_R);
+        vmx_clear_msr_intercept(v, MSR_IA32_MPERF, VMX_MSR_R);
+    }
+    else
+    {
+        vmx_set_msr_intercept(v, MSR_IA32_APERF, VMX_MSR_R);
+        vmx_set_msr_intercept(v, MSR_IA32_MPERF, VMX_MSR_R);
+    }
 }
 
 int vmx_guest_x86_mode(struct vcpu *v)
@@ -1249,7 +1261,12 @@  static void vmx_set_rdtsc_exiting(struct
     vmx_vmcs_enter(v);
     v->arch.hvm.vmx.exec_control &= ~CPU_BASED_RDTSC_EXITING;
     if ( enable )
+    {
         v->arch.hvm.vmx.exec_control |= CPU_BASED_RDTSC_EXITING;
+        vmx_set_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R);
+    }
+    else
+        vmx_clear_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R);
     vmx_update_cpu_exec_control(v);
     vmx_vmcs_exit(v);
 }
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -242,7 +242,7 @@  XEN_CPUFEATURE(MOVDIR64B,     6*32+28) /
 
 /* AMD-defined CPU features, CPUID level 0x80000007.edx, word 7 */
 XEN_CPUFEATURE(ITSC,          7*32+ 8) /*   Invariant TSC */
-XEN_CPUFEATURE(EFRO,          7*32+10) /*   APERF/MPERF Read Only interface */
+XEN_CPUFEATURE(EFRO,          7*32+10) /*S  APERF/MPERF Read Only interface */
 
 /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */
 XEN_CPUFEATURE(CLZERO,        8*32+ 0) /*A  CLZERO instruction */