Message ID | 0d2c44ca-d3ce-bb83-e3fc-0e5037c90143@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86emul: further work | expand |
> From: Jan Beulich <jbeulich@suse.com> > Sent: Tuesday, January 7, 2020 12:39 AM > > If the hardware can handle accesses, we should allow it to do so. This > way we can expose EFRO to HVM guests, and "all" that's left for exposing > APERF/MPERF is to figure out how to handle writes to these MSRs. (Note > that the leaf 6 guest CPUID checks will evaluate to false for now, as > recalculate_misc() zaps the entire leaf for now.) > > For TSC the intercepts are made mirror the RDTSC ones. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
On 19.01.2020 03:44, Tian, Kevin wrote: >> From: Jan Beulich <jbeulich@suse.com> >> Sent: Tuesday, January 7, 2020 12:39 AM >> >> If the hardware can handle accesses, we should allow it to do so. This >> way we can expose EFRO to HVM guests, and "all" that's left for exposing >> APERF/MPERF is to figure out how to handle writes to these MSRs. (Note >> that the leaf 6 guest CPUID checks will evaluate to false for now, as >> recalculate_misc() zaps the entire leaf for now.) >> >> For TSC the intercepts are made mirror the RDTSC ones. >> >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > Reviewed-by: Kevin Tian <kevin.tian@intel.com> Thanks. I assume you've seen Andrew's comment, and hence I take it that the R-b also applies to the adjusted version (not posted yet): --- a/xen/arch/x86/hvm/vmx/vmcs.c +++ b/xen/arch/x86/hvm/vmx/vmcs.c @@ -1140,8 +1140,13 @@ static int construct_vmcs(struct vcpu *v vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_CS, VMX_MSR_RW); vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_ESP, VMX_MSR_RW); vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_EIP, VMX_MSR_RW); + + if ( !(v->arch.hvm.vmx.exec_control & CPU_BASED_RDTSC_EXITING) ) + vmx_clear_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R); + if ( paging_mode_hap(d) && (!is_iommu_enabled(d) || iommu_snoop) ) vmx_clear_msr_intercept(v, MSR_IA32_CR_PAT, VMX_MSR_RW); + if ( (vmexit_ctl & VM_EXIT_CLEAR_BNDCFGS) && (vmentry_ctl & VM_ENTRY_LOAD_BNDCFGS) ) vmx_clear_msr_intercept(v, MSR_IA32_BNDCFGS, VMX_MSR_RW); plus this extra vmx.c hunk: @@ -1249,7 +1261,12 @@ static void vmx_set_rdtsc_exiting(struct vmx_vmcs_enter(v); v->arch.hvm.vmx.exec_control &= ~CPU_BASED_RDTSC_EXITING; if ( enable ) + { v->arch.hvm.vmx.exec_control |= CPU_BASED_RDTSC_EXITING; + vmx_set_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R); + } + else + vmx_clear_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R); vmx_update_cpu_exec_control(v); vmx_vmcs_exit(v); } Jan
> From: Jan Beulich <jbeulich@suse.com> > Sent: Monday, January 20, 2020 4:33 PM > > On 19.01.2020 03:44, Tian, Kevin wrote: > >> From: Jan Beulich <jbeulich@suse.com> > >> Sent: Tuesday, January 7, 2020 12:39 AM > >> > >> If the hardware can handle accesses, we should allow it to do so. This > >> way we can expose EFRO to HVM guests, and "all" that's left for exposing > >> APERF/MPERF is to figure out how to handle writes to these MSRs. (Note > >> that the leaf 6 guest CPUID checks will evaluate to false for now, as > >> recalculate_misc() zaps the entire leaf for now.) > >> > >> For TSC the intercepts are made mirror the RDTSC ones. > >> > >> Signed-off-by: Jan Beulich <jbeulich@suse.com> > > > > Reviewed-by: Kevin Tian <kevin.tian@intel.com> > > Thanks. I assume you've seen Andrew's comment, and hence I take it > that the R-b also applies to the adjusted version (not posted yet): sorry I'm not sure which comment is referred here. If you will anyway send out a new version, please drop my R-b and I will double check again though the below version alone looks good. > > --- a/xen/arch/x86/hvm/vmx/vmcs.c > +++ b/xen/arch/x86/hvm/vmx/vmcs.c > @@ -1140,8 +1140,13 @@ static int construct_vmcs(struct vcpu *v > vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_CS, VMX_MSR_RW); > vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_ESP, VMX_MSR_RW); > vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_EIP, VMX_MSR_RW); > + > + if ( !(v->arch.hvm.vmx.exec_control & CPU_BASED_RDTSC_EXITING) ) > + vmx_clear_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R); > + > if ( paging_mode_hap(d) && (!is_iommu_enabled(d) || iommu_snoop) ) > vmx_clear_msr_intercept(v, MSR_IA32_CR_PAT, VMX_MSR_RW); > + > if ( (vmexit_ctl & VM_EXIT_CLEAR_BNDCFGS) && > (vmentry_ctl & VM_ENTRY_LOAD_BNDCFGS) ) > vmx_clear_msr_intercept(v, MSR_IA32_BNDCFGS, VMX_MSR_RW); > > plus this extra vmx.c hunk: > > @@ -1249,7 +1261,12 @@ static void vmx_set_rdtsc_exiting(struct > vmx_vmcs_enter(v); > v->arch.hvm.vmx.exec_control &= ~CPU_BASED_RDTSC_EXITING; > if ( enable ) > + { > v->arch.hvm.vmx.exec_control |= CPU_BASED_RDTSC_EXITING; > + vmx_set_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R); > + } > + else > + vmx_clear_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R); > vmx_update_cpu_exec_control(v); > vmx_vmcs_exit(v); > } > > Jan
--- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -595,6 +595,7 @@ static void svm_cpuid_policy_changed(str struct vmcb_struct *vmcb = svm->vmcb; const struct cpuid_policy *cp = v->domain->arch.cpuid; u32 bitmap = vmcb_get_exception_intercepts(vmcb); + unsigned int mode; if ( opt_hvm_fep || (v->domain->arch.cpuid->x86_vendor != boot_cpu_data.x86_vendor) ) @@ -607,6 +608,17 @@ static void svm_cpuid_policy_changed(str /* Give access to MSR_PRED_CMD if the guest has been told about it. */ svm_intercept_msr(v, MSR_PRED_CMD, cp->extd.ibpb ? MSR_INTERCEPT_NONE : MSR_INTERCEPT_RW); + + /* Allow direct reads from APERF/MPERF if permitted by the policy. */ + mode = cp->basic.raw[6].c & CPUID6_ECX_APERFMPERF_CAPABILITY + ? MSR_INTERCEPT_WRITE : MSR_INTERCEPT_RW; + svm_intercept_msr(v, MSR_IA32_APERF, mode); + svm_intercept_msr(v, MSR_IA32_MPERF, mode); + + /* Allow direct access to their r/o counterparts if permitted. */ + mode = cp->extd.efro ? MSR_INTERCEPT_NONE : MSR_INTERCEPT_RW; + svm_intercept_msr(v, MSR_APERF_RD_ONLY, mode); + svm_intercept_msr(v, MSR_MPERF_RD_ONLY, mode); } void svm_sync_vmcb(struct vcpu *v, enum vmcb_sync_state new_state) @@ -860,7 +872,10 @@ static void svm_set_rdtsc_exiting(struct { general1_intercepts |= GENERAL1_INTERCEPT_RDTSC; general2_intercepts |= GENERAL2_INTERCEPT_RDTSCP; + svm_enable_intercept_for_msr(v, MSR_IA32_TSC); } + else + svm_intercept_msr(v, MSR_IA32_TSC, MSR_INTERCEPT_WRITE); vmcb_set_general1_intercepts(vmcb, general1_intercepts); vmcb_set_general2_intercepts(vmcb, general2_intercepts); --- a/xen/arch/x86/hvm/svm/vmcb.c +++ b/xen/arch/x86/hvm/svm/vmcb.c @@ -108,6 +108,7 @@ static int construct_vmcb(struct vcpu *v { vmcb->_general1_intercepts |= GENERAL1_INTERCEPT_RDTSC; vmcb->_general2_intercepts |= GENERAL2_INTERCEPT_RDTSCP; + svm_intercept_msr(v, MSR_IA32_TSC, MSR_INTERCEPT_WRITE); } /* Guest segment limits. */ --- a/xen/arch/x86/hvm/vmx/vmcs.c +++ b/xen/arch/x86/hvm/vmx/vmcs.c @@ -1140,8 +1140,13 @@ static int construct_vmcs(struct vcpu *v vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_CS, VMX_MSR_RW); vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_ESP, VMX_MSR_RW); vmx_clear_msr_intercept(v, MSR_IA32_SYSENTER_EIP, VMX_MSR_RW); + + if ( !(v->arch.hvm.vmx.exec_control & CPU_BASED_RDTSC_EXITING) ) + vmx_clear_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R); + if ( paging_mode_hap(d) && (!is_iommu_enabled(d) || iommu_snoop) ) vmx_clear_msr_intercept(v, MSR_IA32_CR_PAT, VMX_MSR_RW); + if ( (vmexit_ctl & VM_EXIT_CLEAR_BNDCFGS) && (vmentry_ctl & VM_ENTRY_LOAD_BNDCFGS) ) vmx_clear_msr_intercept(v, MSR_IA32_BNDCFGS, VMX_MSR_RW); --- a/xen/arch/x86/hvm/vmx/vmx.c +++ b/xen/arch/x86/hvm/vmx/vmx.c @@ -584,6 +584,18 @@ static void vmx_cpuid_policy_changed(str vmx_clear_msr_intercept(v, MSR_FLUSH_CMD, VMX_MSR_RW); else vmx_set_msr_intercept(v, MSR_FLUSH_CMD, VMX_MSR_RW); + + /* Allow direct reads from APERF/MPERF if permitted by the policy. */ + if ( cp->basic.raw[6].c & CPUID6_ECX_APERFMPERF_CAPABILITY ) + { + vmx_clear_msr_intercept(v, MSR_IA32_APERF, VMX_MSR_R); + vmx_clear_msr_intercept(v, MSR_IA32_MPERF, VMX_MSR_R); + } + else + { + vmx_set_msr_intercept(v, MSR_IA32_APERF, VMX_MSR_R); + vmx_set_msr_intercept(v, MSR_IA32_MPERF, VMX_MSR_R); + } } int vmx_guest_x86_mode(struct vcpu *v) @@ -1249,7 +1261,12 @@ static void vmx_set_rdtsc_exiting(struct vmx_vmcs_enter(v); v->arch.hvm.vmx.exec_control &= ~CPU_BASED_RDTSC_EXITING; if ( enable ) + { v->arch.hvm.vmx.exec_control |= CPU_BASED_RDTSC_EXITING; + vmx_set_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R); + } + else + vmx_clear_msr_intercept(v, MSR_IA32_TSC, VMX_MSR_R); vmx_update_cpu_exec_control(v); vmx_vmcs_exit(v); } --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -242,7 +242,7 @@ XEN_CPUFEATURE(MOVDIR64B, 6*32+28) / /* AMD-defined CPU features, CPUID level 0x80000007.edx, word 7 */ XEN_CPUFEATURE(ITSC, 7*32+ 8) /* Invariant TSC */ -XEN_CPUFEATURE(EFRO, 7*32+10) /* APERF/MPERF Read Only interface */ +XEN_CPUFEATURE(EFRO, 7*32+10) /*S APERF/MPERF Read Only interface */ /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */ XEN_CPUFEATURE(CLZERO, 8*32+ 0) /*A CLZERO instruction */
If the hardware can handle accesses, we should allow it to do so. This way we can expose EFRO to HVM guests, and "all" that's left for exposing APERF/MPERF is to figure out how to handle writes to these MSRs. (Note that the leaf 6 guest CPUID checks will evaluate to false for now, as recalculate_misc() zaps the entire leaf for now.) For TSC the intercepts are made mirror the RDTSC ones. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v4: Make TSC intercepts mirror RDTSC ones. Re-base. v3: New.