Message ID | 20220609083916.36658-4-weijiang.yang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Fix up test failures induced by !enable_pmu | expand |
On 9/6/2022 4:39 pm, Yang Weijiang wrote:
> executing rdpmc leads to #GP,
RDPMC still works on processors that do not support architectural performance
monitoring.
The #GP will violate ISA, and try to treat it as NOP (plus EAX=EDX=0) if
!enable_pmu.
On 6/10/2022 8:14 AM, Like Xu wrote: > On 9/6/2022 4:39 pm, Yang Weijiang wrote: >> executing rdpmc leads to #GP, > > RDPMC still works on processors that do not support architectural > performance monitoring. > > The #GP will violate ISA, and try to treat it as NOP (plus EAX=EDX=0) > if !enable_pmu. After a quick check in SDM, I cannot find wordings supporting your above comments, can you point me to it? Another concern is, when !enable_pmu, should we make RDPMC "work" with returning EAX=EDX=0? Or just simply inject #GP to VM in this case?
On Thu, Jun 9, 2022 at 6:47 PM Yang, Weijiang <weijiang.yang@intel.com> wrote: > > > On 6/10/2022 8:14 AM, Like Xu wrote: > > On 9/6/2022 4:39 pm, Yang Weijiang wrote: > >> executing rdpmc leads to #GP, > > > > RDPMC still works on processors that do not support architectural > > performance monitoring. > > > > The #GP will violate ISA, and try to treat it as NOP (plus EAX=EDX=0) > > if !enable_pmu. > > After a quick check in SDM, I cannot find wordings supporting your above > comments, can you > > point me to it? In volume 2, under RDPMC... o If the processor does not support architectural performance monitoring (CPUID.0AH:EAX[7:0]=0), ECX[30:0] specifies the index of the PMC to be read. Setting ECX[31] selects “fast” read mode if supported. In this mode, RDPMC returns bits 31:0 of the PMC in EAX while clearing EDX to zero. For more details, see the following sections of volume 3: 19.6.3 Performance Monitoring (Processors Based on Intel NetBurst Microarchitecture) 19.6.8 Performance Monitoring (P6 Family Processor) 19.6.9 Performance Monitoring (Pentium Processors) > Another concern is, when !enable_pmu, should we make RDPMC "work" with > returning EAX=EDX=0? > > Or just simply inject #GP to VM in this case? Unless KVM is running on a Prescott, it's going to be very difficult to emulate one of these three pre-architectural performance monitoring PMUs. There certainly isn't any code to do it today. In fact, there is no code in KVM to virtualize the NetBurst PMU, even on Prescott. I think Like is being overly pedantic (which is usually my role). RDPMC should behave exactly the same way that RDMSR behaves when accessing the same counter. The last time I looked, RDMSR synthesizes #GP for PMC accesses when !enable_pmu.
On 10/6/2022 10:24 am, Jim Mattson wrote: > On Thu, Jun 9, 2022 at 6:47 PM Yang, Weijiang <weijiang.yang@intel.com> wrote: >> >> >> On 6/10/2022 8:14 AM, Like Xu wrote: >>> On 9/6/2022 4:39 pm, Yang Weijiang wrote: >>>> executing rdpmc leads to #GP, >>> >>> RDPMC still works on processors that do not support architectural >>> performance monitoring. >>> >>> The #GP will violate ISA, and try to treat it as NOP (plus EAX=EDX=0) >>> if !enable_pmu. >> >> After a quick check in SDM, I cannot find wordings supporting your above >> comments, can you >> >> point me to it? > > In volume 2, under RDPMC... > > o If the processor does not support architectural performance > monitoring (CPUID.0AH:EAX[7:0]=0), ECX[30:0] specifies the index of > the PMC to be read. Setting ECX[31] selects “fast” read mode if > supported. In this mode, RDPMC returns bits 31:0 of the PMC in EAX > while clearing EDX to zero. We also miss this part in the KVM code: for (CPUID.0AH:EAX[7:0]=0), the width of general-purpose performance PMCs is 40 bits while the widths of special-purpose PMCs are implementation specific. We may consider the "specific implementation" as "at the discretion of KVM". > > For more details, see the following sections of volume 3: > 19.6.3 Performance Monitoring (Processors Based on Intel NetBurst > Microarchitecture) > 19.6.8 Performance Monitoring (P6 Family Processor) > 19.6.9 Performance Monitoring (Pentium Processors) > >> Another concern is, when !enable_pmu, should we make RDPMC "work" with >> returning EAX=EDX=0? >> >> Or just simply inject #GP to VM in this case? > > Unless KVM is running on a Prescott, it's going to be very difficult > to emulate one of these three pre-architectural performance monitoring > PMUs. There certainly isn't any code to do it today. In fact, there is I don't think so. How arbitrary is this assertion. We have user space like QEMU or GCP to set/unset cpuid.0xa fine-grained, the combination of features will be more flexible in virtualization world. > no code in KVM to virtualize the NetBurst PMU, even on Prescott. > > I think Like is being overly pedantic (which is usually my role). I am indeed greatly influenced by you. :D > RDPMC should behave exactly the same way that RDMSR behaves when > accessing the same counter. The last time I looked, RDMSR synthesizes > #GP for PMC accesses when !enable_pmu. The handling of the available MSR ranges and the available ISA instructions (especially user space available) are different. Users want to make sure their code (using RDPMC on whatever RDPMC-available guest) is robust. The emulation of "use RDPMC if !enable_pmu" should be consistent with the emulation of "use RDPMC to access an unsupported counter". RDPMC Intel Operation: MSCB = Most Significant Counter Bit (* Model-specific *) IF (((CR4.PCE = 1) or (CPL = 0) or (CR0.PE = 0)) and (ECX indicates a supported counter)) THEN EAX := counter[31:0]; EDX := ZeroExtend(counter[MSCB:32]); ELSE (* ECX is not valid or CR4.PCE is 0 and CPL is 1, 2, or 3 and CR0.PE is 1 *) #GP(0); FI; Therefore, we will not have a #GP if !enable_pmu for legacy or future user space programs.
On Thu, Jun 9, 2022 at 7:49 PM Like Xu <like.xu.linux@gmail.com> wrote: > RDPMC Intel Operation: > > MSCB = Most Significant Counter Bit (* Model-specific *) > IF (((CR4.PCE = 1) or (CPL = 0) or (CR0.PE = 0)) and (ECX indicates a supported > counter)) > THEN > EAX := counter[31:0]; > EDX := ZeroExtend(counter[MSCB:32]); > ELSE (* ECX is not valid or CR4.PCE is 0 and CPL is 1, 2, or 3 and CR0.PE is 1 *) > #GP(0); > FI; > > Therefore, we will not have a #GP if !enable_pmu for legacy or future user space > programs. I beg to differ. Continue on a bit further... #GP If an invalid performance counter index is specified. If !enable_pmu, no performance counters are supported by kvm. Hence, all performance counter indices are invalid. The only CPUs for which one might argue that userspace could reasonably assume that some PMCs are valid, in spite of CPUID.0AH:EAX[7:0]=0, are the three legacy families I mentioned previously.
On Thu, Jun 9, 2022 at 9:16 PM Jim Mattson <jmattson@google.com> wrote: > > On Thu, Jun 9, 2022 at 7:49 PM Like Xu <like.xu.linux@gmail.com> wrote: > > > RDPMC Intel Operation: Actually, the key phrase is also present in the pseudocode you quoted: > > MSCB = Most Significant Counter Bit (* Model-specific *) > > IF (((CR4.PCE = 1) or (CPL = 0) or (CR0.PE = 0)) and (ECX indicates a supported > > counter)) ... The final conjunct in that condition is false under KVM when !enable_pmu, because there are no supported counters.
On 10/6/2022 12:22 pm, Jim Mattson wrote: > On Thu, Jun 9, 2022 at 9:16 PM Jim Mattson <jmattson@google.com> wrote: >> >> On Thu, Jun 9, 2022 at 7:49 PM Like Xu <like.xu.linux@gmail.com> wrote: >> >>> RDPMC Intel Operation: > > Actually, the key phrase is also present in the pseudocode you quoted: > >>> MSCB = Most Significant Counter Bit (* Model-specific *) >>> IF (((CR4.PCE = 1) or (CPL = 0) or (CR0.PE = 0)) and (ECX indicates a supported >>> counter)) > ... > > The final conjunct in that condition is false under KVM when > !enable_pmu, because there are no supported counters. Uh, I have lifted a stone and smashed my own feet. Please move on with #GP expectation.
On 6/10/2022 12:56 PM, Like Xu wrote: > On 10/6/2022 12:22 pm, Jim Mattson wrote: >> On Thu, Jun 9, 2022 at 9:16 PM Jim Mattson <jmattson@google.com> wrote: >>> >>> On Thu, Jun 9, 2022 at 7:49 PM Like Xu <like.xu.linux@gmail.com> wrote: >>> >>>> RDPMC Intel Operation: >> >> Actually, the key phrase is also present in the pseudocode you quoted: >> >>>> MSCB = Most Significant Counter Bit (* Model-specific *) >>>> IF (((CR4.PCE = 1) or (CPL = 0) or (CR0.PE = 0)) and (ECX indicates >>>> a supported >>>> counter)) >> ... >> >> The final conjunct in that condition is false under KVM when >> !enable_pmu, because there are no supported counters. > > Uh, I have lifted a stone and smashed my own feet. > > Please move on with #GP expectation. Thank you two!
diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c index 4d581e7..dd6fc13 100644 --- a/x86/vmx_tests.c +++ b/x86/vmx_tests.c @@ -944,6 +944,16 @@ static void insn_intercept_main(void) continue; } + if (insn_table[cur_insn].flag == CPU_RDPMC) { + struct cpuid id = cpuid(10); + + if (!(id.a & 0xff)) { + printf("\tFeature required for %s is not supported.\n", + insn_table[cur_insn].name); + continue; + } + } + if (insn_table[cur_insn].disabled) { printf("\tFeature required for %s is not supported.\n", insn_table[cur_insn].name); @@ -7490,6 +7500,13 @@ static void test_perf_global_ctrl(u32 nr, const char *name, u32 ctrl_nr, static void test_load_host_perf_global_ctrl(void) { + struct cpuid id = cpuid(10); + + if (!(id.a & 0xff)) { + report_skip("test_load_host_perf_global_ctrl"); + return; + } + if (!(ctrl_exit_rev.clr & EXI_LOAD_PERF)) { printf("\"load IA32_PERF_GLOBAL_CTRL\" exit control not supported\n"); return; @@ -7502,6 +7519,13 @@ static void test_load_host_perf_global_ctrl(void) static void test_load_guest_perf_global_ctrl(void) { + struct cpuid id = cpuid(10); + + if (!(id.a & 0xff)) { + report_skip("test_load_guest_perf_global_ctrl"); + return; + } + if (!(ctrl_enter_rev.clr & ENT_LOAD_PERF)) { printf("\"load IA32_PERF_GLOBAL_CTRL\" entry control not supported\n"); return;
When pmu is disabled in KVM, reading MSR_CORE_PERF_GLOBAL_CTRL or executing rdpmc leads to #GP, so skip related tests in this case. Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> --- x86/vmx_tests.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)