Message ID | 20240218043003.2424683-1-dapeng1.mi@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] KVM: selftests: Test top-down slots event | expand |
On Sun, 18 Feb 2024 12:30:03 +0800, Dapeng Mi wrote: > Although the fixed counter 3 and its exclusive pseudo slots event are > not supported by KVM yet, the architectural slots event is supported by > KVM and can be programed on any GP counter. Thus add validation for this > architectural slots event. > > Top-down slots event "counts the total number of available slots for an > unhalted logical processor, and increments by machine-width of the > narrowest pipeline as employed by the Top-down Microarchitecture > Analysis method." > > [...] Applied to kvm-x86 pmu, with a very slight tweak to the shortlog. Thanks a ton for the verbose changelog! [1/1] KVM: selftests: Test top-down slots event in x86's pmu_counters_test https://github.com/kvm-x86/linux/commit/4a447b135e45 -- https://github.com/kvm-x86/linux/tree/next
diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c index ae5f6042f1e8..29609b52f8fa 100644 --- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c +++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c @@ -119,6 +119,9 @@ static void guest_assert_event_count(uint8_t idx, case INTEL_ARCH_REFERENCE_CYCLES_INDEX: GUEST_ASSERT_NE(count, 0); break; + case INTEL_ARCH_TOPDOWN_SLOTS_INDEX: + GUEST_ASSERT(count >= NUM_INSNS_RETIRED); + break; default: break; }
Although the fixed counter 3 and its exclusive pseudo slots event are not supported by KVM yet, the architectural slots event is supported by KVM and can be programed on any GP counter. Thus add validation for this architectural slots event. Top-down slots event "counts the total number of available slots for an unhalted logical processor, and increments by machine-width of the narrowest pipeline as employed by the Top-down Microarchitecture Analysis method." As for the slot, it's an abstract concept which indicates how many uops (decoded from instructions) can be processed simultaneously (per cycle) on HW. In Top-down Microarchitecture Analysis (TMA) method, the processor is divided into two parts, frond-end and back-end. Assume there is a processor with classic 5-stage pipeline, fetch, decode, execute, memory access and register writeback. The former 2 stages (fetch/decode) are classified to frond-end and the latter 3 stages are classified to back-end. In modern Intel processors, a complicated instruction would be decoded into several uops (micro-operations) and so these uops can be processed simultaneously and then improve the performance. Thus, assume a processor can decode and dispatch 4 uops in front-end and execute 4 uops in back-end simultaneously (per-cycle), so the machine-width of this processor is 4 and this processor has 4 topdown slots per-cycle. If a slot is spare and can be used to process a new upcoming uop, then the slot is available, but if a uop occupies a slot for several cycles and can't be retired (maybe blocked by memory access), then this slot is stall and unavailable. Considering the testing instruction sequence can't be macro-fused on x86 platforms, the measured slots count should not be less than NUM_INSNS_RETIRED. Thus assert the slots count against NUM_INSNS_RETIRED. pmu_counters_test passed with this patch on Intel Sapphire Rapids. About the more information about TMA method, please refer the below link. https://www.intel.com/content/www/us/en/docs/vtune-profiler/cookbook/2023-0/top-down-microarchitecture-analysis-method.html Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> --- tools/testing/selftests/kvm/x86_64/pmu_counters_test.c | 3 +++ 1 file changed, 3 insertions(+) base-commit: f0f3b810edda57f317d79f452056786257089667