diff mbox series

[1/7] KVM: selftests: Test Intel PMU architectural events on gp counters

Message ID 20230323072714.82289-2-likexu@tencent.com (mailing list archive)
State New, archived
Headers show
Series KVM: selftests: Test the consistency of the PMU's CPUID and its features | expand

Commit Message

Like Xu March 23, 2023, 7:27 a.m. UTC
From: Like Xu <likexu@tencent.com>

Add test cases to check if different Architectural events are available
after it's marked as unavailable via CPUID. It covers vPMU event filtering
logic based on Intel CPUID, which is a complement to pmu_event_filter.

According to Intel SDM, the number of architectural events is reported
through CPUID.0AH:EAX[31:24] and the architectural event x is
supported if EBX[x]=0 && EAX[31:24]>x.

Co-developed-by: Jinrong Liang <cloudliang@tencent.com>
Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
Signed-off-by: Like Xu <likexu@tencent.com>
---
 tools/testing/selftests/kvm/Makefile          |   1 +
 .../selftests/kvm/x86_64/pmu_cpuid_test.c     | 202 ++++++++++++++++++
 2 files changed, 203 insertions(+)
 create mode 100644 tools/testing/selftests/kvm/x86_64/pmu_cpuid_test.c

Comments

Sean Christopherson May 24, 2023, 10:32 p.m. UTC | #1
On Thu, Mar 23, 2023, Like Xu wrote:
> From: Like Xu <likexu@tencent.com>
> 
> Add test cases to check if different Architectural events are available
> after it's marked as unavailable via CPUID. It covers vPMU event filtering
> logic based on Intel CPUID, which is a complement to pmu_event_filter.
> 
> According to Intel SDM, the number of architectural events is reported
> through CPUID.0AH:EAX[31:24] and the architectural event x is
> supported if EBX[x]=0 && EAX[31:24]>x.
> 
> Co-developed-by: Jinrong Liang <cloudliang@tencent.com>
> Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  tools/testing/selftests/kvm/Makefile          |   1 +
>  .../selftests/kvm/x86_64/pmu_cpuid_test.c     | 202 ++++++++++++++++++

"cpuid" is a rather confusing name, e.g. I was half expecting tests where the guest
is executing CPUID.  IIUC, this is really just a basic functionality test, for a
somewhat loose definition of "basic".

Maybe something like pmu_basic_functionality_test?  Or pmu_functional_test?

>  2 files changed, 203 insertions(+)
>  create mode 100644 tools/testing/selftests/kvm/x86_64/pmu_cpuid_test.c

...

> +#define EVENTSEL_OS BIT_ULL(17)
> +#define EVENTSEL_EN BIT_ULL(22)
> +#define PMU_CAP_FW_WRITES BIT_ULL(13)
> +#define EVENTS_MASK GENMASK_ULL(7, 0)
> +#define PMU_VERSION_MASK GENMASK_ULL(7, 0)
> +#define GP_CTR_NUM_OFS_BIT 8
> +#define GP_CTR_NUM_MASK GENMASK_ULL(15, GP_CTR_NUM_OFS_BIT)
> +#define EVT_LEN_OFS_BIT 24
> +#define EVT_LEN_MASK GENMASK_ULL(31, EVT_LEN_OFS_BIT)
> +
> +#define ARCH_EVENT(select, umask) (((select) & 0xff) | ((umask) & 0xff) << 8)
> +
> +/*
> + * Intel Pre-defined Architectural Performance Events. Note some events
> + * are skipped for testing due to difficulties in stable reproduction.
> + */
> +static const uint64_t arch_events[] = {
> +	[0] = ARCH_EVENT(0x3c, 0x0),
> +	[1] = ARCH_EVENT(0xc0, 0x0),
> +	[2] = ARCH_EVENT(0x3c, 0x1),
> +	[3] = ARCH_EVENT(0x2e, 0x4f), /* LLC Reference */
> +	[4] = ARCH_EVENT(0x2e, 0x41), /* LLC Misses */
> +	[5] = ARCH_EVENT(0xc4, 0x0),
> +	[6] = ARCH_EVENT(0xc5, 0x0),  /* Branch Misses Retired */
> +	[7] = ARCH_EVENT(0xa4, 0x1), /* Topdown Slots */
> +};

All of these definitions belong in library code.  Probably something like
tools/testing/selftests/kvm/include/x86_64/pmu.h, or maybe intel_pmu.h?

> +static struct kvm_vcpu *new_vcpu(void *guest_code)

Assuming the next version is sent before the aforementioned cleanup, how about
defining this as

static struct kvm_vm *pmu_vm_create_with_one_vcpu(struct kvm_vcpu **vcpu,
						  void *guest_code)

so that the cleanup can be a more straightforward replacement.  That should also
make the free_vcpu() wrapper completely unnecessary.

> +{
> +	struct kvm_vm *vm;
> +	struct kvm_vcpu *vcpu;
> +
> +	vm = vm_create_with_one_vcpu(&vcpu, guest_code);
> +	vm_init_descriptor_tables(vm);
> +	vcpu_init_descriptor_tables(vcpu);

Ugh, I really need to put together that series to clean up the descriptor table
mess.

https://lore.kernel.org/all/Y8hCBOndYMD9zsDL@google.com

> +
> +	return vcpu;
> +}
> +
> +static void free_vcpu(struct kvm_vcpu *vcpu)
> +{
> +	kvm_vm_free(vcpu->vm);

Just call kvm_vm_free().  Adding layers of unnecessary helpers just to provide
symmetry is not a good tradeoff.  As above, tweak the name/prototyp of new_vcpu()
if you really want something approaching symmetry.

> +static void run_vcpu(struct kvm_vcpu *vcpu, const char *msg,
> +		     bool (*check_ucall)(struct ucall *uc, void *data),
> +		     void *expect_args)
> +{
> +	struct ucall uc;
> +
> +	for (;;) {
> +		vcpu_run(vcpu);
> +		switch (get_ucall(vcpu, &uc)) {
> +		case UCALL_SYNC:
> +			TEST_ASSERT(check_ucall(&uc, expect_args), "%s", msg);

A callback is both overkill and debug hostile.  If the assert fails, it will report
this common code and not the caller.  Since all uses grab the first ucall param,
just "return" that from the run_vcpu() wrapper as out param.  Callers will need
to define their own loop, but that's one line of code, and it has the advantage
of making the code much more self-documenting

E.g.

static uint64_t run_vcpu(struct kvm_vcpu *vcpu, uint64_t *ucall_arg)
{
	struct ucall uc;

	vcpu_run(vcpu);
	switch (get_ucall(vcpu, &uc)) {
	case UCALL_SYNC:
		ucall_arg = uc->args[1];
		break;
	case UCALL_DONE:
		break;
	default:
		TEST_ASSERT(false, "Unexpected exit: %s",
				exit_reason_str(vcpu->run->exit_reason));
	}
	return uc.cmd;
}

and in intel_test_fixed_counters() as an example:

			while (run_vcpu(vcpu, &val) != UCALL_DONE)
				TEST_ASSERT(val, "One or more fixed counters not counting");

> +static void test_arch_events_setup(struct kvm_vcpu *vcpu, uint8_t evt_vector,
> +				   uint8_t unavl_mask, uint8_t idx)
> +{
> +	struct kvm_cpuid_entry2 *entry;
> +	uint32_t ctr_msr = MSR_IA32_PERFCTR0;
> +	bool is_supported;
> +
> +	entry = vcpu_get_cpuid_entry(vcpu, 0xa);
> +	entry->eax = (entry->eax & ~EVT_LEN_MASK) |
> +		(evt_vector << EVT_LEN_OFS_BIT);
> +	entry->ebx = (entry->ebx & ~EVENTS_MASK) | unavl_mask;
> +	vcpu_set_cpuid(vcpu);
> +
> +	if (vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES)
> +		ctr_msr = MSR_IA32_PMC0;
> +
> +	/* Arch event x is supported if EBX[x]=0 && EAX[31:24]>x */
> +	is_supported = !(entry->ebx & BIT_ULL(idx)) &&
> +		(((entry->eax & EVT_LEN_MASK) >> EVT_LEN_OFS_BIT) > idx);
> +
> +	vcpu_args_set(vcpu, 5, entry->eax & PMU_VERSION_MASK,
> +		      (entry->eax & GP_CTR_NUM_MASK) >> GP_CTR_NUM_OFS_BIT,
> +		      is_supported, ctr_msr, arch_events[idx]);

Why pass all this stuff in as params?  Wouldn't it be easier to query CPUID from
the guest side?  And define KVM_X86_CPU_PROPERTY macros where possible to avoid
having to open code things in multiple places.

> +static void intel_check_arch_event_is_unavl(uint8_t idx)
> +{
> +	const char *msg = "Unavailable arch event is counting.";
> +	uint8_t eax_evt_vec, ebx_unavl_mask, i, j;
> +	struct kvm_vcpu *vcpu;
> +
> +	/*
> +	 * A brute force iteration of all combinations of values is likely to
> +	 * exhaust the limit of the single-threaded thread fd nums, so it's
> +	 * tested here by iterating through all valid values on a single bit.
> +	 */
> +	for (i = 0; i < ARRAY_SIZE(arch_events); i++) {
> +		eax_evt_vec = BIT_ULL(i);
> +		for (j = 0; j < ARRAY_SIZE(arch_events); j++) {
> +			ebx_unavl_mask = BIT_ULL(j);
> +
> +			vcpu = new_vcpu(intel_guest_run_arch_event);
> +			test_arch_events_setup(vcpu, eax_evt_vec,
> +					       ebx_unavl_mask, idx);
> +			run_vcpu(vcpu, msg, first_uc_arg_non_zero, NULL);

Oof, this is super confusing.  At first, second, and third glances, it looks like
this is asserting that the counter _is_ counting.  It took me a while to find this
code in the guest:

		GUEST_SYNC(supported == !!_rdpmc(RDPMC_FIXED_BASE | i));

Either have the guest do GUEST_ASSERT() directly and process the output in the
common run_vcpu(), or just have the guest return the counter value and do the
assert here.  My vote would be the latter, e.g.

			while (run_vcpu(vcpu, &counter_val) != UCALL_DONE)
				TEST_ASSERT(is_supported == !!counter_val,
					    "blah blah blah");

That'd also likely avoid having to plumb the "supported" info into the guest.
Jim Mattson May 24, 2023, 10:59 p.m. UTC | #2
On Thu, Mar 23, 2023 at 12:27 AM Like Xu <like.xu.linux@gmail.com> wrote:
>
> From: Like Xu <likexu@tencent.com>
>
> Add test cases to check if different Architectural events are available
> after it's marked as unavailable via CPUID. It covers vPMU event filtering
> logic based on Intel CPUID, which is a complement to pmu_event_filter.
>
> According to Intel SDM, the number of architectural events is reported
> through CPUID.0AH:EAX[31:24] and the architectural event x is
> supported if EBX[x]=0 && EAX[31:24]>x.
>
> Co-developed-by: Jinrong Liang <cloudliang@tencent.com>
> Signed-off-by: Jinrong Liang <cloudliang@tencent.com>
> Signed-off-by: Like Xu <likexu@tencent.com>
> ---
>  tools/testing/selftests/kvm/Makefile          |   1 +
>  .../selftests/kvm/x86_64/pmu_cpuid_test.c     | 202 ++++++++++++++++++
>  2 files changed, 203 insertions(+)
>  create mode 100644 tools/testing/selftests/kvm/x86_64/pmu_cpuid_test.c
>
> diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> index 84a627c43795..8aa63081b3e6 100644
> --- a/tools/testing/selftests/kvm/Makefile
> +++ b/tools/testing/selftests/kvm/Makefile
> @@ -78,6 +78,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/mmio_warning_test
>  TEST_GEN_PROGS_x86_64 += x86_64/monitor_mwait_test
>  TEST_GEN_PROGS_x86_64 += x86_64/nested_exceptions_test
>  TEST_GEN_PROGS_x86_64 += x86_64/platform_info_test
> +TEST_GEN_PROGS_x86_64 += x86_64/pmu_cpuid_test
>  TEST_GEN_PROGS_x86_64 += x86_64/pmu_event_filter_test
>  TEST_GEN_PROGS_x86_64 += x86_64/set_boot_cpu_id
>  TEST_GEN_PROGS_x86_64 += x86_64/set_sregs_test
> diff --git a/tools/testing/selftests/kvm/x86_64/pmu_cpuid_test.c b/tools/testing/selftests/kvm/x86_64/pmu_cpuid_test.c
> new file mode 100644
> index 000000000000..faab0a91e191
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/x86_64/pmu_cpuid_test.c
> @@ -0,0 +1,202 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Test the consistency of the PMU's CPUID and its features
> + *
> + * Copyright (C) 2023, Tencent, Inc.
> + *
> + * Check that the VM's PMU behaviour is consistent with the
> + * VM CPUID definition.
> + */
> +
> +#define _GNU_SOURCE /* for program_invocation_short_name */
> +#include <x86intrin.h>
> +
> +#include "processor.h"
> +
> +/* Guest payload for any performance counter counting */
> +#define NUM_BRANCHES 10
> +
> +#define EVENTSEL_OS BIT_ULL(17)
> +#define EVENTSEL_EN BIT_ULL(22)
> +#define PMU_CAP_FW_WRITES BIT_ULL(13)
> +#define EVENTS_MASK GENMASK_ULL(7, 0)
> +#define PMU_VERSION_MASK GENMASK_ULL(7, 0)
> +#define GP_CTR_NUM_OFS_BIT 8
> +#define GP_CTR_NUM_MASK GENMASK_ULL(15, GP_CTR_NUM_OFS_BIT)
> +#define EVT_LEN_OFS_BIT 24
> +#define EVT_LEN_MASK GENMASK_ULL(31, EVT_LEN_OFS_BIT)
> +
> +#define ARCH_EVENT(select, umask) (((select) & 0xff) | ((umask) & 0xff) << 8)
> +
> +/*
> + * Intel Pre-defined Architectural Performance Events. Note some events
> + * are skipped for testing due to difficulties in stable reproduction.
> + */
> +static const uint64_t arch_events[] = {
> +       [0] = ARCH_EVENT(0x3c, 0x0),
> +       [1] = ARCH_EVENT(0xc0, 0x0),
> +       [2] = ARCH_EVENT(0x3c, 0x1),
> +       [3] = ARCH_EVENT(0x2e, 0x4f), /* LLC Reference */
> +       [4] = ARCH_EVENT(0x2e, 0x41), /* LLC Misses */
> +       [5] = ARCH_EVENT(0xc4, 0x0),
> +       [6] = ARCH_EVENT(0xc5, 0x0),  /* Branch Misses Retired */
> +       [7] = ARCH_EVENT(0xa4, 0x1), /* Topdown Slots */
> +};
> +
> +static struct kvm_vcpu *new_vcpu(void *guest_code)
> +{
> +       struct kvm_vm *vm;
> +       struct kvm_vcpu *vcpu;
> +
> +       vm = vm_create_with_one_vcpu(&vcpu, guest_code);
> +       vm_init_descriptor_tables(vm);
> +       vcpu_init_descriptor_tables(vcpu);
> +
> +       return vcpu;
> +}
> +
> +static void free_vcpu(struct kvm_vcpu *vcpu)
> +{
> +       kvm_vm_free(vcpu->vm);
> +}
> +
> +static void run_vcpu(struct kvm_vcpu *vcpu, const char *msg,
> +                    bool (*check_ucall)(struct ucall *uc, void *data),
> +                    void *expect_args)
> +{
> +       struct ucall uc;
> +
> +       for (;;) {
> +               vcpu_run(vcpu);
> +               switch (get_ucall(vcpu, &uc)) {
> +               case UCALL_SYNC:
> +                       TEST_ASSERT(check_ucall(&uc, expect_args), "%s", msg);
> +                       continue;
> +               case UCALL_DONE:
> +                       break;
> +               default:
> +                       TEST_ASSERT(false, "Unexpected exit: %s",
> +                                   exit_reason_str(vcpu->run->exit_reason));
> +               }
> +               break;
> +       }
> +}
> +
> +static bool first_uc_arg_non_zero(struct ucall *uc, void *data)
> +{
> +       return uc->args[1];
> +}
> +
> +static void intel_guest_run_arch_event(uint8_t version, uint8_t max_gp_num,
> +                                      bool supported, uint32_t ctr_base_msr,
> +                                      uint64_t evt_code)
> +{
> +       uint32_t global_msr = MSR_CORE_PERF_GLOBAL_CTRL;
> +       unsigned int i;
> +
> +       for (i = 0; i < max_gp_num; i++) {
> +               wrmsr(ctr_base_msr + i, 0);
> +               wrmsr(MSR_P6_EVNTSEL0 + i, EVENTSEL_OS | EVENTSEL_EN | evt_code);
> +               if (version > 1)
> +                       wrmsr(global_msr, BIT_ULL(i));
> +
> +               __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
> +
> +               if (version > 1)
> +                       wrmsr(global_msr, 0);
> +
> +               GUEST_SYNC(supported == !!_rdpmc(i));

Just because the counter is non-zero doesn't mean that the event is
supported.  "Supported" would imply that it actually works, doesn't
it?

> +       }
> +
> +       GUEST_DONE();
> +}
> +
> +static void test_arch_events_setup(struct kvm_vcpu *vcpu, uint8_t evt_vector,
> +                                  uint8_t unavl_mask, uint8_t idx)
> +{
> +       struct kvm_cpuid_entry2 *entry;
> +       uint32_t ctr_msr = MSR_IA32_PERFCTR0;
> +       bool is_supported;
> +
> +       entry = vcpu_get_cpuid_entry(vcpu, 0xa);
> +       entry->eax = (entry->eax & ~EVT_LEN_MASK) |
> +               (evt_vector << EVT_LEN_OFS_BIT);
> +       entry->ebx = (entry->ebx & ~EVENTS_MASK) | unavl_mask;
> +       vcpu_set_cpuid(vcpu);
> +
> +       if (vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES)
> +               ctr_msr = MSR_IA32_PMC0;
> +
> +       /* Arch event x is supported if EBX[x]=0 && EAX[31:24]>x */
> +       is_supported = !(entry->ebx & BIT_ULL(idx)) &&
> +               (((entry->eax & EVT_LEN_MASK) >> EVT_LEN_OFS_BIT) > idx);
> +
> +       vcpu_args_set(vcpu, 5, entry->eax & PMU_VERSION_MASK,
> +                     (entry->eax & GP_CTR_NUM_MASK) >> GP_CTR_NUM_OFS_BIT,
> +                     is_supported, ctr_msr, arch_events[idx]);
> +}
> +
> +static void intel_check_arch_event_is_unavl(uint8_t idx)
> +{
> +       const char *msg = "Unavailable arch event is counting.";
> +       uint8_t eax_evt_vec, ebx_unavl_mask, i, j;
> +       struct kvm_vcpu *vcpu;
> +
> +       /*
> +        * A brute force iteration of all combinations of values is likely to
> +        * exhaust the limit of the single-threaded thread fd nums, so it's
> +        * tested here by iterating through all valid values on a single bit.
> +        */
> +       for (i = 0; i < ARRAY_SIZE(arch_events); i++) {
> +               eax_evt_vec = BIT_ULL(i);
> +               for (j = 0; j < ARRAY_SIZE(arch_events); j++) {
> +                       ebx_unavl_mask = BIT_ULL(j);
> +
> +                       vcpu = new_vcpu(intel_guest_run_arch_event);
> +                       test_arch_events_setup(vcpu, eax_evt_vec,
> +                                              ebx_unavl_mask, idx);
> +                       run_vcpu(vcpu, msg, first_uc_arg_non_zero, NULL);
> +                       free_vcpu(vcpu);
> +               }
> +       }
> +}
> +
> +static void intel_test_arch_events(void)
> +{
> +       uint8_t idx;
> +
> +       for (idx = 0; idx < ARRAY_SIZE(arch_events); idx++) {
> +               /*
> +                * Given the stability of performance event recurrence,
> +                * only these arch events are currently being tested:
> +                * - Core cycle event (idx = 0)
> +                * - Instruction retired event (idx = 1)
> +                * - Reference cycles event (idx = 2)

Note that reference cycles is one event that actually cannot be
successfully virtualized.

> +                * - Branch instruction retired event (idx = 5)
> +                */
> +               if (idx > 2 && idx != 5)
> +                       continue;
> +
> +               intel_check_arch_event_is_unavl(idx);
> +       }
> +}
> +
> +static void intel_test_pmu_cpuid(void)
> +{
> +       intel_test_arch_events();
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +       TEST_REQUIRE(get_kvm_param_bool("enable_pmu"));
> +
> +       if (host_cpu_is_intel) {
> +               TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION));
> +               TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0);
> +               TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_PDCM));
> +
> +               intel_test_pmu_cpuid();
> +       }
> +
> +       return 0;
> +}
> --
> 2.40.0
>
diff mbox series

Patch

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 84a627c43795..8aa63081b3e6 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -78,6 +78,7 @@  TEST_GEN_PROGS_x86_64 += x86_64/mmio_warning_test
 TEST_GEN_PROGS_x86_64 += x86_64/monitor_mwait_test
 TEST_GEN_PROGS_x86_64 += x86_64/nested_exceptions_test
 TEST_GEN_PROGS_x86_64 += x86_64/platform_info_test
+TEST_GEN_PROGS_x86_64 += x86_64/pmu_cpuid_test
 TEST_GEN_PROGS_x86_64 += x86_64/pmu_event_filter_test
 TEST_GEN_PROGS_x86_64 += x86_64/set_boot_cpu_id
 TEST_GEN_PROGS_x86_64 += x86_64/set_sregs_test
diff --git a/tools/testing/selftests/kvm/x86_64/pmu_cpuid_test.c b/tools/testing/selftests/kvm/x86_64/pmu_cpuid_test.c
new file mode 100644
index 000000000000..faab0a91e191
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/pmu_cpuid_test.c
@@ -0,0 +1,202 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Test the consistency of the PMU's CPUID and its features
+ *
+ * Copyright (C) 2023, Tencent, Inc.
+ *
+ * Check that the VM's PMU behaviour is consistent with the
+ * VM CPUID definition.
+ */
+
+#define _GNU_SOURCE /* for program_invocation_short_name */
+#include <x86intrin.h>
+
+#include "processor.h"
+
+/* Guest payload for any performance counter counting */
+#define NUM_BRANCHES 10
+
+#define EVENTSEL_OS BIT_ULL(17)
+#define EVENTSEL_EN BIT_ULL(22)
+#define PMU_CAP_FW_WRITES BIT_ULL(13)
+#define EVENTS_MASK GENMASK_ULL(7, 0)
+#define PMU_VERSION_MASK GENMASK_ULL(7, 0)
+#define GP_CTR_NUM_OFS_BIT 8
+#define GP_CTR_NUM_MASK GENMASK_ULL(15, GP_CTR_NUM_OFS_BIT)
+#define EVT_LEN_OFS_BIT 24
+#define EVT_LEN_MASK GENMASK_ULL(31, EVT_LEN_OFS_BIT)
+
+#define ARCH_EVENT(select, umask) (((select) & 0xff) | ((umask) & 0xff) << 8)
+
+/*
+ * Intel Pre-defined Architectural Performance Events. Note some events
+ * are skipped for testing due to difficulties in stable reproduction.
+ */
+static const uint64_t arch_events[] = {
+	[0] = ARCH_EVENT(0x3c, 0x0),
+	[1] = ARCH_EVENT(0xc0, 0x0),
+	[2] = ARCH_EVENT(0x3c, 0x1),
+	[3] = ARCH_EVENT(0x2e, 0x4f), /* LLC Reference */
+	[4] = ARCH_EVENT(0x2e, 0x41), /* LLC Misses */
+	[5] = ARCH_EVENT(0xc4, 0x0),
+	[6] = ARCH_EVENT(0xc5, 0x0),  /* Branch Misses Retired */
+	[7] = ARCH_EVENT(0xa4, 0x1), /* Topdown Slots */
+};
+
+static struct kvm_vcpu *new_vcpu(void *guest_code)
+{
+	struct kvm_vm *vm;
+	struct kvm_vcpu *vcpu;
+
+	vm = vm_create_with_one_vcpu(&vcpu, guest_code);
+	vm_init_descriptor_tables(vm);
+	vcpu_init_descriptor_tables(vcpu);
+
+	return vcpu;
+}
+
+static void free_vcpu(struct kvm_vcpu *vcpu)
+{
+	kvm_vm_free(vcpu->vm);
+}
+
+static void run_vcpu(struct kvm_vcpu *vcpu, const char *msg,
+		     bool (*check_ucall)(struct ucall *uc, void *data),
+		     void *expect_args)
+{
+	struct ucall uc;
+
+	for (;;) {
+		vcpu_run(vcpu);
+		switch (get_ucall(vcpu, &uc)) {
+		case UCALL_SYNC:
+			TEST_ASSERT(check_ucall(&uc, expect_args), "%s", msg);
+			continue;
+		case UCALL_DONE:
+			break;
+		default:
+			TEST_ASSERT(false, "Unexpected exit: %s",
+				    exit_reason_str(vcpu->run->exit_reason));
+		}
+		break;
+	}
+}
+
+static bool first_uc_arg_non_zero(struct ucall *uc, void *data)
+{
+	return uc->args[1];
+}
+
+static void intel_guest_run_arch_event(uint8_t version, uint8_t max_gp_num,
+				       bool supported, uint32_t ctr_base_msr,
+				       uint64_t evt_code)
+{
+	uint32_t global_msr = MSR_CORE_PERF_GLOBAL_CTRL;
+	unsigned int i;
+
+	for (i = 0; i < max_gp_num; i++) {
+		wrmsr(ctr_base_msr + i, 0);
+		wrmsr(MSR_P6_EVNTSEL0 + i, EVENTSEL_OS | EVENTSEL_EN | evt_code);
+		if (version > 1)
+			wrmsr(global_msr, BIT_ULL(i));
+
+		__asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
+
+		if (version > 1)
+			wrmsr(global_msr, 0);
+
+		GUEST_SYNC(supported == !!_rdpmc(i));
+	}
+
+	GUEST_DONE();
+}
+
+static void test_arch_events_setup(struct kvm_vcpu *vcpu, uint8_t evt_vector,
+				   uint8_t unavl_mask, uint8_t idx)
+{
+	struct kvm_cpuid_entry2 *entry;
+	uint32_t ctr_msr = MSR_IA32_PERFCTR0;
+	bool is_supported;
+
+	entry = vcpu_get_cpuid_entry(vcpu, 0xa);
+	entry->eax = (entry->eax & ~EVT_LEN_MASK) |
+		(evt_vector << EVT_LEN_OFS_BIT);
+	entry->ebx = (entry->ebx & ~EVENTS_MASK) | unavl_mask;
+	vcpu_set_cpuid(vcpu);
+
+	if (vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES)
+		ctr_msr = MSR_IA32_PMC0;
+
+	/* Arch event x is supported if EBX[x]=0 && EAX[31:24]>x */
+	is_supported = !(entry->ebx & BIT_ULL(idx)) &&
+		(((entry->eax & EVT_LEN_MASK) >> EVT_LEN_OFS_BIT) > idx);
+
+	vcpu_args_set(vcpu, 5, entry->eax & PMU_VERSION_MASK,
+		      (entry->eax & GP_CTR_NUM_MASK) >> GP_CTR_NUM_OFS_BIT,
+		      is_supported, ctr_msr, arch_events[idx]);
+}
+
+static void intel_check_arch_event_is_unavl(uint8_t idx)
+{
+	const char *msg = "Unavailable arch event is counting.";
+	uint8_t eax_evt_vec, ebx_unavl_mask, i, j;
+	struct kvm_vcpu *vcpu;
+
+	/*
+	 * A brute force iteration of all combinations of values is likely to
+	 * exhaust the limit of the single-threaded thread fd nums, so it's
+	 * tested here by iterating through all valid values on a single bit.
+	 */
+	for (i = 0; i < ARRAY_SIZE(arch_events); i++) {
+		eax_evt_vec = BIT_ULL(i);
+		for (j = 0; j < ARRAY_SIZE(arch_events); j++) {
+			ebx_unavl_mask = BIT_ULL(j);
+
+			vcpu = new_vcpu(intel_guest_run_arch_event);
+			test_arch_events_setup(vcpu, eax_evt_vec,
+					       ebx_unavl_mask, idx);
+			run_vcpu(vcpu, msg, first_uc_arg_non_zero, NULL);
+			free_vcpu(vcpu);
+		}
+	}
+}
+
+static void intel_test_arch_events(void)
+{
+	uint8_t idx;
+
+	for (idx = 0; idx < ARRAY_SIZE(arch_events); idx++) {
+		/*
+		 * Given the stability of performance event recurrence,
+		 * only these arch events are currently being tested:
+		 * - Core cycle event (idx = 0)
+		 * - Instruction retired event (idx = 1)
+		 * - Reference cycles event (idx = 2)
+		 * - Branch instruction retired event (idx = 5)
+		 */
+		if (idx > 2 && idx != 5)
+			continue;
+
+		intel_check_arch_event_is_unavl(idx);
+	}
+}
+
+static void intel_test_pmu_cpuid(void)
+{
+	intel_test_arch_events();
+}
+
+int main(int argc, char *argv[])
+{
+	TEST_REQUIRE(get_kvm_param_bool("enable_pmu"));
+
+	if (host_cpu_is_intel) {
+		TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION));
+		TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0);
+		TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_PDCM));
+
+		intel_test_pmu_cpuid();
+	}
+
+	return 0;
+}