From patchwork Mon Dec 12 12:58:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 13071099 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F47CC00145 for ; Mon, 12 Dec 2022 12:59:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232473AbiLLM7X (ORCPT ); Mon, 12 Dec 2022 07:59:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58460 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230113AbiLLM7R (ORCPT ); Mon, 12 Dec 2022 07:59:17 -0500 Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F483C78; Mon, 12 Dec 2022 04:59:16 -0800 (PST) Received: by mail-pj1-x102f.google.com with SMTP id o1-20020a17090a678100b00219cf69e5f0so15616334pjj.2; Mon, 12 Dec 2022 04:59:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=eC2iVrhN64qx0wh1++pd8s8D/4z8tTuZKSx0qqh+jVk=; b=oaByAgTh/mGEsWxXHZhNoHOBQEsZc14IJwDr2btqBiJxfwbgrsIlb9Be3eeDwVB8yn 8lLvfVs9DcNU69/EZpheUyDC+TWzd9tcqcRnwZu9v5yKueGX//nlPtoCp+pg1dWf2WPs NeY/qq2otESM0rg3evIACrKeQazyqdSa9ZFwsl8ljsDk+OQKfpAnsnMljwczl4Kap00p j0+/r0QonsHnbcSKj8O4v35+txfNyJnf9UO2dBS+7JQ8O3nL3jNb2s1suZmIwpyHJka5 leuOe9bT/YJY0KJ5bcpEuGjGDjy92bxqn/JQKaJi42HbpmKkn4L8mg9JH+9q8W628to0 LHng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eC2iVrhN64qx0wh1++pd8s8D/4z8tTuZKSx0qqh+jVk=; b=CMbM1Z1x+u1DOKkCpGX7MWtRYWYQKavZLrvy5p+ytEfwMVWOPYaoQZaI+gkjcUsAgP Sznb9VWjUGeA4vkTNnoSnyGRPzGaP88dJ5X5/32NmkVIOPHPSW4J1GjvI6X3pQTk2FZN aMPYf4aNAM3Hucz7AYk0s64/xWiFtG54KiT3cU5wcI0sJJT+eQHu4j6t3rP4OiqxaBw9 7cK4OHlFbY6aY44jloaK5uCrfjf204MnNePJhJp9diIGyR/CGu6YLwbgIwC09ENn9k6m C4a/NsA6Z3k8EhMswUX8eg8Ab0eMk41LqdkWmZdVCEB0ziWd35RTaH95Gelqz+gqQe69 qPIA== X-Gm-Message-State: ANoB5pmO/45ElxiFU+DcnAf78mT0SyQrRUQuzWDYSzfmNQGn+LCge6hK Wy940xzu8P8cnr8neScXTm8= X-Google-Smtp-Source: AA0mqf4KTmGbap+xoQWGEIBCNcnSByCIwuI9ifKJmSUv2uj5R1Bvnm7bWWoLlnJCQMM8sJMN59So0A== X-Received: by 2002:a17:902:ba95:b0:185:441f:7091 with SMTP id k21-20020a170902ba9500b00185441f7091mr15323679pls.22.1670849956058; Mon, 12 Dec 2022 04:59:16 -0800 (PST) Received: from localhost.localdomain ([103.7.29.32]) by smtp.gmail.com with ESMTPSA id jc3-20020a17090325c300b00186f608c543sm6273927plb.304.2022.12.12.04.59.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Dec 2022 04:59:15 -0800 (PST) From: Like Xu X-Google-Original-From: Like Xu To: Peter Zijlstra , Sean Christopherson Cc: Paolo Bonzini , linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Marc Zyngier , Fenghua Yu , kvmarm@lists.linux.dev, linux-perf-users@vger.kernel.org Subject: [PATCH RFC 1/8] perf/core: Add *group_leader to perf_event_create_kernel_counter() Date: Mon, 12 Dec 2022 20:58:37 +0800 Message-Id: <20221212125844.41157-2-likexu@tencent.com> X-Mailer: git-send-email 2.38.2 In-Reply-To: <20221212125844.41157-1-likexu@tencent.com> References: <20221212125844.41157-1-likexu@tencent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Like Xu Like syscalls users, kernel-space perf_event creators may also use group counters abstraction to gain pmu functionalities, and an in-kernel counter groups behave much like normal 'single' counters, following the group semantics-based behavior. No functional changes at this time. An example will be that KVM creates Intel slot event as group leader and other topdown metric events to emulate MSR_PERF_METRICS pmu capability for guests. Cc: Peter Zijlstra Cc: Marc Zyngier Cc: Fenghua Yu Cc: kvmarm@lists.linux.dev Cc: linux-perf-users@vger.kernel.org Signed-off-by: Like Xu --- arch/arm64/kvm/pmu-emul.c | 4 ++-- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 4 ++-- arch/x86/kvm/pmu.c | 2 +- arch/x86/kvm/vmx/pmu_intel.c | 2 +- include/linux/perf_event.h | 1 + kernel/events/core.c | 4 +++- kernel/events/hw_breakpoint.c | 4 ++-- kernel/events/hw_breakpoint_test.c | 2 +- kernel/watchdog_hld.c | 2 +- 9 files changed, 14 insertions(+), 11 deletions(-) diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c index 24908400e190..11c3386bc86b 100644 --- a/arch/arm64/kvm/pmu-emul.c +++ b/arch/arm64/kvm/pmu-emul.c @@ -624,7 +624,7 @@ static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc) attr.sample_period = compute_period(pmc, kvm_pmu_get_pmc_value(pmc)); - event = perf_event_create_kernel_counter(&attr, -1, current, + event = perf_event_create_kernel_counter(&attr, -1, current, NULL, kvm_pmu_perf_overflow, pmc); if (IS_ERR(event)) { @@ -713,7 +713,7 @@ static struct arm_pmu *kvm_pmu_probe_armpmu(void) attr.config = ARMV8_PMUV3_PERFCTR_CPU_CYCLES; attr.sample_period = GENMASK(63, 0); - event = perf_event_create_kernel_counter(&attr, -1, current, + event = perf_event_create_kernel_counter(&attr, -1, current, NULL, kvm_pmu_perf_overflow, &attr); if (IS_ERR(event)) { diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c index d961ae3ed96e..43e54bb200cd 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -952,12 +952,12 @@ static int measure_residency_fn(struct perf_event_attr *miss_attr, u64 tmp; miss_event = perf_event_create_kernel_counter(miss_attr, plr->cpu, - NULL, NULL, NULL); + NULL, NULL, NULL, NULL); if (IS_ERR(miss_event)) goto out; hit_event = perf_event_create_kernel_counter(hit_attr, plr->cpu, - NULL, NULL, NULL); + NULL, NULL, NULL, NULL); if (IS_ERR(hit_event)) goto out_miss; diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index eb594620dd75..f6c8180241d7 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -204,7 +204,7 @@ static int pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, u64 config, attr.precise_ip = 3; } - event = perf_event_create_kernel_counter(&attr, -1, current, + event = perf_event_create_kernel_counter(&attr, -1, current, NULL, kvm_perf_overflow, pmc); if (IS_ERR(event)) { pr_debug_ratelimited("kvm_pmu: event creation failed %ld for pmc->idx = %d\n", diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index f951dc756456..b746381307c7 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -299,7 +299,7 @@ int intel_pmu_create_guest_lbr_event(struct kvm_vcpu *vcpu) } event = perf_event_create_kernel_counter(&attr, -1, - current, NULL, NULL); + current, NULL, NULL, NULL); if (IS_ERR(event)) { pr_debug_ratelimited("%s: failed %ld\n", __func__, PTR_ERR(event)); diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 0031f7b4d9ab..5f34e1d0bff8 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1023,6 +1023,7 @@ extern struct perf_event * perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu, struct task_struct *task, + struct perf_event *group_leader, perf_overflow_handler_t callback, void *context); extern void perf_pmu_migrate_context(struct pmu *pmu, diff --git a/kernel/events/core.c b/kernel/events/core.c index 7f04f995c975..f671b1a9a691 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -12674,12 +12674,14 @@ SYSCALL_DEFINE5(perf_event_open, * @attr: attributes of the counter to create * @cpu: cpu in which the counter is bound * @task: task to profile (NULL for percpu) + * @group_leader: event group leader * @overflow_handler: callback to trigger when we hit the event * @context: context data could be used in overflow_handler callback */ struct perf_event * perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu, struct task_struct *task, + struct perf_event *group_leader, perf_overflow_handler_t overflow_handler, void *context) { @@ -12694,7 +12696,7 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr, int cpu, if (attr->aux_output) return ERR_PTR(-EINVAL); - event = perf_event_alloc(attr, cpu, task, NULL, NULL, + event = perf_event_alloc(attr, cpu, task, group_leader, NULL, overflow_handler, context, -1); if (IS_ERR(event)) { err = PTR_ERR(event); diff --git a/kernel/events/hw_breakpoint.c b/kernel/events/hw_breakpoint.c index c3797701339c..65b5b1421e62 100644 --- a/kernel/events/hw_breakpoint.c +++ b/kernel/events/hw_breakpoint.c @@ -771,7 +771,7 @@ register_user_hw_breakpoint(struct perf_event_attr *attr, void *context, struct task_struct *tsk) { - return perf_event_create_kernel_counter(attr, -1, tsk, triggered, + return perf_event_create_kernel_counter(attr, -1, tsk, NULL, triggered, context); } EXPORT_SYMBOL_GPL(register_user_hw_breakpoint); @@ -881,7 +881,7 @@ register_wide_hw_breakpoint(struct perf_event_attr *attr, cpus_read_lock(); for_each_online_cpu(cpu) { - bp = perf_event_create_kernel_counter(attr, cpu, NULL, + bp = perf_event_create_kernel_counter(attr, cpu, NULL, NULL, triggered, context); if (IS_ERR(bp)) { err = PTR_ERR(bp); diff --git a/kernel/events/hw_breakpoint_test.c b/kernel/events/hw_breakpoint_test.c index c57610f52bb4..b3597df12284 100644 --- a/kernel/events/hw_breakpoint_test.c +++ b/kernel/events/hw_breakpoint_test.c @@ -39,7 +39,7 @@ static struct perf_event *register_test_bp(int cpu, struct task_struct *tsk, int attr.bp_addr = (unsigned long)&break_vars[idx]; attr.bp_len = HW_BREAKPOINT_LEN_1; attr.bp_type = HW_BREAKPOINT_RW; - return perf_event_create_kernel_counter(&attr, cpu, tsk, NULL, NULL); + return perf_event_create_kernel_counter(&attr, cpu, tsk, NULL, NULL, NULL); } static void unregister_test_bp(struct perf_event **bp) diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c index 247bf0b1582c..bb755dadba54 100644 --- a/kernel/watchdog_hld.c +++ b/kernel/watchdog_hld.c @@ -173,7 +173,7 @@ static int hardlockup_detector_event_create(void) wd_attr->sample_period = hw_nmi_get_sample_period(watchdog_thresh); /* Try to register using hardware perf events */ - evt = perf_event_create_kernel_counter(wd_attr, cpu, NULL, + evt = perf_event_create_kernel_counter(wd_attr, cpu, NULL, NULL, watchdog_overflow_callback, NULL); if (IS_ERR(evt)) { pr_debug("Perf event create on CPU %d failed with %ld\n", cpu, From patchwork Mon Dec 12 12:58:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 13071100 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20260C04FDE for ; Mon, 12 Dec 2022 12:59:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232494AbiLLM70 (ORCPT ); Mon, 12 Dec 2022 07:59:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58490 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232426AbiLLM7T (ORCPT ); Mon, 12 Dec 2022 07:59:19 -0500 Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com [IPv6:2607:f8b0:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B51AE1082; Mon, 12 Dec 2022 04:59:18 -0800 (PST) Received: by mail-pl1-x62b.google.com with SMTP id s7so11984536plk.5; Mon, 12 Dec 2022 04:59:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=V4wbgmOrD3DUv8ZzkSj5oeUj/UNys0rAmXMTByr8Jyg=; b=ZzhhQG8MG+rqrkCMIUAez/3prk2GWus/OM6jHIg0zYQL/mNme4EGro0cumi02KzI/g k2bYuGAyE+mVdelcHESmbN02tWwIdoFevT8TZmfgJwnatlim9RGV3I2yIHs64jqY2aXm o+fRS0nKb7UGVz/Uhtonm+W5BPf0wFCCGeydhO2QQdZ4n6BP2OOeZRI0lXYvYWOjeHAe qtT3/AR3WPbTUIiCZsdYVNovLAttvhXBS8EHDktrLQ2RUi6lPvu7WpHrFPDhpoPQnbbc YpTZv+pe95x0ncI5NKZHxH56Rrth8n64oaBzzYj68Cw+uF0i2qv4p9q6meatXy07zWbS D/mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=V4wbgmOrD3DUv8ZzkSj5oeUj/UNys0rAmXMTByr8Jyg=; b=UHECMnTyCP2oS3SvdMaMRCh1Sxf+/eX/QrVwWpYs2I94AeuHHmX5OYL5SCr8oVFyIY izWC88pRl4oQ+18cg9tb/X7YPGF4F5CETZui/d4YSFJJfcToYUgQCbt88GSbLuk6lg9R e+8bmRoRMGMyZSH0CcDVhUzOAMXqf2NwOHdQSftCjjHcBVlqhvf0WoR2X90KWpmRcs3n R/96Dj3kG6xNOWYVHsPuYByyS9lHjKK2vaiIrjZUb/huVgRdVQvHcCfuFkdjvEQktwPX jjQc7N0jD3/GFgmpZ2K3KYdNXNqIUa45nDiOpfYtHEpodeAg7PvnN5+xUzehsNLaAvtp n5bA== X-Gm-Message-State: ANoB5pkuDGRzrparPUmWRFg4UNS6J3BRACHKg9rQOQdwRhbAvWvsCtUh HvzzkNAbPPfm70kDy1is7kY= X-Google-Smtp-Source: AA0mqf7/KyG/5JioQ3eTe1PgU3hbCVnMTrnBxOW44kyUrXK/XVQllGVcEDCPdpvHQP9cJ3gCLj0Bgg== X-Received: by 2002:a17:902:a616:b0:18b:cea3:644 with SMTP id u22-20020a170902a61600b0018bcea30644mr13835148plq.36.1670849958305; Mon, 12 Dec 2022 04:59:18 -0800 (PST) Received: from localhost.localdomain ([103.7.29.32]) by smtp.gmail.com with ESMTPSA id jc3-20020a17090325c300b00186f608c543sm6273927plb.304.2022.12.12.04.59.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Dec 2022 04:59:18 -0800 (PST) From: Like Xu X-Google-Original-From: Like Xu To: Peter Zijlstra , Sean Christopherson Cc: Paolo Bonzini , linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-perf-users@vger.kernel.org Subject: [PATCH RFC 2/8] perf: x86/core: Expose the available number of the Topdown metrics Date: Mon, 12 Dec 2022 20:58:38 +0800 Message-Id: <20221212125844.41157-3-likexu@tencent.com> X-Mailer: git-send-email 2.38.2 In-Reply-To: <20221212125844.41157-1-likexu@tencent.com> References: <20221212125844.41157-1-likexu@tencent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Like Xu Intel Sapphire Rapids server has 8 metrics events, while the Intel Ice Lake only supports 4 metrics events. The available number of the Topdown metrics are model specific without architecture hint. To support guest Topdown metrics, KVM may only rely on the cpu model to emulate the correct number of metrics event on the platforms. It would be nice to have the perf core tell KVM the available number of Topdown metrics, just like x86_pmu.num_counters. Cc: Peter Zijlstra Cc: linux-perf-users@vger.kernel.org Signed-off-by: Like Xu --- arch/x86/events/core.c | 1 + arch/x86/include/asm/perf_event.h | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index b30b8bbcd1e2..d0d84c7a6876 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -3006,6 +3006,7 @@ void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap) * which available for all cores. */ cap->num_counters_gp = x86_pmu.num_counters; + cap->num_topdown_events = x86_pmu.num_topdown_events; cap->num_counters_fixed = x86_pmu.num_counters_fixed; cap->bit_width_gp = x86_pmu.cntval_bits; cap->bit_width_fixed = x86_pmu.cntval_bits; diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h index 5d0f6891ae61..3e263d291595 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -219,6 +219,7 @@ struct x86_pmu_capability { int version; int num_counters_gp; int num_counters_fixed; + int num_topdown_events; int bit_width_gp; int bit_width_fixed; unsigned int events_mask; From patchwork Mon Dec 12 12:58:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 13071101 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89D15C4332F for ; Mon, 12 Dec 2022 12:59:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232471AbiLLM73 (ORCPT ); Mon, 12 Dec 2022 07:59:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58594 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232469AbiLLM7X (ORCPT ); Mon, 12 Dec 2022 07:59:23 -0500 Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 393F56160; Mon, 12 Dec 2022 04:59:21 -0800 (PST) Received: by mail-pl1-x62e.google.com with SMTP id m4so11995131pls.4; Mon, 12 Dec 2022 04:59:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OiLHXHKbQd18qU/wJIAEe1ddmbdvgaYtPXkeXnPK4Mc=; b=lLBbEI1nTIJOGEDFisYJob1XhIARuSKC0/jtyDs0opcR99IRYFPtY91txhFtVGZmSW XBNleig4SV8X0ilug2GL4IS+NMfBUG9qYgHnRnig69KYgubkmifnBg9Lm3cLMBl//rG/ f0FBjyamlMtBahVDdXJWVOVA0yu0xhme5Ei1yq4rk7cSRumxje8jou4VAGIDooOqNDRU 0s+dgky3tUWgRpmuwjv+DeH9bYI6W8GPOv2FUICsFSI6AhrJXc272vMO9FUCOOdpzHRt xP5FGwuaFhp/KPVqWaaesBGek/EI2MtaZmOjBwgCagpLLBZb+8JSu1dhYUV8MBK5gxd5 adxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OiLHXHKbQd18qU/wJIAEe1ddmbdvgaYtPXkeXnPK4Mc=; b=PS6BTp4RV7ZmPz8MtZvE0TWenG9SW8F2pivTNWWYUE/PECa91BaaidqtK563cR1glk +XjqgeGlW4IlNAAsyU14yLixHrOZRVMiVohVi6/yEfOsiSF7UzzGZM5Vf8JrKTtQQsps 871xMVql9thM6KjrIZbpuziFZ1YCpTQ0QWay/Uwp0/Oez+mu9IKzHz7eGJA4XnArt+c9 UWMdSUgFfcpkFW6hvMMb6sLvExRxSRWOOmL6BPIbPcu7t29T1INkkpc5YcmflTGl45wy 1E+H8UQFWD2TAxtpJDihDPbDnDdR6wdBQWYc1kUAEUmoaXQLRoguAKeKw5oX6vja4l0R rxKg== X-Gm-Message-State: ANoB5pmUIeMxZawfnpvOfYqZGgYo+jWCBwEQeeHatQvN9sYP/joMFYHg Fe/KS2fmQFOpu7nE4kV7uOATgfcwAlSK3Q8G X-Google-Smtp-Source: AA0mqf7Yz24FUgeNHPPXXOccqL227PFPrS8ouzv5RWtsGZv0wgyG8IldDLZyL3QB8JkOnU2GRBKh9w== X-Received: by 2002:a17:902:7598:b0:188:92d6:49c8 with SMTP id j24-20020a170902759800b0018892d649c8mr16581800pll.48.1670849960678; Mon, 12 Dec 2022 04:59:20 -0800 (PST) Received: from localhost.localdomain ([103.7.29.32]) by smtp.gmail.com with ESMTPSA id jc3-20020a17090325c300b00186f608c543sm6273927plb.304.2022.12.12.04.59.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Dec 2022 04:59:20 -0800 (PST) From: Like Xu X-Google-Original-From: Like Xu To: Peter Zijlstra , Sean Christopherson Cc: Paolo Bonzini , linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-perf-users@vger.kernel.org Subject: [PATCH RFC 3/8] perf: x86/core: Snyc PERF_METRICS bit together with fixed counter3 Date: Mon, 12 Dec 2022 20:58:39 +0800 Message-Id: <20221212125844.41157-4-likexu@tencent.com> X-Mailer: git-send-email 2.38.2 In-Reply-To: <20221212125844.41157-1-likexu@tencent.com> References: <20221212125844.41157-1-likexu@tencent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Like Xu When the guest uses topdown (the fixed counter 3 and perf_metrics msr), the sharing rule on the PERF_METRICS bit on the GLOBAL_CTRL msr does not change, that is, it should be updated synchronously with the fixed counter 3. Considering that guest topdown feature has just been enabled, this is not a strictly bug fix. Cc: Peter Zijlstra Cc: linux-perf-users@vger.kernel.org Signed-off-by: Like Xu --- arch/x86/events/intel/core.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 1b92bf05fd65..e7897fd9f7ab 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -2436,6 +2436,8 @@ static void intel_pmu_disable_fixed(struct perf_event *event) */ if (*(u64 *)cpuc->active_mask & INTEL_PMC_OTHER_TOPDOWN_BITS(idx)) return; + + intel_clear_masks(event, GLOBAL_CTRL_EN_PERF_METRICS); idx = INTEL_PMC_IDX_FIXED_SLOTS; } @@ -2729,6 +2731,7 @@ static void intel_pmu_enable_fixed(struct perf_event *event) if (*(u64 *)cpuc->active_mask & INTEL_PMC_OTHER_TOPDOWN_BITS(idx)) return; + intel_set_masks(event, GLOBAL_CTRL_EN_PERF_METRICS); idx = INTEL_PMC_IDX_FIXED_SLOTS; } From patchwork Mon Dec 12 12:58:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 13071102 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 840C3C00145 for ; Mon, 12 Dec 2022 12:59:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232468AbiLLM7p (ORCPT ); Mon, 12 Dec 2022 07:59:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58706 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232498AbiLLM70 (ORCPT ); Mon, 12 Dec 2022 07:59:26 -0500 Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E380C108A; Mon, 12 Dec 2022 04:59:23 -0800 (PST) Received: by mail-pj1-x1029.google.com with SMTP id 3-20020a17090a098300b00219041dcbe9so12111208pjo.3; Mon, 12 Dec 2022 04:59:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=h5m/rCe97xq/FcJFWy/8y2nBtJ5MNmBfj7N4aCVlUd4=; b=aQa7/0t1BOC1wyJTP4l/5k5VRYHAS4mj4MbMCdo52rXUjEFmh5BuErXME4RBHRRDcg u7KoPi9+pQNixKbjPcKiJxjZRfT6sZuqpLCB83q5v+tQu82uH/iR2t0hOxWjLhr36Etk J+zCwb+9B3yVsKfBqMke4duedRNU7kDqmgpvoynIrK0uMkrDzuP8NZFEuCDCQ9+y6qID H/hcEEFwAUdshE7UmXMsyPXKoy1i1VGHaqNJ0ZdiPhhAeB3iY9ShKFI6fjFEqZMkz7PQ Y5h3N/fuaqpLacmLlPTIXi9ghuUasf0Cj6oJRDVJ9cjHnFzKHv9dFxFNVn/NhqezqitB zsbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=h5m/rCe97xq/FcJFWy/8y2nBtJ5MNmBfj7N4aCVlUd4=; b=ds1fp1a2PRXOUkIuYptDsiDO5hdBq1BFeSXut+qxZ7Ba+PoTTPvUlsZDA504SUur+t 4cNXnjFlY3sItA0RP6rNjSdhCvRcw1aIlT+4RwFIGyj97vIvlghPzrVQVVJq8vr4NzJh Z0WhNyXL4JlNSLnTegcVlk/tWhjWIUKttdLIm63bEucQIZxUDpBNCzefm5T8J3JINVRh tYT1KWpCgKedVT4ImKhlFjsgNjq7tOrE+7gDqmZbVFtRPVu+PZ0MK/GdAqk9tpYxbjW7 jxChja/2Ze0UkMBb1TFsbslTQcPScp4fRtnY1CaFUAlO91I9hUtcqF7YYi9Ym/g1AK7a i+Pw== X-Gm-Message-State: ANoB5pmMdnrELhqtjX306wjPO1fs69+r2wPCG/gTi+ZIyJlHtHu3Je4q osABvo3juEUpeG69wLZ/9b8= X-Google-Smtp-Source: AA0mqf4cANmL1rZ+LV2ViIGMf4hLKWJe0EK2HscwOWu8nKVRjk43Ohwr92YK6ECnzQ67pHeezQqndw== X-Received: by 2002:a17:902:bd02:b0:188:fc0c:b736 with SMTP id p2-20020a170902bd0200b00188fc0cb736mr16904622pls.67.1670849963312; Mon, 12 Dec 2022 04:59:23 -0800 (PST) Received: from localhost.localdomain ([103.7.29.32]) by smtp.gmail.com with ESMTPSA id jc3-20020a17090325c300b00186f608c543sm6273927plb.304.2022.12.12.04.59.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Dec 2022 04:59:23 -0800 (PST) From: Like Xu X-Google-Original-From: Like Xu To: Peter Zijlstra , Sean Christopherson Cc: Paolo Bonzini , linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH RFC 4/8] KVM: x86/pmu: Add Intel CPUID-hinted Topdown Slots event Date: Mon, 12 Dec 2022 20:58:40 +0800 Message-Id: <20221212125844.41157-5-likexu@tencent.com> X-Mailer: git-send-email 2.38.2 In-Reply-To: <20221212125844.41157-1-likexu@tencent.com> References: <20221212125844.41157-1-likexu@tencent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Like Xu This event counts the total number of available slots for an unhalted logical processor. Software can use this event as the denominator for the top-level metrics of the Top-down Microarchitecture Analysis method. Although the MSR_PERF_METRICS MSR required for topdown events is not currently available in the guest, relying only on the data provided by the slots event is sufficient for pmu users to perceive differences in cpu pipeline machine-width across micro-architectures. The standalone slots event, like the instruction event, can be counted with gp counter or fixed counter 3 (if any). As the last CPUID-hinted Architectural Performance Events, its availability is also controlled by CPUID.AH.EBX. On the Linux, perf user may encode "-e cpu/event=0xa4, umask=0x01/" or "-e cpu/slots/" to count slots events. Signed-off-by: Like Xu --- arch/x86/kvm/vmx/pmu_intel.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index b746381307c7..d86a6ba8c3f9 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -31,10 +31,11 @@ * 4 - PERF_COUNT_HW_CACHE_MISSES * 5 - PERF_COUNT_HW_BRANCH_INSTRUCTIONS * 6 - PERF_COUNT_HW_BRANCH_MISSES + * 7 - CPUID-hinted Topdown Slots event (available on gp counter) * * the second part of hw_events is defined by the generic kernel perf: * - * 7 - PERF_COUNT_HW_REF_CPU_CYCLES + * 8 - PERF_COUNT_HW_REF_CPU_CYCLES */ static struct kvm_pmu_hw_event intel_arch_events[] = { [0] = { 0x3c, 0x00 }, @@ -44,12 +45,13 @@ static struct kvm_pmu_hw_event intel_arch_events[] = { [4] = { 0x2e, 0x41 }, [5] = { 0xc4, 0x00 }, [6] = { 0xc5, 0x00 }, + [7] = { 0xa4, 0x01 }, /* The above index must match CPUID 0x0A.EBX bit vector */ - [7] = { 0x00, 0x03 }, + [8] = { 0x00, 0x03 }, }; /* mapping between fixed pmc index and intel_arch_events array */ -static int fixed_pmc_events[] = {1, 0, 7}; +static int fixed_pmc_events[] = {1, 0, 8}; static void reprogram_fixed_counters(struct kvm_pmu *pmu, u64 data) { @@ -109,7 +111,7 @@ static bool intel_hw_event_available(struct kvm_pmc *pmc) continue; /* disable event that reported as not present by cpuid */ - if ((i < 7) && !(pmu->available_event_types & (1 << i))) + if (i < 8 && !(pmu->available_event_types & (1 << i))) return false; break; From patchwork Mon Dec 12 12:58:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 13071103 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0CDDC4167B for ; Mon, 12 Dec 2022 13:00:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232553AbiLLNAB (ORCPT ); Mon, 12 Dec 2022 08:00:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59098 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232487AbiLLM7l (ORCPT ); Mon, 12 Dec 2022 07:59:41 -0500 Received: from mail-pj1-x102c.google.com (mail-pj1-x102c.google.com [IPv6:2607:f8b0:4864:20::102c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2E23E96; Mon, 12 Dec 2022 04:59:26 -0800 (PST) Received: by mail-pj1-x102c.google.com with SMTP id t17so11942852pjo.3; Mon, 12 Dec 2022 04:59:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=M9Y0ocwG9zsjdx0J9icRyIrQgGVTmoJrYRtW0USxDT4=; b=fz4/d0WM30pFVK77zeklGc4Rvk2A3SLGoDBg8Mv2EZDKnx79zXZaX62GWoKn1a+cEb eUS1AiQvyeWhCvJ97KJrkxO8Y+8O9HivOlIanbMEK/ZfMFcaEpBirpWyfHe+ni2Hcwg8 JjRiHLfsLepHQH0xzJGw2ZXoJ2TZoJFvw4eknnQq1EMrnswQ/oofuCXkrLtBArtY1A9E MgPJHCQl3546Swh3hCiGHrshzB62tA+zoVwsKbA6p2bdC8vtk/K3l+41bH2oKf57ww8y 3e9fBV2Z74OiTdrKGmQbvmUoiJkmOsffUl8MArFLBurrvS/2qC3h/gSZ22TAc2rN3dqd L3bQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=M9Y0ocwG9zsjdx0J9icRyIrQgGVTmoJrYRtW0USxDT4=; b=E95UurlgE7fHrXFqA2ZBJzTtpmBScDpW1LARean2cDurMSBOe1YloyHAZB+aPvCv9p 46K6Y/Fy1FgpfKkYlL7nUgPP7s0lHbG4tdXtgv6uMmB7n8aqSWPZ0TrHCY5FQEOO0rIl uD7xfq0AB1FzASQEheudRjno/Pn8vI5JM76cBBG+2yiLBvkp5jU7GHsQqAwlnN1hIYx6 Auxf2v33YcMYWwRn5VgrUQtTiSHF4hxoGvbg2iIz9yA9Je3UWvkiIx4YqMn6gpUgYfrn CbKeDXNISuDtpyCZRFlaKcInwXuu/pKdH+QjIF6T5jea9RR1cqfsKio2Gw8ln7U+fUEx oUBQ== X-Gm-Message-State: ANoB5plOPRbC//crzsdYUojgwu5TpQKdEKJsHHp71TrHNInNJgXMHfUy +z9AH3XW/SIvlHo0e14lEl4= X-Google-Smtp-Source: AA0mqf45rp7nvp2txzzXp+3WgpaoxYAsQ305+KFHfNFhX5uLlAub1xPrSfRdyjynq7N4PytMnVdI3Q== X-Received: by 2002:a17:902:f64d:b0:189:603d:ea71 with SMTP id m13-20020a170902f64d00b00189603dea71mr16863149plg.58.1670849966211; Mon, 12 Dec 2022 04:59:26 -0800 (PST) Received: from localhost.localdomain ([103.7.29.32]) by smtp.gmail.com with ESMTPSA id jc3-20020a17090325c300b00186f608c543sm6273927plb.304.2022.12.12.04.59.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Dec 2022 04:59:26 -0800 (PST) From: Like Xu X-Google-Original-From: Like Xu To: Peter Zijlstra , Sean Christopherson Cc: Paolo Bonzini , linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH RFC 5/8] KVM: x86/pmu: Add kernel-defined slots event to enable Fixed Counter3 Date: Mon, 12 Dec 2022 20:58:41 +0800 Message-Id: <20221212125844.41157-6-likexu@tencent.com> X-Mailer: git-send-email 2.38.2 In-Reply-To: <20221212125844.41157-1-likexu@tencent.com> References: <20221212125844.41157-1-likexu@tencent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Like Xu The Topdown Slots event can be enabled on gp counter or fixed counter3 and it does not differ from other fixed counters in terms of the use of count and sampling modes (except for the hardware logic for event accumulation). According to commit 6017608936c1 ("perf/x86/intel: Add Icelake support"), KVM or any perf in-kernel user needs to reprogram fixed counter3 via the kernel-defined Topdown Slots event for real fixed counter3 on the host. Opportunistically fix a typo, s/msrs_to_saved_all/msrs_to_save_all/. Signed-off-by: Like Xu --- arch/x86/include/asm/kvm_host.h | 2 +- arch/x86/kvm/vmx/pmu_intel.c | 4 +++- arch/x86/kvm/x86.c | 6 +++--- 3 files changed, 7 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index aa4eb8cfcd7e..413f2e104543 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -513,7 +513,7 @@ struct kvm_pmc { #define KVM_INTEL_PMC_MAX_GENERIC 8 #define MSR_ARCH_PERFMON_PERFCTR_MAX (MSR_ARCH_PERFMON_PERFCTR0 + KVM_INTEL_PMC_MAX_GENERIC - 1) #define MSR_ARCH_PERFMON_EVENTSEL_MAX (MSR_ARCH_PERFMON_EVENTSEL0 + KVM_INTEL_PMC_MAX_GENERIC - 1) -#define KVM_PMC_MAX_FIXED 3 +#define KVM_PMC_MAX_FIXED 4 #define KVM_AMD_PMC_MAX_GENERIC 6 struct kvm_pmu { unsigned nr_arch_gp_counters; diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index d86a6ba8c3f9..637fd709f5f4 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -36,6 +36,7 @@ * the second part of hw_events is defined by the generic kernel perf: * * 8 - PERF_COUNT_HW_REF_CPU_CYCLES + * 9 - Kernel-defined Topdown Slots event (available on fixed counter 3) */ static struct kvm_pmu_hw_event intel_arch_events[] = { [0] = { 0x3c, 0x00 }, @@ -48,10 +49,11 @@ static struct kvm_pmu_hw_event intel_arch_events[] = { [7] = { 0xa4, 0x01 }, /* The above index must match CPUID 0x0A.EBX bit vector */ [8] = { 0x00, 0x03 }, + [9] = { 0x00, 0x04 }, }; /* mapping between fixed pmc index and intel_arch_events array */ -static int fixed_pmc_events[] = {1, 0, 8}; +static int fixed_pmc_events[] = {1, 0, 8, 9}; static void reprogram_fixed_counters(struct kvm_pmu *pmu, u64 data) { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 312aea1854ae..0b61cb58c877 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1435,7 +1435,7 @@ static const u32 msrs_to_save_all[] = { MSR_IA32_UMWAIT_CONTROL, MSR_ARCH_PERFMON_FIXED_CTR0, MSR_ARCH_PERFMON_FIXED_CTR1, - MSR_ARCH_PERFMON_FIXED_CTR0 + 2, + MSR_ARCH_PERFMON_FIXED_CTR0 + 2, MSR_ARCH_PERFMON_FIXED_CTR0 + 3, MSR_CORE_PERF_FIXED_CTR_CTRL, MSR_CORE_PERF_GLOBAL_STATUS, MSR_CORE_PERF_GLOBAL_CTRL, MSR_CORE_PERF_GLOBAL_OVF_CTRL, MSR_IA32_PEBS_ENABLE, MSR_IA32_DS_AREA, MSR_PEBS_DATA_CFG, @@ -7001,8 +7001,8 @@ static void kvm_init_msr_list(void) u32 dummy[2]; unsigned i; - BUILD_BUG_ON_MSG(KVM_PMC_MAX_FIXED != 3, - "Please update the fixed PMCs in msrs_to_saved_all[]"); + BUILD_BUG_ON_MSG(KVM_PMC_MAX_FIXED != 4, + "Please update the fixed PMCs in msrs_to_save_all[]"); num_msrs_to_save = 0; num_emulated_msrs = 0; From patchwork Mon Dec 12 12:58:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 13071104 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4921C4167B for ; Mon, 12 Dec 2022 13:00:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232578AbiLLNAF (ORCPT ); Mon, 12 Dec 2022 08:00:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58706 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232490AbiLLM7n (ORCPT ); Mon, 12 Dec 2022 07:59:43 -0500 Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2059F645D; Mon, 12 Dec 2022 04:59:29 -0800 (PST) Received: by mail-pl1-x634.google.com with SMTP id p24so11999097plw.1; Mon, 12 Dec 2022 04:59:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=SFgEeA5A7ARtU0YSkuHa+HRfQvgLKouYsg8RDNfDz84=; b=hYvw5QeX9zFE2tyoogipEA2SeBdLA6a/ID7vo7vUKHSbeiCDmKaygXXPX9Lo/3dQjL wDiPRNvYbLUhLxEz1lBKzyJmt13paqxtVTuhVHIblyclvXcrDilOupWQWydmpEVdpFpy lDgIiyKeSvJsJNuCp0Tu268yoqxmRnpHathFzO8QzT4e5PrMZG0D4DbnCrhwJELQMy79 Ujk9s+ZmR0bDRaS36P2+h9a6hTisgO00ebt07mRmYVtX0KUHLztp2EvzowM+WBLx6uKU xFmaUoAvOb2K4a10+AlucSIyFqwuudy4fzPp21Q3xusMsOQjYiBwjvAGLMJwxb/LEkwz wgjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SFgEeA5A7ARtU0YSkuHa+HRfQvgLKouYsg8RDNfDz84=; b=6XWQ1H7KaaZUWgp6unYZimEQE2BDMR/X92Yf4qyY74kzHkn6FrwRJSQlPVeX4NFvEZ 0RWHfTLGAFhO21eLR0jyVq19JDZTD3SJDFl8wbRuZdaAtxNvTBU0yTgs5grxcwxY1Nf9 BU0x+CmWlPwCC1JKjcILfJ4bTsV1Qfl19qRc1pE1Z8eSHRHEjP1s3khEkNu1GMKQ08Q/ U5UeMOXZo/5iRGnbjFPSQEDF9uSxVzdTUlXG6ZgaBaxXxdUUgALlmi73MdkK6o351p7p 6+lzWktduHvW862377smeEh0dWicKDVG3tl+s0RcO82RG+VgqAefZgmOQbqSThMC7ucD v39w== X-Gm-Message-State: ANoB5pngMMcKZhSL3/QYoLlS6M5pSEnkRxaoP0uvUDPClW+CT35v+O9X yV5XYb9xn6C1XpFpH9adOggTmZ+C69Wiz7eu X-Google-Smtp-Source: AA0mqf5emy7VKp+UqKxtG/HsKTOuE75TU96mzsGRf9BNvRLY4E9/UMCoAuUikhRmrXE9VFgbHRWJ5A== X-Received: by 2002:a17:903:18c:b0:186:b069:98d5 with SMTP id z12-20020a170903018c00b00186b06998d5mr20626737plg.69.1670849968562; Mon, 12 Dec 2022 04:59:28 -0800 (PST) Received: from localhost.localdomain ([103.7.29.32]) by smtp.gmail.com with ESMTPSA id jc3-20020a17090325c300b00186f608c543sm6273927plb.304.2022.12.12.04.59.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Dec 2022 04:59:28 -0800 (PST) From: Like Xu X-Google-Original-From: Like Xu To: Peter Zijlstra , Sean Christopherson Cc: Paolo Bonzini , linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH RFC 6/8] KVM: x86/pmu: properly use INTEL_PMC_FIXED_RDPMC_BASE macro Date: Mon, 12 Dec 2022 20:58:42 +0800 Message-Id: <20221212125844.41157-7-likexu@tencent.com> X-Mailer: git-send-email 2.38.2 In-Reply-To: <20221212125844.41157-1-likexu@tencent.com> References: <20221212125844.41157-1-likexu@tencent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Like Xu Use macro INTEL_PMC_FIXED_RDPMC_BASE in the rdpmc context to improve readability. No functional change intended. Signed-off-by: Like Xu --- arch/x86/kvm/vmx/pmu_intel.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 637fd709f5f4..b69d337d51d9 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -136,7 +136,7 @@ static bool intel_pmc_is_enabled(struct kvm_pmc *pmc) static bool intel_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx) { struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); - bool fixed = idx & (1u << 30); + bool fixed = idx & INTEL_PMC_FIXED_RDPMC_BASE; idx &= ~(3u << 30); @@ -148,7 +148,7 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu, unsigned int idx, u64 *mask) { struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); - bool fixed = idx & (1u << 30); + bool fixed = idx & INTEL_PMC_FIXED_RDPMC_BASE; struct kvm_pmc *counters; unsigned int num_counters; From patchwork Mon Dec 12 12:58:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 13071105 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1685BC4332F for ; Mon, 12 Dec 2022 13:00:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232525AbiLLNAK (ORCPT ); Mon, 12 Dec 2022 08:00:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232498AbiLLM7t (ORCPT ); Mon, 12 Dec 2022 07:59:49 -0500 Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 968F012A89; Mon, 12 Dec 2022 04:59:31 -0800 (PST) Received: by mail-pl1-x62e.google.com with SMTP id d7so11973943pll.9; Mon, 12 Dec 2022 04:59:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7atUrLJtNTGOgIKhNavIVJ+xQL4raMerkyaaS0gzJtA=; b=Pg3STCDimi35mpNhwD8sMpw6ZNnAuJ+8npxiRMF3tIO6ggOJ7bYOdcb4pUEm9h54Ir 77a0BRJJqkX7/vDDZW6ZIxRSvz0uH1T6D2mjYqW3+oipDrnanyqg3Bpvwpv6kzlOnpbD 2feS1Bj8G5OtWiH2y/XLCdowECNskdzsXiNpsjnqErAZkanVG9cqiTbGyYBe5newLdoV LwD58udI9Zx+vdQIFdRrOwXwVB6zubbokEru9npczrq8lRsmyPMjHcWH4QKJItG2jQkd 6Vp21QK4HYPu73aEEuvofZz0W/hfzmPH3AlZFUFxBborLBQVN7g+bPRkYbdJY/S++/Ae mtjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7atUrLJtNTGOgIKhNavIVJ+xQL4raMerkyaaS0gzJtA=; b=VSVEsvvUu2RZNhKkHzjQ7T/cubzPnk47UHB4Gy6jEyOvAyYrxvUxjacobbHru/q2Hq xqItdHXL1qy0wLg82KcqfCgADDb3ujpY6JHPcAWc/sRk9HSVVy5oopXXVYCUAVV2mQnS yFrde7I23oVaYoFdbb5cZElBQFbfejPbODL3pIU/jcL5OuhBcHQRjzrDxzKppOcZJ5jx hK9Octi/5hmhwVly+LyfkqERccv8Ypcz0F3mTd1R1sAeXg7zxTXDEoIWjNdXCR/GlCwY qDhOXP9zUqdkzdscOA/VNp4sbvYGEcJFZR+uhO3HnpRp2w/rWPTYiZ4a/dry7moNaIPt MhTA== X-Gm-Message-State: ANoB5pmeqIZEwFc26YC6WUoQQKiLP4CpJpaCBAJXydOtx2dws0tj1PEA cgqIsC6gj3LdwziVc7yau2QV+ADZ44emoadS X-Google-Smtp-Source: AA0mqf4KCOWlgyWT+WvddG2AlGtXsFBx0Cg1wzWs1/N5aSdxymgWIAEKNtIwmH1WadJQw1w0nSDVvQ== X-Received: by 2002:a17:902:a70b:b0:187:2501:72fb with SMTP id w11-20020a170902a70b00b00187250172fbmr16553823plq.46.1670849970963; Mon, 12 Dec 2022 04:59:30 -0800 (PST) Received: from localhost.localdomain ([103.7.29.32]) by smtp.gmail.com with ESMTPSA id jc3-20020a17090325c300b00186f608c543sm6273927plb.304.2022.12.12.04.59.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Dec 2022 04:59:30 -0800 (PST) From: Like Xu X-Google-Original-From: Like Xu To: Peter Zijlstra , Sean Christopherson Cc: Paolo Bonzini , linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH RFC 7/8] KVM: x86/pmu: Use flex *event arrays to implement grouped events Date: Mon, 12 Dec 2022 20:58:43 +0800 Message-Id: <20221212125844.41157-8-likexu@tencent.com> X-Mailer: git-send-email 2.38.2 In-Reply-To: <20221212125844.41157-1-likexu@tencent.com> References: <20221212125844.41157-1-likexu@tencent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Like Xu The current vPMU would expect that a pmc tends to represent a hardware event and use pmc->counter to track the event value. But from the perspective of the counter designer,a hardware counter can choose to sacrifice the maximum bit width of the counter and instead reflect the values of multiple hardware events on different fields of the same counter. For example, a 32-bit counter can be divided into four 8-bit wide hardware counters, and these hardware events are kept incrementing independently in their own bitfields. Software can read the counter values once to get the latest values of multiple hardware events, thus saving the number of software accesses to the hardware interface and in virtualization, particularly reducing the number of vm-exits. The most natural way to emulate this all-in-one hardware counter in KVM is to use a 1-to-N mapping relationship, i.e., a pmc can be associated with multiple perf_events, and these events are created in a group manner, which are scheduled in the host perf as a group to obtain hardware resources and present them to the guest. In implementation, when the guest accesses this all-in-one counter, its pmc->max_nr_events changes according to the hardware definition, triggering the kvm's group event creation path, which is centrally created and then enabled in order, which eliminates the code differences of separate enablement. The grouped events are also released as a group. Which hardware events correspond to each pmc and how to divide the available bitfields is predefined by the hardware vendor, and KVM does not freely combine them. A many-to-one pmc is no different from a one-to-one pmc in many cases. The first pmc->perf_event is always the source of the pmc->counter value, which implies that most functions that manipulate the state of pmc->perf_event do not need to be modified. Note, the specific rules for splicing multiple perf_event->count values into a single value are defined by the hardware event, which does not and should not appear in this generic change. Signed-off-by: Like Xu --- arch/x86/include/asm/kvm_host.h | 6 ++++- arch/x86/kvm/pmu.c | 44 ++++++++++++++++++++++++++------- arch/x86/kvm/pmu.h | 15 ++++++++--- arch/x86/kvm/svm/pmu.c | 1 + arch/x86/kvm/vmx/pmu_intel.c | 2 ++ 5 files changed, 54 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 413f2e104543..73740aa891d3 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -494,12 +494,12 @@ enum pmc_type { struct kvm_pmc { enum pmc_type type; u8 idx; + u8 max_nr_events; bool is_paused; bool intr; u64 counter; u64 prev_counter; u64 eventsel; - struct perf_event *perf_event; struct kvm_vcpu *vcpu; /* * only for creating or reusing perf_event, @@ -507,6 +507,10 @@ struct kvm_pmc { * ctrl value for fixed counters. */ u64 current_config; + union { + struct perf_event *perf_event; + DECLARE_FLEX_ARRAY(struct perf_event *, perf_events); + }; }; /* More counters may conflict with other existing Architectural MSRs */ diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index f6c8180241d7..ae53a8298dec 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -160,7 +160,7 @@ static int pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, u64 config, bool intr) { struct kvm_pmu *pmu = pmc_to_pmu(pmc); - struct perf_event *event; + struct perf_event *event, *group_leader; struct perf_event_attr attr = { .type = type, .size = sizeof(attr), @@ -172,6 +172,7 @@ static int pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, u64 config, .config = config, }; bool pebs = test_bit(pmc->idx, (unsigned long *)&pmu->pebs_enable); + unsigned int i; attr.sample_period = get_sample_period(pmc, pmc->counter); @@ -204,18 +205,39 @@ static int pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, u64 config, attr.precise_ip = 3; } - event = perf_event_create_kernel_counter(&attr, -1, current, NULL, - kvm_perf_overflow, pmc); - if (IS_ERR(event)) { - pr_debug_ratelimited("kvm_pmu: event creation failed %ld for pmc->idx = %d\n", - PTR_ERR(event), pmc->idx); - return PTR_ERR(event); + /* + * To create grouped events, the first created perf_event doesn't + * know it will be the group_leader and may move to an unexpected + * enabling path, thus delay all enablement until after creation, + * not affecting non-grouped events to save one perf interface call. + */ + if (pmc->max_nr_events > 1) + attr.disabled = 1; + + for (i = 0; i < pmc->max_nr_events; i++) { + group_leader = i ? pmc->perf_event : NULL; + event = perf_event_create_kernel_counter(&attr, -1, current, group_leader, + kvm_perf_overflow, pmc); + if (IS_ERR(event)) { + pr_debug_ratelimited("kvm_pmu: event creation failed %ld for pmc->idx = %d\n", + PTR_ERR(event), pmc->idx); + return PTR_ERR(event); + } + + pmc->perf_events[i] = event; + pmc_to_pmu(pmc)->event_count++; } - pmc->perf_event = event; - pmc_to_pmu(pmc)->event_count++; pmc->is_paused = false; pmc->intr = intr || pebs; + + if (!attr.disabled) + return 0; + + /* Enable grouped events in order. */ + for (i = 0; i < pmc->max_nr_events; i++) + perf_event_enable(pmc->perf_events[i]); + return 0; } @@ -223,6 +245,10 @@ static void pmc_pause_counter(struct kvm_pmc *pmc) { u64 counter = pmc->counter; + /* The perf_event_pause() is not suitable for grouped events. */ + if (pmc->max_nr_events > 1) + pmc_stop_counter(pmc); + if (!pmc->perf_event || pmc->is_paused) return; diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index b9e29a199ab8..e4b738b7c208 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -64,12 +64,19 @@ static inline u64 pmc_read_counter(struct kvm_pmc *pmc) static inline void pmc_release_perf_event(struct kvm_pmc *pmc) { - if (pmc->perf_event) { - perf_event_release_kernel(pmc->perf_event); - pmc->perf_event = NULL; - pmc->current_config = 0; + unsigned int i; + + for (i = 0; i < pmc->max_nr_events; i++) { + if (!pmc->perf_events[i]) + continue; + + perf_event_release_kernel(pmc->perf_events[i]); + pmc->perf_events[i] = NULL; pmc_to_pmu(pmc)->event_count--; } + + pmc->current_config = 0; + pmc->max_nr_events = 1; } static inline void pmc_stop_counter(struct kvm_pmc *pmc) diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c index 0e313fbae055..09e81200f657 100644 --- a/arch/x86/kvm/svm/pmu.c +++ b/arch/x86/kvm/svm/pmu.c @@ -200,6 +200,7 @@ static void amd_pmu_init(struct kvm_vcpu *vcpu) pmu->gp_counters[i].vcpu = vcpu; pmu->gp_counters[i].idx = i; pmu->gp_counters[i].current_config = 0; + pmu->gp_counters[i].max_nr_events = 1; } } diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index b69d337d51d9..8e1f679d4d9d 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -642,6 +642,7 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu) pmu->gp_counters[i].vcpu = vcpu; pmu->gp_counters[i].idx = i; pmu->gp_counters[i].current_config = 0; + pmu->gp_counters[i].max_nr_events = 1; } for (i = 0; i < KVM_PMC_MAX_FIXED; i++) { @@ -649,6 +650,7 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu) pmu->fixed_counters[i].vcpu = vcpu; pmu->fixed_counters[i].idx = i + INTEL_PMC_IDX_FIXED; pmu->fixed_counters[i].current_config = 0; + pmu->fixed_counters[i].max_nr_events = 1; } lbr_desc->records.nr = 0; From patchwork Mon Dec 12 12:58:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 13071106 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EC701C4332F for ; Mon, 12 Dec 2022 13:00:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232138AbiLLNA2 (ORCPT ); Mon, 12 Dec 2022 08:00:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232508AbiLLNAA (ORCPT ); Mon, 12 Dec 2022 08:00:00 -0500 Received: from mail-pj1-x102e.google.com (mail-pj1-x102e.google.com [IPv6:2607:f8b0:4864:20::102e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C99B412AC5; Mon, 12 Dec 2022 04:59:34 -0800 (PST) Received: by mail-pj1-x102e.google.com with SMTP id q17-20020a17090aa01100b002194cba32e9so15623375pjp.1; Mon, 12 Dec 2022 04:59:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ubEWeHlBxKHaYOMQNFJfVx4kICdW2pJ9feDQxbalcaw=; b=lezNuKasW5mowgc5jAeyXnxYCNRCyJ+LH+6/UbLQWeu8hmPNRvx5DXCSkkb1nnQFyG Sz4bGBpwekANhBlTXfBL4h29X7UStVHC8+KYY2X+jGPav6H4mPPznLitbIfC51IfZ8Gt T5IEd2mLEqJnp3hlO/dbHtuztOqDyJDDwZI07sZ4RLft9AYW3hkiYlVA+tRHutR/J7/6 BGTFRV8njfx92ORk0kaB8heLBkBBh4PrX/zoyEee4m9UXPGmobsPIWss9KS1OvVKBu6r yTKbcYNYNj4kp9r+bKTlNDzPASuO16VPlBEC/sBiriN4sQbNoucDyJJ85ma9hl0yYBDI BrOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ubEWeHlBxKHaYOMQNFJfVx4kICdW2pJ9feDQxbalcaw=; b=ObqqZLJpdQZ0go+x0dqTaQuTGBF1Jv6CAbeXym8slus4bumshg1PkbGKAEUNZIL0AU V93ekjHuGixjG4xZgGCY09TMWnrE1GImDQSSKKqSSFSymP8883k5o/IFUzgoVJgrv7+r heBbfFeIWrWX98oRLHvYNKGhzqBJs7QkyxB9RymYOYA/xu22diQP+8Ta6jndwDXXvyv3 n77I0bGyj1IcX9wahzkBBhLgI5LeP44cXlP8QutLyUBbse2XuunFkPX5rhT11qzspdnc M+AwsrFpTdDGhtdhihi8V8NYBX+cUZSZUAOhGwqKS0rK4vtU9z/WDDdJsT65kNrVM1xF F81g== X-Gm-Message-State: ANoB5pnHS5VZmAMCTLZBV3z1vEKOz1nrfKbDm0ooQsoedoxjZZ2UmRju 4FGJWVj8UYswcoPjGaopBgI= X-Google-Smtp-Source: AA0mqf45N/otSyPPXNDKFC0sjlQ8lDAWctELzAsQRZSHFE4C3eBpW3pBikcEFkYm4/Z9U6ZWpWuGbQ== X-Received: by 2002:a17:902:7e06:b0:189:ed85:94d2 with SMTP id b6-20020a1709027e0600b00189ed8594d2mr13373461plm.1.1670849973950; Mon, 12 Dec 2022 04:59:33 -0800 (PST) Received: from localhost.localdomain ([103.7.29.32]) by smtp.gmail.com with ESMTPSA id jc3-20020a17090325c300b00186f608c543sm6273927plb.304.2022.12.12.04.59.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Dec 2022 04:59:33 -0800 (PST) From: Like Xu X-Google-Original-From: Like Xu To: Peter Zijlstra , Sean Christopherson Cc: Paolo Bonzini , linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH RFC 8/8] KVM: x86/pmu: Add MSR_PERF_METRICS MSR emulation to enable Topdown Date: Mon, 12 Dec 2022 20:58:44 +0800 Message-Id: <20221212125844.41157-9-likexu@tencent.com> X-Mailer: git-send-email 2.38.2 In-Reply-To: <20221212125844.41157-1-likexu@tencent.com> References: <20221212125844.41157-1-likexu@tencent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Like Xu The Ice Lake core PMU provides built-in support for Top-down Micro- architecture Analysis (TMA) method level 1 metrics. These metrics are always available to cross-validate performance observations, freeing general purpose counters to count other events in high counter utilization scenarios. A new MSR called MSR_PERF_METRICS, hinted by PERF_CAPABILITIES. PERF_METRICS_AVAILABLE (bit 15), reports four (on icx) TMA Level 1 metrics directly. The lower 32 bits are divided into four 8-bit fields each of which is an integer fraction of 255. When performance metrics use type 2000H (INTEL_PMC_FIXED_RDPMC_METRICS), rdpmc could read the value of perf_metric. Bit EN_PERF_METRICS[48] have also been added to the following MSRs: - MSR_CORE_PERF_GLOBAL_CTRL - MSR_CORE_PERF_GLOBAL_STATUS - MSR_CORE_PERF_GLOBAL_OVF_CTRL When it comes to KVM implementation, the topdown mode is only enabled when guest starts both registers, PERF_METRICS and fixed counter 3, from zero. In topdwon mode, vPMU creates a group of zero sample period events for fixed counter3. The first pmc->event is a group leader slot event, and its event->count is used to emulate counter value, marking bit 35 when it generates an interrupt, as usual. (Note, the fixed counter3 can be used independently, for counting or sampling, but isn't compatible with topdown mode at the same time). The four (or more) sub-events for perf_metrics are each emulated with a metric event (its event attr.config is kernel-defined, starting with INTEL_TD_METRIC_RETIRING), which increments independently and share the same bit INTEL_PERF_METRICS_IDX on overflow. When pmu->perf_metric is read, vPMU collects all event->count from multiple metric events, but their original values have already been processed by the host perf core (defined in icl_get_metrics_event_value(), with a division and multiplication). This part of calculation logic needs to be reversed in vPMU to restore their real values and stitch them together in turn to form pmu->perf_metric. The pmc_read_counter() has been enhanced, moved out of the header file, and wrapped with EXPORT_SYMBOL_GPL. A little trick is that the PMC corresponding to bit 35 and 48 (INTEL_PERF_METRICS_IDX) are both fixed counter3, in order to lazy release of all above grouped events. Signed-off-by: Like Xu --- arch/x86/include/asm/kvm_host.h | 6 ++ arch/x86/include/asm/msr-index.h | 1 + arch/x86/kvm/pmu.c | 105 ++++++++++++++++++++++++++++++- arch/x86/kvm/pmu.h | 16 ++--- arch/x86/kvm/vmx/pmu_intel.c | 33 ++++++++++ arch/x86/kvm/vmx/vmx.c | 3 + arch/x86/kvm/x86.c | 3 + 7 files changed, 153 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 73740aa891d3..4f2e2ede09b6 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -522,6 +522,7 @@ struct kvm_pmc { struct kvm_pmu { unsigned nr_arch_gp_counters; unsigned nr_arch_fixed_counters; + unsigned int nr_perf_metrics; unsigned available_event_types; u64 fixed_ctr_ctrl; u64 fixed_ctr_ctrl_mask; @@ -564,6 +565,11 @@ struct kvm_pmu { */ u64 host_cross_mapped_mask; + /* Intel Topdown Performance Metrics */ + u64 perf_metrics; + u64 perf_metrics_mask; + bool metric_event_ovf; + /* * The gate to release perf_events not marked in * pmc_in_use only once in a vcpu time slice. diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index 4a2af82553e4..5dde0242ff28 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -246,6 +246,7 @@ #define PERF_CAP_PEBS_BASELINE BIT_ULL(14) #define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \ PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE) +#define MSR_CAP_PERF_METRICS BIT_ULL(15) #define MSR_IA32_RTIT_CTL 0x00000570 #define RTIT_CTL_TRACEEN BIT(0) diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index ae53a8298dec..4bc888462b4f 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -96,6 +96,11 @@ static void kvm_pmi_trigger_fn(struct irq_work *irq_work) kvm_pmu_deliver_pmi(vcpu); } +static inline bool pmc_support_perf_metrics(struct kvm_pmc *pmc) +{ + return pmc->idx == (INTEL_PMC_IDX_FIXED + 3); +} + static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi) { struct kvm_pmu *pmu = pmc_to_pmu(pmc); @@ -115,6 +120,10 @@ static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi) skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT, (unsigned long *)&pmu->global_status); } + } else if (pmu->metric_event_ovf && pmc_support_perf_metrics(pmc)) { + /* At least one of PERF_METRICS sub-counter has overflowed */ + __set_bit(INTEL_PERF_METRICS_IDX, (unsigned long *)&pmu->global_status); + pmu->metric_event_ovf = false; } else { __set_bit(pmc->idx, (unsigned long *)&pmu->global_status); } @@ -155,6 +164,16 @@ static void kvm_perf_overflow(struct perf_event *perf_event, kvm_make_request(KVM_REQ_PMU, pmc->vcpu); } +static inline bool perf_metrics_is_enabled(struct kvm_pmc *pmc) +{ + struct kvm_pmu *pmu = pmc_to_pmu(pmc); + + return !pmu->perf_metrics && !pmc->counter && + pmc_support_perf_metrics(pmc) && pmc_speculative_in_use(pmc) && + pmc_is_enabled(pmc) && test_bit(GLOBAL_CTRL_EN_PERF_METRICS, + (unsigned long *)&pmu->global_ctrl); +} + static int pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, u64 config, bool exclude_user, bool exclude_kernel, bool intr) @@ -172,6 +191,7 @@ static int pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, u64 config, .config = config, }; bool pebs = test_bit(pmc->idx, (unsigned long *)&pmu->pebs_enable); + bool topdown = perf_metrics_is_enabled(pmc); unsigned int i; attr.sample_period = get_sample_period(pmc, pmc->counter); @@ -205,6 +225,10 @@ static int pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, u64 config, attr.precise_ip = 3; } + /* A group of events is needed to emulate each perf_metric. */ + if (topdown) + pmc->max_nr_events = pmu->nr_perf_metrics + 1; + /* * To create grouped events, the first created perf_event doesn't * know it will be the group_leader and may move to an unexpected @@ -215,6 +239,17 @@ static int pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, u64 config, attr.disabled = 1; for (i = 0; i < pmc->max_nr_events; i++) { + if (topdown) { + /* + * According to perf core, the group_leader slots event must + * not be a sampling event for topdown metric use, and the + * topdown metric events don't support sampling. + */ + attr.sample_period = 0; + if (i) + attr.config = INTEL_TD_METRIC_RETIRING + 0x100 * (i - 1); + } + group_leader = i ? pmc->perf_event : NULL; event = perf_event_create_kernel_counter(&attr, -1, current, group_leader, kvm_perf_overflow, pmc); @@ -229,7 +264,7 @@ static int pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, u64 config, } pmc->is_paused = false; - pmc->intr = intr || pebs; + pmc->intr = intr || pebs || topdown; if (!attr.disabled) return 0; @@ -263,6 +298,9 @@ static bool pmc_resume_counter(struct kvm_pmc *pmc) if (!pmc->perf_event) return false; + if (perf_metrics_is_enabled(pmc) == (pmc->max_nr_events == 1)) + return false; + /* recalibrate sample period and check if it's accepted by perf core */ if (is_sampling_event(pmc->perf_event) && perf_event_period(pmc->perf_event, @@ -447,6 +485,60 @@ static int kvm_pmu_rdpmc_vmware(struct kvm_vcpu *vcpu, unsigned idx, u64 *data) return 0; } +void pmu_read_perf_metrics(struct kvm_pmu *pmu, u64 slots) +{ + struct kvm_pmc *pmc = &pmu->fixed_counters[3]; + u64 old_counter, enabled, running, delta; + union { + u8 field[KVM_PERF_METRICS_NUM_MAX]; + u64 data; + } perf_metrics; + unsigned int i; + + perf_metrics.data = pmu->perf_metrics; + for (i = 1; i < pmc->max_nr_events; i++) { + if (!pmc->perf_events[i]) + continue; + + old_counter = perf_metrics.field[i - 1]; + delta = perf_event_read_value(pmc->perf_events[i], &enabled, &running); + + /* + * Reverse the actual metric counter value out + * according to icl_get_metrics_event_value(). + */ + delta = mul_u64_u64_div_u64(delta, 0xff, slots) + 1; + perf_metrics.field[i - 1] = 0xff & (old_counter + delta); + + /* Check if any metric counter have been overflowed. */ + if (perf_metrics.field[i - 1] < old_counter) + pmu->metric_event_ovf = true; + } + + if (pmu->metric_event_ovf) + __kvm_perf_overflow(pmc, false); + + pmu->perf_metrics = perf_metrics.data & (~pmu->perf_metrics_mask); + __set_bit(INTEL_PERF_METRICS_IDX, pmu->pmc_in_use); +} + +u64 pmc_read_counter(struct kvm_pmc *pmc) +{ + u64 counter, enabled, running, delta; + + counter = pmc->counter; + if (pmc->perf_event && !pmc->is_paused) { + delta = perf_event_read_value(pmc->perf_event, &enabled, &running); + if (delta && pmc_support_perf_metrics(pmc)) + pmu_read_perf_metrics(pmc_to_pmu(pmc), delta); + counter += delta; + } + + /* FIXME: Scaling needed? */ + return counter & pmc_bitmask(pmc); +} +EXPORT_SYMBOL_GPL(pmc_read_counter); + int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data) { bool fast_mode = idx & (1u << 31); @@ -469,7 +561,16 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data) (kvm_read_cr0(vcpu) & X86_CR0_PE)) return 1; - *data = pmc_read_counter(pmc) & mask; + if (idx & INTEL_PMC_FIXED_RDPMC_METRICS) { + if (!(vcpu->arch.perf_capabilities & MSR_CAP_PERF_METRICS)) + return 1; + + pmc_read_counter(&pmu->fixed_counters[3]); + *data = pmu->perf_metrics; + } else { + *data = pmc_read_counter(pmc) & mask; + } + return 0; } diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index e4b738b7c208..be800a0c5366 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -10,6 +10,8 @@ #define MSR_IA32_MISC_ENABLE_PMU_RO_MASK (MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL | \ MSR_IA32_MISC_ENABLE_BTS_UNAVAIL) +#define INTEL_PERF_METRICS_IDX GLOBAL_CTRL_EN_PERF_METRICS +#define KVM_PERF_METRICS_NUM_MAX 8 /* retrieve the 4 bits for EN and PMI out of IA32_FIXED_CTR_CTRL */ #define fixed_ctrl_field(ctrl_reg, idx) (((ctrl_reg) >> ((idx)*4)) & 0xf) @@ -42,6 +44,8 @@ struct kvm_pmu_ops { }; void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops); +void pmu_read_perf_metrics(struct kvm_pmu *pmu, u64 slots); +u64 pmc_read_counter(struct kvm_pmc *pmc); static inline u64 pmc_bitmask(struct kvm_pmc *pmc) { @@ -50,18 +54,6 @@ static inline u64 pmc_bitmask(struct kvm_pmc *pmc) return pmu->counter_bitmask[pmc->type]; } -static inline u64 pmc_read_counter(struct kvm_pmc *pmc) -{ - u64 counter, enabled, running; - - counter = pmc->counter; - if (pmc->perf_event && !pmc->is_paused) - counter += perf_event_read_value(pmc->perf_event, - &enabled, &running); - /* FIXME: Scaling needed? */ - return counter & pmc_bitmask(pmc); -} - static inline void pmc_release_perf_event(struct kvm_pmc *pmc) { unsigned int i; diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 8e1f679d4d9d..52fef388fdb0 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -81,6 +81,8 @@ static struct kvm_pmc *intel_pmc_idx_to_pmc(struct kvm_pmu *pmu, int pmc_idx) if (pmc_idx < INTEL_PMC_IDX_FIXED) { return get_gp_pmc(pmu, MSR_P6_EVNTSEL0 + pmc_idx, MSR_P6_EVNTSEL0); + } else if (pmc_idx == INTEL_PERF_METRICS_IDX) { + return get_fixed_pmc(pmu, MSR_CORE_PERF_FIXED_CTR3); } else { u32 idx = pmc_idx - INTEL_PMC_IDX_FIXED; @@ -160,6 +162,12 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu, counters = pmu->gp_counters; num_counters = pmu->nr_arch_gp_counters; } + + if (idx & INTEL_PMC_FIXED_RDPMC_METRICS) { + fixed = true; + idx = 3; + } + if (idx >= num_counters) return NULL; *mask &= pmu->counter_bitmask[fixed ? KVM_PMC_FIXED : KVM_PMC_GP]; @@ -229,6 +237,9 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr) ret = (perf_capabilities & PERF_CAP_PEBS_BASELINE) && ((perf_capabilities & PERF_CAP_PEBS_FORMAT) > 3); break; + case MSR_PERF_METRICS: + ret = vcpu->arch.perf_capabilities & MSR_CAP_PERF_METRICS; + break; default: ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) || get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) || @@ -385,6 +396,10 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_PEBS_DATA_CFG: msr_info->data = pmu->pebs_data_cfg; return 0; + case MSR_PERF_METRICS: + pmc_read_counter(&pmu->fixed_counters[3]); + msr_info->data = pmu->perf_metrics; + return 0; default: if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) || (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) { @@ -472,6 +487,14 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return 0; } break; + case MSR_PERF_METRICS: + if (pmu->perf_metrics == data) + return 0; + if (!(data & pmu->perf_metrics_mask)) { + pmu->perf_metrics = data; + return 0; + } + break; default: if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) || (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) { @@ -545,6 +568,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) pmu->fixed_ctr_ctrl_mask = ~0ull; pmu->pebs_enable_mask = ~0ull; pmu->pebs_data_cfg_mask = ~0ull; + pmu->perf_metrics_mask = ~0ull; entry = kvm_find_cpuid_entry(vcpu, 0xa); if (!entry || !vcpu->kvm->arch.enable_pmu) @@ -584,6 +608,13 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) pmu->fixed_ctr_ctrl_mask &= ~(0xbull << (i * 4)); counter_mask = ~(((1ull << pmu->nr_arch_gp_counters) - 1) | (((1ull << pmu->nr_arch_fixed_counters) - 1) << INTEL_PMC_IDX_FIXED)); + if (vcpu->arch.perf_capabilities & MSR_CAP_PERF_METRICS) { + counter_mask &= ~(1ULL << INTEL_PERF_METRICS_IDX); + pmu->nr_perf_metrics = min_t(int, KVM_PERF_METRICS_NUM_MAX, + kvm_pmu_cap.num_topdown_events); + for (i = 0; i < pmu->nr_perf_metrics; i++) + pmu->perf_metrics_mask &= ~(0xffull << (i * 8)); + } pmu->global_ctrl_mask = counter_mask; pmu->global_ovf_ctrl_mask = pmu->global_ctrl_mask & ~(MSR_CORE_PERF_GLOBAL_OVF_CTRL_OVF_BUF | @@ -656,6 +687,8 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu) lbr_desc->records.nr = 0; lbr_desc->event = NULL; lbr_desc->msr_passthrough = false; + + BUILD_BUG_ON(KVM_PERF_METRICS_NUM_MAX > INTEL_TD_METRIC_NUM); } static void intel_pmu_reset(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index fe5615fd8295..57312b5a3d9d 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7721,6 +7721,9 @@ static u64 vmx_get_perf_capabilities(void) perf_cap &= ~PERF_CAP_PEBS_BASELINE; } + if (kvm_pmu_cap.num_topdown_events) + perf_cap |= host_perf_cap & MSR_CAP_PERF_METRICS; + return perf_cap; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 0b61cb58c877..a50b3ad7294c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1439,6 +1439,7 @@ static const u32 msrs_to_save_all[] = { MSR_CORE_PERF_FIXED_CTR_CTRL, MSR_CORE_PERF_GLOBAL_STATUS, MSR_CORE_PERF_GLOBAL_CTRL, MSR_CORE_PERF_GLOBAL_OVF_CTRL, MSR_IA32_PEBS_ENABLE, MSR_IA32_DS_AREA, MSR_PEBS_DATA_CFG, + MSR_PERF_METRICS, /* This part of MSRs should match KVM_INTEL_PMC_MAX_GENERIC. */ MSR_ARCH_PERFMON_PERFCTR0, MSR_ARCH_PERFMON_PERFCTR1, @@ -3880,6 +3881,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_IA32_PEBS_ENABLE: case MSR_IA32_DS_AREA: case MSR_PEBS_DATA_CFG: + case MSR_PERF_METRICS: case MSR_F15H_PERF_CTL0 ... MSR_F15H_PERF_CTR5: if (kvm_pmu_is_valid_msr(vcpu, msr)) return kvm_pmu_set_msr(vcpu, msr_info); @@ -3983,6 +3985,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_IA32_PEBS_ENABLE: case MSR_IA32_DS_AREA: case MSR_PEBS_DATA_CFG: + case MSR_PERF_METRICS: case MSR_F15H_PERF_CTL0 ... MSR_F15H_PERF_CTR5: if (kvm_pmu_is_valid_msr(vcpu, msr_info->index)) return kvm_pmu_get_msr(vcpu, msr_info);