From patchwork Mon Jan 4 13:15:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11996721 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99B19C433E6 for ; Mon, 4 Jan 2021 13:24:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 725CF22286 for ; Mon, 4 Jan 2021 13:24:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727104AbhADNXs (ORCPT ); Mon, 4 Jan 2021 08:23:48 -0500 Received: from mga07.intel.com ([134.134.136.100]:23246 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726670AbhADNXs (ORCPT ); Mon, 4 Jan 2021 08:23:48 -0500 IronPort-SDR: 9p3odE7WzXEoN1Zibw7uN85UZP1GWYftErRYavq70538zRR0H9tXZ29dbpVNRB5MaHEW8+xqjv d4VL03y9+A8g== X-IronPort-AV: E=McAfee;i="6000,8403,9853"; a="241034281" X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="241034281" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2021 05:22:02 -0800 IronPort-SDR: PbsYVrggunTjJG9WroORn0Ml13OzSR3mXiaXkiAvZFtOdX8wJDGHdAqzeGGtir0tpmzqr7PoNM F9D+0dJXJcyw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="461944514" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga001.fm.intel.com with ESMTP; 04 Jan 2021 05:21:58 -0800 From: Like Xu To: Peter Zijlstra , Paolo Bonzini , eranian@google.com, kvm@vger.kernel.org Cc: Ingo Molnar , Sean Christopherson , Thomas Gleixner , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Andi Kleen , Kan Liang , wei.w.wang@intel.com, luwei.kang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 01/17] KVM: x86/pmu: Set MSR_IA32_MISC_ENABLE_EMON bit when vPMU is enabled Date: Mon, 4 Jan 2021 21:15:26 +0800 Message-Id: <20210104131542.495413-2-like.xu@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210104131542.495413-1-like.xu@linux.intel.com> References: <20210104131542.495413-1-like.xu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Intel platforms, software may uses IA32_MISC_ENABLE[7] bit to detect whether the performance monitoring facility is supported in the processor. It's dependent on the PMU being enabled for the guest and a write to this PMU available bit will be ignored. Cc: Yao Yuan Signed-off-by: Like Xu --- arch/x86/kvm/vmx/pmu_intel.c | 2 ++ arch/x86/kvm/x86.c | 1 + 2 files changed, 3 insertions(+) diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index a886a47daebd..01c7d84ecf3e 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -339,6 +339,8 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) if (!pmu->version) return; + vcpu->arch.ia32_misc_enable_msr |= MSR_IA32_MISC_ENABLE_EMON; + perf_get_x86_pmu_capability(&x86_pmu); if (guest_cpuid_has(vcpu, X86_FEATURE_PDCM)) vcpu->arch.perf_capabilities = vmx_get_perf_capabilities(); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 648c677b12e9..87f97ffa9966 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3094,6 +3094,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } break; case MSR_IA32_MISC_ENABLE: + data &= ~MSR_IA32_MISC_ENABLE_EMON; if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT) && ((vcpu->arch.ia32_misc_enable_msr ^ data) & MSR_IA32_MISC_ENABLE_MWAIT)) { if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3)) From patchwork Mon Jan 4 13:15:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11996719 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8A61C43381 for ; Mon, 4 Jan 2021 13:24:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9B68A2253D for ; Mon, 4 Jan 2021 13:24:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727143AbhADNXw (ORCPT ); Mon, 4 Jan 2021 08:23:52 -0500 Received: from mga07.intel.com ([134.134.136.100]:23250 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726670AbhADNXv (ORCPT ); Mon, 4 Jan 2021 08:23:51 -0500 IronPort-SDR: vh6eDlTf1FajgIEvsjVEkcuVZKpVw7zItICBIVxPVQlhNGiskx0bsoXMp35qHPhOXFsKyhB6RO URqablbLZpRQ== X-IronPort-AV: E=McAfee;i="6000,8403,9853"; a="241034286" X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="241034286" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2021 05:22:05 -0800 IronPort-SDR: 8V3wvu92/v3Ugp0BmNmnvuYtRtHlhMone7GdyMUmehzi5qoTiw+RF3xedx164O6OnmqXXgWbn+ xjcgL08CqDOA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="461944526" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga001.fm.intel.com with ESMTP; 04 Jan 2021 05:22:02 -0800 From: Like Xu To: Peter Zijlstra , Paolo Bonzini , eranian@google.com, kvm@vger.kernel.org Cc: Ingo Molnar , Sean Christopherson , Thomas Gleixner , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Andi Kleen , Kan Liang , wei.w.wang@intel.com, luwei.kang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 02/17] KVM: x86/pmu: Use IA32_PERF_CAPABILITIES to adjust features visibility Date: Mon, 4 Jan 2021 21:15:27 +0800 Message-Id: <20210104131542.495413-3-like.xu@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210104131542.495413-1-like.xu@linux.intel.com> References: <20210104131542.495413-1-like.xu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org On Intel platforms, KVM agent could configure MSR_IA32_PERF_CAPABILITIES (such as unmask some vmx-supported bits in vcpu->arch.perf_capabilities) to adjust the visibility of guest PMU features for vPMU-enabled guests. Once MSR_IA32_PERF_CAPABILITIES is changed validly via vmx_set_msr(), the adjustment in intel_pmu_refresh() will be triggered. To ensure the sustainability of the new value, the default initialization path is moved to intel_pmu_init(). Signed-off-by: Like Xu --- arch/x86/kvm/vmx/pmu_intel.c | 6 +++--- arch/x86/kvm/vmx/vmx.c | 5 +++++ arch/x86/kvm/x86.c | 2 +- 3 files changed, 9 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 01c7d84ecf3e..7c18c85328da 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -327,7 +327,6 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) pmu->counter_bitmask[KVM_PMC_FIXED] = 0; pmu->version = 0; pmu->reserved_bits = 0xffffffff00200000ull; - vcpu->arch.perf_capabilities = 0; entry = kvm_find_cpuid_entry(vcpu, 0xa, 0); if (!entry) @@ -342,8 +341,6 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) vcpu->arch.ia32_misc_enable_msr |= MSR_IA32_MISC_ENABLE_EMON; perf_get_x86_pmu_capability(&x86_pmu); - if (guest_cpuid_has(vcpu, X86_FEATURE_PDCM)) - vcpu->arch.perf_capabilities = vmx_get_perf_capabilities(); pmu->nr_arch_gp_counters = min_t(int, eax.split.num_counters, x86_pmu.num_counters_gp); @@ -403,6 +400,9 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu) pmu->fixed_counters[i].idx = i + INTEL_PMC_IDX_FIXED; pmu->fixed_counters[i].current_config = 0; } + + vcpu->arch.perf_capabilities = guest_cpuid_has(vcpu, X86_FEATURE_PDCM) ? + vmx_get_perf_capabilities() : 0; } static void intel_pmu_reset(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 75c9c6a0a3a4..09bc41c53cd8 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2196,6 +2196,11 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) if ((data >> 32) != 0) return 1; goto find_uret_msr; + case MSR_IA32_PERF_CAPABILITIES: + if (data && !vcpu_to_pmu(vcpu)->version) + return 1; + ret = kvm_set_msr_common(vcpu, msr_info); + break; default: find_uret_msr: diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 87f97ffa9966..a38ca932eec5 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3037,7 +3037,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return 1; vcpu->arch.perf_capabilities = data; - + kvm_pmu_refresh(vcpu); return 0; } case MSR_EFER: From patchwork Mon Jan 4 13:15:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11996729 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44898C43331 for ; Mon, 4 Jan 2021 13:25:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2B0A0221E5 for ; Mon, 4 Jan 2021 13:25:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726825AbhADNZG (ORCPT ); Mon, 4 Jan 2021 08:25:06 -0500 Received: from mga07.intel.com ([134.134.136.100]:23242 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726502AbhADNZG (ORCPT ); Mon, 4 Jan 2021 08:25:06 -0500 IronPort-SDR: K19kWVwiNwdgeFMutuj656poW81BP5PHcWRsnA4Y1nJuFURuOEoER0XOJWHT+mZaDiwJeIvo0p 66/uBEF0Y0vg== X-IronPort-AV: E=McAfee;i="6000,8403,9853"; a="241034307" X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="241034307" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2021 05:22:09 -0800 IronPort-SDR: 3ebCMttWEu0hHlfGTizcNBkoL+uYEPgbYT0/P/KSd3sqqn9oSmrtGUXTHnV33WOsL9wpZdNaep t2gs1npCVonQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="461944539" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga001.fm.intel.com with ESMTP; 04 Jan 2021 05:22:05 -0800 From: Like Xu To: Peter Zijlstra , Paolo Bonzini , eranian@google.com, kvm@vger.kernel.org Cc: Ingo Molnar , Sean Christopherson , Thomas Gleixner , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Andi Kleen , Kan Liang , wei.w.wang@intel.com, luwei.kang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 03/17] KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter Date: Mon, 4 Jan 2021 21:15:28 +0800 Message-Id: <20210104131542.495413-4-like.xu@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210104131542.495413-1-like.xu@linux.intel.com> References: <20210104131542.495413-1-like.xu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The mask value of fixed counter control register should be dynamic adjusted with the number of fixed counters. This patch introduces a variable that includes the reserved bits of fixed counter control registers. This is needed for later Ice Lake fixed counter changes. Co-developed-by: Luwei Kang Signed-off-by: Luwei Kang Signed-off-by: Like Xu --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/vmx/pmu_intel.c | 7 ++++++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 39707e72b062..94c8bfee4a82 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -433,6 +433,7 @@ struct kvm_pmu { unsigned nr_arch_fixed_counters; unsigned available_event_types; u64 fixed_ctr_ctrl; + u64 fixed_ctr_ctrl_mask; u64 global_ctrl; u64 global_status; u64 global_ovf_ctrl; diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 7c18c85328da..50047114c298 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -253,7 +253,7 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_CORE_PERF_FIXED_CTR_CTRL: if (pmu->fixed_ctr_ctrl == data) return 0; - if (!(data & 0xfffffffffffff444ull)) { + if (!(data & pmu->fixed_ctr_ctrl_mask)) { reprogram_fixed_counters(pmu, data); return 0; } @@ -320,6 +320,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) struct kvm_cpuid_entry2 *entry; union cpuid10_eax eax; union cpuid10_edx edx; + int i; pmu->nr_arch_gp_counters = 0; pmu->nr_arch_fixed_counters = 0; @@ -327,6 +328,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) pmu->counter_bitmask[KVM_PMC_FIXED] = 0; pmu->version = 0; pmu->reserved_bits = 0xffffffff00200000ull; + pmu->fixed_ctr_ctrl_mask = ~0ull; entry = kvm_find_cpuid_entry(vcpu, 0xa, 0); if (!entry) @@ -358,6 +360,9 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) ((u64)1 << edx.split.bit_width_fixed) - 1; } + for (i = 0; i < pmu->nr_arch_fixed_counters; i++) + pmu->fixed_ctr_ctrl_mask |= (0xbull << (i * 4)); + pmu->fixed_ctr_ctrl_mask = ~pmu->fixed_ctr_ctrl_mask; pmu->global_ctrl = ((1ull << pmu->nr_arch_gp_counters) - 1) | (((1ull << pmu->nr_arch_fixed_counters) - 1) << INTEL_PMC_IDX_FIXED); pmu->global_ctrl_mask = ~pmu->global_ctrl; From patchwork Mon Jan 4 13:15:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11996727 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 402E0C4332D for ; Mon, 4 Jan 2021 13:25:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0BAC82242A for ; Mon, 4 Jan 2021 13:25:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727168AbhADNZJ (ORCPT ); Mon, 4 Jan 2021 08:25:09 -0500 Received: from mga07.intel.com ([134.134.136.100]:23246 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726502AbhADNZI (ORCPT ); Mon, 4 Jan 2021 08:25:08 -0500 IronPort-SDR: MJ2GCXb2/8ShcWClvkKUweoyFOaGV3xFZEbY03xHPMdXokvGVCvsbucWcv+YfkeX3ojghhNUet xER6HYivHlvQ== X-IronPort-AV: E=McAfee;i="6000,8403,9853"; a="241034316" X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="241034316" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2021 05:22:12 -0800 IronPort-SDR: wJ1Uqk9JEYlvUZTTzHukE3gWww3UgbyKka/qbz8lmWUVRQ0Gvro6zCdm0uZ9WIMu6nMw+AjmWy 2AN5Rluwa6TQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="461944553" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga001.fm.intel.com with ESMTP; 04 Jan 2021 05:22:09 -0800 From: Like Xu To: Peter Zijlstra , Paolo Bonzini , eranian@google.com, kvm@vger.kernel.org Cc: Ingo Molnar , Sean Christopherson , Thomas Gleixner , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Andi Kleen , Kan Liang , wei.w.wang@intel.com, luwei.kang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 04/17] perf: x86/ds: Handle guest PEBS overflow PMI and inject it to guest Date: Mon, 4 Jan 2021 21:15:29 +0800 Message-Id: <20210104131542.495413-5-like.xu@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210104131542.495413-1-like.xu@linux.intel.com> References: <20210104131542.495413-1-like.xu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org With PEBS virtualization, the PEBS records get delivered to the guest, and host still sees the PEBS overflow PMI from guest PEBS counters. This would normally result in a spurious host PMI and we needs to inject that PEBS overflow PMI into the guest, so that the guest PMI handler can handle the PEBS records. Check for this case in the host perf PEBS handler. If a PEBS overflow PMI occurs and it's not generated from host side (via check host DS), a fake event will be triggered. The fake event causes the KVM PMI callback to be called, thereby injecting the PEBS overflow PMI into the guest. No matter how many guest PEBS counters are overflowed, only triggering one fake event is enough. The guest PEBS handler would retrieve the correct information from its own PEBS records buffer. A guest PEBS overflow PMI would be missed when a PEBS counter is enabled on the host side and coincidentally a host PEBS overflow PMI based on host DS_AREA is also triggered right after vm-exit due to the guest PEBS overflow PMI based on guest DS_AREA. In that case, KVM will disable guest PEBS before vm-entry once there's a host PEBS counter enabled on the same CPU. Originally-by: Andi Kleen Co-developed-by: Kan Liang Signed-off-by: Kan Liang Signed-off-by: Like Xu --- arch/x86/events/intel/ds.c | 62 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+) diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index b47cc4226934..c499bdb58373 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -1721,6 +1721,65 @@ intel_pmu_save_and_restart_reload(struct perf_event *event, int count) return 0; } +/* + * We may be running with guest PEBS events created by KVM, and the + * PEBS records are logged into the guest's DS and invisible to host. + * + * In the case of guest PEBS overflow, we only trigger a fake event + * to emulate the PEBS overflow PMI for guest PBES counters in KVM. + * The guest will then vm-entry and check the guest DS area to read + * the guest PEBS records. + * + * The guest PEBS overflow PMI may be dropped when both the guest and + * the host use PEBS. Therefore, KVM will not enable guest PEBS once + * the host PEBS is enabled since it may bring a confused unknown NMI. + * + * The contents and other behavior of the guest event do not matter. + */ +static int intel_pmu_handle_guest_pebs(struct cpu_hw_events *cpuc, + struct pt_regs *iregs, + struct debug_store *ds) +{ + struct perf_sample_data data; + struct perf_event *event = NULL; + u64 guest_pebs_idxs = cpuc->pebs_enabled & ~cpuc->intel_ctrl_host_mask; + int bit; + + /* + * Ideally, we should check guest DS to understand if it's + * a guest PEBS overflow PMI from guest PEBS counters. + * However, it brings high overhead to retrieve guest DS in host. + * So we check host DS instead for performance. + * + * If PEBS interrupt threshold on host is not exceeded in a NMI, there + * must be a PEBS overflow PMI generated from the guest PEBS counters. + * There is no ambiguity since the reported event in the PMI is guest + * only. It gets handled correctly on a case by case base for each event. + * + * Note: KVM disables the co-existence of guest PEBS and host PEBS. + */ + if (!guest_pebs_idxs || !in_nmi() || + ds->pebs_index >= ds->pebs_interrupt_threshold) + return 0; + + for_each_set_bit(bit, (unsigned long *)&guest_pebs_idxs, + INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed) { + + event = cpuc->events[bit]; + if (!event->attr.precise_ip) + continue; + + perf_sample_data_init(&data, 0, event->hw.last_period); + if (perf_event_overflow(event, &data, iregs)) + x86_pmu_stop(event, 0); + + /* Inject one fake event is enough. */ + return 1; + } + + return 0; +} + static __always_inline void __intel_pmu_pebs_event(struct perf_event *event, struct pt_regs *iregs, @@ -1965,6 +2024,9 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs, struct perf_sample_d if (!x86_pmu.pebs_active) return; + if (intel_pmu_handle_guest_pebs(cpuc, iregs, ds)) + return; + base = (struct pebs_basic *)(unsigned long)ds->pebs_buffer_base; top = (struct pebs_basic *)(unsigned long)ds->pebs_index; From patchwork Mon Jan 4 13:15:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11996725 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20486C4332B for ; Mon, 4 Jan 2021 13:25:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DDD1D221E5 for ; Mon, 4 Jan 2021 13:25:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727196AbhADNZM (ORCPT ); Mon, 4 Jan 2021 08:25:12 -0500 Received: from mga07.intel.com ([134.134.136.100]:23250 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726930AbhADNZM (ORCPT ); Mon, 4 Jan 2021 08:25:12 -0500 IronPort-SDR: l2PhqGUoTsEdM2h5oiem1oxX9CgVfyYtfwLCJkfcTjPi1FbCU2RXYFoq205exb97+cog96kVza crfikpR9o8eA== X-IronPort-AV: E=McAfee;i="6000,8403,9853"; a="241034326" X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="241034326" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2021 05:22:15 -0800 IronPort-SDR: MY/GFEr/CAORwT7Z16np5UK0CQjaAlb2tVw3/J0YZ6ueNnbh4kFlnwWQeRHjZPeoJbvkHePRkW LalRcSPsdKag== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="461944563" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga001.fm.intel.com with ESMTP; 04 Jan 2021 05:22:12 -0800 From: Like Xu To: Peter Zijlstra , Paolo Bonzini , eranian@google.com, kvm@vger.kernel.org Cc: Ingo Molnar , Sean Christopherson , Thomas Gleixner , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Andi Kleen , Kan Liang , wei.w.wang@intel.com, luwei.kang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 05/17] KVM: x86/pmu: Reprogram guest PEBS event to emulate guest PEBS counter Date: Mon, 4 Jan 2021 21:15:30 +0800 Message-Id: <20210104131542.495413-6-like.xu@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210104131542.495413-1-like.xu@linux.intel.com> References: <20210104131542.495413-1-like.xu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When a guest counter is configured as a PEBS counter through IA32_PEBS_ENABLE, a guest PEBS event will be reprogrammed by configuring a non-zero precision level in the perf_event_attr. The guest PEBS overflow PMI bit would be set in the guest GLOBAL_STATUS MSR when PEBS facility generates a PEBS overflow PMI based on guest IA32_DS_AREA MSR. The attr.precise_ip would be adjusted to a special precision level when the new PEBS-PDIR feature is supported later which would affect the host counters scheduling. The guest PEBS event would not be reused for non-PEBS guest event even with the same guest counter index. Originally-by: Andi Kleen Co-developed-by: Kan Liang Signed-off-by: Kan Liang Signed-off-by: Like Xu --- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/pmu.c | 29 +++++++++++++++++++++++++++-- 2 files changed, 29 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 94c8bfee4a82..09dacda33fb8 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -449,6 +449,8 @@ struct kvm_pmu { DECLARE_BITMAP(all_valid_pmc_idx, X86_PMC_IDX_MAX); DECLARE_BITMAP(pmc_in_use, X86_PMC_IDX_MAX); + u64 pebs_enable; + /* * The gate to release perf_events not marked in * pmc_in_use only once in a vcpu time slice. diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 67741d2a0308..2e81c50323e2 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -76,7 +76,12 @@ static void kvm_perf_overflow_intr(struct perf_event *perf_event, struct kvm_pmu *pmu = pmc_to_pmu(pmc); if (!test_and_set_bit(pmc->idx, pmu->reprogram_pmi)) { - __set_bit(pmc->idx, (unsigned long *)&pmu->global_status); + if (perf_event->attr.precise_ip) { + /* Indicate PEBS overflow PMI to guest. */ + __set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT, + (unsigned long *)&pmu->global_status); + } else + __set_bit(pmc->idx, (unsigned long *)&pmu->global_status); kvm_make_request(KVM_REQ_PMU, pmc->vcpu); /* @@ -99,6 +104,7 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, bool exclude_kernel, bool intr, bool in_tx, bool in_tx_cp) { + struct kvm_pmu *pmu = vcpu_to_pmu(pmc->vcpu); struct perf_event *event; struct perf_event_attr attr = { .type = type, @@ -110,6 +116,7 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, .exclude_kernel = exclude_kernel, .config = config, }; + bool pebs = test_bit(pmc->idx, (unsigned long *)&pmu->pebs_enable); attr.sample_period = get_sample_period(pmc, pmc->counter); @@ -124,9 +131,23 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, attr.sample_period = 0; attr.config |= HSW_IN_TX_CHECKPOINTED; } + if (pebs) { + /* + * The non-zero precision level of guest event makes the ordinary + * guest event becomes a guest PEBS event and triggers the host + * PEBS PMI handler to determine whether the PEBS overflow PMI + * comes from the host counters or the guest. + * + * For most PEBS hardware events, the difference in the software + * precision levels of guest and host PEBS events will not affect + * the accuracy of the PEBS profiling result, because the "event IP" + * in the PEBS record is calibrated on the guest side. + */ + attr.precise_ip = 1; + } event = perf_event_create_kernel_counter(&attr, -1, current, - intr ? kvm_perf_overflow_intr : + (intr || pebs) ? kvm_perf_overflow_intr : kvm_perf_overflow, pmc); if (IS_ERR(event)) { pr_debug_ratelimited("kvm_pmu: event creation failed %ld for pmc->idx = %d\n", @@ -161,6 +182,10 @@ static bool pmc_resume_counter(struct kvm_pmc *pmc) get_sample_period(pmc, pmc->counter))) return false; + if (!test_bit(pmc->idx, (unsigned long *)&pmc_to_pmu(pmc)->pebs_enable) && + pmc->perf_event->attr.precise_ip) + return false; + /* reuse perf_event to serve as pmc_reprogram_counter() does*/ perf_event_enable(pmc->perf_event); From patchwork Mon Jan 4 13:15:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11996735 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0473C433DB for ; Mon, 4 Jan 2021 13:27:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7CD25207AE for ; Mon, 4 Jan 2021 13:27:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726691AbhADN01 (ORCPT ); Mon, 4 Jan 2021 08:26:27 -0500 Received: from mga07.intel.com ([134.134.136.100]:23242 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726278AbhADN00 (ORCPT ); Mon, 4 Jan 2021 08:26:26 -0500 IronPort-SDR: 52jp23C3uQoT+2zApqJifHMvL4cY1DB7GaK83NWeFMLuk49DTfNFgw/HFa2pV1V61rpH4GH+gg XRWqfmhd0TLg== X-IronPort-AV: E=McAfee;i="6000,8403,9853"; a="241034331" X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="241034331" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2021 05:22:19 -0800 IronPort-SDR: /E4pof7JCasidhuhkK7kH3za4vrEMc3bQGoewpImg3TLUYgLptwvtc16l0lNgdZDvIrLSQD/Ek DNr3C/TZxf2Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="461944575" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga001.fm.intel.com with ESMTP; 04 Jan 2021 05:22:16 -0800 From: Like Xu To: Peter Zijlstra , Paolo Bonzini , eranian@google.com, kvm@vger.kernel.org Cc: Ingo Molnar , Sean Christopherson , Thomas Gleixner , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Andi Kleen , Kan Liang , wei.w.wang@intel.com, luwei.kang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 06/17] KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS Date: Mon, 4 Jan 2021 21:15:31 +0800 Message-Id: <20210104131542.495413-7-like.xu@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210104131542.495413-1-like.xu@linux.intel.com> References: <20210104131542.495413-1-like.xu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org If IA32_PERF_CAPABILITIES.PEBS_BASELINE [bit 14] is set, the IA32_PEBS_ENABLE MSR exists and all architecturally enumerated fixed and general purpose counters have corresponding bits in IA32_PEBS_ENABLE that enable generation of PEBS records. The general-purpose counter bits start at bit IA32_PEBS_ENABLE[0], and the fixed counter bits start at bit IA32_PEBS_ENABLE[32]. When guest PEBS is enabled, the IA32_PEBS_ENABLE MSR will be added to the perf_guest_switch_msr() and switched during the VMX transitions just like CORE_PERF_GLOBAL_CTRL MSR. Originally-by: Andi Kleen Co-developed-by: Kan Liang Signed-off-by: Kan Liang Co-developed-by: Luwei Kang Signed-off-by: Luwei Kang Signed-off-by: Like Xu --- arch/x86/events/intel/core.c | 20 ++++++++++++++++++++ arch/x86/include/asm/kvm_host.h | 1 + arch/x86/include/asm/msr-index.h | 6 ++++++ arch/x86/kvm/vmx/pmu_intel.c | 28 ++++++++++++++++++++++++++++ 4 files changed, 55 insertions(+) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index af457f8cb29d..6453b8a6834a 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3715,6 +3715,26 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr) *nr = 2; } + if (cpuc->pebs_enabled & ~cpuc->intel_ctrl_host_mask) { + arr[1].msr = MSR_IA32_PEBS_ENABLE; + arr[1].host = cpuc->pebs_enabled & ~cpuc->intel_ctrl_guest_mask; + arr[1].guest = cpuc->pebs_enabled & ~cpuc->intel_ctrl_host_mask; + /* + * The guest PEBS will be disabled once the host PEBS is enabled + * since the both enabled case may brings a unknown PMI to + * confuse host and the guest PEBS overflow PMI would be missed. + */ + if (arr[1].host) + arr[1].guest = 0; + arr[0].guest |= arr[1].guest; + *nr = 2; + } else if (*nr == 1) { + /* Remove MSR_IA32_PEBS_ENABLE from MSR switch list in KVM */ + arr[1].msr = MSR_IA32_PEBS_ENABLE; + arr[1].host = arr[1].guest = 0; + *nr = 2; + } + return arr; } diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 09dacda33fb8..88a403fa46d4 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -450,6 +450,7 @@ struct kvm_pmu { DECLARE_BITMAP(pmc_in_use, X86_PMC_IDX_MAX); u64 pebs_enable; + u64 pebs_enable_mask; /* * The gate to release perf_events not marked in diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h index abfc9b0fbd8d..11cc0b80fe7a 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -184,6 +184,12 @@ #define MSR_PEBS_DATA_CFG 0x000003f2 #define MSR_IA32_DS_AREA 0x00000600 #define MSR_IA32_PERF_CAPABILITIES 0x00000345 +#define PERF_CAP_PEBS_TRAP BIT_ULL(6) +#define PERF_CAP_ARCH_REG BIT_ULL(7) +#define PERF_CAP_PEBS_FORMAT 0xf00 +#define PERF_CAP_PEBS_BASELINE BIT_ULL(14) +#define PERF_CAP_PEBS_MASK (PERF_CAP_PEBS_TRAP | PERF_CAP_ARCH_REG | \ + PERF_CAP_PEBS_FORMAT | PERF_CAP_PEBS_BASELINE) #define MSR_PEBS_LD_LAT_THRESHOLD 0x000003f6 #define MSR_IA32_RTIT_CTL 0x00000570 diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 50047114c298..2f10587bda19 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -180,6 +180,9 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr) case MSR_CORE_PERF_GLOBAL_OVF_CTRL: ret = pmu->version > 1; break; + case MSR_IA32_PEBS_ENABLE: + ret = vcpu->arch.perf_capabilities & PERF_CAP_PEBS_FORMAT; + break; default: ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) || get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) || @@ -221,6 +224,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_CORE_PERF_GLOBAL_OVF_CTRL: msr_info->data = pmu->global_ovf_ctrl; return 0; + case MSR_IA32_PEBS_ENABLE: + msr_info->data = pmu->pebs_enable; + return 0; default: if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) || (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) { @@ -280,6 +286,14 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return 0; } break; + case MSR_IA32_PEBS_ENABLE: + if (pmu->pebs_enable == data) + return 0; + if (!(data & pmu->pebs_enable_mask)) { + pmu->pebs_enable = data; + return 0; + } + break; default: if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) || (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) { @@ -329,6 +343,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) pmu->version = 0; pmu->reserved_bits = 0xffffffff00200000ull; pmu->fixed_ctr_ctrl_mask = ~0ull; + pmu->pebs_enable_mask = ~0ull; entry = kvm_find_cpuid_entry(vcpu, 0xa, 0); if (!entry) @@ -384,6 +399,19 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) bitmap_set(pmu->all_valid_pmc_idx, INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters); + if (vcpu->arch.perf_capabilities & PERF_CAP_PEBS_FORMAT) { + if (vcpu->arch.perf_capabilities & PERF_CAP_PEBS_BASELINE) { + pmu->pebs_enable_mask = ~pmu->global_ctrl; + pmu->reserved_bits &= ~ICL_EVENTSEL_ADAPTIVE; + for (i = 0; i < pmu->nr_arch_fixed_counters; i++) + pmu->fixed_ctr_ctrl_mask &= + ~(1ULL << (INTEL_PMC_IDX_FIXED + i * 4)); + } else + pmu->pebs_enable_mask = ~((1ull << pmu->nr_arch_gp_counters) - 1); + } else { + vcpu->arch.perf_capabilities &= ~PERF_CAP_PEBS_MASK; + } + nested_vmx_pmu_entry_exit_ctls_update(vcpu); } From patchwork Mon Jan 4 13:15:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11996733 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E490C43381 for ; Mon, 4 Jan 2021 13:26:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6B626207AE for ; Mon, 4 Jan 2021 13:26:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727017AbhADN0b (ORCPT ); Mon, 4 Jan 2021 08:26:31 -0500 Received: from mga07.intel.com ([134.134.136.100]:23246 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726889AbhADN0a (ORCPT ); Mon, 4 Jan 2021 08:26:30 -0500 IronPort-SDR: FN/zmwTW80DgZfx5xlEIdqFsAfc+uQFlrwx6hbJ7kzLQD/o30Anb1SMyW/Af1tN8awSGQ3GsQQ wBYMEBxQITew== X-IronPort-AV: E=McAfee;i="6000,8403,9853"; a="241034339" X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="241034339" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2021 05:22:22 -0800 IronPort-SDR: rXMNR7oRJ2z0fMgS/JsW+JczQ2tYNAIIm7FwluiaCmoYiOOI/Pc2pA49YD0hf5f1TWnH0LnqKK ZMBycUmXNv2Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="461944588" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga001.fm.intel.com with ESMTP; 04 Jan 2021 05:22:19 -0800 From: Like Xu To: Peter Zijlstra , Paolo Bonzini , eranian@google.com, kvm@vger.kernel.org Cc: Ingo Molnar , Sean Christopherson , Thomas Gleixner , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Andi Kleen , Kan Liang , wei.w.wang@intel.com, luwei.kang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 07/17] KVM: x86/pmu: Add IA32_DS_AREA MSR emulation to manage guest DS buffer Date: Mon, 4 Jan 2021 21:15:32 +0800 Message-Id: <20210104131542.495413-8-like.xu@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210104131542.495413-1-like.xu@linux.intel.com> References: <20210104131542.495413-1-like.xu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When CPUID.01H:EDX.DS[21] is set, the IA32_DS_AREA MSR exists and points to the linear address of the first byte of the DS buffer management area, which is used to manage the PEBS records. When guest PEBS is enabled and the value is different from the host, KVM will add the IA32_DS_AREA MSR to the msr-switch list. The guest's DS value can be loaded to the real HW before VM-entry, and will be removed when guest PEBS is disabled. The WRMSR to IA32_DS_AREA MSR brings a #GP(0) if the source register contains a non-canonical address. The switch of IA32_DS_AREA MSR would also, setup a quiescent period to write the host PEBS records (if any) to host DS area rather than guest DS area. When guest PEBS is enabled, the MSR_IA32_DS_AREA MSR will be added to the perf_guest_switch_msr() and switched during the VMX transitions just like CORE_PERF_GLOBAL_CTRL MSR. Originally-by: Andi Kleen Co-developed-by: Kan Liang Signed-off-by: Kan Liang Signed-off-by: Like Xu --- arch/x86/events/intel/core.c | 13 +++++++++++++ arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/vmx/pmu_intel.c | 11 +++++++++++ arch/x86/kvm/vmx/vmx.c | 6 ++++++ 4 files changed, 31 insertions(+) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 6453b8a6834a..ccddda455bec 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3690,6 +3690,7 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr) { struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs; + struct debug_store *ds = __this_cpu_read(cpu_hw_events.ds); arr[0].msr = MSR_CORE_PERF_GLOBAL_CTRL; arr[0].host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask; @@ -3735,6 +3736,18 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr) *nr = 2; } + if (arr[1].guest) { + arr[2].msr = MSR_IA32_DS_AREA; + arr[2].host = (unsigned long)ds; + /* KVM will update MSR_IA32_DS_AREA with the trapped guest value. */ + arr[2].guest = 0ull; + *nr = 3; + } else if (*nr == 2) { + arr[2].msr = MSR_IA32_DS_AREA; + arr[2].host = arr[2].guest = 0; + *nr = 3; + } + return arr; } diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 88a403fa46d4..520a21af711b 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -449,6 +449,7 @@ struct kvm_pmu { DECLARE_BITMAP(all_valid_pmc_idx, X86_PMC_IDX_MAX); DECLARE_BITMAP(pmc_in_use, X86_PMC_IDX_MAX); + u64 ds_area; u64 pebs_enable; u64 pebs_enable_mask; diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 2f10587bda19..ff5fc405703f 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -183,6 +183,9 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr) case MSR_IA32_PEBS_ENABLE: ret = vcpu->arch.perf_capabilities & PERF_CAP_PEBS_FORMAT; break; + case MSR_IA32_DS_AREA: + ret = guest_cpuid_has(vcpu, X86_FEATURE_DS); + break; default: ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) || get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) || @@ -227,6 +230,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_IA32_PEBS_ENABLE: msr_info->data = pmu->pebs_enable; return 0; + case MSR_IA32_DS_AREA: + msr_info->data = pmu->ds_area; + return 0; default: if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) || (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) { @@ -294,6 +300,11 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return 0; } break; + case MSR_IA32_DS_AREA: + if (is_noncanonical_address(data, vcpu)) + return 1; + pmu->ds_area = data; + return 0; default: if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) || (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) { diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 09bc41c53cd8..42c65acc6c01 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -974,6 +974,7 @@ static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr, return; } break; + case MSR_IA32_DS_AREA: case MSR_IA32_PEBS_ENABLE: /* PEBS needs a quiescent period after being disabled (to write * a record). Disabling PEBS through VMX MSR swapping doesn't @@ -6522,12 +6523,17 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx) { int i, nr_msrs; struct perf_guest_switch_msr *msrs; + struct kvm_vcpu *vcpu = &vmx->vcpu; + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); msrs = perf_guest_get_msrs(&nr_msrs); if (!msrs) return; + if (nr_msrs > 2 && msrs[1].guest) + msrs[2].guest = pmu->ds_area; + for (i = 0; i < nr_msrs; i++) if (msrs[i].host == msrs[i].guest) clear_atomic_switch_msr(vmx, msrs[i].msr); From patchwork Mon Jan 4 13:15:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11996731 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F96CC433E0 for ; Mon, 4 Jan 2021 13:26:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 292F0207AE for ; Mon, 4 Jan 2021 13:26:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727216AbhADN0e (ORCPT ); Mon, 4 Jan 2021 08:26:34 -0500 Received: from mga07.intel.com ([134.134.136.100]:23250 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727048AbhADN0d (ORCPT ); Mon, 4 Jan 2021 08:26:33 -0500 IronPort-SDR: Qj3i0dvR825hzxT+mf2etGN+Fj+3IgLo5wi0JGaI2+QyTNFmbu68UH9Xl+bYNzsWzDMhZSoUgq YUjrWHkAZHqw== X-IronPort-AV: E=McAfee;i="6000,8403,9853"; a="241034343" X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="241034343" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2021 05:22:26 -0800 IronPort-SDR: zUO4EeqK75InJk4nv3cJdljoi8V6YmyefJgTp1a3LV+jk5mb7uX/3R0SzyENUQlChspxE0Z09Q +6TtZLAVhrDg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="461944599" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga001.fm.intel.com with ESMTP; 04 Jan 2021 05:22:22 -0800 From: Like Xu To: Peter Zijlstra , Paolo Bonzini , eranian@google.com, kvm@vger.kernel.org Cc: Ingo Molnar , Sean Christopherson , Thomas Gleixner , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Andi Kleen , Kan Liang , wei.w.wang@intel.com, luwei.kang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 08/17] KVM: x86/pmu: Add PEBS_DATA_CFG MSR emulation to support adaptive PEBS Date: Mon, 4 Jan 2021 21:15:33 +0800 Message-Id: <20210104131542.495413-9-like.xu@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210104131542.495413-1-like.xu@linux.intel.com> References: <20210104131542.495413-1-like.xu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org If IA32_PERF_CAPABILITIES.PEBS_BASELINE [bit 14] is set, the adaptive PEBS is supported. The PEBS_DATA_CFG MSR and adaptive record enable bits (IA32_PERFEVTSELx.Adaptive_Record and IA32_FIXED_CTR_CTRL. FCx_Adaptive_Record) are also supported. Adaptive PEBS provides software the capability to configure the PEBS records to capture only the data of interest, keeping the record size compact. An overflow of PMCx results in generation of an adaptive PEBS record with state information based on the selections specified in MSR_PEBS_DATA_CFG (Memory Info [bit 0], GPRs [bit 1], XMMs [bit 2], and LBRs [bit 3], LBR Entries [bit 31:24]). By default, the PEBS record will only contain the Basic group. When guest adaptive PEBS is enabled, the IA32_PEBS_ENABLE MSR will be added to the perf_guest_switch_msr() and switched during the VMX transitions just like CORE_PERF_GLOBAL_CTRL MSR. Co-developed-by: Luwei Kang Signed-off-by: Luwei Kang Signed-off-by: Like Xu --- arch/x86/events/intel/core.c | 12 ++++++++++++ arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/vmx/pmu_intel.c | 16 ++++++++++++++++ arch/x86/kvm/vmx/vmx.c | 5 ++++- 4 files changed, 34 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index ccddda455bec..736487e6c5e3 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -3748,6 +3748,18 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr) *nr = 3; } + if (arr[1].guest && x86_pmu.intel_cap.pebs_baseline) { + arr[3].msr = MSR_PEBS_DATA_CFG; + arr[3].host = cpuc->pebs_data_cfg; + /* KVM will update MSR_PEBS_DATA_CFG with the trapped guest value. */ + arr[3].guest = 0ull; + *nr = 4; + } else if (*nr == 3) { + arr[3].msr = MSR_PEBS_DATA_CFG; + arr[3].host = arr[3].guest = 0; + *nr = 4; + } + return arr; } diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 520a21af711b..4ff6aa00a325 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -452,6 +452,8 @@ struct kvm_pmu { u64 ds_area; u64 pebs_enable; u64 pebs_enable_mask; + u64 pebs_data_cfg; + u64 pebs_data_cfg_mask; /* * The gate to release perf_events not marked in diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index ff5fc405703f..c04e12812797 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -186,6 +186,9 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr) case MSR_IA32_DS_AREA: ret = guest_cpuid_has(vcpu, X86_FEATURE_DS); break; + case MSR_PEBS_DATA_CFG: + ret = vcpu->arch.perf_capabilities & PERF_CAP_PEBS_BASELINE; + break; default: ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) || get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) || @@ -233,6 +236,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_IA32_DS_AREA: msr_info->data = pmu->ds_area; return 0; + case MSR_PEBS_DATA_CFG: + msr_info->data = pmu->pebs_data_cfg; + return 0; default: if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) || (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) { @@ -305,6 +311,14 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return 1; pmu->ds_area = data; return 0; + case MSR_PEBS_DATA_CFG: + if (pmu->pebs_data_cfg == data) + return 0; + if (!(data & pmu->pebs_data_cfg_mask)) { + pmu->pebs_data_cfg = data; + return 0; + } + break; default: if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) || (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) { @@ -355,6 +369,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) pmu->reserved_bits = 0xffffffff00200000ull; pmu->fixed_ctr_ctrl_mask = ~0ull; pmu->pebs_enable_mask = ~0ull; + pmu->pebs_data_cfg_mask = ~0ull; entry = kvm_find_cpuid_entry(vcpu, 0xa, 0); if (!entry) @@ -417,6 +432,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) for (i = 0; i < pmu->nr_arch_fixed_counters; i++) pmu->fixed_ctr_ctrl_mask &= ~(1ULL << (INTEL_PMC_IDX_FIXED + i * 4)); + pmu->pebs_data_cfg_mask = ~0xff00000full; } else pmu->pebs_enable_mask = ~((1ull << pmu->nr_arch_gp_counters) - 1); } else { diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 42c65acc6c01..dbb0e49aae64 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6531,8 +6531,11 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx) if (!msrs) return; - if (nr_msrs > 2 && msrs[1].guest) + if (nr_msrs > 2 && msrs[1].guest) { msrs[2].guest = pmu->ds_area; + if (nr_msrs > 3) + msrs[3].guest = pmu->pebs_data_cfg; + } for (i = 0; i < nr_msrs; i++) if (msrs[i].host == msrs[i].guest) From patchwork Mon Jan 4 13:15:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11996761 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BAF73C433DB for ; Mon, 4 Jan 2021 13:28:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 933AF2074D for ; Mon, 4 Jan 2021 13:28:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726360AbhADN1r (ORCPT ); Mon, 4 Jan 2021 08:27:47 -0500 Received: from mga07.intel.com ([134.134.136.100]:23242 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726253AbhADN1r (ORCPT ); Mon, 4 Jan 2021 08:27:47 -0500 IronPort-SDR: MzK1duyH9jnufNKLFt2IHQpyD0qdbJCWW3qmTFxem+cfLB/5SD+8IGBJLlWT/Pi1Lf6XKRLKNx ckAXv8DG5qEQ== X-IronPort-AV: E=McAfee;i="6000,8403,9853"; a="241034356" X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="241034356" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2021 05:22:29 -0800 IronPort-SDR: V3QLTmqhuWAhAPq5N7CVtMd3EeW49h0LVSZz5EgncE+g8Ifcjvc0JH1PsJKBeJcbxK8Jd6nrH4 NepbiIOSoJIw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="461944612" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga001.fm.intel.com with ESMTP; 04 Jan 2021 05:22:26 -0800 From: Like Xu To: Peter Zijlstra , Paolo Bonzini , eranian@google.com, kvm@vger.kernel.org Cc: Ingo Molnar , Sean Christopherson , Thomas Gleixner , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Andi Kleen , Kan Liang , wei.w.wang@intel.com, luwei.kang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 09/17] KVM: x86: Set PEBS_UNAVAIL in IA32_MISC_ENABLE when PEBS is enabled Date: Mon, 4 Jan 2021 21:15:34 +0800 Message-Id: <20210104131542.495413-10-like.xu@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210104131542.495413-1-like.xu@linux.intel.com> References: <20210104131542.495413-1-like.xu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The bit 12 represents "Processor Event Based Sampling Unavailable (RO)" : 1 = PEBS is not supported. 0 = PEBS is supported. A write to this PEBS_UNAVL available bit will bring #GP(0) when guest PEBS is enabled. Some PEBS drivers in guest may care about this bit. Signed-off-by: Like Xu --- arch/x86/kvm/vmx/pmu_intel.c | 2 ++ arch/x86/kvm/x86.c | 4 ++++ 2 files changed, 6 insertions(+) diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index c04e12812797..99d9453e0176 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -426,6 +426,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters); if (vcpu->arch.perf_capabilities & PERF_CAP_PEBS_FORMAT) { + vcpu->arch.ia32_misc_enable_msr &= ~MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL; if (vcpu->arch.perf_capabilities & PERF_CAP_PEBS_BASELINE) { pmu->pebs_enable_mask = ~pmu->global_ctrl; pmu->reserved_bits &= ~ICL_EVENTSEL_ADAPTIVE; @@ -436,6 +437,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) } else pmu->pebs_enable_mask = ~((1ull << pmu->nr_arch_gp_counters) - 1); } else { + vcpu->arch.ia32_misc_enable_msr |= MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL; vcpu->arch.perf_capabilities &= ~PERF_CAP_PEBS_MASK; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a38ca932eec5..213368e47500 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3095,6 +3095,10 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) break; case MSR_IA32_MISC_ENABLE: data &= ~MSR_IA32_MISC_ENABLE_EMON; + if (!msr_info->host_initiated && + (vcpu->arch.perf_capabilities & PERF_CAP_PEBS_FORMAT) && + (data & MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL)) + return 1; if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT) && ((vcpu->arch.ia32_misc_enable_msr ^ data) & MSR_IA32_MISC_ENABLE_MWAIT)) { if (!guest_cpuid_has(vcpu, X86_FEATURE_XMM3)) From patchwork Mon Jan 4 13:15:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11996765 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4732AC4332B for ; Mon, 4 Jan 2021 13:28:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 178D8207AE for ; Mon, 4 Jan 2021 13:28:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726713AbhADN1v (ORCPT ); Mon, 4 Jan 2021 08:27:51 -0500 Received: from mga07.intel.com ([134.134.136.100]:23246 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726662AbhADN1v (ORCPT ); Mon, 4 Jan 2021 08:27:51 -0500 IronPort-SDR: gvZlxIob5E1rHyJbhmn7g0TIJwGZhIilrnScVGjbFpo5DfENYue04XP6Qhbc03rZQPPJYRlvMM RMKn7ZtS46fA== X-IronPort-AV: E=McAfee;i="6000,8403,9853"; a="241034368" X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="241034368" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2021 05:22:32 -0800 IronPort-SDR: 2ClTuwdre+ashFI/IodTy4xYrURt7mpTaTansZZm+gyGJdoyZTAWxYdbvoDogR+FRo1gXZWzzB QvU1u/v4b50w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="461944620" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga001.fm.intel.com with ESMTP; 04 Jan 2021 05:22:29 -0800 From: Like Xu To: Peter Zijlstra , Paolo Bonzini , eranian@google.com, kvm@vger.kernel.org Cc: Ingo Molnar , Sean Christopherson , Thomas Gleixner , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Andi Kleen , Kan Liang , wei.w.wang@intel.com, luwei.kang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 10/17] KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64 Date: Mon, 4 Jan 2021 21:15:35 +0800 Message-Id: <20210104131542.495413-11-like.xu@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210104131542.495413-1-like.xu@linux.intel.com> References: <20210104131542.495413-1-like.xu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The CPUID features PDCM, DS and DTES64 are required for PEBS feature. KVM would expose CPUID feature PDCM, DS and DTES64 to guest when PEBS is supported in the KVM on the Ice Lake server platforms. Originally-by: Andi Kleen Co-developed-by: Kan Liang Signed-off-by: Kan Liang Co-developed-by: Luwei Kang Signed-off-by: Luwei Kang Signed-off-by: Like Xu --- arch/x86/kvm/pmu.h | 6 ++++++ arch/x86/kvm/vmx/capabilities.h | 17 ++++++++++++++++- arch/x86/kvm/vmx/vmx.c | 15 +++++++++++++++ 3 files changed, 37 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index 067fef51760c..ee8f15cc4b5e 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -3,6 +3,7 @@ #define __KVM_X86_PMU_H #include +#include #define vcpu_to_pmu(vcpu) (&(vcpu)->arch.pmu) #define pmu_to_vcpu(pmu) (container_of((pmu), struct kvm_vcpu, arch.pmu)) @@ -16,6 +17,11 @@ #define VMWARE_BACKDOOR_PMC_APPARENT_TIME 0x10002 #define MAX_FIXED_COUNTERS 3 +static const struct x86_cpu_id vmx_icl_pebs_cpu[] = { + X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, NULL), + X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, NULL), + {} +}; struct kvm_event_hw_type_mapping { u8 eventsel; diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h index 3a1861403d73..2f22ce34b165 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -5,6 +5,7 @@ #include #include "lapic.h" +#include "pmu.h" extern bool __read_mostly enable_vpid; extern bool __read_mostly flexpriority_enabled; @@ -369,13 +370,27 @@ static inline bool vmx_pt_mode_is_host_guest(void) return pt_mode == PT_MODE_HOST_GUEST; } +static inline bool vmx_pebs_supported(void) +{ + return boot_cpu_has(X86_FEATURE_PEBS) && x86_match_cpu(vmx_icl_pebs_cpu); +} + static inline u64 vmx_get_perf_capabilities(void) { /* * Since counters are virtualized, KVM would support full * width counting unconditionally, even if the host lacks it. */ - return PMU_CAP_FW_WRITES; + u64 value = PMU_CAP_FW_WRITES; + u64 perf_cap = 0; + + if (boot_cpu_has(X86_FEATURE_PDCM)) + rdmsrl(MSR_IA32_PERF_CAPABILITIES, perf_cap); + + if (vmx_pebs_supported()) + value |= perf_cap & PERF_CAP_PEBS_MASK; + + return value; } #endif /* __KVM_X86_VMX_CAPS_H */ diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index dbb0e49aae64..341794b67f9a 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2200,6 +2200,17 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_IA32_PERF_CAPABILITIES: if (data && !vcpu_to_pmu(vcpu)->version) return 1; + if (data & PERF_CAP_PEBS_FORMAT) { + if ((data & PERF_CAP_PEBS_MASK) != + (vmx_get_perf_capabilities() & PERF_CAP_PEBS_MASK)) + return 1; + if (!guest_cpuid_has(vcpu, X86_FEATURE_DS)) + return 1; + if (!guest_cpuid_has(vcpu, X86_FEATURE_DTES64)) + return 1; + if (boot_cpu_data.x86_model != guest_cpuid_model(vcpu)) + return 1; + } ret = kvm_set_msr_common(vcpu, msr_info); break; @@ -7277,6 +7288,10 @@ static __init void vmx_set_cpu_caps(void) kvm_cpu_cap_check_and_set(X86_FEATURE_INVPCID); if (vmx_pt_mode_is_host_guest()) kvm_cpu_cap_check_and_set(X86_FEATURE_INTEL_PT); + if (vmx_pebs_supported()) { + kvm_cpu_cap_check_and_set(X86_FEATURE_DS); + kvm_cpu_cap_check_and_set(X86_FEATURE_DTES64); + } if (vmx_umip_emulated()) kvm_cpu_cap_set(X86_FEATURE_UMIP); From patchwork Mon Jan 4 13:15:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11996763 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2638EC433E9 for ; Mon, 4 Jan 2021 13:28:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ECAFA20449 for ; Mon, 4 Jan 2021 13:28:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727026AbhADN1z (ORCPT ); Mon, 4 Jan 2021 08:27:55 -0500 Received: from mga07.intel.com ([134.134.136.100]:23250 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726949AbhADN1y (ORCPT ); Mon, 4 Jan 2021 08:27:54 -0500 IronPort-SDR: jWUFtDG4FwP8M8pKqdtNRmQUp6McTy6H1iVlKgpmmD4dMiVbGIkZn9qSYdoQU35490zVnHlb8G 2w6jubUVfcxA== X-IronPort-AV: E=McAfee;i="6000,8403,9853"; a="241034378" X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="241034378" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2021 05:22:36 -0800 IronPort-SDR: l19BMzjNTlN/BTyti0zTilEiSu6YZB6XiS2NF5WSDfdpqI0QLhhLY7UNObLXxYhhjd1YDNF3oQ eeqLJkcwTikw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="461944630" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga001.fm.intel.com with ESMTP; 04 Jan 2021 05:22:33 -0800 From: Like Xu To: Peter Zijlstra , Paolo Bonzini , eranian@google.com, kvm@vger.kernel.org Cc: Ingo Molnar , Sean Christopherson , Thomas Gleixner , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Andi Kleen , Kan Liang , wei.w.wang@intel.com, luwei.kang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 11/17] KVM: x86/pmu: Adjust precise_ip to emulate Ice Lake guest PDIR counter Date: Mon, 4 Jan 2021 21:15:36 +0800 Message-Id: <20210104131542.495413-12-like.xu@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210104131542.495413-1-like.xu@linux.intel.com> References: <20210104131542.495413-1-like.xu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The PEBS-PDIR facility on Ice Lake server is supported on IA31_FIXED0 only. If the guest configures counter 32 and PEBS is enabled, the PEBS-PDIR facility is supposed to be used, in which case KVM adjusts attr.precise_ip to 3 and request host perf to assign the exactly requested counter or fail. The cpu model check is also required since some platforms may place the PEBS-PDIR facility in another counter index. Signed-off-by: Like Xu --- arch/x86/kvm/pmu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 2e81c50323e2..67c20ab81991 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -144,6 +144,8 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, * in the PEBS record is calibrated on the guest side. */ attr.precise_ip = 1; + if (x86_match_cpu(vmx_icl_pebs_cpu) && pmc->idx == 32) + attr.precise_ip = 3; } event = perf_event_create_kernel_counter(&attr, -1, current, From patchwork Mon Jan 4 13:15:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11996769 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 485BEC433E6 for ; Mon, 4 Jan 2021 13:29:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 22C65221F9 for ; Mon, 4 Jan 2021 13:29:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726881AbhADN3I (ORCPT ); Mon, 4 Jan 2021 08:29:08 -0500 Received: from mga07.intel.com ([134.134.136.100]:23242 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726640AbhADN3H (ORCPT ); Mon, 4 Jan 2021 08:29:07 -0500 IronPort-SDR: /p4PNWsAiwyDZjxp1I1+u/0dKDU8A/HqUWoKOS+EPhplqNig7Fq2rscVeCabDdocLCnaHVXaKE PVpDAOW6BYwQ== X-IronPort-AV: E=McAfee;i="6000,8403,9853"; a="241034382" X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="241034382" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2021 05:22:39 -0800 IronPort-SDR: MfCm3DJ8qA3KJSkleaVGOGTNFA5J6/ZgQM+1ZXpfTVL1bXj4NeIXO1ySHba0R7N1/TtzbHvBXv wHWzPbXfk42Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="461944637" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga001.fm.intel.com with ESMTP; 04 Jan 2021 05:22:36 -0800 From: Like Xu To: Peter Zijlstra , Paolo Bonzini , eranian@google.com, kvm@vger.kernel.org Cc: Ingo Molnar , Sean Christopherson , Thomas Gleixner , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Andi Kleen , Kan Liang , wei.w.wang@intel.com, luwei.kang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 12/17] KVM: x86/pmu: Disable guest PEBS when counters are cross-mapped Date: Mon, 4 Jan 2021 21:15:37 +0800 Message-Id: <20210104131542.495413-13-like.xu@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210104131542.495413-1-like.xu@linux.intel.com> References: <20210104131542.495413-1-like.xu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org KVM will check if a guest PEBS counter X is cross-mapped to the host counter Y. In this case, the applicable_counter field in the guest PEBS records is filled with the real host counter index(s) which is incorrect. Currently, KVM will disable guest PEBS before vm-entry and the later patches would do more emulations in the KVM to keep PEBS functionality work as host, such as rewriting applicable_counter field in the guest PEBS records buffer. The cross-mapped check should be done right before vm-entry but after local_irq_disable() since perf scheduler would rotate the pmc->perf_event to another host counter or make the event into error state via hrtimer irq. Signed-off-by: Like Xu --- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/pmu.c | 25 +++++++++++++++++++++++++ arch/x86/kvm/pmu.h | 1 + arch/x86/kvm/vmx/vmx.c | 3 +++ arch/x86/kvm/x86.c | 4 ++++ 5 files changed, 35 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 4ff6aa00a325..5de4c14cf526 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -455,6 +455,8 @@ struct kvm_pmu { u64 pebs_data_cfg; u64 pebs_data_cfg_mask; + bool counter_cross_mapped; + /* * The gate to release perf_events not marked in * pmc_in_use only once in a vcpu time slice. diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 67c20ab81991..3bfed803ed17 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -550,3 +550,28 @@ int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void __user *argp) kfree(filter); return r; } + +/* + * The caller needs to ensure that there is no time window for + * perf hrtimer irq or any chance to reschedule pmc->perf_event. + */ +void kvm_pmu_counter_cross_mapped_check(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); + struct kvm_pmc *pmc = NULL; + int bit; + + pmu->counter_cross_mapped = false; + + for_each_set_bit(bit, (unsigned long *)&pmu->pebs_enable, X86_PMC_IDX_MAX) { + pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, bit); + + if (!pmc || !pmc_speculative_in_use(pmc) || !pmc_is_enabled(pmc)) + continue; + + if (pmc->perf_event && (pmc->idx != pmc->perf_event->hw.idx)) { + pmu->counter_cross_mapped = true; + break; + } + } +} diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index ee8f15cc4b5e..f5ec94e9a1dc 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -163,6 +163,7 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu); void kvm_pmu_cleanup(struct kvm_vcpu *vcpu); void kvm_pmu_destroy(struct kvm_vcpu *vcpu); int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void __user *argp); +void kvm_pmu_counter_cross_mapped_check(struct kvm_vcpu *vcpu); bool is_vmware_backdoor_pmc(u32 pmc_idx); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 341794b67f9a..bc30c83e0a62 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6542,6 +6542,9 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx) if (!msrs) return; + if (pmu->counter_cross_mapped) + msrs[1].guest = 0; + if (nr_msrs > 2 && msrs[1].guest) { msrs[2].guest = pmu->ds_area; if (nr_msrs > 3) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 213368e47500..4ab1ce26244d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8936,6 +8936,10 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) * result in virtual interrupt delivery. */ local_irq_disable(); + + if (vcpu_to_pmu(vcpu)->global_ctrl & vcpu_to_pmu(vcpu)->pebs_enable) + kvm_pmu_counter_cross_mapped_check(vcpu); + vcpu->mode = IN_GUEST_MODE; srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx); From patchwork Mon Jan 4 13:15:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11996767 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 707AAC43381 for ; Mon, 4 Jan 2021 13:29:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4D2D9224DE for ; Mon, 4 Jan 2021 13:29:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727167AbhADN3M (ORCPT ); Mon, 4 Jan 2021 08:29:12 -0500 Received: from mga07.intel.com ([134.134.136.100]:23246 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726640AbhADN3M (ORCPT ); Mon, 4 Jan 2021 08:29:12 -0500 IronPort-SDR: JQRSAhi/cIJEdkDTD/hQJBlRP/1Za9nfKVynueDFEyYjJ9hUkUDBSSUlsbgcML3kkrZzV6zFw+ TQ817qooA48A== X-IronPort-AV: E=McAfee;i="6000,8403,9853"; a="241034412" X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="241034412" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2021 05:22:43 -0800 IronPort-SDR: lRuUij7rVOY4bgvfIQuR08zoxBK6kXrQ7F4ek7swA1XHfv7Vd2FQHDJlVWQO3t/xgCP3lrBM2g 27uIeDzlFPRA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="461944651" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga001.fm.intel.com with ESMTP; 04 Jan 2021 05:22:39 -0800 From: Like Xu To: Peter Zijlstra , Paolo Bonzini , eranian@google.com, kvm@vger.kernel.org Cc: Ingo Molnar , Sean Christopherson , Thomas Gleixner , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Andi Kleen , Kan Liang , wei.w.wang@intel.com, luwei.kang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 13/17] KVM: x86/pmu: Add hook to emulate pebs for cross-mapped counters Date: Mon, 4 Jan 2021 21:15:38 +0800 Message-Id: <20210104131542.495413-14-like.xu@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210104131542.495413-1-like.xu@linux.intel.com> References: <20210104131542.495413-1-like.xu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org To emulate PEBS facility, KVM may needs setup context such as guest DS PEBS fields correctly before vm-entry and this part will be implemented in the vmx handle_event() hook. When the cross-map happens to any enabled PEBS counter, it will make PMU request and exit to kvm_pmu_handle_event() for some rewrite stuff and then back to cross-map check again and finally to vm-entry. In this hook, KVM would rewrite the state for the guest and it won't move events, hence races with the NMI PMI are not a problem. Signed-off-by: Like Xu --- arch/x86/kvm/pmu.c | 3 +++ arch/x86/kvm/pmu.h | 1 + arch/x86/kvm/vmx/pmu_intel.c | 9 +++++++++ arch/x86/kvm/vmx/vmx.c | 3 --- 4 files changed, 13 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 3bfed803ed17..e898da4699c9 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -340,6 +340,9 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu) */ if (unlikely(pmu->need_cleanup)) kvm_pmu_cleanup(vcpu); + + if (kvm_x86_ops.pmu_ops->handle_event) + kvm_x86_ops.pmu_ops->handle_event(vcpu); } /* check if idx is a valid index to access PMU */ diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index f5ec94e9a1dc..b1e52e33f08c 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -45,6 +45,7 @@ struct kvm_pmu_ops { void (*refresh)(struct kvm_vcpu *vcpu); void (*init)(struct kvm_vcpu *vcpu); void (*reset)(struct kvm_vcpu *vcpu); + void (*handle_event)(struct kvm_vcpu *vcpu); }; static inline u64 pmc_bitmask(struct kvm_pmc *pmc) diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 99d9453e0176..2a06f923fbc7 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -491,6 +491,14 @@ static void intel_pmu_reset(struct kvm_vcpu *vcpu) pmu->global_ovf_ctrl = 0; } +static void intel_pmu_handle_event(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); + + if (!(pmu->global_ctrl & pmu->pebs_enable)) + return; +} + struct kvm_pmu_ops intel_pmu_ops = { .find_arch_event = intel_find_arch_event, .find_fixed_event = intel_find_fixed_event, @@ -505,4 +513,5 @@ struct kvm_pmu_ops intel_pmu_ops = { .refresh = intel_pmu_refresh, .init = intel_pmu_init, .reset = intel_pmu_reset, + .handle_event = intel_pmu_handle_event, }; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index bc30c83e0a62..341794b67f9a 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6542,9 +6542,6 @@ static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx) if (!msrs) return; - if (pmu->counter_cross_mapped) - msrs[1].guest = 0; - if (nr_msrs > 2 && msrs[1].guest) { msrs[2].guest = pmu->ds_area; if (nr_msrs > 3) From patchwork Mon Jan 4 13:15:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11996771 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEDDDC43333 for ; Mon, 4 Jan 2021 13:29:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D8B64207AE for ; Mon, 4 Jan 2021 13:29:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727204AbhADN3P (ORCPT ); Mon, 4 Jan 2021 08:29:15 -0500 Received: from mga07.intel.com ([134.134.136.100]:23250 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726640AbhADN3P (ORCPT ); Mon, 4 Jan 2021 08:29:15 -0500 IronPort-SDR: KjHsI/8d0xRFiM95W0aL8rJCK24ciym7hn9Lwud3cRVe1UKsJ7rLzNIFXscPzWgN8Orwt6Tktf /bVu11FUyRqw== X-IronPort-AV: E=McAfee;i="6000,8403,9853"; a="241034427" X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="241034427" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2021 05:22:46 -0800 IronPort-SDR: eBxYjP/J05eFgqSvOMglKkiZj9IcYxsNkrm0tK07TIZTAVIPPpC1WDmuHzQI6/XJsUYunHzpj/ 8rf1Nkp+gDdg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="461944683" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga001.fm.intel.com with ESMTP; 04 Jan 2021 05:22:43 -0800 From: Like Xu To: Peter Zijlstra , Paolo Bonzini , eranian@google.com, kvm@vger.kernel.org Cc: Ingo Molnar , Sean Christopherson , Thomas Gleixner , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Andi Kleen , Kan Liang , wei.w.wang@intel.com, luwei.kang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 14/17] KVM: vmx/pmu: Limit pebs_interrupt_threshold in the guest DS area Date: Mon, 4 Jan 2021 21:15:39 +0800 Message-Id: <20210104131542.495413-15-like.xu@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210104131542.495413-1-like.xu@linux.intel.com> References: <20210104131542.495413-1-like.xu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org If the host counter X is scheduled to the guest PEBS counter Y, the guest ds pebs_interrupt_threshold field in guest DS area would be changed to only ONE record before vm-entry which helps KVM more easily and accurately handle the cross-mapping emulation when the PEBS overflow PMI is generated. In most cases, the guest counters would not be scheduled in a cross-mapped way which means there is no need to change guest DS pebs_interrupt_threshold and the applicable_counters fields in the guest PEBS records are naturally correct. PEBS facility writes multiple PEBS records into guest DS w/o interception and the performance is good. AFAIK, we don't expect that changing the pebs_interrupt_threshold value from the KVM side will break any guest PEBS drivers. Signed-off-by: Like Xu --- arch/x86/include/asm/kvm_host.h | 3 ++ arch/x86/kvm/pmu.c | 17 +++----- arch/x86/kvm/pmu.h | 11 +++++ arch/x86/kvm/vmx/pmu_intel.c | 77 +++++++++++++++++++++++++++++++++ arch/x86/kvm/x86.c | 1 + 5 files changed, 98 insertions(+), 11 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 5de4c14cf526..ea204c628f45 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -450,12 +450,15 @@ struct kvm_pmu { DECLARE_BITMAP(pmc_in_use, X86_PMC_IDX_MAX); u64 ds_area; + u64 cached_ds_area; + struct gfn_to_hva_cache ds_area_cache; u64 pebs_enable; u64 pebs_enable_mask; u64 pebs_data_cfg; u64 pebs_data_cfg_mask; bool counter_cross_mapped; + bool need_rewrite_ds_pebs_interrupt_threshold; /* * The gate to release perf_events not marked in diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index e898da4699c9..c0f18b304933 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -472,17 +472,6 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu) kvm_pmu_refresh(vcpu); } -static inline bool pmc_speculative_in_use(struct kvm_pmc *pmc) -{ - struct kvm_pmu *pmu = pmc_to_pmu(pmc); - - if (pmc_is_fixed(pmc)) - return fixed_ctrl_field(pmu->fixed_ctr_ctrl, - pmc->idx - INTEL_PMC_IDX_FIXED) & 0x3; - - return pmc->eventsel & ARCH_PERFMON_EVENTSEL_ENABLE; -} - /* Release perf_events for vPMCs that have been unused for a full time slice. */ void kvm_pmu_cleanup(struct kvm_vcpu *vcpu) { @@ -577,4 +566,10 @@ void kvm_pmu_counter_cross_mapped_check(struct kvm_vcpu *vcpu) break; } } + + if (!pmu->counter_cross_mapped) + return; + + if (pmu->need_rewrite_ds_pebs_interrupt_threshold) + kvm_make_request(KVM_REQ_PMU, pmc->vcpu); } diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index b1e52e33f08c..6cdc9fd03195 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -147,6 +147,17 @@ static inline u64 get_sample_period(struct kvm_pmc *pmc, u64 counter_value) return sample_period; } +static inline bool pmc_speculative_in_use(struct kvm_pmc *pmc) +{ + struct kvm_pmu *pmu = pmc_to_pmu(pmc); + + if (pmc_is_fixed(pmc)) + return fixed_ctrl_field(pmu->fixed_ctr_ctrl, + pmc->idx - INTEL_PMC_IDX_FIXED) & 0x3; + + return pmc->eventsel & ARCH_PERFMON_EVENTSEL_ENABLE; +} + void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel); void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int fixed_idx); void reprogram_counter(struct kvm_pmu *pmu, int pmc_idx); diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 2a06f923fbc7..b69e7c47fb05 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -211,6 +211,36 @@ static struct kvm_pmc *intel_msr_idx_to_pmc(struct kvm_vcpu *vcpu, u32 msr) return pmc; } +static void intel_pmu_pebs_setup(struct kvm_pmu *pmu) +{ + struct kvm_vcpu *vcpu = pmu_to_vcpu(pmu); + struct kvm_pmc *pmc = NULL; + int bit, idx; + gpa_t gpa; + + pmu->need_rewrite_ds_pebs_interrupt_threshold = false; + + for_each_set_bit(bit, (unsigned long *)&pmu->pebs_enable, X86_PMC_IDX_MAX) { + pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, bit); + + if (pmc && pmc_speculative_in_use(pmc)) { + pmu->need_rewrite_ds_pebs_interrupt_threshold = true; + break; + } + } + + if (pmu->pebs_enable && pmu->cached_ds_area != pmu->ds_area) { + idx = srcu_read_lock(&vcpu->kvm->srcu); + gpa = kvm_mmu_gva_to_gpa_system(vcpu, pmu->ds_area, NULL); + if (kvm_gfn_to_hva_cache_init(vcpu->kvm, &pmu->ds_area_cache, + gpa, sizeof(struct debug_store))) + goto out; + pmu->cached_ds_area = pmu->ds_area; +out: + srcu_read_unlock(&vcpu->kvm->srcu, idx); + } +} + static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) { struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); @@ -287,6 +317,8 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return 0; if (kvm_valid_perf_global_ctrl(pmu, data)) { global_ctrl_changed(pmu, data); + if (pmu->global_ctrl & pmu->pebs_enable) + intel_pmu_pebs_setup(pmu); return 0; } break; @@ -491,12 +523,57 @@ static void intel_pmu_reset(struct kvm_vcpu *vcpu) pmu->global_ovf_ctrl = 0; } +static int rewrite_ds_pebs_interrupt_threshold(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); + struct debug_store *ds = NULL; + u64 new_threshold, offset; + int srcu_idx, ret = -ENOMEM; + + ds = kmalloc(sizeof(struct debug_store), GFP_KERNEL); + if (!ds) + goto out; + + ret = -EFAULT; + srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); + if (kvm_read_guest_cached(vcpu->kvm, &pmu->ds_area_cache, + ds, sizeof(struct debug_store))) + goto unlock_out; + + /* Adding sizeof(struct pebs_basic) offset is enough to generate PMI. */ + new_threshold = ds->pebs_buffer_base + sizeof(struct pebs_basic); + offset = offsetof(struct debug_store, pebs_interrupt_threshold); + if (kvm_write_guest_offset_cached(vcpu->kvm, &pmu->ds_area_cache, + &new_threshold, offset, sizeof(u64))) + goto unlock_out; + + ret = 0; + +unlock_out: + srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx); + +out: + kfree(ds); + return ret; +} + static void intel_pmu_handle_event(struct kvm_vcpu *vcpu) { struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); + int ret; if (!(pmu->global_ctrl & pmu->pebs_enable)) return; + + if (pmu->counter_cross_mapped && pmu->need_rewrite_ds_pebs_interrupt_threshold) { + ret = rewrite_ds_pebs_interrupt_threshold(vcpu); + pmu->need_rewrite_ds_pebs_interrupt_threshold = false; + } + + if (ret == -ENOMEM) + pr_debug_ratelimited("%s: Fail to emulate guest PEBS due to OOM.", __func__); + else if (ret == -EFAULT) + pr_debug_ratelimited("%s: Fail to emulate guest PEBS due to GPA fault.", __func__); } struct kvm_pmu_ops intel_pmu_ops = { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 4ab1ce26244d..118e6752b563 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5917,6 +5917,7 @@ gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva, { return vcpu->arch.walk_mmu->gva_to_gpa(vcpu, gva, 0, exception); } +EXPORT_SYMBOL_GPL(kvm_mmu_gva_to_gpa_system); static int kvm_read_guest_virt_helper(gva_t addr, void *val, unsigned int bytes, struct kvm_vcpu *vcpu, u32 access, From patchwork Mon Jan 4 13:15:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11996775 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C94DC43381 for ; Mon, 4 Jan 2021 13:30:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3021F207B1 for ; Mon, 4 Jan 2021 13:30:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727326AbhADNa3 (ORCPT ); Mon, 4 Jan 2021 08:30:29 -0500 Received: from mga07.intel.com ([134.134.136.100]:23242 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726666AbhADNa2 (ORCPT ); Mon, 4 Jan 2021 08:30:28 -0500 IronPort-SDR: 0m1hGqyg//CUplEbGe58Ee4dysT23WHLgQ80lHgB3HaX72/ThfYZ8sDg/u2H+px/fk3jOG1h1x DFBWT2u/MHzg== X-IronPort-AV: E=McAfee;i="6000,8403,9853"; a="241034459" X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="241034459" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2021 05:22:49 -0800 IronPort-SDR: JocRUK8k8ASaPNfd+Pt3539+sNH+2EnMZDn4Mr51/hdscxbK30Rb/5xYht4apSanqajY8f+mBv lFykmFgSMkXw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="461944689" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga001.fm.intel.com with ESMTP; 04 Jan 2021 05:22:46 -0800 From: Like Xu To: Peter Zijlstra , Paolo Bonzini , eranian@google.com, kvm@vger.kernel.org Cc: Ingo Molnar , Sean Christopherson , Thomas Gleixner , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Andi Kleen , Kan Liang , wei.w.wang@intel.com, luwei.kang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 15/17] KVM: vmx/pmu: Rewrite applicable_counters field in guest PEBS records Date: Mon, 4 Jan 2021 21:15:40 +0800 Message-Id: <20210104131542.495413-16-like.xu@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210104131542.495413-1-like.xu@linux.intel.com> References: <20210104131542.495413-1-like.xu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The PEBS event counters scheduled by host may different to the counters required by guest. The host counter index will be leaked into the guest PEBS record and the guest driver will be confused by the counter indexes in the "Applicable Counters" field of the PEBS records and ignore them. Before the guest PEBS overflow PMI is injected into the guest through global status, KVM needs to rewrite the "Applicable Counters" field with the right enabled guest pebs counter idx(s) in the guest PEBS records. Co-developed-by: Luwei Kang Signed-off-by: Luwei Kang Signed-off-by: Like Xu --- arch/x86/include/asm/kvm_host.h | 2 + arch/x86/kvm/pmu.c | 1 + arch/x86/kvm/vmx/pmu_intel.c | 84 +++++++++++++++++++++++++++++++-- 3 files changed, 82 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ea204c628f45..e6394ac54f81 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -452,6 +452,7 @@ struct kvm_pmu { u64 ds_area; u64 cached_ds_area; struct gfn_to_hva_cache ds_area_cache; + struct gfn_to_hva_cache pebs_buffer_base_cache; u64 pebs_enable; u64 pebs_enable_mask; u64 pebs_data_cfg; @@ -459,6 +460,7 @@ struct kvm_pmu { bool counter_cross_mapped; bool need_rewrite_ds_pebs_interrupt_threshold; + bool need_rewrite_pebs_records; /* * The gate to release perf_events not marked in diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index c0f18b304933..581653589108 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -77,6 +77,7 @@ static void kvm_perf_overflow_intr(struct perf_event *perf_event, if (!test_and_set_bit(pmc->idx, pmu->reprogram_pmi)) { if (perf_event->attr.precise_ip) { + pmu->need_rewrite_pebs_records = pmu->counter_cross_mapped; /* Indicate PEBS overflow PMI to guest. */ __set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT, (unsigned long *)&pmu->global_status); diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index b69e7c47fb05..4c095c31db38 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -557,22 +557,96 @@ static int rewrite_ds_pebs_interrupt_threshold(struct kvm_vcpu *vcpu) return ret; } +static int rewrite_ds_pebs_records(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); + struct kvm_pmc *pmc = NULL; + struct debug_store *ds = NULL; + gpa_t gpa; + u64 pebs_buffer_base, offset, buffer_base, status, new_status, format_size; + int srcu_idx, bit, ret = 0; + + if (!pmu->counter_cross_mapped) + return ret; + + ds = kmalloc(sizeof(struct debug_store), GFP_KERNEL); + if (!ds) + return -ENOMEM; + + ret = -EFAULT; + srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); + if (kvm_read_guest_cached(vcpu->kvm, &pmu->ds_area_cache, + ds, sizeof(struct debug_store))) + goto out; + + if (ds->pebs_index <= ds->pebs_buffer_base) + goto out; + + pebs_buffer_base = ds->pebs_buffer_base; + offset = offsetof(struct pebs_basic, applicable_counters); + buffer_base = 0; + + gpa = kvm_mmu_gva_to_gpa_system(vcpu, pebs_buffer_base, NULL); + if (kvm_gfn_to_hva_cache_init(vcpu->kvm, &pmu->pebs_buffer_base_cache, + gpa, sizeof(struct pebs_basic))) + goto out; + + do { + ret = -EFAULT; + if (kvm_read_guest_offset_cached(vcpu->kvm, &pmu->pebs_buffer_base_cache, + &status, buffer_base + offset, sizeof(u64))) + goto out; + if (kvm_read_guest_offset_cached(vcpu->kvm, &pmu->pebs_buffer_base_cache, + &format_size, buffer_base, sizeof(u64))) + goto out; + + new_status = 0ull; + for_each_set_bit(bit, (unsigned long *)&pmu->pebs_enable, X86_PMC_IDX_MAX) { + pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, bit); + + if (!pmc || !pmc->perf_event) + continue; + + if (test_bit(pmc->perf_event->hw.idx, (unsigned long *)&status)) + new_status |= BIT_ULL(pmc->idx); + } + if (kvm_write_guest_offset_cached(vcpu->kvm, &pmu->pebs_buffer_base_cache, + &new_status, buffer_base + offset, sizeof(u64))) + goto out; + + ret = 0; + buffer_base += format_size >> 48; + } while (pebs_buffer_base + buffer_base < ds->pebs_index); + +out: + srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx); + kfree(ds); + return ret; +} + static void intel_pmu_handle_event(struct kvm_vcpu *vcpu) { struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); - int ret; + int ret1, ret2; + + if (pmu->need_rewrite_pebs_records) { + pmu->need_rewrite_pebs_records = false; + ret1 = rewrite_ds_pebs_records(vcpu); + } if (!(pmu->global_ctrl & pmu->pebs_enable)) - return; + goto out; if (pmu->counter_cross_mapped && pmu->need_rewrite_ds_pebs_interrupt_threshold) { - ret = rewrite_ds_pebs_interrupt_threshold(vcpu); pmu->need_rewrite_ds_pebs_interrupt_threshold = false; + ret2 = rewrite_ds_pebs_interrupt_threshold(vcpu); } - if (ret == -ENOMEM) +out: + + if (ret1 == -ENOMEM || ret2 == -ENOMEM) pr_debug_ratelimited("%s: Fail to emulate guest PEBS due to OOM.", __func__); - else if (ret == -EFAULT) + else if (ret1 == -EFAULT || ret2 == -EFAULT) pr_debug_ratelimited("%s: Fail to emulate guest PEBS due to GPA fault.", __func__); } From patchwork Mon Jan 4 13:15:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11996773 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02E42C433E6 for ; Mon, 4 Jan 2021 13:30:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C687A21BE5 for ; Mon, 4 Jan 2021 13:30:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727382AbhADNah (ORCPT ); Mon, 4 Jan 2021 08:30:37 -0500 Received: from mga07.intel.com ([134.134.136.100]:23246 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725889AbhADNad (ORCPT ); Mon, 4 Jan 2021 08:30:33 -0500 IronPort-SDR: 4tsIokbNkQToyEu5woAusd1wDCJcOnI/C4BR73NFWjcLztXYSV6ni9SGVyzMTxiaJ9fb0kCucm bwSzI9D731Ng== X-IronPort-AV: E=McAfee;i="6000,8403,9853"; a="241034492" X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="241034492" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2021 05:22:53 -0800 IronPort-SDR: gmWafp9ugMIA9HtjKXLJ/3I3gfVE0kZ6Cq5GPq4IeP2pUNm+BQ1DyXVIWqP9nNzvgTsoD9Lgnv kbwBhTYe3BrQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="461944716" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga001.fm.intel.com with ESMTP; 04 Jan 2021 05:22:49 -0800 From: Like Xu To: Peter Zijlstra , Paolo Bonzini , eranian@google.com, kvm@vger.kernel.org Cc: Ingo Molnar , Sean Christopherson , Thomas Gleixner , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Andi Kleen , Kan Liang , wei.w.wang@intel.com, luwei.kang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 16/17] KVM: x86/pmu: Save guest pebs reset values when pebs is configured Date: Mon, 4 Jan 2021 21:15:41 +0800 Message-Id: <20210104131542.495413-17-like.xu@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210104131542.495413-1-like.xu@linux.intel.com> References: <20210104131542.495413-1-like.xu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The guest pebs counter X may be cross mapped to the host counter Y. While the PEBS facility would reload the reset value once a PEBS record is written to guest DS and potentially continue to generate PEBS records before guest read the previous records. KVM will adjust the guest DS pebs reset counter values for exactly mapped host counters but before that, it needs to save the original expected guest reset counter values right after the counter is fully enabled via a trap. We assume that every time the guest PEBS driver enables the counter for large PEBS, it will configure the DS reset counter values as Linux does. Signed-off-by: Like Xu --- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/vmx/pmu_intel.c | 47 ++++++++++++++++++++++++++++++--- 2 files changed, 46 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index e6394ac54f81..1d17e51c5c8a 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -418,6 +418,7 @@ struct kvm_pmc { enum pmc_type type; u8 idx; u64 counter; + u64 reset_counter; u64 eventsel; struct perf_event *perf_event; struct kvm_vcpu *vcpu; @@ -461,6 +462,7 @@ struct kvm_pmu { bool counter_cross_mapped; bool need_rewrite_ds_pebs_interrupt_threshold; bool need_rewrite_pebs_records; + bool need_save_reset_counter; /* * The gate to release perf_events not marked in diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 4c095c31db38..4e6ed0e8bddf 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -219,12 +219,14 @@ static void intel_pmu_pebs_setup(struct kvm_pmu *pmu) gpa_t gpa; pmu->need_rewrite_ds_pebs_interrupt_threshold = false; + pmu->need_save_reset_counter = false; for_each_set_bit(bit, (unsigned long *)&pmu->pebs_enable, X86_PMC_IDX_MAX) { pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, bit); if (pmc && pmc_speculative_in_use(pmc)) { pmu->need_rewrite_ds_pebs_interrupt_threshold = true; + pmu->need_save_reset_counter = true; break; } } @@ -624,10 +626,44 @@ static int rewrite_ds_pebs_records(struct kvm_vcpu *vcpu) return ret; } +static int save_ds_pebs_reset_values(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); + struct kvm_pmc *pmc = NULL; + struct debug_store *ds = NULL; + int srcu_idx, bit, idx, ret; + + ds = kmalloc(sizeof(struct debug_store), GFP_KERNEL); + if (!ds) + return -ENOMEM; + + ret = -EFAULT; + srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); + if (kvm_read_guest_cached(vcpu->kvm, &pmu->ds_area_cache, + ds, sizeof(struct debug_store))) + goto out; + + for_each_set_bit(bit, (unsigned long *)&pmu->pebs_enable, X86_PMC_IDX_MAX) { + pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, bit); + + if (pmc) { + idx = (pmc->idx < INTEL_PMC_IDX_FIXED) ? + pmc->idx : (MAX_PEBS_EVENTS + pmc->idx - INTEL_PMC_IDX_FIXED); + pmc->reset_counter = ds->pebs_event_reset[idx]; + } + } + ret = 0; + +out: + srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx); + kfree(ds); + return ret; +} + static void intel_pmu_handle_event(struct kvm_vcpu *vcpu) { struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); - int ret1, ret2; + int ret1, ret2, ret3; if (pmu->need_rewrite_pebs_records) { pmu->need_rewrite_pebs_records = false; @@ -642,11 +678,16 @@ static void intel_pmu_handle_event(struct kvm_vcpu *vcpu) ret2 = rewrite_ds_pebs_interrupt_threshold(vcpu); } + if (pmu->need_save_reset_counter) { + pmu->need_save_reset_counter = false; + ret3 = save_ds_pebs_reset_values(vcpu); + } + out: - if (ret1 == -ENOMEM || ret2 == -ENOMEM) + if (ret1 == -ENOMEM || ret2 == -ENOMEM || ret3 == -ENOMEM) pr_debug_ratelimited("%s: Fail to emulate guest PEBS due to OOM.", __func__); - else if (ret1 == -EFAULT || ret2 == -EFAULT) + else if (ret1 == -EFAULT || ret2 == -EFAULT || ret3 == -EFAULT) pr_debug_ratelimited("%s: Fail to emulate guest PEBS due to GPA fault.", __func__); } From patchwork Mon Jan 4 13:15:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11996777 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40446C433E0 for ; Mon, 4 Jan 2021 13:30:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1CA1D207AE for ; Mon, 4 Jan 2021 13:30:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727368AbhADNag (ORCPT ); Mon, 4 Jan 2021 08:30:36 -0500 Received: from mga07.intel.com ([134.134.136.100]:23250 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727347AbhADNaf (ORCPT ); Mon, 4 Jan 2021 08:30:35 -0500 IronPort-SDR: qNwHZmwQKXrBTR2WyI82xlZXUBkupBEMd2efR74UmYZlQTnMq3beYxQrWhJ8qu0TBaOEhk9wgi E0iPoTyc1B8g== X-IronPort-AV: E=McAfee;i="6000,8403,9853"; a="241034534" X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="241034534" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jan 2021 05:22:56 -0800 IronPort-SDR: x+ZR/ExIJG6iW3Mczg0UyCjHSOiopWG088Fu15Or40S4vIimVekTYPd0vL7xCJrGnV9yWsoLZ8 gHv8f9EuKBVg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,474,1599548400"; d="scan'208";a="461944755" Received: from clx-ap-likexu.sh.intel.com ([10.239.48.108]) by fmsmga001.fm.intel.com with ESMTP; 04 Jan 2021 05:22:53 -0800 From: Like Xu To: Peter Zijlstra , Paolo Bonzini , eranian@google.com, kvm@vger.kernel.org Cc: Ingo Molnar , Sean Christopherson , Thomas Gleixner , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Andi Kleen , Kan Liang , wei.w.wang@intel.com, luwei.kang@intel.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 17/17] KVM: x86/pmu: Adjust guest pebs reset values for crpss-mapped counters Date: Mon, 4 Jan 2021 21:15:42 +0800 Message-Id: <20210104131542.495413-18-like.xu@linux.intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210104131542.495413-1-like.xu@linux.intel.com> References: <20210104131542.495413-1-like.xu@linux.intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The original PEBS reset counter value has been saved to pmc->reset_counter. When the guest PEBS counter X is enabled, the reset value RST-x would be written to guest DS reset field RST-y and it will be auto reloaded to the real host counter Y which is mapped to the guest PEBS counter X during this vm-entry period. KVM would record each last host reset counter index field for each guest PEBS counter and trigger the reset values rewrite once any entry in the host-guest counter mapping table is changed before vm-entry. The frequent changes in the mapping relationship should only happen when perf multiplexes the counters with the default 1ms timer. The time cost of adjusting the guest reset values will not exceed 1ms (13347ns on ICX), and there will be no race with the multiplex timer to create a livelock. Signed-off-by: Like Xu --- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/pmu.c | 15 ++++++++++++ arch/x86/kvm/pmu.h | 1 + arch/x86/kvm/vmx/pmu_intel.c | 43 ++++++++++++++++++++++++++++++--- 4 files changed, 58 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 1d17e51c5c8a..b97e73d16e65 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -418,6 +418,7 @@ struct kvm_pmc { enum pmc_type type; u8 idx; u64 counter; + u8 host_idx; u64 reset_counter; u64 eventsel; struct perf_event *perf_event; @@ -463,6 +464,7 @@ struct kvm_pmu { bool need_rewrite_ds_pebs_interrupt_threshold; bool need_rewrite_pebs_records; bool need_save_reset_counter; + bool need_rewrite_reset_counter; /* * The gate to release perf_events not marked in diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 581653589108..2dbca3f02e33 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -155,6 +155,7 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, if (IS_ERR(event)) { pr_debug_ratelimited("kvm_pmu: event creation failed %ld for pmc->idx = %d\n", PTR_ERR(event), pmc->idx); + pmc->host_idx = -1; return; } @@ -555,6 +556,7 @@ void kvm_pmu_counter_cross_mapped_check(struct kvm_vcpu *vcpu) int bit; pmu->counter_cross_mapped = false; + pmu->need_rewrite_reset_counter = false; for_each_set_bit(bit, (unsigned long *)&pmu->pebs_enable, X86_PMC_IDX_MAX) { pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, bit); @@ -568,6 +570,19 @@ void kvm_pmu_counter_cross_mapped_check(struct kvm_vcpu *vcpu) } } + for_each_set_bit(bit, (unsigned long *)&pmu->pebs_enable, X86_PMC_IDX_MAX) { + pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, bit); + + if (!pmc || !pmc_speculative_in_use(pmc) || !pmc_is_enabled(pmc)) + continue; + + if ((pmc->perf_event && (pmc->host_idx != pmc->perf_event->hw.idx))) { + pmu->need_rewrite_reset_counter = true; + kvm_make_request(KVM_REQ_PMU, pmc->vcpu); + break; + } + } + if (!pmu->counter_cross_mapped) return; diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index 6cdc9fd03195..2776a048fd27 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -74,6 +74,7 @@ static inline void pmc_release_perf_event(struct kvm_pmc *pmc) pmc->perf_event = NULL; pmc->current_config = 0; pmc_to_pmu(pmc)->event_count--; + pmc->host_idx = -1; } } diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 4e6ed0e8bddf..d0bfde29d2b0 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -660,10 +660,42 @@ static int save_ds_pebs_reset_values(struct kvm_vcpu *vcpu) return ret; } +static int rewrite_ds_pebs_reset_counters(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); + struct kvm_pmc *pmc = NULL; + int srcu_idx, bit, ret; + u64 offset, host_idx, idx; + + ret = -EFAULT; + srcu_idx = srcu_read_lock(&vcpu->kvm->srcu); + for_each_set_bit(bit, (unsigned long *)&pmu->pebs_enable, X86_PMC_IDX_MAX) { + pmc = kvm_x86_ops.pmu_ops->pmc_idx_to_pmc(pmu, bit); + + if (!pmc || !pmc->perf_event) + continue; + + host_idx = pmc->perf_event->hw.idx; + idx = (host_idx < INTEL_PMC_IDX_FIXED) ? + host_idx : (MAX_PEBS_EVENTS + host_idx - INTEL_PMC_IDX_FIXED); + offset = offsetof(struct debug_store, pebs_event_reset) + sizeof(u64) * idx; + if (kvm_write_guest_offset_cached(vcpu->kvm, &pmu->ds_area_cache, + &pmc->reset_counter, offset, sizeof(u64))) + goto out; + + pmc->host_idx = pmc->perf_event->hw.idx; + } + ret = 0; + +out: + srcu_read_unlock(&vcpu->kvm->srcu, srcu_idx); + return ret; +} + static void intel_pmu_handle_event(struct kvm_vcpu *vcpu) { struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); - int ret1, ret2, ret3; + int ret1, ret2, ret3, ret4; if (pmu->need_rewrite_pebs_records) { pmu->need_rewrite_pebs_records = false; @@ -683,11 +715,16 @@ static void intel_pmu_handle_event(struct kvm_vcpu *vcpu) ret3 = save_ds_pebs_reset_values(vcpu); } + if (pmu->need_rewrite_reset_counter) { + ret4 = pmu->need_rewrite_reset_counter = false; + rewrite_ds_pebs_reset_counters(vcpu); + } + out: - if (ret1 == -ENOMEM || ret2 == -ENOMEM || ret3 == -ENOMEM) + if (ret1 == -ENOMEM || ret2 == -ENOMEM || ret3 == -ENOMEM || ret4 == -ENOMEM) pr_debug_ratelimited("%s: Fail to emulate guest PEBS due to OOM.", __func__); - else if (ret1 == -EFAULT || ret2 == -EFAULT || ret3 == -EFAULT) + else if (ret1 == -EFAULT || ret2 == -EFAULT || ret3 == -EFAULT || ret4 == -EFAULT) pr_debug_ratelimited("%s: Fail to emulate guest PEBS due to GPA fault.", __func__); }