From patchwork Sat Jun 13 08:09:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11602585 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A10121667 for ; Sat, 13 Jun 2020 08:11:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 845C420739 for ; Sat, 13 Jun 2020 08:11:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726453AbgFMILK (ORCPT ); Sat, 13 Jun 2020 04:11:10 -0400 Received: from mga06.intel.com ([134.134.136.31]:50458 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726451AbgFMILH (ORCPT ); Sat, 13 Jun 2020 04:11:07 -0400 IronPort-SDR: yGPZ1Z7M8oabm0aXoy6aGGKQ/+6/sSPe/D74sp0BcJ56FmQJbC1CnxNH4fQnnVFcERa9AGfi3z HfbAxgCi6MWg== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2020 01:11:03 -0700 IronPort-SDR: fM8V1pKNSOdeOzjezraxISaiMrF/G1Qs6okKN6PURfaF5grCqmvCOyouy5RjcG/U1dl+ECKQj+ IT2yY/R8U0lg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,506,1583222400"; d="scan'208";a="474467309" Received: from sqa-gate.sh.intel.com (HELO clx-ap-likexu.tsp.org) ([10.239.48.212]) by fmsmga006.fm.intel.com with ESMTP; 13 Jun 2020 01:10:57 -0700 From: Like Xu To: Paolo Bonzini Cc: Peter Zijlstra , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , ak@linux.intel.com, wei.w.wang@intel.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH v12 01/11] perf/x86: Fix variable types for LBR registers Date: Sat, 13 Jun 2020 16:09:46 +0800 Message-Id: <20200613080958.132489-2-like.xu@linux.intel.com> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20200613080958.132489-1-like.xu@linux.intel.com> References: <20200613080958.132489-1-like.xu@linux.intel.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Wei Wang The MSR variable type can be 'unsigned int', which uses less memory than the longer 'unsigned long'. Fix 'struct x86_pmu' for that. The lbr_nr won't be a negative number, so make it 'unsigned int' as well. Suggested-by: Peter Zijlstra (Intel) Signed-off-by: Wei Wang Reviewed-by: Andi Kleen --- arch/x86/events/perf_event.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index e17a3d8a47ed..eb37f6c43c96 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -673,8 +673,8 @@ struct x86_pmu { /* * Intel LBR */ - unsigned long lbr_tos, lbr_from, lbr_to; /* MSR base regs */ - int lbr_nr; /* hardware stack size */ + unsigned int lbr_tos, lbr_from, lbr_to, + lbr_nr; /* LBR base regs and size */ u64 lbr_sel_mask; /* LBR_SELECT valid bits */ const int *lbr_sel_map; /* lbr_select mappings */ bool lbr_double_abort; /* duplicated lbr aborts */ From patchwork Sat Jun 13 08:09:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11602587 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B551D912 for ; Sat, 13 Jun 2020 08:11:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A710B20739 for ; Sat, 13 Jun 2020 08:11:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726491AbgFMILS (ORCPT ); Sat, 13 Jun 2020 04:11:18 -0400 Received: from mga06.intel.com ([134.134.136.31]:50471 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726460AbgFMILK (ORCPT ); Sat, 13 Jun 2020 04:11:10 -0400 IronPort-SDR: SNlDmWGLBRL3C1Jr+887w2N+ACUCftzxuJXoXssnE0WwrAfRIOSksV9ZJxpjxsMqoHsFxTZ01S zMNmqHnCJxVA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2020 01:11:07 -0700 IronPort-SDR: WeL8M1hRhkUvyOHKT0gYVvXemPP+ekilcQpX5Jh7DcTrseHj22lAQmjiDGfX/QOvsMQPUACY2L TSTzqmIXPJnw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,506,1583222400"; d="scan'208";a="474467328" Received: from sqa-gate.sh.intel.com (HELO clx-ap-likexu.tsp.org) ([10.239.48.212]) by fmsmga006.fm.intel.com with ESMTP; 13 Jun 2020 01:11:02 -0700 From: Like Xu To: Paolo Bonzini Cc: Peter Zijlstra , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , ak@linux.intel.com, wei.w.wang@intel.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Like Xu Subject: [PATCH v12 02/11] perf/x86/core: Refactor hw->idx checks and cleanup Date: Sat, 13 Jun 2020 16:09:47 +0800 Message-Id: <20200613080958.132489-3-like.xu@linux.intel.com> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20200613080958.132489-1-like.xu@linux.intel.com> References: <20200613080958.132489-1-like.xu@linux.intel.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org For intel_pmu_en/disable_event(), reorder the branches checks for hw->idx and make them sorted by probability: gp,fixed,bts,others. Clean up the x86_assign_hw_event() by converting multiple if-else statements to a switch statement. To skip x86_perf_event_update() and x86_perf_event_set_period(), it's generic to replace "idx == INTEL_PMC_IDX_FIXED_BTS" check with '!hwc->event_base' because that should be 0 for all non-gp/fixed cases. Wrap related bit operations into intel_set/clear_masks() and make the main path more cleaner and readable. No functional changes. Original-by: Peter Zijlstra (Intel) Signed-off-by: Like Xu --- arch/x86/events/core.c | 25 +++++++---- arch/x86/events/intel/core.c | 85 +++++++++++++++++++----------------- 2 files changed, 62 insertions(+), 48 deletions(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 9e63ee50b19a..9a5056472b67 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -71,10 +71,9 @@ u64 x86_perf_event_update(struct perf_event *event) struct hw_perf_event *hwc = &event->hw; int shift = 64 - x86_pmu.cntval_bits; u64 prev_raw_count, new_raw_count; - int idx = hwc->idx; u64 delta; - if (idx == INTEL_PMC_IDX_FIXED_BTS) + if (unlikely(!hwc->event_base)) return 0; /* @@ -1097,22 +1096,30 @@ static inline void x86_assign_hw_event(struct perf_event *event, struct cpu_hw_events *cpuc, int i) { struct hw_perf_event *hwc = &event->hw; + int idx; - hwc->idx = cpuc->assign[i]; + idx = hwc->idx = cpuc->assign[i]; hwc->last_cpu = smp_processor_id(); hwc->last_tag = ++cpuc->tags[i]; - if (hwc->idx == INTEL_PMC_IDX_FIXED_BTS) { + switch (hwc->idx) { + case INTEL_PMC_IDX_FIXED_BTS: hwc->config_base = 0; hwc->event_base = 0; - } else if (hwc->idx >= INTEL_PMC_IDX_FIXED) { + break; + + case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS-1: hwc->config_base = MSR_ARCH_PERFMON_FIXED_CTR_CTRL; - hwc->event_base = MSR_ARCH_PERFMON_FIXED_CTR0 + (hwc->idx - INTEL_PMC_IDX_FIXED); - hwc->event_base_rdpmc = (hwc->idx - INTEL_PMC_IDX_FIXED) | 1<<30; - } else { + hwc->event_base = MSR_ARCH_PERFMON_FIXED_CTR0 + + (idx - INTEL_PMC_IDX_FIXED); + hwc->event_base_rdpmc = (idx - INTEL_PMC_IDX_FIXED) | 1<<30; + break; + + default: hwc->config_base = x86_pmu_config_addr(hwc->idx); hwc->event_base = x86_pmu_event_addr(hwc->idx); hwc->event_base_rdpmc = x86_pmu_rdpmc_index(hwc->idx); + break; } } @@ -1233,7 +1240,7 @@ int x86_perf_event_set_period(struct perf_event *event) s64 period = hwc->sample_period; int ret = 0, idx = hwc->idx; - if (idx == INTEL_PMC_IDX_FIXED_BTS) + if (unlikely(!hwc->event_base)) return 0; /* diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index ca35c8b5ee10..8dac4c61bf76 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -2136,8 +2136,35 @@ static inline void intel_pmu_ack_status(u64 ack) wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, ack); } -static void intel_pmu_disable_fixed(struct hw_perf_event *hwc) +static inline bool event_is_checkpointed(struct perf_event *event) +{ + return unlikely(event->hw.config & HSW_IN_TX_CHECKPOINTED) != 0; +} + +static inline void intel_set_masks(struct perf_event *event, int idx) +{ + struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); + + if (event->attr.exclude_host) + __set_bit(idx, (unsigned long *)&cpuc->intel_ctrl_guest_mask); + if (event->attr.exclude_guest) + __set_bit(idx, (unsigned long *)&cpuc->intel_ctrl_host_mask); + if (event_is_checkpointed(event)) + __set_bit(idx, (unsigned long *)&cpuc->intel_cp_status); +} + +static inline void intel_clear_masks(struct perf_event *event, int idx) { + struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); + + __clear_bit(idx, (unsigned long *)&cpuc->intel_ctrl_guest_mask); + __clear_bit(idx, (unsigned long *)&cpuc->intel_ctrl_host_mask); + __clear_bit(idx, (unsigned long *)&cpuc->intel_cp_status); +} + +static void intel_pmu_disable_fixed(struct perf_event *event) +{ + struct hw_perf_event *hwc = &event->hw; int idx = hwc->idx - INTEL_PMC_IDX_FIXED; u64 ctrl_val, mask; @@ -2148,31 +2175,22 @@ static void intel_pmu_disable_fixed(struct hw_perf_event *hwc) wrmsrl(hwc->config_base, ctrl_val); } -static inline bool event_is_checkpointed(struct perf_event *event) -{ - return (event->hw.config & HSW_IN_TX_CHECKPOINTED) != 0; -} - static void intel_pmu_disable_event(struct perf_event *event) { struct hw_perf_event *hwc = &event->hw; - struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); + int idx = hwc->idx; - if (unlikely(hwc->idx == INTEL_PMC_IDX_FIXED_BTS)) { + if (idx < INTEL_PMC_IDX_FIXED) { + intel_clear_masks(event, idx); + x86_pmu_disable_event(event); + } else if (idx < INTEL_PMC_IDX_FIXED_BTS) { + intel_clear_masks(event, idx); + intel_pmu_disable_fixed(event); + } else if (idx == INTEL_PMC_IDX_FIXED_BTS) { intel_pmu_disable_bts(); intel_pmu_drain_bts_buffer(); - return; } - cpuc->intel_ctrl_guest_mask &= ~(1ull << hwc->idx); - cpuc->intel_ctrl_host_mask &= ~(1ull << hwc->idx); - cpuc->intel_cp_status &= ~(1ull << hwc->idx); - - if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) - intel_pmu_disable_fixed(hwc); - else - x86_pmu_disable_event(event); - /* * Needs to be called after x86_pmu_disable_event, * so we don't trigger the event without PEBS bit set. @@ -2238,33 +2256,22 @@ static void intel_pmu_enable_fixed(struct perf_event *event) static void intel_pmu_enable_event(struct perf_event *event) { struct hw_perf_event *hwc = &event->hw; - struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); - - if (unlikely(hwc->idx == INTEL_PMC_IDX_FIXED_BTS)) { - if (!__this_cpu_read(cpu_hw_events.enabled)) - return; - - intel_pmu_enable_bts(hwc->config); - return; - } - - if (event->attr.exclude_host) - cpuc->intel_ctrl_guest_mask |= (1ull << hwc->idx); - if (event->attr.exclude_guest) - cpuc->intel_ctrl_host_mask |= (1ull << hwc->idx); - - if (unlikely(event_is_checkpointed(event))) - cpuc->intel_cp_status |= (1ull << hwc->idx); + int idx = hwc->idx; if (unlikely(event->attr.precise_ip)) intel_pmu_pebs_enable(event); - if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) { + if (idx < INTEL_PMC_IDX_FIXED) { + intel_set_masks(event, idx); + __x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE); + } else if (idx < INTEL_PMC_IDX_FIXED_BTS) { + intel_set_masks(event, idx); intel_pmu_enable_fixed(event); - return; + } else if (idx == INTEL_PMC_IDX_FIXED_BTS) { + if (!__this_cpu_read(cpu_hw_events.enabled)) + return; + intel_pmu_enable_bts(hwc->config); } - - __x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE); } static void intel_pmu_add_event(struct perf_event *event) From patchwork Sat Jun 13 08:09:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11602603 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7495160D for ; Sat, 13 Jun 2020 08:12:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 51E9520739 for ; Sat, 13 Jun 2020 08:12:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726536AbgFMILb (ORCPT ); Sat, 13 Jun 2020 04:11:31 -0400 Received: from mga07.intel.com ([134.134.136.100]:64586 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726460AbgFMILZ (ORCPT ); Sat, 13 Jun 2020 04:11:25 -0400 IronPort-SDR: pYzNTe0Xd55z7HhcqY9028+PY5CElDg0l108df3B0PUavyE36pWA9GV1MKm9KBWSlvrk9dYWWj wl+bfclPiWUA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2020 01:11:22 -0700 IronPort-SDR: EPDCaGcVwlryAc7sVuB5xVqbAiL5ymdfmM3YZiQJ55hQ8tNuxa1tJhT4DpoK2axaVgpNlcNM7N ad9aMQ6Hv4yA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,506,1583222400"; d="scan'208";a="474467349" Received: from sqa-gate.sh.intel.com (HELO clx-ap-likexu.tsp.org) ([10.239.48.212]) by fmsmga006.fm.intel.com with ESMTP; 13 Jun 2020 01:11:08 -0700 From: Like Xu To: Paolo Bonzini Cc: Peter Zijlstra , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , ak@linux.intel.com, wei.w.wang@intel.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Like Xu Subject: [PATCH v12 03/11] perf/x86/lbr: Add interface to get LBR information Date: Sat, 13 Jun 2020 16:09:48 +0800 Message-Id: <20200613080958.132489-4-like.xu@linux.intel.com> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20200613080958.132489-1-like.xu@linux.intel.com> References: <20200613080958.132489-1-like.xu@linux.intel.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The LBR records msrs are model specific. The perf subsystem has already obtained the base addresses of LBR records based on the cpu model. Therefore, an interface is added to allow callers outside the perf subsystem to obtain these LBR information. It's useful for hypervisors to emulate the LBR feature for guests with less code. Co-developed-by: Wei Wang Signed-off-by: Wei Wang Signed-off-by: Like Xu --- arch/x86/events/intel/lbr.c | 20 ++++++++++++++++++++ arch/x86/include/asm/perf_event.h | 12 ++++++++++++ 2 files changed, 32 insertions(+) diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index 65113b16804a..2ed3f2a51bdf 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -1343,3 +1343,23 @@ void intel_pmu_lbr_init_knl(void) if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_LIP) x86_pmu.intel_cap.lbr_format = LBR_FORMAT_EIP_FLAGS; } + +/** + * x86_perf_get_lbr - get the LBR records information + * + * @lbr: the caller's memory to store the LBR records information + * + * Returns: 0 indicates the LBR info has been successfully obtained + */ +int x86_perf_get_lbr(struct x86_pmu_lbr *lbr) +{ + int lbr_fmt = x86_pmu.intel_cap.lbr_format; + + lbr->nr = x86_pmu.lbr_nr; + lbr->from = x86_pmu.lbr_from; + lbr->to = x86_pmu.lbr_to; + lbr->info = (lbr_fmt == LBR_FORMAT_INFO) ? MSR_LBR_INFO_0 : 0; + + return 0; +} +EXPORT_SYMBOL_GPL(x86_perf_get_lbr); diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h index e855e9cf2c37..5d2c30f0df02 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -333,6 +333,13 @@ struct perf_guest_switch_msr { u64 host, guest; }; +struct x86_pmu_lbr { + unsigned int nr; + unsigned int from; + unsigned int to; + unsigned int info; +}; + extern void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap); extern void perf_check_microcode(void); extern int x86_perf_rdpmc_index(struct perf_event *event); @@ -348,12 +355,17 @@ static inline void perf_check_microcode(void) { } #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL) extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr); +extern int x86_perf_get_lbr(struct x86_pmu_lbr *lbr); #else static inline struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr) { *nr = 0; return NULL; } +static inline int x86_perf_get_lbr(struct x86_pmu_lbr *lbr) +{ + return -1; +} #endif #ifdef CONFIG_CPU_SUP_INTEL From patchwork Sat Jun 13 08:09:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11602611 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A5DF0618 for ; Sat, 13 Jun 2020 08:13:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9509D20836 for ; Sat, 13 Jun 2020 08:13:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726499AbgFMIMh (ORCPT ); Sat, 13 Jun 2020 04:12:37 -0400 Received: from mga07.intel.com ([134.134.136.100]:64586 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726497AbgFMIL2 (ORCPT ); Sat, 13 Jun 2020 04:11:28 -0400 IronPort-SDR: oM6iM1XxgcOK+7bsyaX2grHhTlTLJSgERRWSSShUH9FpxwDuj5+oH2/avQlsjxsfwuhH4b+tmv 5mfHkSAa1fvA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2020 01:11:23 -0700 IronPort-SDR: zy3TlMWoIN6mYecJxRXR2z7b0M+tT8VKkR3z1QXs5Zuz1FI/pWw7iWkprQYU0i9G+iVltwNQng 3tN5OH6dhaCQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,506,1583222400"; d="scan'208";a="474467364" Received: from sqa-gate.sh.intel.com (HELO clx-ap-likexu.tsp.org) ([10.239.48.212]) by fmsmga006.fm.intel.com with ESMTP; 13 Jun 2020 01:11:13 -0700 From: Like Xu To: Paolo Bonzini Cc: Peter Zijlstra , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , ak@linux.intel.com, wei.w.wang@intel.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Like Xu Subject: [PATCH v12 04/11] perf/x86: Add constraint to create guest LBR event without hw counter Date: Sat, 13 Jun 2020 16:09:49 +0800 Message-Id: <20200613080958.132489-5-like.xu@linux.intel.com> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20200613080958.132489-1-like.xu@linux.intel.com> References: <20200613080958.132489-1-like.xu@linux.intel.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The hypervisor may request the perf subsystem to schedule a time window to directly access the LBR records msrs for its own use. Normally, it would create a guest LBR event with callstack mode enabled, which is scheduled along with other ordinary LBR events on the host but in an exclusive way. To avoid wasting a counter for the guest LBR event, the perf tracks its hw->idx via INTEL_PMC_IDX_FIXED_VLBR and assigns it with a fake VLBR counter with the help of new vlbr_constraint. As with the BTS event, there is actually no hardware counter assigned for the guest LBR event. Signed-off-by: Like Xu Signed-off-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/20200514083054.62538-5-like.xu@linux.intel.com --- arch/x86/events/core.c | 1 + arch/x86/events/intel/core.c | 18 ++++++++++++++++++ arch/x86/events/intel/lbr.c | 4 ++++ arch/x86/events/perf_event.h | 1 + arch/x86/include/asm/perf_event.h | 22 +++++++++++++++++++++- 5 files changed, 45 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c index 9a5056472b67..1996f2ed7c83 100644 --- a/arch/x86/events/core.c +++ b/arch/x86/events/core.c @@ -1104,6 +1104,7 @@ static inline void x86_assign_hw_event(struct perf_event *event, switch (hwc->idx) { case INTEL_PMC_IDX_FIXED_BTS: + case INTEL_PMC_IDX_FIXED_VLBR: hwc->config_base = 0; hwc->event_base = 0; break; diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 8dac4c61bf76..51e1fba7b1d1 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -2621,6 +2621,20 @@ intel_bts_constraints(struct perf_event *event) return NULL; } +/* + * Note: matches a fake event, like Fixed2. + */ +static struct event_constraint * +intel_vlbr_constraints(struct perf_event *event) +{ + struct event_constraint *c = &vlbr_constraint; + + if (unlikely(constraint_match(c, event->hw.config))) + return c; + + return NULL; +} + static int intel_alt_er(int idx, u64 config) { int alt_idx = idx; @@ -2811,6 +2825,10 @@ __intel_get_event_constraints(struct cpu_hw_events *cpuc, int idx, { struct event_constraint *c; + c = intel_vlbr_constraints(event); + if (c) + return c; + c = intel_bts_constraints(event); if (c) return c; diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index 2ed3f2a51bdf..d285d26c1578 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -1363,3 +1363,7 @@ int x86_perf_get_lbr(struct x86_pmu_lbr *lbr) return 0; } EXPORT_SYMBOL_GPL(x86_perf_get_lbr); + +struct event_constraint vlbr_constraint = + FIXED_EVENT_CONSTRAINT(INTEL_FIXED_VLBR_EVENT, + (INTEL_PMC_IDX_FIXED_VLBR - INTEL_PMC_IDX_FIXED)); diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index eb37f6c43c96..77a6dd66bd9a 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -990,6 +990,7 @@ void release_ds_buffers(void); void reserve_ds_buffers(void); extern struct event_constraint bts_constraint; +extern struct event_constraint vlbr_constraint; void intel_pmu_enable_bts(u64 config); diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h index 5d2c30f0df02..2df707311d17 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -192,9 +192,29 @@ struct x86_pmu_capability { #define GLOBAL_STATUS_UNC_OVF BIT_ULL(61) #define GLOBAL_STATUS_ASIF BIT_ULL(60) #define GLOBAL_STATUS_COUNTERS_FROZEN BIT_ULL(59) -#define GLOBAL_STATUS_LBRS_FROZEN BIT_ULL(58) +#define GLOBAL_STATUS_LBRS_FROZEN_BIT 58 +#define GLOBAL_STATUS_LBRS_FROZEN BIT_ULL(GLOBAL_STATUS_LBRS_FROZEN_BIT) #define GLOBAL_STATUS_TRACE_TOPAPMI BIT_ULL(55) +/* + * We model guest LBR event tracing as another fixed-mode PMC like BTS. + * + * We choose bit 58 because it's used to indicate LBR stack frozen state + * for architectural perfmon v4, also we unconditionally mask that bit in + * the handle_pmi_common(), so it'll never be set in the overflow handling. + * + * With this fake counter assigned, the guest LBR event user (such as KVM), + * can program the LBR registers on its own, and we don't actually do anything + * with then in the host context. + */ +#define INTEL_PMC_IDX_FIXED_VLBR (GLOBAL_STATUS_LBRS_FROZEN_BIT) + +/* + * Pseudo-encoding the guest LBR event as event=0x00,umask=0x1b, + * since it would claim bit 58 which is effectively Fixed26. + */ +#define INTEL_FIXED_VLBR_EVENT 0x1b00 + /* * Adaptive PEBS v4 */ From patchwork Sat Jun 13 08:09:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11602609 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5F71060D for ; Sat, 13 Jun 2020 08:12:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4B1ED2081A for ; Sat, 13 Jun 2020 08:12:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726688AbgFMIMi (ORCPT ); Sat, 13 Jun 2020 04:12:38 -0400 Received: from mga07.intel.com ([134.134.136.100]:64588 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726513AbgFMIL0 (ORCPT ); Sat, 13 Jun 2020 04:11:26 -0400 IronPort-SDR: djlHTgaFUm1ujGv1ucsBrDTphlUfpxVD9vH8VmpEldU0FnGJoWnPnVBR1vNuNAaAi0h3bSW+hd F9Oy1XYZeKGg== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2020 01:11:22 -0700 IronPort-SDR: DdZKjpsButtjhrf5IMP5F+XZkTpElM1eY5xak5IPAKJSlhxEytJJkKm1rJYEo0SRyePsqhLAFZ 5nL68bUazhnA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,506,1583222400"; d="scan'208";a="474467378" Received: from sqa-gate.sh.intel.com (HELO clx-ap-likexu.tsp.org) ([10.239.48.212]) by fmsmga006.fm.intel.com with ESMTP; 13 Jun 2020 01:11:16 -0700 From: Like Xu To: Paolo Bonzini Cc: Peter Zijlstra , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , ak@linux.intel.com, wei.w.wang@intel.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Like Xu Subject: [PATCH v12 05/11] perf/x86: Keep LBR records unchanged in host context for guest usage Date: Sat, 13 Jun 2020 16:09:50 +0800 Message-Id: <20200613080958.132489-6-like.xu@linux.intel.com> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20200613080958.132489-1-like.xu@linux.intel.com> References: <20200613080958.132489-1-like.xu@linux.intel.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When a guest wants to use the LBR registers, its hypervisor creates a guest LBR event and let host perf schedules it. The LBR records msrs are accessible to the guest when its guest LBR event is scheduled on by the perf subsystem. Before scheduling this event out, we should avoid host changes on IA32_DEBUGCTLMSR or LBR_SELECT. Otherwise, some unexpected branch operations may interfere with guest behavior, pollute LBR records, and even cause host branches leakage. In addition, the read operation on host is also avoidable. To ensure that guest LBR records are not lost during the context switch, the guest LBR event would enable the callstack mode which could save/restore guest unread LBR records with the help of intel_pmu_lbr_sched_task() naturally. However, the guest LBR_SELECT may changes for its own use and the host LBR event doesn't save/restore it. To ensure that we doesn't lost the guest LBR_SELECT value when the guest LBR event is running, the vlbr_constraint is bound up with a new constraint flag PERF_X86_EVENT_LBR_SELECT. Signed-off-by: Like Xu Signed-off-by: Wei Wang Signed-off-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/20200514083054.62538-6-like.xu@linux.intel.com --- arch/x86/events/intel/core.c | 6 ++++-- arch/x86/events/intel/lbr.c | 31 ++++++++++++++++++++++++++----- arch/x86/events/perf_event.h | 3 +++ 3 files changed, 33 insertions(+), 7 deletions(-) diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c index 51e1fba7b1d1..582ddff9a359 100644 --- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -2189,7 +2189,8 @@ static void intel_pmu_disable_event(struct perf_event *event) } else if (idx == INTEL_PMC_IDX_FIXED_BTS) { intel_pmu_disable_bts(); intel_pmu_drain_bts_buffer(); - } + } else if (idx == INTEL_PMC_IDX_FIXED_VLBR) + intel_clear_masks(event, idx); /* * Needs to be called after x86_pmu_disable_event, @@ -2271,7 +2272,8 @@ static void intel_pmu_enable_event(struct perf_event *event) if (!__this_cpu_read(cpu_hw_events.enabled)) return; intel_pmu_enable_bts(hwc->config); - } + } else if (idx == INTEL_PMC_IDX_FIXED_VLBR) + intel_set_masks(event, idx); } static void intel_pmu_add_event(struct perf_event *event) diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index d285d26c1578..d03de7539957 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -383,6 +383,9 @@ static void __intel_pmu_lbr_restore(struct x86_perf_task_context *task_ctx) wrmsrl(x86_pmu.lbr_tos, tos); task_ctx->lbr_stack_state = LBR_NONE; + + if (cpuc->lbr_select) + wrmsrl(MSR_LBR_SELECT, task_ctx->lbr_sel); } static void __intel_pmu_lbr_save(struct x86_perf_task_context *task_ctx) @@ -415,6 +418,9 @@ static void __intel_pmu_lbr_save(struct x86_perf_task_context *task_ctx) cpuc->last_task_ctx = task_ctx; cpuc->last_log_id = ++task_ctx->log_id; + + if (cpuc->lbr_select) + rdmsrl(MSR_LBR_SELECT, task_ctx->lbr_sel); } void intel_pmu_lbr_swap_task_ctx(struct perf_event_context *prev, @@ -485,6 +491,9 @@ void intel_pmu_lbr_add(struct perf_event *event) if (!x86_pmu.lbr_nr) return; + if (event->hw.flags & PERF_X86_EVENT_LBR_SELECT) + cpuc->lbr_select = 1; + cpuc->br_sel = event->hw.branch_reg.reg; if (branch_user_callstack(cpuc->br_sel) && event->ctx->task_ctx_data) { @@ -532,6 +541,9 @@ void intel_pmu_lbr_del(struct perf_event *event) task_ctx->lbr_callstack_users--; } + if (event->hw.flags & PERF_X86_EVENT_LBR_SELECT) + cpuc->lbr_select = 0; + if (x86_pmu.intel_cap.pebs_baseline && event->attr.precise_ip > 0) cpuc->lbr_pebs_users--; cpuc->lbr_users--; @@ -540,11 +552,19 @@ void intel_pmu_lbr_del(struct perf_event *event) perf_sched_cb_dec(event->ctx->pmu); } +static inline bool vlbr_exclude_host(void) +{ + struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); + + return test_bit(INTEL_PMC_IDX_FIXED_VLBR, + (unsigned long *)&cpuc->intel_ctrl_guest_mask); +} + void intel_pmu_lbr_enable_all(bool pmi) { struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); - if (cpuc->lbr_users) + if (cpuc->lbr_users && !vlbr_exclude_host()) __intel_pmu_lbr_enable(pmi); } @@ -552,7 +572,7 @@ void intel_pmu_lbr_disable_all(void) { struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); - if (cpuc->lbr_users) + if (cpuc->lbr_users && !vlbr_exclude_host()) __intel_pmu_lbr_disable(); } @@ -694,7 +714,8 @@ void intel_pmu_lbr_read(void) * This could be smarter and actually check the event, * but this simple approach seems to work for now. */ - if (!cpuc->lbr_users || cpuc->lbr_users == cpuc->lbr_pebs_users) + if (!cpuc->lbr_users || vlbr_exclude_host() || + cpuc->lbr_users == cpuc->lbr_pebs_users) return; if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_32) @@ -1365,5 +1386,5 @@ int x86_perf_get_lbr(struct x86_pmu_lbr *lbr) EXPORT_SYMBOL_GPL(x86_perf_get_lbr); struct event_constraint vlbr_constraint = - FIXED_EVENT_CONSTRAINT(INTEL_FIXED_VLBR_EVENT, - (INTEL_PMC_IDX_FIXED_VLBR - INTEL_PMC_IDX_FIXED)); + __EVENT_CONSTRAINT(INTEL_FIXED_VLBR_EVENT, (1ULL << INTEL_PMC_IDX_FIXED_VLBR), + FIXED_EVENT_FLAGS, 1, 0, PERF_X86_EVENT_LBR_SELECT); diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h index 77a6dd66bd9a..81475963df99 100644 --- a/arch/x86/events/perf_event.h +++ b/arch/x86/events/perf_event.h @@ -78,6 +78,7 @@ static inline bool constraint_match(struct event_constraint *c, u64 ecode) #define PERF_X86_EVENT_LARGE_PEBS 0x0400 /* use large PEBS */ #define PERF_X86_EVENT_PEBS_VIA_PT 0x0800 /* use PT buffer for PEBS */ #define PERF_X86_EVENT_PAIR 0x1000 /* Large Increment per Cycle */ +#define PERF_X86_EVENT_LBR_SELECT 0x2000 /* Save/Restore MSR_LBR_SELECT */ struct amd_nb { int nb_id; /* NorthBridge id */ @@ -237,6 +238,7 @@ struct cpu_hw_events { u64 br_sel; struct x86_perf_task_context *last_task_ctx; int last_log_id; + int lbr_select; /* * Intel host/guest exclude bits @@ -722,6 +724,7 @@ struct x86_perf_task_context { u64 lbr_from[MAX_LBR_ENTRIES]; u64 lbr_to[MAX_LBR_ENTRIES]; u64 lbr_info[MAX_LBR_ENTRIES]; + u64 lbr_sel; int tos; int valid_lbrs; int lbr_callstack_users; From patchwork Sat Jun 13 08:09:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11602607 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 41D74618 for ; Sat, 13 Jun 2020 08:12:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 33392207DD for ; Sat, 13 Jun 2020 08:12:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726678AbgFMIMi (ORCPT ); Sat, 13 Jun 2020 04:12:38 -0400 Received: from mga07.intel.com ([134.134.136.100]:64600 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726510AbgFMIL2 (ORCPT ); Sat, 13 Jun 2020 04:11:28 -0400 IronPort-SDR: lCJBEv6yoeivg0OH1VDe5Bs5s7m5jupXbDgrjLHcmWoQ12UyxWeDdfy4K2N3wB5QtZc2S0XfxK y6VKCuQ+ZKWA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2020 01:11:23 -0700 IronPort-SDR: 7in4YY3BPNZOJr2n1l8cSCsQmx5eSbtT9kmNtVEJe/0rPQwt+DztugEGNexQXtGJjhf+FZOu6g L4EvCSkGxTnA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,506,1583222400"; d="scan'208";a="474467393" Received: from sqa-gate.sh.intel.com (HELO clx-ap-likexu.tsp.org) ([10.239.48.212]) by fmsmga006.fm.intel.com with ESMTP; 13 Jun 2020 01:11:19 -0700 From: Like Xu To: Paolo Bonzini Cc: Peter Zijlstra , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , ak@linux.intel.com, wei.w.wang@intel.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Like Xu Subject: [PATCH v12 06/11] KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES Date: Sat, 13 Jun 2020 16:09:51 +0800 Message-Id: <20200613080958.132489-7-like.xu@linux.intel.com> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20200613080958.132489-1-like.xu@linux.intel.com> References: <20200613080958.132489-1-like.xu@linux.intel.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The bits [0, 5] of the read-only MSR_IA32_PERF_CAPABILITIES tells about the record format stored in the LBR records. Userspace could expose guest LBR when host supports LBR and the exactly supported LBR format value is initialized to the MSR_IA32_PERF_CAPABILITIES and vcpu model is compatible. Signed-off-by: Like Xu --- arch/x86/kvm/vmx/capabilities.h | 11 ++++++- arch/x86/kvm/vmx/pmu_intel.c | 52 +++++++++++++++++++++++++++++++-- arch/x86/kvm/vmx/vmx.h | 6 ++++ 3 files changed, 65 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h index 4bbd8b448d22..b633a90320ee 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -19,6 +19,7 @@ extern int __read_mostly pt_mode; #define PT_MODE_HOST_GUEST 1 #define PMU_CAP_FW_WRITES (1ULL << 13) +#define PMU_CAP_LBR_FMT 0x3f struct nested_vmx_msrs { /* @@ -375,7 +376,15 @@ static inline u64 vmx_get_perf_capabilities(void) * Since counters are virtualized, KVM would support full * width counting unconditionally, even if the host lacks it. */ - return PMU_CAP_FW_WRITES; + u64 perf_cap = PMU_CAP_FW_WRITES; + + if (boot_cpu_has(X86_FEATURE_PDCM)) + rdmsrl(MSR_IA32_PERF_CAPABILITIES, perf_cap); + + /* From now on, KVM will support LBR. */ + perf_cap |= perf_cap & PMU_CAP_LBR_FMT; + + return perf_cap; } #endif /* __KVM_X86_VMX_CAPS_H */ diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index bdcce65c7a1d..a953c7d633f6 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -168,6 +168,13 @@ static inline struct kvm_pmc *get_fw_gp_pmc(struct kvm_pmu *pmu, u32 msr) return get_gp_pmc(pmu, msr, MSR_IA32_PMC0); } +static inline bool lbr_is_enabled(struct kvm_vcpu *vcpu) +{ + struct x86_pmu_lbr *lbr = &to_vmx(vcpu)->lbr_desc.lbr; + + return lbr->nr && (vcpu->arch.perf_capabilities & PMU_CAP_LBR_FMT); +} + static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr) { struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); @@ -251,6 +258,30 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return 1; } +static inline bool lbr_fmt_is_matched(u64 data) +{ + return (data & PMU_CAP_LBR_FMT) == + (vmx_get_perf_capabilities() & PMU_CAP_LBR_FMT); +} + +static inline bool lbr_is_compatible(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); + + if (pmu->version < 2) + return false; + + /* + * As a first step, a guest could only enable LBR feature if its cpu + * model is the same as the host because the LBR registers would + * be pass-through to the guest and they're model specific. + */ + if (boot_cpu_data.x86_model != guest_cpuid_model(vcpu)) + return false; + + return true; +} + static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) { struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); @@ -295,6 +326,14 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) if (guest_cpuid_has(vcpu, X86_FEATURE_PDCM) ? (data & ~vmx_get_perf_capabilities()) : data) return 1; + if (data & PMU_CAP_LBR_FMT) { + if (!lbr_fmt_is_matched(data)) + return 1; + if (!lbr_is_compatible(vcpu)) + return 1; + if (x86_perf_get_lbr(&to_vmx(vcpu)->lbr_desc.lbr)) + return 1; + } vcpu->arch.perf_capabilities = data; return 0; default: @@ -337,6 +376,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) struct kvm_cpuid_entry2 *entry; union cpuid10_eax eax; union cpuid10_edx edx; + struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc; pmu->nr_arch_gp_counters = 0; pmu->nr_arch_fixed_counters = 0; @@ -344,7 +384,6 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) pmu->counter_bitmask[KVM_PMC_FIXED] = 0; pmu->version = 0; pmu->reserved_bits = 0xffffffff00200000ull; - vcpu->arch.perf_capabilities = 0; entry = kvm_find_cpuid_entry(vcpu, 0xa, 0); if (!entry) @@ -357,8 +396,6 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) return; perf_get_x86_pmu_capability(&x86_pmu); - if (guest_cpuid_has(vcpu, X86_FEATURE_PDCM)) - vcpu->arch.perf_capabilities = vmx_get_perf_capabilities(); pmu->nr_arch_gp_counters = min_t(int, eax.split.num_counters, x86_pmu.num_counters_gp); @@ -397,6 +434,10 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) bitmap_set(pmu->all_valid_pmc_idx, INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters); + if ((vcpu->arch.perf_capabilities & PMU_CAP_LBR_FMT) && + x86_perf_get_lbr(&lbr_desc->lbr)) + vcpu->arch.perf_capabilities &= ~PMU_CAP_LBR_FMT; + nested_vmx_pmu_entry_exit_ctls_update(vcpu); } @@ -404,6 +445,7 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu) { int i; struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); + struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc; for (i = 0; i < INTEL_PMC_MAX_GENERIC; i++) { pmu->gp_counters[i].type = KVM_PMC_GP; @@ -418,6 +460,10 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu) pmu->fixed_counters[i].idx = i + INTEL_PMC_IDX_FIXED; pmu->fixed_counters[i].current_config = 0; } + + vcpu->arch.perf_capabilities = guest_cpuid_has(vcpu, X86_FEATURE_PDCM) ? + vmx_get_perf_capabilities() : 0; + lbr_desc->lbr.nr = 0; } static void intel_pmu_reset(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 8a83b5edc820..ef24338b194d 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -91,6 +91,11 @@ struct pt_desc { struct pt_ctx guest; }; +struct lbr_desc { + /* Basic information about LBR records. */ + struct x86_pmu_lbr lbr; +}; + /* * The nested_vmx structure is part of vcpu_vmx, and holds information we need * for correct emulation of VMX (i.e., nested VMX) on this vcpu. @@ -302,6 +307,7 @@ struct vcpu_vmx { u64 ept_pointer; struct pt_desc pt_desc; + struct lbr_desc lbr_desc; }; enum ept_pointers_status { From patchwork Sat Jun 13 08:09:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11602605 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9438F618 for ; Sat, 13 Jun 2020 08:12:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 83E7C207DD for ; Sat, 13 Jun 2020 08:12:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726665AbgFMIM2 (ORCPT ); Sat, 13 Jun 2020 04:12:28 -0400 Received: from mga07.intel.com ([134.134.136.100]:64588 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726397AbgFMILb (ORCPT ); Sat, 13 Jun 2020 04:11:31 -0400 IronPort-SDR: 3Km5Mu4iNQzGva9f+fJk4iO72qdDdedrR++ImzFP5+uQApZqcltKjXn3xCEJtrJ6ADp8wHs1rX IXpRA6Q/zmlg== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2020 01:11:25 -0700 IronPort-SDR: IiYrePAYV0LKMKQOm7qWxd3AXL9wzyugZCnBFMF6qEAfCNHlunUbPbqgipIxawT+A1gemwlmCN KlyzWvgYFESA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,506,1583222400"; d="scan'208";a="474467406" Received: from sqa-gate.sh.intel.com (HELO clx-ap-likexu.tsp.org) ([10.239.48.212]) by fmsmga006.fm.intel.com with ESMTP; 13 Jun 2020 01:11:22 -0700 From: Like Xu To: Paolo Bonzini Cc: Peter Zijlstra , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , ak@linux.intel.com, wei.w.wang@intel.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Like Xu Subject: [PATCH v12 07/11] KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion Date: Sat, 13 Jun 2020 16:09:52 +0800 Message-Id: <20200613080958.132489-8-like.xu@linux.intel.com> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20200613080958.132489-1-like.xu@linux.intel.com> References: <20200613080958.132489-1-like.xu@linux.intel.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When the LBR feature is reported by the vmx_get_perf_capabilities(), the LBR fields in the [vmx|vcpu]_supported debugctl should be unmasked. The debugctl msr is handled separately in vmx/svm and they're not completely identical, hence remove the common msr handling code. Signed-off-by: Like Xu --- arch/x86/kvm/vmx/capabilities.h | 12 ++++++++++++ arch/x86/kvm/vmx/pmu_intel.c | 19 +++++++++++++++++++ arch/x86/kvm/x86.c | 13 ------------- 3 files changed, 31 insertions(+), 13 deletions(-) diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h index b633a90320ee..f6fcfabb1026 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -21,6 +21,8 @@ extern int __read_mostly pt_mode; #define PMU_CAP_FW_WRITES (1ULL << 13) #define PMU_CAP_LBR_FMT 0x3f +#define DEBUGCTLMSR_LBR_MASK (DEBUGCTLMSR_LBR | DEBUGCTLMSR_FREEZE_LBRS_ON_PMI) + struct nested_vmx_msrs { /* * We only store the "true" versions of the VMX capability MSRs. We @@ -387,4 +389,14 @@ static inline u64 vmx_get_perf_capabilities(void) return perf_cap; } +static inline u64 vmx_get_supported_debugctl(void) +{ + u64 val = 0; + + if (vmx_get_perf_capabilities() & PMU_CAP_LBR_FMT) + val |= DEBUGCTLMSR_LBR_MASK; + + return val; +} + #endif /* __KVM_X86_VMX_CAPS_H */ diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index a953c7d633f6..d92e95b64c74 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -187,6 +187,7 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr) case MSR_CORE_PERF_GLOBAL_OVF_CTRL: ret = pmu->version > 1; break; + case MSR_IA32_DEBUGCTLMSR: case MSR_IA32_PERF_CAPABILITIES: ret = 1; break; @@ -237,6 +238,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return 1; msr_info->data = vcpu->arch.perf_capabilities; return 0; + case MSR_IA32_DEBUGCTLMSR: + msr_info->data = vmcs_read64(GUEST_IA32_DEBUGCTL); + return 0; default: if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) || (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) { @@ -282,6 +286,16 @@ static inline bool lbr_is_compatible(struct kvm_vcpu *vcpu) return true; } +static inline u64 vcpu_get_supported_debugctl(struct kvm_vcpu *vcpu) +{ + u64 debugctlmsr = vmx_get_supported_debugctl(); + + if (!lbr_is_enabled(vcpu)) + debugctlmsr &= ~DEBUGCTLMSR_LBR_MASK; + + return debugctlmsr; +} + static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) { struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); @@ -336,6 +350,11 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } vcpu->arch.perf_capabilities = data; return 0; + case MSR_IA32_DEBUGCTLMSR: + if (data & ~vcpu_get_supported_debugctl(vcpu)) + return 1; + vmcs_write64(GUEST_IA32_DEBUGCTL, data); + return 0; default: if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) || (pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 00c88c2f34e4..56f275eb4554 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2840,18 +2840,6 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return 1; } break; - case MSR_IA32_DEBUGCTLMSR: - if (!data) { - /* We support the non-activated case already */ - break; - } else if (data & ~(DEBUGCTLMSR_LBR | DEBUGCTLMSR_BTF)) { - /* Values other than LBR and BTF are vendor-specific, - thus reserved and should throw a #GP */ - return 1; - } - vcpu_unimpl(vcpu, "%s: MSR_IA32_DEBUGCTLMSR 0x%llx, nop\n", - __func__, data); - break; case 0x200 ... 0x2ff: return kvm_mtrr_set_msr(vcpu, msr, data); case MSR_IA32_APICBASE: @@ -3120,7 +3108,6 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) switch (msr_info->index) { case MSR_IA32_PLATFORM_ID: case MSR_IA32_EBL_CR_POWERON: - case MSR_IA32_DEBUGCTLMSR: case MSR_IA32_LASTBRANCHFROMIP: case MSR_IA32_LASTBRANCHTOIP: case MSR_IA32_LASTINTFROMIP: From patchwork Sat Jun 13 08:09:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11602593 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5429A618 for ; Sat, 13 Jun 2020 08:12:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2AE7A207DD for ; Sat, 13 Jun 2020 08:12:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726634AbgFMIMH (ORCPT ); Sat, 13 Jun 2020 04:12:07 -0400 Received: from mga07.intel.com ([134.134.136.100]:64586 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726538AbgFMIMC (ORCPT ); Sat, 13 Jun 2020 04:12:02 -0400 IronPort-SDR: XTBFLHg8hQ8qRg5nYPRUGrjkxV16EE6modBjI7IZIA0aBbDdWzQYJeHNKFbNx1AQTT4xfGwDah VcSwPq/ycdOQ== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2020 01:11:29 -0700 IronPort-SDR: MD26/cXuaZr5OmUoJIq/1HbTSaDeBW97q3yFzz90PlKZqelc54cq2imr6NSvnUDI0n0QXf6OyF jzEZfD/Pkm0w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,506,1583222400"; d="scan'208";a="474467427" Received: from sqa-gate.sh.intel.com (HELO clx-ap-likexu.tsp.org) ([10.239.48.212]) by fmsmga006.fm.intel.com with ESMTP; 13 Jun 2020 01:11:25 -0700 From: Like Xu To: Paolo Bonzini Cc: Peter Zijlstra , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , ak@linux.intel.com, wei.w.wang@intel.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Like Xu Subject: [PATCH v12 08/11] KVM: vmx/pmu: Pass-through LBR msrs when guest LBR event is scheduled Date: Sat, 13 Jun 2020 16:09:53 +0800 Message-Id: <20200613080958.132489-9-like.xu@linux.intel.com> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20200613080958.132489-1-like.xu@linux.intel.com> References: <20200613080958.132489-1-like.xu@linux.intel.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The guest first access on the LBR related msrs (including DEBUGCTLMSR and records msrs) is always interceptible. The KVM handler would create a guest LBR event which enables the callstack mode and none of hardware counter is assigned. The host perf would schedule and enable this event as usual but in an exclusive way. If the guest LBR event is scheduled on with the corresponding vcpu context, KVM will pass-through all LBR records msrs to the guest. The LBR callstack mechanism implemented in the host could helps save/restore the guest LBR records during the event context switches, which reduces a lot of overhead if we save/restore tens of LBR msrs (e.g. 32 LBR records entries) in the much more frequent VMX transitions. To avoid reclaiming LBR resources from any higher priority event on host, KVM would always check the exist of guest LBR event and its state before vm-entry as late as possible. A negative result would cancel the pass-through state, and it also prevents real registers accesses and potential data leakage. If host reclaims the LBR between two checks, the interception state and LBR records can be safely preserved due to native save/restore support from guest LBR event. The KVM emits a pr_warn() when the LBR hardware is unavailable to the guest LBR event. The administer is supposed to reminder users that the guest result may be inaccurate if someone is using LBR to record hypervisor on the host side. The guest LBR event will be released when the vPMU is reset but soon, the lazy release mechanism would be applied to this event like a regular vPMC. Suggested-by: Andi Kleen Co-developed-by: Wei Wang Signed-off-by: Wei Wang Signed-off-by: Like Xu --- arch/x86/kvm/vmx/pmu_intel.c | 138 ++++++++++++++++++++++++++++++++++- arch/x86/kvm/vmx/vmx.c | 70 +++++++++++++++++- arch/x86/kvm/vmx/vmx.h | 8 ++ 3 files changed, 212 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index d92e95b64c74..a78c440ebff2 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -175,6 +175,24 @@ static inline bool lbr_is_enabled(struct kvm_vcpu *vcpu) return lbr->nr && (vcpu->arch.perf_capabilities & PMU_CAP_LBR_FMT); } +static bool intel_is_valid_lbr_record_msr(struct kvm_vcpu *vcpu, u32 index) +{ + struct x86_pmu_lbr *lbr = &to_vmx(vcpu)->lbr_desc.lbr; + bool ret = false; + + if (!lbr_is_enabled(vcpu)) + return ret; + + ret = (index == MSR_LBR_SELECT) || (index == MSR_LBR_TOS) || + (index >= lbr->from && index < lbr->from + lbr->nr) || + (index >= lbr->to && index < lbr->to + lbr->nr); + + if (!ret && lbr->info) + ret = (index >= lbr->info && index < lbr->info + lbr->nr); + + return ret; +} + static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr) { struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); @@ -194,7 +212,8 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr) default: ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) || get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) || - get_fixed_pmc(pmu, msr) || get_fw_gp_pmc(pmu, msr); + get_fixed_pmc(pmu, msr) || get_fw_gp_pmc(pmu, msr) || + intel_is_valid_lbr_record_msr(vcpu, msr); break; } @@ -213,6 +232,113 @@ static struct kvm_pmc *intel_msr_idx_to_pmc(struct kvm_vcpu *vcpu, u32 msr) return pmc; } +static int intel_pmu_create_lbr_event(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); + struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc; + struct perf_event *event; + + /* + * The perf_event_attr is constructed in the minimum efficient way: + * - set 'pinned = true' to make it task pinned so that if another + * cpu pinned event reclaims LBR, the event->oncpu will be set to -1; + * - set '.exclude_host = true' to record guest branches behavior; + * + * - set '.config = INTEL_FIXED_VLBR_EVENT' to indicates host perf + * schedule the event without a real HW counter but a fake one; + * check is_guest_lbr_event() and __intel_get_event_constraints(); + * + * - set 'sample_type = PERF_SAMPLE_BRANCH_STACK' and + * 'branch_sample_type = PERF_SAMPLE_BRANCH_CALL_STACK | + * PERF_SAMPLE_BRANCH_USER' to configure it as a LBR callstack + * event, which helps KVM to save/restore guest LBR records + * during host context switches and reduces quite a lot overhead, + * check branch_user_callstack() and intel_pmu_lbr_sched_task(); + */ + struct perf_event_attr attr = { + .type = PERF_TYPE_RAW, + .size = sizeof(attr), + .config = INTEL_FIXED_VLBR_EVENT, + .sample_type = PERF_SAMPLE_BRANCH_STACK, + .pinned = true, + .exclude_host = true, + .branch_sample_type = PERF_SAMPLE_BRANCH_CALL_STACK | + PERF_SAMPLE_BRANCH_USER, + }; + + if (unlikely(lbr_desc->event)) + return 0; + + event = perf_event_create_kernel_counter(&attr, -1, + current, NULL, NULL); + if (IS_ERR(event)) { + pr_debug_ratelimited("%s: failed %ld\n", + __func__, PTR_ERR(event)); + return -ENOENT; + } + lbr_desc->event = event; + pmu->event_count++; + return 0; +} + +static void intel_pmu_free_lbr_event(struct kvm_vcpu *vcpu) +{ + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); + struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc; + struct perf_event *event = lbr_desc->event; + + if (!event) + return; + + perf_event_release_kernel(event); + lbr_desc->event = NULL; + pmu->event_count--; +} + +/* + * It's safe to access LBR msrs from guest when they have not + * been passthrough since the host would help restore or reset + * the LBR msrs records when the guest LBR event is scheduled in. + */ +static bool access_lbr_record_msr(struct kvm_vcpu *vcpu, + struct msr_data *msr_info, bool read) +{ + struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc; + u32 index = msr_info->index; + + if (!intel_is_valid_lbr_record_msr(vcpu, index)) + return false; + + if (msr_info->host_initiated) + goto dummy; + + if (!lbr_desc->event && !intel_pmu_create_lbr_event(vcpu)) + goto dummy; + + /* + * Disable irq to ensure the LBR feature doesn't get reclaimed by the + * host at the time the value is read from the msr, and this avoids the + * host LBR value to be leaked to the guest. If LBR has been reclaimed, + * return 0 on guest reads. + */ + local_irq_disable(); + if (lbr_desc->event->state == PERF_EVENT_STATE_ACTIVE) { + if (read) + rdmsrl(index, msr_info->data); + else + wrmsrl(index, msr_info->data); + } else if (read) + msr_info->data = 0; + local_irq_enable(); + + return true; + +dummy: + if (read) + msr_info->data = 0; + return true; +} + static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) { struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); @@ -256,7 +382,8 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } else if ((pmc = get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0))) { msr_info->data = pmc->eventsel; return 0; - } + } else if (access_lbr_record_msr(vcpu, msr_info, true)) + return 0; } return 1; @@ -354,6 +481,8 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) if (data & ~vcpu_get_supported_debugctl(vcpu)) return 1; vmcs_write64(GUEST_IA32_DEBUGCTL, data); + if (!msr_info->host_initiated && !to_vmx(vcpu)->lbr_desc.event) + intel_pmu_create_lbr_event(vcpu); return 0; default: if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) || @@ -382,7 +511,8 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) reprogram_gp_counter(pmc, data); return 0; } - } + } else if (access_lbr_record_msr(vcpu, msr_info, false)) + return 0; } return 1; @@ -483,6 +613,7 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu) vcpu->arch.perf_capabilities = guest_cpuid_has(vcpu, X86_FEATURE_PDCM) ? vmx_get_perf_capabilities() : 0; lbr_desc->lbr.nr = 0; + lbr_desc->event = NULL; } static void intel_pmu_reset(struct kvm_vcpu *vcpu) @@ -507,6 +638,7 @@ static void intel_pmu_reset(struct kvm_vcpu *vcpu) pmu->fixed_ctr_ctrl = pmu->global_ctrl = pmu->global_status = pmu->global_ovf_ctrl = 0; + intel_pmu_free_lbr_event(vcpu); } struct kvm_pmu_ops intel_pmu_ops = { diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 08e26a9518c2..58a8af433741 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3857,6 +3857,71 @@ void pt_update_intercept_for_msr(struct vcpu_vmx *vmx) } } +static void vmx_update_intercept_for_lbr_msrs(struct kvm_vcpu *vcpu, bool set) +{ + unsigned long *msr_bitmap = to_vmx(vcpu)->vmcs01.msr_bitmap; + struct x86_pmu_lbr *lbr = &to_vmx(vcpu)->lbr_desc.lbr; + int i; + + WARN_ON_ONCE(!lbr->nr); + + vmx_set_intercept_for_msr(msr_bitmap, MSR_LBR_SELECT, MSR_TYPE_RW, set); + vmx_set_intercept_for_msr(msr_bitmap, MSR_LBR_TOS, MSR_TYPE_RW, set); + for (i = 0; i < lbr->nr; i++) { + vmx_set_intercept_for_msr(msr_bitmap, + lbr->from + i, MSR_TYPE_RW, set); + vmx_set_intercept_for_msr(msr_bitmap, + lbr->to + i, MSR_TYPE_RW, set); + if (lbr->info) + vmx_set_intercept_for_msr(msr_bitmap, + lbr->info + i, MSR_TYPE_RW, set); + } +} + +static inline void vmx_lbr_disable_passthrough(struct kvm_vcpu *vcpu) +{ + vmx_update_intercept_for_lbr_msrs(vcpu, true); +} + +static inline void vmx_lbr_enable_passthrough(struct kvm_vcpu *vcpu) +{ + vmx_update_intercept_for_lbr_msrs(vcpu, false); +} + +/* + * Higher priority host perf events (e.g. cpu pinned) could reclaim the + * pmu resources (e.g. LBR) that were assigned to the guest. This is + * usually done via ipi calls (more details in perf_install_in_context). + * + * Before entering the non-root mode (with irq disabled here), double + * confirm that the pmu features enabled to the guest are not reclaimed + * by higher priority host events. Otherwise, disallow vcpu's access to + * the reclaimed features. + */ +static void vmx_passthrough_lbr_msrs(struct kvm_vcpu *vcpu) +{ + struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc; + + if (!lbr_desc->event) { + vmx_lbr_disable_passthrough(vcpu); + if (vmcs_read64(GUEST_IA32_DEBUGCTL) & DEBUGCTLMSR_LBR) + goto warn; + return; + } + + if (lbr_desc->event->state < PERF_EVENT_STATE_ACTIVE) { + vmx_lbr_disable_passthrough(vcpu); + goto warn; + } else + vmx_lbr_enable_passthrough(vcpu); + + return; + +warn: + pr_warn_ratelimited("kvm: vcpu-%d: fail to passthrough LBR.\n", + vcpu->vcpu_id); +} + static bool vmx_guest_apic_has_interrupt(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -6728,8 +6793,11 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) pt_guest_enter(vmx); - if (vcpu_to_pmu(vcpu)->version) + if (vcpu_to_pmu(vcpu)->version) { atomic_switch_perf_msrs(vmx); + if (vcpu->arch.perf_capabilities & PMU_CAP_LBR_FMT) + vmx_passthrough_lbr_msrs(vcpu); + } atomic_switch_umwait_control_msr(vmx); if (enable_preemption_timer) diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index ef24338b194d..c67ce758412e 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -94,6 +94,14 @@ struct pt_desc { struct lbr_desc { /* Basic information about LBR records. */ struct x86_pmu_lbr lbr; + + /* + * Emulate LBR feature via passthrough LBR registers when the + * per-vcpu guest LBR event is scheduled on the current pcpu. + * + * The records may be inaccurate if the host reclaims the LBR. + */ + struct perf_event *event; }; /* From patchwork Sat Jun 13 08:09:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11602601 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2484660D for ; Sat, 13 Jun 2020 08:12:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 16778207FF for ; Sat, 13 Jun 2020 08:12:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726563AbgFMILh (ORCPT ); Sat, 13 Jun 2020 04:11:37 -0400 Received: from mga07.intel.com ([134.134.136.100]:64600 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726541AbgFMILd (ORCPT ); Sat, 13 Jun 2020 04:11:33 -0400 IronPort-SDR: 918yfGiljIrESC7/iWY7XuNlvoy1LYbQ8ocM/zK4xX6GOU3Zq2vkdDSEXjhWnYyN4W9B5GrrLz 9j3aEm2VaWIg== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2020 01:11:32 -0700 IronPort-SDR: IA2aq0Psj6XKnGsFTqLXpJJl3J4I5Bd0cqRpKIbdbx7qYyl+Sk6Ud0xX8CMATdQ/mD3m1pDavQ oKwJpUaNaUig== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,506,1583222400"; d="scan'208";a="474467442" Received: from sqa-gate.sh.intel.com (HELO clx-ap-likexu.tsp.org) ([10.239.48.212]) by fmsmga006.fm.intel.com with ESMTP; 13 Jun 2020 01:11:29 -0700 From: Like Xu To: Paolo Bonzini Cc: Peter Zijlstra , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , ak@linux.intel.com, wei.w.wang@intel.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Like Xu Subject: [PATCH v12 09/11] KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI Date: Sat, 13 Jun 2020 16:09:54 +0800 Message-Id: <20200613080958.132489-10-like.xu@linux.intel.com> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20200613080958.132489-1-like.xu@linux.intel.com> References: <20200613080958.132489-1-like.xu@linux.intel.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The current vPMU only supports Architecture Version 2. According to Intel SDM "17.4.7 Freezing LBR and Performance Counters on PMI", if IA32_DEBUGCTL.Freeze_LBR_On_PMI = 1, the LBR is frozen on the virtual PMI and the KVM would emulate to clear the LBR bit (bit 0) in IA32_DEBUGCTL. Also guest needs to re-enable IA32_DEBUGCTL.LBR to resume recording branches. Signed-off-by: Like Xu --- arch/x86/kvm/pmu.c | 5 ++++- arch/x86/kvm/pmu.h | 1 + arch/x86/kvm/vmx/pmu_intel.c | 31 +++++++++++++++++++++++++++++++ 3 files changed, 36 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index b86346903f2e..5053f4238218 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -378,8 +378,11 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data) void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu) { - if (lapic_in_kernel(vcpu)) + if (lapic_in_kernel(vcpu)) { + if (kvm_x86_ops.pmu_ops->deliver_pmi) + kvm_x86_ops.pmu_ops->deliver_pmi(vcpu); kvm_apic_local_deliver(vcpu->arch.apic, APIC_LVTPC); + } } bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr) diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index ab85eed8a6cc..095b84392b89 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -37,6 +37,7 @@ struct kvm_pmu_ops { void (*refresh)(struct kvm_vcpu *vcpu); void (*init)(struct kvm_vcpu *vcpu); void (*reset)(struct kvm_vcpu *vcpu); + void (*deliver_pmi)(struct kvm_vcpu *vcpu); }; static inline u64 pmc_bitmask(struct kvm_pmc *pmc) diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index a78c440ebff2..85a675004cbb 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -641,6 +641,36 @@ static void intel_pmu_reset(struct kvm_vcpu *vcpu) intel_pmu_free_lbr_event(vcpu); } +/* + * Emulate LBR_On_PMI behavior for 1 < pmu.version < 4. + * + * If Freeze_LBR_On_PMI = 1, the LBR is frozen on PMI and + * the KVM emulates to clear the LBR bit (bit 0) in IA32_DEBUGCTL. + * + * Guest needs to re-enable LBR to resume branches recording. + */ +static void intel_pmu_legacy_freezing_lbrs_on_pmi(struct kvm_vcpu *vcpu) +{ + u64 data; + + data = vmcs_read64(GUEST_IA32_DEBUGCTL); + if (data & DEBUGCTLMSR_FREEZE_LBRS_ON_PMI) { + data &= ~DEBUGCTLMSR_LBR; + vmcs_write64(GUEST_IA32_DEBUGCTL, data); + } +} + +static void intel_pmu_deliver_pmi(struct kvm_vcpu *vcpu) +{ + u8 version = vcpu_to_pmu(vcpu)->version; + + if (!lbr_is_enabled(vcpu)) + return; + + if (version > 1 && version < 4) + intel_pmu_legacy_freezing_lbrs_on_pmi(vcpu); +} + struct kvm_pmu_ops intel_pmu_ops = { .find_arch_event = intel_find_arch_event, .find_fixed_event = intel_find_fixed_event, @@ -655,4 +685,5 @@ struct kvm_pmu_ops intel_pmu_ops = { .refresh = intel_pmu_refresh, .init = intel_pmu_init, .reset = intel_pmu_reset, + .deliver_pmi = intel_pmu_deliver_pmi, }; From patchwork Sat Jun 13 08:09:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11602595 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C6CEB60D for ; Sat, 13 Jun 2020 08:12:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B82E520739 for ; Sat, 13 Jun 2020 08:12:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726551AbgFMILk (ORCPT ); Sat, 13 Jun 2020 04:11:40 -0400 Received: from mga07.intel.com ([134.134.136.100]:64588 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726553AbgFMILf (ORCPT ); Sat, 13 Jun 2020 04:11:35 -0400 IronPort-SDR: ROT/FVHTNhFtjDcxJ2n4uSZ5SoYC89vC0Al4d9tBXqtoAO/gkBi288w8VXZDmPy2Q1DdZxGN4g yH2k/sAYE+7g== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2020 01:11:34 -0700 IronPort-SDR: DsydGXdWvck1TnL/AtC4qj2yDMHJCD7/zQgLXVI3sRRGgrSiuaDcE7wf5HHOtaCtpDyrvEHnf0 W2jVNiH5HSGA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,506,1583222400"; d="scan'208";a="474467453" Received: from sqa-gate.sh.intel.com (HELO clx-ap-likexu.tsp.org) ([10.239.48.212]) by fmsmga006.fm.intel.com with ESMTP; 13 Jun 2020 01:11:32 -0700 From: Like Xu To: Paolo Bonzini Cc: Peter Zijlstra , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , ak@linux.intel.com, wei.w.wang@intel.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Like Xu Subject: [PATCH v12 10/11] KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation Date: Sat, 13 Jun 2020 16:09:55 +0800 Message-Id: <20200613080958.132489-11-like.xu@linux.intel.com> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20200613080958.132489-1-like.xu@linux.intel.com> References: <20200613080958.132489-1-like.xu@linux.intel.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When the LBR record msrs has already been pass-through, there is no need to call vmx_update_intercept_for_lbr_msrs() again and again, and vice versa. Signed-off-by: Like Xu --- arch/x86/kvm/vmx/pmu_intel.c | 1 + arch/x86/kvm/vmx/vmx.c | 12 ++++++++++++ arch/x86/kvm/vmx/vmx.h | 3 +++ 3 files changed, 16 insertions(+) diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 85a675004cbb..75ba0444b4d1 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -614,6 +614,7 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu) vmx_get_perf_capabilities() : 0; lbr_desc->lbr.nr = 0; lbr_desc->event = NULL; + lbr_desc->already_passthrough = false; } static void intel_pmu_reset(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 58a8af433741..800a26e3b571 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3880,12 +3880,24 @@ static void vmx_update_intercept_for_lbr_msrs(struct kvm_vcpu *vcpu, bool set) static inline void vmx_lbr_disable_passthrough(struct kvm_vcpu *vcpu) { + struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc; + + if (!lbr_desc->already_passthrough) + return; + vmx_update_intercept_for_lbr_msrs(vcpu, true); + lbr_desc->already_passthrough = false; } static inline void vmx_lbr_enable_passthrough(struct kvm_vcpu *vcpu) { + struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc; + + if (lbr_desc->already_passthrough) + return; + vmx_update_intercept_for_lbr_msrs(vcpu, false); + lbr_desc->already_passthrough = true; } /* diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index c67ce758412e..c931463f75d9 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -102,6 +102,9 @@ struct lbr_desc { * The records may be inaccurate if the host reclaims the LBR. */ struct perf_event *event; + + /* A flag to reduce the overhead of LBR pass-through or cancellation. */ + bool already_passthrough; }; /* From patchwork Sat Jun 13 08:09:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 11602599 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EDC0260D for ; Sat, 13 Jun 2020 08:12:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D497B207FF for ; Sat, 13 Jun 2020 08:12:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726574AbgFMIMR (ORCPT ); Sat, 13 Jun 2020 04:12:17 -0400 Received: from mga07.intel.com ([134.134.136.100]:64611 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726570AbgFMILj (ORCPT ); Sat, 13 Jun 2020 04:11:39 -0400 IronPort-SDR: FgeE9e97xthOFHzWlJ0Nsop+I38LwSDLSOyn36DpCUXeRkrknJQLhT2Uii8uaCmCQkrg4C5WTN BKNah2O01/dg== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jun 2020 01:11:37 -0700 IronPort-SDR: EWhDzJPbkRJ4BkuFB+j0sRW1P5napTO3QjlklnI/66bD6YiCojHmZdDwqF27bnNQcf0O9aaiAN NZl96hYG1CNQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,506,1583222400"; d="scan'208";a="474467465" Received: from sqa-gate.sh.intel.com (HELO clx-ap-likexu.tsp.org) ([10.239.48.212]) by fmsmga006.fm.intel.com with ESMTP; 13 Jun 2020 01:11:34 -0700 From: Like Xu To: Paolo Bonzini Cc: Peter Zijlstra , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , ak@linux.intel.com, wei.w.wang@intel.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Like Xu Subject: [PATCH v12 11/11] KVM: vmx/pmu: Release guest LBR event via lazy release mechanism Date: Sat, 13 Jun 2020 16:09:56 +0800 Message-Id: <20200613080958.132489-12-like.xu@linux.intel.com> X-Mailer: git-send-email 2.21.3 In-Reply-To: <20200613080958.132489-1-like.xu@linux.intel.com> References: <20200613080958.132489-1-like.xu@linux.intel.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org The vPMU uses GUEST_LBR_IN_USE_IDX (bit 58) in 'pmu->pmc_in_use' to indicate whether a guest LBR event is still needed by the vcpu. If the vcpu no longer accesses LBR related registers within a scheduling time slice, and the enable bit of LBR has been unset, vPMU will treat the guest LBR event as a bland event of a vPMC counter and release it as usual. Also the pass-through state of LBR records msrs is cancelled. Signed-off-by: Like Xu --- arch/x86/kvm/pmu.c | 7 +++++++ arch/x86/kvm/pmu.h | 4 ++++ arch/x86/kvm/vmx/pmu_intel.c | 14 +++++++++++++- arch/x86/kvm/vmx/vmx.c | 4 ++++ 4 files changed, 28 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 5053f4238218..e5b76f1c3ce8 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -458,6 +458,7 @@ void kvm_pmu_cleanup(struct kvm_vcpu *vcpu) struct kvm_pmc *pmc = NULL; DECLARE_BITMAP(bitmask, X86_PMC_IDX_MAX); int i; + bool arch_cleanup = false; pmu->need_cleanup = false; @@ -469,8 +470,14 @@ void kvm_pmu_cleanup(struct kvm_vcpu *vcpu) if (pmc && pmc->perf_event && !pmc_speculative_in_use(pmc)) pmc_stop_counter(pmc); + + if (i == INTEL_GUEST_LBR_INUSE) + arch_cleanup = true; } + if (arch_cleanup && kvm_x86_ops.pmu_ops->cleanup) + kvm_x86_ops.pmu_ops->cleanup(vcpu); + bitmap_zero(pmu->pmc_in_use, X86_PMC_IDX_MAX); } diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index 095b84392b89..d5023eacd8ed 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -15,6 +15,9 @@ #define VMWARE_BACKDOOR_PMC_REAL_TIME 0x10001 #define VMWARE_BACKDOOR_PMC_APPARENT_TIME 0x10002 +/* Indicates whether Intel LBR msrs were accessed during the last time slice. */ +#define INTEL_GUEST_LBR_INUSE INTEL_PMC_IDX_FIXED_VLBR + struct kvm_event_hw_type_mapping { u8 eventsel; u8 unit_mask; @@ -38,6 +41,7 @@ struct kvm_pmu_ops { void (*init)(struct kvm_vcpu *vcpu); void (*reset)(struct kvm_vcpu *vcpu); void (*deliver_pmi)(struct kvm_vcpu *vcpu); + void (*cleanup)(struct kvm_vcpu *vcpu); }; static inline u64 pmc_bitmask(struct kvm_pmc *pmc) diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 75ba0444b4d1..c1c5058acc90 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -303,6 +303,7 @@ static void intel_pmu_free_lbr_event(struct kvm_vcpu *vcpu) static bool access_lbr_record_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info, bool read) { + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc; u32 index = msr_info->index; @@ -331,6 +332,7 @@ static bool access_lbr_record_msr(struct kvm_vcpu *vcpu, msr_info->data = 0; local_irq_enable(); + __set_bit(INTEL_GUEST_LBR_INUSE, pmu->pmc_in_use); return true; dummy: @@ -483,6 +485,7 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) vmcs_write64(GUEST_IA32_DEBUGCTL, data); if (!msr_info->host_initiated && !to_vmx(vcpu)->lbr_desc.event) intel_pmu_create_lbr_event(vcpu); + __set_bit(INTEL_GUEST_LBR_INUSE, pmu->pmc_in_use); return 0; default: if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) || @@ -584,7 +587,9 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters); if ((vcpu->arch.perf_capabilities & PMU_CAP_LBR_FMT) && - x86_perf_get_lbr(&lbr_desc->lbr)) + !x86_perf_get_lbr(&lbr_desc->lbr)) + bitmap_set(pmu->all_valid_pmc_idx, INTEL_GUEST_LBR_INUSE, 1); + else vcpu->arch.perf_capabilities &= ~PMU_CAP_LBR_FMT; nested_vmx_pmu_entry_exit_ctls_update(vcpu); @@ -672,6 +677,12 @@ static void intel_pmu_deliver_pmi(struct kvm_vcpu *vcpu) intel_pmu_legacy_freezing_lbrs_on_pmi(vcpu); } +static void intel_pmu_cleanup(struct kvm_vcpu *vcpu) +{ + if (!(vmcs_read64(GUEST_IA32_DEBUGCTL) & DEBUGCTLMSR_LBR)) + intel_pmu_free_lbr_event(vcpu); +} + struct kvm_pmu_ops intel_pmu_ops = { .find_arch_event = intel_find_arch_event, .find_fixed_event = intel_find_fixed_event, @@ -687,4 +698,5 @@ struct kvm_pmu_ops intel_pmu_ops = { .init = intel_pmu_init, .reset = intel_pmu_reset, .deliver_pmi = intel_pmu_deliver_pmi, + .cleanup = intel_pmu_cleanup, }; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 800a26e3b571..8521fc640b95 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3912,17 +3912,21 @@ static inline void vmx_lbr_enable_passthrough(struct kvm_vcpu *vcpu) */ static void vmx_passthrough_lbr_msrs(struct kvm_vcpu *vcpu) { + struct kvm_pmu *pmu = vcpu_to_pmu(vcpu); struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc; if (!lbr_desc->event) { vmx_lbr_disable_passthrough(vcpu); if (vmcs_read64(GUEST_IA32_DEBUGCTL) & DEBUGCTLMSR_LBR) goto warn; + if (test_bit(INTEL_GUEST_LBR_INUSE, pmu->pmc_in_use)) + goto warn; return; } if (lbr_desc->event->state < PERF_EVENT_STATE_ACTIVE) { vmx_lbr_disable_passthrough(vcpu); + __clear_bit(INTEL_GUEST_LBR_INUSE, pmu->pmc_in_use); goto warn; } else vmx_lbr_enable_passthrough(vcpu);