From patchwork Thu Nov  1 10:04:01 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Wang, Wei W" <wei.w.wang@intel.com>
X-Patchwork-Id: 10663821
Return-Path: <kvm-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
 [172.30.200.125])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6079F14E2
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:36:30 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 45D5628397
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:36:30 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 35DF52B7BB; Thu,  1 Nov 2018 10:36:30 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B53A128397
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:36:29 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1728185AbeKATiu (ORCPT
        <rfc822;patchwork-kvm@patchwork.kernel.org>);
        Thu, 1 Nov 2018 15:38:50 -0400
Received: from mga06.intel.com ([134.134.136.31]:52047 "EHLO mga06.intel.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1726320AbeKATit (ORCPT <rfc822;kvm@vger.kernel.org>);
        Thu, 1 Nov 2018 15:38:49 -0400
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga002.fm.intel.com ([10.253.24.26])
  by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 01 Nov 2018 03:36:27 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.54,451,1534834800";
   d="scan'208";a="100601943"
Received: from devel-ww.sh.intel.com ([10.239.48.119])
  by fmsmga002.fm.intel.com with ESMTP; 01 Nov 2018 03:36:23 -0700
From: Wei Wang <wei.w.wang@intel.com>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
        pbonzini@redhat.com, ak@linux.intel.com, peterz@infradead.org
Cc: mingo@redhat.com, rkrcmar@redhat.com, like.xu@intel.com,
        wei.w.wang@intel.com
Subject: [PATCH v1 1/8] perf/x86: add support to mask counters from host
Date: Thu,  1 Nov 2018 18:04:01 +0800
Message-Id: <1541066648-40690-2-git-send-email-wei.w.wang@intel.com>
X-Mailer: git-send-email 2.7.4
In-Reply-To: <1541066648-40690-1-git-send-email-wei.w.wang@intel.com>
References: <1541066648-40690-1-git-send-email-wei.w.wang@intel.com>
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

Add x86_perf_mask_perf_counters to reserve counters from the host perf
subsystem. The masked counters will not be assigned to any host perf
events. This can be used by the hypervisor to reserve perf counters for
a guest to use.

This function is currently supported on Intel CPUs only, but put in x86
perf core because the counter assignment is implemented here and we need
to re-enable the pmu which is defined in the x86 perf core in the case
that a counter to be masked happens to be used by the host.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/events/core.c            | 37 +++++++++++++++++++++++++++++++++++++
 arch/x86/include/asm/perf_event.h |  1 +
 2 files changed, 38 insertions(+)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 106911b..e73135a 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -716,6 +716,7 @@ struct perf_sched {
 static void perf_sched_init(struct perf_sched *sched, struct event_constraint **constraints,
 			    int num, int wmin, int wmax, int gpmax)
 {
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int idx;
 
 	memset(sched, 0, sizeof(*sched));
@@ -723,6 +724,9 @@ static void perf_sched_init(struct perf_sched *sched, struct event_constraint **
 	sched->max_weight	= wmax;
 	sched->max_gp		= gpmax;
 	sched->constraints	= constraints;
+#ifdef CONFIG_CPU_SUP_INTEL
+	sched->state.used[0]	= cpuc->intel_ctrl_guest_mask;
+#endif
 
 	for (idx = 0; idx < num; idx++) {
 		if (constraints[idx]->weight == wmin)
@@ -2386,6 +2390,39 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *re
 	}
 }
 
+#ifdef CONFIG_CPU_SUP_INTEL
+/**
+ * x86_perf_mask_perf_counters - mask perf counters
+ * @mask: the bitmask of counters
+ *
+ * Mask the perf counters that are not available to be used by the perf core.
+ * If the counter to be masked has been assigned, it will be taken back and
+ * then the perf core will re-assign usable counters to its events.
+ *
+ * This can be used by a component outside the perf core to reserve counters.
+ * For example, a hypervisor uses it to reserve counters for a guest to use,
+ * and later return the counters by another call with the related bits cleared.
+ */
+void x86_perf_mask_perf_counters(u64 mask)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+
+	/*
+	 * If the counter happens to be used by a host event, take it back
+	 * first, and then restart the pmu after mask that counter as being
+	 * reserved.
+	 */
+	if (mask & cpuc->intel_ctrl_host_mask) {
+		perf_pmu_disable(&pmu);
+		cpuc->intel_ctrl_guest_mask = mask;
+		perf_pmu_enable(&pmu);
+	} else {
+		cpuc->intel_ctrl_guest_mask = mask;
+	}
+}
+EXPORT_SYMBOL_GPL(x86_perf_mask_perf_counters);
+#endif
+
 static inline int
 valid_user_frame(const void __user *fp, unsigned long size)
 {
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 8bdf749..5b4463e 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -297,6 +297,7 @@ static inline void perf_check_microcode(void) { }
 
 #ifdef CONFIG_CPU_SUP_INTEL
  extern void intel_pt_handle_vmx(int on);
+extern void x86_perf_mask_perf_counters(u64 mask);
 #endif
 
 #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_AMD)

From patchwork Thu Nov  1 10:04:02 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Wang, Wei W" <wei.w.wang@intel.com>
X-Patchwork-Id: 10663835
Return-Path: <kvm-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
 [172.30.200.125])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 31C9814E2
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:37:22 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1C96C28397
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:37:22 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 107512B7C3; Thu,  1 Nov 2018 10:37:22 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 77C9B28397
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:37:21 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1728244AbeKATiw (ORCPT
        <rfc822;patchwork-kvm@patchwork.kernel.org>);
        Thu, 1 Nov 2018 15:38:52 -0400
Received: from mga06.intel.com ([134.134.136.31]:52047 "EHLO mga06.intel.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1728199AbeKATiv (ORCPT <rfc822;kvm@vger.kernel.org>);
        Thu, 1 Nov 2018 15:38:51 -0400
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga002.fm.intel.com ([10.253.24.26])
  by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 01 Nov 2018 03:36:29 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.54,451,1534834800";
   d="scan'208";a="100601954"
Received: from devel-ww.sh.intel.com ([10.239.48.119])
  by fmsmga002.fm.intel.com with ESMTP; 01 Nov 2018 03:36:25 -0700
From: Wei Wang <wei.w.wang@intel.com>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
        pbonzini@redhat.com, ak@linux.intel.com, peterz@infradead.org
Cc: mingo@redhat.com, rkrcmar@redhat.com, like.xu@intel.com,
        wei.w.wang@intel.com
Subject: [PATCH v1 2/8] perf/x86/intel: add pmi callback support
Date: Thu,  1 Nov 2018 18:04:02 +0800
Message-Id: <1541066648-40690-3-git-send-email-wei.w.wang@intel.com>
X-Mailer: git-send-email 2.7.4
In-Reply-To: <1541066648-40690-1-git-send-email-wei.w.wang@intel.com>
References: <1541066648-40690-1-git-send-email-wei.w.wang@intel.com>
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

This patch adds the pmi callback support for the counters reserved by
components outside the perf core. For example, a hypervisor may register
such a callback to get the guest notified about the receiving of the
pmi.

The host PMI handling requires the active_events to be non-zero, so we
need active_events to be at least 1 in x86_perf_mask_perf_counters when
there are counters used by the guest.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/events/core.c            | 10 ++++++++++
 arch/x86/events/intel/core.c      | 27 +++++++++++++++++++++++++++
 arch/x86/events/perf_event.h      |  4 ++++
 arch/x86/include/asm/perf_event.h |  5 +++++
 4 files changed, 46 insertions(+)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index e73135a..504d102 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2408,6 +2408,16 @@ void x86_perf_mask_perf_counters(u64 mask)
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 
 	/*
+	 * Increment active_events by 1 if there are counters reserved for
+	 * the guest to use, and no need to do more increments if there are
+	 * already counters taken by the guest.
+	 */
+	if (!cpuc->intel_ctrl_guest_mask && mask)
+		atomic_inc(&active_events);
+	else if (cpuc->intel_ctrl_guest_mask && !mask)
+		atomic_dec(&active_events);
+
+	/*
 	 * If the counter happens to be used by a host event, take it back
 	 * first, and then restart the pmu after mask that counter as being
 	 * reserved.
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 0fb8659..f5e5191 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2283,6 +2283,13 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
 	 */
 	status |= cpuc->intel_cp_status;
 
+	if (status & cpuc->intel_ctrl_guest_mask) {
+		cpuc->pmi_callback(cpuc->pmi_opaque,
+				   status & cpuc->intel_ctrl_guest_mask);
+		status &= ~cpuc->intel_ctrl_guest_mask;
+		handled++;
+	}
+
 	for_each_set_bit(bit, (unsigned long *)&status, X86_PMC_IDX_MAX) {
 		struct perf_event *event = cpuc->events[bit];
 
@@ -3162,6 +3169,26 @@ struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr)
 }
 EXPORT_SYMBOL_GPL(perf_guest_get_msrs);
 
+void x86_perf_register_pmi_callback(pmi_callback_t callback, void *opaque)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+
+	cpuc->pmi_callback = callback;
+	cpuc->pmi_opaque = opaque;
+
+	apic_write(APIC_LVTPC, APIC_DM_NMI);
+}
+EXPORT_SYMBOL_GPL(x86_perf_register_pmi_callback);
+
+void x86_perf_unregister_pmi_callback(void)
+{
+	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+
+	cpuc->pmi_callback = NULL;
+	cpuc->pmi_opaque = NULL;
+}
+EXPORT_SYMBOL_GPL(x86_perf_unregister_pmi_callback);
+
 static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index adae087..06404eb 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -197,6 +197,10 @@ struct cpu_hw_events {
 	unsigned int		txn_flags;
 	int			is_fake;
 
+	/* PMI related fields */
+	pmi_callback_t			pmi_callback;
+	void				*pmi_opaque;
+
 	/*
 	 * Intel DebugStore bits
 	 */
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 5b4463e..fd33688 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -275,6 +275,8 @@ struct perf_guest_switch_msr {
 	u64 host, guest;
 };
 
+typedef void (*pmi_callback_t)(void *opaque, u64 status);
+
 extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
 extern void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap);
 extern void perf_check_microcode(void);
@@ -298,6 +300,9 @@ static inline void perf_check_microcode(void) { }
 #ifdef CONFIG_CPU_SUP_INTEL
  extern void intel_pt_handle_vmx(int on);
 extern void x86_perf_mask_perf_counters(u64 mask);
+extern void x86_perf_register_pmi_callback(pmi_callback_t callback,
+					   void *opaque);
+extern void x86_perf_unregister_pmi_callback(void);
 #endif
 
 #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_AMD)

From patchwork Thu Nov  1 10:04:03 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Wang, Wei W" <wei.w.wang@intel.com>
X-Patchwork-Id: 10663833
Return-Path: <kvm-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
 [172.30.200.125])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 106F514E2
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:37:11 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E7F2728397
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:37:10 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id D9C0B2B7C3; Thu,  1 Nov 2018 10:37:10 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2449828397
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:37:10 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1728288AbeKATix (ORCPT
        <rfc822;patchwork-kvm@patchwork.kernel.org>);
        Thu, 1 Nov 2018 15:38:53 -0400
Received: from mga06.intel.com ([134.134.136.31]:52047 "EHLO mga06.intel.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1728252AbeKATiw (ORCPT <rfc822;kvm@vger.kernel.org>);
        Thu, 1 Nov 2018 15:38:52 -0400
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga002.fm.intel.com ([10.253.24.26])
  by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 01 Nov 2018 03:36:31 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.54,451,1534834800";
   d="scan'208";a="100601965"
Received: from devel-ww.sh.intel.com ([10.239.48.119])
  by fmsmga002.fm.intel.com with ESMTP; 01 Nov 2018 03:36:26 -0700
From: Wei Wang <wei.w.wang@intel.com>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
        pbonzini@redhat.com, ak@linux.intel.com, peterz@infradead.org
Cc: mingo@redhat.com, rkrcmar@redhat.com, like.xu@intel.com,
        wei.w.wang@intel.com
Subject: [PATCH v1 3/8] KVM/x86/vPMU: optimize intel vPMU
Date: Thu,  1 Nov 2018 18:04:03 +0800
Message-Id: <1541066648-40690-4-git-send-email-wei.w.wang@intel.com>
X-Mailer: git-send-email 2.7.4
In-Reply-To: <1541066648-40690-1-git-send-email-wei.w.wang@intel.com>
References: <1541066648-40690-1-git-send-email-wei.w.wang@intel.com>
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

Current vPMU relies on the host perf software stack to update the guest
change of the perf counter MSRs. The whole process includes releasing the
old perf event and re-creating a new one. This results in around 2500000ns
overhead to update a perf counter control MSR.

This can be avoided by having the vPMU layer directly sits on the hardware
perf counters, and in this case the guest accesses to the virtual perf
counters can be directly applied to the related hardware counter by vPMU.
The guest used counters are taken from the host perf core via
x86_perf_mask_perf_counters, which in most cases is a bit-setting of the
guest mask.

This patch implements the handling of guest accesses to the perf counter
MSRs. A host perf counter is assigned to the guest when the guest has the
vPMC enabled and returned to the host when the vPMC gets disabled.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 arch/x86/include/asm/kvm_host.h |   1 +
 arch/x86/kvm/pmu_intel.c        | 257 ++++++++++++++++++++++++++++++++--------
 2 files changed, 209 insertions(+), 49 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 55e51ff..f8bc46d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -463,6 +463,7 @@ struct kvm_pmu {
 	u64 global_ovf_ctrl;
 	u64 counter_bitmask[2];
 	u64 global_ctrl_mask;
+	u64 assigned_pmc_bitmap;
 	u64 reserved_bits;
 	u8 version;
 	struct kvm_pmc gp_counters[INTEL_PMC_MAX_GENERIC];
diff --git a/arch/x86/kvm/pmu_intel.c b/arch/x86/kvm/pmu_intel.c
index 5ab4a36..8c2d37f 100644
--- a/arch/x86/kvm/pmu_intel.c
+++ b/arch/x86/kvm/pmu_intel.c
@@ -185,7 +185,11 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *data)
 	default:
 		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
 		    (pmc = get_fixed_pmc(pmu, msr))) {
-			*data = pmc_read_counter(pmc);
+			if (test_bit(pmc->idx,
+				(unsigned long *)&pmu->assigned_pmc_bitmap))
+				rdmsrl(msr, *data);
+			else
+				*data = pmc->counter;
 			return 0;
 		} else if ((pmc = get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0))) {
 			*data = pmc->eventsel;
@@ -196,59 +200,210 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *data)
 	return 1;
 }
 
+static void intel_pmu_update_fixed_ctrl_msr(u64 new_ctrl, u8 idx)
+{
+	u64 host_ctrl, mask;
+
+	rdmsrl(MSR_ARCH_PERFMON_FIXED_CTR_CTRL, host_ctrl);
+	mask = 0xfULL << (idx * 4);
+	host_ctrl &= ~mask;
+	new_ctrl <<= (idx * 4);
+	host_ctrl |= new_ctrl;
+	wrmsrl(MSR_ARCH_PERFMON_FIXED_CTR_CTRL, host_ctrl);
+}
+
+static void intel_pmu_save_pmc_counters(struct kvm_pmu *pmu, u32 idx)
+{
+	struct kvm_pmc *pmc;
+
+	pmc = intel_pmc_idx_to_pmc(pmu, idx);
+	/*
+	 * The control MSRs (pmc->eventsel and pmu->fixed_ctr_ctrl) always
+	 * store the updated value, so we only need to save the counter value.
+	 */
+	if (pmc->type == KVM_PMC_GP)
+		rdmsrl(MSR_IA32_PERFCTR0 + idx, pmc->counter);
+	else
+		rdmsrl(MSR_CORE_PERF_FIXED_CTR0 + idx - INTEL_PMC_IDX_FIXED,
+		       pmc->counter);
+}
+
+static void intel_pmu_restore_pmc_counters(struct kvm_pmu *pmu, u32 idx)
+{
+	struct kvm_pmc *pmc;
+
+	pmc = intel_pmc_idx_to_pmc(pmu, idx);
+
+	if (pmc->type == KVM_PMC_GP) {
+		wrmsrl(MSR_IA32_PERFCTR0 + idx, pmc->counter);
+		wrmsrl(MSR_ARCH_PERFMON_EVENTSEL0 + idx, pmc->eventsel);
+	} else {
+		u8 ctrl;
+
+		idx -= INTEL_PMC_IDX_FIXED;
+		ctrl = fixed_ctrl_field(pmu->fixed_ctr_ctrl, idx);
+
+		wrmsrl(MSR_CORE_PERF_FIXED_CTR0 + idx, pmc->counter);
+		intel_pmu_update_fixed_ctrl_msr(ctrl, idx);
+	}
+}
+
+/* Get the physical PMC from host and restore the vPMC states. */
+static inline void intel_pmu_get_pmc(struct kvm_pmu *pmu, unsigned int idx)
+{
+	/* Already assigned? */
+	if (test_bit(idx, (unsigned long *)&pmu->assigned_pmc_bitmap))
+		return;
+
+	set_bit(idx, (unsigned long *)&pmu->assigned_pmc_bitmap);
+	x86_perf_mask_perf_counters(pmu->assigned_pmc_bitmap);
+	intel_pmu_restore_pmc_counters(pmu, idx);
+}
+
+/* Save the physical PMC state and return it to host. */
+static inline void intel_pmu_put_pmc(struct kvm_pmu *pmu, unsigned int idx)
+{
+	/* Already returned? */
+	if (!test_bit(idx, (unsigned long *)&pmu->assigned_pmc_bitmap))
+		return;
+
+	intel_pmu_save_pmc_counters(pmu, idx);
+	clear_bit(idx, (unsigned long *)&pmu->assigned_pmc_bitmap);
+	x86_perf_mask_perf_counters(pmu->assigned_pmc_bitmap);
+}
+
+static int intel_pmu_set_fixed_ctrl(struct kvm_pmu *pmu,
+				    struct msr_data *msr_info)
+{
+	u8 old_ctrl, new_ctrl, pmc_idx, i;
+	u64 data = msr_info->data;
+
+	if (pmu->fixed_ctr_ctrl == data)
+		return 0;
+	if (unlikely(data & 0xfffffffffffff444ull))
+		return 1;
+
+	for (i = 0; i < INTEL_PMC_MAX_FIXED; i++) {
+		old_ctrl = fixed_ctrl_field(pmu->fixed_ctr_ctrl, i);
+		new_ctrl = fixed_ctrl_field(data, i);
+		if (old_ctrl == new_ctrl)
+			continue;
+
+		pmc_idx = INTEL_PMC_IDX_FIXED + i;
+		if (new_ctrl) {
+			/* Set the control after we own the pmc */
+			intel_pmu_get_pmc(pmu, pmc_idx);
+			intel_pmu_update_fixed_ctrl_msr((u64)new_ctrl, i);
+		} else {
+			/* Zero the control before we return the pmc */
+			intel_pmu_update_fixed_ctrl_msr((u64)new_ctrl, i);
+			intel_pmu_put_pmc(pmu, pmc_idx);
+		}
+	}
+	pmu->fixed_ctr_ctrl = data;
+
+	return 0;
+}
+
+static int intel_pmu_set_global_status(struct kvm_pmu *pmu,
+				       struct msr_data *msr_info)
+{
+	/* RO to the guest */
+	if (!msr_info->host_initiated)
+		return 1;
+
+	pmu->global_status = msr_info->data;
+	return 0;
+}
+
+static int intel_pmu_set_global_ctrl(struct kvm_pmu *pmu,
+				     struct msr_data *msr_info)
+{
+	u64 data = msr_info->data;
+
+	if (unlikely(data & pmu->global_ctrl_mask))
+		return 1;
+
+	pmu->global_ctrl = data;
+
+	return 0;
+}
+
+static int intel_pmu_set_ovf_ctrl(struct kvm_pmu *pmu,
+				  struct msr_data *msr_info)
+{
+	u64 data = msr_info->data;
+
+	if (unlikely(data & (pmu->global_ctrl_mask & ~(3ull<<62))))
+		return 1;
+
+	if (!msr_info->host_initiated)
+		pmu->global_status &= ~data;
+
+	pmu->global_ovf_ctrl = data;
+
+	return 0;
+}
+
+static int intel_pmu_set_gp_eventsel(struct kvm_pmc *pmc,
+				     struct msr_data *msr_info)
+{
+	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
+	u64 data = msr_info->data;
+
+	if (pmc->eventsel == data)
+		return 0;
+
+	pmc->eventsel = data;
+
+	if (data & ARCH_PERFMON_EVENTSEL_ENABLE) {
+		intel_pmu_get_pmc(pmu, pmc->idx);
+		wrmsrl(msr_info->index, pmc->eventsel);
+	} else {
+		wrmsrl(msr_info->index, pmc->eventsel);
+		intel_pmu_put_pmc(pmu, pmc->idx);
+	}
+
+	return 0;
+}
+
+static int intel_pmu_set_pmc_counter(struct kvm_pmc *pmc,
+				     struct msr_data *msr_info)
+{
+	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
+
+	pmc->counter = msr_info->data;
+	if (test_bit(pmc->idx, (unsigned long *)&pmu->assigned_pmc_bitmap))
+		wrmsrl(msr_info->index, pmc->counter);
+
+	return 0;
+}
+
 static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 	struct kvm_pmc *pmc;
 	u32 msr = msr_info->index;
-	u64 data = msr_info->data;
 
 	switch (msr) {
 	case MSR_CORE_PERF_FIXED_CTR_CTRL:
-		if (pmu->fixed_ctr_ctrl == data)
-			return 0;
-		if (!(data & 0xfffffffffffff444ull)) {
-			reprogram_fixed_counters(pmu, data);
-			return 0;
-		}
-		break;
+		return intel_pmu_set_fixed_ctrl(pmu, msr_info);
 	case MSR_CORE_PERF_GLOBAL_STATUS:
-		if (msr_info->host_initiated) {
-			pmu->global_status = data;
-			return 0;
-		}
-		break; /* RO MSR */
+		return intel_pmu_set_global_status(pmu, msr_info);
 	case MSR_CORE_PERF_GLOBAL_CTRL:
-		if (pmu->global_ctrl == data)
-			return 0;
-		if (!(data & pmu->global_ctrl_mask)) {
-			global_ctrl_changed(pmu, data);
-			return 0;
-		}
-		break;
+		return intel_pmu_set_global_ctrl(pmu, msr_info);
 	case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
-		if (!(data & (pmu->global_ctrl_mask & ~(3ull<<62)))) {
-			if (!msr_info->host_initiated)
-				pmu->global_status &= ~data;
-			pmu->global_ovf_ctrl = data;
-			return 0;
-		}
-		break;
+		return intel_pmu_set_ovf_ctrl(pmu, msr_info);
 	default:
-		if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
-		    (pmc = get_fixed_pmc(pmu, msr))) {
-			if (!msr_info->host_initiated)
-				data = (s64)(s32)data;
-			pmc->counter += data - pmc_read_counter(pmc);
-			return 0;
-		} else if ((pmc = get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0))) {
-			if (data == pmc->eventsel)
-				return 0;
-			if (!(data & pmu->reserved_bits)) {
-				reprogram_gp_counter(pmc, data);
-				return 0;
-			}
-		}
+		pmc = get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0);
+		if (pmc)
+			return intel_pmu_set_gp_eventsel(pmc, msr_info);
+		pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0);
+		if (pmc)
+			return intel_pmu_set_pmc_counter(pmc, msr_info);
+		pmc = get_fixed_pmc(pmu, msr);
+		if (pmc)
+			return intel_pmu_set_pmc_counter(pmc, msr_info);
 	}
 
 	return 1;
@@ -326,20 +481,24 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
 static void intel_pmu_reset(struct kvm_vcpu *vcpu)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+	struct kvm_pmc *pmc;
 	int i;
 
 	for (i = 0; i < INTEL_PMC_MAX_GENERIC; i++) {
-		struct kvm_pmc *pmc = &pmu->gp_counters[i];
-
-		pmc_stop_counter(pmc);
+		pmc = &pmu->gp_counters[i];
 		pmc->counter = pmc->eventsel = 0;
 	}
 
-	for (i = 0; i < INTEL_PMC_MAX_FIXED; i++)
-		pmc_stop_counter(&pmu->fixed_counters[i]);
+	for (i = 0; i < INTEL_PMC_MAX_FIXED; i++) {
+		pmc = &pmu->fixed_counters[i];
+		pmc->counter = 0;
+	}
 
-	pmu->fixed_ctr_ctrl = pmu->global_ctrl = pmu->global_status =
-		pmu->global_ovf_ctrl = 0;
+	pmu->fixed_ctr_ctrl = 0;
+	pmu->global_ctrl = 0;
+	pmu->global_status = 0;
+	pmu->global_ovf_ctrl = 0;
+	pmu->assigned_pmc_bitmap = 0;
 }
 
 struct kvm_pmu_ops intel_pmu_ops = {

From patchwork Thu Nov  1 10:04:04 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Wang, Wei W" <wei.w.wang@intel.com>
X-Patchwork-Id: 10663831
Return-Path: <kvm-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
 [172.30.200.125])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3232614E2
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:37:05 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1B9D928397
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:37:05 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 0F79D2B7BB; Thu,  1 Nov 2018 10:37:05 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5FA0928397
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:37:04 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1728326AbeKATiz (ORCPT
        <rfc822;patchwork-kvm@patchwork.kernel.org>);
        Thu, 1 Nov 2018 15:38:55 -0400
Received: from mga06.intel.com ([134.134.136.31]:52047 "EHLO mga06.intel.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1728199AbeKATiy (ORCPT <rfc822;kvm@vger.kernel.org>);
        Thu, 1 Nov 2018 15:38:54 -0400
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga002.fm.intel.com ([10.253.24.26])
  by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 01 Nov 2018 03:36:32 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.54,451,1534834800";
   d="scan'208";a="100601974"
Received: from devel-ww.sh.intel.com ([10.239.48.119])
  by fmsmga002.fm.intel.com with ESMTP; 01 Nov 2018 03:36:28 -0700
From: Wei Wang <wei.w.wang@intel.com>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
        pbonzini@redhat.com, ak@linux.intel.com, peterz@infradead.org
Cc: mingo@redhat.com, rkrcmar@redhat.com, like.xu@intel.com,
        wei.w.wang@intel.com
Subject: [PATCH v1 4/8] KVM/x86/vPMU: support msr switch on vmx transitions
Date: Thu,  1 Nov 2018 18:04:04 +0800
Message-Id: <1541066648-40690-5-git-send-email-wei.w.wang@intel.com>
X-Mailer: git-send-email 2.7.4
In-Reply-To: <1541066648-40690-1-git-send-email-wei.w.wang@intel.com>
References: <1541066648-40690-1-git-send-email-wei.w.wang@intel.com>
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

This patch adds support to intel vPMU to switch msrs on vmx transitions.
Currenly only 1 msr (global ctrl) is switched. The number can be
increased on demand in the future (e.g. pebs enable). The old method from
the host perf subsystem is also removed.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <ak@linux.intel.com>
---
 arch/x86/events/intel/core.c      | 60 ---------------------------------------
 arch/x86/events/perf_event.h      |  6 ----
 arch/x86/include/asm/kvm_host.h   | 11 +++++++
 arch/x86/include/asm/perf_event.h | 12 --------
 arch/x86/kvm/pmu.h                |  2 ++
 arch/x86/kvm/pmu_intel.c          | 19 +++++++++++++
 arch/x86/kvm/vmx.c                |  6 ++--
 7 files changed, 35 insertions(+), 81 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index f5e5191..33c156f 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3160,15 +3160,6 @@ static int intel_pmu_hw_config(struct perf_event *event)
 	return 0;
 }
 
-struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr)
-{
-	if (x86_pmu.guest_get_msrs)
-		return x86_pmu.guest_get_msrs(nr);
-	*nr = 0;
-	return NULL;
-}
-EXPORT_SYMBOL_GPL(perf_guest_get_msrs);
-
 void x86_perf_register_pmi_callback(pmi_callback_t callback, void *opaque)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
@@ -3189,55 +3180,6 @@ void x86_perf_unregister_pmi_callback(void)
 }
 EXPORT_SYMBOL_GPL(x86_perf_unregister_pmi_callback);
 
-static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr)
-{
-	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
-	struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
-
-	arr[0].msr = MSR_CORE_PERF_GLOBAL_CTRL;
-	arr[0].host = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_guest_mask;
-	arr[0].guest = x86_pmu.intel_ctrl & ~cpuc->intel_ctrl_host_mask;
-	/*
-	 * If PMU counter has PEBS enabled it is not enough to disable counter
-	 * on a guest entry since PEBS memory write can overshoot guest entry
-	 * and corrupt guest memory. Disabling PEBS solves the problem.
-	 */
-	arr[1].msr = MSR_IA32_PEBS_ENABLE;
-	arr[1].host = cpuc->pebs_enabled;
-	arr[1].guest = 0;
-
-	*nr = 2;
-	return arr;
-}
-
-static struct perf_guest_switch_msr *core_guest_get_msrs(int *nr)
-{
-	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
-	struct perf_guest_switch_msr *arr = cpuc->guest_switch_msrs;
-	int idx;
-
-	for (idx = 0; idx < x86_pmu.num_counters; idx++)  {
-		struct perf_event *event = cpuc->events[idx];
-
-		arr[idx].msr = x86_pmu_config_addr(idx);
-		arr[idx].host = arr[idx].guest = 0;
-
-		if (!test_bit(idx, cpuc->active_mask))
-			continue;
-
-		arr[idx].host = arr[idx].guest =
-			event->hw.config | ARCH_PERFMON_EVENTSEL_ENABLE;
-
-		if (event->attr.exclude_host)
-			arr[idx].host &= ~ARCH_PERFMON_EVENTSEL_ENABLE;
-		else if (event->attr.exclude_guest)
-			arr[idx].guest &= ~ARCH_PERFMON_EVENTSEL_ENABLE;
-	}
-
-	*nr = x86_pmu.num_counters;
-	return arr;
-}
-
 static void core_pmu_enable_event(struct perf_event *event)
 {
 	if (!event->attr.exclude_host)
@@ -3641,7 +3583,6 @@ static __initconst const struct x86_pmu core_pmu = {
 	.get_event_constraints	= intel_get_event_constraints,
 	.put_event_constraints	= intel_put_event_constraints,
 	.event_constraints	= intel_core_event_constraints,
-	.guest_get_msrs		= core_guest_get_msrs,
 	.format_attrs		= intel_arch_formats_attr,
 	.events_sysfs_show	= intel_event_sysfs_show,
 
@@ -3694,7 +3635,6 @@ static __initconst const struct x86_pmu intel_pmu = {
 	.cpu_prepare		= intel_pmu_cpu_prepare,
 	.cpu_starting		= intel_pmu_cpu_starting,
 	.cpu_dying		= intel_pmu_cpu_dying,
-	.guest_get_msrs		= intel_guest_get_msrs,
 	.sched_task		= intel_pmu_sched_task,
 };
 
diff --git a/arch/x86/events/perf_event.h b/arch/x86/events/perf_event.h
index 06404eb..9f818d5 100644
--- a/arch/x86/events/perf_event.h
+++ b/arch/x86/events/perf_event.h
@@ -227,7 +227,6 @@ struct cpu_hw_events {
 	 */
 	u64				intel_ctrl_guest_mask;
 	u64				intel_ctrl_host_mask;
-	struct perf_guest_switch_msr	guest_switch_msrs[X86_PMC_IDX_MAX];
 
 	/*
 	 * Intel checkpoint mask
@@ -645,11 +644,6 @@ struct x86_pmu {
 	 */
 	struct extra_reg *extra_regs;
 	unsigned int flags;
-
-	/*
-	 * Intel host/guest support (KVM)
-	 */
-	struct perf_guest_switch_msr *(*guest_get_msrs)(int *nr);
 };
 
 struct x86_perf_task_context {
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f8bc46d..b66d164 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -453,6 +453,16 @@ struct kvm_pmc {
 	struct kvm_vcpu *vcpu;
 };
 
+/*
+ * Below MSRs are currently switched on VMX transitions:
+ * - MSR_CORE_PERF_GLOBAL_CTRL
+ */
+#define KVM_PERF_SWITCH_MSR_NUM 1
+struct kvm_perf_switch_msr {
+	unsigned int msr;
+	u64 host, guest;
+};
+
 struct kvm_pmu {
 	unsigned nr_arch_gp_counters;
 	unsigned nr_arch_fixed_counters;
@@ -470,6 +480,7 @@ struct kvm_pmu {
 	struct kvm_pmc fixed_counters[INTEL_PMC_MAX_FIXED];
 	struct irq_work irq_work;
 	u64 reprogram_pmi;
+	struct kvm_perf_switch_msr switch_msrs[KVM_PERF_SWITCH_MSR_NUM];
 };
 
 struct kvm_pmu_ops;
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index fd33688..e9cff88 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -270,24 +270,12 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
 	);							\
 }
 
-struct perf_guest_switch_msr {
-	unsigned msr;
-	u64 host, guest;
-};
-
 typedef void (*pmi_callback_t)(void *opaque, u64 status);
 
-extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
 extern void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap);
 extern void perf_check_microcode(void);
 extern int x86_perf_rdpmc_index(struct perf_event *event);
 #else
-static inline struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr)
-{
-	*nr = 0;
-	return NULL;
-}
-
 static inline void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap)
 {
 	memset(cap, 0, sizeof(*cap));
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index ba8898e..b3b0238 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -119,6 +119,8 @@ void kvm_pmu_init(struct kvm_vcpu *vcpu);
 void kvm_pmu_destroy(struct kvm_vcpu *vcpu);
 
 bool is_vmware_backdoor_pmc(u32 pmc_idx);
+struct kvm_perf_switch_msr *intel_pmu_get_switch_msrs(struct kvm_vcpu *vcpu,
+						      u32 *nr_msrs);
 
 extern struct kvm_pmu_ops intel_pmu_ops;
 extern struct kvm_pmu_ops amd_pmu_ops;
diff --git a/arch/x86/kvm/pmu_intel.c b/arch/x86/kvm/pmu_intel.c
index 8c2d37f..2be31ad 100644
--- a/arch/x86/kvm/pmu_intel.c
+++ b/arch/x86/kvm/pmu_intel.c
@@ -501,6 +501,25 @@ static void intel_pmu_reset(struct kvm_vcpu *vcpu)
 	pmu->assigned_pmc_bitmap = 0;
 }
 
+struct kvm_perf_switch_msr *intel_pmu_get_switch_msrs(struct kvm_vcpu *vcpu,
+						      u32 *nr_msrs)
+{
+	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+	struct kvm_perf_switch_msr *arr = pmu->switch_msrs;
+
+	arr[0].msr = MSR_CORE_PERF_GLOBAL_CTRL;
+	rdmsrl(MSR_CORE_PERF_GLOBAL_CTRL, arr[0].host);
+	arr[0].host &= ~pmu->assigned_pmc_bitmap;
+	/*
+	 * The guest value will be written to the hardware msr when entering
+	 * the guest, and the bits of unassigned pmcs are not enabled.
+	 */
+	arr[0].guest = pmu->global_ctrl & pmu->assigned_pmc_bitmap;
+	*nr_msrs = KVM_PERF_SWITCH_MSR_NUM;
+
+	return arr;
+}
+
 struct kvm_pmu_ops intel_pmu_ops = {
 	.find_arch_event = intel_find_arch_event,
 	.find_fixed_event = intel_find_fixed_event,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 4555077..714d1f3 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -11073,10 +11073,10 @@ static void vmx_cancel_injection(struct kvm_vcpu *vcpu)
 
 static void atomic_switch_perf_msrs(struct vcpu_vmx *vmx)
 {
-	int i, nr_msrs;
-	struct perf_guest_switch_msr *msrs;
+	u32 i, nr_msrs;
+	struct kvm_perf_switch_msr *msrs;
 
-	msrs = perf_guest_get_msrs(&nr_msrs);
+	msrs = intel_pmu_get_switch_msrs(&vmx->vcpu, &nr_msrs);
 
 	if (!msrs)
 		return;

From patchwork Thu Nov  1 10:04:05 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Wang, Wei W" <wei.w.wang@intel.com>
X-Patchwork-Id: 10663829
Return-Path: <kvm-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
 [172.30.200.125])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E908013B5
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:36:59 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D2B3328397
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:36:59 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id C6CBB2B7BB; Thu,  1 Nov 2018 10:36:59 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 631CD28397
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:36:59 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1728349AbeKATi4 (ORCPT
        <rfc822;patchwork-kvm@patchwork.kernel.org>);
        Thu, 1 Nov 2018 15:38:56 -0400
Received: from mga06.intel.com ([134.134.136.31]:52047 "EHLO mga06.intel.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1728339AbeKATi4 (ORCPT <rfc822;kvm@vger.kernel.org>);
        Thu, 1 Nov 2018 15:38:56 -0400
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga002.fm.intel.com ([10.253.24.26])
  by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 01 Nov 2018 03:36:34 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.54,451,1534834800";
   d="scan'208";a="100601994"
Received: from devel-ww.sh.intel.com ([10.239.48.119])
  by fmsmga002.fm.intel.com with ESMTP; 01 Nov 2018 03:36:30 -0700
From: Wei Wang <wei.w.wang@intel.com>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
        pbonzini@redhat.com, ak@linux.intel.com, peterz@infradead.org
Cc: mingo@redhat.com, rkrcmar@redhat.com, like.xu@intel.com,
        wei.w.wang@intel.com
Subject: [PATCH v1 5/8] KVM/x86/vPMU: intel_pmu_read_pmc
Date: Thu,  1 Nov 2018 18:04:05 +0800
Message-Id: <1541066648-40690-6-git-send-email-wei.w.wang@intel.com>
X-Mailer: git-send-email 2.7.4
In-Reply-To: <1541066648-40690-1-git-send-email-wei.w.wang@intel.com>
References: <1541066648-40690-1-git-send-email-wei.w.wang@intel.com>
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

This patch adds the handling of guest rdpmc. If the physical counter has
been assigned, directly reads the value from the hardware. Otherwise,
return to the guest the virtual counter value.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 arch/x86/kvm/pmu.c       |  3 +++
 arch/x86/kvm/pmu.h       |  1 +
 arch/x86/kvm/pmu_intel.c | 27 +++++++++++++++++++++++++++
 3 files changed, 31 insertions(+)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 58ead7d..7f2f63e 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -284,6 +284,9 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
 	struct kvm_pmc *pmc;
 	u64 ctr_val;
 
+	if (kvm_x86_ops->pmu_ops->read_pmc)
+		return kvm_x86_ops->pmu_ops->read_pmc(vcpu, idx, data);
+
 	if (is_vmware_backdoor_pmc(idx))
 		return kvm_pmu_rdpmc_vmware(vcpu, idx, data);
 
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index b3b0238..7ab85bf 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -30,6 +30,7 @@ struct kvm_pmu_ops {
 	bool (*is_valid_msr)(struct kvm_vcpu *vcpu, u32 msr);
 	int (*get_msr)(struct kvm_vcpu *vcpu, u32 msr, u64 *data);
 	int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr_info);
+	int (*read_pmc)(struct kvm_vcpu *vcpu, unsigned int idx, u64 *data);
 	void (*refresh)(struct kvm_vcpu *vcpu);
 	void (*init)(struct kvm_vcpu *vcpu);
 	void (*reset)(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/pmu_intel.c b/arch/x86/kvm/pmu_intel.c
index 2be31ad..4c6183b 100644
--- a/arch/x86/kvm/pmu_intel.c
+++ b/arch/x86/kvm/pmu_intel.c
@@ -20,6 +20,9 @@
 #include "lapic.h"
 #include "pmu.h"
 
+#define INTEL_PMU_RDPMC_FIXED_PMC (1ULL << 30)
+#define INTEL_PMU_RDPMC_COUNTER_MASK (INTEL_PMU_RDPMC_FIXED_PMC - 1)
+
 static struct kvm_event_hw_type_mapping intel_arch_events[] = {
 	/* Index must match CPUID 0x0A.EBX bit vector */
 	[0] = { 0x3c, 0x00, PERF_COUNT_HW_CPU_CYCLES },
@@ -409,6 +412,29 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	return 1;
 }
 
+static int intel_pmu_read_pmc(struct kvm_vcpu *vcpu, unsigned int idx,
+			      u64 *data)
+{
+	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+	unsigned int pmc_idx;
+
+	if (idx & INTEL_PMU_RDPMC_FIXED_PMC)
+		pmc_idx = INTEL_PMC_IDX_FIXED +
+			  (idx & INTEL_PMU_RDPMC_COUNTER_MASK);
+	else
+		pmc_idx = idx & INTEL_PMU_RDPMC_COUNTER_MASK;
+
+	if (test_bit(pmc_idx,
+		(unsigned long *)&pmu->assigned_pmc_bitmap))
+		rdpmcl(idx, *data);
+	else {
+		struct kvm_pmc *pmc = intel_pmc_idx_to_pmc(pmu, pmc_idx);
+		*data = pmc->counter;
+	}
+
+	return 0;
+}
+
 static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -530,6 +556,7 @@ struct kvm_pmu_ops intel_pmu_ops = {
 	.is_valid_msr = intel_is_valid_msr,
 	.get_msr = intel_pmu_get_msr,
 	.set_msr = intel_pmu_set_msr,
+	.read_pmc = intel_pmu_read_pmc,
 	.refresh = intel_pmu_refresh,
 	.init = intel_pmu_init,
 	.reset = intel_pmu_reset,

From patchwork Thu Nov  1 10:04:06 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Wang, Wei W" <wei.w.wang@intel.com>
X-Patchwork-Id: 10663827
Return-Path: <kvm-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
 [172.30.200.125])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CFAED13B5
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:36:54 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B94E128397
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:36:54 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id AB86C2B7BB; Thu,  1 Nov 2018 10:36:54 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 39E3528397
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:36:54 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1728382AbeKATi6 (ORCPT
        <rfc822;patchwork-kvm@patchwork.kernel.org>);
        Thu, 1 Nov 2018 15:38:58 -0400
Received: from mga06.intel.com ([134.134.136.31]:52047 "EHLO mga06.intel.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1728339AbeKATi6 (ORCPT <rfc822;kvm@vger.kernel.org>);
        Thu, 1 Nov 2018 15:38:58 -0400
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga002.fm.intel.com ([10.253.24.26])
  by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 01 Nov 2018 03:36:36 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.54,451,1534834800";
   d="scan'208";a="100602007"
Received: from devel-ww.sh.intel.com ([10.239.48.119])
  by fmsmga002.fm.intel.com with ESMTP; 01 Nov 2018 03:36:32 -0700
From: Wei Wang <wei.w.wang@intel.com>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
        pbonzini@redhat.com, ak@linux.intel.com, peterz@infradead.org
Cc: mingo@redhat.com, rkrcmar@redhat.com, like.xu@intel.com,
        wei.w.wang@intel.com
Subject: [PATCH v1 6/8] KVM/x86/vPMU: remove some unused functions
Date: Thu,  1 Nov 2018 18:04:06 +0800
Message-Id: <1541066648-40690-7-git-send-email-wei.w.wang@intel.com>
X-Mailer: git-send-email 2.7.4
In-Reply-To: <1541066648-40690-1-git-send-email-wei.w.wang@intel.com>
References: <1541066648-40690-1-git-send-email-wei.w.wang@intel.com>
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

Those functions are used in the old intel vPMU implementation and are not
needed now.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 arch/x86/kvm/pmu_intel.c | 103 -----------------------------------------------
 1 file changed, 103 deletions(-)

diff --git a/arch/x86/kvm/pmu_intel.c b/arch/x86/kvm/pmu_intel.c
index 4c6183b..66d7c09 100644
--- a/arch/x86/kvm/pmu_intel.c
+++ b/arch/x86/kvm/pmu_intel.c
@@ -23,87 +23,6 @@
 #define INTEL_PMU_RDPMC_FIXED_PMC (1ULL << 30)
 #define INTEL_PMU_RDPMC_COUNTER_MASK (INTEL_PMU_RDPMC_FIXED_PMC - 1)
 
-static struct kvm_event_hw_type_mapping intel_arch_events[] = {
-	/* Index must match CPUID 0x0A.EBX bit vector */
-	[0] = { 0x3c, 0x00, PERF_COUNT_HW_CPU_CYCLES },
-	[1] = { 0xc0, 0x00, PERF_COUNT_HW_INSTRUCTIONS },
-	[2] = { 0x3c, 0x01, PERF_COUNT_HW_BUS_CYCLES  },
-	[3] = { 0x2e, 0x4f, PERF_COUNT_HW_CACHE_REFERENCES },
-	[4] = { 0x2e, 0x41, PERF_COUNT_HW_CACHE_MISSES },
-	[5] = { 0xc4, 0x00, PERF_COUNT_HW_BRANCH_INSTRUCTIONS },
-	[6] = { 0xc5, 0x00, PERF_COUNT_HW_BRANCH_MISSES },
-	[7] = { 0x00, 0x30, PERF_COUNT_HW_REF_CPU_CYCLES },
-};
-
-/* mapping between fixed pmc index and intel_arch_events array */
-static int fixed_pmc_events[] = {1, 0, 7};
-
-static void reprogram_fixed_counters(struct kvm_pmu *pmu, u64 data)
-{
-	int i;
-
-	for (i = 0; i < pmu->nr_arch_fixed_counters; i++) {
-		u8 new_ctrl = fixed_ctrl_field(data, i);
-		u8 old_ctrl = fixed_ctrl_field(pmu->fixed_ctr_ctrl, i);
-		struct kvm_pmc *pmc;
-
-		pmc = get_fixed_pmc(pmu, MSR_CORE_PERF_FIXED_CTR0 + i);
-
-		if (old_ctrl == new_ctrl)
-			continue;
-
-		reprogram_fixed_counter(pmc, new_ctrl, i);
-	}
-
-	pmu->fixed_ctr_ctrl = data;
-}
-
-/* function is called when global control register has been updated. */
-static void global_ctrl_changed(struct kvm_pmu *pmu, u64 data)
-{
-	int bit;
-	u64 diff = pmu->global_ctrl ^ data;
-
-	pmu->global_ctrl = data;
-
-	for_each_set_bit(bit, (unsigned long *)&diff, X86_PMC_IDX_MAX)
-		reprogram_counter(pmu, bit);
-}
-
-static unsigned intel_find_arch_event(struct kvm_pmu *pmu,
-				      u8 event_select,
-				      u8 unit_mask)
-{
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(intel_arch_events); i++)
-		if (intel_arch_events[i].eventsel == event_select
-		    && intel_arch_events[i].unit_mask == unit_mask
-		    && (pmu->available_event_types & (1 << i)))
-			break;
-
-	if (i == ARRAY_SIZE(intel_arch_events))
-		return PERF_COUNT_HW_MAX;
-
-	return intel_arch_events[i].event_type;
-}
-
-static unsigned intel_find_fixed_event(int idx)
-{
-	if (idx >= ARRAY_SIZE(fixed_pmc_events))
-		return PERF_COUNT_HW_MAX;
-
-	return intel_arch_events[fixed_pmc_events[idx]].event_type;
-}
-
-/* check if a PMC is enabled by comparing it with globl_ctrl bits. */
-static bool intel_pmc_is_enabled(struct kvm_pmc *pmc)
-{
-	struct kvm_pmu *pmu = pmc_to_pmu(pmc);
-
-	return test_bit(pmc->idx, (unsigned long *)&pmu->global_ctrl);
-}
-
 static struct kvm_pmc *intel_pmc_idx_to_pmc(struct kvm_pmu *pmu, int pmc_idx)
 {
 	if (pmc_idx < INTEL_PMC_IDX_FIXED)
@@ -128,23 +47,6 @@ static int intel_is_valid_msr_idx(struct kvm_vcpu *vcpu, unsigned idx)
 		(fixed && idx >= pmu->nr_arch_fixed_counters);
 }
 
-static struct kvm_pmc *intel_msr_idx_to_pmc(struct kvm_vcpu *vcpu,
-					    unsigned idx)
-{
-	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
-	bool fixed = idx & (1u << 30);
-	struct kvm_pmc *counters;
-
-	idx &= ~(3u << 30);
-	if (!fixed && idx >= pmu->nr_arch_gp_counters)
-		return NULL;
-	if (fixed && idx >= pmu->nr_arch_fixed_counters)
-		return NULL;
-	counters = fixed ? pmu->fixed_counters : pmu->gp_counters;
-
-	return &counters[idx];
-}
-
 static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -547,11 +449,6 @@ struct kvm_perf_switch_msr *intel_pmu_get_switch_msrs(struct kvm_vcpu *vcpu,
 }
 
 struct kvm_pmu_ops intel_pmu_ops = {
-	.find_arch_event = intel_find_arch_event,
-	.find_fixed_event = intel_find_fixed_event,
-	.pmc_is_enabled = intel_pmc_is_enabled,
-	.pmc_idx_to_pmc = intel_pmc_idx_to_pmc,
-	.msr_idx_to_pmc = intel_msr_idx_to_pmc,
 	.is_valid_msr_idx = intel_is_valid_msr_idx,
 	.is_valid_msr = intel_is_valid_msr,
 	.get_msr = intel_pmu_get_msr,

From patchwork Thu Nov  1 10:04:07 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Wang, Wei W" <wei.w.wang@intel.com>
X-Patchwork-Id: 10663823
Return-Path: <kvm-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
 [172.30.200.125])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9B99714E2
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:36:47 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8617A28397
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:36:47 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 77E852B7BB; Thu,  1 Nov 2018 10:36:47 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E24E228397
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:36:46 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1728427AbeKATjC (ORCPT
        <rfc822;patchwork-kvm@patchwork.kernel.org>);
        Thu, 1 Nov 2018 15:39:02 -0400
Received: from mga06.intel.com ([134.134.136.31]:52047 "EHLO mga06.intel.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1728339AbeKATjB (ORCPT <rfc822;kvm@vger.kernel.org>);
        Thu, 1 Nov 2018 15:39:01 -0400
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga002.fm.intel.com ([10.253.24.26])
  by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 01 Nov 2018 03:36:37 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.54,451,1534834800";
   d="scan'208";a="100602018"
Received: from devel-ww.sh.intel.com ([10.239.48.119])
  by fmsmga002.fm.intel.com with ESMTP; 01 Nov 2018 03:36:33 -0700
From: Wei Wang <wei.w.wang@intel.com>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
        pbonzini@redhat.com, ak@linux.intel.com, peterz@infradead.org
Cc: mingo@redhat.com, rkrcmar@redhat.com, like.xu@intel.com,
        wei.w.wang@intel.com
Subject: [PATCH v1 7/8] KVM/x86/vPMU: save/restore guest perf counters on vCPU
 switching
Date: Thu,  1 Nov 2018 18:04:07 +0800
Message-Id: <1541066648-40690-8-git-send-email-wei.w.wang@intel.com>
X-Mailer: git-send-email 2.7.4
In-Reply-To: <1541066648-40690-1-git-send-email-wei.w.wang@intel.com>
References: <1541066648-40690-1-git-send-email-wei.w.wang@intel.com>
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

When the vCPU is scheduled in, restore the assigned perf counter states
and register the guest pmi callback. When the vCPU is scheduled out,
save the assigned perf counter states and unregister the guest PMI
callback.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/pmu.c              | 12 ++++++++++++
 arch/x86/kvm/pmu.h              |  4 ++++
 arch/x86/kvm/pmu_intel.c        | 38 ++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/x86.c              |  6 ++++++
 include/linux/kvm_host.h        |  1 +
 virt/kvm/kvm_main.c             |  3 +++
 7 files changed, 65 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index b66d164..cb1c0bf 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -474,6 +474,7 @@ struct kvm_pmu {
 	u64 counter_bitmask[2];
 	u64 global_ctrl_mask;
 	u64 assigned_pmc_bitmap;
+	u64 restore_pmc_bitmap;
 	u64 reserved_bits;
 	u8 version;
 	struct kvm_pmc gp_counters[INTEL_PMC_MAX_GENERIC];
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 7f2f63e..4448a88 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -340,6 +340,18 @@ void kvm_pmu_reset(struct kvm_vcpu *vcpu)
 	kvm_x86_ops->pmu_ops->reset(vcpu);
 }
 
+void kvm_pmu_sched_in(struct kvm_vcpu *vcpu)
+{
+	if (kvm_x86_ops->pmu_ops->sched_in)
+		kvm_x86_ops->pmu_ops->sched_in(vcpu);
+}
+
+void kvm_pmu_sched_out(struct kvm_vcpu *vcpu)
+{
+	if (kvm_x86_ops->pmu_ops->sched_out)
+		kvm_x86_ops->pmu_ops->sched_out(vcpu);
+}
+
 void kvm_pmu_init(struct kvm_vcpu *vcpu)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 7ab85bf..77fc973 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -34,6 +34,8 @@ struct kvm_pmu_ops {
 	void (*refresh)(struct kvm_vcpu *vcpu);
 	void (*init)(struct kvm_vcpu *vcpu);
 	void (*reset)(struct kvm_vcpu *vcpu);
+	void (*sched_in)(struct kvm_vcpu *vcpu);
+	void (*sched_out)(struct kvm_vcpu *vcpu);
 };
 
 static inline u64 pmc_bitmask(struct kvm_pmc *pmc)
@@ -118,6 +120,8 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu);
 void kvm_pmu_reset(struct kvm_vcpu *vcpu);
 void kvm_pmu_init(struct kvm_vcpu *vcpu);
 void kvm_pmu_destroy(struct kvm_vcpu *vcpu);
+void kvm_pmu_sched_in(struct kvm_vcpu *vcpu);
+void kvm_pmu_sched_out(struct kvm_vcpu *vcpu);
 
 bool is_vmware_backdoor_pmc(u32 pmc_idx);
 struct kvm_perf_switch_msr *intel_pmu_get_switch_msrs(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/kvm/pmu_intel.c b/arch/x86/kvm/pmu_intel.c
index 66d7c09..9eb5230 100644
--- a/arch/x86/kvm/pmu_intel.c
+++ b/arch/x86/kvm/pmu_intel.c
@@ -427,6 +427,7 @@ static void intel_pmu_reset(struct kvm_vcpu *vcpu)
 	pmu->global_status = 0;
 	pmu->global_ovf_ctrl = 0;
 	pmu->assigned_pmc_bitmap = 0;
+	pmu->restore_pmc_bitmap = 0;
 }
 
 struct kvm_perf_switch_msr *intel_pmu_get_switch_msrs(struct kvm_vcpu *vcpu,
@@ -448,6 +449,41 @@ struct kvm_perf_switch_msr *intel_pmu_get_switch_msrs(struct kvm_vcpu *vcpu,
 	return arr;
 }
 
+static void intel_pmu_inject_guest_pmi(void *opaque, u64 status)
+{
+	struct kvm_vcpu *vcpu = opaque;
+	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+
+	pmu->global_status |= status;
+	kvm_make_request(KVM_REQ_PMI, vcpu);
+}
+
+static void intel_pmu_sched_out(struct kvm_vcpu *vcpu)
+{
+	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+	u32 bit;
+
+	pmu->restore_pmc_bitmap = pmu->assigned_pmc_bitmap;
+	for_each_set_bit(bit, (unsigned long *)&pmu->restore_pmc_bitmap,
+			 X86_PMC_IDX_MAX)
+		intel_pmu_put_pmc(pmu, bit);
+
+	x86_perf_unregister_pmi_callback();
+}
+
+static void intel_pmu_sched_in(struct kvm_vcpu *vcpu)
+{
+	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+	u32 bit;
+
+	for_each_set_bit(bit, (unsigned long *)&pmu->restore_pmc_bitmap,
+			 X86_PMC_IDX_MAX)
+		intel_pmu_get_pmc(pmu, bit);
+	pmu->restore_pmc_bitmap = 0;
+
+	x86_perf_register_pmi_callback(intel_pmu_inject_guest_pmi, vcpu);
+}
+
 struct kvm_pmu_ops intel_pmu_ops = {
 	.is_valid_msr_idx = intel_is_valid_msr_idx,
 	.is_valid_msr = intel_is_valid_msr,
@@ -457,4 +493,6 @@ struct kvm_pmu_ops intel_pmu_ops = {
 	.refresh = intel_pmu_refresh,
 	.init = intel_pmu_init,
 	.reset = intel_pmu_reset,
+	.sched_out = intel_pmu_sched_out,
+	.sched_in = intel_pmu_sched_in,
 };
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 66d66d7..47308bd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8986,9 +8986,15 @@ void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu)
 void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu)
 {
 	vcpu->arch.l1tf_flush_l1d = true;
+	kvm_pmu_sched_in(vcpu);
 	kvm_x86_ops->sched_in(vcpu, cpu);
 }
 
+void kvm_arch_sched_out(struct kvm_vcpu *vcpu)
+{
+	kvm_pmu_sched_out(vcpu);
+}
+
 int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 {
 	if (type)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index c926698..478d602 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -795,6 +795,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu);
 
 void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu);
+void kvm_arch_sched_out(struct kvm_vcpu *vcpu);
 
 void kvm_arch_vcpu_free(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2679e47..24c2be2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3990,6 +3990,9 @@ static void kvm_sched_out(struct preempt_notifier *pn,
 
 	if (current->state == TASK_RUNNING)
 		vcpu->preempted = true;
+
+	kvm_arch_sched_out(vcpu);
+
 	kvm_arch_vcpu_put(vcpu);
 }
 

From patchwork Thu Nov  1 10:04:08 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Wang, Wei W" <wei.w.wang@intel.com>
X-Patchwork-Id: 10663825
Return-Path: <kvm-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
 [172.30.200.125])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 474F714E2
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:36:53 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 32E6E28397
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:36:53 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 27A522B7BB; Thu,  1 Nov 2018 10:36:53 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D072928397
	for <patchwork-kvm@patchwork.kernel.org>;
 Thu,  1 Nov 2018 10:36:52 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1728403AbeKATjC (ORCPT
        <rfc822;patchwork-kvm@patchwork.kernel.org>);
        Thu, 1 Nov 2018 15:39:02 -0400
Received: from mga06.intel.com ([134.134.136.31]:52050 "EHLO mga06.intel.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1728370AbeKATjB (ORCPT <rfc822;kvm@vger.kernel.org>);
        Thu, 1 Nov 2018 15:39:01 -0400
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga002.fm.intel.com ([10.253.24.26])
  by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 01 Nov 2018 03:36:39 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.54,451,1534834800";
   d="scan'208";a="100602029"
Received: from devel-ww.sh.intel.com ([10.239.48.119])
  by fmsmga002.fm.intel.com with ESMTP; 01 Nov 2018 03:36:35 -0700
From: Wei Wang <wei.w.wang@intel.com>
To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
        pbonzini@redhat.com, ak@linux.intel.com, peterz@infradead.org
Cc: mingo@redhat.com, rkrcmar@redhat.com, like.xu@intel.com,
        wei.w.wang@intel.com
Subject: [PATCH v1 8/8] KVM/x86/vPMU: return the counters to host if guest is
 torn down
Date: Thu,  1 Nov 2018 18:04:08 +0800
Message-Id: <1541066648-40690-9-git-send-email-wei.w.wang@intel.com>
X-Mailer: git-send-email 2.7.4
In-Reply-To: <1541066648-40690-1-git-send-email-wei.w.wang@intel.com>
References: <1541066648-40690-1-git-send-email-wei.w.wang@intel.com>
Sender: kvm-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

Return the assigned counters to host in the case the guest is torn down
unexpectedly.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
---
 arch/x86/kvm/pmu_intel.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/pmu_intel.c b/arch/x86/kvm/pmu_intel.c
index 9eb5230..30fa8eb 100644
--- a/arch/x86/kvm/pmu_intel.c
+++ b/arch/x86/kvm/pmu_intel.c
@@ -410,7 +410,7 @@ static void intel_pmu_reset(struct kvm_vcpu *vcpu)
 {
 	struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
 	struct kvm_pmc *pmc;
-	int i;
+	u32 i, bit;
 
 	for (i = 0; i < INTEL_PMC_MAX_GENERIC; i++) {
 		pmc = &pmu->gp_counters[i];
@@ -422,6 +422,10 @@ static void intel_pmu_reset(struct kvm_vcpu *vcpu)
 		pmc->counter = 0;
 	}
 
+	for_each_set_bit(bit, (unsigned long *)&pmu->assigned_pmc_bitmap,
+			 X86_PMC_IDX_MAX)
+		intel_pmu_put_pmc(pmu, bit);
+
 	pmu->fixed_ctr_ctrl = 0;
 	pmu->global_ctrl = 0;
 	pmu->global_status = 0;