From patchwork Wed Apr 13 13:48:15 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Shishkin X-Patchwork-Id: 8822481 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 9BD72C0553 for ; Wed, 13 Apr 2016 13:48:44 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 885B3201F5 for ; Wed, 13 Apr 2016 13:48:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4C6A320274 for ; Wed, 13 Apr 2016 13:48:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934391AbcDMNsW (ORCPT ); Wed, 13 Apr 2016 09:48:22 -0400 Received: from mga02.intel.com ([134.134.136.20]:42317 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934296AbcDMNsU (ORCPT ); Wed, 13 Apr 2016 09:48:20 -0400 Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP; 13 Apr 2016 06:48:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,479,1455004800"; d="scan'208";a="685596388" Received: from um.fi.intel.com (HELO localhost) ([10.237.72.212]) by FMSMGA003.fm.intel.com with ESMTP; 13 Apr 2016 06:48:18 -0700 From: Alexander Shishkin To: Ingo Molnar , Peter Zijlstra Cc: Gleb Natapov , Paolo Bonzini , x86@kernel.org, kvm@vger.kernel.org, Ingo Molnar , linux-kernel@vger.kernel.org, tglx@linutronix.de, hpa@zytor.com, Arnaldo Carvalho de Melo , Borislav Petkov Subject: Re: [PATCH] perf/x86/intel/pt: Don't die on VMXON In-Reply-To: <20160413105331.GA30907@gmail.com> References: <1459527854-5899-1-git-send-email-alexander.shishkin@linux.intel.com> <20160406085227.GO3448@twins.programming.kicks-ass.net> <87vb3v56x2.fsf@ashishki-desk.ger.corp.intel.com> <20160406115215.GK2906@worktop> <871t6izv9z.fsf@ashishki-desk.ger.corp.intel.com> <20160413094951.GC6430@gmail.com> <87r3e9x0hi.fsf@ashishki-desk.ger.corp.intel.com> <20160413105331.GA30907@gmail.com> User-Agent: Notmuch/0.21 (http://notmuchmail.org) Emacs/24.5.1 (x86_64-pc-linux-gnu) Date: Wed, 13 Apr 2016 16:48:15 +0300 Message-ID: <87oa9dwrfk.fsf@ashishki-desk.ger.corp.intel.com> MIME-Version: 1.0 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Ingo Molnar writes: > * Alexander Shishkin wrote: > >> Ingo Molnar writes: >> >> >> @@ -3144,6 +3146,8 @@ static void vmclear_local_loaded_vmcss(void) >> >> static void kvm_cpu_vmxoff(void) >> >> { >> >> asm volatile (__ex(ASM_VMX_VMXOFF) : : : "cc"); >> >> + >> >> + intel_pt_vmx(0); >> >> } >> > >> > Yeah so the name intel_pt_vmx() is pretty information-free because it has no verb, >> > only nouns - please name new functions descriptively to after what they do! >> >> I do agree that it can use a better name (and this is a second attempt >> already). >> >> > Something like intel_pt_set_vmx_state() or so? >> >> Hmm how about intel_pt_handle_vmx()? Ideally, akin to the VMXON/VMXOFF insns, >> this could be two functions (intel_pt_handle_vmx{on,off}()) if the global >> namespace can take it. > > Sure, intel_pt_handle_vmx(0/1) sounds good too. I wouldn't split it into two > functions ... Ok, so I'll use this one then. Thanks! Peter, here's an updated patch against perf/urgent. It also doesn't introduce new ACCESS_ONCE()s any more. From 89b85412c46270a675cf77918fc9fda520a7bdd2 Mon Sep 17 00:00:00 2001 From: Alexander Shishkin Date: Tue, 29 Mar 2016 17:43:10 +0300 Subject: [PATCH] perf/x86/intel/pt: Don't die on VMXON Some versions of Intel PT do not support tracing across VMXON, more specifically, VMXON will clear TraceEn control bit and any attempt to set it before VMXOFF will throw a #GP, which in the current state of things will crash the kernel. Namely, $ perf record -e intel_pt// kvm -nographic on such a machine will kill it. To avoid this, notify the intel_pt driver before VMXON and after VMXOFF so that it knows when not to enable itself. Signed-off-by: Alexander Shishkin --- arch/x86/events/intel/pt.c | 75 +++++++++++++++++++++++++++++++++------ arch/x86/events/intel/pt.h | 3 ++ arch/x86/include/asm/perf_event.h | 4 +++ arch/x86/kvm/vmx.c | 4 +++ 4 files changed, 75 insertions(+), 11 deletions(-) diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c index 6af7cf71d6..09a77dbc73 100644 --- a/arch/x86/events/intel/pt.c +++ b/arch/x86/events/intel/pt.c @@ -136,9 +136,21 @@ static int __init pt_pmu_hw_init(void) struct dev_ext_attribute *de_attrs; struct attribute **attrs; size_t size; + u64 reg; int ret; long i; + if (boot_cpu_has(X86_FEATURE_VMX)) { + /* + * Intel SDM, 36.5 "Tracing post-VMXON" says that + * "IA32_VMX_MISC[bit 14]" being 1 means PT can trace + * post-VMXON. + */ + rdmsrl(MSR_IA32_VMX_MISC, reg); + if (reg & BIT(14)) + pt_pmu.vmx = true; + } + attrs = NULL; for (i = 0; i < PT_CPUID_LEAVES; i++) { @@ -269,20 +281,23 @@ static void pt_config(struct perf_event *event) reg |= (event->attr.config & PT_CONFIG_MASK); + event->hw.config = reg; wrmsrl(MSR_IA32_RTIT_CTL, reg); } -static void pt_config_start(bool start) +static void pt_config_stop(struct perf_event *event) { - u64 ctl; + u64 ctl = READ_ONCE(event->hw.config); + + /* may be already stopped by a PMI */ + if (!(ctl & RTIT_CTL_TRACEEN)) + return; - rdmsrl(MSR_IA32_RTIT_CTL, ctl); - if (start) - ctl |= RTIT_CTL_TRACEEN; - else - ctl &= ~RTIT_CTL_TRACEEN; + ctl &= ~RTIT_CTL_TRACEEN; wrmsrl(MSR_IA32_RTIT_CTL, ctl); + WRITE_ONCE(event->hw.config, ctl); + /* * A wrmsr that disables trace generation serializes other PT * registers and causes all data packets to be written to memory, @@ -291,8 +306,7 @@ static void pt_config_start(bool start) * The below WMB, separating data store and aux_head store matches * the consumer's RMB that separates aux_head load and data load. */ - if (!start) - wmb(); + wmb(); } static void pt_config_buffer(void *buf, unsigned int topa_idx, @@ -942,11 +956,17 @@ void intel_pt_interrupt(void) if (!ACCESS_ONCE(pt->handle_nmi)) return; - pt_config_start(false); + /* + * If VMX is on and PT does not support it, don't touch anything. + */ + if (READ_ONCE(pt->vmx_on)) + return; if (!event) return; + pt_config_stop(event); + buf = perf_get_aux(&pt->handle); if (!buf) return; @@ -983,6 +1003,35 @@ void intel_pt_interrupt(void) } } +void intel_pt_handle_vmx(int on) +{ + struct pt *pt = this_cpu_ptr(&pt_ctx); + struct perf_event *event; + unsigned long flags; + + /* PT plays nice with VMX, do nothing */ + if (pt_pmu.vmx) + return; + + /* + * VMXON will clear RTIT_CTL.TraceEn; we need to make + * sure to not try to set it while VMX is on. Disable + * interrupts to avoid racing with pmu callbacks; + * concurrent PMI should be handled fine. + */ + local_irq_save(flags); + WRITE_ONCE(pt->vmx_on, on); + + if (on) { + /* prevent pt_config_stop() from writing RTIT_CTL */ + event = pt->handle.event; + if (event) + event->hw.config = 0; + } + local_irq_restore(flags); +} +EXPORT_SYMBOL_GPL(intel_pt_handle_vmx); + /* * PMU callbacks */ @@ -992,6 +1041,9 @@ static void pt_event_start(struct perf_event *event, int mode) struct pt *pt = this_cpu_ptr(&pt_ctx); struct pt_buffer *buf = perf_get_aux(&pt->handle); + if (READ_ONCE(pt->vmx_on)) + return; + if (!buf || pt_buffer_is_full(buf, pt)) { event->hw.state = PERF_HES_STOPPED; return; @@ -1014,7 +1066,8 @@ static void pt_event_stop(struct perf_event *event, int mode) * see comment in intel_pt_interrupt(). */ ACCESS_ONCE(pt->handle_nmi) = 0; - pt_config_start(false); + + pt_config_stop(event); if (event->hw.state == PERF_HES_STOPPED) return; diff --git a/arch/x86/events/intel/pt.h b/arch/x86/events/intel/pt.h index 336878a5d2..3abb5f5ccc 100644 --- a/arch/x86/events/intel/pt.h +++ b/arch/x86/events/intel/pt.h @@ -65,6 +65,7 @@ enum pt_capabilities { struct pt_pmu { struct pmu pmu; u32 caps[PT_CPUID_REGS_NUM * PT_CPUID_LEAVES]; + bool vmx; }; /** @@ -107,10 +108,12 @@ struct pt_buffer { * struct pt - per-cpu pt context * @handle: perf output handle * @handle_nmi: do handle PT PMI on this cpu, there's an active event + * @vmx_on: 1 if VMX is ON on this cpu */ struct pt { struct perf_output_handle handle; int handle_nmi; + int vmx_on; }; #endif /* __INTEL_PT_H__ */ diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h index 5a2ed3ed2f..f353061bba 100644 --- a/arch/x86/include/asm/perf_event.h +++ b/arch/x86/include/asm/perf_event.h @@ -285,6 +285,10 @@ static inline void perf_events_lapic_init(void) { } static inline void perf_check_microcode(void) { } #endif +#ifdef CONFIG_CPU_SUP_INTEL + extern void intel_pt_handle_vmx(int on); +#endif + #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_AMD) extern void amd_pmu_enable_virt(void); extern void amd_pmu_disable_virt(void); diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 1735ae9d68..143648c056 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -3075,6 +3075,8 @@ static __init int vmx_disabled_by_bios(void) static void kvm_cpu_vmxon(u64 addr) { + intel_pt_handle_vmx(1); + asm volatile (ASM_VMX_VMXON_RAX : : "a"(&addr), "m"(addr) : "memory", "cc"); @@ -3144,6 +3146,8 @@ static void vmclear_local_loaded_vmcss(void) static void kvm_cpu_vmxoff(void) { asm volatile (__ex(ASM_VMX_VMXOFF) : : : "cc"); + + intel_pt_handle_vmx(0); } static void hardware_disable(void)