From patchwork Fri Sep 23 00:13:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12985965 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33709C54EE9 for ; Fri, 23 Sep 2022 00:14:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230454AbiIWAOJ (ORCPT ); Thu, 22 Sep 2022 20:14:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48542 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230365AbiIWAOH (ORCPT ); Thu, 22 Sep 2022 20:14:07 -0400 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D0218F8FB2 for ; Thu, 22 Sep 2022 17:14:06 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id p12-20020a170902e74c00b00177f3be2825so7009348plf.17 for ; Thu, 22 Sep 2022 17:14:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date; bh=EBTyUY3XZE752wRZafMDoegRM4hNkAZeyMdz0Uk4JbQ=; b=fkCRmoRfPIqYKdmcvJI9Fta4b9KfZN7e0eQdNaWmIXq9ZQL4DSBz+ajrSHsCXaMi71 Zgn/HSUe51xwQOcV032hxfi5UP8kOspSJUZimrlImKt8/S+EXPqiqZcjWgY/gPDW//zO Lbd+lGKdejwCBAFox3oJ8si7077UsyN2QRpJMlYW/LdMFfHBZMv9CvBXz8mb2wGRZg70 tq+xOvLDHaOLiKIHzdCdveOyHMnaxG14nqTihHP18CygyqHXXOqkBBRGJY0qf0IEbHJC soqmTCAAaBLssjgchpPzZYOs7hyCCSv2AWBn8thxprC6q1yuge/HlbOsWZy8mx9w9Hjb Uf2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date; bh=EBTyUY3XZE752wRZafMDoegRM4hNkAZeyMdz0Uk4JbQ=; b=LDbQz5ycQV4JUUk934ZdDcYX0r4pg5o6xCn1dsSdqGn91gRC6axgPgUc0DpUBnMmcN oWE1u9mVxFgPMqQy9659Qot48PhFgSMBneb8xqdH2dgTUqUYtRTf/d3lfB09FvWmZN6q dE5pje0PNkT6cXQjgmuPnlBNmhEjQu/NR+1it1N3wsq4+yCLLAT6PPrj4kh2fVxpdTM5 G50CZnWgarvdgxDMUXr7+JpI5e3LBeTGfUwCJJMQaiKHpZOCVf3I56kH4bE6W1jI3Pk5 SJE0H4d4S0y4YPbsBEHG2PXmwHi4EpaAcUR00jPFikbudrsUJLCr45Gx8PEdKSLsIdGi QjRw== X-Gm-Message-State: ACrzQf29RB7lgLjic1ACphslbYUHCz3nZlnrEVhRC8WNuoKBoE0cEanJ khF10js6Pw/qBlx89ODSwI5A2r8zR8Q= X-Google-Smtp-Source: AMsMyM7F4JyUTc+0+Z0vKXfRIXgEG8sC2y6UBGCK0o5D0CrYPyYnoSRVj26nigofwjfILt7FrNyq7J229xg= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:90b:10a:b0:200:2849:235f with SMTP id p10-20020a17090b010a00b002002849235fmr437280pjz.1.1663892046038; Thu, 22 Sep 2022 17:14:06 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 23 Sep 2022 00:13:52 +0000 In-Reply-To: <20220923001355.3741194-1-seanjc@google.com> Mime-Version: 1.0 References: <20220923001355.3741194-1-seanjc@google.com> X-Mailer: git-send-email 2.37.3.998.g577e59143f-goog Message-ID: <20220923001355.3741194-2-seanjc@google.com> Subject: [PATCH 1/4] KVM: x86/pmu: Force reprogramming of all counters on PMU filter change From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Aaron Lewis , Like Xu , Wanpeng Li Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Force vCPUs to reprogram all counters on a PMU filter change to provide a sane ABI for userspace. Use the existing KVM_REQ_PMU to do the programming, and take advantage of the fact that the reprogram_pmi bitmap fits in a u64 to set all bits in a single atomic update. Note, setting the bitmap and making the request needs to be done _after_ the SRCU synchronization to ensure that vCPUs will reprogram using the new filter. KVM's current "lazy" approach is confusing and non-deterministic. It's confusing because, from a developer perspective, the code is buggy as it makes zero sense to let userspace modify the filter but then not actually enforce the new filter. The lazy approach is non-deterministic because KVM enforces the filter whenever a counter is reprogrammed, not just on guest WRMSRs, i.e. a guest might gain/lose access to an event at random times depending on what is going on in the host. Note, the resulting behavior is still non-determinstic while the filter is in flux. If userspace wants to guarantee deterministic behavior, all vCPUs should be paused during the filter update. Fixes: 66bb8a065f5a ("KVM: x86: PMU Event Filter") Cc: Aaron Lewis Jim Mattson Signed-off-by: Sean Christopherson --- arch/x86/include/asm/kvm_host.h | 11 ++++++++++- arch/x86/kvm/pmu.c | 15 +++++++++++++-- 2 files changed, 23 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index b3ce723efb43..462f041ede9f 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -519,7 +519,16 @@ struct kvm_pmu { struct kvm_pmc gp_counters[INTEL_PMC_MAX_GENERIC]; struct kvm_pmc fixed_counters[KVM_PMC_MAX_FIXED]; struct irq_work irq_work; - DECLARE_BITMAP(reprogram_pmi, X86_PMC_IDX_MAX); + + /* + * Overlay the bitmap with a 64-bit atomic so that all bits can be + * set in a single access, e.g. to reprogram all counters when the PMU + * filter changes. + */ + union { + DECLARE_BITMAP(reprogram_pmi, X86_PMC_IDX_MAX); + atomic64_t __reprogram_pmi; + }; DECLARE_BITMAP(all_valid_pmc_idx, X86_PMC_IDX_MAX); DECLARE_BITMAP(pmc_in_use, X86_PMC_IDX_MAX); diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index d9b9a0f0db17..4504987cbbe2 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -577,6 +577,8 @@ EXPORT_SYMBOL_GPL(kvm_pmu_trigger_event); int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void __user *argp) { struct kvm_pmu_event_filter tmp, *filter; + struct kvm_vcpu *vcpu; + unsigned long i; size_t size; int r; @@ -613,9 +615,18 @@ int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void __user *argp) mutex_lock(&kvm->lock); filter = rcu_replace_pointer(kvm->arch.pmu_event_filter, filter, mutex_is_locked(&kvm->lock)); - mutex_unlock(&kvm->lock); - synchronize_srcu_expedited(&kvm->srcu); + + BUILD_BUG_ON(sizeof(((struct kvm_pmu *)0)->reprogram_pmi) > + sizeof(((struct kvm_pmu *)0)->__reprogram_pmi)); + + kvm_for_each_vcpu(i, vcpu, kvm) + atomic64_set(&vcpu_to_pmu(vcpu)->__reprogram_pmi, -1ull); + + kvm_make_all_cpus_request(kvm, KVM_REQ_PMU); + + mutex_unlock(&kvm->lock); + r = 0; cleanup: kfree(filter); From patchwork Fri Sep 23 00:13:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12985967 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1EF04C54EE9 for ; Fri, 23 Sep 2022 00:14:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231254AbiIWAOU (ORCPT ); Thu, 22 Sep 2022 20:14:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48610 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230443AbiIWAOJ (ORCPT ); Thu, 22 Sep 2022 20:14:09 -0400 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 547B2105D62 for ; Thu, 22 Sep 2022 17:14:08 -0700 (PDT) Received: by mail-pg1-x549.google.com with SMTP id m188-20020a633fc5000000b00434dccacd4aso6140649pga.10 for ; Thu, 22 Sep 2022 17:14:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date; bh=f2xmpMmtFzdLGsMsB8kYkuwB1JyVMmF0btvwH040Ee4=; b=rnJaFLPPbtEl6p8La4UuvJhCknH1LynXt8bS7KvIUGS3P7eeW30h7WaODIpwswxgzj roFq6ACeln5B5O+fl50fFlFJ7wQ2rGsO9epEhfT30UKCFX3s+0S3uO+Ie1y7k3+dPGFZ fyGMfm4BRK4zUGtWjhOJ/CavPC++zvEDZnnAVaTWrrif8FAUXzvKISy9FfG2v3I9Zbav zQsIBzbyyX14LwXYLZXehp/DXX9KNEkupPok80ZJM7cCSiW54V0uE0EMolLe1MCyIQSz JCCn0G2pjR+/pzm4ZLNsb6wHq2/B01AWyE6cHihCndklGqAAnV3u9Z5WquEizknAePSt FccA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date; bh=f2xmpMmtFzdLGsMsB8kYkuwB1JyVMmF0btvwH040Ee4=; b=Gmae4GZliVaGMbntdnDz5VApIuauZUK7QHNQnXlZTJ7RrEJUDdqbdoTebCrkPD20aT vaw7RWtUPOX3MRhkV3ml2T5h4IeKpXCmzG+mqJ5DFJ2Otp83ZA/Hir1nNK0pwUgYVulO tgS4cb0HvzCort4nUTXo7ajIrgFEsUDudUyZZRtoE+jXPy0NH69wYcbjHk2LIn45R771 YHwkf8uh/EJq/BYj/uxCsTancaedGpnIWLv78Z7v0J0b4i2hYusidxVxAGVU+nm/BmsF Rx8LWfziKokTgoFyqXEmjfwQVm5iyDXHVDkhZcQNZH+UyY9I/J6v6C7qCXXLGIbI3Uel 09Ng== X-Gm-Message-State: ACrzQf2OnNQ/QLKtpt7Pu+4w9kygboMqnogp6NT+dvd1UDg2bM32/EPX O/bn+Pcc0hn9OVz/kMMRDLQhgGMib8Q= X-Google-Smtp-Source: AMsMyM6w+VH96XwQOSnc5r8zfbqw1kG64xzRQX/6SN96JqNtwiRpf7Q+Oh9TqgLurWU6/4eTY93vgQmAHfs= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6a00:a22:b0:54e:6a90:fbef with SMTP id p34-20020a056a000a2200b0054e6a90fbefmr6068807pfh.53.1663892047901; Thu, 22 Sep 2022 17:14:07 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 23 Sep 2022 00:13:53 +0000 In-Reply-To: <20220923001355.3741194-1-seanjc@google.com> Mime-Version: 1.0 References: <20220923001355.3741194-1-seanjc@google.com> X-Mailer: git-send-email 2.37.3.998.g577e59143f-goog Message-ID: <20220923001355.3741194-3-seanjc@google.com> Subject: [PATCH 2/4] KVM: x86/pmu: Clear "reprogram" bit if counter is disabled or disallowed From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Aaron Lewis , Like Xu , Wanpeng Li Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org When reprogramming a counter, clear the counter's "reprogram pending" bit if the counter is disabled (by the guest) or is disallowed (by the userspace filter). In both cases, there's no need to re-attempt programming on the next coincident KVM_REQ_PMU as enabling the counter by either method will trigger reprogramming. Signed-off-by: Sean Christopherson --- arch/x86/kvm/pmu.c | 38 ++++++++++++++++++++++++-------------- 1 file changed, 24 insertions(+), 14 deletions(-) diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 4504987cbbe2..4cd99320019b 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -150,9 +150,9 @@ static void kvm_perf_overflow(struct perf_event *perf_event, __kvm_perf_overflow(pmc, true); } -static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, - u64 config, bool exclude_user, - bool exclude_kernel, bool intr) +static int pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, u64 config, + bool exclude_user, bool exclude_kernel, + bool intr) { struct kvm_pmu *pmu = pmc_to_pmu(pmc); struct perf_event *event; @@ -204,14 +204,14 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, if (IS_ERR(event)) { pr_debug_ratelimited("kvm_pmu: event creation failed %ld for pmc->idx = %d\n", PTR_ERR(event), pmc->idx); - return; + return PTR_ERR(event); } pmc->perf_event = event; pmc_to_pmu(pmc)->event_count++; - clear_bit(pmc->idx, pmc_to_pmu(pmc)->reprogram_pmi); pmc->is_paused = false; pmc->intr = intr || pebs; + return 0; } static void pmc_pause_counter(struct kvm_pmc *pmc) @@ -245,7 +245,6 @@ static bool pmc_resume_counter(struct kvm_pmc *pmc) perf_event_enable(pmc->perf_event); pmc->is_paused = false; - clear_bit(pmc->idx, (unsigned long *)&pmc_to_pmu(pmc)->reprogram_pmi); return true; } @@ -303,10 +302,10 @@ void reprogram_counter(struct kvm_pmc *pmc) pmc_pause_counter(pmc); if (!pmc_speculative_in_use(pmc) || !pmc_is_enabled(pmc)) - return; + goto reprogram_complete; if (!check_pmu_event_filter(pmc)) - return; + goto reprogram_complete; if (eventsel & ARCH_PERFMON_EVENTSEL_PIN_CONTROL) printk_once("kvm pmu: pin control bit is ignored\n"); @@ -324,16 +323,27 @@ void reprogram_counter(struct kvm_pmc *pmc) } if (pmc->current_config == new_config && pmc_resume_counter(pmc)) - return; + goto reprogram_complete; pmc_release_perf_event(pmc); pmc->current_config = new_config; - pmc_reprogram_counter(pmc, PERF_TYPE_RAW, - (eventsel & pmu->raw_event_mask), - !(eventsel & ARCH_PERFMON_EVENTSEL_USR), - !(eventsel & ARCH_PERFMON_EVENTSEL_OS), - eventsel & ARCH_PERFMON_EVENTSEL_INT); + + /* + * If reprogramming fails, e.g. due to contention, leave the counter's + * regprogram bit set, i.e. opportunistically try again on the next PMU + * refresh. Don't make a new request as doing so can stall the guest + * if reprogramming repeatedly fails. + */ + if (pmc_reprogram_counter(pmc, PERF_TYPE_RAW, + (eventsel & pmu->raw_event_mask), + !(eventsel & ARCH_PERFMON_EVENTSEL_USR), + !(eventsel & ARCH_PERFMON_EVENTSEL_OS), + eventsel & ARCH_PERFMON_EVENTSEL_INT)) + return; + +reprogram_complete: + clear_bit(pmc->idx, (unsigned long *)&pmc_to_pmu(pmc)->reprogram_pmi); } EXPORT_SYMBOL_GPL(reprogram_counter); From patchwork Fri Sep 23 00:13:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12985966 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B463ECAAD8 for ; Fri, 23 Sep 2022 00:14:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231153AbiIWAOQ (ORCPT ); Thu, 22 Sep 2022 20:14:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48662 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230500AbiIWAOL (ORCPT ); Thu, 22 Sep 2022 20:14:11 -0400 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3EA21106534 for ; Thu, 22 Sep 2022 17:14:10 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id m10-20020a170902db0a00b001789bd49db9so7026344plx.23 for ; Thu, 22 Sep 2022 17:14:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date; bh=EzYkgN8mabakGnvYj+N48jsNX6a8/81bGRaQci8iJjQ=; b=aNLw7jyaLlbqLQteGB5ehmKXP57MyrZ038jjncK+m8lWrgBzi7q3XZa//tAT+Ua9XK tWAiYF4q3CcF07V04BwKJnZS52xB6NBnjmrptFm1tBtO6Z2juurrkaKosLHW9R9eouzU GdEKAHhfL/mHP1gwRRIx7chgyjW8hoRBweHgDc2zoQ+PzsbtgfMuYxLVOPKmlhriXT3L tQs0vNt553TEIRrNiQBpBHfuc5zHZAtqlU69QBdk+vc9Rdx50nn/8A1IeWe6whe2PKem Lrt3BKz53b15J5MEoALRIMs1WoSVNY5LGiOO+jiaoyzAU3TEq4qJoFlMV7J8tqOQ4RDJ dB3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date; bh=EzYkgN8mabakGnvYj+N48jsNX6a8/81bGRaQci8iJjQ=; b=fZCDMJUCx2qsKpmSIDI93bmDnUjbaXjAOADHgKjEXg0TKM0YnpisacJwK5h6c82F+/ 0pP8thBAZ8OPGWWnPB/h7qpUS0uSVIuwAQecic3xUfVHY5jsMw1eABLztZTW2NurcJGs pUdMNpu57YHbsg7Ks2nt07aaBemfBposoj7mL6bYE+dN2DcGfwwPub1KPEpUR5INHKHY rc5Ar1YrQNOF3Ht96rSnGttxkvQzX26Y+szppHdnaTkTkGrnFQltkYv5jRVz9jB+2Ski x85/GRqQrNybiTOGtBFSHPtrzkyMYge4JNo8DKh+OtkToqRJOtbhte+CsyP83YEci8rs 5JHw== X-Gm-Message-State: ACrzQf0dBrv9sTMC4TcFbyi5cN1CWrt1NIBIjsUwHKey7Ql/QT3vqrva ZAxbEeVzouV9TzX7yAGcGu74ttI0oI0= X-Google-Smtp-Source: AMsMyM4phjE663nw9ogfVflbHCX+eO8lLiQBp7ZCmqUDUmna3PKoffErmx3DpUfId/cYrbhyW5w9HEyaxy0= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:d2ce:b0:178:7b5:c070 with SMTP id n14-20020a170902d2ce00b0017807b5c070mr5827561plc.58.1663892049838; Thu, 22 Sep 2022 17:14:09 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 23 Sep 2022 00:13:54 +0000 In-Reply-To: <20220923001355.3741194-1-seanjc@google.com> Mime-Version: 1.0 References: <20220923001355.3741194-1-seanjc@google.com> X-Mailer: git-send-email 2.37.3.998.g577e59143f-goog Message-ID: <20220923001355.3741194-4-seanjc@google.com> Subject: [PATCH 3/4] KVM: x86/pmu: Defer reprogram_counter() to kvm_pmu_handle_event() From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Aaron Lewis , Like Xu , Wanpeng Li Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Like Xu Batch reprogramming PMU counters by setting KVM_REQ_PMU and thus deferring reprogramming kvm_pmu_handle_event() to avoid reprogramming a counter multiple times during a single VM-Exit. Deferring programming will also allow KVM to fix a bug where immediately reprogramming a counter can result in sleeping (taking a mutex) while interrupts are disabled in the VM-Exit fastpath. Introduce kvm_pmu_request_counter_reprogam() to make it obvious that KVM is _requesting_ a reprogram and not actually doing the reprogram. Opportunistically refine related comments to avoid misunderstandings. Signed-off-by: Like Xu Link: https://lore.kernel.org/r/20220831085328.45489-5-likexu@tencent.com Signed-off-by: Sean Christopherson --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/pmu.c | 17 ++++++++++++----- arch/x86/kvm/pmu.h | 6 +++++- arch/x86/kvm/svm/pmu.c | 2 +- arch/x86/kvm/vmx/pmu_intel.c | 6 +++--- 5 files changed, 22 insertions(+), 10 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 462f041ede9f..12dcfc9330e7 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -493,6 +493,7 @@ struct kvm_pmc { struct perf_event *perf_event; struct kvm_vcpu *vcpu; /* + * only for creating or reusing perf_event, * eventsel value for general purpose counters, * ctrl value for fixed counters. */ diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index 4cd99320019b..d8330e6064ab 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -101,7 +101,11 @@ static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi) struct kvm_pmu *pmu = pmc_to_pmu(pmc); bool skip_pmi = false; - /* Ignore counters that have been reprogrammed already. */ + /* + * Ignore overflow events for counters that are scheduled to be + * reprogrammed, e.g. if a PMI for the previous event races with KVM's + * handling of a related guest WRMSR. + */ if (test_and_set_bit(pmc->idx, pmu->reprogram_pmi)) return; @@ -292,7 +296,7 @@ static bool check_pmu_event_filter(struct kvm_pmc *pmc) return allow_event; } -void reprogram_counter(struct kvm_pmc *pmc) +static void reprogram_counter(struct kvm_pmc *pmc) { struct kvm_pmu *pmu = pmc_to_pmu(pmc); u64 eventsel = pmc->eventsel; @@ -345,7 +349,6 @@ void reprogram_counter(struct kvm_pmc *pmc) reprogram_complete: clear_bit(pmc->idx, (unsigned long *)&pmc_to_pmu(pmc)->reprogram_pmi); } -EXPORT_SYMBOL_GPL(reprogram_counter); void kvm_pmu_handle_event(struct kvm_vcpu *vcpu) { @@ -355,10 +358,11 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu) for_each_set_bit(bit, pmu->reprogram_pmi, X86_PMC_IDX_MAX) { struct kvm_pmc *pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, bit); - if (unlikely(!pmc || !pmc->perf_event)) { + if (unlikely(!pmc)) { clear_bit(bit, pmu->reprogram_pmi); continue; } + reprogram_counter(pmc); } @@ -552,12 +556,15 @@ static inline bool eventsel_match_perf_hw_id(struct kvm_pmc *pmc, static inline bool cpl_is_matched(struct kvm_pmc *pmc) { bool select_os, select_user; - u64 config = pmc->current_config; + u64 config; if (pmc_is_gp(pmc)) { + config = pmc->eventsel; select_os = config & ARCH_PERFMON_EVENTSEL_OS; select_user = config & ARCH_PERFMON_EVENTSEL_USR; } else { + config = fixed_ctrl_field(pmc_to_pmu(pmc)->fixed_ctr_ctrl, + pmc->idx - INTEL_PMC_IDX_FIXED); select_os = config & 0x1; select_user = config & 0x2; } diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h index 5cc5721f260b..85ff3c0588ba 100644 --- a/arch/x86/kvm/pmu.h +++ b/arch/x86/kvm/pmu.h @@ -183,7 +183,11 @@ static inline void kvm_init_pmu_capability(void) KVM_PMC_MAX_FIXED); } -void reprogram_counter(struct kvm_pmc *pmc); +static inline void kvm_pmu_request_counter_reprogam(struct kvm_pmc *pmc) +{ + set_bit(pmc->idx, pmc_to_pmu(pmc)->reprogram_pmi); + kvm_make_request(KVM_REQ_PMU, pmc->vcpu); +} void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu); void kvm_pmu_handle_event(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c index b68956299fa8..041aa898e1bc 100644 --- a/arch/x86/kvm/svm/pmu.c +++ b/arch/x86/kvm/svm/pmu.c @@ -159,7 +159,7 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) data &= ~pmu->reserved_bits; if (data != pmc->eventsel) { pmc->eventsel = data; - reprogram_counter(pmc); + kvm_pmu_request_counter_reprogam(pmc); } return 0; } diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 25b70a85bef5..e38518afc265 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -52,7 +52,7 @@ static void reprogram_fixed_counters(struct kvm_pmu *pmu, u64 data) pmc = get_fixed_pmc(pmu, MSR_CORE_PERF_FIXED_CTR0 + i); __set_bit(INTEL_PMC_IDX_FIXED + i, pmu->pmc_in_use); - reprogram_counter(pmc); + kvm_pmu_request_counter_reprogam(pmc); } } @@ -76,7 +76,7 @@ static void reprogram_counters(struct kvm_pmu *pmu, u64 diff) for_each_set_bit(bit, (unsigned long *)&diff, X86_PMC_IDX_MAX) { pmc = intel_pmc_idx_to_pmc(pmu, bit); if (pmc) - reprogram_counter(pmc); + kvm_pmu_request_counter_reprogam(pmc); } } @@ -477,7 +477,7 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) reserved_bits ^= HSW_IN_TX_CHECKPOINTED; if (!(data & reserved_bits)) { pmc->eventsel = data; - reprogram_counter(pmc); + kvm_pmu_request_counter_reprogam(pmc); return 0; } } else if (intel_pmu_handle_lbr_msrs_access(vcpu, msr_info, false)) From patchwork Fri Sep 23 00:13:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 12985968 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D94BECAAD8 for ; Fri, 23 Sep 2022 00:14:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230500AbiIWAOX (ORCPT ); Thu, 22 Sep 2022 20:14:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48664 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231162AbiIWAOO (ORCPT ); Thu, 22 Sep 2022 20:14:14 -0400 Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com [IPv6:2607:f8b0:4864:20::104a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 74C51106F67 for ; Thu, 22 Sep 2022 17:14:12 -0700 (PDT) Received: by mail-pj1-x104a.google.com with SMTP id pc10-20020a17090b3b8a00b00202be8d81d2so2423262pjb.1 for ; Thu, 22 Sep 2022 17:14:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date; bh=MpGY4j2+HNaS4pIK9i1574FgHbMWXnwyl50j6Y5l4ns=; b=DDZQtcI8HcW4ZNwcB3YWNlpdzUNkTDOJD4Z4frFbFub3o0IvhP65RrnoMw1lrlFZli XUCEZU53R+VS0cb0POQZWjfl1vaQeYkVVp8MqTtMWLEpgmye3Qokzg/QaVmopGauY9pk gJ858vimDO428U2r0JFoiA3qceWciU/4poFF0NW0xJwE+AAWd1QzxD107Iou9uh7fnwu KVMc7pradILq2kMDgjY9gPlR4WNWnOL9qeKDumwMdd6lnWn/r3hDZl3C2mgb7XQnAeWY ZojaWYTAIZ0BmqgM4JpAbndykSPnvhAY6BTnDxePOr6ZRI9wtRe6twjp5ult/6+RN5cv xyaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date; bh=MpGY4j2+HNaS4pIK9i1574FgHbMWXnwyl50j6Y5l4ns=; b=oLhJNqWrWVNGIgLhMn7/OIPAaxNzxV7PBOMBeHlFDXnz6VPvKcTqe1OaMfwccbXuJt XRVXdCVAMTM2NS1+wggW/iMNgu2w+bBabl/2HMhoix5T9FT01exRuBTb6RUCarA4Bwx+ PRUoR+i083LF6AthIva34wIHWkh6TbM4QUVthAwVNtrZDR/iy+2jHL/em0QQdcsVB4jq Mo2Q0MPtHKnfHkrGXO1eeC/fsB33MeCIaj4wNXkMlbsvl/tp5ZHyvk+JXqtPF9E2xvaJ c+L+QX7cdkcQDKH2EzHxGR5c0SL9MFeGxW3bu8Fwvan6+3kLLUxH1bYhuxX3jL2sa5HI 4TLw== X-Gm-Message-State: ACrzQf21CIaF7jz8lShU/U7DskabVKsSj0YIf//KdTprjQFWbGLNzZqC Ee8U3iNLWDKS8+49VHRD5R4L5vISQx4= X-Google-Smtp-Source: AMsMyM6dR8YjTdbIGbobhsytlzYF4LuFcy7c0+vSn8nMZnL9q0hTWanGut+PyhNipbx1MRUNNtBJTh3lLqQ= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:90b:10a:b0:200:2849:235f with SMTP id p10-20020a17090b010a00b002002849235fmr437296pjz.1.1663892051472; Thu, 22 Sep 2022 17:14:11 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 23 Sep 2022 00:13:55 +0000 In-Reply-To: <20220923001355.3741194-1-seanjc@google.com> Mime-Version: 1.0 References: <20220923001355.3741194-1-seanjc@google.com> X-Mailer: git-send-email 2.37.3.998.g577e59143f-goog Message-ID: <20220923001355.3741194-5-seanjc@google.com> Subject: [PATCH 4/4] KVM: x86/pmu: Defer counter emulated overflow via pmc->prev_counter From: Sean Christopherson To: Sean Christopherson , Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Aaron Lewis , Like Xu , Wanpeng Li Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Like Xu Defer reprogramming counters and handling overflow via KVM_REQ_PMU when incrementing counters. KVM skips emulated WRMSR in the VM-Exit fastpath, the fastpath runs with IRQs disabled, skipping instructions can increment and reprogram counters, reprogramming counters can sleep, and sleeping is disallowed while IRQs are disabled. [*] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580 [*] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 2981888, name: CPU 15/KVM [*] preempt_count: 1, expected: 0 [*] RCU nest depth: 0, expected: 0 [*] INFO: lockdep is turned off. [*] irq event stamp: 0 [*] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [*] hardirqs last disabled at (0): [] copy_process+0x146a/0x62d0 [*] softirqs last enabled at (0): [] copy_process+0x14a9/0x62d0 [*] softirqs last disabled at (0): [<0000000000000000>] 0x0 [*] Preemption disabled at: [*] [] vcpu_enter_guest+0x1001/0x3dc0 [kvm] [*] CPU: 17 PID: 2981888 Comm: CPU 15/KVM Kdump: 5.19.0-rc1-g239111db364c-dirty #2 [*] Call Trace: [*] [*] dump_stack_lvl+0x6c/0x9b [*] __might_resched.cold+0x22e/0x297 [*] __mutex_lock+0xc0/0x23b0 [*] perf_event_ctx_lock_nested+0x18f/0x340 [*] perf_event_pause+0x1a/0x110 [*] reprogram_counter+0x2af/0x1490 [kvm] [*] kvm_pmu_trigger_event+0x429/0x950 [kvm] [*] kvm_skip_emulated_instruction+0x48/0x90 [kvm] [*] handle_fastpath_set_msr_irqoff+0x349/0x3b0 [kvm] [*] vmx_vcpu_run+0x268e/0x3b80 [kvm_intel] [*] vcpu_enter_guest+0x1d22/0x3dc0 [kvm] Add a field to kvm_pmc to track the previous counter value in order to defer overflow detection to kvm_pmu_handle_event() (the counter must be paused before handling overflow, and that may increment the counter). Opportunistically shrink sizeof(struct kvm_pmc) a bit. Suggested-by: Wanpeng Li Fixes: 9cd803d496e7 ("KVM: x86: Update vPMCs when retiring instructions") Signed-off-by: Like Xu Link: https://lore.kernel.org/r/20220831085328.45489-6-likexu@tencent.com [sean: avoid re-triggering KVM_REQ_PMU on overflow, tweak changelog] Signed-off-by: Sean Christopherson --- arch/x86/include/asm/kvm_host.h | 5 +++-- arch/x86/kvm/pmu.c | 32 ++++++++++++++++---------------- arch/x86/kvm/svm/pmu.c | 2 +- arch/x86/kvm/vmx/pmu_intel.c | 4 ++-- 4 files changed, 22 insertions(+), 21 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 12dcfc9330e7..9639404f2856 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -488,7 +488,10 @@ enum pmc_type { struct kvm_pmc { enum pmc_type type; u8 idx; + bool is_paused; + bool intr; u64 counter; + u64 prev_counter; u64 eventsel; struct perf_event *perf_event; struct kvm_vcpu *vcpu; @@ -498,8 +501,6 @@ struct kvm_pmc { * ctrl value for fixed counters. */ u64 current_config; - bool is_paused; - bool intr; }; #define KVM_PMC_MAX_FIXED 3 diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index d8330e6064ab..935c9d80ab50 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -101,14 +101,6 @@ static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi) struct kvm_pmu *pmu = pmc_to_pmu(pmc); bool skip_pmi = false; - /* - * Ignore overflow events for counters that are scheduled to be - * reprogrammed, e.g. if a PMI for the previous event races with KVM's - * handling of a related guest WRMSR. - */ - if (test_and_set_bit(pmc->idx, pmu->reprogram_pmi)) - return; - if (pmc->perf_event && pmc->perf_event->attr.precise_ip) { if (!in_pmi) { /* @@ -126,7 +118,6 @@ static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi) } else { __set_bit(pmc->idx, (unsigned long *)&pmu->global_status); } - kvm_make_request(KVM_REQ_PMU, pmc->vcpu); if (!pmc->intr || skip_pmi) return; @@ -151,7 +142,17 @@ static void kvm_perf_overflow(struct perf_event *perf_event, { struct kvm_pmc *pmc = perf_event->overflow_handler_context; + /* + * Ignore overflow events for counters that are scheduled to be + * reprogrammed, e.g. if a PMI for the previous event races with KVM's + * handling of a related guest WRMSR. + */ + if (test_and_set_bit(pmc->idx, pmc_to_pmu(pmc)->reprogram_pmi)) + return; + __kvm_perf_overflow(pmc, true); + + kvm_make_request(KVM_REQ_PMU, pmc->vcpu); } static int pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type, u64 config, @@ -311,6 +312,9 @@ static void reprogram_counter(struct kvm_pmc *pmc) if (!check_pmu_event_filter(pmc)) goto reprogram_complete; + if (pmc->counter < pmc->prev_counter) + __kvm_perf_overflow(pmc, false); + if (eventsel & ARCH_PERFMON_EVENTSEL_PIN_CONTROL) printk_once("kvm pmu: pin control bit is ignored\n"); @@ -348,6 +352,7 @@ static void reprogram_counter(struct kvm_pmc *pmc) reprogram_complete: clear_bit(pmc->idx, (unsigned long *)&pmc_to_pmu(pmc)->reprogram_pmi); + pmc->prev_counter = 0; } void kvm_pmu_handle_event(struct kvm_vcpu *vcpu) @@ -536,14 +541,9 @@ void kvm_pmu_destroy(struct kvm_vcpu *vcpu) static void kvm_pmu_incr_counter(struct kvm_pmc *pmc) { - u64 prev_count; - - prev_count = pmc->counter; + pmc->prev_counter = pmc->counter; pmc->counter = (pmc->counter + 1) & pmc_bitmask(pmc); - - reprogram_counter(pmc); - if (pmc->counter < prev_count) - __kvm_perf_overflow(pmc, false); + kvm_pmu_request_counter_reprogam(pmc); } static inline bool eventsel_match_perf_hw_id(struct kvm_pmc *pmc, diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c index 041aa898e1bc..2ec420b85d6a 100644 --- a/arch/x86/kvm/svm/pmu.c +++ b/arch/x86/kvm/svm/pmu.c @@ -211,7 +211,7 @@ static void amd_pmu_reset(struct kvm_vcpu *vcpu) struct kvm_pmc *pmc = &pmu->gp_counters[i]; pmc_stop_counter(pmc); - pmc->counter = pmc->eventsel = 0; + pmc->counter = pmc->prev_counter = pmc->eventsel = 0; } } diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index e38518afc265..1bf5d4b00296 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -647,14 +647,14 @@ static void intel_pmu_reset(struct kvm_vcpu *vcpu) pmc = &pmu->gp_counters[i]; pmc_stop_counter(pmc); - pmc->counter = pmc->eventsel = 0; + pmc->counter = pmc->prev_counter = pmc->eventsel = 0; } for (i = 0; i < KVM_PMC_MAX_FIXED; i++) { pmc = &pmu->fixed_counters[i]; pmc_stop_counter(pmc); - pmc->counter = 0; + pmc->counter = pmc->prev_counter = 0; } pmu->fixed_ctr_ctrl = pmu->global_ctrl = pmu->global_status = 0;