From patchwork Wed Aug 2 05:16:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 13337648 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF22AC41513 for ; Wed, 2 Aug 2023 05:17:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232356AbjHBFR1 (ORCPT ); Wed, 2 Aug 2023 01:17:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41596 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231378AbjHBFRY (ORCPT ); Wed, 2 Aug 2023 01:17:24 -0400 Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7013F1BFD; Tue, 1 Aug 2023 22:17:22 -0700 (PDT) Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1b9c5e07c1bso54640345ad.2; Tue, 01 Aug 2023 22:17:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690953442; x=1691558242; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5aA5JIT6ardwan6m5SFrBRELKgm9+/graZUkPPo07yI=; b=q7uCK5VordhnMcT6Tz24S/dN/Vn2wrSzLgX2dpILJ6wRV6TK6oN2TyUS6RqCEyV7hL nntU07Yf24yserlGDfyafGLY9SzFAEsKDmfQHHEDv9RqIeqOkmAer9sRMOcONoOdzqpp nOlHhvW5PXBQb62KXzZZZx+QL7xwtHw+88lyK+MRRl70SEKwGGihs/H7LU5KyZimLcuQ G/Lny3nf6gW0DdDIprZEFHbuwDJWIsA/KCtAZnw5+nIe+YXV/PWWRSqqwukDSB9VB+Sq BvQ4VI/gyuSHWg/15j1wrojTrs+hE+kIUS9Adqj1vazvrRYP70DsrTMNUfQYv1Y99+TT +wgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690953442; x=1691558242; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5aA5JIT6ardwan6m5SFrBRELKgm9+/graZUkPPo07yI=; b=MDCbwh42FETez1Fh3+KR3Y7CoU4NBDzbFuW0+qDMRjHKzvkbw13bAeY9cHItkowCrr eBCTsIGjgHAXoG3gPE9us7gZl+LhWukPdf7xUCHy7Nmt7l2QL+zlYgOxtXUKmrsLldSR f/ZQcQVlPBHQyF2QEEdPP963YQiI/O/1VJ4AOaQpJjB5SHy01bOCV8DbEmpA8IgLYWrW YW3QCfzHzobpn7QycSUm91fsMEqLJEbxZPGJ2oVBpbu34Eapd4s+cBhLFOYQ1Qt+vpml kjcWrfSpGb1m4GYAj9Rlg/8avfwXMIRgvjoYofj2jd9huOjYAHovRzThBCLt0D9TPHxl AOwg== X-Gm-Message-State: ABy/qLYFR4LWHz0ZTbYdIharnItlk3PnhK9czg5xJA3He8sGdSD6+0RO 1wzfVqICUNBUVMr8fZCUlK4= X-Google-Smtp-Source: APBJJlHjwCEcvdsQuQrQOVoQmjE+0k0ALlSHusRTo5K46jnkSLwJUSYTWzmsbQWefU2k+AdsFSv5tQ== X-Received: by 2002:a17:903:245:b0:1b8:abe7:5a80 with SMTP id j5-20020a170903024500b001b8abe75a80mr17356754plh.40.1690953441838; Tue, 01 Aug 2023 22:17:21 -0700 (PDT) Received: from localhost.localdomain ([103.7.29.32]) by smtp.gmail.com with ESMTPSA id f8-20020a17090274c800b001bba7002132sm11330446plt.33.2023.08.01.22.17.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Aug 2023 22:17:21 -0700 (PDT) From: Like Xu X-Google-Original-From: Like Xu To: Alex Williamson , Paolo Bonzini Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH v2 1/2] KVM: eventfd: Fix NULL deref irqbypass producer Date: Wed, 2 Aug 2023 13:16:59 +0800 Message-ID: <20230802051700.52321-2-likexu@tencent.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230802051700.52321-1-likexu@tencent.com> References: <20230802051700.52321-1-likexu@tencent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Like Xu Adding guard logic to make irq_bypass_register/unregister_producer() looks for the producer entry based on producer pointer itself instead of pure token matching. As was attempted commit 4f3dbdf47e15 ("KVM: eventfd: fix NULL deref irqbypass consumer"), two different producers may occasionally have two identical eventfd's. In this case, the later producer may unregister the previous one after the registration fails (since they share the same token), then NULL deref incurres in the path of deleting producer from the producers list. Registration should also fail if a registered producer changes its token and registers again via the same producer pointer. Cc: Alex Williamson Signed-off-by: Like Xu Reviewed-by: Alex Williamson --- virt/lib/irqbypass.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/virt/lib/irqbypass.c b/virt/lib/irqbypass.c index 28fda42e471b..e0aabbbf27ec 100644 --- a/virt/lib/irqbypass.c +++ b/virt/lib/irqbypass.c @@ -98,7 +98,7 @@ int irq_bypass_register_producer(struct irq_bypass_producer *producer) mutex_lock(&lock); list_for_each_entry(tmp, &producers, node) { - if (tmp->token == producer->token) { + if (tmp->token == producer->token || tmp == producer) { ret = -EBUSY; goto out_err; } @@ -148,7 +148,7 @@ void irq_bypass_unregister_producer(struct irq_bypass_producer *producer) mutex_lock(&lock); list_for_each_entry(tmp, &producers, node) { - if (tmp->token != producer->token) + if (tmp != producer) continue; list_for_each_entry(consumer, &consumers, node) { From patchwork Wed Aug 2 05:17:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Like Xu X-Patchwork-Id: 13337649 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 597B4C001DE for ; Wed, 2 Aug 2023 05:17:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232366AbjHBFR3 (ORCPT ); Wed, 2 Aug 2023 01:17:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41618 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232349AbjHBFR0 (ORCPT ); Wed, 2 Aug 2023 01:17:26 -0400 Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C4BCD1FCB; Tue, 1 Aug 2023 22:17:24 -0700 (PDT) Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-68336d06620so6236633b3a.1; Tue, 01 Aug 2023 22:17:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1690953444; x=1691558244; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Hk6fUj5vrqB1StR0ZBHfn/V9Hh3UNf+M83guAs+/pPU=; b=s051tE8aOWfi36qsg4UtJ/jS2IwEtTquP+HqeLv1SyyvNFgh1efdWEYWt68U2bYT6n Pbi21ssR0ZI33uk2gtZnEn3a0zmDec2V1kxUK2fKOVVSRoW+4t6sRJO+ZwTRmebkT/pf 8go+8BWM6QKUDiT88xAcIsK0JjTrX3gqwYpfG25Ffykv2EK5FikmoyvLTc3jMIRkSE3y 7ofHcTMoyoyeyMsAJ5NyOWeXW3TDPZp7Dt1DSrVr8IoyV5hh3nfo9redD5SOx7D14Iuz WXzc00a3Ptneu8vc6IW8G8DaB5f4wDArWK21XkvqzS429qJUtFE5FNH/WEpWPhspg1PM xhYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690953444; x=1691558244; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Hk6fUj5vrqB1StR0ZBHfn/V9Hh3UNf+M83guAs+/pPU=; b=b7YVtKkyXMLWW+NOzfYYUpPeVhqZjKV5J40DYiL50su8j83x/uNUCe9s1DRBFV1wUA giLLjtLr/VmHBAQeck5k3dRNyQ0/0/DfRMrx/JW+Esl/zpfVZz+rd1noXblF5dNCs173 JU5wvWjOyxPHONus///C+CnbMEevoFxSM9o6YaXBF6R16c/3DixBlsCc7Q0JOVIEHeVQ MTnxNzHPmnxu4Jb8WhtVPypswh4czV6pMQXtuPbNmncywGnnSCwT7AQ55ELQNA5AZjhS VOCVPHjI4JJus3pf946inL8bx6rjqKHoVjWxBQ6PckkCdyhrmi9HwVVn/zaUT6nE2rKe 31Hg== X-Gm-Message-State: ABy/qLYHn5ClNEB48JOU4jRGnEWXoxY/GZ9JJMbla5yKSt4VJTPpDsr9 G1ZOIrb0drh8SkTQrswrZO8= X-Google-Smtp-Source: APBJJlFDGk1Dpg2sxfixYBpiz+h6GbfLLHuV0CQXxQNbKALUWzt3DQmZAIuEaR0avRh5latF1eajmw== X-Received: by 2002:a05:6a20:3d02:b0:13c:a02a:9804 with SMTP id y2-20020a056a203d0200b0013ca02a9804mr19722001pzi.32.1690953443713; Tue, 01 Aug 2023 22:17:23 -0700 (PDT) Received: from localhost.localdomain ([103.7.29.32]) by smtp.gmail.com with ESMTPSA id f8-20020a17090274c800b001bba7002132sm11330446plt.33.2023.08.01.22.17.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Aug 2023 22:17:23 -0700 (PDT) From: Like Xu X-Google-Original-From: Like Xu To: Alex Williamson , Paolo Bonzini Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Subject: [PATCH v2 2/2] KVM: irqbypass: Convert producers/consumers single linked list to XArray Date: Wed, 2 Aug 2023 13:17:00 +0800 Message-ID: <20230802051700.52321-3-likexu@tencent.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230802051700.52321-1-likexu@tencent.com> References: <20230802051700.52321-1-likexu@tencent.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Like Xu Replace producers/consumers linked list with XArray. There are no changes in functionality, but lookup performance has been improved. The producers and consumers in current IRQ bypass manager are stored in simple linked lists, and a single mutex is held while traversing the lists and connecting a consumer to a producer (and vice versa). With this design and implementation, if there are a large number of KVM agents concurrently creating irqfds and all requesting to register their irqfds in the global consumers list, the global mutex contention will exponentially increase the avg wait latency, which is no longer tolerable in modern systems with a large number of CPU cores. For example: the wait time latency to acquire the mutex in a stress test where 174000 irqfds were created concurrently on an 2.70GHz ICX w/ 144 cores: - avg = 117.855314 ms - min = 20 ns - max = 11428.340858 ms To reduce latency introduced by the irq_bypass_register_consumer() in the above usage scenario, the data structure XArray and its normal API is applied to track the producers and consumers so that lookups don't require a linear walk since the "tokens" used to match producers and consumers are just kernel pointers. Thanks to the nature of XArray (more memory-efficient, parallelisable and cache friendly), the latecny is significantly reduced (compared to list and hlist proposal) under the same environment and testing: - avg = 314 ns - min = 124 ns - max = 47637 ns In this conversion, the non-NULL opaque token to match between producer and consumer () is used as the XArray index. The list_for_each_entry() is replaced by xa_load(), and list_add/del() is replaced by xa_store/erase(). The list_head member for linked list is removed, along with comments. Cc: Alex Williamson Reported-by: Yong He Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217379 Suggested-by: Sean Christopherson Suggested-by: Paolo Bonzini Signed-off-by: Like Xu --- include/linux/irqbypass.h | 8 +-- virt/lib/irqbypass.c | 127 +++++++++++++++++++------------------- 2 files changed, 63 insertions(+), 72 deletions(-) diff --git a/include/linux/irqbypass.h b/include/linux/irqbypass.h index 9bdb2a781841..dbcc1b4d0ccf 100644 --- a/include/linux/irqbypass.h +++ b/include/linux/irqbypass.h @@ -8,14 +8,12 @@ #ifndef IRQBYPASS_H #define IRQBYPASS_H -#include - struct irq_bypass_consumer; /* * Theory of operation * - * The IRQ bypass manager is a simple set of lists and callbacks that allows + * The IRQ bypass manager is a simple set of xarrays and callbacks that allows * IRQ producers (ex. physical interrupt sources) to be matched to IRQ * consumers (ex. virtualization hardware that allows IRQ bypass or offload) * via a shared token (ex. eventfd_ctx). Producers and consumers register @@ -30,7 +28,6 @@ struct irq_bypass_consumer; /** * struct irq_bypass_producer - IRQ bypass producer definition - * @node: IRQ bypass manager private list management * @token: opaque token to match between producer and consumer (non-NULL) * @irq: Linux IRQ number for the producer device * @add_consumer: Connect the IRQ producer to an IRQ consumer (optional) @@ -43,7 +40,6 @@ struct irq_bypass_consumer; * for a physical device assigned to a VM. */ struct irq_bypass_producer { - struct list_head node; void *token; int irq; int (*add_consumer)(struct irq_bypass_producer *, @@ -56,7 +52,6 @@ struct irq_bypass_producer { /** * struct irq_bypass_consumer - IRQ bypass consumer definition - * @node: IRQ bypass manager private list management * @token: opaque token to match between producer and consumer (non-NULL) * @add_producer: Connect the IRQ consumer to an IRQ producer * @del_producer: Disconnect the IRQ consumer from an IRQ producer @@ -69,7 +64,6 @@ struct irq_bypass_producer { * portions of the interrupt handling to the VM. */ struct irq_bypass_consumer { - struct list_head node; void *token; int (*add_producer)(struct irq_bypass_consumer *, struct irq_bypass_producer *); diff --git a/virt/lib/irqbypass.c b/virt/lib/irqbypass.c index e0aabbbf27ec..3f8736951e92 100644 --- a/virt/lib/irqbypass.c +++ b/virt/lib/irqbypass.c @@ -15,15 +15,15 @@ */ #include -#include #include #include +#include MODULE_LICENSE("GPL v2"); MODULE_DESCRIPTION("IRQ bypass manager utility module"); -static LIST_HEAD(producers); -static LIST_HEAD(consumers); +static DEFINE_XARRAY(producers); +static DEFINE_XARRAY(consumers); static DEFINE_MUTEX(lock); /* @lock must be held when calling connect */ @@ -78,11 +78,12 @@ static void __disconnect(struct irq_bypass_producer *prod, * irq_bypass_register_producer - register IRQ bypass producer * @producer: pointer to producer structure * - * Add the provided IRQ producer to the list of producers and connect - * with any matching token found on the IRQ consumers list. + * Add the provided IRQ producer to the xarray of producers and connect + * with any matching token found on the IRQ consumers xarray. */ int irq_bypass_register_producer(struct irq_bypass_producer *producer) { + unsigned long token = (unsigned long)producer->token; struct irq_bypass_producer *tmp; struct irq_bypass_consumer *consumer; int ret; @@ -97,24 +98,23 @@ int irq_bypass_register_producer(struct irq_bypass_producer *producer) mutex_lock(&lock); - list_for_each_entry(tmp, &producers, node) { - if (tmp->token == producer->token || tmp == producer) { - ret = -EBUSY; + tmp = xa_load(&producers, token); + if (tmp || tmp == producer) { + ret = -EBUSY; + goto out_err; + } + + ret = xa_err(xa_store(&producers, token, producer, GFP_KERNEL)); + if (ret) + goto out_err; + + consumer = xa_load(&consumers, token); + if (consumer) { + ret = __connect(producer, consumer); + if (ret) goto out_err; - } } - list_for_each_entry(consumer, &consumers, node) { - if (consumer->token == producer->token) { - ret = __connect(producer, consumer); - if (ret) - goto out_err; - break; - } - } - - list_add(&producer->node, &producers); - mutex_unlock(&lock); return 0; @@ -129,11 +129,12 @@ EXPORT_SYMBOL_GPL(irq_bypass_register_producer); * irq_bypass_unregister_producer - unregister IRQ bypass producer * @producer: pointer to producer structure * - * Remove a previously registered IRQ producer from the list of producers + * Remove a previously registered IRQ producer from the xarray of producers * and disconnect it from any connected IRQ consumer. */ void irq_bypass_unregister_producer(struct irq_bypass_producer *producer) { + unsigned long token = (unsigned long)producer->token; struct irq_bypass_producer *tmp; struct irq_bypass_consumer *consumer; @@ -143,24 +144,18 @@ void irq_bypass_unregister_producer(struct irq_bypass_producer *producer) might_sleep(); if (!try_module_get(THIS_MODULE)) - return; /* nothing in the list anyway */ + return; /* nothing in the xarray anyway */ mutex_lock(&lock); - list_for_each_entry(tmp, &producers, node) { - if (tmp != producer) - continue; + tmp = xa_load(&producers, token); + if (tmp == producer) { + consumer = xa_load(&consumers, token); + if (consumer) + __disconnect(producer, consumer); - list_for_each_entry(consumer, &consumers, node) { - if (consumer->token == producer->token) { - __disconnect(producer, consumer); - break; - } - } - - list_del(&producer->node); + xa_erase(&producers, token); module_put(THIS_MODULE); - break; } mutex_unlock(&lock); @@ -173,11 +168,12 @@ EXPORT_SYMBOL_GPL(irq_bypass_unregister_producer); * irq_bypass_register_consumer - register IRQ bypass consumer * @consumer: pointer to consumer structure * - * Add the provided IRQ consumer to the list of consumers and connect - * with any matching token found on the IRQ producer list. + * Add the provided IRQ consumer to the xarray of consumers and connect + * with any matching token found on the IRQ producer xarray. */ int irq_bypass_register_consumer(struct irq_bypass_consumer *consumer) { + unsigned long token = (unsigned long)consumer->token; struct irq_bypass_consumer *tmp; struct irq_bypass_producer *producer; int ret; @@ -193,24 +189,23 @@ int irq_bypass_register_consumer(struct irq_bypass_consumer *consumer) mutex_lock(&lock); - list_for_each_entry(tmp, &consumers, node) { - if (tmp->token == consumer->token || tmp == consumer) { - ret = -EBUSY; + tmp = xa_load(&consumers, token); + if (tmp || tmp == consumer) { + ret = -EBUSY; + goto out_err; + } + + ret = xa_err(xa_store(&consumers, token, consumer, GFP_KERNEL)); + if (ret) + goto out_err; + + producer = xa_load(&producers, token); + if (producer) { + ret = __connect(producer, consumer); + if (ret) goto out_err; - } } - list_for_each_entry(producer, &producers, node) { - if (producer->token == consumer->token) { - ret = __connect(producer, consumer); - if (ret) - goto out_err; - break; - } - } - - list_add(&consumer->node, &consumers); - mutex_unlock(&lock); return 0; @@ -225,11 +220,12 @@ EXPORT_SYMBOL_GPL(irq_bypass_register_consumer); * irq_bypass_unregister_consumer - unregister IRQ bypass consumer * @consumer: pointer to consumer structure * - * Remove a previously registered IRQ consumer from the list of consumers + * Remove a previously registered IRQ consumer from the xarray of consumers * and disconnect it from any connected IRQ producer. */ void irq_bypass_unregister_consumer(struct irq_bypass_consumer *consumer) { + unsigned long token = (unsigned long)consumer->token; struct irq_bypass_consumer *tmp; struct irq_bypass_producer *producer; @@ -239,24 +235,18 @@ void irq_bypass_unregister_consumer(struct irq_bypass_consumer *consumer) might_sleep(); if (!try_module_get(THIS_MODULE)) - return; /* nothing in the list anyway */ + return; /* nothing in the xarray anyway */ mutex_lock(&lock); - list_for_each_entry(tmp, &consumers, node) { - if (tmp != consumer) - continue; + tmp = xa_load(&consumers, token); + if (tmp == consumer) { + producer = xa_load(&producers, token); + if (producer) + __disconnect(producer, consumer); - list_for_each_entry(producer, &producers, node) { - if (producer->token == consumer->token) { - __disconnect(producer, consumer); - break; - } - } - - list_del(&consumer->node); + xa_erase(&consumers, token); module_put(THIS_MODULE); - break; } mutex_unlock(&lock); @@ -264,3 +254,10 @@ void irq_bypass_unregister_consumer(struct irq_bypass_consumer *consumer) module_put(THIS_MODULE); } EXPORT_SYMBOL_GPL(irq_bypass_unregister_consumer); + +static void __exit irqbypass_exit(void) +{ + xa_destroy(&producers); + xa_destroy(&consumers); +} +module_exit(irqbypass_exit);