From patchwork Fri Feb 2 21:53:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13543472 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E7995C48291 for ; Fri, 2 Feb 2024 21:54:54 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rW1Tp-00088b-L1; Fri, 02 Feb 2024 16:53:49 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1To-00088R-Ds for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:53:48 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1Tm-0004pw-Kp for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:53:48 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706910825; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+sTK7SxvFmjL3/R7XiBJpkSqtBZ/tOmfZKghGUy0GP0=; b=FKZkq/KTqu5o7A592LszV/G96PrVndBOV43KurKeXG9rq/6OtXIYPKQKW8JIVdAcd+fqVf 89tmwE8ijFjFVNz0AeIJUR1RU58gIodFh7VdeEjYZXgGCHNpe7cVvlSqETd19ztDXxgpSQ sbEx9/IZNUJJ3E8laGRBxOjOjkVg900= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-56-v1chVkfNOXmrSDfhcaI1TA-1; Fri, 02 Feb 2024 16:53:43 -0500 X-MC-Unique: v1chVkfNOXmrSDfhcaI1TA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5B0243806712; Fri, 2 Feb 2024 21:53:43 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.192.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id 30A162166B31; Fri, 2 Feb 2024 21:53:41 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , "Michael S . Tsirkin" , Jason Wang , Stefan Hajnoczi , Stefano Garzarella , Germano Veit Michel , Raphael Norwitz Subject: [PATCH v1 01/15] libvhost-user: Fix msg_region->userspace_addr computation Date: Fri, 2 Feb 2024 22:53:18 +0100 Message-ID: <20240202215332.118728-2-david@redhat.com> In-Reply-To: <20240202215332.118728-1-david@redhat.com> References: <20240202215332.118728-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.276, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org We barely had mmap_offset set in the past. With virtio-mem and dynamic-memslots that will change. In vu_add_mem_reg() and vu_set_mem_table_exec_postcopy(), we are performing pointer arithmetics, which is wrong. Let's simply use dev_region->mmap_addr instead of "void *mmap_addr". Fixes: ec94c8e621de ("Support adding individual regions in libvhost-user") Fixes: 9bb38019942c ("vhost+postcopy: Send address back to qemu") Cc: Raphael Norwitz Signed-off-by: David Hildenbrand Reviewed-by: Raphael Norwitz --- subprojects/libvhost-user/libvhost-user.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index a3b158c671..7e515ed15d 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -800,8 +800,8 @@ vu_add_mem_reg(VuDev *dev, VhostUserMsg *vmsg) { * Return the address to QEMU so that it can translate the ufd * fault addresses back. */ - msg_region->userspace_addr = (uintptr_t)(mmap_addr + - dev_region->mmap_offset); + msg_region->userspace_addr = dev_region->mmap_addr + + dev_region->mmap_offset; /* Send the message back to qemu with the addresses filled in. */ vmsg->fd_num = 0; @@ -969,8 +969,8 @@ vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg *vmsg) /* Return the address to QEMU so that it can translate the ufd * fault addresses back. */ - msg_region->userspace_addr = (uintptr_t)(mmap_addr + - dev_region->mmap_offset); + msg_region->userspace_addr = dev_region->mmap_addr + + dev_region->mmap_offset; close(vmsg->fds[i]); } From patchwork Fri Feb 2 21:53:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13543482 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8EE6DC4828F for ; Fri, 2 Feb 2024 21:55:46 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rW1Tt-00089b-Mt; Fri, 02 Feb 2024 16:53:53 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1Tr-00089D-QK for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:53:51 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1Tq-0004qc-EL for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:53:51 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706910829; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=aTB8Ez+mNhMI/GF/L0T5BLMye43WSG0TKmO8bVZtESY=; b=MEp8tcjFQojdxNzg/dHtg4tGMnVxcuYGURspN0LZZxH3Ip7sXKkQ0taMwvoxtShGaH43fL Rnk3J1pBFath+TqfPgaQmPmgpsrRxP3TlHd371RiwuKDqh87w0mwH/4jjQ5iPhiZ8JtHgS f4QwszcsnmZLVDotTHY3vqQNLhxjcyY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-147-pLyTJv4aOo22QirFJ4xkFA-1; Fri, 02 Feb 2024 16:53:46 -0500 X-MC-Unique: pLyTJv4aOo22QirFJ4xkFA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CAF78867944; Fri, 2 Feb 2024 21:53:45 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.192.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id BA8CD2166B31; Fri, 2 Feb 2024 21:53:43 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , "Michael S . Tsirkin" , Jason Wang , Stefan Hajnoczi , Stefano Garzarella , Germano Veit Michel , Raphael Norwitz Subject: [PATCH v1 02/15] libvhost-user: Dynamically allocate memory for memory slots Date: Fri, 2 Feb 2024 22:53:19 +0100 Message-ID: <20240202215332.118728-3-david@redhat.com> In-Reply-To: <20240202215332.118728-1-david@redhat.com> References: <20240202215332.118728-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.129.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.276, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Let's prepare for increasing VHOST_USER_MAX_RAM_SLOTS by dynamically allocating dev->regions. We don't have any ABI guarantees (not dynamically linked), so we can simply change the layout of VuDev. Let's zero out the memory, just as we used to do. Signed-off-by: David Hildenbrand Reviewed-by: Raphael Norwitz --- subprojects/libvhost-user/libvhost-user.c | 11 +++++++++++ subprojects/libvhost-user/libvhost-user.h | 2 +- 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index 7e515ed15d..8a5a7a2295 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -2171,6 +2171,8 @@ vu_deinit(VuDev *dev) free(dev->vq); dev->vq = NULL; + free(dev->regions); + dev->regions = NULL; } bool @@ -2205,9 +2207,18 @@ vu_init(VuDev *dev, dev->backend_fd = -1; dev->max_queues = max_queues; + dev->regions = malloc(VHOST_USER_MAX_RAM_SLOTS * sizeof(dev->regions[0])); + if (!dev->regions) { + DPRINT("%s: failed to malloc mem regions\n", __func__); + return false; + } + memset(dev->regions, 0, VHOST_USER_MAX_RAM_SLOTS * sizeof(dev->regions[0])); + dev->vq = malloc(max_queues * sizeof(dev->vq[0])); if (!dev->vq) { DPRINT("%s: failed to malloc virtqueues\n", __func__); + free(dev->regions); + dev->regions = NULL; return false; } diff --git a/subprojects/libvhost-user/libvhost-user.h b/subprojects/libvhost-user/libvhost-user.h index c2352904f0..c882b4e3a2 100644 --- a/subprojects/libvhost-user/libvhost-user.h +++ b/subprojects/libvhost-user/libvhost-user.h @@ -398,7 +398,7 @@ typedef struct VuDevInflightInfo { struct VuDev { int sock; uint32_t nregions; - VuDevRegion regions[VHOST_USER_MAX_RAM_SLOTS]; + VuDevRegion *regions; VuVirtq *vq; VuDevInflightInfo inflight_info; int log_call_fd; From patchwork Fri Feb 2 21:53:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13543478 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7CC02C4828F for ; Fri, 2 Feb 2024 21:55:22 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rW1Tv-0008A3-CA; Fri, 02 Feb 2024 16:53:55 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1Tt-00089c-RL for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:53:53 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1Ts-0004qq-7l for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:53:53 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706910831; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=r8GYFmmOJF+1Dbb6a/YBxoMN7EPCTP0YL5LwMwqVEuQ=; b=WsF3EfRRwhAA63padasM0iAiSNokqHBN4rTLo5lBku8+GnA71Z/AnKESBXogqeVEibHgLU kkJ5TUOo2MPGd5VXbvI7OtXv6iKOstF1Seg3riRPXFCwSEmevkuGtxz5H64lwMqQfwne5/ yYV6sFuEUpRCJs6NwX9C4Bd6lJS9MYQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-318-YTQOS8LeNvqt-vRPfdZMVQ-1; Fri, 02 Feb 2024 16:53:48 -0500 X-MC-Unique: YTQOS8LeNvqt-vRPfdZMVQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2CC96185A782; Fri, 2 Feb 2024 21:53:48 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.192.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2396B2166B31; Fri, 2 Feb 2024 21:53:46 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , "Michael S . Tsirkin" , Jason Wang , Stefan Hajnoczi , Stefano Garzarella , Germano Veit Michel , Raphael Norwitz Subject: [PATCH v1 03/15] libvhost-user: Bump up VHOST_USER_MAX_RAM_SLOTS to 509 Date: Fri, 2 Feb 2024 22:53:20 +0100 Message-ID: <20240202215332.118728-4-david@redhat.com> In-Reply-To: <20240202215332.118728-1-david@redhat.com> References: <20240202215332.118728-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.276, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Let's support up to 509 mem slots, just like vhost in the kernel usually does and the rust vhost-user implementation recently [1] started doing. This is required to properly support memory hotplug, either using multiple DIMMs (ACPI supports up to 256) or using virtio-mem. The 509 used to be the KVM limit, it supported 512, but 3 were used for internal purposes. Currently, KVM supports more than 512, but it usually doesn't make use of more than ~260 (i.e., 256 DIMMs + boot memory), except when other memory devices like PCI devices with BARs are used. So, 509 seems to work well for vhost in the kernel. Details can be found in the QEMU change that made virtio-mem consume up to 256 mem slots across all virtio-mem devices. [2] 509 mem slots implies 509 VMAs/mappings in the worst case (even though, in practice with virtio-mem we won't be seeing more than ~260 in most setups). With max_map_count under Linux defaulting to 64k, 509 mem slots still correspond to less than 1% of the maximum number of mappings. There are plenty left for the application to consume. [1] https://github.com/rust-vmm/vhost/pull/224 [2] https://lore.kernel.org/all/20230926185738.277351-1-david@redhat.com/ Signed-off-by: David Hildenbrand Reviewed-by: Raphael Norwitz --- subprojects/libvhost-user/libvhost-user.h | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/subprojects/libvhost-user/libvhost-user.h b/subprojects/libvhost-user/libvhost-user.h index c882b4e3a2..deb40e77b3 100644 --- a/subprojects/libvhost-user/libvhost-user.h +++ b/subprojects/libvhost-user/libvhost-user.h @@ -31,10 +31,12 @@ #define VHOST_MEMORY_BASELINE_NREGIONS 8 /* - * Set a reasonable maximum number of ram slots, which will be supported by - * any architecture. + * vhost in the kernel usually supports 509 mem slots. 509 used to be the + * KVM limit, it supported 512, but 3 were used for internal purposes. This + * limit is sufficient to support many DIMMs and virtio-mem in + * "dynamic-memslots" mode. */ -#define VHOST_USER_MAX_RAM_SLOTS 32 +#define VHOST_USER_MAX_RAM_SLOTS 509 #define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64) From patchwork Fri Feb 2 21:53:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13543474 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1024DC4828E for ; Fri, 2 Feb 2024 21:54:54 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rW1Tx-0008CI-RY; Fri, 02 Feb 2024 16:53:57 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1Tw-0008C8-K8 for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:53:56 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1Tv-0004r9-8G for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:53:56 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706910834; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tnD1WbeLBV3hwUM53n89RabIQheGwulLSgdIZsqwAE4=; b=XHquorGSh1xpZ+3HUNJhSlVZPlUEauWQ1BR+26A5c3mBv2Vg6VYPikjAKazuBfCDlQ04Zn Z0rlZpmAzqYPNXCOesKOvOVt+T12NnKes+7m2Uua4bAO2crCKUP41zIj7zzxSyzmUZqwDF EeXsrsWR+Cr7CtBGedtAN/CVQOg7QRE= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-408-9urfckw3ML6wLm4ccBagpw-1; Fri, 02 Feb 2024 16:53:51 -0500 X-MC-Unique: 9urfckw3ML6wLm4ccBagpw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B54C63C0ED43; Fri, 2 Feb 2024 21:53:50 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.192.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8C3DE2166B31; Fri, 2 Feb 2024 21:53:48 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , "Michael S . Tsirkin" , Jason Wang , Stefan Hajnoczi , Stefano Garzarella , Germano Veit Michel , Raphael Norwitz Subject: [PATCH v1 04/15] libvhost-user: Factor out removing all mem regions Date: Fri, 2 Feb 2024 22:53:21 +0100 Message-ID: <20240202215332.118728-5-david@redhat.com> In-Reply-To: <20240202215332.118728-1-david@redhat.com> References: <20240202215332.118728-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.276, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Let's factor it out. Note that the check for MAP_FAILED was wrong as we never set mmap_addr if mmap() failed. We'll remove the NULL check separately. Signed-off-by: David Hildenbrand Reviewed-by: Raphael Norwitz --- subprojects/libvhost-user/libvhost-user.c | 34 ++++++++++++----------- 1 file changed, 18 insertions(+), 16 deletions(-) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index 8a5a7a2295..d5b3468e43 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -240,6 +240,22 @@ qva_to_va(VuDev *dev, uint64_t qemu_addr) return NULL; } +static void +vu_remove_all_mem_regs(VuDev *dev) +{ + unsigned int i; + + for (i = 0; i < dev->nregions; i++) { + VuDevRegion *r = &dev->regions[i]; + void *ma = (void *)(uintptr_t)r->mmap_addr; + + if (ma) { + munmap(ma, r->size + r->mmap_offset); + } + } + dev->nregions = 0; +} + static void vmsg_close_fds(VhostUserMsg *vmsg) { @@ -1003,14 +1019,7 @@ vu_set_mem_table_exec(VuDev *dev, VhostUserMsg *vmsg) unsigned int i; VhostUserMemory m = vmsg->payload.memory, *memory = &m; - for (i = 0; i < dev->nregions; i++) { - VuDevRegion *r = &dev->regions[i]; - void *ma = (void *) (uintptr_t) r->mmap_addr; - - if (ma) { - munmap(ma, r->size + r->mmap_offset); - } - } + vu_remove_all_mem_regs(dev); dev->nregions = memory->nregions; if (dev->postcopy_listening) { @@ -2112,14 +2121,7 @@ vu_deinit(VuDev *dev) { unsigned int i; - for (i = 0; i < dev->nregions; i++) { - VuDevRegion *r = &dev->regions[i]; - void *m = (void *) (uintptr_t) r->mmap_addr; - if (m != MAP_FAILED) { - munmap(m, r->size + r->mmap_offset); - } - } - dev->nregions = 0; + vu_remove_all_mem_regs(dev); for (i = 0; i < dev->max_queues; i++) { VuVirtq *vq = &dev->vq[i]; From patchwork Fri Feb 2 21:53:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13543483 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3ABC5C4828F for ; Fri, 2 Feb 2024 21:56:06 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rW1U0-0008Cn-Ew; Fri, 02 Feb 2024 16:54:00 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1Ty-0008CL-4n for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:53:58 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1Tv-0004rB-K3 for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:53:57 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706910834; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lRnJlci7RMaYLOMMtvvCEhtoW6q8GzEJRCgBPJt/ZFk=; b=AO+hSnn+XBCvXbne4hskX39GG+t13hpigN++viJ3dLx5dNpHz0eUn+91bLfkC/VozY9RWg 3zdDAIZYzMVdsyYUkwYGx0JfNb6QxVxeQqQe3A0xuuGDIypv2kRA1wbicIjasOuo+3/nNh 6R3ABKAC9IxPqqSuA8VyXZC/4hEtFY0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-677-1o9bvWtxMYmoFF-YB17YnA-1; Fri, 02 Feb 2024 16:53:53 -0500 X-MC-Unique: 1o9bvWtxMYmoFF-YB17YnA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 180B610665A2; Fri, 2 Feb 2024 21:53:53 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.192.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id 36D792166B31; Fri, 2 Feb 2024 21:53:51 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , "Michael S . Tsirkin" , Jason Wang , Stefan Hajnoczi , Stefano Garzarella , Germano Veit Michel , Raphael Norwitz Subject: [PATCH v1 05/15] libvhost-user: Merge vu_set_mem_table_exec_postcopy() into vu_set_mem_table_exec() Date: Fri, 2 Feb 2024 22:53:22 +0100 Message-ID: <20240202215332.118728-6-david@redhat.com> In-Reply-To: <20240202215332.118728-1-david@redhat.com> References: <20240202215332.118728-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.129.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.276, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Let's reduce some code duplication and prepare for further changes. Signed-off-by: David Hildenbrand Reviewed-by: Raphael Norwitz --- subprojects/libvhost-user/libvhost-user.c | 119 +++++++--------------- 1 file changed, 39 insertions(+), 80 deletions(-) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index d5b3468e43..d9e2214ad2 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -937,95 +937,23 @@ vu_get_shared_object(VuDev *dev, VhostUserMsg *vmsg) } static bool -vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg *vmsg) +vu_set_mem_table_exec(VuDev *dev, VhostUserMsg *vmsg) { - unsigned int i; VhostUserMemory m = vmsg->payload.memory, *memory = &m; - dev->nregions = memory->nregions; - - DPRINT("Nregions: %u\n", memory->nregions); - for (i = 0; i < dev->nregions; i++) { - void *mmap_addr; - VhostUserMemoryRegion *msg_region = &memory->regions[i]; - VuDevRegion *dev_region = &dev->regions[i]; - - DPRINT("Region %d\n", i); - DPRINT(" guest_phys_addr: 0x%016"PRIx64"\n", - msg_region->guest_phys_addr); - DPRINT(" memory_size: 0x%016"PRIx64"\n", - msg_region->memory_size); - DPRINT(" userspace_addr 0x%016"PRIx64"\n", - msg_region->userspace_addr); - DPRINT(" mmap_offset 0x%016"PRIx64"\n", - msg_region->mmap_offset); - - dev_region->gpa = msg_region->guest_phys_addr; - dev_region->size = msg_region->memory_size; - dev_region->qva = msg_region->userspace_addr; - dev_region->mmap_offset = msg_region->mmap_offset; + int prot = PROT_READ | PROT_WRITE; + unsigned int i; - /* We don't use offset argument of mmap() since the - * mapped address has to be page aligned, and we use huge - * pages. + if (dev->postcopy_listening) { + /* * In postcopy we're using PROT_NONE here to catch anyone * accessing it before we userfault */ - mmap_addr = mmap(0, dev_region->size + dev_region->mmap_offset, - PROT_NONE, MAP_SHARED | MAP_NORESERVE, - vmsg->fds[i], 0); - - if (mmap_addr == MAP_FAILED) { - vu_panic(dev, "region mmap error: %s", strerror(errno)); - } else { - dev_region->mmap_addr = (uint64_t)(uintptr_t)mmap_addr; - DPRINT(" mmap_addr: 0x%016"PRIx64"\n", - dev_region->mmap_addr); - } - - /* Return the address to QEMU so that it can translate the ufd - * fault addresses back. - */ - msg_region->userspace_addr = dev_region->mmap_addr + - dev_region->mmap_offset; - close(vmsg->fds[i]); - } - - /* Send the message back to qemu with the addresses filled in */ - vmsg->fd_num = 0; - if (!vu_send_reply(dev, dev->sock, vmsg)) { - vu_panic(dev, "failed to respond to set-mem-table for postcopy"); - return false; - } - - /* Wait for QEMU to confirm that it's registered the handler for the - * faults. - */ - if (!dev->read_msg(dev, dev->sock, vmsg) || - vmsg->size != sizeof(vmsg->payload.u64) || - vmsg->payload.u64 != 0) { - vu_panic(dev, "failed to receive valid ack for postcopy set-mem-table"); - return false; + prot = PROT_NONE; } - /* OK, now we can go and register the memory and generate faults */ - (void)generate_faults(dev); - - return false; -} - -static bool -vu_set_mem_table_exec(VuDev *dev, VhostUserMsg *vmsg) -{ - unsigned int i; - VhostUserMemory m = vmsg->payload.memory, *memory = &m; - vu_remove_all_mem_regs(dev); dev->nregions = memory->nregions; - if (dev->postcopy_listening) { - return vu_set_mem_table_exec_postcopy(dev, vmsg); - } - DPRINT("Nregions: %u\n", memory->nregions); for (i = 0; i < dev->nregions; i++) { void *mmap_addr; @@ -1051,8 +979,7 @@ vu_set_mem_table_exec(VuDev *dev, VhostUserMsg *vmsg) * mapped address has to be page aligned, and we use huge * pages. */ mmap_addr = mmap(0, dev_region->size + dev_region->mmap_offset, - PROT_READ | PROT_WRITE, MAP_SHARED | MAP_NORESERVE, - vmsg->fds[i], 0); + prot, MAP_SHARED | MAP_NORESERVE, vmsg->fds[i], 0); if (mmap_addr == MAP_FAILED) { vu_panic(dev, "region mmap error: %s", strerror(errno)); @@ -1062,9 +989,41 @@ vu_set_mem_table_exec(VuDev *dev, VhostUserMsg *vmsg) dev_region->mmap_addr); } + if (dev->postcopy_listening) { + /* + * Return the address to QEMU so that it can translate the ufd + * fault addresses back. + */ + msg_region->userspace_addr = dev_region->mmap_addr + + dev_region->mmap_offset; + } close(vmsg->fds[i]); } + if (dev->postcopy_listening) { + /* Send the message back to qemu with the addresses filled in */ + vmsg->fd_num = 0; + if (!vu_send_reply(dev, dev->sock, vmsg)) { + vu_panic(dev, "failed to respond to set-mem-table for postcopy"); + return false; + } + + /* + * Wait for QEMU to confirm that it's registered the handler for the + * faults. + */ + if (!dev->read_msg(dev, dev->sock, vmsg) || + vmsg->size != sizeof(vmsg->payload.u64) || + vmsg->payload.u64 != 0) { + vu_panic(dev, "failed to receive valid ack for postcopy set-mem-table"); + return false; + } + + /* OK, now we can go and register the memory and generate faults */ + (void)generate_faults(dev); + return false; + } + for (i = 0; i < dev->max_queues; i++) { if (dev->vq[i].vring.desc) { if (map_ring(dev, &dev->vq[i])) { From patchwork Fri Feb 2 21:53:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13543477 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 78535C48291 for ; Fri, 2 Feb 2024 21:55:17 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rW1U3-0008DJ-2m; Fri, 02 Feb 2024 16:54:03 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1U1-0008Ct-1v for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:01 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1Tz-0004sz-9k for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:00 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706910838; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UjO81Bw6iEHfFtOoTrtiZ4pC6BhmP65sUHBmbECDG38=; b=Adoz6RpliJRzssFDKh/P62V0+5muIekqtG15RN0VhFPwOUCEqHs8eItmTgoE4Fq+h76JDJ Oi+N8cD+dCVU7W53T8kmXHqQBxSqHCrTC3zmo3YKR/bZcRlxgLFVNfAXRIuNfk5RNTOtrJ 56PbdCH3dbjG/irMFuYXtLFWcAEzMsg= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-120-fUP7D4M9NUSUuujoq_jgNQ-1; Fri, 02 Feb 2024 16:53:55 -0500 X-MC-Unique: fUP7D4M9NUSUuujoq_jgNQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 13F59800074; Fri, 2 Feb 2024 21:53:55 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.192.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id 727D52166B31; Fri, 2 Feb 2024 21:53:53 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , "Michael S . Tsirkin" , Jason Wang , Stefan Hajnoczi , Stefano Garzarella , Germano Veit Michel , Raphael Norwitz Subject: [PATCH v1 06/15] libvhost-user: Factor out adding a memory region Date: Fri, 2 Feb 2024 22:53:23 +0100 Message-ID: <20240202215332.118728-7-david@redhat.com> In-Reply-To: <20240202215332.118728-1-david@redhat.com> References: <20240202215332.118728-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.276, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Let's factor it out, reducing quite some code duplication and perparing for further changes. If we fail to mmap a region and panic, we now simply don't add that (broken) region. Note that we now increment dev->nregions as we are successfully adding memory regions, and don't increment dev->nregions if anything went wrong. Signed-off-by: David Hildenbrand Reviewed-by: Raphael Norwitz --- subprojects/libvhost-user/libvhost-user.c | 168 ++++++++-------------- 1 file changed, 60 insertions(+), 108 deletions(-) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index d9e2214ad2..a2baefe84b 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -256,6 +256,61 @@ vu_remove_all_mem_regs(VuDev *dev) dev->nregions = 0; } +static void +_vu_add_mem_reg(VuDev *dev, VhostUserMemoryRegion *msg_region, int fd) +{ + int prot = PROT_READ | PROT_WRITE; + VuDevRegion *r; + void *mmap_addr; + + DPRINT("Adding region %d\n", dev->nregions); + DPRINT(" guest_phys_addr: 0x%016"PRIx64"\n", + msg_region->guest_phys_addr); + DPRINT(" memory_size: 0x%016"PRIx64"\n", + msg_region->memory_size); + DPRINT(" userspace_addr 0x%016"PRIx64"\n", + msg_region->userspace_addr); + DPRINT(" mmap_offset 0x%016"PRIx64"\n", + msg_region->mmap_offset); + + if (dev->postcopy_listening) { + /* + * In postcopy we're using PROT_NONE here to catch anyone + * accessing it before we userfault + */ + prot = PROT_NONE; + } + + /* + * We don't use offset argument of mmap() since the mapped address has + * to be page aligned, and we use huge pages. + */ + mmap_addr = mmap(0, msg_region->memory_size + msg_region->mmap_offset, + prot, MAP_SHARED | MAP_NORESERVE, fd, 0); + if (mmap_addr == MAP_FAILED) { + vu_panic(dev, "region mmap error: %s", strerror(errno)); + return; + } + DPRINT(" mmap_addr: 0x%016"PRIx64"\n", + (uint64_t)(uintptr_t)mmap_addr); + + r = &dev->regions[dev->nregions]; + r->gpa = msg_region->guest_phys_addr; + r->size = msg_region->memory_size; + r->qva = msg_region->userspace_addr; + r->mmap_addr = (uint64_t)(uintptr_t)mmap_addr; + r->mmap_offset = msg_region->mmap_offset; + dev->nregions++; + + if (dev->postcopy_listening) { + /* + * Return the address to QEMU so that it can translate the ufd + * fault addresses back. + */ + msg_region->userspace_addr = r->mmap_addr + r->mmap_offset; + } +} + static void vmsg_close_fds(VhostUserMsg *vmsg) { @@ -727,10 +782,7 @@ generate_faults(VuDev *dev) { static bool vu_add_mem_reg(VuDev *dev, VhostUserMsg *vmsg) { int i; - bool track_ramblocks = dev->postcopy_listening; VhostUserMemoryRegion m = vmsg->payload.memreg.region, *msg_region = &m; - VuDevRegion *dev_region = &dev->regions[dev->nregions]; - void *mmap_addr; if (vmsg->fd_num != 1) { vmsg_close_fds(vmsg); @@ -760,69 +812,20 @@ vu_add_mem_reg(VuDev *dev, VhostUserMsg *vmsg) { * we know all the postcopy client bases have been received, and we * should start generating faults. */ - if (track_ramblocks && + if (dev->postcopy_listening && vmsg->size == sizeof(vmsg->payload.u64) && vmsg->payload.u64 == 0) { (void)generate_faults(dev); return false; } - DPRINT("Adding region: %u\n", dev->nregions); - DPRINT(" guest_phys_addr: 0x%016"PRIx64"\n", - msg_region->guest_phys_addr); - DPRINT(" memory_size: 0x%016"PRIx64"\n", - msg_region->memory_size); - DPRINT(" userspace_addr 0x%016"PRIx64"\n", - msg_region->userspace_addr); - DPRINT(" mmap_offset 0x%016"PRIx64"\n", - msg_region->mmap_offset); - - dev_region->gpa = msg_region->guest_phys_addr; - dev_region->size = msg_region->memory_size; - dev_region->qva = msg_region->userspace_addr; - dev_region->mmap_offset = msg_region->mmap_offset; - - /* - * We don't use offset argument of mmap() since the - * mapped address has to be page aligned, and we use huge - * pages. - */ - if (track_ramblocks) { - /* - * In postcopy we're using PROT_NONE here to catch anyone - * accessing it before we userfault. - */ - mmap_addr = mmap(0, dev_region->size + dev_region->mmap_offset, - PROT_NONE, MAP_SHARED | MAP_NORESERVE, - vmsg->fds[0], 0); - } else { - mmap_addr = mmap(0, dev_region->size + dev_region->mmap_offset, - PROT_READ | PROT_WRITE, MAP_SHARED | MAP_NORESERVE, - vmsg->fds[0], 0); - } - - if (mmap_addr == MAP_FAILED) { - vu_panic(dev, "region mmap error: %s", strerror(errno)); - } else { - dev_region->mmap_addr = (uint64_t)(uintptr_t)mmap_addr; - DPRINT(" mmap_addr: 0x%016"PRIx64"\n", - dev_region->mmap_addr); - } - + _vu_add_mem_reg(dev, msg_region, vmsg->fds[0]); close(vmsg->fds[0]); - if (track_ramblocks) { - /* - * Return the address to QEMU so that it can translate the ufd - * fault addresses back. - */ - msg_region->userspace_addr = dev_region->mmap_addr + - dev_region->mmap_offset; - + if (dev->postcopy_listening) { /* Send the message back to qemu with the addresses filled in. */ vmsg->fd_num = 0; DPRINT("Successfully added new region in postcopy\n"); - dev->nregions++; return true; } else { for (i = 0; i < dev->max_queues; i++) { @@ -835,7 +838,6 @@ vu_add_mem_reg(VuDev *dev, VhostUserMsg *vmsg) { } DPRINT("Successfully added new region\n"); - dev->nregions++; return false; } } @@ -940,63 +942,13 @@ static bool vu_set_mem_table_exec(VuDev *dev, VhostUserMsg *vmsg) { VhostUserMemory m = vmsg->payload.memory, *memory = &m; - int prot = PROT_READ | PROT_WRITE; unsigned int i; - if (dev->postcopy_listening) { - /* - * In postcopy we're using PROT_NONE here to catch anyone - * accessing it before we userfault - */ - prot = PROT_NONE; - } - vu_remove_all_mem_regs(dev); - dev->nregions = memory->nregions; DPRINT("Nregions: %u\n", memory->nregions); - for (i = 0; i < dev->nregions; i++) { - void *mmap_addr; - VhostUserMemoryRegion *msg_region = &memory->regions[i]; - VuDevRegion *dev_region = &dev->regions[i]; - - DPRINT("Region %d\n", i); - DPRINT(" guest_phys_addr: 0x%016"PRIx64"\n", - msg_region->guest_phys_addr); - DPRINT(" memory_size: 0x%016"PRIx64"\n", - msg_region->memory_size); - DPRINT(" userspace_addr 0x%016"PRIx64"\n", - msg_region->userspace_addr); - DPRINT(" mmap_offset 0x%016"PRIx64"\n", - msg_region->mmap_offset); - - dev_region->gpa = msg_region->guest_phys_addr; - dev_region->size = msg_region->memory_size; - dev_region->qva = msg_region->userspace_addr; - dev_region->mmap_offset = msg_region->mmap_offset; - - /* We don't use offset argument of mmap() since the - * mapped address has to be page aligned, and we use huge - * pages. */ - mmap_addr = mmap(0, dev_region->size + dev_region->mmap_offset, - prot, MAP_SHARED | MAP_NORESERVE, vmsg->fds[i], 0); - - if (mmap_addr == MAP_FAILED) { - vu_panic(dev, "region mmap error: %s", strerror(errno)); - } else { - dev_region->mmap_addr = (uint64_t)(uintptr_t)mmap_addr; - DPRINT(" mmap_addr: 0x%016"PRIx64"\n", - dev_region->mmap_addr); - } - - if (dev->postcopy_listening) { - /* - * Return the address to QEMU so that it can translate the ufd - * fault addresses back. - */ - msg_region->userspace_addr = dev_region->mmap_addr + - dev_region->mmap_offset; - } + for (i = 0; i < memory->nregions; i++) { + _vu_add_mem_reg(dev, &memory->regions[i], vmsg->fds[i]); close(vmsg->fds[i]); } From patchwork Fri Feb 2 21:53:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13543479 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 69E50C4828E for ; Fri, 2 Feb 2024 21:55:31 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rW1U4-0008EC-VQ; Fri, 02 Feb 2024 16:54:04 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1U3-0008DK-5I for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:03 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1U1-0004tD-MV for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:02 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706910841; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3Xz/DNogoDrAuG8jKm4UbtdAbPzzBhEQKM/Cy3B57Pw=; b=GFp+Dwe/m4iPorFbt2VBWyZF/EJQPX2rKhWZfIGBhEZgMW75mOWfP7u06HpTBQecXqOplr D0iNBpPc/PjIwN7UDCRTCeGx9/Vx7hzMjIkQl0bKxS+fqm+3u7R262mz/i5CvtYms8UeK/ VKJUVXXq6JJ9SH0jYakN882rR9UbAsU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-153-kwvOuFcxPzWBtZevY2vGoQ-1; Fri, 02 Feb 2024 16:53:57 -0500 X-MC-Unique: kwvOuFcxPzWBtZevY2vGoQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 16AC5800074; Fri, 2 Feb 2024 21:53:57 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.192.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4E7192166B31; Fri, 2 Feb 2024 21:53:55 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , "Michael S . Tsirkin" , Jason Wang , Stefan Hajnoczi , Stefano Garzarella , Germano Veit Michel , Raphael Norwitz Subject: [PATCH v1 07/15] libvhost-user: No need to check for NULL when unmapping Date: Fri, 2 Feb 2024 22:53:24 +0100 Message-ID: <20240202215332.118728-8-david@redhat.com> In-Reply-To: <20240202215332.118728-1-david@redhat.com> References: <20240202215332.118728-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.129.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.276, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org We never add a memory region if mmap() failed. Therefore, no need to check for NULL. Signed-off-by: David Hildenbrand Reviewed-by: Raphael Norwitz --- subprojects/libvhost-user/libvhost-user.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index a2baefe84b..f99c888b48 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -247,11 +247,8 @@ vu_remove_all_mem_regs(VuDev *dev) for (i = 0; i < dev->nregions; i++) { VuDevRegion *r = &dev->regions[i]; - void *ma = (void *)(uintptr_t)r->mmap_addr; - if (ma) { - munmap(ma, r->size + r->mmap_offset); - } + munmap((void *)(uintptr_t)r->mmap_addr, r->size + r->mmap_offset); } dev->nregions = 0; } @@ -888,11 +885,8 @@ vu_rem_mem_reg(VuDev *dev, VhostUserMsg *vmsg) { for (i = 0; i < dev->nregions; i++) { if (reg_equal(&dev->regions[i], msg_region)) { VuDevRegion *r = &dev->regions[i]; - void *ma = (void *) (uintptr_t) r->mmap_addr; - if (ma) { - munmap(ma, r->size + r->mmap_offset); - } + munmap((void *)(uintptr_t)r->mmap_addr, r->size + r->mmap_offset); /* * Shift all affected entries by 1 to close the hole at index i and From patchwork Fri Feb 2 21:53:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13543480 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 626A1C4828F for ; Fri, 2 Feb 2024 21:55:34 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rW1U4-0008E5-JR; Fri, 02 Feb 2024 16:54:04 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1U3-0008DN-AA for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:03 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1U1-0004tF-PD for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:03 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706910841; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PiM3izwPA+L0qhccpaX2lE+3batb7kEf/yeHcfoCS8E=; b=Cl37GK+kyRlcq54KEWzSWVNO17ux8DOZaTji04jQhrzWKZO4iyJVwt4+Yb1FmiJmHT5gq6 9NUlIvwpjsZJ5xlkaC1yDWwBCE9fQWB+EU4Z68JQ0NGiMMMxXmodRM2jQCDT9xhFREW3CG TQrvXVNGg3n6C/E1fAUoitCgQ+l6F8o= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-357-l0-zmGLHMY2YWwUJQmbZSQ-1; Fri, 02 Feb 2024 16:53:59 -0500 X-MC-Unique: l0-zmGLHMY2YWwUJQmbZSQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 18622800074; Fri, 2 Feb 2024 21:53:59 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.192.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7700E2166B31; Fri, 2 Feb 2024 21:53:57 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , "Michael S . Tsirkin" , Jason Wang , Stefan Hajnoczi , Stefano Garzarella , Germano Veit Michel , Raphael Norwitz Subject: [PATCH v1 08/15] libvhost-user: Don't zero out memory for memory regions Date: Fri, 2 Feb 2024 22:53:25 +0100 Message-ID: <20240202215332.118728-9-david@redhat.com> In-Reply-To: <20240202215332.118728-1-david@redhat.com> References: <20240202215332.118728-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.129.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.276, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org dev->nregions always covers only valid entries. Stop zeroing out other array elements that are unused. Signed-off-by: David Hildenbrand Reviewed-by: Raphael Norwitz --- subprojects/libvhost-user/libvhost-user.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index f99c888b48..e1a1b9df88 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -888,13 +888,9 @@ vu_rem_mem_reg(VuDev *dev, VhostUserMsg *vmsg) { munmap((void *)(uintptr_t)r->mmap_addr, r->size + r->mmap_offset); - /* - * Shift all affected entries by 1 to close the hole at index i and - * zero out the last entry. - */ + /* Shift all affected entries by 1 to close the hole at index. */ memmove(dev->regions + i, dev->regions + i + 1, sizeof(VuDevRegion) * (dev->nregions - i - 1)); - memset(dev->regions + dev->nregions - 1, 0, sizeof(VuDevRegion)); DPRINT("Successfully removed a region\n"); dev->nregions--; i--; @@ -2119,7 +2115,6 @@ vu_init(VuDev *dev, DPRINT("%s: failed to malloc mem regions\n", __func__); return false; } - memset(dev->regions, 0, VHOST_USER_MAX_RAM_SLOTS * sizeof(dev->regions[0])); dev->vq = malloc(max_queues * sizeof(dev->vq[0])); if (!dev->vq) { From patchwork Fri Feb 2 21:53:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13543487 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 463EDC4828F for ; Fri, 2 Feb 2024 21:56:22 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rW1U7-0008Ev-JI; Fri, 02 Feb 2024 16:54:07 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1U5-0008EG-Hl for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:05 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1U4-0004tp-4g for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:05 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706910843; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XQuXq7QcBaKkDuX1oNtcfRrVvti48Kk6OJQDMaKZxmg=; b=fZIW9Ejq8CCmDEtzTk1K3iTctYU9rYAiJKN1rJfbzWHlPu0UooSpYLHKdw4A/yfXHX7Uc3 kgwEvOYvvf7qa+OwkSp7oVv6v9rIAQxYNFQg5UfXP6SPKAzZQ/tj+Xnb2UrknfDcr8Yeoh +ita6h33iMtmAxU8ajdWQMmYH1CGMY4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-683-MGaoz8YBNJWYX4FpNcsKBQ-1; Fri, 02 Feb 2024 16:54:01 -0500 X-MC-Unique: MGaoz8YBNJWYX4FpNcsKBQ-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 465B885A589; Fri, 2 Feb 2024 21:54:01 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.192.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id 50E4E2166B31; Fri, 2 Feb 2024 21:53:59 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , "Michael S . Tsirkin" , Jason Wang , Stefan Hajnoczi , Stefano Garzarella , Germano Veit Michel , Raphael Norwitz Subject: [PATCH v1 09/15] libvhost-user: Don't search for duplicates when removing memory regions Date: Fri, 2 Feb 2024 22:53:26 +0100 Message-ID: <20240202215332.118728-10-david@redhat.com> In-Reply-To: <20240202215332.118728-1-david@redhat.com> References: <20240202215332.118728-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.129.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.276, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org We cannot have duplicate memory regions, something would be deeply flawed elsewhere. Let's just stop the search once we found an entry. We'll add more sanity checks when adding memory regions later. Signed-off-by: David Hildenbrand Reviewed-by: Raphael Norwitz --- subprojects/libvhost-user/libvhost-user.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index e1a1b9df88..22154b217f 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -896,8 +896,7 @@ vu_rem_mem_reg(VuDev *dev, VhostUserMsg *vmsg) { i--; found = true; - - /* Continue the search for eventual duplicates. */ + break; } } From patchwork Fri Feb 2 21:53:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13543475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 282D8C48298 for ; Fri, 2 Feb 2024 21:54:57 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rW1UB-0008FG-6R; Fri, 02 Feb 2024 16:54:11 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1U9-0008F0-Kk for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:09 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1U7-0004uH-Vl for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706910847; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QGmenbdCAUIMi0LDoLYDRKXhVIbagKvxeclGQjh1d48=; b=KkUVZqKV9uIxTNN9gjCn5MnqWFRDYrmAE5KqrZY1z1zBDzMXf6DRHXFY7LY/7msJuXR9qu +dpeJRcpDFiMvDIlhbgzOFPV3NmqzYd2rRG+1TaaAxZ0nUxwD3DKlHaIp/czhHh5BpLOJG sT8SIrH5H7f35jORupIi+VwfXhuHqP0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-86-GzfXMkQDPqipNrByfOagQg-1; Fri, 02 Feb 2024 16:54:03 -0500 X-MC-Unique: GzfXMkQDPqipNrByfOagQg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 85DA6185A786; Fri, 2 Feb 2024 21:54:03 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.192.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id A6A4E2166B31; Fri, 2 Feb 2024 21:54:01 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , "Michael S . Tsirkin" , Jason Wang , Stefan Hajnoczi , Stefano Garzarella , Germano Veit Michel , Raphael Norwitz Subject: [PATCH v1 10/15] libvhost-user: Factor out search for memory region by GPA and simplify Date: Fri, 2 Feb 2024 22:53:27 +0100 Message-ID: <20240202215332.118728-11-david@redhat.com> In-Reply-To: <20240202215332.118728-1-david@redhat.com> References: <20240202215332.118728-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.276, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Memory regions cannot overlap, and if we ever hit that case something would be really flawed. For example, when vhost code in QEMU decides to increase the size of memory regions to cover full huge pages, it makes sure to never create overlaps, and if there would be overlaps, it would bail out. QEMU commits 48d7c9757749 ("vhost: Merge sections added to temporary list"), c1ece84e7c93 ("vhost: Huge page align and merge") and e7b94a84b6cb ("vhost: Allow adjoining regions") added and clarified that handling and how overlaps are impossible. Consequently, each GPA can belong to at most one memory region, and everything else doesn't make sense. Let's factor out our search to prepare for further changes. Signed-off-by: David Hildenbrand Reviewed-by: Raphael Norwitz --- subprojects/libvhost-user/libvhost-user.c | 79 +++++++++++++---------- 1 file changed, 45 insertions(+), 34 deletions(-) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index 22154b217f..d036b54ed0 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -195,30 +195,47 @@ vu_panic(VuDev *dev, const char *msg, ...) */ } +/* Search for a memory region that covers this guest physical address. */ +static VuDevRegion * +vu_gpa_to_mem_region(VuDev *dev, uint64_t guest_addr) +{ + unsigned int i; + + /* + * Memory regions cannot overlap in guest physical address space. Each + * GPA belongs to exactly one memory region, so there can only be one + * match. + */ + for (i = 0; i < dev->nregions; i++) { + VuDevRegion *cur = &dev->regions[i]; + + if (guest_addr >= cur->gpa && guest_addr < cur->gpa + cur->size) { + return cur; + } + } + return NULL; +} + /* Translate guest physical address to our virtual address. */ void * vu_gpa_to_va(VuDev *dev, uint64_t *plen, uint64_t guest_addr) { - unsigned int i; + VuDevRegion *r; if (*plen == 0) { return NULL; } - /* Find matching memory region. */ - for (i = 0; i < dev->nregions; i++) { - VuDevRegion *r = &dev->regions[i]; - - if ((guest_addr >= r->gpa) && (guest_addr < (r->gpa + r->size))) { - if ((guest_addr + *plen) > (r->gpa + r->size)) { - *plen = r->gpa + r->size - guest_addr; - } - return (void *)(uintptr_t) - guest_addr - r->gpa + r->mmap_addr + r->mmap_offset; - } + r = vu_gpa_to_mem_region(dev, guest_addr); + if (!r) { + return NULL; } - return NULL; + if ((guest_addr + *plen) > (r->gpa + r->size)) { + *plen = r->gpa + r->size - guest_addr; + } + return (void *)(uintptr_t)guest_addr - r->gpa + r->mmap_addr + + r->mmap_offset; } /* Translate qemu virtual address to our virtual address. */ @@ -854,8 +871,8 @@ static inline bool reg_equal(VuDevRegion *vudev_reg, static bool vu_rem_mem_reg(VuDev *dev, VhostUserMsg *vmsg) { VhostUserMemoryRegion m = vmsg->payload.memreg.region, *msg_region = &m; - unsigned int i; - bool found = false; + unsigned int idx; + VuDevRegion *r; if (vmsg->fd_num > 1) { vmsg_close_fds(vmsg); @@ -882,28 +899,22 @@ vu_rem_mem_reg(VuDev *dev, VhostUserMsg *vmsg) { DPRINT(" mmap_offset 0x%016"PRIx64"\n", msg_region->mmap_offset); - for (i = 0; i < dev->nregions; i++) { - if (reg_equal(&dev->regions[i], msg_region)) { - VuDevRegion *r = &dev->regions[i]; - - munmap((void *)(uintptr_t)r->mmap_addr, r->size + r->mmap_offset); - - /* Shift all affected entries by 1 to close the hole at index. */ - memmove(dev->regions + i, dev->regions + i + 1, - sizeof(VuDevRegion) * (dev->nregions - i - 1)); - DPRINT("Successfully removed a region\n"); - dev->nregions--; - i--; - - found = true; - break; - } - } - - if (!found) { + r = vu_gpa_to_mem_region(dev, msg_region->guest_phys_addr); + if (!r || !reg_equal(r, msg_region)) { + vmsg_close_fds(vmsg); vu_panic(dev, "Specified region not found\n"); + return false; } + munmap((void *)(uintptr_t)r->mmap_addr, r->size + r->mmap_offset); + + idx = r - dev->regions; + assert(idx < dev->nregions); + /* Shift all affected entries by 1 to close the hole. */ + memmove(r, r + 1, sizeof(VuDevRegion) * (dev->nregions - idx - 1)); + DPRINT("Successfully removed a region\n"); + dev->nregions--; + vmsg_close_fds(vmsg); return false; From patchwork Fri Feb 2 21:53:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13543476 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 19575C4828F for ; Fri, 2 Feb 2024 21:55:16 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rW1UE-0008G4-7q; Fri, 02 Feb 2024 16:54:14 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1UC-0008Fe-Na for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:12 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1UB-0004uY-4B for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:12 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706910850; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=czNYp4U5+ggmqgLkvs4/GbnY/M6qHCdVm83DexN7PvQ=; b=KqVPBG9UpYNL2ypWYfatc4AxFkUfeNUus88w23zo4Usuzz7tCO6Jzt15waxy7ce5rdx0iW trVNWcZ3kiGsGrWXweAGv79awsjZ55mzzNJ0wQYbWyUC18gs1pZRqzIEo0HpNf+xnN5hoD VSWPBS8AKnv+9JI87BYZ4lnWmrXVTWo= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-300-7azcj647P_SgeriKma39zA-1; Fri, 02 Feb 2024 16:54:06 -0500 X-MC-Unique: 7azcj647P_SgeriKma39zA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1922086C141; Fri, 2 Feb 2024 21:54:06 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.192.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id E38E32166B31; Fri, 2 Feb 2024 21:54:03 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , "Michael S . Tsirkin" , Jason Wang , Stefan Hajnoczi , Stefano Garzarella , Germano Veit Michel , Raphael Norwitz Subject: [PATCH v1 11/15] libvhost-user: Speedup gpa_to_mem_region() and vu_gpa_to_va() Date: Fri, 2 Feb 2024 22:53:28 +0100 Message-ID: <20240202215332.118728-12-david@redhat.com> In-Reply-To: <20240202215332.118728-1-david@redhat.com> References: <20240202215332.118728-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.276, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Let's speed up GPA to memory region / virtual address lookup. Store the memory regions ordered by guest physical addresses, and use binary search for address translation, as well as when adding/removing memory regions. Most importantly, this will speed up GPA->VA address translation when we have many memslots. Signed-off-by: David Hildenbrand Reviewed-by: Raphael Norwitz --- subprojects/libvhost-user/libvhost-user.c | 49 +++++++++++++++++++++-- 1 file changed, 45 insertions(+), 4 deletions(-) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index d036b54ed0..75e47b7bb3 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -199,19 +199,30 @@ vu_panic(VuDev *dev, const char *msg, ...) static VuDevRegion * vu_gpa_to_mem_region(VuDev *dev, uint64_t guest_addr) { - unsigned int i; + int low = 0; + int high = dev->nregions - 1; /* * Memory regions cannot overlap in guest physical address space. Each * GPA belongs to exactly one memory region, so there can only be one * match. + * + * We store our memory regions ordered by GPA and can simply perform a + * binary search. */ - for (i = 0; i < dev->nregions; i++) { - VuDevRegion *cur = &dev->regions[i]; + while (low <= high) { + unsigned int mid = low + (high - low) / 2; + VuDevRegion *cur = &dev->regions[mid]; if (guest_addr >= cur->gpa && guest_addr < cur->gpa + cur->size) { return cur; } + if (guest_addr >= cur->gpa + cur->size) { + low = mid + 1; + } + if (guest_addr < cur->gpa) { + high = mid - 1; + } } return NULL; } @@ -273,9 +284,14 @@ vu_remove_all_mem_regs(VuDev *dev) static void _vu_add_mem_reg(VuDev *dev, VhostUserMemoryRegion *msg_region, int fd) { + const uint64_t start_gpa = msg_region->guest_phys_addr; + const uint64_t end_gpa = start_gpa + msg_region->memory_size; int prot = PROT_READ | PROT_WRITE; VuDevRegion *r; void *mmap_addr; + int low = 0; + int high = dev->nregions - 1; + unsigned int idx; DPRINT("Adding region %d\n", dev->nregions); DPRINT(" guest_phys_addr: 0x%016"PRIx64"\n", @@ -295,6 +311,29 @@ _vu_add_mem_reg(VuDev *dev, VhostUserMemoryRegion *msg_region, int fd) prot = PROT_NONE; } + /* + * We will add memory regions into the array sorted by GPA. Perform a + * binary search to locate the insertion point: it will be at the low + * index. + */ + while (low <= high) { + unsigned int mid = low + (high - low) / 2; + VuDevRegion *cur = &dev->regions[mid]; + + /* Overlap of GPA addresses. */ + if (start_gpa < cur->gpa + cur->size && cur->gpa < end_gpa) { + vu_panic(dev, "regions with overlapping guest physical addresses"); + return; + } + if (start_gpa >= cur->gpa + cur->size) { + low = mid + 1; + } + if (start_gpa < cur->gpa) { + high = mid - 1; + } + } + idx = low; + /* * We don't use offset argument of mmap() since the mapped address has * to be page aligned, and we use huge pages. @@ -308,7 +347,9 @@ _vu_add_mem_reg(VuDev *dev, VhostUserMemoryRegion *msg_region, int fd) DPRINT(" mmap_addr: 0x%016"PRIx64"\n", (uint64_t)(uintptr_t)mmap_addr); - r = &dev->regions[dev->nregions]; + /* Shift all affected entries by 1 to open a hole at idx. */ + r = &dev->regions[idx]; + memmove(r + 1, r, sizeof(VuDevRegion) * (dev->nregions - idx)); r->gpa = msg_region->guest_phys_addr; r->size = msg_region->memory_size; r->qva = msg_region->userspace_addr; From patchwork Fri Feb 2 21:53:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13543485 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CBA39C4828E for ; Fri, 2 Feb 2024 21:56:09 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rW1UK-0008TQ-Hk; Fri, 02 Feb 2024 16:54:20 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1UI-0008Lg-Fx for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:18 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1UA-0004uV-Ni for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:18 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706910850; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=77r2j9eRnlux52n9mVjVt4CTYLtK8feRXEl2IUvTNLM=; b=EY4t4n9hBsfXpm+EWex8duGH9BGe5SHgBBhruSJJyBFqmisEPRUEjBi/jvR2LSaJaKWbV8 lNInLYlQ+doMjFhNyrARJOjhV29NKB9VMAjjnnspTIQn8Yalhw6itrNXYI5AvBq7bN57cA 3uZ5plyr7eqjqDWDVRXi8PmpCeTiwy8= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-558-VEH01V6kP-u9EfeQSdCp_w-1; Fri, 02 Feb 2024 16:54:08 -0500 X-MC-Unique: VEH01V6kP-u9EfeQSdCp_w-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 056D029AA38E; Fri, 2 Feb 2024 21:54:08 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.192.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id 791022166B31; Fri, 2 Feb 2024 21:54:06 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , "Michael S . Tsirkin" , Jason Wang , Stefan Hajnoczi , Stefano Garzarella , Germano Veit Michel , Raphael Norwitz Subject: [PATCH v1 12/15] libvhost-user: Use most of mmap_offset as fd_offset Date: Fri, 2 Feb 2024 22:53:29 +0100 Message-ID: <20240202215332.118728-13-david@redhat.com> In-Reply-To: <20240202215332.118728-1-david@redhat.com> References: <20240202215332.118728-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.276, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org In the past, QEMU would create memory regions that could partially cover hugetlb pages, making mmap() fail if we would use the mmap_offset as an fd_offset. For that reason, we never used the mmap_offset as an offset into the fd and instead always mapped the fd from the very start. However, that can easily result in us mmap'ing a lot of unnecessary parts of an fd, possibly repeatedly. QEMU nowadays does not create memory regions that partially cover huge pages -- it never really worked with postcopy. QEMU handles merging of regions that partially cover huge pages (due to holes in boot memory) since 2018 in c1ece84e7c93 ("vhost: Huge page align and merge"). Let's be a bit careful and not unconditionally convert the mmap_offset into an fd_offset. Instead, let's simply detect the hugetlb size and pass as much as we can as fd_offset, making sure that we call mmap() with a properly aligned offset. With QEMU and a virtio-mem device that is fully plugged (50GiB using 50 memslots) the qemu-storage daemon process consumes in the VA space 1281GiB before this change and 58GiB after this change. Example debug output: ================ Vhost user message ================ Request: VHOST_USER_ADD_MEM_REG (37) Flags: 0x9 Size: 40 Fds: 59 Adding region 50 guest_phys_addr: 0x0000000d80000000 memory_size: 0x0000000040000000 userspace_addr 0x00007f54ebffe000 mmap_offset 0x0000000c00000000 fd_offset: 0x0000000c00000000 new mmap_offset: 0x0000000000000000 mmap_addr: 0x00007f7ecc000000 Successfully added new region ================ Vhost user message ================ Request: VHOST_USER_ADD_MEM_REG (37) Flags: 0x9 Size: 40 Fds: 59 Adding region 51 guest_phys_addr: 0x0000000dc0000000 memory_size: 0x0000000040000000 userspace_addr 0x00007f552bffe000 mmap_offset 0x0000000c40000000 fd_offset: 0x0000000c40000000 new mmap_offset: 0x0000000000000000 mmap_addr: 0x00007f7e8c000000 Successfully added new region Signed-off-by: David Hildenbrand Reviewed-by: Raphael Norwitz --- subprojects/libvhost-user/libvhost-user.c | 50 ++++++++++++++++++++--- 1 file changed, 45 insertions(+), 5 deletions(-) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index 75e47b7bb3..7d8293dc84 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -43,6 +43,8 @@ #include #include #include +#include +#include #ifdef __NR_userfaultfd #include @@ -281,12 +283,36 @@ vu_remove_all_mem_regs(VuDev *dev) dev->nregions = 0; } +static size_t +get_fd_pagesize(int fd) +{ + static size_t pagesize; +#if defined(__linux__) + struct statfs fs; + int ret; + + do { + ret = fstatfs(fd, &fs); + } while (ret != 0 && errno == EINTR); + + if (!ret && fs.f_type == HUGETLBFS_MAGIC) { + return fs.f_bsize; + } +#endif + + if (!pagesize) { + pagesize = getpagesize(); + } + return pagesize; +} + static void _vu_add_mem_reg(VuDev *dev, VhostUserMemoryRegion *msg_region, int fd) { const uint64_t start_gpa = msg_region->guest_phys_addr; const uint64_t end_gpa = start_gpa + msg_region->memory_size; int prot = PROT_READ | PROT_WRITE; + uint64_t mmap_offset, fd_offset; VuDevRegion *r; void *mmap_addr; int low = 0; @@ -335,11 +361,25 @@ _vu_add_mem_reg(VuDev *dev, VhostUserMemoryRegion *msg_region, int fd) idx = low; /* - * We don't use offset argument of mmap() since the mapped address has - * to be page aligned, and we use huge pages. + * Convert most of msg_region->mmap_offset to fd_offset. In almost all + * cases, this will leave us with mmap_offset == 0, mmap()'ing only + * what we really need. Only if a memory region would partially cover + * hugetlb pages, we'd get mmap_offset != 0, which usually doesn't happen + * anymore (i.e., modern QEMU). + * + * Note that mmap() with hugetlb would fail if the offset into the file + * is not aligned to the huge page size. */ - mmap_addr = mmap(0, msg_region->memory_size + msg_region->mmap_offset, - prot, MAP_SHARED | MAP_NORESERVE, fd, 0); + fd_offset = ALIGN_DOWN(msg_region->mmap_offset, get_fd_pagesize(fd)); + mmap_offset = msg_region->mmap_offset - fd_offset; + + DPRINT(" fd_offset: 0x%016"PRIx64"\n", + fd_offset); + DPRINT(" adj mmap_offset: 0x%016"PRIx64"\n", + mmap_offset); + + mmap_addr = mmap(0, msg_region->memory_size + mmap_offset, + prot, MAP_SHARED | MAP_NORESERVE, fd, fd_offset); if (mmap_addr == MAP_FAILED) { vu_panic(dev, "region mmap error: %s", strerror(errno)); return; @@ -354,7 +394,7 @@ _vu_add_mem_reg(VuDev *dev, VhostUserMemoryRegion *msg_region, int fd) r->size = msg_region->memory_size; r->qva = msg_region->userspace_addr; r->mmap_addr = (uint64_t)(uintptr_t)mmap_addr; - r->mmap_offset = msg_region->mmap_offset; + r->mmap_offset = mmap_offset; dev->nregions++; if (dev->postcopy_listening) { From patchwork Fri Feb 2 21:53:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13543484 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BDEA2C48291 for ; Fri, 2 Feb 2024 21:56:06 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rW1UH-0008JG-Pz; Fri, 02 Feb 2024 16:54:17 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1UF-0008GA-T4 for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:16 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1UE-0004uj-Ft for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:15 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706910854; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WYZZpUJmqSm7TaL0kZhutM5lwJZW8aorfX8+7PRkcLU=; b=Qu6Ygyn7NV6usz8cNyO5fk/GLcKPXaC+vjw9jvOUVTWXdgmbTo85gOIPP1r6pI9UpKFsJo q3Aw8ra5bX7UlCiHXLXn+b4eOTT8oUHKPlJd5j3F3jrOLY/rLsSi2a4xSvh8QLf/73r/rB w6XQZSZ0hYo7miTB3fdgyfdRYAC62L4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-102-SjDiPhsEOSC94M_ChOaHMw-1; Fri, 02 Feb 2024 16:54:10 -0500 X-MC-Unique: SjDiPhsEOSC94M_ChOaHMw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 687B7101A526; Fri, 2 Feb 2024 21:54:10 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.192.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6A2292166B31; Fri, 2 Feb 2024 21:54:08 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , "Michael S . Tsirkin" , Jason Wang , Stefan Hajnoczi , Stefano Garzarella , Germano Veit Michel , Raphael Norwitz Subject: [PATCH v1 13/15] libvhost-user: Factor out vq usability check Date: Fri, 2 Feb 2024 22:53:30 +0100 Message-ID: <20240202215332.118728-14-david@redhat.com> In-Reply-To: <20240202215332.118728-1-david@redhat.com> References: <20240202215332.118728-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.276, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Let's factor it out to prepare for further changes. Signed-off-by: David Hildenbrand Reviewed-by: Raphael Norwitz --- subprojects/libvhost-user/libvhost-user.c | 24 +++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index 7d8293dc84..febeb2eb89 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -283,6 +283,12 @@ vu_remove_all_mem_regs(VuDev *dev) dev->nregions = 0; } +static bool +vu_is_vq_usable(VuDev *dev, VuVirtq *vq) +{ + return likely(!dev->broken) && likely(vq->vring.avail); +} + static size_t get_fd_pagesize(int fd) { @@ -2378,8 +2384,7 @@ vu_queue_get_avail_bytes(VuDev *dev, VuVirtq *vq, unsigned int *in_bytes, idx = vq->last_avail_idx; total_bufs = in_total = out_total = 0; - if (unlikely(dev->broken) || - unlikely(!vq->vring.avail)) { + if (!vu_is_vq_usable(dev, vq)) { goto done; } @@ -2494,8 +2499,7 @@ vu_queue_avail_bytes(VuDev *dev, VuVirtq *vq, unsigned int in_bytes, bool vu_queue_empty(VuDev *dev, VuVirtq *vq) { - if (unlikely(dev->broken) || - unlikely(!vq->vring.avail)) { + if (!vu_is_vq_usable(dev, vq)) { return true; } @@ -2534,8 +2538,7 @@ vring_notify(VuDev *dev, VuVirtq *vq) static void _vu_queue_notify(VuDev *dev, VuVirtq *vq, bool sync) { - if (unlikely(dev->broken) || - unlikely(!vq->vring.avail)) { + if (!vu_is_vq_usable(dev, vq)) { return; } @@ -2860,8 +2863,7 @@ vu_queue_pop(VuDev *dev, VuVirtq *vq, size_t sz) unsigned int head; VuVirtqElement *elem; - if (unlikely(dev->broken) || - unlikely(!vq->vring.avail)) { + if (!vu_is_vq_usable(dev, vq)) { return NULL; } @@ -3018,8 +3020,7 @@ vu_queue_fill(VuDev *dev, VuVirtq *vq, { struct vring_used_elem uelem; - if (unlikely(dev->broken) || - unlikely(!vq->vring.avail)) { + if (!vu_is_vq_usable(dev, vq)) { return; } @@ -3048,8 +3049,7 @@ vu_queue_flush(VuDev *dev, VuVirtq *vq, unsigned int count) { uint16_t old, new; - if (unlikely(dev->broken) || - unlikely(!vq->vring.avail)) { + if (!vu_is_vq_usable(dev, vq)) { return; } From patchwork Fri Feb 2 21:53:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13543486 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9A9AAC4828E for ; Fri, 2 Feb 2024 21:56:21 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rW1UK-0008Tp-Sj; Fri, 02 Feb 2024 16:54:20 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1UI-0008N4-TF for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:18 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1UH-0004v8-6E for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:18 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706910856; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ri6FoSJmzypYKy1MV4ZPdfCalka49kBuEYsBMKwQ3zg=; b=G1sKoER768bRUpoB/QwDr8LMxQ66dlkTz1R+fouAKoONVZnUdAxyl10Sn923dl0EMSaCzV mGdHZ/qRsR3W96R7ZQ367+YrPH5gwpHmKZo/AhOHDnt3mm7Wk6ErBRYLndXc0738Ddejsh WemiEbKfF56Q2VekfxBnxnW1KepYQsk= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-67-U9Rpi69mNs-DQ7wLGBitlw-1; Fri, 02 Feb 2024 16:54:13 -0500 X-MC-Unique: U9Rpi69mNs-DQ7wLGBitlw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D09DC10665A2; Fri, 2 Feb 2024 21:54:12 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.192.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id C182A2166B33; Fri, 2 Feb 2024 21:54:10 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , "Michael S . Tsirkin" , Jason Wang , Stefan Hajnoczi , Stefano Garzarella , Germano Veit Michel , Raphael Norwitz Subject: [PATCH v1 14/15] libvhost-user: Dynamically remap rings after (temporarily?) removing memory regions Date: Fri, 2 Feb 2024 22:53:31 +0100 Message-ID: <20240202215332.118728-15-david@redhat.com> In-Reply-To: <20240202215332.118728-1-david@redhat.com> References: <20240202215332.118728-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.129.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.276, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Currently, we try to remap all rings whenever we add a single new memory region. That doesn't quite make sense, because we already map rings when setting the ring address, and panic if that goes wrong. Likely, that handling was simply copied from set_mem_table code, where we actually have to remap all rings. Remapping all rings might require us to walk quite a lot of memory regions to perform the address translations. Ideally, we'd simply remove that remapping. However, let's be a bit careful. There might be some weird corner cases where we might temporarily remove a single memory region (e.g., resize it), that would have worked for now. Further, a ring might be located on hotplugged memory, and as the VM reboots, we might unplug that memory, to hotplug memory before resetting the ring addresses. So let's unmap affected rings as we remove a memory region, and try dynamically mapping the ring again when required. Signed-off-by: David Hildenbrand Acked-by: Raphael Norwitz --- subprojects/libvhost-user/libvhost-user.c | 107 ++++++++++++++++------ 1 file changed, 78 insertions(+), 29 deletions(-) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index febeb2eb89..738e84ab63 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -283,10 +283,75 @@ vu_remove_all_mem_regs(VuDev *dev) dev->nregions = 0; } +static bool +map_ring(VuDev *dev, VuVirtq *vq) +{ + vq->vring.desc = qva_to_va(dev, vq->vra.desc_user_addr); + vq->vring.used = qva_to_va(dev, vq->vra.used_user_addr); + vq->vring.avail = qva_to_va(dev, vq->vra.avail_user_addr); + + DPRINT("Setting virtq addresses:\n"); + DPRINT(" vring_desc at %p\n", vq->vring.desc); + DPRINT(" vring_used at %p\n", vq->vring.used); + DPRINT(" vring_avail at %p\n", vq->vring.avail); + + return !(vq->vring.desc && vq->vring.used && vq->vring.avail); +} + static bool vu_is_vq_usable(VuDev *dev, VuVirtq *vq) { - return likely(!dev->broken) && likely(vq->vring.avail); + if (unlikely(dev->broken)) { + return false; + } + + if (likely(vq->vring.avail)) { + return true; + } + + /* + * In corner cases, we might temporarily remove a memory region that + * mapped a ring. When removing a memory region we make sure to + * unmap any rings that would be impacted. Let's try to remap if we + * already succeeded mapping this ring once. + */ + if (!vq->vra.desc_user_addr || !vq->vra.used_user_addr || + !vq->vra.avail_user_addr) { + return false; + } + if (map_ring(dev, vq)) { + vu_panic(dev, "remapping queue on access"); + return false; + } + return true; +} + +static void +unmap_rings(VuDev *dev, VuDevRegion *r) +{ + int i; + + for (i = 0; i < dev->max_queues; i++) { + VuVirtq *vq = &dev->vq[i]; + const uintptr_t desc = (uintptr_t)vq->vring.desc; + const uintptr_t used = (uintptr_t)vq->vring.used; + const uintptr_t avail = (uintptr_t)vq->vring.avail; + + if (desc < r->mmap_addr || desc >= r->mmap_addr + r->size) { + continue; + } + if (used < r->mmap_addr || used >= r->mmap_addr + r->size) { + continue; + } + if (avail < r->mmap_addr || avail >= r->mmap_addr + r->size) { + continue; + } + + DPRINT("Unmapping rings of queue %d\n", i); + vq->vring.desc = NULL; + vq->vring.used = NULL; + vq->vring.avail = NULL; + } } static size_t @@ -784,21 +849,6 @@ vu_reset_device_exec(VuDev *dev, VhostUserMsg *vmsg) return false; } -static bool -map_ring(VuDev *dev, VuVirtq *vq) -{ - vq->vring.desc = qva_to_va(dev, vq->vra.desc_user_addr); - vq->vring.used = qva_to_va(dev, vq->vra.used_user_addr); - vq->vring.avail = qva_to_va(dev, vq->vra.avail_user_addr); - - DPRINT("Setting virtq addresses:\n"); - DPRINT(" vring_desc at %p\n", vq->vring.desc); - DPRINT(" vring_used at %p\n", vq->vring.used); - DPRINT(" vring_avail at %p\n", vq->vring.avail); - - return !(vq->vring.desc && vq->vring.used && vq->vring.avail); -} - static bool generate_faults(VuDev *dev) { unsigned int i; @@ -882,7 +932,6 @@ generate_faults(VuDev *dev) { static bool vu_add_mem_reg(VuDev *dev, VhostUserMsg *vmsg) { - int i; VhostUserMemoryRegion m = vmsg->payload.memreg.region, *msg_region = &m; if (vmsg->fd_num != 1) { @@ -928,19 +977,9 @@ vu_add_mem_reg(VuDev *dev, VhostUserMsg *vmsg) { vmsg->fd_num = 0; DPRINT("Successfully added new region in postcopy\n"); return true; - } else { - for (i = 0; i < dev->max_queues; i++) { - if (dev->vq[i].vring.desc) { - if (map_ring(dev, &dev->vq[i])) { - vu_panic(dev, "remapping queue %d for new memory region", - i); - } - } - } - - DPRINT("Successfully added new region\n"); - return false; } + DPRINT("Successfully added new region\n"); + return false; } static inline bool reg_equal(VuDevRegion *vudev_reg, @@ -993,6 +1032,16 @@ vu_rem_mem_reg(VuDev *dev, VhostUserMsg *vmsg) { return false; } + /* + * There might be valid cases where we temporarily remove memory regions + * to readd them again, or remove memory regions and don't use the rings + * anymore before we set the ring addresses and restart the device. + * + * Unmap all affected rings, remapping them on demand later. This should + * be a corner case. + */ + unmap_rings(dev, r); + munmap((void *)(uintptr_t)r->mmap_addr, r->size + r->mmap_offset); idx = r - dev->regions; From patchwork Fri Feb 2 21:53:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13543473 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BF6DBC4828F for ; Fri, 2 Feb 2024 21:54:54 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rW1UM-0000CS-RB; Fri, 02 Feb 2024 16:54:22 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1UJ-0008Oo-MK for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:19 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rW1UI-0004vC-29 for qemu-devel@nongnu.org; Fri, 02 Feb 2024 16:54:19 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706910857; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OdNqjIdTvGgL2rzqx1u7F6VWnSUR37k3K9wxCLfN5WQ=; b=IECqDy7ZBCd0rH4inPNBo0aq04/X+IzsrFMqjVfPJgqTxswfw78Xmwr04bIwm4h+7giJ4d X9SeatH0WehLQ2Ld0knG0pla938hH8BCCc7TJoCKXflaSWn+ooXPX9t0EuKr/FQI27Mj3I EoVfzSuwqEsnzpeaLWp6FO7S0gixtE0= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-227-M3N-RdjZOwGzC0RD1jkMuA-1; Fri, 02 Feb 2024 16:54:15 -0500 X-MC-Unique: M3N-RdjZOwGzC0RD1jkMuA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 651703C0ED44; Fri, 2 Feb 2024 21:54:15 +0000 (UTC) Received: from t14s.redhat.com (unknown [10.39.192.47]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3CF402166B31; Fri, 2 Feb 2024 21:54:13 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , "Michael S . Tsirkin" , Jason Wang , Stefan Hajnoczi , Stefano Garzarella , Germano Veit Michel , Raphael Norwitz Subject: [PATCH v1 15/15] libvhost-user: Mark mmap'ed region memory as MADV_DONTDUMP Date: Fri, 2 Feb 2024 22:53:32 +0100 Message-ID: <20240202215332.118728-16-david@redhat.com> In-Reply-To: <20240202215332.118728-1-david@redhat.com> References: <20240202215332.118728-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 Received-SPF: pass client-ip=170.10.129.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-2.276, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org We already use MADV_NORESERVE to deal with sparse memory regions. Let's also set madvise(MADV_DONTDUMP), otherwise a crash of the process can result in us allocating all memory in the mmap'ed region for dumping purposes. This change implies that the mmap'ed rings won't be included in a coredump. If ever required for debugging purposes, we could mark only the mapped rings MADV_DODUMP. Ignore errors during madvise() for now. Signed-off-by: David Hildenbrand Reviewed-by: Raphael Norwitz --- subprojects/libvhost-user/libvhost-user.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/subprojects/libvhost-user/libvhost-user.c b/subprojects/libvhost-user/libvhost-user.c index 738e84ab63..26c289518c 100644 --- a/subprojects/libvhost-user/libvhost-user.c +++ b/subprojects/libvhost-user/libvhost-user.c @@ -458,6 +458,12 @@ _vu_add_mem_reg(VuDev *dev, VhostUserMemoryRegion *msg_region, int fd) DPRINT(" mmap_addr: 0x%016"PRIx64"\n", (uint64_t)(uintptr_t)mmap_addr); +#if defined(__linux__) + /* Don't include all guest memory in a coredump. */ + madvise(mmap_addr, msg_region->memory_size + mmap_offset, + MADV_DONTDUMP); +#endif + /* Shift all affected entries by 1 to open a hole at idx. */ r = &dev->regions[idx]; memmove(r + 1, r, sizeof(VuDevRegion) * (dev->nregions - idx));