From patchwork Fri Jun 16 09:26:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13282370 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6CEBAEB64D7 for ; Fri, 16 Jun 2023 09:28:01 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qA5kB-00055l-TS; Fri, 16 Jun 2023 05:27:47 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qA5kA-00055b-Mf for qemu-devel@nongnu.org; Fri, 16 Jun 2023 05:27:46 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qA5k8-0000Ll-LN for qemu-devel@nongnu.org; Fri, 16 Jun 2023 05:27:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1686907664; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YVJH1n7JWv+MjfEIK8HbJyiWIDdnfDhM0TQSsSGtsbg=; b=AfQ1H7VSWQwcvjPnuzaKo/wg7QYeolMzBrGNlcTg1cqL5lPtfExtJkvzCf2VtLcMjUm/NW 44JevmrDj7vD0RpHPdvTB/njVoKbWTbRxCtkVjslEgqSLqEwjtoj6Y3fqb2t+P9uRSXTlq mz2awJNCz3/J0/6QM3VnJa5IJK6CM54= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-176-vqGrInqZOOOjPPBkdy13Sg-1; Fri, 16 Jun 2023 05:27:39 -0400 X-MC-Unique: vqGrInqZOOOjPPBkdy13Sg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 59AB0800A15; Fri, 16 Jun 2023 09:27:39 +0000 (UTC) Received: from t480s.fritz.box (unknown [10.39.194.44]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8EB131121315; Fri, 16 Jun 2023 09:27:36 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Igor Mammedov , Xiao Guangrong , "Michael S. Tsirkin" , Peter Xu , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , Eduardo Habkost , Marcel Apfelbaum , Yanan Wang , Michal Privoznik , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Gavin Shan , Alex Williamson , kvm@vger.kernel.org Subject: [PATCH v1 09/15] memory-device, vhost: Support memory devices that dynamically consume multiple memslots Date: Fri, 16 Jun 2023 11:26:48 +0200 Message-Id: <20230616092654.175518-10-david@redhat.com> In-Reply-To: <20230616092654.175518-1-david@redhat.com> References: <20230616092654.175518-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org We want to support memory devices that have a dynamically managed memory region container as device memory region. This device memory region maps multiple RAM memory subregions (e.g., aliases to the same RAM memory region), whereby these subregions can be (un)mapped on demand. Each RAM subregion will consume a memslot in KVM and vhost, resulting in such a new device consuming memslots dynamically, and initially usually 0. We already track the number of used vs. required memslots for all memslots. From that, we can derive the number of reserved memslots that must not be used. We only have to add a way for memory devices to expose how many memslots they require, such that we can properly consider them as required (and as reserved until actually used). Let's properly document what's supported and what's not. The target use case is virtio-mem, which will dynamically map parts of a source RAM memory region into the container device region using aliases, consuming one memslot per alias. Extend the vhost memslot check accordingly and give a hint that adding vhost devices before adding memory devices might make it work (especially virtio-mem devices, once they determine the number of memslots to use at runtime). Signed-off-by: David Hildenbrand --- hw/mem/memory-device.c | 36 +++++++++++++++++++++++++++++++++- hw/virtio/vhost.c | 18 +++++++++++++---- include/hw/mem/memory-device.h | 7 +++++++ stubs/qmp_memory_device.c | 5 +++++ 4 files changed, 61 insertions(+), 5 deletions(-) diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c index 752258333b..2e6536c841 100644 --- a/hw/mem/memory-device.c +++ b/hw/mem/memory-device.c @@ -88,6 +88,40 @@ static unsigned int get_free_memslots(void) return MIN(vhost_get_free_memslots(), kvm_get_free_memslots()); } +/* Memslots that are reserved by memory devices (required but still unused). */ +static unsigned int get_reserved_memslots(MachineState *ms) +{ + if (ms->device_memory->used_memslots > + ms->device_memory->required_memslots) { + /* This is unexpected, and we warned already in the memory notifier. */ + return 0; + } + return ms->device_memory->required_memslots - + ms->device_memory->used_memslots; +} + +unsigned int memory_devices_get_reserved_memslots(void) +{ + if (!current_machine->device_memory) { + return 0; + } + return get_reserved_memslots(current_machine); +} + +/* Memslots that are still free but not reserved by memory devices yet. */ +static unsigned int get_available_memslots(MachineState *ms) +{ + const unsigned int free = get_free_memslots(); + const unsigned int reserved = get_reserved_memslots(ms); + + if (free < reserved) { + warn_report_once("The reserved memory slots (%u) exceed the free" + " memory slots (%u)", reserved, free); + return 0; + } + return reserved - free; +} + /* * The memslot soft limit for memory devices. The soft limit might change at * runtime in corner cases (that should certainly be avoided), for example, when @@ -146,7 +180,7 @@ static void memory_device_check_addable(MachineState *ms, MemoryDeviceState *md, MemoryRegion *mr, Error **errp) { const uint64_t used_region_size = ms->device_memory->used_region_size; - const unsigned int available_memslots = get_free_memslots(); + const unsigned int available_memslots = get_available_memslots(ms); const uint64_t size = memory_region_size(mr); unsigned int required_memslots; diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 472ccba4ab..b1e2eca55d 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -1422,7 +1422,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque, VhostBackendType backend_type, uint32_t busyloop_timeout, Error **errp) { - unsigned int used; + unsigned int used, reserved, limit; uint64_t features; int i, r, n_initialized_vqs = 0; @@ -1528,9 +1528,19 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque, } else { used = used_memslots; } - if (used > hdev->vhost_ops->vhost_backend_memslots_limit(hdev)) { - error_setg(errp, "vhost backend memory slots limit is less" - " than current number of present memory slots"); + /* + * We simplify by assuming that reserved memslots are compatible with used + * vhost devices (if vhost only supports shared memory, the memory devices + * better use shared memory) and that reserved memslots are not used for + * ROM. + */ + reserved = memory_devices_get_reserved_memslots(); + limit = hdev->vhost_ops->vhost_backend_memslots_limit(hdev); + if (used + reserved > limit) { + error_setg(errp, "vhost backend memory slots limit (%d) is less" + " than current number of used (%d) and reserved (%d)" + " memory slots. Try adding vhost devices before memory" + " devices.", limit, used, reserved); r = -EINVAL; goto fail_busyloop; } diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h index 755f6304c6..7e8e4452cb 100644 --- a/include/hw/mem/memory-device.h +++ b/include/hw/mem/memory-device.h @@ -47,6 +47,12 @@ typedef struct MemoryDeviceState MemoryDeviceState; * single RAM/ROM memory region or a memory region container with subregions * that are RAM/ROM memory regions or aliases to RAM/ROM memory regions. Other * memory regions or subregions are not supported. + * + * If the device memory region returned via @get_memory_region is a + * memory region container, it's supported to dynamically (un)map subregions + * as long as the number of memslots returned by @get_memslots() won't + * be exceeded and as long as all memory regions are of the same kind (e.g., + * all RAM or all ROM). */ struct MemoryDeviceClass { /* private */ @@ -127,6 +133,7 @@ struct MemoryDeviceClass { MemoryDeviceInfoList *qmp_memory_device_list(void); uint64_t get_plugged_memory_size(void); void memory_devices_notify_vhost_device_added(void); +unsigned int memory_devices_get_reserved_memslots(void); void memory_device_pre_plug(MemoryDeviceState *md, MachineState *ms, const uint64_t *legacy_align, Error **errp); void memory_device_plug(MemoryDeviceState *md, MachineState *ms); diff --git a/stubs/qmp_memory_device.c b/stubs/qmp_memory_device.c index b0e3e34f85..74707ed9fd 100644 --- a/stubs/qmp_memory_device.c +++ b/stubs/qmp_memory_device.c @@ -14,3 +14,8 @@ uint64_t get_plugged_memory_size(void) void memory_devices_notify_vhost_device_added(void) { } + +unsigned int memory_devices_get_reserved_memslots(void) +{ + return 0; +}