From patchwork Wed Oct 27 12:45:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12587231 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BAC63C433F5 for ; Wed, 27 Oct 2021 12:45:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A128961052 for ; Wed, 27 Oct 2021 12:45:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241985AbhJ0MsM (ORCPT ); Wed, 27 Oct 2021 08:48:12 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:26352 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241978AbhJ0MsL (ORCPT ); Wed, 27 Oct 2021 08:48:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635338746; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PxRO//KGswXPKqSCCWXXkgC8CPdp/iPZT12lPG2xPHY=; b=bIL3tFHkysoq7ma4H6b8rKSCU74tBLrFlbtbPgYGKeFVoaew7E14AuLOIHyCVjvM7wOCa/ 3zvw0su49hL2jYxsBQTnDc9vkagB657p5C+2KEHsJbga0/G60nSAIsKDgG4AqT1hn0cWBB zA2fwDwx8F81DPIHFALilNaFT4l8mVU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-243-7qP48cA4NxeNUK2czPSimA-1; Wed, 27 Oct 2021 08:45:42 -0400 X-MC-Unique: 7qP48cA4NxeNUK2czPSimA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4F2D78066F9; Wed, 27 Oct 2021 12:45:41 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id 912FA196E6; Wed, 27 Oct 2021 12:45:37 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , Hui Zhu , Sebastien Boeuf , kvm@vger.kernel.org Subject: [PATCH v1 01/12] kvm: Return number of free memslots Date: Wed, 27 Oct 2021 14:45:20 +0200 Message-Id: <20211027124531.57561-2-david@redhat.com> In-Reply-To: <20211027124531.57561-1-david@redhat.com> References: <20211027124531.57561-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's return the number of free slots instead of only checking if there is a free slot. Required to support memory devices that consume multiple memslots. Signed-off-by: David Hildenbrand --- accel/kvm/kvm-all.c | 24 +++++++++++------------- accel/stubs/kvm-stub.c | 4 ++-- hw/mem/memory-device.c | 2 +- include/sysemu/kvm.h | 2 +- 4 files changed, 15 insertions(+), 17 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index db8d83b137..0846be835e 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -103,6 +103,7 @@ struct KVMState AccelState parent_obj; int nr_slots; + int nr_free_slots; int fd; int vmfd; int coalesced_mmio; @@ -245,6 +246,13 @@ int kvm_get_max_memslots(void) return s->nr_slots; } +unsigned int kvm_get_free_memslots(void) +{ + KVMState *s = kvm_state; + + return s->nr_free_slots; +} + /* Called with KVMMemoryListener.slots_lock held */ static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml) { @@ -260,19 +268,6 @@ static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml) return NULL; } -bool kvm_has_free_slot(MachineState *ms) -{ - KVMState *s = KVM_STATE(ms->accelerator); - bool result; - KVMMemoryListener *kml = &s->memory_listener; - - kvm_slots_lock(); - result = !!kvm_get_free_slot(kml); - kvm_slots_unlock(); - - return result; -} - /* Called with KVMMemoryListener.slots_lock held */ static KVMSlot *kvm_alloc_slot(KVMMemoryListener *kml) { @@ -1410,6 +1405,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml, } start_addr += slot_size; size -= slot_size; + kvm_state->nr_free_slots++; } while (size); goto out; } @@ -1435,6 +1431,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml, ram_start_offset += slot_size; ram += slot_size; size -= slot_size; + kvm_state->nr_free_slots--; } while (size); out: @@ -2364,6 +2361,7 @@ static int kvm_init(MachineState *ms) if (!s->nr_slots) { s->nr_slots = 32; } + s->nr_free_slots = s->nr_slots; s->nr_as = kvm_check_extension(s, KVM_CAP_MULTI_ADDRESS_SPACE); if (s->nr_as <= 1) { diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c index 5b1d00a222..cbaeb7c656 100644 --- a/accel/stubs/kvm-stub.c +++ b/accel/stubs/kvm-stub.c @@ -133,9 +133,9 @@ int kvm_irqchip_remove_irqfd_notifier_gsi(KVMState *s, EventNotifier *n, return -ENOSYS; } -bool kvm_has_free_slot(MachineState *ms) +unsigned int kvm_get_free_memslots(void) { - return false; + return 0; } void kvm_init_cpu_signals(CPUState *cpu) diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c index d9f8301711..9045ead33e 100644 --- a/hw/mem/memory-device.c +++ b/hw/mem/memory-device.c @@ -73,7 +73,7 @@ static void memory_device_check_addable(MachineState *ms, uint64_t size, uint64_t used_region_size = 0; /* we will need a new memory slot for kvm and vhost */ - if (kvm_enabled() && !kvm_has_free_slot(ms)) { + if (kvm_enabled() && !kvm_get_free_memslots()) { error_setg(errp, "hypervisor has no free memory slots left"); return; } diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index a1ab1ee12d..c18be3cbd5 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -211,7 +211,7 @@ typedef struct Notifier Notifier; /* external API */ -bool kvm_has_free_slot(MachineState *ms); +unsigned int kvm_get_free_memslots(void); bool kvm_has_sync_mmu(void); int kvm_has_vcpu_events(void); int kvm_has_robust_singlestep(void); From patchwork Wed Oct 27 12:45:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12587233 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28167C433EF for ; Wed, 27 Oct 2021 12:45:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0445861052 for ; Wed, 27 Oct 2021 12:45:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241981AbhJ0MsV (ORCPT ); Wed, 27 Oct 2021 08:48:21 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:51968 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241984AbhJ0MsP (ORCPT ); Wed, 27 Oct 2021 08:48:15 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635338749; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2hbj6hDCwEzAAPaPgFDnVD9b+AHnA6L4nxGKqSIgncI=; b=QCc4TfUFn99nQDcXx+S5ngyBOOnj93A84lvOV8wmC8GYnjmAEW9Cd4XIkuUZkn4C7x/2gq 41qiHgyt6oqAgDcaUkaO435UeTH+TYxWrxPY0G5r+I3CXDXJha+6cM866CYRBTIb29nHeg ngt/PPzDxc/LUzx0n2EHSQisuiyCRRk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-186-FZCJ872cNxaGvPS7A_zIWw-1; Wed, 27 Oct 2021 08:45:46 -0400 X-MC-Unique: FZCJ872cNxaGvPS7A_zIWw-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4EA7F10A8E08; Wed, 27 Oct 2021 12:45:45 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id ACC1D196E6; Wed, 27 Oct 2021 12:45:41 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , Hui Zhu , Sebastien Boeuf , kvm@vger.kernel.org Subject: [PATCH v1 02/12] vhost: Return number of free memslots Date: Wed, 27 Oct 2021 14:45:21 +0200 Message-Id: <20211027124531.57561-3-david@redhat.com> In-Reply-To: <20211027124531.57561-1-david@redhat.com> References: <20211027124531.57561-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's return the number of free slots instead of only checking if there is a free slot. Required to support memory devices that consume multiple memslots. Signed-off-by: David Hildenbrand Reviewed-by: Philippe Mathieu-Daudé --- hw/mem/memory-device.c | 2 +- hw/virtio/vhost-stub.c | 2 +- hw/virtio/vhost.c | 4 ++-- include/hw/virtio/vhost.h | 2 +- 4 files changed, 5 insertions(+), 5 deletions(-) diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c index 9045ead33e..7f76a09e57 100644 --- a/hw/mem/memory-device.c +++ b/hw/mem/memory-device.c @@ -77,7 +77,7 @@ static void memory_device_check_addable(MachineState *ms, uint64_t size, error_setg(errp, "hypervisor has no free memory slots left"); return; } - if (!vhost_has_free_slot()) { + if (!vhost_get_free_memslots()) { error_setg(errp, "a used vhost backend has no free memory slots left"); return; } diff --git a/hw/virtio/vhost-stub.c b/hw/virtio/vhost-stub.c index c175148fce..fe111e5e45 100644 --- a/hw/virtio/vhost-stub.c +++ b/hw/virtio/vhost-stub.c @@ -2,7 +2,7 @@ #include "hw/virtio/vhost.h" #include "hw/virtio/vhost-user.h" -bool vhost_has_free_slot(void) +unsigned int vhost_get_free_memslots(void) { return true; } diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 437347ad01..2707972870 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -48,7 +48,7 @@ static unsigned int used_memslots; static QLIST_HEAD(, vhost_dev) vhost_devices = QLIST_HEAD_INITIALIZER(vhost_devices); -bool vhost_has_free_slot(void) +unsigned int vhost_get_free_memslots(void) { unsigned int slots_limit = ~0U; struct vhost_dev *hdev; @@ -57,7 +57,7 @@ bool vhost_has_free_slot(void) unsigned int r = hdev->vhost_ops->vhost_backend_memslots_limit(hdev); slots_limit = MIN(slots_limit, r); } - return slots_limit > used_memslots; + return slots_limit - used_memslots; } static void vhost_dev_sync_region(struct vhost_dev *dev, diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h index 3fa0b554ef..9d59fc1404 100644 --- a/include/hw/virtio/vhost.h +++ b/include/hw/virtio/vhost.h @@ -130,7 +130,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits, uint64_t features); void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits, uint64_t features); -bool vhost_has_free_slot(void); +unsigned int vhost_get_free_memslots(void); int vhost_net_set_backend(struct vhost_dev *hdev, struct vhost_vring_file *file); From patchwork Wed Oct 27 12:45:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12587253 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B515C433EF for ; Wed, 27 Oct 2021 12:46:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2DEC261052 for ; Wed, 27 Oct 2021 12:46:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242007AbhJ0MtH (ORCPT ); Wed, 27 Oct 2021 08:49:07 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:27320 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242014AbhJ0MsT (ORCPT ); Wed, 27 Oct 2021 08:48:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635338754; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4oRf4XjJvORI8h+Ij1YCp2fOAHr5xa6Jd7RTRmYUtiQ=; b=O6pCGYVjKZG3ugAfReditXZ7xTFkkNzLWIP/bRo0aKho/o17UucJtI9JmPcS4VCBIPKMwI KzfkHsrmWUUiXg+RyJPc1XlmMLx+mA82FTftC8UU5PP8tzjwuYs97scUvpMSt56qUdFXUy 3xxO5cVcJLJd6h6Q7/c4BMgXfnWXkPE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-448-OsI6DejjNhq3-zCHaAHdcQ-1; Wed, 27 Oct 2021 08:45:50 -0400 X-MC-Unique: OsI6DejjNhq3-zCHaAHdcQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 43A8C19200C0; Wed, 27 Oct 2021 12:45:49 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9265619D9F; Wed, 27 Oct 2021 12:45:45 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , Hui Zhu , Sebastien Boeuf , kvm@vger.kernel.org Subject: [PATCH v1 03/12] memory: Allow for marking memory region aliases unmergeable Date: Wed, 27 Oct 2021 14:45:22 +0200 Message-Id: <20211027124531.57561-4-david@redhat.com> In-Reply-To: <20211027124531.57561-1-david@redhat.com> References: <20211027124531.57561-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's allow for marking memory region aliases unmergeable, to teach flatview code (and memory listeners like vhost-user) to not merge adjacent aliases to the same memory region into a larger memory section; instead, we want separate alias to stay separate such that we can atomically map/unmap aliases without affecting other aliases. This is a preparation for virtio-mem mapping device memory located on a RAM memory region via separate aliases into a memory region container, resulting in separate memslots that can get (un)mapped atomically. As an example with virtio-mem, the layout looks something like this: [...] 0000000180000000-000002007fffffff (prio 0, i/o): device-memory 0000000180000000-000001017fffffff (prio 0, i/o): virtio-mem-memslots 0000000180000000-00000001bfffffff (prio 0, ram): alias virtio-mem-memslot-0 @mem0 0000000000000000-000000003fffffff 00000001c0000000-00000001ffffffff (prio 0, ram): alias virtio-mem-memslot-1 @mem0 0000000040000000-000000007fffffff 0000000200000000-000000023fffffff (prio 0, ram): alias virtio-mem-memslot-2 @mem0 0000000080000000-00000000bfffffff [...] What would happen right now is that flatview code merged all 3 aliases into a single memory section. When mapping another alias (e.g., virtio-mem-memslot-3) or when unmapping any of the mapped aliases, memory listeners will first get notified about the removal of the big memory section to then get notified about re-adding of the new (differently merged) memory section(s). In an ideal world, memory listeners would be able to deal with that atomically, however, that is not the case for the most important memory listeners used in context of virtio-mem (KVM, vhost-user, vfio) and supporting atomic updates is quite hard (e.g., for KVM where we cannot simply resize or split memory slots due to allocated metadata per slot, or in virtiofsd where we cannot simply resize or split an active mmap mapping). While temporarily removing memslots, active users (e.g., KVM VCPUs) can stumble over the missing memslot and essentially crash the VM. Further, merged chunks will consume less memslots, but we might end up consuming more later, when unmapping chunks and splitting the bigger chunks into smaller ones -- making memslot accounting for memory devices problematic as well. Let's allow for marking a memory region alias unmergeable, such that we can atomically (un)map aliases to the same memory region, similar to (un)mapping individual DIMMs. Signed-off-by: David Hildenbrand --- include/exec/memory.h | 23 +++++++++++++++++++++++ softmmu/memory.c | 33 +++++++++++++++++++++++++++------ 2 files changed, 50 insertions(+), 6 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index 75b4f600e3..d877b80e6e 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -82,6 +82,7 @@ struct ReservedRegion { * relative to the region's address space * @readonly: writes to this section are ignored * @nonvolatile: this section is non-volatile + * @unmergeable: this section should not get merged with adjacent sections */ struct MemoryRegionSection { Int128 size; @@ -91,6 +92,7 @@ struct MemoryRegionSection { hwaddr offset_within_address_space; bool readonly; bool nonvolatile; + bool unmergeable; }; typedef struct IOMMUTLBEntry IOMMUTLBEntry; @@ -720,6 +722,7 @@ struct MemoryRegion { bool nonvolatile; bool rom_device; bool flush_coalesced_mmio; + bool unmergeable; uint8_t dirty_log_mask; bool is_iommu; RAMBlock *ram_block; @@ -2272,6 +2275,26 @@ void memory_region_set_size(MemoryRegion *mr, uint64_t size); void memory_region_set_alias_offset(MemoryRegion *mr, hwaddr offset); +/* + * memory_region_set_alias_unmergeable: Turn a memory region alias unmergeable + * + * Mark a memory region alias unmergeable, resulting in multiple adjacent + * aliasas to the same memory region not getting merged into one memory section + * when simplifying the address space and notifying memory listeners. + * + * Primarily useful on aliases to RAM regions; the target use case is + * splitting a RAM memory region via aliases into multiple memslots and + * dynamically (un)mapping the aliases into another container memory region. + * As resulting memory sections don't cover multiple aliases, memory listeners + * will be notified about adding/removing separate aliases, resulting in + * individual memslots in KVM, vhost, vfio,... that can be added/removed + * atomically when mapping/unmapping the corresponding alias. + * + * @mr: the #MemoryRegion to be updated + * @unmergeable: whether to mark the #MemoryRegion unmergeable + */ +void memory_region_set_alias_unmergeable(MemoryRegion *mr, bool unmergeable); + /** * memory_region_present: checks if an address relative to a @container * translates into #MemoryRegion within @container diff --git a/softmmu/memory.c b/softmmu/memory.c index 3a2613bd3c..d8a584ca80 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -223,6 +223,7 @@ struct FlatRange { bool romd_mode; bool readonly; bool nonvolatile; + bool unmergeable; }; #define FOR_EACH_FLAT_RANGE(var, view) \ @@ -239,6 +240,7 @@ section_from_flat_range(FlatRange *fr, FlatView *fv) .offset_within_address_space = int128_get64(fr->addr.start), .readonly = fr->readonly, .nonvolatile = fr->nonvolatile, + .unmergeable = fr->unmergeable, }; } @@ -249,7 +251,8 @@ static bool flatrange_equal(FlatRange *a, FlatRange *b) && a->offset_in_region == b->offset_in_region && a->romd_mode == b->romd_mode && a->readonly == b->readonly - && a->nonvolatile == b->nonvolatile; + && a->nonvolatile == b->nonvolatile + && a->unmergeable == b->unmergeable; } static FlatView *flatview_new(MemoryRegion *mr_root) @@ -322,7 +325,8 @@ static bool can_merge(FlatRange *r1, FlatRange *r2) && r1->dirty_log_mask == r2->dirty_log_mask && r1->romd_mode == r2->romd_mode && r1->readonly == r2->readonly - && r1->nonvolatile == r2->nonvolatile; + && r1->nonvolatile == r2->nonvolatile + && !r1->unmergeable && !r2->unmergeable; } /* Attempt to simplify a view by merging adjacent ranges */ @@ -581,7 +585,8 @@ static void render_memory_region(FlatView *view, Int128 base, AddrRange clip, bool readonly, - bool nonvolatile) + bool nonvolatile, + bool unmergeable) { MemoryRegion *subregion; unsigned i; @@ -598,6 +603,7 @@ static void render_memory_region(FlatView *view, int128_addto(&base, int128_make64(mr->addr)); readonly |= mr->readonly; nonvolatile |= mr->nonvolatile; + unmergeable |= mr->unmergeable; tmp = addrrange_make(base, mr->size); @@ -611,14 +617,14 @@ static void render_memory_region(FlatView *view, int128_subfrom(&base, int128_make64(mr->alias->addr)); int128_subfrom(&base, int128_make64(mr->alias_offset)); render_memory_region(view, mr->alias, base, clip, - readonly, nonvolatile); + readonly, nonvolatile, unmergeable); return; } /* Render subregions in priority order. */ QTAILQ_FOREACH(subregion, &mr->subregions, subregions_link) { render_memory_region(view, subregion, base, clip, - readonly, nonvolatile); + readonly, nonvolatile, unmergeable); } if (!mr->terminates) { @@ -634,6 +640,7 @@ static void render_memory_region(FlatView *view, fr.romd_mode = mr->romd_mode; fr.readonly = readonly; fr.nonvolatile = nonvolatile; + fr.unmergeable = unmergeable; /* Render the region itself into any gaps left by the current view. */ for (i = 0; i < view->nr && int128_nz(remain); ++i) { @@ -735,7 +742,7 @@ static FlatView *generate_memory_topology(MemoryRegion *mr) if (mr) { render_memory_region(view, mr, int128_zero(), addrrange_make(int128_zero(), int128_2_64()), - false, false); + false, false, false); } flatview_simplify(view); @@ -2634,6 +2641,20 @@ void memory_region_set_alias_offset(MemoryRegion *mr, hwaddr offset) memory_region_transaction_commit(); } +void memory_region_set_alias_unmergeable(MemoryRegion *mr, bool unmergeable) +{ + assert(mr->alias); + + if (unmergeable == mr->unmergeable) { + return; + } + + memory_region_transaction_begin(); + mr->unmergeable = unmergeable; + memory_region_update_pending |= mr->enabled; + memory_region_transaction_commit(); +} + uint64_t memory_region_get_alignment(const MemoryRegion *mr) { return mr->align; From patchwork Wed Oct 27 12:45:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12587235 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9581FC433EF for ; Wed, 27 Oct 2021 12:45:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7CE1060FC0 for ; Wed, 27 Oct 2021 12:45:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241988AbhJ0MsY (ORCPT ); Wed, 27 Oct 2021 08:48:24 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:25662 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241980AbhJ0MsX (ORCPT ); Wed, 27 Oct 2021 08:48:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635338757; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cHf2ZqC5DC9zioFkkyjMYiEOu523mJCDOZFO8VfZXy8=; b=BL267+ptq0W2bYQ52iivwUPQPh5j3f2vS1ZEExf0yo9m/Xdqd+JsA+NXksaWc2h7NnvhLj QFHdAz8gA+eJp4l34ipayYGbl2FLvFD39tOwNofc4yuYr3/x+Biuc6A21cJgH4NxI+xJka dP6QI+I7T/TWuYiSpwk9Vy+p/rqzbq4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-347-EzF0ft-uOY2jUUW1mdrN4g-1; Wed, 27 Oct 2021 08:45:54 -0400 X-MC-Unique: EzF0ft-uOY2jUUW1mdrN4g-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 00F92802B4F; Wed, 27 Oct 2021 12:45:53 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id AA57D19724; Wed, 27 Oct 2021 12:45:49 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , Hui Zhu , Sebastien Boeuf , kvm@vger.kernel.org Subject: [PATCH v1 04/12] vhost: Don't merge unmergeable memory sections Date: Wed, 27 Oct 2021 14:45:23 +0200 Message-Id: <20211027124531.57561-5-david@redhat.com> In-Reply-To: <20211027124531.57561-1-david@redhat.com> References: <20211027124531.57561-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Memory sections that are marked unmergeable should not be merged, to allow for atomic removal later. Signed-off-by: David Hildenbrand --- hw/virtio/vhost.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 2707972870..49a1074097 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -620,7 +620,7 @@ static void vhost_region_add_section(struct vhost_dev *dev, mrs_size, mrs_host); } - if (dev->n_tmp_sections) { + if (dev->n_tmp_sections && !section->unmergeable) { /* Since we already have at least one section, lets see if * this extends it; since we're scanning in order, we only * have to look at the last one, and the FlatView that calls @@ -653,7 +653,7 @@ static void vhost_region_add_section(struct vhost_dev *dev, size_t offset = mrs_gpa - prev_gpa_start; if (prev_host_start + offset == mrs_host && - section->mr == prev_sec->mr && + section->mr == prev_sec->mr && !prev_sec->unmergeable && (!dev->vhost_ops->vhost_backend_can_merge || dev->vhost_ops->vhost_backend_can_merge(dev, mrs_host, mrs_size, From patchwork Wed Oct 27 12:45:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12587237 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE080C433F5 for ; Wed, 27 Oct 2021 12:46:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B476C6109D for ; Wed, 27 Oct 2021 12:46:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240158AbhJ0Ms2 (ORCPT ); Wed, 27 Oct 2021 08:48:28 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:34698 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240311AbhJ0Ms1 (ORCPT ); Wed, 27 Oct 2021 08:48:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635338761; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=S8/nicczR9pseL6axCf0NoW1qX9tQ1hx5V0enNVRtU0=; b=RgXUaHtNz7/b/WiA9BI7rxGRMeLMXUvT0eFvT0fRagj2P+jJHR/I2wgATY0GHBZUFiKYmG D/555mx5wjNylk46MPZFP4a9KecSU1dD5FaRDZaP43efxI7Drl6EW4WkR2i0NJb7ANmWgE fHkKanGcDfpNmJDT1tQAA8F5xRZ9vO0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-227-DbfXdOn_NBeY2lX9nHYyZg-1; Wed, 27 Oct 2021 08:45:58 -0400 X-MC-Unique: DbfXdOn_NBeY2lX9nHYyZg-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D7FC410A8E01; Wed, 27 Oct 2021 12:45:56 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id 76C95196E6; Wed, 27 Oct 2021 12:45:53 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , Hui Zhu , Sebastien Boeuf , kvm@vger.kernel.org Subject: [PATCH v1 05/12] memory-device: Move memory_device_check_addable() directly into memory_device_pre_plug() Date: Wed, 27 Oct 2021 14:45:24 +0200 Message-Id: <20211027124531.57561-6-david@redhat.com> In-Reply-To: <20211027124531.57561-1-david@redhat.com> References: <20211027124531.57561-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Move it out of memory_device_get_free_addr(), which is cleaner and prepares for future changes. Signed-off-by: David Hildenbrand --- hw/mem/memory-device.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c index 7f76a09e57..68a2c3dbcc 100644 --- a/hw/mem/memory-device.c +++ b/hw/mem/memory-device.c @@ -67,9 +67,10 @@ static int memory_device_used_region_size(Object *obj, void *opaque) return 0; } -static void memory_device_check_addable(MachineState *ms, uint64_t size, +static void memory_device_check_addable(MachineState *ms, MemoryRegion *mr, Error **errp) { + const uint64_t size = memory_region_size(mr); uint64_t used_region_size = 0; /* we will need a new memory slot for kvm and vhost */ @@ -99,7 +100,6 @@ static uint64_t memory_device_get_free_addr(MachineState *ms, uint64_t align, uint64_t size, Error **errp) { - Error *err = NULL; GSList *list = NULL, *item; Range as, new = range_empty; @@ -125,12 +125,6 @@ static uint64_t memory_device_get_free_addr(MachineState *ms, align); } - memory_device_check_addable(ms, size, &err); - if (err) { - error_propagate(errp, err); - return 0; - } - if (hint && !QEMU_IS_ALIGNED(*hint, align)) { error_setg(errp, "address must be aligned to 0x%" PRIx64 " bytes", align); @@ -259,6 +253,11 @@ void memory_device_pre_plug(MemoryDeviceState *md, MachineState *ms, goto out; } + memory_device_check_addable(ms, mr, &local_err); + if (local_err) { + goto out; + } + if (legacy_align) { align = *legacy_align; } else { From patchwork Wed Oct 27 12:45:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12587239 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 612CCC433EF for ; Wed, 27 Oct 2021 12:46:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4AC5A60FC0 for ; Wed, 27 Oct 2021 12:46:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235920AbhJ0Ms3 (ORCPT ); Wed, 27 Oct 2021 08:48:29 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:49311 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237382AbhJ0Ms2 (ORCPT ); Wed, 27 Oct 2021 08:48:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635338763; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Icv29slOxUnH4zF0alx14tDkDXA6WVbCnInv3q9rhxo=; b=X+au1By8fyDbmyDzMnPDIxQmXlRJBAVAa0SDmr3eL5OWZT87cnt0UN62wXN+GEt8UMb8NE aS1zm7As4CmRmnins5EtJ0Lp3zGmnkDD2FqaDYcYpnlpxbmvIQZNVSWCB3/97KcYlQboSP /8rYpU59Iz/f0RIaXhKNO29ROhC8ywU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-204-0G5tDUYzOYCytiLAiGMnOQ-1; Wed, 27 Oct 2021 08:46:02 -0400 X-MC-Unique: 0G5tDUYzOYCytiLAiGMnOQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C03FF802B52; Wed, 27 Oct 2021 12:46:00 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2E61419D9F; Wed, 27 Oct 2021 12:45:57 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , Hui Zhu , Sebastien Boeuf , kvm@vger.kernel.org Subject: [PATCH v1 06/12] memory-device: Generalize memory_device_used_region_size() Date: Wed, 27 Oct 2021 14:45:25 +0200 Message-Id: <20211027124531.57561-7-david@redhat.com> In-Reply-To: <20211027124531.57561-1-david@redhat.com> References: <20211027124531.57561-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's generalize traversal of all plugged memory devices to collect information in the context of memory_device_check_addable() to prepare for future changes. Signed-off-by: David Hildenbrand --- hw/mem/memory-device.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c index 68a2c3dbcc..a915894819 100644 --- a/hw/mem/memory-device.c +++ b/hw/mem/memory-device.c @@ -50,20 +50,24 @@ static int memory_device_build_list(Object *obj, void *opaque) return 0; } -static int memory_device_used_region_size(Object *obj, void *opaque) +struct memory_devices_info { + uint64_t region_size; +}; + +static int memory_devices_collect_info(Object *obj, void *opaque) { - uint64_t *size = opaque; + struct memory_devices_info *i = opaque; if (object_dynamic_cast(obj, TYPE_MEMORY_DEVICE)) { const DeviceState *dev = DEVICE(obj); const MemoryDeviceState *md = MEMORY_DEVICE(obj); if (dev->realized) { - *size += memory_device_get_region_size(md, &error_abort); + i->region_size += memory_device_get_region_size(md, &error_abort); } } - object_child_foreach(obj, memory_device_used_region_size, opaque); + object_child_foreach(obj, memory_devices_collect_info, opaque); return 0; } @@ -71,7 +75,7 @@ static void memory_device_check_addable(MachineState *ms, MemoryRegion *mr, Error **errp) { const uint64_t size = memory_region_size(mr); - uint64_t used_region_size = 0; + struct memory_devices_info info = {}; /* we will need a new memory slot for kvm and vhost */ if (kvm_enabled() && !kvm_get_free_memslots()) { @@ -84,12 +88,12 @@ static void memory_device_check_addable(MachineState *ms, MemoryRegion *mr, } /* will we exceed the total amount of memory specified */ - memory_device_used_region_size(OBJECT(ms), &used_region_size); - if (used_region_size + size < used_region_size || - used_region_size + size > ms->maxram_size - ms->ram_size) { + memory_devices_collect_info(OBJECT(ms), &info); + if (info.region_size + size < info.region_size || + info.region_size + size > ms->maxram_size - ms->ram_size) { error_setg(errp, "not enough space, currently 0x%" PRIx64 " in use of total space for memory devices 0x" RAM_ADDR_FMT, - used_region_size, ms->maxram_size - ms->ram_size); + info.region_size, ms->maxram_size - ms->ram_size); return; } From patchwork Wed Oct 27 12:45:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12587241 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E36BC433EF for ; Wed, 27 Oct 2021 12:46:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 317D661052 for ; Wed, 27 Oct 2021 12:46:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241993AbhJ0Msp (ORCPT ); Wed, 27 Oct 2021 08:48:45 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:24921 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242004AbhJ0Mse (ORCPT ); Wed, 27 Oct 2021 08:48:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635338769; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7Jg/b6nvIHSM/B2S2cdIl1E9LXRFcI99q0x3sJS0Zmk=; b=NM4ktTRdiIXb7T63Sf01C5zuvexWmlgEIDZVne5/s0FVru0zShVLi74OLFH5fYpuD+qS5O 7ekG8RcS1h8U7rnM2jvRH6qm0lSQotIfc9UJLA42S/+XIjke0R77cyjvWA3xq/xM8sldW9 tyj9Iy1+F4PjgyqAUOTD/tuIIzaQQbk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-588-J4Y1EJ_kMh24ZGfTLkBP9w-1; Wed, 27 Oct 2021 08:46:05 -0400 X-MC-Unique: J4Y1EJ_kMh24ZGfTLkBP9w-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9949A10A8E01; Wed, 27 Oct 2021 12:46:04 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id 283CF19D9F; Wed, 27 Oct 2021 12:46:00 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , Hui Zhu , Sebastien Boeuf , kvm@vger.kernel.org Subject: [PATCH v1 07/12] memory-device: Support memory devices that dynamically consume multiple memslots Date: Wed, 27 Oct 2021 14:45:26 +0200 Message-Id: <20211027124531.57561-8-david@redhat.com> In-Reply-To: <20211027124531.57561-1-david@redhat.com> References: <20211027124531.57561-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org We want to support memory devices that have a container as device memory region, and (dynamically) map individual chunks into that container resulting in multiple memslots getting consumed by such a device. So we want to support memory devices that require a) multiple memslots and b) dynamically make use of these memslots. We already have one device that uses a container as device memory region: NVDIMM. However, an NVDIMM also end up consuming exactly one memslot. The target use case will be virtio-mem, which will dynamically map parts of a source RAM memory region into the container device region using aliases, consuming one memslot per alias. We need a way to query from a memory device: * The currently used number memslots. * The total number of memslots that might get used across device lifetime. Expose one helper functions -- memory_devices_get_reserved_memslots() -- that will be used by vhost code to respect the current memslot reservation when realizing vhost device. Limit the number of memslots usable by memory devices to something sane -- for now 2048 -- we really don't want to create tens of thousands of memslots that can degrade performance when traversing an address space, just because it would be possible (for example, without KVM and without vhost there is no real limit ...). This does not affect existing setups (for example more than 256 DIMMs+NVDIMMs is not supported). Signed-off-by: David Hildenbrand --- hw/mem/memory-device.c | 84 ++++++++++++++++++++++++++++++---- include/hw/mem/memory-device.h | 33 +++++++++++++ stubs/qmp_memory_device.c | 5 ++ 3 files changed, 113 insertions(+), 9 deletions(-) diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c index a915894819..bb02eb410a 100644 --- a/hw/mem/memory-device.c +++ b/hw/mem/memory-device.c @@ -18,6 +18,8 @@ #include "sysemu/kvm.h" #include "trace.h" +#define MEMORY_DEVICES_MAX_MEMSLOTS 2048 + static gint memory_device_addr_sort(gconstpointer a, gconstpointer b) { const MemoryDeviceState *md_a = MEMORY_DEVICE(a); @@ -50,8 +52,28 @@ static int memory_device_build_list(Object *obj, void *opaque) return 0; } +static unsigned int memory_device_get_used_memslots(const MemoryDeviceState *md) +{ + const MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(md); + + if (!mdc->get_used_memslots) + return 1; + return mdc->get_used_memslots(md, &error_abort); +} + +static unsigned int memory_device_get_memslots(const MemoryDeviceState *md) +{ + const MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(md); + + if (!mdc->get_memslots) + return 1; + return mdc->get_memslots(md, &error_abort); +} + struct memory_devices_info { uint64_t region_size; + unsigned int used_memslots; + unsigned int reserved_memslots; }; static int memory_devices_collect_info(Object *obj, void *opaque) @@ -61,9 +83,15 @@ static int memory_devices_collect_info(Object *obj, void *opaque) if (object_dynamic_cast(obj, TYPE_MEMORY_DEVICE)) { const DeviceState *dev = DEVICE(obj); const MemoryDeviceState *md = MEMORY_DEVICE(obj); + unsigned int used, total; if (dev->realized) { i->region_size += memory_device_get_region_size(md, &error_abort); + + used = memory_device_get_used_memslots(md); + total = memory_device_get_memslots(md); + i->used_memslots += used; + i->reserved_memslots += total - used; } } @@ -71,24 +99,62 @@ static int memory_devices_collect_info(Object *obj, void *opaque) return 0; } -static void memory_device_check_addable(MachineState *ms, MemoryRegion *mr, - Error **errp) +/* + * Get the number of memslots that are reserved (not used yet but will get used + * dynamically in the future without further checks) by all memory devices. + */ +unsigned int memory_devices_get_reserved_memslots(void) +{ + struct memory_devices_info info = {}; + + memory_devices_collect_info(qdev_get_machine(), &info); + return info.reserved_memslots; +} + +static void memory_device_check_addable(MachineState *ms, MemoryDeviceState *md, + MemoryRegion *mr, Error **errp) { const uint64_t size = memory_region_size(mr); struct memory_devices_info info = {}; + unsigned int required, reserved; + + memory_devices_collect_info(OBJECT(ms), &info); + reserved = info.reserved_memslots; + required = memory_device_get_memslots(md); - /* we will need a new memory slot for kvm and vhost */ - if (kvm_enabled() && !kvm_get_free_memslots()) { - error_setg(errp, "hypervisor has no free memory slots left"); + /* + * Limit the maximum number of memslot used by memory devices to something + * sane. + */ + if (info.used_memslots + reserved + required > + MEMORY_DEVICES_MAX_MEMSLOTS) { + error_setg(errp, "The maximum number of memory slots to be consumed by" + " memory devices (%u) would be exceeded. Used: %u," + " Reserved: %u, Required: %u", + MEMORY_DEVICES_MAX_MEMSLOTS, info.used_memslots, + reserved, required); return; } - if (!vhost_get_free_memslots()) { - error_setg(errp, "a used vhost backend has no free memory slots left"); + + /* + * All memslots used by memory devices are already subtracted from + * the free memslots as reported by kvm and vhost. + */ + if (kvm_enabled() && kvm_get_free_memslots() < reserved + required) { + error_setg(errp, "KVM does not have enough free, unreserved memory" + "slots left. Free: %u, Reserved: %u, Required: %u", + kvm_get_free_memslots(), reserved, required); + return; + } + if (vhost_get_free_memslots() < reserved + required) { + error_setg(errp, "a used vhost backend does not have enough free," + " unreserved memory slots left. Free: %u, Reserved: %u," + " Required: %u", vhost_get_free_memslots(), reserved, + required); return; } /* will we exceed the total amount of memory specified */ - memory_devices_collect_info(OBJECT(ms), &info); if (info.region_size + size < info.region_size || info.region_size + size > ms->maxram_size - ms->ram_size) { error_setg(errp, "not enough space, currently 0x%" PRIx64 @@ -257,7 +323,7 @@ void memory_device_pre_plug(MemoryDeviceState *md, MachineState *ms, goto out; } - memory_device_check_addable(ms, mr, &local_err); + memory_device_check_addable(ms, md, mr, &local_err); if (local_err) { goto out; } diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h index 48d2611fc5..c811415ce6 100644 --- a/include/hw/mem/memory-device.h +++ b/include/hw/mem/memory-device.h @@ -98,6 +98,38 @@ struct MemoryDeviceClass { */ uint64_t (*get_min_alignment)(const MemoryDeviceState *md); + /* + * Optional: Return the number of used individual memslots (i.e., + * individual RAM mappings) the device has created in the memory region of + * the device. The device has to make sure that memslots won't get merged + * internally (e.g,, by disabling merging of memory region aliases) if the + * memory region layout could allow for that. + * + * If this function is not implemented, we assume the device memory region + * is not a container and that there is exactly one memslot getting used + * after realizing and plugging the device. + * + * Called when plugging the memory device or when iterating over + * all realized memory devices to calculate used/reserved/available + * memslots. + */ + unsigned int (*get_used_memslots)(const MemoryDeviceState *md, Error **errp); + + /* + * Optional: Return the total number of individual memslots + * (i.e., individual RAM mappings) the device may create in the the memory + * region of the device over its lifetime. The result must never change. + * + * If this function is not implemented, we assume the device memory region + * is not a container and that there will be exactly one memslot getting + * used after realizing and plugging the device. + * + * Called when plugging the memory device or when iterating over + * all realized memory devices to calculate used/reserved/available + * memslots. + */ + unsigned int (*get_memslots)(const MemoryDeviceState *md, Error **errp); + /* * Translate the memory device into #MemoryDeviceInfo. */ @@ -113,5 +145,6 @@ void memory_device_plug(MemoryDeviceState *md, MachineState *ms); void memory_device_unplug(MemoryDeviceState *md, MachineState *ms); uint64_t memory_device_get_region_size(const MemoryDeviceState *md, Error **errp); +unsigned int memory_devices_get_reserved_memslots(void); #endif diff --git a/stubs/qmp_memory_device.c b/stubs/qmp_memory_device.c index e75cac62dc..318a5d4187 100644 --- a/stubs/qmp_memory_device.c +++ b/stubs/qmp_memory_device.c @@ -10,3 +10,8 @@ uint64_t get_plugged_memory_size(void) { return (uint64_t)-1; } + +unsigned int memory_devices_get_reserved_memslots(void) +{ + return 0; +} From patchwork Wed Oct 27 12:45:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12587243 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 109A5C433FE for ; Wed, 27 Oct 2021 12:46:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E16756109D for ; Wed, 27 Oct 2021 12:46:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242003AbhJ0Msr (ORCPT ); Wed, 27 Oct 2021 08:48:47 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:24020 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235839AbhJ0Msn (ORCPT ); Wed, 27 Oct 2021 08:48:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635338772; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ETMqw/34hByQcj56CeaZZMN6SPT1/xyTSjcuShEUBz4=; b=UXpdNuOMDtA9qohD43OcW3xFYH1P40P8Yjw/n8s9hiZTxE+g9D/FJCj2PKg5k2NYnYj8dX wD/m+4nY2oMDNPOw7lGts5rZgYzqWcR35wOi5grUx1ujFMY095tTANiKymqMy0mrvS67eL ZHzRYlmdKEBcgF/oJWQGGgTkj9T5USQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-192-vWcmnyppOC2TTOgvaTYddA-1; Wed, 27 Oct 2021 08:46:09 -0400 X-MC-Unique: vWcmnyppOC2TTOgvaTYddA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 31A8019200C1; Wed, 27 Oct 2021 12:46:08 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id E2FDB1A7E9; Wed, 27 Oct 2021 12:46:04 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , Hui Zhu , Sebastien Boeuf , kvm@vger.kernel.org Subject: [PATCH v1 08/12] vhost: Respect reserved memslots for memory devices when realizing a vhost device Date: Wed, 27 Oct 2021 14:45:27 +0200 Message-Id: <20211027124531.57561-9-david@redhat.com> In-Reply-To: <20211027124531.57561-1-david@redhat.com> References: <20211027124531.57561-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Make sure that the current reservations can be fulfilled, otherwise we might run out of memslots later when memory devices start actually using the reserved memslots and crash. Signed-off-by: David Hildenbrand --- hw/virtio/vhost.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 49a1074097..a8f0d0bdf7 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -23,6 +23,7 @@ #include "standard-headers/linux/vhost_types.h" #include "hw/virtio/virtio-bus.h" #include "hw/virtio/virtio-access.h" +#include "hw/mem/memory-device.h" #include "migration/blocker.h" #include "migration/qemu-file-types.h" #include "sysemu/dma.h" @@ -1319,7 +1320,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque, Error **errp) { uint64_t features; - int i, r, n_initialized_vqs = 0; + int i, r, reserved_memslots, backend_memslots, n_initialized_vqs = 0; hdev->vdev = NULL; hdev->migration_blocker = NULL; @@ -1415,9 +1416,13 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque, memory_listener_register(&hdev->memory_listener, &address_space_memory); QLIST_INSERT_HEAD(&vhost_devices, hdev, entry); - if (used_memslots > hdev->vhost_ops->vhost_backend_memslots_limit(hdev)) { - error_setg(errp, "vhost backend memory slots limit is less" - " than current number of present memory slots"); + reserved_memslots = memory_devices_get_reserved_memslots(); + backend_memslots = hdev->vhost_ops->vhost_backend_memslots_limit(hdev); + if (used_memslots + reserved_memslots > backend_memslots) { + error_setg(errp, "vhost backend memory slots limit (%d) is less" + " than current number of used (%d) and reserved (%d)" + " memory slots", backend_memslots, used_memslots, + reserved_memslots); r = -EINVAL; goto fail_busyloop; } From patchwork Wed Oct 27 12:45:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12587245 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5155C433EF for ; Wed, 27 Oct 2021 12:46:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C28AD61052 for ; Wed, 27 Oct 2021 12:46:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242006AbhJ0Mss (ORCPT ); Wed, 27 Oct 2021 08:48:48 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:23840 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240311AbhJ0Msn (ORCPT ); Wed, 27 Oct 2021 08:48:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635338776; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5bxd5XP5fcsL/t1gO40/CCrT43QsDH7dh39uTsHCFhA=; b=Wwdfg0mWrYKjpfLDFI67/33ewd2Y3zBzimGbvZUv7C3yAI3PtiE2xr7J5XfJXF9KsGdty+ 6IjxXK66qG0MPLswmp5/X0KD+VVB+wGHyPNsTkVQHKme0IitAShlQycEqSXc/qaY74O0P5 z0Z0wpxjTzeqdvMRAbWf2Uy+Croh10g= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-157-v5V9zn3EM-aX8RYnt3439A-1; Wed, 27 Oct 2021 08:46:13 -0400 X-MC-Unique: v5V9zn3EM-aX8RYnt3439A-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C23A78066F8; Wed, 27 Oct 2021 12:46:11 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7BB61196E6; Wed, 27 Oct 2021 12:46:08 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , Hui Zhu , Sebastien Boeuf , kvm@vger.kernel.org Subject: [PATCH v1 09/12] memory: Drop mapping check from memory_region_get_ram_discard_manager() Date: Wed, 27 Oct 2021 14:45:28 +0200 Message-Id: <20211027124531.57561-10-david@redhat.com> In-Reply-To: <20211027124531.57561-1-david@redhat.com> References: <20211027124531.57561-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org It's sufficient to check whether a memory region is RAM, the region doesn't necessarily have to be mapped into another memory region. For example, RAM memory regions mapped via an alias will never make "memory_region_is_mapped()" succeed. Signed-off-by: David Hildenbrand --- softmmu/memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/softmmu/memory.c b/softmmu/memory.c index d8a584ca80..2496bff729 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -2045,7 +2045,7 @@ int memory_region_iommu_num_indexes(IOMMUMemoryRegion *iommu_mr) RamDiscardManager *memory_region_get_ram_discard_manager(MemoryRegion *mr) { - if (!memory_region_is_mapped(mr) || !memory_region_is_ram(mr)) { + if (!memory_region_is_ram(mr)) { return NULL; } return mr->rdm; From patchwork Wed Oct 27 12:45:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12587247 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3433C433EF for ; Wed, 27 Oct 2021 12:46:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8E4A661052 for ; Wed, 27 Oct 2021 12:46:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241999AbhJ0Msu (ORCPT ); Wed, 27 Oct 2021 08:48:50 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:37619 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242004AbhJ0Msr (ORCPT ); Wed, 27 Oct 2021 08:48:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635338782; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jt56Elig8KNNUT5YJgfc6kSni0eelTGMID26LpdycPA=; b=f3T6L09crFtt9aybNXZSr2YosR/csbt9il2Mv3Jr7ry/RcZoQOWdlKjQbyDcc0GTpHQVOn r8Xzo1jwCZLl895OsTXbi8VS+CdFTiEhPoSCbjYayVcRz8CTaPlVIZfpXcZLV01TEO253m g3Av5KMJ7DpriPv5x+7+4BjoktTel9o= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-525-ocR0olR3PN24_B1gbvb04Q-1; Wed, 27 Oct 2021 08:46:16 -0400 X-MC-Unique: ocR0olR3PN24_B1gbvb04Q-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5265719200CE; Wed, 27 Oct 2021 12:46:15 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id 14AB7196E6; Wed, 27 Oct 2021 12:46:11 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , Hui Zhu , Sebastien Boeuf , kvm@vger.kernel.org Subject: [PATCH v1 10/12] virtio-mem: Fix typo in virito_mem_intersect_memory_section() function name Date: Wed, 27 Oct 2021 14:45:29 +0200 Message-Id: <20211027124531.57561-11-david@redhat.com> In-Reply-To: <20211027124531.57561-1-david@redhat.com> References: <20211027124531.57561-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org It's "virtio", not "virito". Signed-off-by: David Hildenbrand Reviewed-by: Philippe Mathieu-Daudé --- hw/virtio/virtio-mem.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index d5a578142b..0f5eae4a7e 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -177,7 +177,7 @@ static int virtio_mem_for_each_unplugged_range(const VirtIOMEM *vmem, void *arg, * * Returns false if the intersection is empty, otherwise returns true. */ -static bool virito_mem_intersect_memory_section(MemoryRegionSection *s, +static bool virtio_mem_intersect_memory_section(MemoryRegionSection *s, uint64_t offset, uint64_t size) { uint64_t start = MAX(s->offset_within_region, offset); @@ -215,7 +215,7 @@ static int virtio_mem_for_each_plugged_section(const VirtIOMEM *vmem, first_bit + 1) - 1; size = (last_bit - first_bit + 1) * vmem->block_size; - if (!virito_mem_intersect_memory_section(&tmp, offset, size)) { + if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) { break; } ret = cb(&tmp, arg); @@ -247,7 +247,7 @@ static int virtio_mem_for_each_unplugged_section(const VirtIOMEM *vmem, first_bit + 1) - 1; size = (last_bit - first_bit + 1) * vmem->block_size; - if (!virito_mem_intersect_memory_section(&tmp, offset, size)) { + if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) { break; } ret = cb(&tmp, arg); @@ -283,7 +283,7 @@ static void virtio_mem_notify_unplug(VirtIOMEM *vmem, uint64_t offset, QLIST_FOREACH(rdl, &vmem->rdl_list, next) { MemoryRegionSection tmp = *rdl->section; - if (!virito_mem_intersect_memory_section(&tmp, offset, size)) { + if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) { continue; } rdl->notify_discard(rdl, &tmp); @@ -299,7 +299,7 @@ static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint64_t offset, QLIST_FOREACH(rdl, &vmem->rdl_list, next) { MemoryRegionSection tmp = *rdl->section; - if (!virito_mem_intersect_memory_section(&tmp, offset, size)) { + if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) { continue; } ret = rdl->notify_populate(rdl, &tmp); @@ -316,7 +316,7 @@ static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint64_t offset, if (rdl2 == rdl) { break; } - if (!virito_mem_intersect_memory_section(&tmp, offset, size)) { + if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) { continue; } rdl2->notify_discard(rdl2, &tmp); From patchwork Wed Oct 27 12:45:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12587249 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DCB56C433F5 for ; Wed, 27 Oct 2021 12:46:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C2EDF61052 for ; Wed, 27 Oct 2021 12:46:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240341AbhJ0Msv (ORCPT ); Wed, 27 Oct 2021 08:48:51 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:36432 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241984AbhJ0Mst (ORCPT ); Wed, 27 Oct 2021 08:48:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635338784; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=neMEa5P1k0NumSDTzcgvk/F1dlF3Z1+sCn+TIT1vP/g=; b=GqN087P+eg779sEQ3hQG3uqb8QzNfplYPhMA4keJw0x93hTDIpFFcEDZGGbZaNqLqZBrAz znPP6vS8js0uZVa7T9EL8CeipCtNZQuX2zBA8BPyKyONucSlsLSbYXTSuTo/dQC6SzH6KG xAQT1dsEZZrZYLcrdQcZRgsH9tMPrL0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-476-atr5WQL6O9qTp40vTVKzJA-1; Wed, 27 Oct 2021 08:46:21 -0400 X-MC-Unique: atr5WQL6O9qTp40vTVKzJA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 759AF19200CA; Wed, 27 Oct 2021 12:46:19 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id AC80119724; Wed, 27 Oct 2021 12:46:15 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , Hui Zhu , Sebastien Boeuf , kvm@vger.kernel.org Subject: [PATCH v1 11/12] virtio-mem: Set the RamDiscardManager for the RAM memory region earlier Date: Wed, 27 Oct 2021 14:45:30 +0200 Message-Id: <20211027124531.57561-12-david@redhat.com> In-Reply-To: <20211027124531.57561-1-david@redhat.com> References: <20211027124531.57561-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's set the RamDiscardManager earlier, logically before we expose the RAM memory region to the system. This is a preparation for further changes and is logically cleaner: before we expose the RAM memory region to migration code, make sure we have the RamDiscardManager setup. Signed-off-by: David Hildenbrand --- hw/virtio/virtio-mem.c | 19 ++++++++----------- 1 file changed, 8 insertions(+), 11 deletions(-) diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index 0f5eae4a7e..1e29706798 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -773,16 +773,17 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp) sizeof(struct virtio_mem_config)); vmem->vq = virtio_add_queue(vdev, 128, virtio_mem_handle_request); - host_memory_backend_set_mapped(vmem->memdev, true); - vmstate_register_ram(&vmem->memdev->mr, DEVICE(vmem)); - qemu_register_reset(virtio_mem_system_reset, vmem); - /* - * Set ourselves as RamDiscardManager before the plug handler maps the - * memory region and exposes it via an address space. + * Set ourselves as RamDiscardManager before we expose the memory region + * to the system (e.g., marking the RAMBlock migratable, mapping the + * region). */ memory_region_set_ram_discard_manager(&vmem->memdev->mr, RAM_DISCARD_MANAGER(vmem)); + + host_memory_backend_set_mapped(vmem->memdev, true); + vmstate_register_ram(&vmem->memdev->mr, DEVICE(vmem)); + qemu_register_reset(virtio_mem_system_reset, vmem); } static void virtio_mem_device_unrealize(DeviceState *dev) @@ -790,14 +791,10 @@ static void virtio_mem_device_unrealize(DeviceState *dev) VirtIODevice *vdev = VIRTIO_DEVICE(dev); VirtIOMEM *vmem = VIRTIO_MEM(dev); - /* - * The unplug handler unmapped the memory region, it cannot be - * found via an address space anymore. Unset ourselves. - */ - memory_region_set_ram_discard_manager(&vmem->memdev->mr, NULL); qemu_unregister_reset(virtio_mem_system_reset, vmem); vmstate_unregister_ram(&vmem->memdev->mr, DEVICE(vmem)); host_memory_backend_set_mapped(vmem->memdev, false); + memory_region_set_ram_discard_manager(&vmem->memdev->mr, NULL); virtio_del_queue(vdev, 0); virtio_cleanup(vdev); g_free(vmem->bitmap); From patchwork Wed Oct 27 12:45:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12587251 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09554C433F5 for ; Wed, 27 Oct 2021 12:46:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E23BA61052 for ; Wed, 27 Oct 2021 12:46:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242004AbhJ0Msy (ORCPT ); Wed, 27 Oct 2021 08:48:54 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:45086 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241986AbhJ0Msx (ORCPT ); Wed, 27 Oct 2021 08:48:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635338787; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Fr+TtYyCiz9ln+xKFrWOdzxZutxWwjiyy4MKuQ7lZOg=; b=Q+4cK4+L577DVP30SRTjlZdoTHArqHLDrMKI0imtOYPzllqp61FH+K2/OPKWTv1vDBugal LF96pmfjYTotgYAhxZ70IouAGuF0sh7gSdq3FbCqAER6VPn2/tn9c9x/bNN6V30GCTQM6Y ndPgTsGqfUc/i160Ee8uGWH9XDliw54= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-318-j48Uqr6JOj6Y1savY1WJNg-1; Wed, 27 Oct 2021 08:46:24 -0400 X-MC-Unique: j48Uqr6JOj6Y1savY1WJNg-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 675F75075A; Wed, 27 Oct 2021 12:46:23 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.193.176]) by smtp.corp.redhat.com (Postfix) with ESMTP id BAD4319D9F; Wed, 27 Oct 2021 12:46:19 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , Hui Zhu , Sebastien Boeuf , kvm@vger.kernel.org Subject: [PATCH v1 12/12] virtio-mem: Expose device memory via multiple memslots Date: Wed, 27 Oct 2021 14:45:31 +0200 Message-Id: <20211027124531.57561-13-david@redhat.com> In-Reply-To: <20211027124531.57561-1-david@redhat.com> References: <20211027124531.57561-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org We want to expose virtio-mem device memory via multiple memslots to the guest on demand, essentially reducing the total size of KVM slots significantly (and thereby metadata in KVM and in QEMU for KVM memory slots) especially when exposing initially only a small amount of memory via a virtio-mem device to the guest, to hotplug more memory later. Further, not always exposing the full device memory region to the guest reduces the attack surface in many setups without requiring other mechanisms like userfaultfd for protection of unplugged memory. So split the original RAM region via memory region aliases into separate memory slots, and dynamically map the required memory slots into the container. For now, we always map the memslots covered by the usable region. In the future, with VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, we'll be able to map memslots on actual demand and optimize further. The user is in charge of setting the number of memslots the device should use. In the future, we might add an auto mode, specified via "memslots=0" for user convenience. There are two new properties: 1) "used-memslots" contains how many memslots are currently used and is read-only. Used internally, but can also be used for debugging/ introspection purposes. 2) "memslots" specifies how many memslots the device is should use. * "1" is the default and corresponds mostly to the old behavior. The only exception is that with a usable region size of 0, the single memslot won't get used and nothing will get mapped. * "> 1" tells the device to use the given number of memslots. There are a couple of restrictions: * Cannot be bigger than 1024 or equal to 0. * Cannot be bigger than the number of device blocks. * Must not result in memslots that are smaller than the minimum memslot size This parameter doesn't have to be migrated and can differ between source and destination. The minimum memslot size (and thereby the alignment of memslots in guest physical address space) is currently defined to be 128 MiB -- a memory slot size we know works (due to x86-64 DIMMs) without confusing devices that might not be able to handle crossing memory regions in address spaces when performing I/O. Signed-off-by: David Hildenbrand --- hw/virtio/virtio-mem-pci.c | 23 +++++ hw/virtio/virtio-mem.c | 183 ++++++++++++++++++++++++++++++++- include/hw/virtio/virtio-mem.h | 25 ++++- 3 files changed, 226 insertions(+), 5 deletions(-) diff --git a/hw/virtio/virtio-mem-pci.c b/hw/virtio/virtio-mem-pci.c index be2383b0c5..1dc4078941 100644 --- a/hw/virtio/virtio-mem-pci.c +++ b/hw/virtio/virtio-mem-pci.c @@ -82,6 +82,21 @@ static uint64_t virtio_mem_pci_get_min_alignment(const MemoryDeviceState *md) &error_abort); } +static unsigned int virtio_mem_pci_get_used_memslots( + const MemoryDeviceState *md, + Error **errp) +{ + return object_property_get_uint(OBJECT(md), VIRTIO_MEM_USED_MEMSLOTS_PROP, + &error_abort); +} + +static unsigned int virtio_mem_pci_get_memslots(const MemoryDeviceState *md, + Error **errp) +{ + return object_property_get_uint(OBJECT(md), VIRTIO_MEM_MEMSLOTS_PROP, + &error_abort); +} + static void virtio_mem_pci_size_change_notify(Notifier *notifier, void *data) { VirtIOMEMPCI *pci_mem = container_of(notifier, VirtIOMEMPCI, @@ -115,6 +130,8 @@ static void virtio_mem_pci_class_init(ObjectClass *klass, void *data) mdc->get_memory_region = virtio_mem_pci_get_memory_region; mdc->fill_device_info = virtio_mem_pci_fill_device_info; mdc->get_min_alignment = virtio_mem_pci_get_min_alignment; + mdc->get_used_memslots = virtio_mem_pci_get_used_memslots; + mdc->get_memslots = virtio_mem_pci_get_memslots; } static void virtio_mem_pci_instance_init(Object *obj) @@ -142,6 +159,12 @@ static void virtio_mem_pci_instance_init(Object *obj) object_property_add_alias(obj, VIRTIO_MEM_REQUESTED_SIZE_PROP, OBJECT(&dev->vdev), VIRTIO_MEM_REQUESTED_SIZE_PROP); + object_property_add_alias(obj, VIRTIO_MEM_MEMSLOTS_PROP, + OBJECT(&dev->vdev), + VIRTIO_MEM_MEMSLOTS_PROP); + object_property_add_alias(obj, VIRTIO_MEM_USED_MEMSLOTS_PROP, + OBJECT(&dev->vdev), + VIRTIO_MEM_USED_MEMSLOTS_PROP); } static const VirtioPCIDeviceTypeInfo virtio_mem_pci_info = { diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index 1e29706798..f0ad365b91 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -23,6 +23,7 @@ #include "hw/virtio/virtio-bus.h" #include "hw/virtio/virtio-access.h" #include "hw/virtio/virtio-mem.h" +#include "hw/mem/memory-device.h" #include "qapi/error.h" #include "qapi/visitor.h" #include "exec/ram_addr.h" @@ -46,6 +47,13 @@ #define VIRTIO_MEM_DEFAULT_THP_SIZE VIRTIO_MEM_MIN_BLOCK_SIZE #endif +/* + * Let's not allow a crazy number of memslots for a single virtio-mem device + * and try to size memslots reasonably large. + */ +#define VIRTIO_MEM_MAX_MEMSLOTS 1024 +#define VIRTIO_MEM_MIN_MEMSLOT_SIZE (128 * MiB) + /* * We want to have a reasonable default block size such that * 1. We avoid splitting THPs when unplugging memory, which degrades @@ -500,6 +508,7 @@ static void virtio_mem_resize_usable_region(VirtIOMEM *vmem, { uint64_t newsize = MIN(memory_region_size(&vmem->memdev->mr), requested_size + VIRTIO_MEM_USABLE_EXTENT); + int i; /* The usable region size always has to be multiples of the block size. */ newsize = QEMU_ALIGN_UP(newsize, vmem->block_size); @@ -514,6 +523,25 @@ static void virtio_mem_resize_usable_region(VirtIOMEM *vmem, trace_virtio_mem_resized_usable_region(vmem->usable_region_size, newsize); vmem->usable_region_size = newsize; + + /* + * Map all unmapped memslots that cover the usable region and unmap all + * remaining mapped ones. + */ + for (i = 0; i < vmem->nb_memslots; i++) { + if (vmem->memslot_size * i < vmem->usable_region_size) { + if (!memory_region_is_mapped(&vmem->memslots[i])) { + memory_region_add_subregion(vmem->mr, vmem->memslot_size * i, + &vmem->memslots[i]); + vmem->nb_used_memslots++; + } + } else { + if (memory_region_is_mapped(&vmem->memslots[i])) { + memory_region_del_subregion(vmem->mr, &vmem->memslots[i]); + vmem->nb_used_memslots--; + } + } + } } static int virtio_mem_unplug_all(VirtIOMEM *vmem) @@ -674,6 +702,83 @@ static void virtio_mem_system_reset(void *opaque) virtio_mem_unplug_all(vmem); } +static void virtio_mem_alloc_mr(VirtIOMEM *vmem) +{ + const uint64_t region_size = memory_region_size(&vmem->memdev->mr); + + vmem->mr = g_new0(MemoryRegion, 1); + memory_region_init(vmem->mr, OBJECT(vmem), "virtio-mem-memslots", + region_size); + vmem->mr->align = memory_region_get_alignment(&vmem->memdev->mr); +} + +static int virtio_mem_prepare_memslots(VirtIOMEM *vmem, Error **errp) +{ + const uint64_t region_size = memory_region_size(&vmem->memdev->mr); + const uint64_t device_blocks = region_size / vmem->block_size; + + if (vmem->nb_memslots == 1) { + vmem->memslot_size = region_size; + return 0; + } + + /* We cannot have more memslots than device blocks. */ + if (vmem->nb_memslots > device_blocks) { + error_setg(errp, "'%s' property exceeds the total number of device" + " blocks (%" PRIu64 ")", VIRTIO_MEM_MEMSLOTS_PROP, + device_blocks); + return -EINVAL; + } + + /* + * We'll make sure the memslots are multiple of the minimum memslot size + * and multiple of the device block size; This can make the last memslot + * larger than the others. + */ + vmem->memslot_size = QEMU_ALIGN_UP(region_size, vmem->nb_memslots) / + vmem->nb_memslots; + vmem->memslot_size = QEMU_ALIGN_DOWN(vmem->memslot_size, vmem->block_size); + vmem->memslot_size = QEMU_ALIGN_DOWN(vmem->memslot_size, + VIRTIO_MEM_MIN_MEMSLOT_SIZE); + if (!vmem->memslot_size) { + error_setg(errp, "'%s' property would create memory slots smaller than" + " the minimum supported memory slot size (%lu MiB)", + VIRTIO_MEM_MEMSLOTS_PROP, VIRTIO_MEM_MIN_MEMSLOT_SIZE / MiB); + return -EINVAL; + } + + return 0; +} + +static void virtio_mem_alloc_memslots(VirtIOMEM *vmem) +{ + const uint64_t region_size = memory_region_size(&vmem->memdev->mr); + int i; + + /* Create our memslots but don't map them yet -- we'll map dynamically. */ + vmem->memslots = g_new0(MemoryRegion, vmem->nb_memslots); + for (i = 0; i < vmem->nb_memslots; i++) { + uint64_t size; + char name[80]; + + /* The size of the last memslot might differ. */ + size = vmem->memslot_size; + if (i + 1 == vmem->nb_memslots) { + size = region_size - i * vmem->memslot_size; + } + + snprintf(name, sizeof(name), "virtio-mem-memslot-%u", i); + memory_region_init_alias(&vmem->memslots[i], OBJECT(vmem), name, + &vmem->memdev->mr, vmem->memslot_size * i, + size); + /* + * We want our aliases to result in separate memory sections and thereby + * separate memslots. + */ + memory_region_set_alias_unmergeable(&vmem->memslots[i], true); + } +} + static void virtio_mem_device_realize(DeviceState *dev, Error **errp) { MachineState *ms = MACHINE(qdev_get_machine()); @@ -751,6 +856,10 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp) return; } + if (virtio_mem_prepare_memslots(vmem, errp)) { + return; + } + if (ram_block_coordinated_discard_require(true)) { error_setg(errp, "Discarding RAM is disabled"); return; @@ -763,12 +872,15 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp) return; } - virtio_mem_resize_usable_region(vmem, vmem->requested_size, true); - vmem->bitmap_size = memory_region_size(&vmem->memdev->mr) / vmem->block_size; vmem->bitmap = bitmap_new(vmem->bitmap_size); + if (!vmem->mr) { + virtio_mem_alloc_mr(vmem); + } + virtio_mem_alloc_memslots(vmem); + virtio_init(vdev, TYPE_VIRTIO_MEM, VIRTIO_ID_MEM, sizeof(struct virtio_mem_config)); vmem->vq = virtio_add_queue(vdev, 128, virtio_mem_handle_request); @@ -780,7 +892,7 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp) */ memory_region_set_ram_discard_manager(&vmem->memdev->mr, RAM_DISCARD_MANAGER(vmem)); - + virtio_mem_resize_usable_region(vmem, vmem->requested_size, true); host_memory_backend_set_mapped(vmem->memdev, true); vmstate_register_ram(&vmem->memdev->mr, DEVICE(vmem)); qemu_register_reset(virtio_mem_system_reset, vmem); @@ -794,9 +906,12 @@ static void virtio_mem_device_unrealize(DeviceState *dev) qemu_unregister_reset(virtio_mem_system_reset, vmem); vmstate_unregister_ram(&vmem->memdev->mr, DEVICE(vmem)); host_memory_backend_set_mapped(vmem->memdev, false); + virtio_mem_resize_usable_region(vmem, 0, true); memory_region_set_ram_discard_manager(&vmem->memdev->mr, NULL); virtio_del_queue(vdev, 0); virtio_cleanup(vdev); + g_free(vmem->memslots); + g_free(vmem->mr); g_free(vmem->bitmap); ram_block_coordinated_discard_require(false); } @@ -955,7 +1070,10 @@ static MemoryRegion *virtio_mem_get_memory_region(VirtIOMEM *vmem, Error **errp) return NULL; } - return &vmem->memdev->mr; + if (!vmem->mr) { + virtio_mem_alloc_mr(vmem); + } + return vmem->mr; } static void virtio_mem_add_size_change_notifier(VirtIOMEM *vmem, @@ -1084,10 +1202,62 @@ static void virtio_mem_set_block_size(Object *obj, Visitor *v, const char *name, vmem->block_size = value; } +static void virtio_mem_get_used_memslots(Object *obj, Visitor *v, + const char *name, + void *opaque, Error **errp) +{ + const VirtIOMEM *vmem = VIRTIO_MEM(obj); + uint16_t value = vmem->nb_used_memslots; + + visit_type_uint16(v, name, &value, errp); +} + +static void virtio_mem_get_memslots(Object *obj, Visitor *v, const char *name, + void *opaque, Error **errp) +{ + VirtIOMEM *vmem = VIRTIO_MEM(obj); + uint16_t value = vmem->nb_memslots; + + visit_type_uint16(v, name, &value, errp); +} + +static void virtio_mem_set_memslots(Object *obj, Visitor *v, const char *name, + void *opaque, Error **errp) +{ + VirtIOMEM *vmem = VIRTIO_MEM(obj); + Error *err = NULL; + uint16_t value; + + if (DEVICE(obj)->realized) { + error_setg(errp, "'%s' cannot be changed", name); + return; + } + + visit_type_uint16(v, name, &value, &err); + if (err) { + error_propagate(errp, err); + return; + } + if (value > VIRTIO_MEM_MAX_MEMSLOTS) { + error_setg(errp, "'%s' property must not exceed '%d'", name, + VIRTIO_MEM_MAX_MEMSLOTS); + return; + } else if (!value) { + error_setg(errp, "'%s' property must not be '0'", name); + return; + } + vmem->nb_memslots = value; +} + static void virtio_mem_instance_init(Object *obj) { VirtIOMEM *vmem = VIRTIO_MEM(obj); + /* + * Default to a single memslot, the old default; users have to opt in for + * more by setting the "memslots" property accordingly. + */ + vmem->nb_memslots = 1; notifier_list_init(&vmem->size_change_notifiers); QLIST_INIT(&vmem->rdl_list); @@ -1099,6 +1269,11 @@ static void virtio_mem_instance_init(Object *obj) object_property_add(obj, VIRTIO_MEM_BLOCK_SIZE_PROP, "size", virtio_mem_get_block_size, virtio_mem_set_block_size, NULL, NULL); + object_property_add(obj, VIRTIO_MEM_MEMSLOTS_PROP, "uint16", + virtio_mem_get_memslots, virtio_mem_set_memslots, NULL, + NULL); + object_property_add(obj, VIRTIO_MEM_USED_MEMSLOTS_PROP, "uint16", + virtio_mem_get_used_memslots, NULL, NULL, NULL); } static Property virtio_mem_properties[] = { diff --git a/include/hw/virtio/virtio-mem.h b/include/hw/virtio/virtio-mem.h index a5dd6a493b..8d72427be2 100644 --- a/include/hw/virtio/virtio-mem.h +++ b/include/hw/virtio/virtio-mem.h @@ -30,6 +30,8 @@ OBJECT_DECLARE_TYPE(VirtIOMEM, VirtIOMEMClass, #define VIRTIO_MEM_REQUESTED_SIZE_PROP "requested-size" #define VIRTIO_MEM_BLOCK_SIZE_PROP "block-size" #define VIRTIO_MEM_ADDR_PROP "memaddr" +#define VIRTIO_MEM_MEMSLOTS_PROP "memslots" +#define VIRTIO_MEM_USED_MEMSLOTS_PROP "used-memslots" struct VirtIOMEM { VirtIODevice parent_obj; @@ -41,9 +43,30 @@ struct VirtIOMEM { int32_t bitmap_size; unsigned long *bitmap; - /* assigned memory backend and memory region */ + /* Device memory region in which we dynamically map memslots */ + MemoryRegion *mr; + + /* + * Assigned memory backend with the RAM memory region we will split + * into memslots to dynamically map them into the device memory region. + */ HostMemoryBackend *memdev; + /* + * Individual memslots we dynamically map that are aliases to the + * assigned RAM memory region + */ + MemoryRegion *memslots; + + /* Total number of memslots we're going to use. */ + uint16_t nb_memslots; + + /* Current number of memslots we're using. */ + uint16_t nb_used_memslots; + + /* Size of one memslot (the last one can differ) */ + uint64_t memslot_size; + /* NUMA node */ uint32_t node;