From patchwork Wed Oct 13 10:33:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12555427 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE1A6C433EF for ; Wed, 13 Oct 2021 10:34:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9B60460F4A for ; Wed, 13 Oct 2021 10:34:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232148AbhJMKgX (ORCPT ); Wed, 13 Oct 2021 06:36:23 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:57472 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229495AbhJMKgW (ORCPT ); Wed, 13 Oct 2021 06:36:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634121259; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eiq5vh3dCNHXTt2L9OzYv4MKu2jV8o9+6xs4ComEFdU=; b=QzS3emDpHFgkWmd/4DDKKacnAUqdmaKevDn6yK0mJGYhteJfRNgXtqvGkxgKKij7GNrEBp dzAHmBv4vMEx4iEY/V25ZTEMYUCajn2+9ewC4iCUYGA9Sf3Xat2btL2j438PLHjazYL7Cr g0x+KaMoM8W9DdfimzS0EDcqI5q1Er4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-291-HnIIzyQSMuCRRQdXCqwBtA-1; Wed, 13 Oct 2021 06:34:16 -0400 X-MC-Unique: HnIIzyQSMuCRRQdXCqwBtA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BDCB4100CCC1; Wed, 13 Oct 2021 10:34:14 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id C2D3F5D9D5; Wed, 13 Oct 2021 10:33:50 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , kvm@vger.kernel.org Subject: [PATCH RFC 01/15] memory: Drop mapping check from memory_region_get_ram_discard_manager() Date: Wed, 13 Oct 2021 12:33:16 +0200 Message-Id: <20211013103330.26869-2-david@redhat.com> In-Reply-To: <20211013103330.26869-1-david@redhat.com> References: <20211013103330.26869-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org It's sufficient to check whether a memory region is RAM, the region doesn't necessarily have to be mapped into another memory region. For example, RAM memory regions mapped via an alias will never make "memory_region_is_mapped()" succeed. Signed-off-by: David Hildenbrand --- softmmu/memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/softmmu/memory.c b/softmmu/memory.c index 3bcfc3899b..8669f78395 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -2038,7 +2038,7 @@ int memory_region_iommu_num_indexes(IOMMUMemoryRegion *iommu_mr) RamDiscardManager *memory_region_get_ram_discard_manager(MemoryRegion *mr) { - if (!memory_region_is_mapped(mr) || !memory_region_is_ram(mr)) { + if (!memory_region_is_ram(mr)) { return NULL; } return mr->rdm; From patchwork Wed Oct 13 10:33:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12555429 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BECA7C433F5 for ; Wed, 13 Oct 2021 10:34:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A590B60F4A for ; Wed, 13 Oct 2021 10:34:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232299AbhJMKgj (ORCPT ); Wed, 13 Oct 2021 06:36:39 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:40710 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230460AbhJMKgh (ORCPT ); Wed, 13 Oct 2021 06:36:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634121274; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PxRO//KGswXPKqSCCWXXkgC8CPdp/iPZT12lPG2xPHY=; b=aPBAvTObIvr67D8aL32EqA2g49yS5pqASkLXlgT/jdZXinQPqQ+g3B4dA2u3yrPMuqNsPv 3zETv6lqhY2zy9KHCuMz2TRolUvLj34BfidOEwzHo0dTcWoKA6rqeiUUgVamRpFQHfUBji laXKHwOk1NisPIRGOXuN5wJwJP5DA+I= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-551-3LdVgzHfPDiuu_EhC-eoYw-1; Wed, 13 Oct 2021 06:34:31 -0400 X-MC-Unique: 3LdVgzHfPDiuu_EhC-eoYw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C911F10A8E00; Wed, 13 Oct 2021 10:34:29 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4C71E5D9D5; Wed, 13 Oct 2021 10:34:14 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , kvm@vger.kernel.org Subject: [PATCH RFC 02/15] kvm: Return number of free memslots Date: Wed, 13 Oct 2021 12:33:17 +0200 Message-Id: <20211013103330.26869-3-david@redhat.com> In-Reply-To: <20211013103330.26869-1-david@redhat.com> References: <20211013103330.26869-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's return the number of free slots instead of only checking if there is a free slot. Required to support memory devices that consume multiple memslots. Signed-off-by: David Hildenbrand --- accel/kvm/kvm-all.c | 24 +++++++++++------------- accel/stubs/kvm-stub.c | 4 ++-- hw/mem/memory-device.c | 2 +- include/sysemu/kvm.h | 2 +- 4 files changed, 15 insertions(+), 17 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index db8d83b137..0846be835e 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -103,6 +103,7 @@ struct KVMState AccelState parent_obj; int nr_slots; + int nr_free_slots; int fd; int vmfd; int coalesced_mmio; @@ -245,6 +246,13 @@ int kvm_get_max_memslots(void) return s->nr_slots; } +unsigned int kvm_get_free_memslots(void) +{ + KVMState *s = kvm_state; + + return s->nr_free_slots; +} + /* Called with KVMMemoryListener.slots_lock held */ static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml) { @@ -260,19 +268,6 @@ static KVMSlot *kvm_get_free_slot(KVMMemoryListener *kml) return NULL; } -bool kvm_has_free_slot(MachineState *ms) -{ - KVMState *s = KVM_STATE(ms->accelerator); - bool result; - KVMMemoryListener *kml = &s->memory_listener; - - kvm_slots_lock(); - result = !!kvm_get_free_slot(kml); - kvm_slots_unlock(); - - return result; -} - /* Called with KVMMemoryListener.slots_lock held */ static KVMSlot *kvm_alloc_slot(KVMMemoryListener *kml) { @@ -1410,6 +1405,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml, } start_addr += slot_size; size -= slot_size; + kvm_state->nr_free_slots++; } while (size); goto out; } @@ -1435,6 +1431,7 @@ static void kvm_set_phys_mem(KVMMemoryListener *kml, ram_start_offset += slot_size; ram += slot_size; size -= slot_size; + kvm_state->nr_free_slots--; } while (size); out: @@ -2364,6 +2361,7 @@ static int kvm_init(MachineState *ms) if (!s->nr_slots) { s->nr_slots = 32; } + s->nr_free_slots = s->nr_slots; s->nr_as = kvm_check_extension(s, KVM_CAP_MULTI_ADDRESS_SPACE); if (s->nr_as <= 1) { diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c index 5b1d00a222..cbaeb7c656 100644 --- a/accel/stubs/kvm-stub.c +++ b/accel/stubs/kvm-stub.c @@ -133,9 +133,9 @@ int kvm_irqchip_remove_irqfd_notifier_gsi(KVMState *s, EventNotifier *n, return -ENOSYS; } -bool kvm_has_free_slot(MachineState *ms) +unsigned int kvm_get_free_memslots(void) { - return false; + return 0; } void kvm_init_cpu_signals(CPUState *cpu) diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c index d9f8301711..9045ead33e 100644 --- a/hw/mem/memory-device.c +++ b/hw/mem/memory-device.c @@ -73,7 +73,7 @@ static void memory_device_check_addable(MachineState *ms, uint64_t size, uint64_t used_region_size = 0; /* we will need a new memory slot for kvm and vhost */ - if (kvm_enabled() && !kvm_has_free_slot(ms)) { + if (kvm_enabled() && !kvm_get_free_memslots()) { error_setg(errp, "hypervisor has no free memory slots left"); return; } diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index a1ab1ee12d..c18be3cbd5 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -211,7 +211,7 @@ typedef struct Notifier Notifier; /* external API */ -bool kvm_has_free_slot(MachineState *ms); +unsigned int kvm_get_free_memslots(void); bool kvm_has_sync_mmu(void); int kvm_has_vcpu_events(void); int kvm_has_robust_singlestep(void); From patchwork Wed Oct 13 10:33:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12555431 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2982C433EF for ; Wed, 13 Oct 2021 10:34:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CAA2760E09 for ; Wed, 13 Oct 2021 10:34:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232272AbhJMKgp (ORCPT ); Wed, 13 Oct 2021 06:36:45 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:37358 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230460AbhJMKgn (ORCPT ); Wed, 13 Oct 2021 06:36:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634121280; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6RfXT0bxeLK1QDswcENw90bGti15VyXGN95wefMGXc8=; b=AyCNtnnXgvhAjekTUi6rLVfCDaVx7xsBKIljrkMMus++9EPND4zy8DnLlXkUis2WXVl28F R2O+nStI7UpuegRi/gnSJgAstB7ep631zF22IB3MBz6HVUaHQxTJT+d1BbJjH8Hn9nlqSo E8iaqpukUymBlzLkwCoofru/GfOcMCs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-528-BtHvFrdKPYKtxqIbfhIHPQ-1; Wed, 13 Oct 2021 06:34:37 -0400 X-MC-Unique: BtHvFrdKPYKtxqIbfhIHPQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5FEF35074F; Wed, 13 Oct 2021 10:34:36 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id 35FF85D9D5; Wed, 13 Oct 2021 10:34:30 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , kvm@vger.kernel.org Subject: [PATCH RFC 03/15] vhost: Return number of free memslots Date: Wed, 13 Oct 2021 12:33:18 +0200 Message-Id: <20211013103330.26869-4-david@redhat.com> In-Reply-To: <20211013103330.26869-1-david@redhat.com> References: <20211013103330.26869-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's return the number of free slots instead of only checking if there is a free slot. Required to support memory devices that consume multiple memslots. Signed-off-by: David Hildenbrand --- hw/mem/memory-device.c | 2 +- hw/virtio/vhost-stub.c | 2 +- hw/virtio/vhost.c | 4 ++-- include/hw/virtio/vhost.h | 2 +- 4 files changed, 5 insertions(+), 5 deletions(-) diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c index 9045ead33e..7f76a09e57 100644 --- a/hw/mem/memory-device.c +++ b/hw/mem/memory-device.c @@ -77,7 +77,7 @@ static void memory_device_check_addable(MachineState *ms, uint64_t size, error_setg(errp, "hypervisor has no free memory slots left"); return; } - if (!vhost_has_free_slot()) { + if (!vhost_get_free_memslots()) { error_setg(errp, "a used vhost backend has no free memory slots left"); return; } diff --git a/hw/virtio/vhost-stub.c b/hw/virtio/vhost-stub.c index c175148fce..fe111e5e45 100644 --- a/hw/virtio/vhost-stub.c +++ b/hw/virtio/vhost-stub.c @@ -2,7 +2,7 @@ #include "hw/virtio/vhost.h" #include "hw/virtio/vhost-user.h" -bool vhost_has_free_slot(void) +unsigned int vhost_get_free_memslots(void) { return true; } diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 437347ad01..2707972870 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -48,7 +48,7 @@ static unsigned int used_memslots; static QLIST_HEAD(, vhost_dev) vhost_devices = QLIST_HEAD_INITIALIZER(vhost_devices); -bool vhost_has_free_slot(void) +unsigned int vhost_get_free_memslots(void) { unsigned int slots_limit = ~0U; struct vhost_dev *hdev; @@ -57,7 +57,7 @@ bool vhost_has_free_slot(void) unsigned int r = hdev->vhost_ops->vhost_backend_memslots_limit(hdev); slots_limit = MIN(slots_limit, r); } - return slots_limit > used_memslots; + return slots_limit - used_memslots; } static void vhost_dev_sync_region(struct vhost_dev *dev, diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h index 1a9fc65089..1613af8855 100644 --- a/include/hw/virtio/vhost.h +++ b/include/hw/virtio/vhost.h @@ -128,7 +128,7 @@ uint64_t vhost_get_features(struct vhost_dev *hdev, const int *feature_bits, uint64_t features); void vhost_ack_features(struct vhost_dev *hdev, const int *feature_bits, uint64_t features); -bool vhost_has_free_slot(void); +unsigned int vhost_get_free_memslots(void); int vhost_net_set_backend(struct vhost_dev *hdev, struct vhost_vring_file *file); From patchwork Wed Oct 13 10:33:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12555439 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61375C433EF for ; Wed, 13 Oct 2021 10:37:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 48E2C6023E for ; Wed, 13 Oct 2021 10:37:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230373AbhJMKjM (ORCPT ); Wed, 13 Oct 2021 06:39:12 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:35862 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233341AbhJMKiF (ORCPT ); Wed, 13 Oct 2021 06:38:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634121361; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/XIIwznCpw2h5sCs0N0lfNhxBo/xD3rtm+KW/xJ/BJA=; b=ApcOvbrwaLgwG+uWZLJ2rqM6QsYVC9e0m1STzHNU6EU9Xp1GcJsK58+Dt/kjrmz2jiPwE5 slhullsAHM05U8cKwgVD1GqO73q90PAmQc5HssCpZ4z2Cl8z+juxfM5Vfb+Ke8eatg7msb OQrn0h8BPygbGUQQwXjSZkt2AlcSbHU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-4-Umn2ZXDzN7aLWmSZcbiZ4Q-1; Wed, 13 Oct 2021 06:34:50 -0400 X-MC-Unique: Umn2ZXDzN7aLWmSZcbiZ4Q-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D69A0100CCC0; Wed, 13 Oct 2021 10:34:48 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id DC11C5D9D5; Wed, 13 Oct 2021 10:34:36 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , kvm@vger.kernel.org Subject: [PATCH RFC 04/15] memory: Allow for marking memory region aliases unmergeable Date: Wed, 13 Oct 2021 12:33:19 +0200 Message-Id: <20211013103330.26869-5-david@redhat.com> In-Reply-To: <20211013103330.26869-1-david@redhat.com> References: <20211013103330.26869-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's allow for marking memory region aliases unmergeable, to teach flatview code (and memory listeners like vhost-user) to not merge adjacent aliases to the same memory region into a larger memory section; instead, we want separate alias to stay separate such that we can atomically map/unmap aliases without affecting other aliases. This is a preparation for virtio-mem mapping device memory located on a RAM memory region via separate aliases into a memory region container, resulting in separate memslots that can get (un)mapped atomically. As an example with virtio-mem, the layout looks something like this: [...] 0000000180000000-000002007fffffff (prio 0, i/o): device-memory 0000000180000000-000001017fffffff (prio 0, i/o): virtio-mem-memslots 0000000180000000-00000001bfffffff (prio 0, ram): alias virito-mem-memslot-0 @mem0 0000000000000000-000000003fffffff 00000001c0000000-00000001ffffffff (prio 0, ram): alias virito-mem-memslot-1 @mem0 0000000040000000-000000007fffffff 0000000200000000-000000023fffffff (prio 0, ram): alias virito-mem-memslot-2 @mem0 0000000080000000-00000000bfffffff [...] What would happen right now is that flatview code merged all 3 aliases into a single memorys ection. When mapping another alias (e.g., virito-mem-memslot-3) or when unmapping any of the mapped aliases, memory listeners will first get notified about the removal of the big memory section to then get notified about re-adding of the new (differently merged) memory section(s). In an ideal world, memory listeners would be able to deal with that atomically, however, that is not the case for the most important memory listeners used in context of virtio-mem (KVM, vhost-user, vfio) and supporting atomic updates is quite hard (e.g., for KVM where we cannot simply resize or split memory slots due to allocated metadata per slot, or in virtiofsd where we cannot simply resize or split an active mmap mapping). While temporarily removing memslots, active users (e.g., KVM VCPUs) can stumble over the missing memslot and essentially crash the VM. Further, merged chunks will consume less memslots, but we might end up consuming more later, when unmapping chunks and splitting the bigger chunks into smaller ones -- making memslot accounting for memory devices problematic as well. Let's allow for marking a memory region alias unmergeable, such that we can atomically (un)map aliases to the same memory region, similar to (un)mapping individual DIMMs. Signed-off-by: David Hildenbrand --- include/exec/memory.h | 23 +++++++++++++++++++++++ softmmu/memory.c | 33 +++++++++++++++++++++++++++------ 2 files changed, 50 insertions(+), 6 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index 75b4f600e3..d877b80e6e 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -82,6 +82,7 @@ struct ReservedRegion { * relative to the region's address space * @readonly: writes to this section are ignored * @nonvolatile: this section is non-volatile + * @unmergeable: this section should not get merged with adjacent sections */ struct MemoryRegionSection { Int128 size; @@ -91,6 +92,7 @@ struct MemoryRegionSection { hwaddr offset_within_address_space; bool readonly; bool nonvolatile; + bool unmergeable; }; typedef struct IOMMUTLBEntry IOMMUTLBEntry; @@ -720,6 +722,7 @@ struct MemoryRegion { bool nonvolatile; bool rom_device; bool flush_coalesced_mmio; + bool unmergeable; uint8_t dirty_log_mask; bool is_iommu; RAMBlock *ram_block; @@ -2272,6 +2275,26 @@ void memory_region_set_size(MemoryRegion *mr, uint64_t size); void memory_region_set_alias_offset(MemoryRegion *mr, hwaddr offset); +/* + * memory_region_set_alias_unmergeable: Turn a memory region alias unmergeable + * + * Mark a memory region alias unmergeable, resulting in multiple adjacent + * aliasas to the same memory region not getting merged into one memory section + * when simplifying the address space and notifying memory listeners. + * + * Primarily useful on aliases to RAM regions; the target use case is + * splitting a RAM memory region via aliases into multiple memslots and + * dynamically (un)mapping the aliases into another container memory region. + * As resulting memory sections don't cover multiple aliases, memory listeners + * will be notified about adding/removing separate aliases, resulting in + * individual memslots in KVM, vhost, vfio,... that can be added/removed + * atomically when mapping/unmapping the corresponding alias. + * + * @mr: the #MemoryRegion to be updated + * @unmergeable: whether to mark the #MemoryRegion unmergeable + */ +void memory_region_set_alias_unmergeable(MemoryRegion *mr, bool unmergeable); + /** * memory_region_present: checks if an address relative to a @container * translates into #MemoryRegion within @container diff --git a/softmmu/memory.c b/softmmu/memory.c index 8669f78395..c92ec8372f 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -223,6 +223,7 @@ struct FlatRange { bool romd_mode; bool readonly; bool nonvolatile; + bool unmergeable; }; #define FOR_EACH_FLAT_RANGE(var, view) \ @@ -239,6 +240,7 @@ section_from_flat_range(FlatRange *fr, FlatView *fv) .offset_within_address_space = int128_get64(fr->addr.start), .readonly = fr->readonly, .nonvolatile = fr->nonvolatile, + .unmergeable = fr->unmergeable, }; } @@ -249,7 +251,8 @@ static bool flatrange_equal(FlatRange *a, FlatRange *b) && a->offset_in_region == b->offset_in_region && a->romd_mode == b->romd_mode && a->readonly == b->readonly - && a->nonvolatile == b->nonvolatile; + && a->nonvolatile == b->nonvolatile + && a->unmergeable == b->unmergeable; } static FlatView *flatview_new(MemoryRegion *mr_root) @@ -322,7 +325,8 @@ static bool can_merge(FlatRange *r1, FlatRange *r2) && r1->dirty_log_mask == r2->dirty_log_mask && r1->romd_mode == r2->romd_mode && r1->readonly == r2->readonly - && r1->nonvolatile == r2->nonvolatile; + && r1->nonvolatile == r2->nonvolatile + && !r1->unmergeable && !r2->unmergeable; } /* Attempt to simplify a view by merging adjacent ranges */ @@ -581,7 +585,8 @@ static void render_memory_region(FlatView *view, Int128 base, AddrRange clip, bool readonly, - bool nonvolatile) + bool nonvolatile, + bool unmergeable) { MemoryRegion *subregion; unsigned i; @@ -598,6 +603,7 @@ static void render_memory_region(FlatView *view, int128_addto(&base, int128_make64(mr->addr)); readonly |= mr->readonly; nonvolatile |= mr->nonvolatile; + unmergeable |= mr->unmergeable; tmp = addrrange_make(base, mr->size); @@ -611,14 +617,14 @@ static void render_memory_region(FlatView *view, int128_subfrom(&base, int128_make64(mr->alias->addr)); int128_subfrom(&base, int128_make64(mr->alias_offset)); render_memory_region(view, mr->alias, base, clip, - readonly, nonvolatile); + readonly, nonvolatile, unmergeable); return; } /* Render subregions in priority order. */ QTAILQ_FOREACH(subregion, &mr->subregions, subregions_link) { render_memory_region(view, subregion, base, clip, - readonly, nonvolatile); + readonly, nonvolatile, unmergeable); } if (!mr->terminates) { @@ -634,6 +640,7 @@ static void render_memory_region(FlatView *view, fr.romd_mode = mr->romd_mode; fr.readonly = readonly; fr.nonvolatile = nonvolatile; + fr.unmergeable = unmergeable; /* Render the region itself into any gaps left by the current view. */ for (i = 0; i < view->nr && int128_nz(remain); ++i) { @@ -735,7 +742,7 @@ static FlatView *generate_memory_topology(MemoryRegion *mr) if (mr) { render_memory_region(view, mr, int128_zero(), addrrange_make(int128_zero(), int128_2_64()), - false, false); + false, false, false); } flatview_simplify(view); @@ -2634,6 +2641,20 @@ void memory_region_set_alias_offset(MemoryRegion *mr, hwaddr offset) memory_region_transaction_commit(); } +void memory_region_set_alias_unmergeable(MemoryRegion *mr, bool unmergeable) +{ + assert(mr->alias); + + if (unmergeable == mr->unmergeable) { + return; + } + + memory_region_transaction_begin(); + mr->unmergeable = unmergeable; + memory_region_update_pending |= mr->enabled; + memory_region_transaction_commit(); +} + uint64_t memory_region_get_alignment(const MemoryRegion *mr) { return mr->align; From patchwork Wed Oct 13 10:33:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12555433 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66F9DC433F5 for ; Wed, 13 Oct 2021 10:35:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4EF4260F4A for ; Wed, 13 Oct 2021 10:35:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232486AbhJMKhD (ORCPT ); Wed, 13 Oct 2021 06:37:03 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:49297 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230460AbhJMKhC (ORCPT ); Wed, 13 Oct 2021 06:37:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634121299; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cHf2ZqC5DC9zioFkkyjMYiEOu523mJCDOZFO8VfZXy8=; b=a8+EKECW4PqHC/7NpsKrljdEvrtBQjiZDZjzeRD2JkoePy4s+IaVvQVve+9V7yqEK05D2e VGsTAfD/7a+RR7UXBn++vkUGY/dnhQ7/8nkOTBeXAyW87J3ZclXNZZcj2R9QQPB8xHw/54 jfo3vgpELvJJu/0ZZCZkBOv0vEtrwWI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-281-NI4MhWvBNF6bRyWFtjjMtw-1; Wed, 13 Oct 2021 06:34:56 -0400 X-MC-Unique: NI4MhWvBNF6bRyWFtjjMtw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4F84C100CCC2; Wed, 13 Oct 2021 10:34:55 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4110D5D9D5; Wed, 13 Oct 2021 10:34:49 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , kvm@vger.kernel.org Subject: [PATCH RFC 05/15] vhost: Don't merge unmergeable memory sections Date: Wed, 13 Oct 2021 12:33:20 +0200 Message-Id: <20211013103330.26869-6-david@redhat.com> In-Reply-To: <20211013103330.26869-1-david@redhat.com> References: <20211013103330.26869-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Memory sections that are marked unmergeable should not be merged, to allow for atomic removal later. Signed-off-by: David Hildenbrand --- hw/virtio/vhost.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 2707972870..49a1074097 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -620,7 +620,7 @@ static void vhost_region_add_section(struct vhost_dev *dev, mrs_size, mrs_host); } - if (dev->n_tmp_sections) { + if (dev->n_tmp_sections && !section->unmergeable) { /* Since we already have at least one section, lets see if * this extends it; since we're scanning in order, we only * have to look at the last one, and the FlatView that calls @@ -653,7 +653,7 @@ static void vhost_region_add_section(struct vhost_dev *dev, size_t offset = mrs_gpa - prev_gpa_start; if (prev_host_start + offset == mrs_host && - section->mr == prev_sec->mr && + section->mr == prev_sec->mr && !prev_sec->unmergeable && (!dev->vhost_ops->vhost_backend_can_merge || dev->vhost_ops->vhost_backend_can_merge(dev, mrs_host, mrs_size, From patchwork Wed Oct 13 10:33:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12555435 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB0CDC433EF for ; Wed, 13 Oct 2021 10:35:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8F16F6109E for ; Wed, 13 Oct 2021 10:35:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232145AbhJMKhy (ORCPT ); Wed, 13 Oct 2021 06:37:54 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:38354 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230460AbhJMKhx (ORCPT ); Wed, 13 Oct 2021 06:37:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634121350; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=S8/nicczR9pseL6axCf0NoW1qX9tQ1hx5V0enNVRtU0=; b=iDTL51sbJ0QB8oTHJX/GpLYpaS1Idv3UjYJQYQTYGITuEojJNg6XzdNrb6TvJqiKtAnDvP 01Mlado5v0MeauMHHsMHJBLsfa+fmVWX1HN2qrFuNU5nGK2FHRBQO6ed09oglSVCOIFQeY m09MaiIyzB6UFM7xElgHs5tnLZOnltI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-172-jCblcP9TML-z5svllEZhFQ-1; Wed, 13 Oct 2021 06:35:47 -0400 X-MC-Unique: jCblcP9TML-z5svllEZhFQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4D78B10A8E02; Wed, 13 Oct 2021 10:35:46 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id 947465D9D5; Wed, 13 Oct 2021 10:34:55 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , kvm@vger.kernel.org Subject: [PATCH RFC 06/15] memory-device: Move memory_device_check_addable() directly into memory_device_pre_plug() Date: Wed, 13 Oct 2021 12:33:21 +0200 Message-Id: <20211013103330.26869-7-david@redhat.com> In-Reply-To: <20211013103330.26869-1-david@redhat.com> References: <20211013103330.26869-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Move it out of memory_device_get_free_addr(), which is cleaner and prepares for future changes. Signed-off-by: David Hildenbrand --- hw/mem/memory-device.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c index 7f76a09e57..68a2c3dbcc 100644 --- a/hw/mem/memory-device.c +++ b/hw/mem/memory-device.c @@ -67,9 +67,10 @@ static int memory_device_used_region_size(Object *obj, void *opaque) return 0; } -static void memory_device_check_addable(MachineState *ms, uint64_t size, +static void memory_device_check_addable(MachineState *ms, MemoryRegion *mr, Error **errp) { + const uint64_t size = memory_region_size(mr); uint64_t used_region_size = 0; /* we will need a new memory slot for kvm and vhost */ @@ -99,7 +100,6 @@ static uint64_t memory_device_get_free_addr(MachineState *ms, uint64_t align, uint64_t size, Error **errp) { - Error *err = NULL; GSList *list = NULL, *item; Range as, new = range_empty; @@ -125,12 +125,6 @@ static uint64_t memory_device_get_free_addr(MachineState *ms, align); } - memory_device_check_addable(ms, size, &err); - if (err) { - error_propagate(errp, err); - return 0; - } - if (hint && !QEMU_IS_ALIGNED(*hint, align)) { error_setg(errp, "address must be aligned to 0x%" PRIx64 " bytes", align); @@ -259,6 +253,11 @@ void memory_device_pre_plug(MemoryDeviceState *md, MachineState *ms, goto out; } + memory_device_check_addable(ms, mr, &local_err); + if (local_err) { + goto out; + } + if (legacy_align) { align = *legacy_align; } else { From patchwork Wed Oct 13 10:33:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12555437 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2458C433F5 for ; Wed, 13 Oct 2021 10:36:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B98956109E for ; Wed, 13 Oct 2021 10:36:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230106AbhJMKiw (ORCPT ); Wed, 13 Oct 2021 06:38:52 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:59051 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229715AbhJMKiv (ORCPT ); Wed, 13 Oct 2021 06:38:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634121408; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ih+jDEDdCSReJb44DrIvYsF7h99Td6g1C16k8E/cins=; b=V8O9z058xXLaqyzZq0xxnpVTk7EDiNRU8sp4czzBpXVM56ypSgvgfSPeIAMhqMUb3iZJeV 1/NnAJY0fxzqYXn/BTS+kDC1l0VReKPBdGCCzd001z3c9YT75pSp8uu4sQFLhUnnYhE/KQ U3gRpXWmdIS6RcZASLaGIjBG/QA3YUU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-500-lfpmDYmLNYu6Z_BvezUuGw-1; Wed, 13 Oct 2021 06:36:45 -0400 X-MC-Unique: lfpmDYmLNYu6Z_BvezUuGw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id EC63518D6A38; Wed, 13 Oct 2021 10:36:43 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id ABCED5D9D5; Wed, 13 Oct 2021 10:35:46 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , kvm@vger.kernel.org Subject: [PATCH RFC 07/15] memory-device: Generalize memory_device_used_region_size() Date: Wed, 13 Oct 2021 12:33:22 +0200 Message-Id: <20211013103330.26869-8-david@redhat.com> In-Reply-To: <20211013103330.26869-1-david@redhat.com> References: <20211013103330.26869-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's generalize traversal of all plugged memory devices to collect information to prepare for future changes. Signed-off-by: David Hildenbrand --- hw/mem/memory-device.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c index 68a2c3dbcc..a915894819 100644 --- a/hw/mem/memory-device.c +++ b/hw/mem/memory-device.c @@ -50,20 +50,24 @@ static int memory_device_build_list(Object *obj, void *opaque) return 0; } -static int memory_device_used_region_size(Object *obj, void *opaque) +struct memory_devices_info { + uint64_t region_size; +}; + +static int memory_devices_collect_info(Object *obj, void *opaque) { - uint64_t *size = opaque; + struct memory_devices_info *i = opaque; if (object_dynamic_cast(obj, TYPE_MEMORY_DEVICE)) { const DeviceState *dev = DEVICE(obj); const MemoryDeviceState *md = MEMORY_DEVICE(obj); if (dev->realized) { - *size += memory_device_get_region_size(md, &error_abort); + i->region_size += memory_device_get_region_size(md, &error_abort); } } - object_child_foreach(obj, memory_device_used_region_size, opaque); + object_child_foreach(obj, memory_devices_collect_info, opaque); return 0; } @@ -71,7 +75,7 @@ static void memory_device_check_addable(MachineState *ms, MemoryRegion *mr, Error **errp) { const uint64_t size = memory_region_size(mr); - uint64_t used_region_size = 0; + struct memory_devices_info info = {}; /* we will need a new memory slot for kvm and vhost */ if (kvm_enabled() && !kvm_get_free_memslots()) { @@ -84,12 +88,12 @@ static void memory_device_check_addable(MachineState *ms, MemoryRegion *mr, } /* will we exceed the total amount of memory specified */ - memory_device_used_region_size(OBJECT(ms), &used_region_size); - if (used_region_size + size < used_region_size || - used_region_size + size > ms->maxram_size - ms->ram_size) { + memory_devices_collect_info(OBJECT(ms), &info); + if (info.region_size + size < info.region_size || + info.region_size + size > ms->maxram_size - ms->ram_size) { error_setg(errp, "not enough space, currently 0x%" PRIx64 " in use of total space for memory devices 0x" RAM_ADDR_FMT, - used_region_size, ms->maxram_size - ms->ram_size); + info.region_size, ms->maxram_size - ms->ram_size); return; } From patchwork Wed Oct 13 10:33:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12555441 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF8D0C433EF for ; Wed, 13 Oct 2021 10:37:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8C64260F94 for ; Wed, 13 Oct 2021 10:37:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229625AbhJMKj4 (ORCPT ); Wed, 13 Oct 2021 06:39:56 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:24453 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229524AbhJMKj4 (ORCPT ); Wed, 13 Oct 2021 06:39:56 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634121472; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cR7jsj4sT8SdKZLXHhxaONioj6cXcFsXdgOGCFcDfN8=; b=NRJRpuhGfUvM8wZbZ4LwPN6KojIbCpIk9yIurqj7/gaqB5QMhV30qypWHlgTV3kIOmkLwA N1idK+ldYHPTckS9AZXi/D/97z665ZDdLPaK1vxOwWLq6N6PzO3CYfhnfl9c7EcoLRBAWU H2ygrkR1qBBqEG2A6r9W0kqptlJ/12k= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-79-n_8PDKCePwmwv9Jm79D1zw-1; Wed, 13 Oct 2021 06:37:49 -0400 X-MC-Unique: n_8PDKCePwmwv9Jm79D1zw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2E35118D6A35; Wed, 13 Oct 2021 10:37:48 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id 593F55D9D5; Wed, 13 Oct 2021 10:36:44 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , kvm@vger.kernel.org Subject: [PATCH RFC 08/15] memory-device: Support memory devices that consume a variable number of memslots Date: Wed, 13 Oct 2021 12:33:23 +0200 Message-Id: <20211013103330.26869-9-david@redhat.com> In-Reply-To: <20211013103330.26869-1-david@redhat.com> References: <20211013103330.26869-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org We want to support memory devices that have a container as device memory region, and (dynamically) map individual chunks into that container resulting in multiple memslots getting consumed by such a device. We already have one such device: NVDIMM. However, an NVDIMM also end up consuming exactly one memslot. The target use case will be virtio-mem, which will dynamically map parts of a source RAM memory region into the container device region using aliases, consuming one memslot per alias. We need a way to query from a memory device: * The currently used number memslots. * The total number of memslots that might get used across device lifetime. Expose some helper functions that will be used by vhost code to respect the current memslot reservation when realizing vhost devices, and by virtio-mem to dynamically figure out how many memslots it can use. Signed-off-by: David Hildenbrand --- hw/mem/memory-device.c | 136 ++++++++++++++++++++++++++++++--- include/hw/mem/memory-device.h | 32 ++++++++ stubs/qmp_memory_device.c | 5 ++ 3 files changed, 164 insertions(+), 9 deletions(-) diff --git a/hw/mem/memory-device.c b/hw/mem/memory-device.c index a915894819..5876c90a59 100644 --- a/hw/mem/memory-device.c +++ b/hw/mem/memory-device.c @@ -50,8 +50,28 @@ static int memory_device_build_list(Object *obj, void *opaque) return 0; } +static unsigned int memory_device_get_used_memslots(const MemoryDeviceState *md) +{ + const MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(md); + + if (!mdc->get_used_memslots) + return 1; + return mdc->get_used_memslots(md, &error_abort); +} + +static unsigned int memory_device_get_memslots(const MemoryDeviceState *md) +{ + const MemoryDeviceClass *mdc = MEMORY_DEVICE_GET_CLASS(md); + + if (!mdc->get_memslots) + return 1; + return mdc->get_memslots(md, &error_abort); +} + struct memory_devices_info { uint64_t region_size; + unsigned int used_memslots; + unsigned int reserved_memslots; }; static int memory_devices_collect_info(Object *obj, void *opaque) @@ -61,9 +81,15 @@ static int memory_devices_collect_info(Object *obj, void *opaque) if (object_dynamic_cast(obj, TYPE_MEMORY_DEVICE)) { const DeviceState *dev = DEVICE(obj); const MemoryDeviceState *md = MEMORY_DEVICE(obj); + unsigned int used, total; if (dev->realized) { i->region_size += memory_device_get_region_size(md, &error_abort); + + used = memory_device_get_used_memslots(md); + total = memory_device_get_memslots(md); + i->used_memslots += used; + i->reserved_memslots += total - used; } } @@ -71,24 +97,116 @@ static int memory_devices_collect_info(Object *obj, void *opaque) return 0; } -static void memory_device_check_addable(MachineState *ms, MemoryRegion *mr, - Error **errp) +/* + * Get the number of memslots that are reserved (not used yet but will get used + * dynamically in the future without further checks) by all memory devices. + */ +unsigned int memory_devices_get_reserved_memslots(void) +{ + struct memory_devices_info info = {}; + + object_child_foreach(qdev_get_machine(), memory_devices_collect_info, &info); + return info.reserved_memslots; +} + +/* + * Calculate the maximum number of memslots using a heuristic a memory device + * with the given region size may used. Called before/while plugging and + * realizing a memory device that can determine the number of memslots to use + * dynamically depending on the actual number of available memslots. + */ +unsigned int memory_devices_calc_memslot_limit(uint64_t region_size) +{ + struct memory_devices_info info = {}; + MachineState *ms = current_machine; + unsigned int total, free, limit; + double percent; + + free = vhost_get_free_memslots(); + if (kvm_enabled()) { + free = MIN(free, kvm_get_free_memslots()); + } + object_child_foreach(OBJECT(ms), memory_devices_collect_info, &info); + + /* + * Consider all memslots that are used+reserved by memory devices and + * can be used for memory devices. This leaves any memslots used for + * something else (e.g., initial memory) out of the picture. + */ + total = info.used_memslots + info.reserved_memslots + free; + + /* + * Cap the total to something reasonable for now. We don't want to have + * infinite memslots or max out the KVM limit ... + */ + total = MIN(4096, total); + if (total > info.used_memslots + info.reserved_memslots) { + free = total - info.used_memslots + info.reserved_memslots; + } else { + free = 0; + } + + /* + * Simple heuristic: equally distribute the total slots over the whole + * device region. + */ + percent = (double)region_size / (ms->maxram_size - ms->ram_size); + limit = total * percent; + + /* + * However, let's be conservative and prepare for some smaller devices + * that consume more memslots-per-byte. Only use 90% of the assigned + * percentage. + */ + limit = 0.9 * limit; + + /* + * In rare corner cases (especially, appearance of vhost devices after + * already plugging memory devices), we might still run into trouble. + * Let's try to leave 16 slots around "just in case". + */ + if (limit > free) { + if (free > 16) { + free = free - 16; + } else { + free = 0; + } + limit = MIN(limit, free); + } + return !limit ? 1 : limit; +} + +static void memory_device_check_addable(MachineState *ms, MemoryDeviceState *md, + MemoryRegion *mr, Error **errp) { const uint64_t size = memory_region_size(mr); struct memory_devices_info info = {}; + unsigned int required, reserved; + + memory_devices_collect_info(OBJECT(ms), &info); + reserved = info.reserved_memslots; + required = memory_device_get_memslots(md); - /* we will need a new memory slot for kvm and vhost */ - if (kvm_enabled() && !kvm_get_free_memslots()) { - error_setg(errp, "hypervisor has no free memory slots left"); + /* + * All memslots used by memory devices are already subtracted from + * the free memslots as reported by kvm and vhost. Memory devices that + * use multiple memslots are expected to take proper care (disabling + * merging of memory regions) such that used memslots don't end up + * actually consuming less right now and might consume more later. + */ + if (kvm_enabled() && kvm_get_free_memslots() < reserved + required) { + error_setg(errp, "KVM does not have enough free memory slots left (%u vs. %u)", + required, kvm_get_free_memslots() - reserved); return; } - if (!vhost_get_free_memslots()) { - error_setg(errp, "a used vhost backend has no free memory slots left"); + if (vhost_get_free_memslots() < reserved + required) { + error_setg(errp, + "a used vhost backend does not have enough free memory slots left (%u vs. %u)", + required, vhost_get_free_memslots() - reserved); return; } /* will we exceed the total amount of memory specified */ - memory_devices_collect_info(OBJECT(ms), &info); if (info.region_size + size < info.region_size || info.region_size + size > ms->maxram_size - ms->ram_size) { error_setg(errp, "not enough space, currently 0x%" PRIx64 @@ -257,7 +375,7 @@ void memory_device_pre_plug(MemoryDeviceState *md, MachineState *ms, goto out; } - memory_device_check_addable(ms, mr, &local_err); + memory_device_check_addable(ms, md, mr, &local_err); if (local_err) { goto out; } diff --git a/include/hw/mem/memory-device.h b/include/hw/mem/memory-device.h index 48d2611fc5..fe4387438c 100644 --- a/include/hw/mem/memory-device.h +++ b/include/hw/mem/memory-device.h @@ -98,6 +98,36 @@ struct MemoryDeviceClass { */ uint64_t (*get_min_alignment)(const MemoryDeviceState *md); + /* + * Optional: Return the number of used individual memslots (i.e., + * individual RAM mappings) the device has created in the memory region of + * the device. The device has to make sure that memslots won't get merged + * internally (e.g,, by disabling merging of memory region aliases) if the + * memory region layout could allow for that. + * + * If this function is not implemented, we assume the device memory region + * is not a container and that there is exactly one memslot. + * + * Called when plugging the memory device or when iterating over + * all realized memory devices to calculate used/reserved/available + * memslots. + */ + unsigned int (*get_used_memslots)(const MemoryDeviceState *md, Error **errp); + + /* + * Optional: Return the total number of individual memslots + * (i.e., individual RAM mappings) the device may create in the the memory + * region of the device over its lifetime. The result must never change. + * + * If this function is not implemented, we assume the device memory region + * is not a container and that there will be exactly one memslot. + * + * Called when plugging the memory device or when iterating over + * all realized memory devices to calculate used/reserved/available + * memslots. + */ + unsigned int (*get_memslots)(const MemoryDeviceState *md, Error **errp); + /* * Translate the memory device into #MemoryDeviceInfo. */ @@ -113,5 +143,7 @@ void memory_device_plug(MemoryDeviceState *md, MachineState *ms); void memory_device_unplug(MemoryDeviceState *md, MachineState *ms); uint64_t memory_device_get_region_size(const MemoryDeviceState *md, Error **errp); +unsigned int memory_devices_get_reserved_memslots(void); +unsigned int memory_devices_calc_memslot_limit(uint64_t region_size); #endif diff --git a/stubs/qmp_memory_device.c b/stubs/qmp_memory_device.c index e75cac62dc..318a5d4187 100644 --- a/stubs/qmp_memory_device.c +++ b/stubs/qmp_memory_device.c @@ -10,3 +10,8 @@ uint64_t get_plugged_memory_size(void) { return (uint64_t)-1; } + +unsigned int memory_devices_get_reserved_memslots(void) +{ + return 0; +} From patchwork Wed Oct 13 10:33:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12555475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 250D0C433F5 for ; Wed, 13 Oct 2021 10:38:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0A8B96109E for ; Wed, 13 Oct 2021 10:38:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229912AbhJMKkk (ORCPT ); Wed, 13 Oct 2021 06:40:40 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:22881 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229965AbhJMKkf (ORCPT ); Wed, 13 Oct 2021 06:40:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634121512; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=G7w77GdrliKMKTab+xhACViP4y5ZTEa223HCQBn7oDw=; b=ctsTDAE+a8URbKnYKrRDIkEPg91T7S4nmwwfQLbGMMs7L3sLKOHSokn4Je25vB1fArPy6M FftwUJ/Qdaz2OIdmyzfmDUvZTUL+ewBGDj5tJm3GjVJJAxk1UGGhr3w/WT+nhiQ/mCM7oA INPeeZdE5JfWaKN2bj3j3kcULHDIGOs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-244-0PNp_IzHPjO9VogSL7ZcPg-1; Wed, 13 Oct 2021 06:38:31 -0400 X-MC-Unique: 0PNp_IzHPjO9VogSL7ZcPg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 0FD1E10A8E02; Wed, 13 Oct 2021 10:38:30 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8CF6F5D9D5; Wed, 13 Oct 2021 10:37:48 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , kvm@vger.kernel.org Subject: [PATCH RFC 09/15] vhost: Respect reserved memslots for memory devices when realizing a vhost device Date: Wed, 13 Oct 2021 12:33:24 +0200 Message-Id: <20211013103330.26869-10-david@redhat.com> In-Reply-To: <20211013103330.26869-1-david@redhat.com> References: <20211013103330.26869-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Make sure that the current reservations can be fulfilled, otherwise we might run out of memslots later when memory devices start actually using the reserved memslots and crash. Signed-off-by: David Hildenbrand --- hw/virtio/vhost.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 49a1074097..b3fa814393 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -23,6 +23,7 @@ #include "standard-headers/linux/vhost_types.h" #include "hw/virtio/virtio-bus.h" #include "hw/virtio/virtio-access.h" +#include "hw/mem/memory-device.h" #include "migration/blocker.h" #include "migration/qemu-file-types.h" #include "sysemu/dma.h" @@ -1319,7 +1320,7 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque, Error **errp) { uint64_t features; - int i, r, n_initialized_vqs = 0; + int i, r, reserved_slots, n_initialized_vqs = 0; hdev->vdev = NULL; hdev->migration_blocker = NULL; @@ -1415,9 +1416,11 @@ int vhost_dev_init(struct vhost_dev *hdev, void *opaque, memory_listener_register(&hdev->memory_listener, &address_space_memory); QLIST_INSERT_HEAD(&vhost_devices, hdev, entry); - if (used_memslots > hdev->vhost_ops->vhost_backend_memslots_limit(hdev)) { + reserved_slots = memory_devices_get_reserved_memslots(); + if (used_memslots + reserved_slots > + hdev->vhost_ops->vhost_backend_memslots_limit(hdev)) { error_setg(errp, "vhost backend memory slots limit is less" - " than current number of present memory slots"); + " than current number of used and reserved memory slots"); r = -EINVAL; goto fail_busyloop; } From patchwork Wed Oct 13 10:33:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12555477 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7FC42C433F5 for ; Wed, 13 Oct 2021 10:39:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6334F60F94 for ; Wed, 13 Oct 2021 10:39:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229603AbhJMKlP (ORCPT ); Wed, 13 Oct 2021 06:41:15 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:24101 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229535AbhJMKlO (ORCPT ); Wed, 13 Oct 2021 06:41:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634121551; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+drgKSyBp+3GOfFgRCVw00nbzgTkvgCinN1Uq8+5xro=; b=A/Z94K5xPzxrGGYjUnGzKubOS7bghNHEGapC6gK/Uz3iTEoiKzSK7D9bBkPB9Xvu9V7b/v FvIkb7gWGhoAE10PXg0R3b7yhc+mih87sfn7xwac2kqHDaQ0ocMdxiSLPczF2OxVTOzAJv Og4Pa52d3bCyBFaS+cHGnK/6UnFFmvw= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-233-VAyqgU9YN0mfZYNoK44x7A-1; Wed, 13 Oct 2021 06:39:08 -0400 X-MC-Unique: VAyqgU9YN0mfZYNoK44x7A-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B07A318D6A25; Wed, 13 Oct 2021 10:39:06 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6B21D5D9D5; Wed, 13 Oct 2021 10:38:30 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , kvm@vger.kernel.org Subject: [PATCH RFC 10/15] virtio-mem: Set the RamDiscardManager for the RAM memory region earlier Date: Wed, 13 Oct 2021 12:33:25 +0200 Message-Id: <20211013103330.26869-11-david@redhat.com> In-Reply-To: <20211013103330.26869-1-david@redhat.com> References: <20211013103330.26869-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's set the RamDiscardManager earlier, logically before we expose the RAM memory region to the system. This is a preparation for further changes and is logically cleaner: before we expose the RAM memory region to migration code, make sure we have the RamDiscardManager setup. Signed-off-by: David Hildenbrand --- hw/virtio/virtio-mem.c | 19 ++++++++----------- 1 file changed, 8 insertions(+), 11 deletions(-) diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index d5a578142b..b2ad27ed7f 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -773,16 +773,17 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp) sizeof(struct virtio_mem_config)); vmem->vq = virtio_add_queue(vdev, 128, virtio_mem_handle_request); - host_memory_backend_set_mapped(vmem->memdev, true); - vmstate_register_ram(&vmem->memdev->mr, DEVICE(vmem)); - qemu_register_reset(virtio_mem_system_reset, vmem); - /* - * Set ourselves as RamDiscardManager before the plug handler maps the - * memory region and exposes it via an address space. + * Set ourselves as RamDiscardManager before we expose the memory region + * to the system (e.g., marking the RAMBlock migratable, mapping the + * region). */ memory_region_set_ram_discard_manager(&vmem->memdev->mr, RAM_DISCARD_MANAGER(vmem)); + + host_memory_backend_set_mapped(vmem->memdev, true); + vmstate_register_ram(&vmem->memdev->mr, DEVICE(vmem)); + qemu_register_reset(virtio_mem_system_reset, vmem); } static void virtio_mem_device_unrealize(DeviceState *dev) @@ -790,14 +791,10 @@ static void virtio_mem_device_unrealize(DeviceState *dev) VirtIODevice *vdev = VIRTIO_DEVICE(dev); VirtIOMEM *vmem = VIRTIO_MEM(dev); - /* - * The unplug handler unmapped the memory region, it cannot be - * found via an address space anymore. Unset ourselves. - */ - memory_region_set_ram_discard_manager(&vmem->memdev->mr, NULL); qemu_unregister_reset(virtio_mem_system_reset, vmem); vmstate_unregister_ram(&vmem->memdev->mr, DEVICE(vmem)); host_memory_backend_set_mapped(vmem->memdev, false); + memory_region_set_ram_discard_manager(&vmem->memdev->mr, NULL); virtio_del_queue(vdev, 0); virtio_cleanup(vdev); g_free(vmem->bitmap); From patchwork Wed Oct 13 10:33:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12555479 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C1CCC433F5 for ; Wed, 13 Oct 2021 10:39:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 015E060F94 for ; Wed, 13 Oct 2021 10:39:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229581AbhJMKlu (ORCPT ); Wed, 13 Oct 2021 06:41:50 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:34938 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229723AbhJMKlr (ORCPT ); Wed, 13 Oct 2021 06:41:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634121583; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4DaQvGd/XgsCi/O6+tZS2igjEJPTY/ON4QVn26V1ghg=; b=IneR3PIHwmm2+Fv7S5wn/DJqFZU5XOCpsvqSIOY1L92wcpR5nYwKY3gXBCfa0HjN4s9x30 lUFLtl9DDl7hEqor6wwPcFusqBKYjIsacO2uZ6MH0CiCMALocRTx5BkMNY+IcXcSGtUnzm ODNSkoXwzmpVD3Z+aL7BB6SwFc1s96Q= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-591-gugKglFwNj2Bzh1ak63n5w-1; Wed, 13 Oct 2021 06:39:39 -0400 X-MC-Unique: gugKglFwNj2Bzh1ak63n5w-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4A61E18D6A2F; Wed, 13 Oct 2021 10:39:38 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id 196C15D9D5; Wed, 13 Oct 2021 10:39:06 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , kvm@vger.kernel.org Subject: [PATCH RFC 11/15] virtio-mem: Fix typo in virito_mem_intersect_memory_section() function name Date: Wed, 13 Oct 2021 12:33:26 +0200 Message-Id: <20211013103330.26869-12-david@redhat.com> In-Reply-To: <20211013103330.26869-1-david@redhat.com> References: <20211013103330.26869-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org It's "virtio", not "virito". Signed-off-by: David Hildenbrand --- hw/virtio/virtio-mem.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index b2ad27ed7f..1e29706798 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -177,7 +177,7 @@ static int virtio_mem_for_each_unplugged_range(const VirtIOMEM *vmem, void *arg, * * Returns false if the intersection is empty, otherwise returns true. */ -static bool virito_mem_intersect_memory_section(MemoryRegionSection *s, +static bool virtio_mem_intersect_memory_section(MemoryRegionSection *s, uint64_t offset, uint64_t size) { uint64_t start = MAX(s->offset_within_region, offset); @@ -215,7 +215,7 @@ static int virtio_mem_for_each_plugged_section(const VirtIOMEM *vmem, first_bit + 1) - 1; size = (last_bit - first_bit + 1) * vmem->block_size; - if (!virito_mem_intersect_memory_section(&tmp, offset, size)) { + if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) { break; } ret = cb(&tmp, arg); @@ -247,7 +247,7 @@ static int virtio_mem_for_each_unplugged_section(const VirtIOMEM *vmem, first_bit + 1) - 1; size = (last_bit - first_bit + 1) * vmem->block_size; - if (!virito_mem_intersect_memory_section(&tmp, offset, size)) { + if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) { break; } ret = cb(&tmp, arg); @@ -283,7 +283,7 @@ static void virtio_mem_notify_unplug(VirtIOMEM *vmem, uint64_t offset, QLIST_FOREACH(rdl, &vmem->rdl_list, next) { MemoryRegionSection tmp = *rdl->section; - if (!virito_mem_intersect_memory_section(&tmp, offset, size)) { + if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) { continue; } rdl->notify_discard(rdl, &tmp); @@ -299,7 +299,7 @@ static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint64_t offset, QLIST_FOREACH(rdl, &vmem->rdl_list, next) { MemoryRegionSection tmp = *rdl->section; - if (!virito_mem_intersect_memory_section(&tmp, offset, size)) { + if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) { continue; } ret = rdl->notify_populate(rdl, &tmp); @@ -316,7 +316,7 @@ static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint64_t offset, if (rdl2 == rdl) { break; } - if (!virito_mem_intersect_memory_section(&tmp, offset, size)) { + if (!virtio_mem_intersect_memory_section(&tmp, offset, size)) { continue; } rdl2->notify_discard(rdl2, &tmp); From patchwork Wed Oct 13 10:33:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12555481 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E810C433EF for ; Wed, 13 Oct 2021 10:40:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E5550610A1 for ; Wed, 13 Oct 2021 10:40:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229723AbhJMKmT (ORCPT ); Wed, 13 Oct 2021 06:42:19 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:38692 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229599AbhJMKmR (ORCPT ); Wed, 13 Oct 2021 06:42:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634121614; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hfDP3762FkoZZG6FBVSMAS6iAUs3K6cNZa87Izprn0s=; b=AM4JRpCDHYdVwbzPovJ9Yybga/etnBSfrQv/WL1xNR/5hNVDRJm4v1NyEo3gQq4tEzvyrv BV0CpOJBcFz3GpFexUdjAs3XYozpYOS+Q+F4Qgf9hn11E96PJ1Js0AlTOEMCXWc8J4yiKj avMBGW6LwFmz5Tt4VUQ3gveJvYPBAZI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-198-tU6bOfqAPhOJWITFGO2yeA-1; Wed, 13 Oct 2021 06:40:13 -0400 X-MC-Unique: tU6bOfqAPhOJWITFGO2yeA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3E91210A8E02; Wed, 13 Oct 2021 10:40:12 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id BBC2D5D9D5; Wed, 13 Oct 2021 10:39:38 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , kvm@vger.kernel.org Subject: [PATCH RFC 12/15] virtio-mem: Expose device memory via separate memslots Date: Wed, 13 Oct 2021 12:33:27 +0200 Message-Id: <20211013103330.26869-13-david@redhat.com> In-Reply-To: <20211013103330.26869-1-david@redhat.com> References: <20211013103330.26869-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org KVM nowadays supports a lot of memslots. We want to exploit that in virtio-mem, exposing device memory via separate memslots to the guest on demand, essentially reducing the total size of KVM slots significantly (and thereby metadata in KVM and in QEMU for KVM memory slots) especially when exposing initially only a small amount of memory via a virtio-mem device to the guest, to hotplug more later. Further, not always exposing the full device memory region to the guest reduces the attack surface in many setups without requiring other mechanisms like uffd for protection of unplugged memory. So split the original RAM region via memory region aliases into separate chunks (ending up as individual memslots), and dynamically map the required chunks (falling into the usable region) into the container. For now, we always map the memslots covered by the usable region. In the future, with VIRTIO_MEM_F_UNPLUGGED_INACCESSIBLE, we'll be able to map memslots on actual demand and optimize further. Users can specify via the "max-memslots" property how much memslots the virtio-mem device is allowed to use at max. "0" translates to "auto, no limit" and is determinded automatically using a heuristic. When a maximum (> 1) is specified, that auto-determined value is capped. The parameter doesn't have to be migrated and can differ between source and destination. The only reason the parameter exists is not make some corner case setups (multiple large virtio-mem devices assigned to a single virtual NUMA node with only very limited available memslots, hotplug of vhost devices) work. The parameter will be set to be "0" as default soon, whereby it will remain to be "1" for compat machines. The properties "memslots" and "used-memslots" are read-only. Signed-off-by: David Hildenbrand --- hw/virtio/virtio-mem-pci.c | 22 +++++ hw/virtio/virtio-mem.c | 173 ++++++++++++++++++++++++++++++++- include/hw/virtio/virtio-mem.h | 29 +++++- 3 files changed, 220 insertions(+), 4 deletions(-) diff --git a/hw/virtio/virtio-mem-pci.c b/hw/virtio/virtio-mem-pci.c index be2383b0c5..2c1be2afb7 100644 --- a/hw/virtio/virtio-mem-pci.c +++ b/hw/virtio/virtio-mem-pci.c @@ -82,6 +82,20 @@ static uint64_t virtio_mem_pci_get_min_alignment(const MemoryDeviceState *md) &error_abort); } +static unsigned int virtio_mem_pci_get_used_memslots(const MemoryDeviceState *md, + Error **errp) +{ + return object_property_get_uint(OBJECT(md), VIRTIO_MEM_USED_MEMSLOTS_PROP, + &error_abort); +} + +static unsigned int virtio_mem_pci_get_memslots(const MemoryDeviceState *md, + Error **errp) +{ + return object_property_get_uint(OBJECT(md), VIRTIO_MEM_MEMSLOTS_PROP, + &error_abort); +} + static void virtio_mem_pci_size_change_notify(Notifier *notifier, void *data) { VirtIOMEMPCI *pci_mem = container_of(notifier, VirtIOMEMPCI, @@ -115,6 +129,8 @@ static void virtio_mem_pci_class_init(ObjectClass *klass, void *data) mdc->get_memory_region = virtio_mem_pci_get_memory_region; mdc->fill_device_info = virtio_mem_pci_fill_device_info; mdc->get_min_alignment = virtio_mem_pci_get_min_alignment; + mdc->get_used_memslots = virtio_mem_pci_get_used_memslots; + mdc->get_memslots = virtio_mem_pci_get_memslots; } static void virtio_mem_pci_instance_init(Object *obj) @@ -142,6 +158,12 @@ static void virtio_mem_pci_instance_init(Object *obj) object_property_add_alias(obj, VIRTIO_MEM_REQUESTED_SIZE_PROP, OBJECT(&dev->vdev), VIRTIO_MEM_REQUESTED_SIZE_PROP); + object_property_add_alias(obj, VIRTIO_MEM_MEMSLOTS_PROP, + OBJECT(&dev->vdev), + VIRTIO_MEM_MEMSLOTS_PROP); + object_property_add_alias(obj, VIRTIO_MEM_USED_MEMSLOTS_PROP, + OBJECT(&dev->vdev), + VIRTIO_MEM_USED_MEMSLOTS_PROP); } static const VirtioPCIDeviceTypeInfo virtio_mem_pci_info = { diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index 1e29706798..f7e8f1db83 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -23,6 +23,7 @@ #include "hw/virtio/virtio-bus.h" #include "hw/virtio/virtio-access.h" #include "hw/virtio/virtio-mem.h" +#include "hw/mem/memory-device.h" #include "qapi/error.h" #include "qapi/visitor.h" #include "exec/ram_addr.h" @@ -500,6 +501,7 @@ static void virtio_mem_resize_usable_region(VirtIOMEM *vmem, { uint64_t newsize = MIN(memory_region_size(&vmem->memdev->mr), requested_size + VIRTIO_MEM_USABLE_EXTENT); + int i; /* The usable region size always has to be multiples of the block size. */ newsize = QEMU_ALIGN_UP(newsize, vmem->block_size); @@ -514,6 +516,25 @@ static void virtio_mem_resize_usable_region(VirtIOMEM *vmem, trace_virtio_mem_resized_usable_region(vmem->usable_region_size, newsize); vmem->usable_region_size = newsize; + + /* + * Map all unmapped memslots that cover the usable region and unmap all + * remaining mapped ones. + */ + for (i = 0; i < vmem->nb_memslots; i++) { + if (vmem->memslot_size * i < vmem->usable_region_size) { + if (!memory_region_is_mapped(&vmem->memslots[i])) { + memory_region_add_subregion(vmem->mr, vmem->memslot_size * i, + &vmem->memslots[i]); + vmem->nb_used_memslots++; + } + } else { + if (memory_region_is_mapped(&vmem->memslots[i])) { + memory_region_del_subregion(vmem->mr, &vmem->memslots[i]); + vmem->nb_used_memslots--; + } + } + } } static int virtio_mem_unplug_all(VirtIOMEM *vmem) @@ -674,6 +695,92 @@ static void virtio_mem_system_reset(void *opaque) virtio_mem_unplug_all(vmem); } +static void virtio_mem_prepare_mr(VirtIOMEM *vmem) +{ + const uint64_t region_size = memory_region_size(&vmem->memdev->mr); + + if (vmem->mr) { + return; + } + + vmem->mr = g_malloc0(sizeof(*vmem->mr)); + memory_region_init(vmem->mr, OBJECT(vmem), "virtio-mem-memslots", + region_size); + vmem->mr->align = memory_region_get_alignment(&vmem->memdev->mr); +} + +/* + * Calculate the number of memslots we'll use based on device properties and + * available + already used+reserved memslots for other devices. + * + * Must not get called after realizing the device. + */ +static unsigned int virtio_mem_calc_nb_memslots(uint64_t region_size, + uint64_t block_size, + unsigned int user_limit) +{ + unsigned int limit = memory_devices_calc_memslot_limit(region_size); + uint64_t memslot_size; + + /* + * We never use more than 1024 memslots for a single device (relevant only + * for devices > 1 TiB). + */ + limit = MIN(limit, 1024); + + /* + * We'll never use memslots that are smaller than 1 GiB or smaller than + * the block size (and thereby the page size). memslots are always a power + * of two. + */ + memslot_size = MAX(1 * GiB, block_size); + while (ROUND_UP(region_size, memslot_size) / memslot_size > limit) { + memslot_size *= 2; + } + limit = ROUND_UP(region_size, memslot_size) / memslot_size; + + return !user_limit ? limit : MIN(user_limit, limit); +} + +static void virtio_mem_prepare_memslots(VirtIOMEM *vmem) +{ + const uint64_t region_size = memory_region_size(&vmem->memdev->mr); + int i; + + if (!vmem->nb_memslots) { + vmem->nb_memslots = virtio_mem_calc_nb_memslots(region_size, + vmem->block_size, + vmem->nb_max_memslots); + } + if (vmem->nb_memslots == 1) { + vmem->memslot_size = region_size; + } else { + vmem->memslot_size = 1 * GiB; + while (ROUND_UP(region_size, vmem->memslot_size) / vmem->memslot_size > + vmem->nb_memslots) { + vmem->memslot_size *= 2; + } + } + + /* Create our memslots but don't map them yet -- we'll map dynamically. */ + vmem->memslots = g_malloc0_n(vmem->nb_memslots, sizeof(*vmem->memslots)); + for (i = 0; i < vmem->nb_memslots; i++) { + const uint64_t size = MIN(vmem->memslot_size, + region_size - i * vmem->memslot_size); + char name[80]; + + snprintf(name, sizeof(name), "virtio-mem-memslot-%u", i); + memory_region_init_alias(&vmem->memslots[i], OBJECT(vmem), name, + &vmem->memdev->mr, vmem->memslot_size * i, + size); + /* + * We want our aliases to result in separate memory sections and thereby + * separate memslots. + */ + memory_region_set_alias_unmergeable(&vmem->memslots[i], true); + } +} + static void virtio_mem_device_realize(DeviceState *dev, Error **errp) { MachineState *ms = MACHINE(qdev_get_machine()); @@ -763,7 +870,7 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp) return; } - virtio_mem_resize_usable_region(vmem, vmem->requested_size, true); + virtio_mem_prepare_mr(vmem); vmem->bitmap_size = memory_region_size(&vmem->memdev->mr) / vmem->block_size; @@ -780,9 +887,11 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp) */ memory_region_set_ram_discard_manager(&vmem->memdev->mr, RAM_DISCARD_MANAGER(vmem)); + virtio_mem_prepare_memslots(vmem); - host_memory_backend_set_mapped(vmem->memdev, true); + virtio_mem_resize_usable_region(vmem, vmem->requested_size, true); vmstate_register_ram(&vmem->memdev->mr, DEVICE(vmem)); + host_memory_backend_set_mapped(vmem->memdev, true); qemu_register_reset(virtio_mem_system_reset, vmem); } @@ -794,10 +903,14 @@ static void virtio_mem_device_unrealize(DeviceState *dev) qemu_unregister_reset(virtio_mem_system_reset, vmem); vmstate_unregister_ram(&vmem->memdev->mr, DEVICE(vmem)); host_memory_backend_set_mapped(vmem->memdev, false); + /* Unmap all memslots. */ + virtio_mem_resize_usable_region(vmem, 0, true); + g_free(vmem->memslots); memory_region_set_ram_discard_manager(&vmem->memdev->mr, NULL); virtio_del_queue(vdev, 0); virtio_cleanup(vdev); g_free(vmem->bitmap); + g_free(vmem->mr); ram_block_coordinated_discard_require(false); } @@ -955,7 +1068,8 @@ static MemoryRegion *virtio_mem_get_memory_region(VirtIOMEM *vmem, Error **errp) return NULL; } - return &vmem->memdev->mr; + virtio_mem_prepare_mr(vmem); + return vmem->mr; } static void virtio_mem_add_size_change_notifier(VirtIOMEM *vmem, @@ -1084,6 +1198,53 @@ static void virtio_mem_set_block_size(Object *obj, Visitor *v, const char *name, vmem->block_size = value; } +static void virtio_mem_get_used_memslots(Object *obj, Visitor *v, + const char *name, + void *opaque, Error **errp) +{ + const VirtIOMEM *vmem = VIRTIO_MEM(obj); + uint16_t value = vmem->nb_used_memslots; + + visit_type_uint16(v, name, &value, errp); +} + +static void virtio_mem_get_memslots(Object *obj, Visitor *v, const char *name, + void *opaque, Error **errp) +{ + VirtIOMEM *vmem = VIRTIO_MEM(obj); + uint16_t value = vmem->nb_memslots; + + /* Determine the final value now, we don't want it to change later. */ + if (!vmem->nb_memslots) { + uint64_t block_size = vmem->block_size; + uint64_t region_size; + RAMBlock *rb; + + if (!vmem->memdev || !memory_region_is_ram(&vmem->memdev->mr)) { + /* We'll fail realizing later ... */ + vmem->nb_memslots = 1; + goto out; + } + region_size = memory_region_size(&vmem->memdev->mr); + rb = vmem->memdev->mr.ram_block; + + if (!block_size) { + block_size = virtio_mem_default_block_size(rb); + } else if (block_size < qemu_ram_pagesize(rb)) { + /* We'll fail realizing later ... */ + vmem->nb_memslots = 1; + goto out; + } + + vmem->nb_memslots = virtio_mem_calc_nb_memslots(region_size, + vmem->block_size, + vmem->nb_max_memslots); + } +out: + value = vmem->nb_memslots; + visit_type_uint16(v, name, &value, errp); +} + static void virtio_mem_instance_init(Object *obj) { VirtIOMEM *vmem = VIRTIO_MEM(obj); @@ -1099,6 +1260,10 @@ static void virtio_mem_instance_init(Object *obj) object_property_add(obj, VIRTIO_MEM_BLOCK_SIZE_PROP, "size", virtio_mem_get_block_size, virtio_mem_set_block_size, NULL, NULL); + object_property_add(obj, VIRTIO_MEM_MEMSLOTS_PROP, "uint16", + virtio_mem_get_memslots, NULL, NULL, NULL); + object_property_add(obj, VIRTIO_MEM_USED_MEMSLOTS_PROP, "uint16", + virtio_mem_get_used_memslots, NULL, NULL, NULL); } static Property virtio_mem_properties[] = { @@ -1106,6 +1271,8 @@ static Property virtio_mem_properties[] = { DEFINE_PROP_UINT32(VIRTIO_MEM_NODE_PROP, VirtIOMEM, node, 0), DEFINE_PROP_LINK(VIRTIO_MEM_MEMDEV_PROP, VirtIOMEM, memdev, TYPE_MEMORY_BACKEND, HostMemoryBackend *), + DEFINE_PROP_UINT16(VIRTIO_MEM_MAX_MEMSLOTS_PROP, VirtIOMEM, nb_max_memslots, + 1), DEFINE_PROP_END_OF_LIST(), }; diff --git a/include/hw/virtio/virtio-mem.h b/include/hw/virtio/virtio-mem.h index a5dd6a493b..3589865871 100644 --- a/include/hw/virtio/virtio-mem.h +++ b/include/hw/virtio/virtio-mem.h @@ -30,6 +30,9 @@ OBJECT_DECLARE_TYPE(VirtIOMEM, VirtIOMEMClass, #define VIRTIO_MEM_REQUESTED_SIZE_PROP "requested-size" #define VIRTIO_MEM_BLOCK_SIZE_PROP "block-size" #define VIRTIO_MEM_ADDR_PROP "memaddr" +#define VIRTIO_MEM_MEMSLOTS_PROP "memslots" +#define VIRTIO_MEM_USED_MEMSLOTS_PROP "used-memslots" +#define VIRTIO_MEM_MAX_MEMSLOTS_PROP "max-memslots" struct VirtIOMEM { VirtIODevice parent_obj; @@ -41,9 +44,33 @@ struct VirtIOMEM { int32_t bitmap_size; unsigned long *bitmap; - /* assigned memory backend and memory region */ + /* Device memory region in which we dynamically map memslots */ + MemoryRegion *mr; + + /* + * Assigned memory backend with the RAM memory region we will split + * into memslots to dynamically map them into the device memory region. + */ HostMemoryBackend *memdev; + /* + * Individual memslots we dynamically map that are aliases to the + * assigned RAM memory region + */ + MemoryRegion *memslots; + + /* User defined maximum number of memslots we may ever use. */ + uint16_t nb_max_memslots; + + /* Total number of memslots we're going to use. */ + uint16_t nb_memslots; + + /* Current number of memslots we're using. */ + uint16_t nb_used_memslots; + + /* Size of one memslot (the last one might be smaller) */ + uint64_t memslot_size; + /* NUMA node */ uint32_t node; From patchwork Wed Oct 13 10:33:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12555483 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CEA20C433F5 for ; Wed, 13 Oct 2021 10:41:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A9237610C8 for ; Wed, 13 Oct 2021 10:41:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229599AbhJMKns (ORCPT ); Wed, 13 Oct 2021 06:43:48 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:44993 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229461AbhJMKnr (ORCPT ); Wed, 13 Oct 2021 06:43:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634121703; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=I3wYyhah5TBD+5tb0f6Fmglz6xziVK7NNaX1k9BXRGA=; b=Ir7EGoblJwJV2gpLo9GqB3z8ChEDnJgN9UOjwt0kv4WO5rQaCr95staU8palDT2UXFmbMh p/iZLDUhwO+sU1rmI174ZHzKINCrUCStJ4VPFNiBDR/noYO0BVOb9tjYNqXpoPnJB+MvxV 98ettIMQfUjthuW9BLov963YpVJE9+c= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-332-kLb36IpyOSGmbDfXBZ-wbA-1; Wed, 13 Oct 2021 06:41:40 -0400 X-MC-Unique: kLb36IpyOSGmbDfXBZ-wbA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 97C2410A8E02; Wed, 13 Oct 2021 10:41:39 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9DC6F5D9D5; Wed, 13 Oct 2021 10:40:12 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , kvm@vger.kernel.org Subject: [PATCH RFC 13/15] vhost-user: Increase VHOST_USER_MAX_RAM_SLOTS to 496 with CONFIG_VIRTIO_MEM Date: Wed, 13 Oct 2021 12:33:28 +0200 Message-Id: <20211013103330.26869-14-david@redhat.com> In-Reply-To: <20211013103330.26869-1-david@redhat.com> References: <20211013103330.26869-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's increase the number of slots to 4096 to allow for increased flexibility with virtio-mem when dealing with large virtio-mem devices that start out small. In the future, we might want to look into some performance improvements, but for now there isn't really anything stopping us from raising the limit. Signed-off-by: David Hildenbrand --- hw/virtio/vhost-user.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index 2c8556237f..1c6a720728 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -24,6 +24,7 @@ #include "sysemu/cryptodev.h" #include "migration/migration.h" #include "migration/postcopy-ram.h" +#include CONFIG_DEVICES #include "trace.h" #include @@ -45,8 +46,10 @@ * the maximum number supported by the target * hardware plaform. */ -#if defined(TARGET_X86) || defined(TARGET_X86_64) || \ - defined(TARGET_ARM) || defined(TARGET_ARM_64) +#if defined(CONFIG_VIRTIO_MEM) +#define VHOST_USER_MAX_RAM_SLOTS 4096 +#elif defined(TARGET_X86) || defined(TARGET_X86_64) || \ + defined(TARGET_ARM) || defined(TARGET_ARM_64) #include "hw/acpi/acpi.h" #define VHOST_USER_MAX_RAM_SLOTS ACPI_MAX_RAM_SLOTS From patchwork Wed Oct 13 10:33:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12555485 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68886C433F5 for ; Wed, 13 Oct 2021 10:42:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3FD2E6109E for ; Wed, 13 Oct 2021 10:42:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229795AbhJMKoX (ORCPT ); Wed, 13 Oct 2021 06:44:23 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:48352 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229778AbhJMKoX (ORCPT ); Wed, 13 Oct 2021 06:44:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634121739; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3KV1UzSu/dfBqv39FkZcvqMEG7LTj87cKAMHcgkxcYc=; b=B0AG1YmcpIbta2ELHhG+CD5LTQ+wCi9SF3QfC5u59d1zZutbRnyg5ZIoPVpC9Q/35jTIHw MD4bJkBrk81fhIynM6wYJanfuNLB8N9Noj4q4KQXXUWwxQb0jUIS5vyr1BvpeTXWZjGNZf f33lZLksDrtt+2D1rXbTrQCQny5QOpU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-495-pGTXNjp_MA6vZTI1fy6W1g-1; Wed, 13 Oct 2021 06:42:18 -0400 X-MC-Unique: pGTXNjp_MA6vZTI1fy6W1g-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 66C959126D; Wed, 13 Oct 2021 10:42:17 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id F3B085D9D5; Wed, 13 Oct 2021 10:41:39 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , kvm@vger.kernel.org Subject: [PATCH RFC 14/15] libvhost-user: Increase VHOST_USER_MAX_RAM_SLOTS to 4096 Date: Wed, 13 Oct 2021 12:33:29 +0200 Message-Id: <20211013103330.26869-15-david@redhat.com> In-Reply-To: <20211013103330.26869-1-david@redhat.com> References: <20211013103330.26869-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org 4096 is the maximum we can have right now in QEMU with vhost-user, so increase the libvhost-user limit as well. Signed-off-by: David Hildenbrand --- subprojects/libvhost-user/libvhost-user.h | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/subprojects/libvhost-user/libvhost-user.h b/subprojects/libvhost-user/libvhost-user.h index 3d13dfadde..d9628ed9f0 100644 --- a/subprojects/libvhost-user/libvhost-user.h +++ b/subprojects/libvhost-user/libvhost-user.h @@ -30,11 +30,8 @@ #define VHOST_MEMORY_BASELINE_NREGIONS 8 -/* - * Set a reasonable maximum number of ram slots, which will be supported by - * any architecture. - */ -#define VHOST_USER_MAX_RAM_SLOTS 32 +/* Set the RAM slots based on the maximum supported by QEMU vhost-user. */ +#define VHOST_USER_MAX_RAM_SLOTS 4096 #define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64) From patchwork Wed Oct 13 10:33:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 12555487 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4D78C433F5 for ; Wed, 13 Oct 2021 10:43:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8E1F1610A1 for ; Wed, 13 Oct 2021 10:43:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229711AbhJMKpE (ORCPT ); Wed, 13 Oct 2021 06:45:04 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:40915 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229580AbhJMKpE (ORCPT ); Wed, 13 Oct 2021 06:45:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1634121780; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QBvLIgW2i2uwd7ToirTYXVN7d6sYO4/nwGsrCS5gg4M=; b=emJT5lbljc7C9AGfCiyhbZAT7rsxZYdK1I+viKC7ZywzDK/d06gNn317Acq4s8cT0oKCZ5 WbN+j/ozTJcHzSt7/fJ6OBQ4y9mWovrSOuorSUYdjesyPNcEvsLMgTKFpHjqI/4GrxTSqo yVDuSx1AvHvx34RcKfGRBubzUGPPqbU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-147-7IQ-mNL2Mw2TxTox0fZawA-1; Wed, 13 Oct 2021 06:42:57 -0400 X-MC-Unique: 7IQ-mNL2Mw2TxTox0fZawA-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 851C0802C98; Wed, 13 Oct 2021 10:42:56 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.194.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0448A5DF21; Wed, 13 Oct 2021 10:42:17 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , Paolo Bonzini , Eduardo Habkost , Marcel Apfelbaum , "Michael S. Tsirkin" , Igor Mammedov , Ani Sinha , Peter Xu , "Dr . David Alan Gilbert" , Stefan Hajnoczi , Richard Henderson , =?utf-8?q?Philippe_Mathie?= =?utf-8?q?u-Daud=C3=A9?= , kvm@vger.kernel.org Subject: [PATCH RFC 15/15] virtio-mem: Set "max-memslots" to 0 (auto) for the 6.2 machine Date: Wed, 13 Oct 2021 12:33:30 +0200 Message-Id: <20211013103330.26869-16-david@redhat.com> In-Reply-To: <20211013103330.26869-1-david@redhat.com> References: <20211013103330.26869-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Let's enable automatic detection of memslots to use for the 6.2 machine, leaving the behavior of compat machines unchanged. Signed-off-by: David Hildenbrand --- hw/core/machine.c | 1 + hw/virtio/virtio-mem.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/hw/core/machine.c b/hw/core/machine.c index b8d95eec32..25aa42cf9f 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -39,6 +39,7 @@ GlobalProperty hw_compat_6_1[] = { { "vhost-user-vsock-device", "seqpacket", "off" }, + { "virtio-mem", "max-memslots", "1" }, }; const size_t hw_compat_6_1_len = G_N_ELEMENTS(hw_compat_6_1); diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index f7e8f1db83..3de8ed94e6 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -1272,7 +1272,7 @@ static Property virtio_mem_properties[] = { DEFINE_PROP_LINK(VIRTIO_MEM_MEMDEV_PROP, VirtIOMEM, memdev, TYPE_MEMORY_BACKEND, HostMemoryBackend *), DEFINE_PROP_UINT16(VIRTIO_MEM_MAX_MEMSLOTS_PROP, VirtIOMEM, nb_max_memslots, - 1), + 0), DEFINE_PROP_END_OF_LIST(), };