From patchwork Thu Nov 19 15:39:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 11917969 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCC67C56201 for ; Thu, 19 Nov 2020 15:47:35 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E890A22256 for ; Thu, 19 Nov 2020 15:47:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="cTWrV/1O" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E890A22256 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:46326 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kfm9l-0001C2-Rs for qemu-devel@archiver.kernel.org; Thu, 19 Nov 2020 10:47:33 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:41884) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kfm2z-0000dH-Jp for qemu-devel@nongnu.org; Thu, 19 Nov 2020 10:40:33 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:40174) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kfm2x-0001TI-4n for qemu-devel@nongnu.org; Thu, 19 Nov 2020 10:40:33 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1605800430; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OXqOJeqfnm6q/E0h4hAXsopl+JYRJpn4WMYXuYdH0gg=; b=cTWrV/1OyL4Xpq9DHX+FmVxUciyW/i2b8Rw49Nf9/8nKXU6LQto0vIo3zXXBVhCXR9yeBD xjv2LImvgLKuvbNu2vPQpv6wf9t8vd1NdPuwcUg2rgCK9c+rTK93ezZHbthn1z5NuYuqe3 pFIqMSscx1/75wQtYJ346upJoI62n6w= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-587-4L4AACJUPGejI0NEmuxjMA-1; Thu, 19 Nov 2020 10:40:29 -0500 X-MC-Unique: 4L4AACJUPGejI0NEmuxjMA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7BFD4100F341; Thu, 19 Nov 2020 15:40:27 +0000 (UTC) Received: from t480s.redhat.com (ovpn-114-158.ams2.redhat.com [10.36.114.158]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0864E5C1B4; Thu, 19 Nov 2020 15:40:04 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Subject: [PATCH v1 3/9] virtio-mem: Implement RamDiscardMgr interface Date: Thu, 19 Nov 2020 16:39:12 +0100 Message-Id: <20201119153918.120976-4-david@redhat.com> In-Reply-To: <20201119153918.120976-1-david@redhat.com> References: <20201119153918.120976-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/11/19 03:44:58 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Pankaj Gupta , Wei Yang , David Hildenbrand , "Michael S. Tsirkin" , "Dr . David Alan Gilbert" , Peter Xu , Luiz Capitulino , Auger Eric , Alex Williamson , teawater , Igor Mammedov , Paolo Bonzini , Marek Kedzierski Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Let's properly notify when (un)plugging blocks, after discarding memory and before allowing the guest to consume memory. Handle errors from notifiers gracefully (e.g., no remaining VFIO mappings) when plugging, rolling back the change and telling the guest that the VM is busy. One special case to take care of is replaying all notifications after restoring the vmstate. The device starts out with all memory discarded, so after loading the vmstate, we have to notify about all plugged blocks. Cc: Paolo Bonzini Cc: "Michael S. Tsirkin" Cc: Alex Williamson Cc: Dr. David Alan Gilbert Cc: Igor Mammedov Cc: Pankaj Gupta Cc: Peter Xu Cc: Auger Eric Cc: Wei Yang Cc: teawater Cc: Marek Kedzierski Signed-off-by: David Hildenbrand --- hw/virtio/virtio-mem.c | 231 ++++++++++++++++++++++++++++++++- include/hw/virtio/virtio-mem.h | 3 + 2 files changed, 231 insertions(+), 3 deletions(-) diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index 471e464171..93257b6c26 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -172,7 +172,110 @@ static int virtio_mem_for_each_unplugged_range(const VirtIOMEM *vmem, void *arg, return ret; } -static bool virtio_mem_test_bitmap(VirtIOMEM *vmem, uint64_t start_gpa, +static int virtio_mem_for_each_plugged_range(const VirtIOMEM *vmem, void *arg, + virtio_mem_range_cb cb) +{ + unsigned long first_bit, last_bit; + uint64_t offset, size; + int ret = 0; + + first_bit = find_first_bit(vmem->bitmap, vmem->bitmap_size); + while (first_bit < vmem->bitmap_size) { + offset = first_bit * vmem->block_size; + last_bit = find_next_zero_bit(vmem->bitmap, vmem->bitmap_size, + first_bit + 1) - 1; + size = (last_bit - first_bit + 1) * vmem->block_size; + + ret = cb(vmem, arg, offset, size); + if (ret) { + break; + } + first_bit = find_next_bit(vmem->bitmap, vmem->bitmap_size, + last_bit + 2); + } + return ret; +} + +static void virtio_mem_notify_unplug(VirtIOMEM *vmem, uint64_t offset, + uint64_t size) +{ + RamDiscardListener *rdl; + + QLIST_FOREACH(rdl, &vmem->rdl_list, next) { + rdl->notify_discard(rdl, &vmem->memdev->mr, offset, size); + } +} + +static int virtio_mem_notify_plug(VirtIOMEM *vmem, uint64_t offset, + uint64_t size) +{ + RamDiscardListener *rdl, *rdl2; + int ret = 0, ret2; + + QLIST_FOREACH(rdl, &vmem->rdl_list, next) { + ret = rdl->notify_populate(rdl, &vmem->memdev->mr, offset, size); + if (ret) { + break; + } + } + + if (ret) { + /* Could be a mapping attempt resulted in memory getting populated. */ + ret2 = ram_block_discard_range(vmem->memdev->mr.ram_block, offset, + size); + if (ret2) { + error_report("Unexpected error discarding RAM: %s", + strerror(-ret2)); + } + + /* Notify all already-notified listeners. */ + QLIST_FOREACH(rdl2, &vmem->rdl_list, next) { + if (rdl2 == rdl) { + break; + } + rdl2->notify_discard(rdl2, &vmem->memdev->mr, offset, size); + } + } + return ret; +} + +static int virtio_mem_notify_discard_range_cb(const VirtIOMEM *vmem, void *arg, + uint64_t offset, uint64_t size) +{ + RamDiscardListener *rdl; + + QLIST_FOREACH(rdl, &vmem->rdl_list, next) { + if (!rdl->notify_discard_all) { + rdl->notify_discard(rdl, &vmem->memdev->mr, offset, size); + } + } + return 0; +} + +static void virtio_mem_notify_unplug_all(VirtIOMEM *vmem) +{ + bool individual_calls = false; + RamDiscardListener *rdl; + + if (!vmem->size) { + return; + } + + QLIST_FOREACH(rdl, &vmem->rdl_list, next) { + if (rdl->notify_discard_all) { + rdl->notify_discard_all(rdl, &vmem->memdev->mr); + } else { + individual_calls = true; + } + } + + if (individual_calls) { + virtio_mem_for_each_unplugged_range(vmem, NULL, + virtio_mem_notify_discard_range_cb); + } +} + +static bool virtio_mem_test_bitmap(const VirtIOMEM *vmem, uint64_t start_gpa, uint64_t size, bool plugged) { const unsigned long first_bit = (start_gpa - vmem->addr) / vmem->block_size; @@ -225,7 +328,8 @@ static void virtio_mem_send_response_simple(VirtIOMEM *vmem, virtio_mem_send_response(vmem, elem, &resp); } -static bool virtio_mem_valid_range(VirtIOMEM *vmem, uint64_t gpa, uint64_t size) +static bool virtio_mem_valid_range(const VirtIOMEM *vmem, uint64_t gpa, + uint64_t size) { if (!QEMU_IS_ALIGNED(gpa, vmem->block_size)) { return false; @@ -259,6 +363,9 @@ static int virtio_mem_set_block_state(VirtIOMEM *vmem, uint64_t start_gpa, strerror(-ret)); return -EBUSY; } + virtio_mem_notify_unplug(vmem, offset, size); + } else if (virtio_mem_notify_plug(vmem, offset, size)) { + return -EBUSY; } virtio_mem_set_bitmap(vmem, start_gpa, size, plug); return 0; @@ -356,6 +463,8 @@ static int virtio_mem_unplug_all(VirtIOMEM *vmem) error_report("Unexpected error discarding RAM: %s", strerror(-ret)); return -EBUSY; } + virtio_mem_notify_unplug_all(vmem); + bitmap_clear(vmem->bitmap, 0, vmem->bitmap_size); if (vmem->size) { vmem->size = 0; @@ -604,6 +713,12 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp) vmstate_register_ram(&vmem->memdev->mr, DEVICE(vmem)); qemu_register_reset(virtio_mem_system_reset, vmem); precopy_add_notifier(&vmem->precopy_notifier); + + /* + * Set ourselves as RamDiscardMgr before the plug handler maps the memory + * region and exposes it via an address space. + */ + memory_region_set_ram_discard_mgr(&vmem->memdev->mr, RAM_DISCARD_MGR(vmem)); } static void virtio_mem_device_unrealize(DeviceState *dev) @@ -611,6 +726,11 @@ static void virtio_mem_device_unrealize(DeviceState *dev) VirtIODevice *vdev = VIRTIO_DEVICE(dev); VirtIOMEM *vmem = VIRTIO_MEM(dev); + /* + * The unplug handler unmapped the memory region, it cannot be + * found via an address space anymore. Unset ourselves. + */ + memory_region_set_ram_discard_mgr(&vmem->memdev->mr, NULL); precopy_remove_notifier(&vmem->precopy_notifier); qemu_unregister_reset(virtio_mem_system_reset, vmem); vmstate_unregister_ram(&vmem->memdev->mr, DEVICE(vmem)); @@ -642,13 +762,41 @@ static int virtio_mem_restore_unplugged(VirtIOMEM *vmem) virtio_mem_discard_range_cb); } +static int virtio_mem_post_load_replay_cb(const VirtIOMEM *vmem, void *arg, + uint64_t offset, uint64_t size) +{ + RamDiscardListener *rdl; + int ret = 0; + + QLIST_FOREACH(rdl, &vmem->rdl_list, next) { + ret = rdl->notify_populate(rdl, &vmem->memdev->mr, offset, size); + if (ret) { + break; + } + } + return ret; +} + static int virtio_mem_post_load(void *opaque, int version_id) { + VirtIOMEM *vmem = VIRTIO_MEM(opaque); + int ret; + + /* + * We started out with all memory discarded and our memory region is mapped + * into an address space. Replay, now that we updated the bitmap. + */ + ret = virtio_mem_for_each_plugged_range(vmem, NULL, + virtio_mem_post_load_replay_cb); + if (ret) { + return ret; + } + if (migration_in_incoming_postcopy()) { return 0; } - return virtio_mem_restore_unplugged(VIRTIO_MEM(opaque)); + return virtio_mem_restore_unplugged(vmem); } typedef struct VirtIOMEMMigSanityChecks { @@ -933,6 +1081,7 @@ static void virtio_mem_instance_init(Object *obj) notifier_list_init(&vmem->size_change_notifiers); vmem->precopy_notifier.notify = virtio_mem_precopy_notify; + QLIST_INIT(&vmem->rdl_list); object_property_add(obj, VIRTIO_MEM_SIZE_PROP, "size", virtio_mem_get_size, NULL, NULL, NULL); @@ -952,11 +1101,77 @@ static Property virtio_mem_properties[] = { DEFINE_PROP_END_OF_LIST(), }; +static uint64_t virtio_mem_rdm_get_min_granularity(const RamDiscardMgr *rdm, + const MemoryRegion *mr) +{ + const VirtIOMEM *vmem = VIRTIO_MEM(rdm); + + g_assert(mr == &vmem->memdev->mr); + return vmem->block_size; +} + +static bool virtio_mem_rdm_is_populated(const RamDiscardMgr *rdm, + const MemoryRegion *mr, + ram_addr_t offset, ram_addr_t size) +{ + const VirtIOMEM *vmem = VIRTIO_MEM(rdm); + uint64_t start_gpa = QEMU_ALIGN_DOWN(vmem->addr + offset, vmem->block_size); + uint64_t end_gpa = QEMU_ALIGN_UP(vmem->addr + offset + size, + vmem->block_size); + + g_assert(mr == &vmem->memdev->mr); + if (!virtio_mem_valid_range(vmem, start_gpa, end_gpa - start_gpa)) { + return false; + } + + return virtio_mem_test_bitmap(vmem, start_gpa, end_gpa - start_gpa, true); +} + +static void virtio_mem_rdm_register_listener(RamDiscardMgr *rdm, + const MemoryRegion *mr, + RamDiscardListener *rdl) +{ + VirtIOMEM *vmem = VIRTIO_MEM(rdm); + + g_assert(mr == &vmem->memdev->mr); + QLIST_INSERT_HEAD(&vmem->rdl_list, rdl, next); +} + +static void virtio_mem_rdm_unregister_listener(RamDiscardMgr *rdm, + const MemoryRegion *mr, + RamDiscardListener *rdl) +{ + VirtIOMEM *vmem = VIRTIO_MEM(rdm); + + g_assert(mr == &vmem->memdev->mr); + QLIST_REMOVE(rdl, next); +} + +static int virtio_mem_notify_populate_range_cb(const VirtIOMEM *vmem, void *arg, + uint64_t offset, uint64_t size) +{ + RamDiscardListener *rdl = arg; + + return rdl->notify_populate(rdl, &vmem->memdev->mr, offset, size); +} + +static int virtio_mem_rdm_replay_populated(const RamDiscardMgr *rdm, + const MemoryRegion *mr, + RamDiscardListener *rdl) +{ + const VirtIOMEM *vmem = VIRTIO_MEM(rdm); + + g_assert(mr == &vmem->memdev->mr); + return virtio_mem_for_each_plugged_range(vmem, rdl, + virtio_mem_notify_populate_range_cb); +} + static void virtio_mem_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass); VirtIOMEMClass *vmc = VIRTIO_MEM_CLASS(klass); + RamDiscardMgrClass *rdmc = RAM_DISCARD_MGR_CLASS(klass); device_class_set_props(dc, virtio_mem_properties); dc->vmsd = &vmstate_virtio_mem; @@ -972,6 +1187,12 @@ static void virtio_mem_class_init(ObjectClass *klass, void *data) vmc->get_memory_region = virtio_mem_get_memory_region; vmc->add_size_change_notifier = virtio_mem_add_size_change_notifier; vmc->remove_size_change_notifier = virtio_mem_remove_size_change_notifier; + + rdmc->get_min_granularity = virtio_mem_rdm_get_min_granularity; + rdmc->is_populated = virtio_mem_rdm_is_populated; + rdmc->register_listener = virtio_mem_rdm_register_listener; + rdmc->unregister_listener = virtio_mem_rdm_unregister_listener; + rdmc->replay_populated = virtio_mem_rdm_replay_populated; } static const TypeInfo virtio_mem_info = { @@ -981,6 +1202,10 @@ static const TypeInfo virtio_mem_info = { .instance_init = virtio_mem_instance_init, .class_init = virtio_mem_class_init, .class_size = sizeof(VirtIOMEMClass), + .interfaces = (InterfaceInfo[]) { + { TYPE_RAM_DISCARD_MGR }, + { } + }, }; static void virtio_register_types(void) diff --git a/include/hw/virtio/virtio-mem.h b/include/hw/virtio/virtio-mem.h index 4eeb82d5dd..9a6e348fa2 100644 --- a/include/hw/virtio/virtio-mem.h +++ b/include/hw/virtio/virtio-mem.h @@ -67,6 +67,9 @@ struct VirtIOMEM { /* don't migrate unplugged memory */ NotifierWithReturn precopy_notifier; + + /* listeners to notify on plug/unplug activity. */ + QLIST_HEAD(, RamDiscardListener) rdl_list; }; struct VirtIOMEMClass {