From patchwork Mon Nov 1 22:08:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597379 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CDFF0C433FE for ; Mon, 1 Nov 2021 22:09:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ACBA760C51 for ; Mon, 1 Nov 2021 22:09:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232222AbhKAWLx (ORCPT ); Mon, 1 Nov 2021 18:11:53 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:20963 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229702AbhKAWLw (ORCPT ); Mon, 1 Nov 2021 18:11:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804558; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vkGrv1tXzqIqxPURTbd5P4AwWVT450Zx93AirftplRQ=; b=Ab7Z++8Wab2vq6TiaKho4ZODLSAujQxKs5M7CjKAwk3uzE9TxonEtrkU+tDnphFkBu8qTW jBw4BnGSv5UjRLWJ2/8JKj+/ftAGpSVA3cTcfJ7S5bXV/GIvDEvycOh12r86xXMYBvvP3e vttPKDjjUch9S2S12341cprcWZBe+Kc= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-516-_l7I5ClOPLC8swsi7hm1WA-1; Mon, 01 Nov 2021 18:09:17 -0400 X-MC-Unique: _l7I5ClOPLC8swsi7hm1WA-1 Received: by mail-wr1-f72.google.com with SMTP id y4-20020adfd084000000b00186b16950f3so139788wrh.14 for ; Mon, 01 Nov 2021 15:09:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=vkGrv1tXzqIqxPURTbd5P4AwWVT450Zx93AirftplRQ=; b=HRwvLdGvIPZCF3zxlM/8EHx17P0Li7vcuiTEovOa6grDyeDuuM7/cH3NInV1dXeuND a/UZUEGGJNGMETM2BAEj6CIQ7ypg770UmQdP3Jr2J8v9rHaQhNTCKfdZE93SwA8Toqp4 nGPtyiFoIzRY4qptBQsbp3HSHaf59v4bcHWmo6rpzttbISk+tHwpGQv3IKvsNaQWYKjS ASNVx0MMZVVBYMo4dMoOeHv7os8BMXdvRoefiHY2LMlm0kgKzbCWVPKiZLXsp+UKDaie vd0lsJII6kQamDRyLFc1i0tgNSey4dwVtlJ+exBG7EIS7Bw1vghrm9HMA6MfYwqRczlr 3/IA== X-Gm-Message-State: AOAM533peF8RfuSfXAilP6IpAs0N+2J8clD5unyEpZRgyEGQP4L1VrN2 +aY7KVCI5wfsZpR5D7r6Bh5DGvTCk8TQhnmyqckXcYMxhGWcczEifHIcJV+e0qkZNz6ZTenvFuK ILPmccHtAZnVG X-Received: by 2002:a1c:f207:: with SMTP id s7mr1888731wmc.179.1635804556023; Mon, 01 Nov 2021 15:09:16 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzdelN3acVNNcauyISlwa7+WeaQgRjEVk9oRjzpL60B7TVIsWjsHCUYyBOoiJZxzWXWvnQUpA== X-Received: by 2002:a1c:f207:: with SMTP id s7mr1888689wmc.179.1635804555648; Mon, 01 Nov 2021 15:09:15 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id a1sm6750809wri.89.2021.11.01.15.09.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:15 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard , Li Zhijian Subject: [PULL 01/20] migration/rdma: Fix out of order wrid Date: Mon, 1 Nov 2021 23:08:53 +0100 Message-Id: <20211101220912.10039-2-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Li Zhijian destination: ../qemu/build/qemu-system-x86_64 -enable-kvm -netdev tap,id=hn0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device e1000,netdev=hn0,mac=50:52:54:00:11:22 -boot c -drive if=none,file=./Fedora-rdma-server-migration.qcow2,id=drive-virtio-disk0 -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor stdio -vga qxl -spice streaming-video=filter,port=5902,disable-ticketing -incoming rdma:192.168.22.23:8888 qemu-system-x86_64: -spice streaming-video=filter,port=5902,disable-ticketing: warning: short-form boolean option 'disable-ticketing' deprecated Please use disable-ticketing=on instead QEMU 6.0.50 monitor - type 'help' for more information (qemu) trace-event qemu_rdma_block_for_wrid_miss on (qemu) dest_init RDMA Device opened: kernel name rxe_eth0 uverbs device name uverbs2, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs2, infiniband class device path /sys/class/infiniband/rxe_eth0, transport: (2) Ethernet qemu_rdma_block_for_wrid_miss A Wanted wrid CONTROL SEND (2000) but got CONTROL RECV (4000) source: ../qemu/build/qemu-system-x86_64 -enable-kvm -netdev tap,id=hn0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown -device e1000,netdev=hn0,mac=50:52:54:00:11:22 -boot c -drive if=none,file=./Fedora-rdma-server.qcow2,id=drive-virtio-disk0 -device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -m 2048 -smp 2 -device piix3-usb-uhci -device usb-tablet -monitor stdio -vga qxl -spice streaming-video=filter,port=5901,disable-ticketing -S qemu-system-x86_64: -spice streaming-video=filter,port=5901,disable-ticketing: warning: short-form boolean option 'disable-ticketing' deprecated Please use disable-ticketing=on instead QEMU 6.0.50 monitor - type 'help' for more information (qemu) (qemu) trace-event qemu_rdma_block_for_wrid_miss on (qemu) migrate -d rdma:192.168.22.23:8888 source_resolve_host RDMA Device opened: kernel name rxe_eth0 uverbs device name uverbs2, infiniband_verbs class device path /sys/class/infiniband_verbs/uverbs2, infiniband class device path /sys/class/infiniband/rxe_eth0, transport: (2) Ethernet (qemu) qemu_rdma_block_for_wrid_miss A Wanted wrid WRITE RDMA (1) but got CONTROL RECV (4000) NOTE: we use soft RoCE as the rdma device. [root@iaas-rpma images]# rdma link show rxe_eth0/1 link rxe_eth0/1 state ACTIVE physical_state LINK_UP netdev eth0 This migration could not be completed when out of order(OOO) CQ event occurs. The send queue and receive queue shared a same completion queue, and qemu_rdma_block_for_wrid() will drop the CQs it's not interested in. But the dropped CQs by qemu_rdma_block_for_wrid() could be later CQs it wants. So in this case, qemu_rdma_block_for_wrid() will block forever. OOO cases will occur in both source side and destination side. And a forever blocking happens on only SEND and RECV are out of order. OOO between 'WRITE RDMA' and 'RECV' doesn't matter. below the OOO sequence: source destination rdma_write_one() qemu_rdma_registration_handle() 1. S1: post_recv X D1: post_recv Y 2. wait for recv CQ event X 3. D2: post_send X ---------------+ 4. wait for send CQ send event X (D2) | 5. recv CQ event X reaches (D2) | 6. +-S2: post_send Y | 7. | wait for send CQ event Y | 8. | recv CQ event Y (S2) (drop it) | 9. +-send CQ event Y reaches (S2) | 10. send CQ event X reaches (D2) -----+ 11. wait recv CQ event Y (dropped by (8)) Although a hardware IB works fine in my a hundred of runs, the IB specification doesn't guaratee the CQ order in such case. Here we introduce a independent send completion queue to distinguish ibv_post_send completion queue from the original mixed completion queue. It helps us to poll the specific CQE we are really interested in. Signed-off-by: Li Zhijian Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- migration/rdma.c | 138 ++++++++++++++++++++++++++++++++++------------- 1 file changed, 101 insertions(+), 37 deletions(-) diff --git a/migration/rdma.c b/migration/rdma.c index 2a3c7889b9..f5d3bbe7e9 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -358,9 +358,11 @@ typedef struct RDMAContext { struct ibv_context *verbs; struct rdma_event_channel *channel; struct ibv_qp *qp; /* queue pair */ - struct ibv_comp_channel *comp_channel; /* completion channel */ + struct ibv_comp_channel *recv_comp_channel; /* recv completion channel */ + struct ibv_comp_channel *send_comp_channel; /* send completion channel */ struct ibv_pd *pd; /* protection domain */ - struct ibv_cq *cq; /* completion queue */ + struct ibv_cq *recv_cq; /* recvieve completion queue */ + struct ibv_cq *send_cq; /* send completion queue */ /* * If a previous write failed (perhaps because of a failed @@ -1059,21 +1061,34 @@ static int qemu_rdma_alloc_pd_cq(RDMAContext *rdma) return -1; } - /* create completion channel */ - rdma->comp_channel = ibv_create_comp_channel(rdma->verbs); - if (!rdma->comp_channel) { - error_report("failed to allocate completion channel"); + /* create receive completion channel */ + rdma->recv_comp_channel = ibv_create_comp_channel(rdma->verbs); + if (!rdma->recv_comp_channel) { + error_report("failed to allocate receive completion channel"); goto err_alloc_pd_cq; } /* - * Completion queue can be filled by both read and write work requests, - * so must reflect the sum of both possible queue sizes. + * Completion queue can be filled by read work requests. */ - rdma->cq = ibv_create_cq(rdma->verbs, (RDMA_SIGNALED_SEND_MAX * 3), - NULL, rdma->comp_channel, 0); - if (!rdma->cq) { - error_report("failed to allocate completion queue"); + rdma->recv_cq = ibv_create_cq(rdma->verbs, (RDMA_SIGNALED_SEND_MAX * 3), + NULL, rdma->recv_comp_channel, 0); + if (!rdma->recv_cq) { + error_report("failed to allocate receive completion queue"); + goto err_alloc_pd_cq; + } + + /* create send completion channel */ + rdma->send_comp_channel = ibv_create_comp_channel(rdma->verbs); + if (!rdma->send_comp_channel) { + error_report("failed to allocate send completion channel"); + goto err_alloc_pd_cq; + } + + rdma->send_cq = ibv_create_cq(rdma->verbs, (RDMA_SIGNALED_SEND_MAX * 3), + NULL, rdma->send_comp_channel, 0); + if (!rdma->send_cq) { + error_report("failed to allocate send completion queue"); goto err_alloc_pd_cq; } @@ -1083,11 +1098,19 @@ err_alloc_pd_cq: if (rdma->pd) { ibv_dealloc_pd(rdma->pd); } - if (rdma->comp_channel) { - ibv_destroy_comp_channel(rdma->comp_channel); + if (rdma->recv_comp_channel) { + ibv_destroy_comp_channel(rdma->recv_comp_channel); + } + if (rdma->send_comp_channel) { + ibv_destroy_comp_channel(rdma->send_comp_channel); + } + if (rdma->recv_cq) { + ibv_destroy_cq(rdma->recv_cq); + rdma->recv_cq = NULL; } rdma->pd = NULL; - rdma->comp_channel = NULL; + rdma->recv_comp_channel = NULL; + rdma->send_comp_channel = NULL; return -1; } @@ -1104,8 +1127,8 @@ static int qemu_rdma_alloc_qp(RDMAContext *rdma) attr.cap.max_recv_wr = 3; attr.cap.max_send_sge = 1; attr.cap.max_recv_sge = 1; - attr.send_cq = rdma->cq; - attr.recv_cq = rdma->cq; + attr.send_cq = rdma->send_cq; + attr.recv_cq = rdma->recv_cq; attr.qp_type = IBV_QPT_RC; ret = rdma_create_qp(rdma->cm_id, rdma->pd, &attr); @@ -1496,14 +1519,14 @@ static void qemu_rdma_signal_unregister(RDMAContext *rdma, uint64_t index, * (of any kind) has completed. * Return the work request ID that completed. */ -static uint64_t qemu_rdma_poll(RDMAContext *rdma, uint64_t *wr_id_out, - uint32_t *byte_len) +static uint64_t qemu_rdma_poll(RDMAContext *rdma, struct ibv_cq *cq, + uint64_t *wr_id_out, uint32_t *byte_len) { int ret; struct ibv_wc wc; uint64_t wr_id; - ret = ibv_poll_cq(rdma->cq, 1, &wc); + ret = ibv_poll_cq(cq, 1, &wc); if (!ret) { *wr_id_out = RDMA_WRID_NONE; @@ -1575,7 +1598,8 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, uint64_t *wr_id_out, /* Wait for activity on the completion channel. * Returns 0 on success, none-0 on error. */ -static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) +static int qemu_rdma_wait_comp_channel(RDMAContext *rdma, + struct ibv_comp_channel *comp_channel) { struct rdma_cm_event *cm_event; int ret = -1; @@ -1586,7 +1610,7 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) */ if (rdma->migration_started_on_destination && migration_incoming_get_current()->state == MIGRATION_STATUS_ACTIVE) { - yield_until_fd_readable(rdma->comp_channel->fd); + yield_until_fd_readable(comp_channel->fd); } else { /* This is the source side, we're in a separate thread * or destination prior to migration_fd_process_incoming() @@ -1597,7 +1621,7 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) */ while (!rdma->error_state && !rdma->received_error) { GPollFD pfds[2]; - pfds[0].fd = rdma->comp_channel->fd; + pfds[0].fd = comp_channel->fd; pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR; pfds[0].revents = 0; @@ -1655,6 +1679,17 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma) return rdma->error_state; } +static struct ibv_comp_channel *to_channel(RDMAContext *rdma, int wrid) +{ + return wrid < RDMA_WRID_RECV_CONTROL ? rdma->send_comp_channel : + rdma->recv_comp_channel; +} + +static struct ibv_cq *to_cq(RDMAContext *rdma, int wrid) +{ + return wrid < RDMA_WRID_RECV_CONTROL ? rdma->send_cq : rdma->recv_cq; +} + /* * Block until the next work request has completed. * @@ -1675,13 +1710,15 @@ static int qemu_rdma_block_for_wrid(RDMAContext *rdma, int wrid_requested, struct ibv_cq *cq; void *cq_ctx; uint64_t wr_id = RDMA_WRID_NONE, wr_id_in; + struct ibv_comp_channel *ch = to_channel(rdma, wrid_requested); + struct ibv_cq *poll_cq = to_cq(rdma, wrid_requested); - if (ibv_req_notify_cq(rdma->cq, 0)) { + if (ibv_req_notify_cq(poll_cq, 0)) { return -1; } /* poll cq first */ while (wr_id != wrid_requested) { - ret = qemu_rdma_poll(rdma, &wr_id_in, byte_len); + ret = qemu_rdma_poll(rdma, poll_cq, &wr_id_in, byte_len); if (ret < 0) { return ret; } @@ -1702,12 +1739,12 @@ static int qemu_rdma_block_for_wrid(RDMAContext *rdma, int wrid_requested, } while (1) { - ret = qemu_rdma_wait_comp_channel(rdma); + ret = qemu_rdma_wait_comp_channel(rdma, ch); if (ret) { goto err_block_for_wrid; } - ret = ibv_get_cq_event(rdma->comp_channel, &cq, &cq_ctx); + ret = ibv_get_cq_event(ch, &cq, &cq_ctx); if (ret) { perror("ibv_get_cq_event"); goto err_block_for_wrid; @@ -1721,7 +1758,7 @@ static int qemu_rdma_block_for_wrid(RDMAContext *rdma, int wrid_requested, } while (wr_id != wrid_requested) { - ret = qemu_rdma_poll(rdma, &wr_id_in, byte_len); + ret = qemu_rdma_poll(rdma, poll_cq, &wr_id_in, byte_len); if (ret < 0) { goto err_block_for_wrid; } @@ -2437,13 +2474,21 @@ static void qemu_rdma_cleanup(RDMAContext *rdma) rdma_destroy_qp(rdma->cm_id); rdma->qp = NULL; } - if (rdma->cq) { - ibv_destroy_cq(rdma->cq); - rdma->cq = NULL; + if (rdma->recv_cq) { + ibv_destroy_cq(rdma->recv_cq); + rdma->recv_cq = NULL; } - if (rdma->comp_channel) { - ibv_destroy_comp_channel(rdma->comp_channel); - rdma->comp_channel = NULL; + if (rdma->send_cq) { + ibv_destroy_cq(rdma->send_cq); + rdma->send_cq = NULL; + } + if (rdma->recv_comp_channel) { + ibv_destroy_comp_channel(rdma->recv_comp_channel); + rdma->recv_comp_channel = NULL; + } + if (rdma->send_comp_channel) { + ibv_destroy_comp_channel(rdma->send_comp_channel); + rdma->send_comp_channel = NULL; } if (rdma->pd) { ibv_dealloc_pd(rdma->pd); @@ -3115,10 +3160,14 @@ static void qio_channel_rdma_set_aio_fd_handler(QIOChannel *ioc, { QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); if (io_read) { - aio_set_fd_handler(ctx, rioc->rdmain->comp_channel->fd, + aio_set_fd_handler(ctx, rioc->rdmain->recv_comp_channel->fd, + false, io_read, io_write, NULL, opaque); + aio_set_fd_handler(ctx, rioc->rdmain->send_comp_channel->fd, false, io_read, io_write, NULL, opaque); } else { - aio_set_fd_handler(ctx, rioc->rdmaout->comp_channel->fd, + aio_set_fd_handler(ctx, rioc->rdmaout->recv_comp_channel->fd, + false, io_read, io_write, NULL, opaque); + aio_set_fd_handler(ctx, rioc->rdmaout->send_comp_channel->fd, false, io_read, io_write, NULL, opaque); } } @@ -3332,7 +3381,22 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void *opaque, */ while (1) { uint64_t wr_id, wr_id_in; - int ret = qemu_rdma_poll(rdma, &wr_id_in, NULL); + int ret = qemu_rdma_poll(rdma, rdma->recv_cq, &wr_id_in, NULL); + if (ret < 0) { + error_report("rdma migration: polling error! %d", ret); + goto err; + } + + wr_id = wr_id_in & RDMA_WRID_TYPE_MASK; + + if (wr_id == RDMA_WRID_NONE) { + break; + } + } + + while (1) { + uint64_t wr_id, wr_id_in; + int ret = qemu_rdma_poll(rdma, rdma->send_cq, &wr_id_in, NULL); if (ret < 0) { error_report("rdma migration: polling error! %d", ret); goto err; From patchwork Mon Nov 1 22:08:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597381 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 951C7C433F5 for ; Mon, 1 Nov 2021 22:09:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7940F61053 for ; Mon, 1 Nov 2021 22:09:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232243AbhKAWLy (ORCPT ); Mon, 1 Nov 2021 18:11:54 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:59000 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232027AbhKAWLx (ORCPT ); Mon, 1 Nov 2021 18:11:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804559; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hSQaBGBpJWy0B8DIEKBqKdfykrzA4ow4TzC1HWV6kH0=; b=hNSNMXU2Y/ryCTwvF3V47yRHi+POxwBr4glDBEfYK0S0Ns5en9qYlDbeWezXQMwMhHb6MF OWKSfyUcZnxqlywL8QICTVFh9cuj2k3GFG93i+t2kCnnKwFJvGDewoZ3tDcjlv9S0F5Y0s PZGrDGd+EHen92OgTkJja7gRK8dXI8U= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-556-2rrkSFN2MiuKcME313qIvQ-1; Mon, 01 Nov 2021 18:09:18 -0400 X-MC-Unique: 2rrkSFN2MiuKcME313qIvQ-1 Received: by mail-wm1-f72.google.com with SMTP id m1-20020a1ca301000000b003231d5b3c4cso173225wme.5 for ; Mon, 01 Nov 2021 15:09:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=hSQaBGBpJWy0B8DIEKBqKdfykrzA4ow4TzC1HWV6kH0=; b=KtEC0JLvJxci3t770m+J44IeLAjiwkClpWm1Hl3ldcn/AfbRa3z2ETnp9vNxG2wpH5 mlKJY+DWfKb1OoZBrylhN/JkGUWPrF8jDHEY8u8zpcXJNpsR45uGbgBhaZqCr7dNbJE1 B9Du7xE/NjgKqAFhwTf963h3XjgVlOA+NaUTMPtuAgBHP3Lcdv2O3mzv/ADlMhuZViTi hHFzdOjdi7jCsMImmGrfXhk4br4dr7cWFw8IwqBL/2gtJIwzXvLH8Cg/Le11Xd8eB4dN cATeOJS7xIzIRIL93W2edQxQse3Nu4e0PXrnfG4AVvUqw5USqCN6Pl6AHCnMer7KhEC1 gcQw== X-Gm-Message-State: AOAM533p+ZRZirOrg3GhgWTOC8YARKrYVqwctHCN1mUKgtOipoLY0IzU hthgnSm2X26oLBb7OpOhANBxnLZNj2cFJ4/7F8jHvmjFI6ZETU7q8Xetc5bB1CiIBCzh3djQ9Kg NWbLDV+LXsZ5V X-Received: by 2002:adf:a78a:: with SMTP id j10mr42370725wrc.105.1635804557396; Mon, 01 Nov 2021 15:09:17 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzHSBNKTiLO5JlEV3iFexkub7waHAM8W6BpiJr+wCxnuH8FeGBZ5Ook1cyDpaRUcX/w/4yrnA== X-Received: by 2002:adf:a78a:: with SMTP id j10mr42370691wrc.105.1635804557180; Mon, 01 Nov 2021 15:09:17 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id r1sm726567wmr.36.2021.11.01.15.09.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:16 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard , =?utf-8?b?SHltYW4gSHVhbmcow6k=?= =?utf-8?b?wrvigJ7DpeKAueKAoSk=?= Subject: [PULL 02/20] KVM: introduce dirty_pages and kvm_dirty_ring_enabled Date: Mon, 1 Nov 2021 23:08:54 +0100 Message-Id: <20211101220912.10039-3-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Hyman Huang(黄勇) dirty_pages is used to calculate dirtyrate via dirty ring, when enabled, kvm-reaper will increase the dirty pages after gfns being dirtied. kvm_dirty_ring_enabled shows if kvm-reaper is working. dirtyrate thread could use it to check if measurement can base on dirty ring feature. Signed-off-by: Hyman Huang(黄勇) Message-Id: Reviewed-by: Peter Xu Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- include/hw/core/cpu.h | 1 + include/sysemu/kvm.h | 1 + accel/kvm/kvm-all.c | 7 +++++++ accel/stubs/kvm-stub.c | 5 +++++ 4 files changed, 14 insertions(+) diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h index 1a10497af3..e948e81f1a 100644 --- a/include/hw/core/cpu.h +++ b/include/hw/core/cpu.h @@ -381,6 +381,7 @@ struct CPUState { struct kvm_run *kvm_run; struct kvm_dirty_gfn *kvm_dirty_gfns; uint32_t kvm_fetch_index; + uint64_t dirty_pages; /* Used for events with 'vcpu' and *without* the 'disabled' properties */ DECLARE_BITMAP(trace_dstate_delayed, CPU_TRACE_DSTATE_MAX_EVENTS); diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index a1ab1ee12d..7b22aeb6ae 100644 --- a/include/sysemu/kvm.h +++ b/include/sysemu/kvm.h @@ -547,4 +547,5 @@ bool kvm_cpu_check_are_resettable(void); bool kvm_arch_cpu_check_are_resettable(void); +bool kvm_dirty_ring_enabled(void); #endif diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index db8d83b137..eecd8031cf 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -469,6 +469,7 @@ int kvm_init_vcpu(CPUState *cpu, Error **errp) cpu->kvm_fd = ret; cpu->kvm_state = s; cpu->vcpu_dirty = true; + cpu->dirty_pages = 0; mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0); if (mmap_size < 0) { @@ -743,6 +744,7 @@ static uint32_t kvm_dirty_ring_reap_one(KVMState *s, CPUState *cpu) count++; } cpu->kvm_fetch_index = fetch; + cpu->dirty_pages += count; return count; } @@ -2296,6 +2298,11 @@ bool kvm_vcpu_id_is_valid(int vcpu_id) return vcpu_id >= 0 && vcpu_id < kvm_max_vcpu_id(s); } +bool kvm_dirty_ring_enabled(void) +{ + return kvm_state->kvm_dirty_ring_size ? true : false; +} + static int kvm_init(MachineState *ms) { MachineClass *mc = MACHINE_GET_CLASS(ms); diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c index 5b1d00a222..5319573e00 100644 --- a/accel/stubs/kvm-stub.c +++ b/accel/stubs/kvm-stub.c @@ -147,4 +147,9 @@ bool kvm_arm_supports_user_irq(void) { return false; } + +bool kvm_dirty_ring_enabled(void) +{ + return false; +} #endif From patchwork Mon Nov 1 22:08:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597383 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1315C433EF for ; Mon, 1 Nov 2021 22:09:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A6A7761053 for ; Mon, 1 Nov 2021 22:09:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232259AbhKAWLz (ORCPT ); Mon, 1 Nov 2021 18:11:55 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:30722 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232027AbhKAWLz (ORCPT ); Mon, 1 Nov 2021 18:11:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804560; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mNV0USCJ0bTE7YbxR+SPvyYXWmDPBQ41R9sCmOpsXYk=; b=HhoX8igNn0xPZeMHOqreAft/VE3RZqj+024RLqEhAMZ4gRMLJxz8oH1kXDseMKnILXVvkq o6ckbOvvyjxMJ9YgFE0bvUR8oNpeyRgLux3HE7XvA4H1ilQOALKZeWXB1soPJGOh36gv6m qET9fWgKa1iNjgF2JJOzrjda5nYtCiI= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-560-Pg6Uoz8FNPCdslhGX60mEA-1; Mon, 01 Nov 2021 18:09:19 -0400 X-MC-Unique: Pg6Uoz8FNPCdslhGX60mEA-1 Received: by mail-wr1-f70.google.com with SMTP id a2-20020a5d4d42000000b0017b3bcf41b9so4041763wru.23 for ; Mon, 01 Nov 2021 15:09:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mNV0USCJ0bTE7YbxR+SPvyYXWmDPBQ41R9sCmOpsXYk=; b=ZQqAt4AKjaaOmes2WGD+I22E3R3XNTuvU2dyMZMqDfKsoahgEjC/Da33KrHZT87quo pZMd0T65XHDRRn1P0fsTWaTcWBFn3DtTiq6m//kzLBmoQ2JNt22/Ql6cKmYf3Ls803DH gZKrp8uwdJMpBWqIP7xjazBwweoKJDzEEccdq0cIYRKObXdsUbQE6VpKFjDkPc5ce1dX suwM+3+uDwT+9fCzwQrLqGCPwGG8GtT/YnCbzOBPUzKAzYJF0dbWZwEs7V2P4w1MGfb0 0yV0WCjHrgX2i9/tb/3ALBTT2dFQEe/Pj143Hyx/TFflEkaRk1z172ZtRsR8fQweM3UK AR0g== X-Gm-Message-State: AOAM5332g32uwEIicST2pUWLgAtT7TQiQMFcZNL1wqkUBNxDVUicsccz XtmJS85OX7gFm6orZzpClcnWKJMqfCcnjzTtndz6DlLtWV/Ym6DRA/z+m7yQB6j0UMY/RPIs0em r5uafQry3oARb X-Received: by 2002:a5d:658c:: with SMTP id q12mr16674207wru.34.1635804558679; Mon, 01 Nov 2021 15:09:18 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz7l2Vftm7DMc/eD2Pbw7jBZVxT3bmH28cxDkkzkMDwNXV8YPCcpluT9gBfpkvFnsLHKLEoAQ== X-Received: by 2002:a5d:658c:: with SMTP id q12mr16674187wru.34.1635804558501; Mon, 01 Nov 2021 15:09:18 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id c79sm643941wme.43.2021.11.01.15.09.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:18 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard , =?utf-8?b?SHltYW4gSHVhbmcow6k=?= =?utf-8?b?wrvigJ7DpeKAueKAoSk=?= Subject: [PULL 03/20] memory: make global_dirty_tracking a bitmask Date: Mon, 1 Nov 2021 23:08:55 +0100 Message-Id: <20211101220912.10039-4-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Hyman Huang(黄勇) since dirty ring has been introduced, there are two methods to track dirty pages of vm. it seems that "logging" has a hint on the method, so rename the global_dirty_log to global_dirty_tracking would make description more accurate. dirty rate measurement may start or stop dirty tracking during calculation. this conflict with migration because stop dirty tracking make migration leave dirty pages out then that'll be a problem. make global_dirty_tracking a bitmask can let both migration and dirty rate measurement work fine. introduce GLOBAL_DIRTY_MIGRATION and GLOBAL_DIRTY_DIRTY_RATE to distinguish what current dirty tracking aims for, migration or dirty rate. Signed-off-by: Hyman Huang(黄勇) Message-Id: <9c9388657cfa0301bd2c1cfa36e7cf6da4aeca19.1624040308.git.huangy81@chinatelecom.cn> Reviewed-by: Peter Xu Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- include/exec/memory.h | 20 +++++++++++++++++--- include/exec/ram_addr.h | 4 ++-- hw/i386/xen/xen-hvm.c | 4 ++-- migration/ram.c | 15 +++++++++++---- softmmu/memory.c | 32 +++++++++++++++++++++----------- softmmu/trace-events | 1 + 6 files changed, 54 insertions(+), 22 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index a185b6dcb8..04280450c9 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -61,7 +61,17 @@ static inline void fuzz_dma_read_cb(size_t addr, } #endif -extern bool global_dirty_log; +/* Possible bits for global_dirty_log_{start|stop} */ + +/* Dirty tracking enabled because migration is running */ +#define GLOBAL_DIRTY_MIGRATION (1U << 0) + +/* Dirty tracking enabled because measuring dirty rate */ +#define GLOBAL_DIRTY_DIRTY_RATE (1U << 1) + +#define GLOBAL_DIRTY_MASK (0x3) + +extern unsigned int global_dirty_tracking; typedef struct MemoryRegionOps MemoryRegionOps; @@ -2388,13 +2398,17 @@ void memory_listener_unregister(MemoryListener *listener); /** * memory_global_dirty_log_start: begin dirty logging for all regions + * + * @flags: purpose of starting dirty log, migration or dirty rate */ -void memory_global_dirty_log_start(void); +void memory_global_dirty_log_start(unsigned int flags); /** * memory_global_dirty_log_stop: end dirty logging for all regions + * + * @flags: purpose of stopping dirty log, migration or dirty rate */ -void memory_global_dirty_log_stop(void); +void memory_global_dirty_log_stop(unsigned int flags); void mtree_info(bool flatview, bool dispatch_tree, bool owner, bool disabled); diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index 551876bed0..45c913264a 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -369,7 +369,7 @@ static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap, qatomic_or(&blocks[DIRTY_MEMORY_VGA][idx][offset], temp); - if (global_dirty_log) { + if (global_dirty_tracking) { qatomic_or( &blocks[DIRTY_MEMORY_MIGRATION][idx][offset], temp); @@ -392,7 +392,7 @@ static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap, } else { uint8_t clients = tcg_enabled() ? DIRTY_CLIENTS_ALL : DIRTY_CLIENTS_NOCODE; - if (!global_dirty_log) { + if (!global_dirty_tracking) { clients &= ~(1 << DIRTY_MEMORY_MIGRATION); } diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c index e3d3d5cf89..482be95415 100644 --- a/hw/i386/xen/xen-hvm.c +++ b/hw/i386/xen/xen-hvm.c @@ -1613,8 +1613,8 @@ void xen_hvm_modified_memory(ram_addr_t start, ram_addr_t length) void qmp_xen_set_global_dirty_log(bool enable, Error **errp) { if (enable) { - memory_global_dirty_log_start(); + memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION); } else { - memory_global_dirty_log_stop(); + memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION); } } diff --git a/migration/ram.c b/migration/ram.c index bb908822d5..ae2601bf3b 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2216,7 +2216,14 @@ static void ram_save_cleanup(void *opaque) /* caller have hold iothread lock or is in a bh, so there is * no writing race against the migration bitmap */ - memory_global_dirty_log_stop(); + if (global_dirty_tracking & GLOBAL_DIRTY_MIGRATION) { + /* + * do not stop dirty log without starting it, since + * memory_global_dirty_log_stop will assert that + * memory_global_dirty_log_start/stop used in pairs + */ + memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION); + } } RAMBLOCK_FOREACH_NOT_IGNORED(block) { @@ -2678,7 +2685,7 @@ static void ram_init_bitmaps(RAMState *rs) ram_list_init_bitmaps(); /* We don't use dirty log with background snapshots */ if (!migrate_background_snapshot()) { - memory_global_dirty_log_start(); + memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION); migration_bitmap_sync_precopy(rs); } } @@ -3434,7 +3441,7 @@ void colo_incoming_start_dirty_log(void) /* Discard this dirty bitmap record */ bitmap_zero(block->bmap, block->max_length >> TARGET_PAGE_BITS); } - memory_global_dirty_log_start(); + memory_global_dirty_log_start(GLOBAL_DIRTY_MIGRATION); } ram_state->migration_dirty_pages = 0; qemu_mutex_unlock_ramlist(); @@ -3446,7 +3453,7 @@ void colo_release_ram_cache(void) { RAMBlock *block; - memory_global_dirty_log_stop(); + memory_global_dirty_log_stop(GLOBAL_DIRTY_MIGRATION); RAMBLOCK_FOREACH_NOT_IGNORED(block) { g_free(block->bmap); block->bmap = NULL; diff --git a/softmmu/memory.c b/softmmu/memory.c index e5826faa0c..f2ac0d2e89 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -39,7 +39,7 @@ static unsigned memory_region_transaction_depth; static bool memory_region_update_pending; static bool ioeventfd_update_pending; -bool global_dirty_log; +unsigned int global_dirty_tracking; static QTAILQ_HEAD(, MemoryListener) memory_listeners = QTAILQ_HEAD_INITIALIZER(memory_listeners); @@ -1821,7 +1821,7 @@ uint8_t memory_region_get_dirty_log_mask(MemoryRegion *mr) uint8_t mask = mr->dirty_log_mask; RAMBlock *rb = mr->ram_block; - if (global_dirty_log && ((rb && qemu_ram_is_migratable(rb)) || + if (global_dirty_tracking && ((rb && qemu_ram_is_migratable(rb)) || memory_region_is_iommu(mr))) { mask |= (1 << DIRTY_MEMORY_MIGRATION); } @@ -2760,14 +2760,18 @@ void memory_global_after_dirty_log_sync(void) static VMChangeStateEntry *vmstate_change; -void memory_global_dirty_log_start(void) +void memory_global_dirty_log_start(unsigned int flags) { if (vmstate_change) { qemu_del_vm_change_state_handler(vmstate_change); vmstate_change = NULL; } - global_dirty_log = true; + assert(flags && !(flags & (~GLOBAL_DIRTY_MASK))); + assert(!(global_dirty_tracking & flags)); + global_dirty_tracking |= flags; + + trace_global_dirty_changed(global_dirty_tracking); MEMORY_LISTENER_CALL_GLOBAL(log_global_start, Forward); @@ -2777,9 +2781,13 @@ void memory_global_dirty_log_start(void) memory_region_transaction_commit(); } -static void memory_global_dirty_log_do_stop(void) +static void memory_global_dirty_log_do_stop(unsigned int flags) { - global_dirty_log = false; + assert(flags && !(flags & (~GLOBAL_DIRTY_MASK))); + assert((global_dirty_tracking & flags) == flags); + global_dirty_tracking &= ~flags; + + trace_global_dirty_changed(global_dirty_tracking); /* Refresh DIRTY_MEMORY_MIGRATION bit. */ memory_region_transaction_begin(); @@ -2792,8 +2800,9 @@ static void memory_global_dirty_log_do_stop(void) static void memory_vm_change_state_handler(void *opaque, bool running, RunState state) { + unsigned int flags = (unsigned int)(uintptr_t)opaque; if (running) { - memory_global_dirty_log_do_stop(); + memory_global_dirty_log_do_stop(flags); if (vmstate_change) { qemu_del_vm_change_state_handler(vmstate_change); @@ -2802,18 +2811,19 @@ static void memory_vm_change_state_handler(void *opaque, bool running, } } -void memory_global_dirty_log_stop(void) +void memory_global_dirty_log_stop(unsigned int flags) { if (!runstate_is_running()) { if (vmstate_change) { return; } vmstate_change = qemu_add_vm_change_state_handler( - memory_vm_change_state_handler, NULL); + memory_vm_change_state_handler, + (void *)(uintptr_t)flags); return; } - memory_global_dirty_log_do_stop(); + memory_global_dirty_log_do_stop(flags); } static void listener_add_address_space(MemoryListener *listener, @@ -2825,7 +2835,7 @@ static void listener_add_address_space(MemoryListener *listener, if (listener->begin) { listener->begin(listener); } - if (global_dirty_log) { + if (global_dirty_tracking) { if (listener->log_global_start) { listener->log_global_start(listener); } diff --git a/softmmu/trace-events b/softmmu/trace-events index bf1469990e..9c88887b3c 100644 --- a/softmmu/trace-events +++ b/softmmu/trace-events @@ -19,6 +19,7 @@ memory_region_sync_dirty(const char *mr, const char *listener, int global) "mr ' flatview_new(void *view, void *root) "%p (root %p)" flatview_destroy(void *view, void *root) "%p (root %p)" flatview_destroy_rcu(void *view, void *root) "%p (root %p)" +global_dirty_changed(unsigned int bitmask) "bitmask 0x%"PRIx32 # softmmu.c vm_stop_flush_all(int ret) "ret %d" From patchwork Mon Nov 1 22:08:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597385 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA829C433F5 for ; Mon, 1 Nov 2021 22:09:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9EC4C6058D for ; Mon, 1 Nov 2021 22:09:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232273AbhKAWL5 (ORCPT ); Mon, 1 Nov 2021 18:11:57 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:39872 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232027AbhKAWL4 (ORCPT ); Mon, 1 Nov 2021 18:11:56 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804562; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zC+zxEH+pUhjgORZ5ETFB7/+CsNzKhs7SNRZ+CUF1/0=; b=D0ygIqxRSY5Sl+3gvqB/NSQ8h4qa/gD/WFYh5IuLJraAntcO8uP6NH0Ci5xFMQuthBGBxG jhiX9O8doGFwN36n5ItLILypvBkTjo4IiIgvqpngg6WrPT2BnaIJHzXnGPq94AMNJEuZnH EE5oCAg4SoyUOSUAEj6pyJEFrz9ZajY= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-145-EqeP25fDP9yMszYkFXiEIg-1; Mon, 01 Nov 2021 18:09:21 -0400 X-MC-Unique: EqeP25fDP9yMszYkFXiEIg-1 Received: by mail-wr1-f71.google.com with SMTP id q17-20020adfcd91000000b0017bcb12ad4fso3953582wrj.12 for ; Mon, 01 Nov 2021 15:09:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=zC+zxEH+pUhjgORZ5ETFB7/+CsNzKhs7SNRZ+CUF1/0=; b=RMkFJXZr2q1BZY9Oq1Sp9aiDPZKlOmm8f6Ks8F3Lea/N1G4ZyxRxRCRqcn78fvOut/ 6bT2a8E6L6oHsyhmAY4LteGTIhW1+W0mxj0m/wxrQRrr2czB/YWTlQDCEqQttS//faqC lec+KLvQ2K60mjnwflVRF1fajJVKjyDmDVDoOrkq3YVHkkfr9uc0IgkgARL0v1bSTk+U l4InZox2i2CN2dMcF5yb7AiOTl+B5uOnMmv8nKQcEtSbo4yUEvWxJPvUYFN3QDq3UO/j prGol1Vb4q7LTdNn0wp36hdl/xvNAQENbRP2ovNtysTpteMJMNszpFtA0Gisl3idcsfH AJ1w== X-Gm-Message-State: AOAM532uxcd9qctWleIhX5e4wlv274cfLDA3aEzPKY9Qsi6kGJBWZq0Z PCngEqkgbzigFW6Z5wpTT4jBTChC+DcIuFxH5pn4S+nwmSz/rqWyGHv6GKsDLWYVHGNDkgDSu98 Ew6TdB4bVpYI9 X-Received: by 2002:a5d:46cb:: with SMTP id g11mr24140821wrs.26.1635804559961; Mon, 01 Nov 2021 15:09:19 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwOqU8wWYVaIjRWd2sjZqSUpY8HCgD5Gu3n1DFyO90oDUDzJdkVJM68NrFLzLaF6oWA+5DyZw== X-Received: by 2002:a5d:46cb:: with SMTP id g11mr24140791wrs.26.1635804559728; Mon, 01 Nov 2021 15:09:19 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id g5sm709951wmi.2.2021.11.01.15.09.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:19 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard , =?utf-8?b?SHltYW4gSHVhbmcow6k=?= =?utf-8?b?wrvigJ7DpeKAueKAoSk=?= Subject: [PULL 04/20] migration/dirtyrate: introduce struct and adjust DirtyRateStat Date: Mon, 1 Nov 2021 23:08:56 +0100 Message-Id: <20211101220912.10039-5-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Hyman Huang(黄勇) introduce "DirtyRateMeasureMode" to specify what method should be used to calculate dirty rate, introduce "DirtyRateVcpu" to store dirty rate for each vcpu. use union to store stat data of specific mode Signed-off-by: Hyman Huang(黄勇) Message-Id: <661c98c40f40e163aa58334337af8f3ddf41316a.1624040308.git.huangy81@chinatelecom.cn> Reviewed-by: Peter Xu Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- qapi/migration.json | 30 +++++++++++++++++++++++++++ migration/dirtyrate.h | 21 +++++++++++++++---- migration/dirtyrate.c | 48 +++++++++++++++++++++++++------------------ 3 files changed, 75 insertions(+), 24 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index 9aa8bc5759..94eece16e1 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -1731,6 +1731,21 @@ { 'event': 'UNPLUG_PRIMARY', 'data': { 'device-id': 'str' } } +## +# @DirtyRateVcpu: +# +# Dirty rate of vcpu. +# +# @id: vcpu index. +# +# @dirty-rate: dirty rate. +# +# Since: 6.1 +# +## +{ 'struct': 'DirtyRateVcpu', + 'data': { 'id': 'int', 'dirty-rate': 'int64' } } + ## # @DirtyRateStatus: # @@ -1748,6 +1763,21 @@ { 'enum': 'DirtyRateStatus', 'data': [ 'unstarted', 'measuring', 'measured'] } +## +# @DirtyRateMeasureMode: +# +# An enumeration of mode of measuring dirtyrate. +# +# @page-sampling: calculate dirtyrate by sampling pages. +# +# @dirty-ring: calculate dirtyrate by via dirty ring. +# +# Since: 6.1 +# +## +{ 'enum': 'DirtyRateMeasureMode', + 'data': ['page-sampling', 'dirty-ring'] } + ## # @DirtyRateInfo: # diff --git a/migration/dirtyrate.h b/migration/dirtyrate.h index e1fd29089e..69d4c5b865 100644 --- a/migration/dirtyrate.h +++ b/migration/dirtyrate.h @@ -43,6 +43,7 @@ struct DirtyRateConfig { uint64_t sample_pages_per_gigabytes; /* sample pages per GB */ int64_t sample_period_seconds; /* time duration between two sampling */ + DirtyRateMeasureMode mode; /* mode of dirtyrate measurement */ }; /* @@ -58,17 +59,29 @@ struct RamblockDirtyInfo { uint32_t *hash_result; /* array of hash result for sampled pages */ }; -/* - * Store calculation statistics for each measure. - */ -struct DirtyRateStat { +typedef struct SampleVMStat { uint64_t total_dirty_samples; /* total dirty sampled page */ uint64_t total_sample_count; /* total sampled pages */ uint64_t total_block_mem_MB; /* size of total sampled pages in MB */ +} SampleVMStat; + +typedef struct VcpuStat { + int nvcpu; /* number of vcpu */ + DirtyRateVcpu *rates; /* array of dirty rate for each vcpu */ +} VcpuStat; + +/* + * Store calculation statistics for each measure. + */ +struct DirtyRateStat { int64_t dirty_rate; /* dirty rate in MB/s */ int64_t start_time; /* calculation start time in units of second */ int64_t calc_time; /* time duration of two sampling in units of second */ uint64_t sample_pages; /* sample pages per GB */ + union { + SampleVMStat page_sampling; + VcpuStat dirty_ring; + }; }; void *get_dirtyrate_thread(void *arg); diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c index 320c56ba2c..e0a27a992c 100644 --- a/migration/dirtyrate.c +++ b/migration/dirtyrate.c @@ -88,33 +88,44 @@ static struct DirtyRateInfo *query_dirty_rate_info(void) return info; } -static void init_dirtyrate_stat(int64_t start_time, int64_t calc_time, - uint64_t sample_pages) +static void init_dirtyrate_stat(int64_t start_time, + struct DirtyRateConfig config) { - DirtyStat.total_dirty_samples = 0; - DirtyStat.total_sample_count = 0; - DirtyStat.total_block_mem_MB = 0; DirtyStat.dirty_rate = -1; DirtyStat.start_time = start_time; - DirtyStat.calc_time = calc_time; - DirtyStat.sample_pages = sample_pages; + DirtyStat.calc_time = config.sample_period_seconds; + DirtyStat.sample_pages = config.sample_pages_per_gigabytes; + + switch (config.mode) { + case DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING: + DirtyStat.page_sampling.total_dirty_samples = 0; + DirtyStat.page_sampling.total_sample_count = 0; + DirtyStat.page_sampling.total_block_mem_MB = 0; + break; + case DIRTY_RATE_MEASURE_MODE_DIRTY_RING: + DirtyStat.dirty_ring.nvcpu = -1; + DirtyStat.dirty_ring.rates = NULL; + break; + default: + break; + } } static void update_dirtyrate_stat(struct RamblockDirtyInfo *info) { - DirtyStat.total_dirty_samples += info->sample_dirty_count; - DirtyStat.total_sample_count += info->sample_pages_count; + DirtyStat.page_sampling.total_dirty_samples += info->sample_dirty_count; + DirtyStat.page_sampling.total_sample_count += info->sample_pages_count; /* size of total pages in MB */ - DirtyStat.total_block_mem_MB += (info->ramblock_pages * - TARGET_PAGE_SIZE) >> 20; + DirtyStat.page_sampling.total_block_mem_MB += (info->ramblock_pages * + TARGET_PAGE_SIZE) >> 20; } static void update_dirtyrate(uint64_t msec) { uint64_t dirtyrate; - uint64_t total_dirty_samples = DirtyStat.total_dirty_samples; - uint64_t total_sample_count = DirtyStat.total_sample_count; - uint64_t total_block_mem_MB = DirtyStat.total_block_mem_MB; + uint64_t total_dirty_samples = DirtyStat.page_sampling.total_dirty_samples; + uint64_t total_sample_count = DirtyStat.page_sampling.total_sample_count; + uint64_t total_block_mem_MB = DirtyStat.page_sampling.total_block_mem_MB; dirtyrate = total_dirty_samples * total_block_mem_MB * 1000 / (total_sample_count * msec); @@ -327,7 +338,7 @@ static bool compare_page_hash_info(struct RamblockDirtyInfo *info, update_dirtyrate_stat(block_dinfo); } - if (DirtyStat.total_sample_count == 0) { + if (DirtyStat.page_sampling.total_sample_count == 0) { return false; } @@ -372,8 +383,6 @@ void *get_dirtyrate_thread(void *arg) struct DirtyRateConfig config = *(struct DirtyRateConfig *)arg; int ret; int64_t start_time; - int64_t calc_time; - uint64_t sample_pages; ret = dirtyrate_set_state(&CalculatingState, DIRTY_RATE_STATUS_UNSTARTED, DIRTY_RATE_STATUS_MEASURING); @@ -383,9 +392,7 @@ void *get_dirtyrate_thread(void *arg) } start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000; - calc_time = config.sample_period_seconds; - sample_pages = config.sample_pages_per_gigabytes; - init_dirtyrate_stat(start_time, calc_time, sample_pages); + init_dirtyrate_stat(start_time, config); calculate_dirtyrate(config); @@ -442,6 +449,7 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages, config.sample_period_seconds = calc_time; config.sample_pages_per_gigabytes = sample_pages; + config.mode = DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING; qemu_thread_create(&thread, "get_dirtyrate", get_dirtyrate_thread, (void *)&config, QEMU_THREAD_DETACHED); } From patchwork Mon Nov 1 22:08:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597387 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70037C433EF for ; Mon, 1 Nov 2021 22:09:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 55B406058D for ; Mon, 1 Nov 2021 22:09:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231969AbhKAWL6 (ORCPT ); Mon, 1 Nov 2021 18:11:58 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:57823 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232279AbhKAWL5 (ORCPT ); Mon, 1 Nov 2021 18:11:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804563; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YZFYZZ5zEoiyMjFkwTDaD6Kc9ekYQT7iiG9Yos9LPnM=; b=Qn9qhXotRnwBbwLmjzCtMMto2U5LxwpkSjjGEVeEWEhPSiND5Mlpnn54ojKwbNKtPi3ACN xEnHVfw73opWa9JppxShtcD/DIEW1/Z/ilkWdtOeObrfD2Q0FkBgaUQkW5MgebV0lDpBui 9W8fQnjW6mDUXIJ1RXJBJ2FHhfh6kDk= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-48-6j1jO9qROIKkd0W7sYu80w-1; Mon, 01 Nov 2021 18:09:22 -0400 X-MC-Unique: 6j1jO9qROIKkd0W7sYu80w-1 Received: by mail-wm1-f69.google.com with SMTP id k25-20020a05600c1c9900b00332f798ba1dso174690wms.4 for ; Mon, 01 Nov 2021 15:09:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YZFYZZ5zEoiyMjFkwTDaD6Kc9ekYQT7iiG9Yos9LPnM=; b=0JvIs2jR9qEPRNXQ1BJnhn13J3x8O1uN6KKXt2F85KJcs7+dzyJzrwT5U79ZPCAX2u T0svJVsd5yi/BspSIaH71jXoLCvKh7zTX4k4uGF3wp5/av7PtsNm+f22Tz3T+aIBxgCi QDrMriGfcymEgqkaAzTShNOZCRbP4ib6t1VwX/Hn7NW+AyRa3QBsy6657iYyuxcW5GWV 2yCzp2KYFtZCIHErxCXXnXs4iWXd4xht60AwqJEchAlj7/L/dATxq+58rJ7g8tLtamlK oxZZ8tqBdiYotCMYtcrq7rWHrrAUzWH33eiBzpaTgAEl9T2+bkJ2dLWm8ebajwalzvre bZiw== X-Gm-Message-State: AOAM533R9LSljxhUuATsQUyxTPHFHFcGoRsHOCc/B4nttH/Gu1JnRhmp Dp6s8RipbnBXhYvzo6BnFzNZEnP/HjIJYRMO8ZHvNz4U0rNDVF64WICKrUEer1KppUC+PYpMuOp 4ImPAp5IayQvj X-Received: by 2002:a1c:e90a:: with SMTP id q10mr2002267wmc.108.1635804561242; Mon, 01 Nov 2021 15:09:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxa8vqOhjEPyDeA8zcYXC8Gdajcc+LdPsZwKy493tocqfCdmTr3IvnHpn4ZYtHQE6XeYkTmSQ== X-Received: by 2002:a1c:e90a:: with SMTP id q10mr2002245wmc.108.1635804561097; Mon, 01 Nov 2021 15:09:21 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id 126sm621666wmz.28.2021.11.01.15.09.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:20 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard , =?utf-8?b?SHltYW4gSHVhbmcow6k=?= =?utf-8?b?wrvigJ7DpeKAueKAoSk=?= Subject: [PULL 05/20] migration/dirtyrate: adjust order of registering thread Date: Mon, 1 Nov 2021 23:08:57 +0100 Message-Id: <20211101220912.10039-6-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Hyman Huang(黄勇) registering get_dirtyrate thread in advance so that both page-sampling and dirty-ring mode can be covered. Signed-off-by: Hyman Huang(黄勇) Message-Id: Reviewed-by: Peter Xu Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- migration/dirtyrate.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c index e0a27a992c..a9bdd60034 100644 --- a/migration/dirtyrate.c +++ b/migration/dirtyrate.c @@ -352,7 +352,6 @@ static void calculate_dirtyrate(struct DirtyRateConfig config) int64_t msec = 0; int64_t initial_time; - rcu_register_thread(); rcu_read_lock(); initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); if (!record_ramblock_hash_info(&block_dinfo, config, &block_count)) { @@ -375,7 +374,6 @@ static void calculate_dirtyrate(struct DirtyRateConfig config) out: rcu_read_unlock(); free_ramblock_dirty_info(block_dinfo, block_count); - rcu_unregister_thread(); } void *get_dirtyrate_thread(void *arg) @@ -383,6 +381,7 @@ void *get_dirtyrate_thread(void *arg) struct DirtyRateConfig config = *(struct DirtyRateConfig *)arg; int ret; int64_t start_time; + rcu_register_thread(); ret = dirtyrate_set_state(&CalculatingState, DIRTY_RATE_STATUS_UNSTARTED, DIRTY_RATE_STATUS_MEASURING); @@ -401,6 +400,8 @@ void *get_dirtyrate_thread(void *arg) if (ret == -1) { error_report("change dirtyrate state failed."); } + + rcu_unregister_thread(); return NULL; } From patchwork Mon Nov 1 22:08:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597389 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 175C7C433F5 for ; Mon, 1 Nov 2021 22:09:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0021560FD9 for ; Mon, 1 Nov 2021 22:09:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232279AbhKAWL7 (ORCPT ); Mon, 1 Nov 2021 18:11:59 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:20381 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232276AbhKAWL6 (ORCPT ); Mon, 1 Nov 2021 18:11:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804564; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UH6qFcOvOngPka/Fe9lzsDNdrqD8p0gmRL7v9NgF7Ts=; b=cgDoKbHNqK9Isjr/sb6KPxjoDIP+2SLoml3SA+fs3JQzpRgFRnyMXj3jTV6zU1MPRcIied pqxwrM6yPcAOHM2iWj0bnukS6UiBYa0xaz0ajGJbgaK5DMeEUuDnx6uNFXg5Y+tflNByes C/tz+3+EQxbVhg3jwf3Kih0s2GkOosI= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-144-LhPNnfggN-a5btp1LFR2nA-1; Mon, 01 Nov 2021 18:09:23 -0400 X-MC-Unique: LhPNnfggN-a5btp1LFR2nA-1 Received: by mail-wr1-f71.google.com with SMTP id u4-20020a5d4684000000b0017c8c1de97dso3734607wrq.16 for ; Mon, 01 Nov 2021 15:09:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=UH6qFcOvOngPka/Fe9lzsDNdrqD8p0gmRL7v9NgF7Ts=; b=guF79rS6fggaYOC/qYdpLayZKyxOebnCbQjjb+6rxYBYlwzGzO3R/+uHGrQGtkwsJx 2IX7soO6jtoIrRdf4lT+u1g016rr3LMgFrmGuGTgVCiPQyboPvaSJEh/kplshbMItirS Q0rdEuQQm0qOq37o3C57+Blu3nDOw+QRC+oE8oFGr6sKuGj771FQxY47UUfpaZBwK8iZ YJBARW/b0OISEpejv3rIOCIggKvym1cV5cw080ld4Qib4StWMZo51fVQ93jmDkw9Bs4U m/yuN1wbBQPLsuFwQA9KVGer0NycTgjofZk8B4++4xcYrNKFuLeY+SNxB9j+Axs+6XoG Kccw== X-Gm-Message-State: AOAM532Bn0y72xtBPrKA3jjAfL1XYMojVQ26qM1B8zoykIirvtF92W9t PWpTuCZJN/oGNYQz/J8SKqrRUP72jI5kgt+mAMSEXU1cltArlFSuaSRXcGvzsG98hHoDOtRHImP yrw2UaZH0iFLh X-Received: by 2002:adf:d20e:: with SMTP id j14mr33297700wrh.220.1635804562475; Mon, 01 Nov 2021 15:09:22 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxkffbvXrtYSeZN/zNmfpHm8cIwzDDYcEpAf91jdU0NQr69iLSppRQMuLbVQCsCLOwkm5qiNw== X-Received: by 2002:adf:d20e:: with SMTP id j14mr33297679wrh.220.1635804562331; Mon, 01 Nov 2021 15:09:22 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id n15sm637153wmq.38.2021.11.01.15.09.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:21 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard , =?utf-8?b?SHltYW4gSHVhbmcow6k=?= =?utf-8?b?wrvigJ7DpeKAueKAoSk=?= Subject: [PULL 06/20] migration/dirtyrate: move init step of calculation to main thread Date: Mon, 1 Nov 2021 23:08:58 +0100 Message-Id: <20211101220912.10039-7-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Hyman Huang(黄勇) since main thread may "query dirty rate" at any time, it's better to move init step into main thead so that synchronization overhead between "main" and "get_dirtyrate" can be reduced. Signed-off-by: Hyman Huang(黄勇) Message-Id: <109f8077518ed2f13068e3bfb10e625e964780f1.1624040308.git.huangy81@chinatelecom.cn> Reviewed-by: Peter Xu Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- migration/dirtyrate.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c index a9bdd60034..b8f61cc650 100644 --- a/migration/dirtyrate.c +++ b/migration/dirtyrate.c @@ -380,7 +380,6 @@ void *get_dirtyrate_thread(void *arg) { struct DirtyRateConfig config = *(struct DirtyRateConfig *)arg; int ret; - int64_t start_time; rcu_register_thread(); ret = dirtyrate_set_state(&CalculatingState, DIRTY_RATE_STATUS_UNSTARTED, @@ -390,9 +389,6 @@ void *get_dirtyrate_thread(void *arg) return NULL; } - start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000; - init_dirtyrate_stat(start_time, config); - calculate_dirtyrate(config); ret = dirtyrate_set_state(&CalculatingState, DIRTY_RATE_STATUS_MEASURING, @@ -411,6 +407,7 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages, static struct DirtyRateConfig config; QemuThread thread; int ret; + int64_t start_time; /* * If the dirty rate is already being measured, don't attempt to start. @@ -451,6 +448,10 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages, config.sample_period_seconds = calc_time; config.sample_pages_per_gigabytes = sample_pages; config.mode = DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING; + + start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000; + init_dirtyrate_stat(start_time, config); + qemu_thread_create(&thread, "get_dirtyrate", get_dirtyrate_thread, (void *)&config, QEMU_THREAD_DETACHED); } From patchwork Mon Nov 1 22:08:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597393 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 179ADC433F5 for ; Mon, 1 Nov 2021 22:09:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F3EF961053 for ; Mon, 1 Nov 2021 22:09:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232297AbhKAWMB (ORCPT ); Mon, 1 Nov 2021 18:12:01 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:47345 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232276AbhKAWMA (ORCPT ); Mon, 1 Nov 2021 18:12:00 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804566; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4qSeheYxsTSzOnjbgBDH9X0OK6DFyKGFiBbRXcZR8yQ=; b=Wa8G9rQHzG+j0Ls5iB0NfThAZ5jw+bCy7t9DeeOUJCiD7jVKmyUNmkL/5uazmqt6BK1eEm i6BCKdVUbyUziuV3m1Br1SO29+4sktow15EgLKb4mIvBjvBWtNiYW/2haze0DJp5aOQchU A45cwl9DrVDjimNST3eCRtkcGGotiXg= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-547-K3bNN_n-Py6eUtMIPbHOKw-1; Mon, 01 Nov 2021 18:09:25 -0400 X-MC-Unique: K3bNN_n-Py6eUtMIPbHOKw-1 Received: by mail-wr1-f72.google.com with SMTP id c8-20020a056000104800b00186ac81c04fso986211wrx.11 for ; Mon, 01 Nov 2021 15:09:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=4qSeheYxsTSzOnjbgBDH9X0OK6DFyKGFiBbRXcZR8yQ=; b=Q+Sf8QiNhrdtJhNmpmPjmSpbLdK8cixGcT8TPTofCcHP2lLzUEQPRnya289jKPpHxK IVs/Isp5RLCt5VIJEj5XY7ZX2jR0gzfU0lqOn/m4mOBR5/rCgczsmGcckh37ezpTZBC5 RvfXoD+IIl1BN/hQddHa7FPbFNfxZht3umde/cWUADUxOFzSLaVti/DOHvh9w+WEnV40 ukGY5B5ah9ca2r3MZTeV+pieH+PF+V3HO79Iw8FtVuLMJpSin8f4EVMcthaxqH9x4uCM rp00cNHm4qM9vWw/DuBJalhc33aWiL/F34OuVlYU2aPdQZ8NlW3ZsS3CrjqKy5V78RwN d4FA== X-Gm-Message-State: AOAM531nsdHuMuGDiYLs6SswWhOdl7lVtd11zmhK8aW5xWSB65dswLkz ysBIwWdQgAB3h1pbMxpgDMlgxlq5ZQtwR/RYwDdT9Y88AG5xKVagOhKdnalwHPl6lBalS7W/yY7 2aYEBKejCRPNy X-Received: by 2002:adf:df89:: with SMTP id z9mr4325032wrl.336.1635804563898; Mon, 01 Nov 2021 15:09:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx3hGujhhWr5tIpME+wxcFQbUd76XID7qBwn2gPvXB9kRPFZ8nE35rreXch30k76QltMlSD3g== X-Received: by 2002:adf:df89:: with SMTP id z9mr4325008wrl.336.1635804563683; Mon, 01 Nov 2021 15:09:23 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id w17sm9942681wrp.79.2021.11.01.15.09.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:23 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard , =?utf-8?b?SHltYW4gSHVhbmcow6k=?= =?utf-8?b?wrvigJ7DpeKAueKAoSk=?= Subject: [PULL 07/20] migration/dirtyrate: implement dirty-ring dirtyrate calculation Date: Mon, 1 Nov 2021 23:08:59 +0100 Message-Id: <20211101220912.10039-8-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Hyman Huang(黄勇) use dirty ring feature to implement dirtyrate calculation. introduce mode option in qmp calc_dirty_rate to specify what method should be used when calculating dirtyrate, either page-sampling or dirty-ring should be passed. introduce "dirty_ring:-r" option in hmp calc_dirty_rate to indicate dirty ring method should be used for calculation. Signed-off-by: Hyman Huang(黄勇) Message-Id: <7db445109bd18125ce8ec86816d14f6ab5de6a7d.1624040308.git.huangy81@chinatelecom.cn> Reviewed-by: Peter Xu Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- qapi/migration.json | 16 +++- migration/dirtyrate.c | 208 +++++++++++++++++++++++++++++++++++++++-- hmp-commands.hx | 7 +- migration/trace-events | 2 + 4 files changed, 218 insertions(+), 15 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index 94eece16e1..fae4bc608c 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -1796,6 +1796,12 @@ # @sample-pages: page count per GB for sample dirty pages # the default value is 512 (since 6.1) # +# @mode: mode containing method of calculate dirtyrate includes +# 'page-sampling' and 'dirty-ring' (Since 6.1) +# +# @vcpu-dirty-rate: dirtyrate for each vcpu if dirty-ring +# mode specified (Since 6.1) +# # Since: 5.2 # ## @@ -1804,7 +1810,9 @@ 'status': 'DirtyRateStatus', 'start-time': 'int64', 'calc-time': 'int64', - 'sample-pages': 'uint64'} } + 'sample-pages': 'uint64', + 'mode': 'DirtyRateMeasureMode', + '*vcpu-dirty-rate': [ 'DirtyRateVcpu' ] } } ## # @calc-dirty-rate: @@ -1816,6 +1824,9 @@ # @sample-pages: page count per GB for sample dirty pages # the default value is 512 (since 6.1) # +# @mode: mechanism of calculating dirtyrate includes +# 'page-sampling' and 'dirty-ring' (Since 6.1) +# # Since: 5.2 # # Example: @@ -1824,7 +1835,8 @@ # ## { 'command': 'calc-dirty-rate', 'data': {'calc-time': 'int64', - '*sample-pages': 'int'} } + '*sample-pages': 'int', + '*mode': 'DirtyRateMeasureMode'} } ## # @query-dirty-rate: diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c index b8f61cc650..f92c4b498e 100644 --- a/migration/dirtyrate.c +++ b/migration/dirtyrate.c @@ -16,6 +16,7 @@ #include "cpu.h" #include "exec/ramblock.h" #include "qemu/rcu_queue.h" +#include "qemu/main-loop.h" #include "qapi/qapi-commands-migration.h" #include "ram.h" #include "trace.h" @@ -23,9 +24,19 @@ #include "monitor/hmp.h" #include "monitor/monitor.h" #include "qapi/qmp/qdict.h" +#include "sysemu/kvm.h" +#include "sysemu/runstate.h" +#include "exec/memory.h" + +typedef struct DirtyPageRecord { + uint64_t start_pages; + uint64_t end_pages; +} DirtyPageRecord; static int CalculatingState = DIRTY_RATE_STATUS_UNSTARTED; static struct DirtyRateStat DirtyStat; +static DirtyRateMeasureMode dirtyrate_mode = + DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING; static int64_t set_sample_page_period(int64_t msec, int64_t initial_time) { @@ -70,18 +81,37 @@ static int dirtyrate_set_state(int *state, int old_state, int new_state) static struct DirtyRateInfo *query_dirty_rate_info(void) { + int i; int64_t dirty_rate = DirtyStat.dirty_rate; struct DirtyRateInfo *info = g_malloc0(sizeof(DirtyRateInfo)); - - if (qatomic_read(&CalculatingState) == DIRTY_RATE_STATUS_MEASURED) { - info->has_dirty_rate = true; - info->dirty_rate = dirty_rate; - } + DirtyRateVcpuList *head = NULL, **tail = &head; info->status = CalculatingState; info->start_time = DirtyStat.start_time; info->calc_time = DirtyStat.calc_time; info->sample_pages = DirtyStat.sample_pages; + info->mode = dirtyrate_mode; + + if (qatomic_read(&CalculatingState) == DIRTY_RATE_STATUS_MEASURED) { + info->has_dirty_rate = true; + info->dirty_rate = dirty_rate; + + if (dirtyrate_mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) { + /* + * set sample_pages with 0 to indicate page sampling + * isn't enabled + **/ + info->sample_pages = 0; + info->has_vcpu_dirty_rate = true; + for (i = 0; i < DirtyStat.dirty_ring.nvcpu; i++) { + DirtyRateVcpu *rate = g_malloc0(sizeof(DirtyRateVcpu)); + rate->id = DirtyStat.dirty_ring.rates[i].id; + rate->dirty_rate = DirtyStat.dirty_ring.rates[i].dirty_rate; + QAPI_LIST_APPEND(tail, rate); + } + info->vcpu_dirty_rate = head; + } + } trace_query_dirty_rate_info(DirtyRateStatus_str(CalculatingState)); @@ -111,6 +141,15 @@ static void init_dirtyrate_stat(int64_t start_time, } } +static void cleanup_dirtyrate_stat(struct DirtyRateConfig config) +{ + /* last calc-dirty-rate qmp use dirty ring mode */ + if (dirtyrate_mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) { + free(DirtyStat.dirty_ring.rates); + DirtyStat.dirty_ring.rates = NULL; + } +} + static void update_dirtyrate_stat(struct RamblockDirtyInfo *info) { DirtyStat.page_sampling.total_dirty_samples += info->sample_dirty_count; @@ -345,7 +384,97 @@ static bool compare_page_hash_info(struct RamblockDirtyInfo *info, return true; } -static void calculate_dirtyrate(struct DirtyRateConfig config) +static inline void record_dirtypages(DirtyPageRecord *dirty_pages, + CPUState *cpu, bool start) +{ + if (start) { + dirty_pages[cpu->cpu_index].start_pages = cpu->dirty_pages; + } else { + dirty_pages[cpu->cpu_index].end_pages = cpu->dirty_pages; + } +} + +static void dirtyrate_global_dirty_log_start(void) +{ + qemu_mutex_lock_iothread(); + memory_global_dirty_log_start(GLOBAL_DIRTY_DIRTY_RATE); + qemu_mutex_unlock_iothread(); +} + +static void dirtyrate_global_dirty_log_stop(void) +{ + qemu_mutex_lock_iothread(); + memory_global_dirty_log_sync(); + memory_global_dirty_log_stop(GLOBAL_DIRTY_DIRTY_RATE); + qemu_mutex_unlock_iothread(); +} + +static int64_t do_calculate_dirtyrate_vcpu(DirtyPageRecord dirty_pages) +{ + uint64_t memory_size_MB; + int64_t time_s; + uint64_t increased_dirty_pages = + dirty_pages.end_pages - dirty_pages.start_pages; + + memory_size_MB = (increased_dirty_pages * TARGET_PAGE_SIZE) >> 20; + time_s = DirtyStat.calc_time; + + return memory_size_MB / time_s; +} + +static void calculate_dirtyrate_dirty_ring(struct DirtyRateConfig config) +{ + CPUState *cpu; + int64_t msec = 0; + int64_t start_time; + uint64_t dirtyrate = 0; + uint64_t dirtyrate_sum = 0; + DirtyPageRecord *dirty_pages; + int nvcpu = 0; + int i = 0; + + CPU_FOREACH(cpu) { + nvcpu++; + } + + dirty_pages = malloc(sizeof(*dirty_pages) * nvcpu); + + DirtyStat.dirty_ring.nvcpu = nvcpu; + DirtyStat.dirty_ring.rates = malloc(sizeof(DirtyRateVcpu) * nvcpu); + + dirtyrate_global_dirty_log_start(); + + CPU_FOREACH(cpu) { + record_dirtypages(dirty_pages, cpu, true); + } + + start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); + DirtyStat.start_time = start_time / 1000; + + msec = config.sample_period_seconds * 1000; + msec = set_sample_page_period(msec, start_time); + DirtyStat.calc_time = msec / 1000; + + dirtyrate_global_dirty_log_stop(); + + CPU_FOREACH(cpu) { + record_dirtypages(dirty_pages, cpu, false); + } + + for (i = 0; i < DirtyStat.dirty_ring.nvcpu; i++) { + dirtyrate = do_calculate_dirtyrate_vcpu(dirty_pages[i]); + trace_dirtyrate_do_calculate_vcpu(i, dirtyrate); + + DirtyStat.dirty_ring.rates[i].id = i; + DirtyStat.dirty_ring.rates[i].dirty_rate = dirtyrate; + dirtyrate_sum += dirtyrate; + } + + DirtyStat.dirty_rate = dirtyrate_sum; + free(dirty_pages); +} + +static void calculate_dirtyrate_sample_vm(struct DirtyRateConfig config) { struct RamblockDirtyInfo *block_dinfo = NULL; int block_count = 0; @@ -376,6 +505,17 @@ out: free_ramblock_dirty_info(block_dinfo, block_count); } +static void calculate_dirtyrate(struct DirtyRateConfig config) +{ + if (config.mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) { + calculate_dirtyrate_dirty_ring(config); + } else { + calculate_dirtyrate_sample_vm(config); + } + + trace_dirtyrate_calculate(DirtyStat.dirty_rate); +} + void *get_dirtyrate_thread(void *arg) { struct DirtyRateConfig config = *(struct DirtyRateConfig *)arg; @@ -401,8 +541,12 @@ void *get_dirtyrate_thread(void *arg) return NULL; } -void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages, - int64_t sample_pages, Error **errp) +void qmp_calc_dirty_rate(int64_t calc_time, + bool has_sample_pages, + int64_t sample_pages, + bool has_mode, + DirtyRateMeasureMode mode, + Error **errp) { static struct DirtyRateConfig config; QemuThread thread; @@ -424,6 +568,15 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages, return; } + if (!has_mode) { + mode = DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING; + } + + if (has_sample_pages && mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) { + error_setg(errp, "either sample-pages or dirty-ring can be specified."); + return; + } + if (has_sample_pages) { if (!is_sample_pages_valid(sample_pages)) { error_setg(errp, "sample-pages is out of range[%d, %d].", @@ -435,6 +588,16 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages, sample_pages = DIRTYRATE_DEFAULT_SAMPLE_PAGES; } + /* + * dirty ring mode only works when kvm dirty ring is enabled. + */ + if ((mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) && + !kvm_dirty_ring_enabled()) { + error_setg(errp, "dirty ring is disabled, use sample-pages method " + "or remeasure later."); + return; + } + /* * Init calculation state as unstarted. */ @@ -447,7 +610,15 @@ void qmp_calc_dirty_rate(int64_t calc_time, bool has_sample_pages, config.sample_period_seconds = calc_time; config.sample_pages_per_gigabytes = sample_pages; - config.mode = DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING; + config.mode = mode; + + cleanup_dirtyrate_stat(config); + + /* + * update dirty rate mode so that we can figure out what mode has + * been used in last calculation + **/ + dirtyrate_mode = mode; start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) / 1000; init_dirtyrate_stat(start_time, config); @@ -473,12 +644,24 @@ void hmp_info_dirty_rate(Monitor *mon, const QDict *qdict) info->sample_pages); monitor_printf(mon, "Period: %"PRIi64" (sec)\n", info->calc_time); + monitor_printf(mon, "Mode: %s\n", + DirtyRateMeasureMode_str(info->mode)); monitor_printf(mon, "Dirty rate: "); if (info->has_dirty_rate) { monitor_printf(mon, "%"PRIi64" (MB/s)\n", info->dirty_rate); + if (info->has_vcpu_dirty_rate) { + DirtyRateVcpuList *rate, *head = info->vcpu_dirty_rate; + for (rate = head; rate != NULL; rate = rate->next) { + monitor_printf(mon, "vcpu[%"PRIi64"], Dirty rate: %"PRIi64 + " (MB/s)\n", rate->value->id, + rate->value->dirty_rate); + } + } } else { monitor_printf(mon, "(not ready)\n"); } + + qapi_free_DirtyRateVcpuList(info->vcpu_dirty_rate); g_free(info); } @@ -487,6 +670,10 @@ void hmp_calc_dirty_rate(Monitor *mon, const QDict *qdict) int64_t sec = qdict_get_try_int(qdict, "second", 0); int64_t sample_pages = qdict_get_try_int(qdict, "sample_pages_per_GB", -1); bool has_sample_pages = (sample_pages != -1); + bool dirty_ring = qdict_get_try_bool(qdict, "dirty_ring", false); + DirtyRateMeasureMode mode = + (dirty_ring ? DIRTY_RATE_MEASURE_MODE_DIRTY_RING : + DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING); Error *err = NULL; if (!sec) { @@ -494,7 +681,8 @@ void hmp_calc_dirty_rate(Monitor *mon, const QDict *qdict) return; } - qmp_calc_dirty_rate(sec, has_sample_pages, sample_pages, &err); + qmp_calc_dirty_rate(sec, has_sample_pages, sample_pages, true, + mode, &err); if (err) { hmp_handle_error(mon, err); return; diff --git a/hmp-commands.hx b/hmp-commands.hx index cf723c69ac..b6d47bd03f 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1737,8 +1737,9 @@ ERST { .name = "calc_dirty_rate", - .args_type = "second:l,sample_pages_per_GB:l?", - .params = "second [sample_pages_per_GB]", - .help = "start a round of guest dirty rate measurement", + .args_type = "dirty_ring:-r,second:l,sample_pages_per_GB:l?", + .params = "[-r] second [sample_pages_per_GB]", + .help = "start a round of guest dirty rate measurement (using -d to" + "\n\t\t\t specify dirty ring as the method of calculation)", .cmd = hmp_calc_dirty_rate, }, diff --git a/migration/trace-events b/migration/trace-events index a8ae163707..b48d873b8a 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -333,6 +333,8 @@ get_ramblock_vfn_hash(const char *idstr, uint64_t vfn, uint32_t crc) "ramblock n calc_page_dirty_rate(const char *idstr, uint32_t new_crc, uint32_t old_crc) "ramblock name: %s, new crc: %" PRIu32 ", old crc: %" PRIu32 skip_sample_ramblock(const char *idstr, uint64_t ramblock_size) "ramblock name: %s, ramblock size: %" PRIu64 find_page_matched(const char *idstr) "ramblock %s addr or size changed" +dirtyrate_calculate(int64_t dirtyrate) "dirty rate: %" PRIi64 " MB/s" +dirtyrate_do_calculate_vcpu(int idx, uint64_t rate) "vcpu[%d]: %"PRIu64 " MB/s" # block.c migration_block_init_shared(const char *blk_device_name) "Start migration for %s with shared base image" From patchwork Mon Nov 1 22:09:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597391 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1EAB1C433EF for ; Mon, 1 Nov 2021 22:09:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 090D961075 for ; Mon, 1 Nov 2021 22:09:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232285AbhKAWMC (ORCPT ); Mon, 1 Nov 2021 18:12:02 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:25237 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232274AbhKAWMB (ORCPT ); Mon, 1 Nov 2021 18:12:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804567; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GGo9WanuJU04tuhhnw4peEWGOBw6c3WDbeGbMBl0UrM=; b=CxE49QFwEnLn5ozH0njoOjqV0y/Y9CbD0DHU97jOrCpeiZ73tHyjt2KFCAkRCrotaFm8PT IOP4IFxlq+NBRFVWHQC+dynHM0bhn+EQJe3r+cRU1Sr4oouxMZaZK6gT+ZwT4lBt5cvNBf w9Cl+GmwmkIY40c9fqDupbx4Sf43htg= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-387-bqdmTDkaNT-hYX3127zj0w-1; Mon, 01 Nov 2021 18:09:26 -0400 X-MC-Unique: bqdmTDkaNT-hYX3127zj0w-1 Received: by mail-wr1-f70.google.com with SMTP id q7-20020adff507000000b0017d160d35a8so3583015wro.4 for ; Mon, 01 Nov 2021 15:09:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GGo9WanuJU04tuhhnw4peEWGOBw6c3WDbeGbMBl0UrM=; b=1fvSLUAEgzFXsMomsuevXLg9q5DjSMPKHLi6TWnRk0TiI3K9Wvv5l/BVOmcksguhZR lcHCbpbFqYwbz9PGCeDszaZbcY/PFzIQcZq+9B6eJ7jf4/Rv6CsFgWGEQak0KDXbAsSq bkxFApumSNPud1D9VZzcnsn2/Ik6vZPw6vja20MHbSkAwLrFogLGpdeIVxx/BY2W6uU6 mzu4q5F4fXxJJltjqTSVyFMNECbkYFtggTejZFUGp/35RNl4t7Ga/NXzZrZkySxQwTJz LwGTsXaZB6G+jEn4tEYN+tdmIgCbXN29gICygo3R+7xUrhIlNfd/0/s792prLb/gtXQN IfaA== X-Gm-Message-State: AOAM532TXYwfBmrxmSV8McHbvFKdbAkBdoVnb8O9Qsrsw3Yh6gUZ+WRd /WGinERpiCzJqWOnaMKBCk6Q9tx0ribZ40OZWdsn9YPXjx8mfR9fd5FHvUzZMdIiVesz6zVSQvQ CZvU2QDi2i40+ X-Received: by 2002:a1c:7e41:: with SMTP id z62mr1955940wmc.9.1635804565092; Mon, 01 Nov 2021 15:09:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwpSY2086+CnGB9WPrHW5bo96aeVOpNDlLfr+WAapyjvAt6zv6pqshWIvxGHLts1sX02fyU2g== X-Received: by 2002:a1c:7e41:: with SMTP id z62mr1955923wmc.9.1635804564942; Mon, 01 Nov 2021 15:09:24 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id o17sm685487wmq.11.2021.11.01.15.09.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:24 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard Subject: [PULL 08/20] migration: Make migration blocker work for snapshots too Date: Mon, 1 Nov 2021 23:09:00 +0100 Message-Id: <20211101220912.10039-9-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Peter Xu save_snapshot() checks migration blocker, which looks sane. At the meantime we should also teach the blocker add helper to fail if during a snapshot, just like for migrations. Reviewed-by: Marc-André Lureau Signed-off-by: Peter Xu Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- migration/migration.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 9172686b89..e81e473f5a 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2058,15 +2058,16 @@ int migrate_add_blocker(Error *reason, Error **errp) return -EACCES; } - if (migration_is_idle()) { - migration_blockers = g_slist_prepend(migration_blockers, reason); - return 0; + /* Snapshots are similar to migrations, so check RUN_STATE_SAVE_VM too. */ + if (runstate_check(RUN_STATE_SAVE_VM) || !migration_is_idle()) { + error_propagate_prepend(errp, error_copy(reason), + "disallowing migration blocker " + "(migration/snapshot in progress) for: "); + return -EBUSY; } - error_propagate_prepend(errp, error_copy(reason), - "disallowing migration blocker " - "(migration in progress) for: "); - return -EBUSY; + migration_blockers = g_slist_prepend(migration_blockers, reason); + return 0; } void migrate_del_blocker(Error *reason) From patchwork Mon Nov 1 22:09:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597395 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F660C433EF for ; Mon, 1 Nov 2021 22:09:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0168260C51 for ; Mon, 1 Nov 2021 22:09:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232310AbhKAWME (ORCPT ); Mon, 1 Nov 2021 18:12:04 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:22877 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232286AbhKAWMC (ORCPT ); Mon, 1 Nov 2021 18:12:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804568; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xgA8Pg4pioYTYcQYYaUz2yN31K5Rcv8HQjmwYm8rweY=; b=P3PUolCz4j4ymEn/0u53vtsuTYUJ2crPH6lHWspESqFkkPYiBJ2ZFimpYFTl3CEJfed1b3 eBQubuPdQ83yo3v71Fs0IpFQi52Qjmd4csSJO0cCyI+Cz6QL9ngRmga2MadZqlwu73xsUP DVjT94/OzHHw/2r3QDvTeIbtgrP7Aus= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-424-WU9L0UdqMl6_oHzTU09nEg-1; Mon, 01 Nov 2021 18:09:27 -0400 X-MC-Unique: WU9L0UdqMl6_oHzTU09nEg-1 Received: by mail-wm1-f71.google.com with SMTP id v10-20020a1cf70a000000b00318203a6bd1so225057wmh.6 for ; Mon, 01 Nov 2021 15:09:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=xgA8Pg4pioYTYcQYYaUz2yN31K5Rcv8HQjmwYm8rweY=; b=Kart1uZs+LnVn8mqImMpH4B+twY+kNGpJhSshWA1BZDUHX93gx5n7Tt2COyjMiKfsP CGqnzsz0O0rO8ZEeyn1TmC8LqYaaUCy2X9v711+UmL1gVGQYyCaqRX3klcIv5OwTny1w tH2AILOSriH8zU99pPn5mHk7VsMGOkk1wCcT8g4LHixvA8IrALGU31kYcpQo7C5PKWqb bmfa9ZeP4jwmD8cwlCyjrwgVto9SuZ/xXLwm3+NhUUm3TtQJb0B/qeYxJ0dwaccTB0N6 cvyW2JyXOyDMQgpn2ezovYUT/TJnmDMKgmtGf1jFYc65xQJNJb0Fceu8Tg0ELHALwiLx qyaQ== X-Gm-Message-State: AOAM5302hHtpTz7U5U7AsdGQ7UZI//P/20Kt05txr7/SXF8aGghBm2Kq M9NnxESdrVGvQ9shuG7/xwKv8bYjLXH1nw8b55V5r5WIUuLI01WUcl1yFhV9GsYQdfyckgsnDSH 9AmV/2EfspKev X-Received: by 2002:adf:c70b:: with SMTP id k11mr40015416wrg.154.1635804566477; Mon, 01 Nov 2021 15:09:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzUrxZpRrNYU7lx4z1CYr7OY2cCtGorBz1ob+LkkuKyWgUku8HgKdZfOfBOrsPY7hoBqfW6RQ== X-Received: by 2002:adf:c70b:: with SMTP id k11mr40015380wrg.154.1635804566314; Mon, 01 Nov 2021 15:09:26 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id u16sm620998wmc.21.2021.11.01.15.09.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:25 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard Subject: [PULL 09/20] migration: Add migrate_add_blocker_internal() Date: Mon, 1 Nov 2021 23:09:01 +0100 Message-Id: <20211101220912.10039-10-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Peter Xu An internal version that removes -only-migratable implications. It can be used for temporary migration blockers like dump-guest-memory. Reviewed-by: Marc-André Lureau Reviewed-by: Juan Quintela Signed-off-by: Peter Xu Signed-off-by: Juan Quintela --- include/migration/blocker.h | 16 ++++++++++++++++ migration/migration.c | 21 +++++++++++++-------- 2 files changed, 29 insertions(+), 8 deletions(-) diff --git a/include/migration/blocker.h b/include/migration/blocker.h index acd27018e9..9cebe2ba06 100644 --- a/include/migration/blocker.h +++ b/include/migration/blocker.h @@ -25,6 +25,22 @@ */ int migrate_add_blocker(Error *reason, Error **errp); +/** + * @migrate_add_blocker_internal - prevent migration from proceeding without + * only-migrate implications + * + * @reason - an error to be returned whenever migration is attempted + * + * @errp - [out] The reason (if any) we cannot block migration right now. + * + * @returns - 0 on success, -EBUSY on failure, with errp set. + * + * Some of the migration blockers can be temporary (e.g., for a few seconds), + * so it shouldn't need to conflict with "-only-migratable". For those cases, + * we can call this function rather than @migrate_add_blocker(). + */ +int migrate_add_blocker_internal(Error *reason, Error **errp); + /** * @migrate_del_blocker - remove a blocking error from migration * diff --git a/migration/migration.c b/migration/migration.c index e81e473f5a..e1c0082530 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2049,15 +2049,8 @@ void migrate_init(MigrationState *s) s->threshold_size = 0; } -int migrate_add_blocker(Error *reason, Error **errp) +int migrate_add_blocker_internal(Error *reason, Error **errp) { - if (only_migratable) { - error_propagate_prepend(errp, error_copy(reason), - "disallowing migration blocker " - "(--only-migratable) for: "); - return -EACCES; - } - /* Snapshots are similar to migrations, so check RUN_STATE_SAVE_VM too. */ if (runstate_check(RUN_STATE_SAVE_VM) || !migration_is_idle()) { error_propagate_prepend(errp, error_copy(reason), @@ -2070,6 +2063,18 @@ int migrate_add_blocker(Error *reason, Error **errp) return 0; } +int migrate_add_blocker(Error *reason, Error **errp) +{ + if (only_migratable) { + error_propagate_prepend(errp, error_copy(reason), + "disallowing migration blocker " + "(--only-migratable) for: "); + return -EACCES; + } + + return migrate_add_blocker_internal(reason, errp); +} + void migrate_del_blocker(Error *reason) { migration_blockers = g_slist_remove(migration_blockers, reason); From patchwork Mon Nov 1 22:09:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597397 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1892DC433EF for ; Mon, 1 Nov 2021 22:09:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 01AA561077 for ; Mon, 1 Nov 2021 22:09:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232362AbhKAWMJ (ORCPT ); Mon, 1 Nov 2021 18:12:09 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:20554 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232301AbhKAWME (ORCPT ); Mon, 1 Nov 2021 18:12:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804570; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Eb45pt+rx+z3d/QNqQ0ka/rPHoe06plBoXsL9wFgC3A=; b=XtbMSMGTQZvMco2cqRvB6sMNp2kXWpDgubhBSI/o3ct+u0hRuPelBE3jucBTFg1jjr/Why kLLoLyLGU6A1J7RIr1r7Aw5eUqFBllDRjog5qJtscdAYk0Pcq6YnQA8umpeYe5Vf8y85l5 1KXLUFCxkvc4P7/ZD47YRfIPjKRP2l4= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-500-lKL3UztbPJamlnoipeTKpA-1; Mon, 01 Nov 2021 18:09:29 -0400 X-MC-Unique: lKL3UztbPJamlnoipeTKpA-1 Received: by mail-wr1-f70.google.com with SMTP id a2-20020a5d4d42000000b0017b3bcf41b9so4041871wru.23 for ; Mon, 01 Nov 2021 15:09:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Eb45pt+rx+z3d/QNqQ0ka/rPHoe06plBoXsL9wFgC3A=; b=zn/3p2qRc6xzE9rQXZrT7C8eIZskzJGpDcYkMMjRN7JsD/ujhbKzgIwF4yVQnzu0K7 uZDoh+nw3Z5+aiY1GPVAKdGJBiGSlX/q/WVx7Rf9ZIHWigXPMMk5Fhgj5iLSUTSLpKyr mPA/pzKA2kL2ZVmROxEfds4K2d9k2EPpD2pIH38/N3HtCOdlz0PHwX1GnRcdL91Rj/u+ HZIs7RTQedboGqzktCwOBaN4SyYovhoZIUW46CfWhECZd48P7xsESAhzeoVXifK1wtU6 sXqWMvfniNSmo3XaWb1+4MpDDUczLFEmUgi3bt4e9vU5wrVljaSO3TcHIbgg5EOV0RsR +EMg== X-Gm-Message-State: AOAM531p7v/4Pe8lxLZLeS/xhW1HEiVi/euYg9CLP9Bph5tNmTkVySr4 muNjIGfidDMXclXyivVdVZgBtYkj/AKi21me8S+Wnv3RSb/0Wu1CfGFGPIB6RlKg7v0u2mdGElO K0w+PzVCyi+2S X-Received: by 2002:adf:e387:: with SMTP id e7mr30508145wrm.412.1635804567992; Mon, 01 Nov 2021 15:09:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwTDcj674kEGI9dl471cqGKZI5Q3j6okD5bAskecUMcRegbtpEz5VGFYHzxKsv8krfzx5uJsg== X-Received: by 2002:adf:e387:: with SMTP id e7mr30508114wrm.412.1635804567768; Mon, 01 Nov 2021 15:09:27 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id o17sm685550wmq.11.2021.11.01.15.09.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:27 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard Subject: [PULL 10/20] dump-guest-memory: Block live migration Date: Mon, 1 Nov 2021 23:09:02 +0100 Message-Id: <20211101220912.10039-11-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Peter Xu Both dump-guest-memory and live migration caches vm state at the beginning. Either of them entering the other one will cause race on the vm state, and even more severe on that (please refer to the crash report in the bug link). Let's block live migration in dump-guest-memory, and that'll also block dump-guest-memory if it detected that we're during a live migration. Side note: migrate_del_blocker() can be called even if the blocker is not inserted yet, so it's safe to unconditionally delete that blocker in dump_cleanup (g_slist_remove allows no-entry-found case). Suggested-by: Dr. David Alan Gilbert Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1996609 Signed-off-by: Peter Xu Reviewed-by: Marc-André Lureau Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- dump/dump.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/dump/dump.c b/dump/dump.c index ab625909f3..662d0a62cd 100644 --- a/dump/dump.c +++ b/dump/dump.c @@ -29,6 +29,7 @@ #include "qemu/error-report.h" #include "qemu/main-loop.h" #include "hw/misc/vmcoreinfo.h" +#include "migration/blocker.h" #ifdef TARGET_X86_64 #include "win_dump.h" @@ -47,6 +48,8 @@ #define MAX_GUEST_NOTE_SIZE (1 << 20) /* 1MB should be enough */ +static Error *dump_migration_blocker; + #define ELF_NOTE_SIZE(hdr_size, name_size, desc_size) \ ((DIV_ROUND_UP((hdr_size), 4) + \ DIV_ROUND_UP((name_size), 4) + \ @@ -101,6 +104,7 @@ static int dump_cleanup(DumpState *s) qemu_mutex_unlock_iothread(); } } + migrate_del_blocker(dump_migration_blocker); return 0; } @@ -2005,6 +2009,21 @@ void qmp_dump_guest_memory(bool paging, const char *file, return; } + if (!dump_migration_blocker) { + error_setg(&dump_migration_blocker, + "Live migration disabled: dump-guest-memory in progress"); + } + + /* + * Allows even for -only-migratable, but forbid migration during the + * process of dump guest memory. + */ + if (migrate_add_blocker_internal(dump_migration_blocker, errp)) { + /* Remember to release the fd before passing it over to dump state */ + close(fd); + return; + } + s = &dump_state_global; dump_state_prepare(s); From patchwork Mon Nov 1 22:09:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597399 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13768C433F5 for ; Mon, 1 Nov 2021 22:09:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F32D860FD9 for ; Mon, 1 Nov 2021 22:09:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232298AbhKAWML (ORCPT ); Mon, 1 Nov 2021 18:12:11 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:59198 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232286AbhKAWMF (ORCPT ); Mon, 1 Nov 2021 18:12:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804571; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YGFGnLXWsheb/qmVvEQnfSbPAvk7Yei/m0I6VzGvNKw=; b=Jyf6//8Zu3kbIbVIS8M9m5YgJOIrlgTHuURpHs/7RsCjFFPzzz4qDjGzrjkYMDbNxibDZy f1NN6ak0eevgrOiwElXACQ2UPKhkvC+nHHrGdg147ASFa/Jzq3bdQyRFufT/LLa9wFDDOP c3mclGysmmuXYuSfjWX2RunGzpNRSpA= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-468-OeNNtpahOoO6qEZH1RkQ5w-1; Mon, 01 Nov 2021 18:09:30 -0400 X-MC-Unique: OeNNtpahOoO6qEZH1RkQ5w-1 Received: by mail-wm1-f70.google.com with SMTP id k25-20020a05600c1c9900b00332f798ba1dso174829wms.4 for ; Mon, 01 Nov 2021 15:09:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YGFGnLXWsheb/qmVvEQnfSbPAvk7Yei/m0I6VzGvNKw=; b=FTU6ArSm+Ff2I40CPs1Kzty2iztju9WUMMonx0/YQbrDY1Su5y/j7KqsH5YYthPVJu xAIV8LTJtp5DbB8w5uDpO1W2gNqIQ7/T7xPAeMTQzjkmsOsz+2jeE8R2MxTSyovzUXr6 EMHJ9B/s5iBBIuhIIwReCZ/4gmXpkahBWiKTXWQc7wzF08rG9fx07MQ1Fr0x1wLU1igs Ikodkk+AN15pVw+HneXhOgqBJoH1l6/8rApnIuKjF+QLrXurabCz9SGwhjO3NFFof3wv MS/u2+NITjIw+eG7VOC82O/zlrfNEU2HV2MPadQZQ58gSmn7gBFqvSxJ+4+ep3cRyfAr xORw== X-Gm-Message-State: AOAM531NjKKFz+41+VqQE4htIF+fY8WM6uB4+byjp2znXjoAONezn6KG cFQORYOfvfxBjnzKTv6fQsid51G+xmp+jpUw4n4bMjt4ptSMmG3fZUXJWOGS+t/4yd0ENjhbkEU fgIO2x9TDgOZ/ X-Received: by 2002:a7b:cb52:: with SMTP id v18mr2063960wmj.10.1635804569184; Mon, 01 Nov 2021 15:09:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJycZK3+VR9iW1fa+cGg38G9Qb59TRFSiW9klW7eikgdbxXetfRBE2KmtCFvsskBng80n8r0wQ== X-Received: by 2002:a7b:cb52:: with SMTP id v18mr2063943wmj.10.1635804569026; Mon, 01 Nov 2021 15:09:29 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id q18sm658847wmc.7.2021.11.01.15.09.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:28 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard Subject: [PULL 11/20] memory: Introduce replay_discarded callback for RamDiscardManager Date: Mon, 1 Nov 2021 23:09:03 +0100 Message-Id: <20211101220912.10039-12-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: David Hildenbrand Introduce replay_discarded callback similar to our existing replay_populated callback, to be used my migration code to never migrate discarded memory. Acked-by: Peter Xu Signed-off-by: David Hildenbrand Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- include/exec/memory.h | 21 +++++++++++++++++++++ softmmu/memory.c | 11 +++++++++++ 2 files changed, 32 insertions(+) diff --git a/include/exec/memory.h b/include/exec/memory.h index 04280450c9..20f1b27377 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -550,6 +550,7 @@ static inline void ram_discard_listener_init(RamDiscardListener *rdl, } typedef int (*ReplayRamPopulate)(MemoryRegionSection *section, void *opaque); +typedef void (*ReplayRamDiscard)(MemoryRegionSection *section, void *opaque); /* * RamDiscardManagerClass: @@ -638,6 +639,21 @@ struct RamDiscardManagerClass { MemoryRegionSection *section, ReplayRamPopulate replay_fn, void *opaque); + /** + * @replay_discarded: + * + * Call the #ReplayRamDiscard callback for all discarded parts within the + * #MemoryRegionSection via the #RamDiscardManager. + * + * @rdm: the #RamDiscardManager + * @section: the #MemoryRegionSection + * @replay_fn: the #ReplayRamDiscard callback + * @opaque: pointer to forward to the callback + */ + void (*replay_discarded)(const RamDiscardManager *rdm, + MemoryRegionSection *section, + ReplayRamDiscard replay_fn, void *opaque); + /** * @register_listener: * @@ -682,6 +698,11 @@ int ram_discard_manager_replay_populated(const RamDiscardManager *rdm, ReplayRamPopulate replay_fn, void *opaque); +void ram_discard_manager_replay_discarded(const RamDiscardManager *rdm, + MemoryRegionSection *section, + ReplayRamDiscard replay_fn, + void *opaque); + void ram_discard_manager_register_listener(RamDiscardManager *rdm, RamDiscardListener *rdl, MemoryRegionSection *section); diff --git a/softmmu/memory.c b/softmmu/memory.c index f2ac0d2e89..7340e19ff5 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -2081,6 +2081,17 @@ int ram_discard_manager_replay_populated(const RamDiscardManager *rdm, return rdmc->replay_populated(rdm, section, replay_fn, opaque); } +void ram_discard_manager_replay_discarded(const RamDiscardManager *rdm, + MemoryRegionSection *section, + ReplayRamDiscard replay_fn, + void *opaque) +{ + RamDiscardManagerClass *rdmc = RAM_DISCARD_MANAGER_GET_CLASS(rdm); + + g_assert(rdmc->replay_discarded); + rdmc->replay_discarded(rdm, section, replay_fn, opaque); +} + void ram_discard_manager_register_listener(RamDiscardManager *rdm, RamDiscardListener *rdl, MemoryRegionSection *section) From patchwork Mon Nov 1 22:09:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597417 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C619C433EF for ; Mon, 1 Nov 2021 22:10:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E37B660F56 for ; Mon, 1 Nov 2021 22:10:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232301AbhKAWN3 (ORCPT ); Mon, 1 Nov 2021 18:13:29 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:29978 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231805AbhKAWN1 (ORCPT ); Mon, 1 Nov 2021 18:13:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804653; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fSoWVIR9hKZoxnk9uhtu16nhbmo8V4KFxeuBGBhsQZ8=; b=Xz8WAKU0qeaUHBDnUDYZzJTqk32rCZpYKnK9irXFZE4NLnE2XElKjXTqRxlSF5Yn7ldMMq gbQ2hsQQy1GHkhK1a1DLC+IRtuwOO7geWuz2/5c/Jc/q4C8SEEPzzoDl6sG+Q4dYBh1Nd7 4bnlxrmRXJJo/oy0YUvIxgGV/JUtdzQ= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-4--_H1QlISMeWoLIYpphj3zg-1; Mon, 01 Nov 2021 18:09:31 -0400 X-MC-Unique: -_H1QlISMeWoLIYpphj3zg-1 Received: by mail-wr1-f69.google.com with SMTP id q7-20020adff507000000b0017d160d35a8so3583075wro.4 for ; Mon, 01 Nov 2021 15:09:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=fSoWVIR9hKZoxnk9uhtu16nhbmo8V4KFxeuBGBhsQZ8=; b=P2WxvFFrSKLbYrgIQ3bI8cQepDEX2nMtRZgPZZlWSJRCrgJdV7yBdeFJ37EOBi7OD/ G1/Pm6L9w5lDnuP5lLSpY6702fTJlvastEIfAaZrVXA3pBMfHLgaJB7Fq6cawbvFhA1n hI00naXhedVJxlPucWJUfa795pqwYknzaTmrhDani4/3NCRYACzvS5VI9HoFddrp6MIp mmvrM2hg9dK/4kFBzzmeXv8ZofDZaGvgzhtW/pstF/oURSTSJtyoH/7oLuKiAmclXcQN IK6OVdiEd+90jp+1NgX0v3CgjsLBXZeJ0OsJAhYbT9reGyFg1yyg14ECYqQg8kUbsxm5 ZaRg== X-Gm-Message-State: AOAM530czyIjk66h5sdRql5OSqGvdr4Cd007ddk6ysOEE3WNpM7Px/kP Enls0Zmo6rdtPv9xM+iEJhCSlorBrES+A0WOJrg0tMYHsN4CK6qymLjjWUlbTFYPiTMjMLa4ZZ8 iDAroW7CGaMKM X-Received: by 2002:a1c:ed1a:: with SMTP id l26mr1960966wmh.19.1635804570445; Mon, 01 Nov 2021 15:09:30 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwu/QxnhEuNymsrQqTg8NiGiNIcnxGmbn+stcw6LipoHUBGhkcJBUusifbHj5C9b6NJX/heIQ== X-Received: by 2002:a1c:ed1a:: with SMTP id l26mr1960945wmh.19.1635804570300; Mon, 01 Nov 2021 15:09:30 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id f133sm655275wmf.31.2021.11.01.15.09.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:29 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard Subject: [PULL 12/20] virtio-mem: Implement replay_discarded RamDiscardManager callback Date: Mon, 1 Nov 2021 23:09:04 +0100 Message-Id: <20211101220912.10039-13-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: David Hildenbrand Implement it similar to the replay_populated callback. Acked-by: Peter Xu Signed-off-by: David Hildenbrand Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- hw/virtio/virtio-mem.c | 58 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index df91e454b2..284096ec5f 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -228,6 +228,38 @@ static int virtio_mem_for_each_plugged_section(const VirtIOMEM *vmem, return ret; } +static int virtio_mem_for_each_unplugged_section(const VirtIOMEM *vmem, + MemoryRegionSection *s, + void *arg, + virtio_mem_section_cb cb) +{ + unsigned long first_bit, last_bit; + uint64_t offset, size; + int ret = 0; + + first_bit = s->offset_within_region / vmem->bitmap_size; + first_bit = find_next_zero_bit(vmem->bitmap, vmem->bitmap_size, first_bit); + while (first_bit < vmem->bitmap_size) { + MemoryRegionSection tmp = *s; + + offset = first_bit * vmem->block_size; + last_bit = find_next_bit(vmem->bitmap, vmem->bitmap_size, + first_bit + 1) - 1; + size = (last_bit - first_bit + 1) * vmem->block_size; + + if (!virito_mem_intersect_memory_section(&tmp, offset, size)) { + break; + } + ret = cb(&tmp, arg); + if (ret) { + break; + } + first_bit = find_next_zero_bit(vmem->bitmap, vmem->bitmap_size, + last_bit + 2); + } + return ret; +} + static int virtio_mem_notify_populate_cb(MemoryRegionSection *s, void *arg) { RamDiscardListener *rdl = arg; @@ -1170,6 +1202,31 @@ static int virtio_mem_rdm_replay_populated(const RamDiscardManager *rdm, virtio_mem_rdm_replay_populated_cb); } +static int virtio_mem_rdm_replay_discarded_cb(MemoryRegionSection *s, + void *arg) +{ + struct VirtIOMEMReplayData *data = arg; + + ((ReplayRamDiscard)data->fn)(s, data->opaque); + return 0; +} + +static void virtio_mem_rdm_replay_discarded(const RamDiscardManager *rdm, + MemoryRegionSection *s, + ReplayRamDiscard replay_fn, + void *opaque) +{ + const VirtIOMEM *vmem = VIRTIO_MEM(rdm); + struct VirtIOMEMReplayData data = { + .fn = replay_fn, + .opaque = opaque, + }; + + g_assert(s->mr == &vmem->memdev->mr); + virtio_mem_for_each_unplugged_section(vmem, s, &data, + virtio_mem_rdm_replay_discarded_cb); +} + static void virtio_mem_rdm_register_listener(RamDiscardManager *rdm, RamDiscardListener *rdl, MemoryRegionSection *s) @@ -1234,6 +1291,7 @@ static void virtio_mem_class_init(ObjectClass *klass, void *data) rdmc->get_min_granularity = virtio_mem_rdm_get_min_granularity; rdmc->is_populated = virtio_mem_rdm_is_populated; rdmc->replay_populated = virtio_mem_rdm_replay_populated; + rdmc->replay_discarded = virtio_mem_rdm_replay_discarded; rdmc->register_listener = virtio_mem_rdm_register_listener; rdmc->unregister_listener = virtio_mem_rdm_unregister_listener; } From patchwork Mon Nov 1 22:09:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597403 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89877C433FE for ; Mon, 1 Nov 2021 22:09:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 75A0B6058D for ; Mon, 1 Nov 2021 22:09:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232324AbhKAWMN (ORCPT ); Mon, 1 Nov 2021 18:12:13 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:28044 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232346AbhKAWMK (ORCPT ); Mon, 1 Nov 2021 18:12:10 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804574; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UAlWjWoEMOZw+xZxC0WiKGp4PGVHqFH6WrrAJ6wsQRw=; b=TsqoTfdcFdZ6EaonwfBRtXArB1IX+dCHNN/qhEd3mVR2iSzSfKZIGD2I/sZFcDSsazxXIy kdo0kkhxAAJj125nnZ54cnD1ya5vOcnVvK4ZpeCv+O+ClsUBn2D4GbGPJ86JBWHSgVWG7o IsBbxTouquAc79ncyaLQDhsVotKss74= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-6-4jx57JHeOYis2k4U5itvjA-1; Mon, 01 Nov 2021 18:09:32 -0400 X-MC-Unique: 4jx57JHeOYis2k4U5itvjA-1 Received: by mail-wm1-f70.google.com with SMTP id n189-20020a1c27c6000000b00322f2e380f2so171676wmn.6 for ; Mon, 01 Nov 2021 15:09:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=UAlWjWoEMOZw+xZxC0WiKGp4PGVHqFH6WrrAJ6wsQRw=; b=oJhOM2+cVEFNrNmZGUP2wYJUWxt3Qghp6IrAODf6H3fC1GWH1reAaTFXhZVLcFM7gV kMTryqNNR4i3GQYVi5/WAxK49MAhMA4CgYIx/8MMiCJeMpDGeIAdAGPhonXQqaKAyl5a IJv4w7GIJkmY8mZbfA05s7h9bgAop2DFVL8HXWZb+/Y6SdFiuV6wJnkXG5NGOgsWWH1q bSXc21FsG9q9FCQQBbJItJHj9hG80/rb1SXsaYAzK2gIpc6O6NX6XZQIODPoyqJZfw2M IjqM4EBukwZrYXiT3/S0dT4WHwNUs6OLm4pH7b36czKmFGt/+8lJ+zgHk2E8nEK5cWiK CQmw== X-Gm-Message-State: AOAM530q8j6JphXlef5TkgxHmtwKZYC+eQ/9ArIRJzUw/mQBKz1a2n7L XZ+VO2s+FD/W0e2Tc3afpDiDbH9HteyRFF3bQ5N9QwIhMPuc0Rk+1sj97/yiWJaDpwuLTFMpgM2 gLnhA3to8ode0 X-Received: by 2002:a5d:6151:: with SMTP id y17mr33450598wrt.275.1635804571734; Mon, 01 Nov 2021 15:09:31 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwtxtTYPSTZydDNAKjX/wG8BXfEAdBg4wUEf4pzVhU0q8NeY6R4dn/NUoY2vV1PgKczmSzIOA== X-Received: by 2002:a5d:6151:: with SMTP id y17mr33450569wrt.275.1635804571558; Mon, 01 Nov 2021 15:09:31 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id h16sm9219637wrm.27.2021.11.01.15.09.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:31 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard Subject: [PULL 13/20] migration/ram: Handle RAMBlocks with a RamDiscardManager on the migration source Date: Mon, 1 Nov 2021 23:09:05 +0100 Message-Id: <20211101220912.10039-14-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: David Hildenbrand We don't want to migrate memory that corresponds to discarded ranges as managed by a RamDiscardManager responsible for the mapped memory region of the RAMBlock. The content of these pages is essentially stale and without any guarantees for the VM ("logically unplugged"). Depending on the underlying memory type, even reading memory might populate memory on the source, resulting in an undesired memory consumption. Of course, on the destination, even writing a zeropage consumes memory, which we also want to avoid (similar to free page hinting). Currently, virtio-mem tries achieving that goal (not migrating "unplugged" memory that was discarded) by going via qemu_guest_free_page_hint() - but it's hackish and incomplete. For example, background snapshots still end up reading all memory, as they don't do bitmap syncs. Postcopy recovery code will re-add previously cleared bits to the dirty bitmap and migrate them. Let's consult the RamDiscardManager after setting up our dirty bitmap initially and when postcopy recovery code reinitializes it: clear corresponding bits in the dirty bitmaps (e.g., of the RAMBlock and inside KVM). It's important to fixup the dirty bitmap *after* our initial bitmap sync, such that the corresponding dirty bits in KVM are actually cleared. As colo is incompatible with discarding of RAM and inhibits it, we don't have to bother. Note: if a misbehaving guest would use discarded ranges after migration started we would still migrate that memory: however, then we already populated that memory on the migration source. Reviewed-by: Peter Xu Signed-off-by: David Hildenbrand Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- migration/ram.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index ae2601bf3b..e8c06f207c 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -858,6 +858,60 @@ static inline bool migration_bitmap_clear_dirty(RAMState *rs, return ret; } +static void dirty_bitmap_clear_section(MemoryRegionSection *section, + void *opaque) +{ + const hwaddr offset = section->offset_within_region; + const hwaddr size = int128_get64(section->size); + const unsigned long start = offset >> TARGET_PAGE_BITS; + const unsigned long npages = size >> TARGET_PAGE_BITS; + RAMBlock *rb = section->mr->ram_block; + uint64_t *cleared_bits = opaque; + + /* + * We don't grab ram_state->bitmap_mutex because we expect to run + * only when starting migration or during postcopy recovery where + * we don't have concurrent access. + */ + if (!migration_in_postcopy() && !migrate_background_snapshot()) { + migration_clear_memory_region_dirty_bitmap_range(rb, start, npages); + } + *cleared_bits += bitmap_count_one_with_offset(rb->bmap, start, npages); + bitmap_clear(rb->bmap, start, npages); +} + +/* + * Exclude all dirty pages from migration that fall into a discarded range as + * managed by a RamDiscardManager responsible for the mapped memory region of + * the RAMBlock. Clear the corresponding bits in the dirty bitmaps. + * + * Discarded pages ("logically unplugged") have undefined content and must + * not get migrated, because even reading these pages for migration might + * result in undesired behavior. + * + * Returns the number of cleared bits in the RAMBlock dirty bitmap. + * + * Note: The result is only stable while migrating (precopy/postcopy). + */ +static uint64_t ramblock_dirty_bitmap_clear_discarded_pages(RAMBlock *rb) +{ + uint64_t cleared_bits = 0; + + if (rb->mr && rb->bmap && memory_region_has_ram_discard_manager(rb->mr)) { + RamDiscardManager *rdm = memory_region_get_ram_discard_manager(rb->mr); + MemoryRegionSection section = { + .mr = rb->mr, + .offset_within_region = 0, + .size = int128_make64(qemu_ram_get_used_length(rb)), + }; + + ram_discard_manager_replay_discarded(rdm, §ion, + dirty_bitmap_clear_section, + &cleared_bits); + } + return cleared_bits; +} + /* Called with RCU critical section */ static void ramblock_sync_dirty_bitmap(RAMState *rs, RAMBlock *rb) { @@ -2675,6 +2729,19 @@ static void ram_list_init_bitmaps(void) } } +static void migration_bitmap_clear_discarded_pages(RAMState *rs) +{ + unsigned long pages; + RAMBlock *rb; + + RCU_READ_LOCK_GUARD(); + + RAMBLOCK_FOREACH_NOT_IGNORED(rb) { + pages = ramblock_dirty_bitmap_clear_discarded_pages(rb); + rs->migration_dirty_pages -= pages; + } +} + static void ram_init_bitmaps(RAMState *rs) { /* For memory_global_dirty_log_start below. */ @@ -2691,6 +2758,12 @@ static void ram_init_bitmaps(RAMState *rs) } qemu_mutex_unlock_ramlist(); qemu_mutex_unlock_iothread(); + + /* + * After an eventual first bitmap sync, fixup the initial bitmap + * containing all 1s to exclude any discarded pages from migration. + */ + migration_bitmap_clear_discarded_pages(rs); } static int ram_init_all(RAMState **rsp) @@ -4119,6 +4192,10 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *block) */ bitmap_complement(block->bmap, block->bmap, nbits); + /* Clear dirty bits of discarded ranges that we don't want to migrate. */ + ramblock_dirty_bitmap_clear_discarded_pages(block); + + /* We'll recalculate migration_dirty_pages in ram_state_resume_prepare(). */ trace_ram_dirty_bitmap_reload_complete(block->idstr); /* From patchwork Mon Nov 1 22:09:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597401 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14BF5C433EF for ; Mon, 1 Nov 2021 22:09:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F29BF60FD9 for ; Mon, 1 Nov 2021 22:09:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232316AbhKAWMM (ORCPT ); Mon, 1 Nov 2021 18:12:12 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:36205 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232360AbhKAWMK (ORCPT ); Mon, 1 Nov 2021 18:12:10 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804575; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FIlkv3kUVM5FLjDY6YlHGmvpDrEJaIZ60b+7KENr5m4=; b=RroJFxsbK+OCWinvHPZq/ChrrIRkJi3l9tpoxt3YyJasht3h8YlFH2xDiI/0a7jm4myfbK f78XPx3p6TVdCJnvZNpWSShRHjzvk6t+fUu2E1ArcmlkAFXh+8hAmL4lyyMJztE9W34hlU yr0Ej3G63koOoxO0SYeTXhDuM+txvQ8= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-602-vg7Ean9GNtyKLgi9Aswctw-1; Mon, 01 Nov 2021 18:09:34 -0400 X-MC-Unique: vg7Ean9GNtyKLgi9Aswctw-1 Received: by mail-wm1-f70.google.com with SMTP id g11-20020a1c200b000000b003320d092d08so3342376wmg.9 for ; Mon, 01 Nov 2021 15:09:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FIlkv3kUVM5FLjDY6YlHGmvpDrEJaIZ60b+7KENr5m4=; b=Qs+UM2xyUFLePJkhz/4DAUFLApHOn/WuWQgZvNLKpq1/IMpa4QsMRxPYYkqpLxd5vg dwlTyF3yposMhSBolcpezPcRb4vh2EnY/ggRL8S53RMiBdjjix9RD2sV9+Rc0gwMDr6c RjsI2DDBwoXZE7vT7ge1Yy7t6IF4xn2eYZWB6mnJTYKYntO675T/283V8iMvQupe4w+b nD+0A03i82SCfB8KF2e9W1CsqdEdFo55k5FK9jcHDNAcmbugQWolGbCGmvfvrl92aDCh B6z1wPMXXMI1ttH8EeAzTfpq+LEkKCMyK9xUfcizk/766RJs+/KrkwdXwGjIgX1o6v6l 1PaA== X-Gm-Message-State: AOAM5323ASDUUMG3CtCaW91kqxHSBk/Zt/o23mjMePJ31oXxUAZ4v6s1 Pfflnv7K6f9/kDs1CLFjVKefXK/jAiKnYmwR/7tELHj9LEdCcuZowkB2SMY2E/xBuCM3NHdxnUR tyxih/uVTvAGe X-Received: by 2002:a5d:64af:: with SMTP id m15mr12543619wrp.267.1635804573225; Mon, 01 Nov 2021 15:09:33 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzrZQZMxAe13255F4Umt/lsvtqR2XTX2d+P8MJlbDMrZGA3GCpF4nzZPJktxVFBm5Hby8r3Kg== X-Received: by 2002:a5d:64af:: with SMTP id m15mr12543584wrp.267.1635804573009; Mon, 01 Nov 2021 15:09:33 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id m35sm1212614wms.2.2021.11.01.15.09.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:32 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard Subject: [PULL 14/20] virtio-mem: Drop precopy notifier Date: Mon, 1 Nov 2021 23:09:06 +0100 Message-Id: <20211101220912.10039-15-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: David Hildenbrand Migration code now properly handles RAMBlocks which are indirectly managed by a RamDiscardManager. No need for manual handling via the free page optimization interface, let's get rid of it. Acked-by: Michael S. Tsirkin Acked-by: Peter Xu Signed-off-by: David Hildenbrand Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- include/hw/virtio/virtio-mem.h | 3 --- hw/virtio/virtio-mem.c | 34 ---------------------------------- 2 files changed, 37 deletions(-) diff --git a/include/hw/virtio/virtio-mem.h b/include/hw/virtio/virtio-mem.h index 9a6e348fa2..a5dd6a493b 100644 --- a/include/hw/virtio/virtio-mem.h +++ b/include/hw/virtio/virtio-mem.h @@ -65,9 +65,6 @@ struct VirtIOMEM { /* notifiers to notify when "size" changes */ NotifierList size_change_notifiers; - /* don't migrate unplugged memory */ - NotifierWithReturn precopy_notifier; - /* listeners to notify on plug/unplug activity. */ QLIST_HEAD(, RamDiscardListener) rdl_list; }; diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index 284096ec5f..d5a578142b 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -776,7 +776,6 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp) host_memory_backend_set_mapped(vmem->memdev, true); vmstate_register_ram(&vmem->memdev->mr, DEVICE(vmem)); qemu_register_reset(virtio_mem_system_reset, vmem); - precopy_add_notifier(&vmem->precopy_notifier); /* * Set ourselves as RamDiscardManager before the plug handler maps the @@ -796,7 +795,6 @@ static void virtio_mem_device_unrealize(DeviceState *dev) * found via an address space anymore. Unset ourselves. */ memory_region_set_ram_discard_manager(&vmem->memdev->mr, NULL); - precopy_remove_notifier(&vmem->precopy_notifier); qemu_unregister_reset(virtio_mem_system_reset, vmem); vmstate_unregister_ram(&vmem->memdev->mr, DEVICE(vmem)); host_memory_backend_set_mapped(vmem->memdev, false); @@ -1089,43 +1087,11 @@ static void virtio_mem_set_block_size(Object *obj, Visitor *v, const char *name, vmem->block_size = value; } -static int virtio_mem_precopy_exclude_range_cb(const VirtIOMEM *vmem, void *arg, - uint64_t offset, uint64_t size) -{ - void * const host = qemu_ram_get_host_addr(vmem->memdev->mr.ram_block); - - qemu_guest_free_page_hint(host + offset, size); - return 0; -} - -static void virtio_mem_precopy_exclude_unplugged(VirtIOMEM *vmem) -{ - virtio_mem_for_each_unplugged_range(vmem, NULL, - virtio_mem_precopy_exclude_range_cb); -} - -static int virtio_mem_precopy_notify(NotifierWithReturn *n, void *data) -{ - VirtIOMEM *vmem = container_of(n, VirtIOMEM, precopy_notifier); - PrecopyNotifyData *pnd = data; - - switch (pnd->reason) { - case PRECOPY_NOTIFY_AFTER_BITMAP_SYNC: - virtio_mem_precopy_exclude_unplugged(vmem); - break; - default: - break; - } - - return 0; -} - static void virtio_mem_instance_init(Object *obj) { VirtIOMEM *vmem = VIRTIO_MEM(obj); notifier_list_init(&vmem->size_change_notifiers); - vmem->precopy_notifier.notify = virtio_mem_precopy_notify; QLIST_INIT(&vmem->rdl_list); object_property_add(obj, VIRTIO_MEM_SIZE_PROP, "size", virtio_mem_get_size, From patchwork Mon Nov 1 22:09:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597405 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97B0CC4332F for ; Mon, 1 Nov 2021 22:09:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7CD9160FE8 for ; Mon, 1 Nov 2021 22:09:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232311AbhKAWMO (ORCPT ); Mon, 1 Nov 2021 18:12:14 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:24828 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232284AbhKAWML (ORCPT ); Mon, 1 Nov 2021 18:12:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804577; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=d6xzgvfaRH7wOf37zGfKB0deqCxxsFVowA0eQT7aDxo=; b=Xvd/2WNQ90mqyu1g9An2sFIzS6fLiYI7/gp77Y+fGCLP1AtrzHb5CnHDiORFIk/GetcwEH xrgy65OHE80Bimxl89xdMovvVioVP0TgEC/w2vivE4nAkG4CraIVl6NVUsGIY160XeYwzO Ospe0b2KVUEWQTQcrXxJh9kheKNAVpo= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-503-4YAtllsAMQKzGfdJkNcLqw-1; Mon, 01 Nov 2021 18:09:35 -0400 X-MC-Unique: 4YAtllsAMQKzGfdJkNcLqw-1 Received: by mail-wr1-f71.google.com with SMTP id z5-20020a5d6405000000b00182083d7d2aso2347700wru.13 for ; Mon, 01 Nov 2021 15:09:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=d6xzgvfaRH7wOf37zGfKB0deqCxxsFVowA0eQT7aDxo=; b=i+as+MNbzuf3wTgjn7pDzHPyr8WF/LZt/2FeFSHRcx6QWadEOCSGx/Y8EBS2VHj1Gd V6A/7s8TXBEynq03sF9HJvSaN/BC0ymqx/CNfPXOnhwXIML4URIVLY/X0/sQ//a7YFaA xdVeiGffj0gF4+hnN/JoTM7MlrQ1Av+ZMX7u/5660Ktt0jYWkj8choH/SnrJLBDPiqIc rswzkp6ZoocFVdduNwgWkz+hyxjG2QMVCVRkc1xrCTk2T9lQYcCYHwNlTz7bkWiCp2i9 US1uKPuhtoGj29Rg2suuVJ2c9pFdkCeNyNIocBfW3kQOd6eEqN6hl/RIU823+IBXq4Tl UO8w== X-Gm-Message-State: AOAM530P4/Fj1qlGZg97ZianLC24w07ZmZ47SaKXr3VRbc2wx20nHeHJ Cm3laFiJVHoicPSlZHPLdmOERmTI25uynPe3I7hh7jwYYpGBQO3k76rKrlczb6DrNx8+T9FTLy3 q5NQXvbefeYdr X-Received: by 2002:a7b:c344:: with SMTP id l4mr2036325wmj.64.1635804574701; Mon, 01 Nov 2021 15:09:34 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxo7AHQlwU2kKnhA6BjMO/H58USugS+QmilgTuoi4heYyTz0HB9ONCrlie6M4WQHi9L+TU2pA== X-Received: by 2002:a7b:c344:: with SMTP id l4mr2036289wmj.64.1635804574465; Mon, 01 Nov 2021 15:09:34 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id m34sm702738wms.25.2021.11.01.15.09.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:34 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard Subject: [PULL 15/20] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination Date: Mon, 1 Nov 2021 23:09:07 +0100 Message-Id: <20211101220912.10039-16-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: David Hildenbrand Currently, when someone (i.e., the VM) accesses discarded parts inside a RAMBlock with a RamDiscardManager managing the corresponding mapped memory region, postcopy will request migration of the corresponding page from the source. The source, however, will never answer, because it refuses to migrate such pages with undefined content ("logically unplugged"): the pages are never dirty, and get_queued_page() will consequently skip processing these postcopy requests. Especially reading discarded ("logically unplugged") ranges is supposed to work in some setups (for example with current virtio-mem), although it barely ever happens: still, not placing a page would currently stall the VM, as it cannot make forward progress. Let's check the state via the RamDiscardManager (the state e.g., of virtio-mem is migrated during precopy) and avoid sending a request that will never get answered. Place a fresh zero page instead to keep the VM working. This is the same behavior that would happen automatically without userfaultfd being active, when accessing virtual memory regions without populated pages -- "populate on demand". For now, there are valid cases (as documented in the virtio-mem spec) where a VM might read discarded memory; in the future, we will disallow that. Then, we might want to handle that case differently, e.g., warning the user that the VM seems to be mis-behaving. Reviewed-by: Peter Xu Signed-off-by: David Hildenbrand Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- migration/ram.h | 1 + migration/postcopy-ram.c | 31 +++++++++++++++++++++++++++---- migration/ram.c | 21 +++++++++++++++++++++ 3 files changed, 49 insertions(+), 4 deletions(-) diff --git a/migration/ram.h b/migration/ram.h index 4833e9fd5b..dda1988f3d 100644 --- a/migration/ram.h +++ b/migration/ram.h @@ -72,6 +72,7 @@ void ramblock_recv_bitmap_set_range(RAMBlock *rb, void *host_addr, size_t nr); int64_t ramblock_recv_bitmap_send(QEMUFile *file, const char *block_name); int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb); +bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t start); /* ram cache */ int colo_init_ram_cache(void); diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 2e9697bdd2..3609ce7e52 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -671,6 +671,29 @@ int postcopy_wake_shared(struct PostCopyFD *pcfd, return ret; } +static int postcopy_request_page(MigrationIncomingState *mis, RAMBlock *rb, + ram_addr_t start, uint64_t haddr) +{ + void *aligned = (void *)(uintptr_t)ROUND_DOWN(haddr, qemu_ram_pagesize(rb)); + + /* + * Discarded pages (via RamDiscardManager) are never migrated. On unlikely + * access, place a zeropage, which will also set the relevant bits in the + * recv_bitmap accordingly, so we won't try placing a zeropage twice. + * + * Checking a single bit is sufficient to handle pagesize > TPS as either + * all relevant bits are set or not. + */ + assert(QEMU_IS_ALIGNED(start, qemu_ram_pagesize(rb))); + if (ramblock_page_is_discarded(rb, start)) { + bool received = ramblock_recv_bitmap_test_byte_offset(rb, start); + + return received ? 0 : postcopy_place_page_zero(mis, aligned, rb); + } + + return migrate_send_rp_req_pages(mis, rb, start, haddr); +} + /* * Callback from shared fault handlers to ask for a page, * the page must be specified by a RAMBlock and an offset in that rb @@ -690,7 +713,7 @@ int postcopy_request_shared_page(struct PostCopyFD *pcfd, RAMBlock *rb, qemu_ram_get_idstr(rb), rb_offset); return postcopy_wake_shared(pcfd, client_addr, rb); } - migrate_send_rp_req_pages(mis, rb, aligned_rbo, client_addr); + postcopy_request_page(mis, rb, aligned_rbo, client_addr); return 0; } @@ -984,8 +1007,8 @@ retry: * Send the request to the source - we want to request one * of our host page sizes (which is >= TPS) */ - ret = migrate_send_rp_req_pages(mis, rb, rb_offset, - msg.arg.pagefault.address); + ret = postcopy_request_page(mis, rb, rb_offset, + msg.arg.pagefault.address); if (ret) { /* May be network failure, try to wait for recovery */ if (ret == -EIO && postcopy_pause_fault_thread(mis)) { @@ -993,7 +1016,7 @@ retry: goto retry; } else { /* This is a unavoidable fault */ - error_report("%s: migrate_send_rp_req_pages() get %d", + error_report("%s: postcopy_request_page() get %d", __func__, ret); break; } diff --git a/migration/ram.c b/migration/ram.c index e8c06f207c..4f629de7d0 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -912,6 +912,27 @@ static uint64_t ramblock_dirty_bitmap_clear_discarded_pages(RAMBlock *rb) return cleared_bits; } +/* + * Check if a host-page aligned page falls into a discarded range as managed by + * a RamDiscardManager responsible for the mapped memory region of the RAMBlock. + * + * Note: The result is only stable while migrating (precopy/postcopy). + */ +bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t start) +{ + if (rb->mr && memory_region_has_ram_discard_manager(rb->mr)) { + RamDiscardManager *rdm = memory_region_get_ram_discard_manager(rb->mr); + MemoryRegionSection section = { + .mr = rb->mr, + .offset_within_region = start, + .size = int128_make64(qemu_ram_pagesize(rb)), + }; + + return !ram_discard_manager_is_populated(rdm, §ion); + } + return false; +} + /* Called with RCU critical section */ static void ramblock_sync_dirty_bitmap(RAMState *rs, RAMBlock *rb) { From patchwork Mon Nov 1 22:09:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597407 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1E13C433EF for ; Mon, 1 Nov 2021 22:09:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BE5BF61053 for ; Mon, 1 Nov 2021 22:09:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232314AbhKAWMP (ORCPT ); Mon, 1 Nov 2021 18:12:15 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:48536 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232302AbhKAWMM (ORCPT ); Mon, 1 Nov 2021 18:12:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804578; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=H80g6dMw2Mv72MY8J/rqWuWQm27PgZOf7gOHLz8JbhA=; b=bjsc1aWn+K8zk7qoHc4vHv+090mDJzTUnYChUKAmdLv0SI3mBliUZgb20OdxFMlFf33slq tvU7U65m0m686+XN+xWYoM0541HcdlNxQna19nV/cWk7jzOPxwxsamKH3I7HTZAQRPSsPj utRhPnLylBbhLvMj8FZFT+N8dg0YyjI= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-325-JvmkSI43MaehjqV1RRQkJg-1; Mon, 01 Nov 2021 18:09:37 -0400 X-MC-Unique: JvmkSI43MaehjqV1RRQkJg-1 Received: by mail-wm1-f69.google.com with SMTP id a186-20020a1c7fc3000000b00332f1a308e7so180171wmd.3 for ; Mon, 01 Nov 2021 15:09:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=H80g6dMw2Mv72MY8J/rqWuWQm27PgZOf7gOHLz8JbhA=; b=dpSS6dxB7cQAqIPEcZdfcG0wq9YgENGuoF9GWEhdcKMlXvvUbiPVlgcdH+/LI45r83 vGv8Zm9CnjiaNxUglNn4EhfE0jPKzO7Z0YLQrezG2VdaR5N9z44bcXN6JogcBXkBzkxa IGlkGm00dPmmoi8auY+zCcUKMJA6v3BqaVPyrihhkQmlTWimtLfO1hb2PPM1sYZbBTVf vgcJICkQZyl9Zg570lO5WUlhBym1s2KyLit0ANr2OWO7sKb+enrc9UMPIJwZMJNzY+kb Pwqv04XoIgiaLMEfOXfB4Ze54i9/OFn1MpDgg4lCmUmX/dWdmvWfxF+hKJTrnm8S7wxa vwIQ== X-Gm-Message-State: AOAM533Nqh3xz9PZzTi0f9bgzZ9lKIpqYrAZv9JSRqgpjgdUh7mh7eKY yD7Ee0zbTobIPOip3Nb9de0XiW3Tki9G7oT2TLkTzlj/89R+k4zAZAnbLrx3wL2c0Tz1Fr8hpA+ oDp/ZAZBqEAMU X-Received: by 2002:a05:6000:54e:: with SMTP id b14mr40644781wrf.308.1635804575960; Mon, 01 Nov 2021 15:09:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyeYrI4G4BKiXAB/InRvOnWTy+hbH8wuPZhOXw0coLYSKiQ6S2dlFWrOoqN5kf/uLhyOXH1Ww== X-Received: by 2002:a05:6000:54e:: with SMTP id b14mr40644755wrf.308.1635804575814; Mon, 01 Nov 2021 15:09:35 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id l8sm667683wmc.40.2021.11.01.15.09.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:35 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard Subject: [PULL 16/20] migration: Simplify alignment and alignment checks Date: Mon, 1 Nov 2021 23:09:08 +0100 Message-Id: <20211101220912.10039-17-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: David Hildenbrand Let's use QEMU_ALIGN_DOWN() and friends to make the code a bit easier to read. Reviewed-by: Peter Xu Signed-off-by: David Hildenbrand Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- migration/migration.c | 6 +++--- migration/postcopy-ram.c | 9 ++++----- migration/ram.c | 2 +- 3 files changed, 8 insertions(+), 9 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index e1c0082530..53b9a8af96 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -391,7 +391,7 @@ int migrate_send_rp_message_req_pages(MigrationIncomingState *mis, int migrate_send_rp_req_pages(MigrationIncomingState *mis, RAMBlock *rb, ram_addr_t start, uint64_t haddr) { - void *aligned = (void *)(uintptr_t)(haddr & (-qemu_ram_pagesize(rb))); + void *aligned = (void *)(uintptr_t)ROUND_DOWN(haddr, qemu_ram_pagesize(rb)); bool received = false; WITH_QEMU_LOCK_GUARD(&mis->page_request_mutex) { @@ -2637,8 +2637,8 @@ static void migrate_handle_rp_req_pages(MigrationState *ms, const char* rbname, * Since we currently insist on matching page sizes, just sanity check * we're being asked for whole host pages. */ - if (start & (our_host_ps - 1) || - (len & (our_host_ps - 1))) { + if (!QEMU_IS_ALIGNED(start, our_host_ps) || + !QEMU_IS_ALIGNED(len, our_host_ps)) { error_report("%s: Misaligned page request, start: " RAM_ADDR_FMT " len: %zd", __func__, start, len); mark_source_rp_bad(ms); diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 3609ce7e52..e721f69d0f 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -402,7 +402,7 @@ bool postcopy_ram_supported_by_host(MigrationIncomingState *mis) strerror(errno)); goto out; } - g_assert(((size_t)testarea & (pagesize - 1)) == 0); + g_assert(QEMU_PTR_IS_ALIGNED(testarea, pagesize)); reg_struct.range.start = (uintptr_t)testarea; reg_struct.range.len = pagesize; @@ -660,7 +660,7 @@ int postcopy_wake_shared(struct PostCopyFD *pcfd, struct uffdio_range range; int ret; trace_postcopy_wake_shared(client_addr, qemu_ram_get_idstr(rb)); - range.start = client_addr & ~(pagesize - 1); + range.start = ROUND_DOWN(client_addr, pagesize); range.len = pagesize; ret = ioctl(pcfd->fd, UFFDIO_WAKE, &range); if (ret) { @@ -702,8 +702,7 @@ static int postcopy_request_page(MigrationIncomingState *mis, RAMBlock *rb, int postcopy_request_shared_page(struct PostCopyFD *pcfd, RAMBlock *rb, uint64_t client_addr, uint64_t rb_offset) { - size_t pagesize = qemu_ram_pagesize(rb); - uint64_t aligned_rbo = rb_offset & ~(pagesize - 1); + uint64_t aligned_rbo = ROUND_DOWN(rb_offset, qemu_ram_pagesize(rb)); MigrationIncomingState *mis = migration_incoming_get_current(); trace_postcopy_request_shared_page(pcfd->idstr, qemu_ram_get_idstr(rb), @@ -993,7 +992,7 @@ static void *postcopy_ram_fault_thread(void *opaque) break; } - rb_offset &= ~(qemu_ram_pagesize(rb) - 1); + rb_offset = ROUND_DOWN(rb_offset, qemu_ram_pagesize(rb)); trace_postcopy_ram_fault_thread_request(msg.arg.pagefault.address, qemu_ram_get_idstr(rb), rb_offset, diff --git a/migration/ram.c b/migration/ram.c index 4f629de7d0..54df5dc0fc 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -811,7 +811,7 @@ static void migration_clear_memory_region_dirty_bitmap(RAMBlock *rb, assert(shift >= 6); size = 1ULL << (TARGET_PAGE_BITS + shift); - start = (((ram_addr_t)page) << TARGET_PAGE_BITS) & (-size); + start = QEMU_ALIGN_DOWN((ram_addr_t)page << TARGET_PAGE_BITS, size); trace_migration_bitmap_clear_dirty(rb->idstr, start, size, page); memory_region_clear_dirty_bitmap(rb->mr, start, size); } From patchwork Mon Nov 1 22:09:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D76D4C433EF for ; Mon, 1 Nov 2021 22:09:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BF69760FD9 for ; Mon, 1 Nov 2021 22:09:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232332AbhKAWMS (ORCPT ); Mon, 1 Nov 2021 18:12:18 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:40136 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232320AbhKAWMN (ORCPT ); Mon, 1 Nov 2021 18:12:13 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804579; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ut0I0bOpeQjRvVatojvb4Hc5X8NOzmNkKKtiHDMU41A=; b=Ve1upKNxyC66ZBx9CtFhL7Q6/Al2o3ZTvSRJtSdsQw/GE9IN3pU9OmXRCcqcoenQaczbiF TBEpdZVY41gevtSqQaPM5Vmtabg0l2wNCpdfjqNIe/pUYnjAukD/MP8xKcFe9+Px3mmo9w grFKC8I3gKU3+/0qzuQdI+ts9xrWn3A= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-234-1K7B_GF7OkCKskqjJQqLYQ-1; Mon, 01 Nov 2021 18:09:38 -0400 X-MC-Unique: 1K7B_GF7OkCKskqjJQqLYQ-1 Received: by mail-wr1-f72.google.com with SMTP id d13-20020adf9b8d000000b00160a94c235aso6776810wrc.2 for ; Mon, 01 Nov 2021 15:09:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ut0I0bOpeQjRvVatojvb4Hc5X8NOzmNkKKtiHDMU41A=; b=AxDN3lgE/YjwjqsyTLKD6ntFltXaZfNh+ljoNyqujO/ncYrzBOn1ThfLqweESOKG++ sVnTkoOTaZ6YruZgx0GrDbWRImltiwVfNqWwxiugqTvbQB5y85a22+6gusC84doSMcWh q6eouem5RMwloBDPw2BuZ4kH9jvV/PQq2cvvnSJlWIfZKIJQ/qxXtP0AcCC5IKsGR8Ux oJV99tJYxXS2tgaFg+sh90WUuv7EqfO3I9ZPa4+ghRfLiFx9NEPAXXxx8WCcYzywcas/ H6MRRdPdHREvBPSlu59RiPFadJ9D40DIwa2XGCtjhQJnhEOQktgBcwFBnR0kC28J+gO0 guRA== X-Gm-Message-State: AOAM5317hWZbMDceiuZxZgzNQhpXZMziDLrD+IhlXV1erK7TD7Sg3/0C 8zSv2u2iDWhTUHjonfLe5vUH4q99L6A35Kyer9hszM3HvEhw/GRhtc4hf9cjzWWq+6IBg5a+duw oIkqLWZNTQ60i X-Received: by 2002:a05:600c:4f92:: with SMTP id n18mr1960062wmq.22.1635804577428; Mon, 01 Nov 2021 15:09:37 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzk1mlsAWBnPV3UAdQaOyBp79mHrO95iLydgSFEaD+slGORmkvNAM202r9ngtyh0byPMnYLqg== X-Received: by 2002:a05:600c:4f92:: with SMTP id n18mr1960047wmq.22.1635804577205; Mon, 01 Nov 2021 15:09:37 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id f1sm14841974wrc.74.2021.11.01.15.09.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:36 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard Subject: [PULL 17/20] migration/ram: Factor out populating pages readable in ram_block_populate_pages() Date: Mon, 1 Nov 2021 23:09:09 +0100 Message-Id: <20211101220912.10039-18-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: David Hildenbrand Let's factor out prefaulting/populating to make further changes easier to review and add a comment what we are actually expecting to happen. While at it, use the actual page size of the ramblock, which defaults to qemu_real_host_page_size for anonymous memory. Further, rename ram_block_populate_pages() to ram_block_populate_read() as well, to make it clearer what we are doing. In the future, we might want to use MADV_POPULATE_READ to speed up population. Reviewed-by: Peter Xu Signed-off-by: David Hildenbrand Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- migration/ram.c | 35 ++++++++++++++++++++++------------- 1 file changed, 22 insertions(+), 13 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 54df5dc0fc..92c7b788ae 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -1639,26 +1639,35 @@ out: return ret; } +static inline void populate_read_range(RAMBlock *block, ram_addr_t offset, + ram_addr_t size) +{ + /* + * We read one byte of each page; this will preallocate page tables if + * required and populate the shared zeropage on MAP_PRIVATE anonymous memory + * where no page was populated yet. This might require adaption when + * supporting other mappings, like shmem. + */ + for (; offset < size; offset += block->page_size) { + char tmp = *((char *)block->host + offset); + + /* Don't optimize the read out */ + asm volatile("" : "+r" (tmp)); + } +} + /* - * ram_block_populate_pages: populate memory in the RAM block by reading - * an integer from the beginning of each page. + * ram_block_populate_read: preallocate page tables and populate pages in the + * RAM block by reading a byte of each page. * * Since it's solely used for userfault_fd WP feature, here we just * hardcode page size to qemu_real_host_page_size. * * @block: RAM block to populate */ -static void ram_block_populate_pages(RAMBlock *block) +static void ram_block_populate_read(RAMBlock *block) { - char *ptr = (char *) block->host; - - for (ram_addr_t offset = 0; offset < block->used_length; - offset += qemu_real_host_page_size) { - char tmp = *(ptr + offset); - - /* Don't optimize the read out */ - asm volatile("" : "+r" (tmp)); - } + populate_read_range(block, 0, block->used_length); } /* @@ -1684,7 +1693,7 @@ void ram_write_tracking_prepare(void) * UFFDIO_WRITEPROTECT_MODE_WP mode setting would silently skip * pages with pte_none() entries in page table. */ - ram_block_populate_pages(block); + ram_block_populate_read(block); } } From patchwork Mon Nov 1 22:09:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597411 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D06DCC433FE for ; Mon, 1 Nov 2021 22:09:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B92AD6058D for ; Mon, 1 Nov 2021 22:09:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232286AbhKAWMU (ORCPT ); Mon, 1 Nov 2021 18:12:20 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:21320 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232347AbhKAWMP (ORCPT ); Mon, 1 Nov 2021 18:12:15 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804580; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=unY7TGVSo3gAzWqEez72wIx+FCQs3PCeVqZ11RSh+wo=; b=DIR3jOa2Rt6lsKA/IXosHujRfaoKekMdh9ALbRGyEGrEmK/Xv8M37aQabSCeIruOl+fXKO epLd93IMRjb4HxvOd+xmeIA3BwAqS3IN2n72cKIFCIqgkWuS0rRvAY1xu1ksitkOSjQkZp e7c6MEVxqi5N0678e2p+f1/CV/944DI= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-519-n8q5LivxOOmh_Cb-kXtDAQ-1; Mon, 01 Nov 2021 18:09:40 -0400 X-MC-Unique: n8q5LivxOOmh_Cb-kXtDAQ-1 Received: by mail-wr1-f69.google.com with SMTP id f1-20020a5d64c1000000b001611832aefeso6799423wri.17 for ; Mon, 01 Nov 2021 15:09:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=unY7TGVSo3gAzWqEez72wIx+FCQs3PCeVqZ11RSh+wo=; b=hZopAk3yFpPRbzeRLRDPPo5aRo1HtdNROIVpLwEf0fJQEjOC2Mmbvklb0wL4i+qM+x Rs+X+IE0m4xzKPpkRELJyVwspNQC4h3a75k9N2KkFqs0d6b9zUOwrFFkPHZneHyKxk7q QeU8eScQt4gWMIEHKYDKNF0F16uBqSA1eHULjGQJ435rEn+mCYKTT5e+zbiVVGdDkeCp 4Jqo7fvYVW7GM348emVZ6WIKLfjEEXpRvsrNWQEomMu6TBbX8CPCUkTqQ0W4mH09PP0/ 2Q3+o2huH+b07aSIGZaaSiKY4oyEnLl7zoP09UVZMpk2eSOR7Kx8pNIrquZ62QVKFJea dRiw== X-Gm-Message-State: AOAM533SBfU4pHO1bJCiG3hj/lCTEG7aTiLIRv3cIgaEAjpeBmRJH/Fe ppJMe5TU7Ql4qz4c6agcTi9ANPTNMoetNfvpKYy6m8q6qXelb30Z+rz6+R0tBBqOsAdO4Bmv3gM Q380pJjfAcqZy X-Received: by 2002:a5d:568c:: with SMTP id f12mr30759745wrv.240.1635804578729; Mon, 01 Nov 2021 15:09:38 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxNZg4k6DRhmX852xk77R95Xufg4HlBWhkQxhZujNVoSP0nUJe8lhm/J27V+ocUqFpop+mG2g== X-Received: by 2002:a5d:568c:: with SMTP id f12mr30759714wrv.240.1635804578506; Mon, 01 Nov 2021 15:09:38 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id o1sm7544314wrn.63.2021.11.01.15.09.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:38 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard Subject: [PULL 18/20] migration/ram: Handle RAMBlocks with a RamDiscardManager on background snapshots Date: Mon, 1 Nov 2021 23:09:10 +0100 Message-Id: <20211101220912.10039-19-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: David Hildenbrand We already don't ever migrate memory that corresponds to discarded ranges as managed by a RamDiscardManager responsible for the mapped memory region of the RAMBlock. virtio-mem uses this mechanism to logically unplug parts of a RAMBlock. Right now, we still populate zeropages for the whole usable part of the RAMBlock, which is undesired because: 1. Even populating the shared zeropage will result in memory getting consumed for page tables. 2. Memory backends without a shared zeropage (like hugetlbfs and shmem) will populate an actual, fresh page, resulting in an unintended memory consumption. Discarded ("logically unplugged") parts have to remain discarded. As these pages are never part of the migration stream, there is no need to track modifications via userfaultfd WP reliably for these parts. Further, any writes to these ranges by the VM are invalid and the behavior is undefined. Note that Linux only supports userfaultfd WP on private anonymous memory for now, which usually results in the shared zeropage getting populated. The issue will become more relevant once userfaultfd WP supports shmem and hugetlb. Acked-by: Peter Xu Signed-off-by: David Hildenbrand Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- migration/ram.c | 38 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 36 insertions(+), 2 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 92c7b788ae..680a5158aa 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -1656,6 +1656,17 @@ static inline void populate_read_range(RAMBlock *block, ram_addr_t offset, } } +static inline int populate_read_section(MemoryRegionSection *section, + void *opaque) +{ + const hwaddr size = int128_get64(section->size); + hwaddr offset = section->offset_within_region; + RAMBlock *block = section->mr->ram_block; + + populate_read_range(block, offset, size); + return 0; +} + /* * ram_block_populate_read: preallocate page tables and populate pages in the * RAM block by reading a byte of each page. @@ -1665,9 +1676,32 @@ static inline void populate_read_range(RAMBlock *block, ram_addr_t offset, * * @block: RAM block to populate */ -static void ram_block_populate_read(RAMBlock *block) +static void ram_block_populate_read(RAMBlock *rb) { - populate_read_range(block, 0, block->used_length); + /* + * Skip populating all pages that fall into a discarded range as managed by + * a RamDiscardManager responsible for the mapped memory region of the + * RAMBlock. Such discarded ("logically unplugged") parts of a RAMBlock + * must not get populated automatically. We don't have to track + * modifications via userfaultfd WP reliably, because these pages will + * not be part of the migration stream either way -- see + * ramblock_dirty_bitmap_exclude_discarded_pages(). + * + * Note: The result is only stable while migrating (precopy/postcopy). + */ + if (rb->mr && memory_region_has_ram_discard_manager(rb->mr)) { + RamDiscardManager *rdm = memory_region_get_ram_discard_manager(rb->mr); + MemoryRegionSection section = { + .mr = rb->mr, + .offset_within_region = 0, + .size = rb->mr->size, + }; + + ram_discard_manager_replay_populated(rdm, §ion, + populate_read_section, NULL); + } else { + populate_read_range(rb, 0, rb->used_length); + } } /* From patchwork Mon Nov 1 22:09:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597413 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59580C433F5 for ; Mon, 1 Nov 2021 22:09:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 42C0760F56 for ; Mon, 1 Nov 2021 22:09:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232361AbhKAWMV (ORCPT ); Mon, 1 Nov 2021 18:12:21 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:48732 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232302AbhKAWMQ (ORCPT ); Mon, 1 Nov 2021 18:12:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804582; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2DiI4EfnW7YWT1ASaakXUWvRZyN7+/5k2450FBy0sI4=; b=de7Q1lob8fEuDpGz4btqF8OXeMLKOZqegQhSq7PmRPUpfzweKZ9aneIzihTtvQzjY3IbBh ljVjH/fDBJ8ujskN+m/rgo85/+fRyWCuyyf5jjtE27vaMnF9DFLpg2dXLKCTt1UgIIfjl8 Vy4VYVoYxNdE0Qmziouu8/SeX7RYnqU= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-459-wDOpzYLsNfSJbWk9WqzPRw-1; Mon, 01 Nov 2021 18:09:41 -0400 X-MC-Unique: wDOpzYLsNfSJbWk9WqzPRw-1 Received: by mail-wm1-f72.google.com with SMTP id k5-20020a7bc3050000b02901e081f69d80so6351545wmj.8 for ; Mon, 01 Nov 2021 15:09:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2DiI4EfnW7YWT1ASaakXUWvRZyN7+/5k2450FBy0sI4=; b=o7Gw43iyoy08m6LcAX1m6cT684CD1UY1zQ2kYivuvMk3Jm/KqRmNWrwdaUgUtWBtot UhDnV991pA9xy3pLO/DfKTzTh55dKeo04MB500pi1tPrKKmNd3xdZQ1yrt5GV48xMa0a nJmah33Xwyb7EwngB0BSC1FTORjd4u4KPnoQAE0Ulprln/TWqhRQ0JKCaXWXP21CY1zA iKgAiuqdC+vaSsUPGgyUSZSatOW9adESzY2iaL1JdKmfzvDiBns3VL0GWu13mGzF9rUz uaT7eBwoCfdOzm6U41xX2mdVgWZg9JD3m3swpiHujYqS0elEyyPXVsQxn+Vo8a8GBFkp o7rQ== X-Gm-Message-State: AOAM531zlRCmL9ag/MW0bY1IrFi8wx6NYDTixTjFJFzEw4NbB46BnUtF Zaaf5sDD4aOKoVOylVcqtG7PjkIqPAO+vekiIlXT/yOTl9AKz0ozaO15WQZ/19cbIfYP9eQAhUv FB1yhOuNdaYpv X-Received: by 2002:adf:e292:: with SMTP id v18mr39964926wri.369.1635804579985; Mon, 01 Nov 2021 15:09:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJysID67lQG9xljT2Fw7myH50fBYiqoMMqtw2ZP9fgHwBDYxBz5RqDHeu2EAgEpnNBhOmOcEEQ== X-Received: by 2002:adf:e292:: with SMTP id v18mr39964906wri.369.1635804579845; Mon, 01 Nov 2021 15:09:39 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id d24sm610262wmb.35.2021.11.01.15.09.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:39 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard , =?utf-8?b?SHltYW4gSHVhbmco6buE?= =?utf-8?b?5YuHKQ==?= Subject: [PULL 19/20] memory: introduce total_dirty_pages to stat dirty pages Date: Mon, 1 Nov 2021 23:09:11 +0100 Message-Id: <20211101220912.10039-20-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Hyman Huang(黄勇) introduce global var total_dirty_pages to stat dirty pages along with memory_global_dirty_log_sync. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Peter Xu Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- include/exec/ram_addr.h | 9 +++++++++ migration/dirtyrate.c | 7 +++++++ 2 files changed, 16 insertions(+) diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index 45c913264a..64fb936c7c 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -26,6 +26,8 @@ #include "exec/ramlist.h" #include "exec/ramblock.h" +extern uint64_t total_dirty_pages; + /** * clear_bmap_size: calculate clear bitmap size * @@ -373,6 +375,10 @@ static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap, qatomic_or( &blocks[DIRTY_MEMORY_MIGRATION][idx][offset], temp); + if (unlikely( + global_dirty_tracking & GLOBAL_DIRTY_DIRTY_RATE)) { + total_dirty_pages += ctpopl(temp); + } } if (tcg_enabled()) { @@ -403,6 +409,9 @@ static inline void cpu_physical_memory_set_dirty_lebitmap(unsigned long *bitmap, for (i = 0; i < len; i++) { if (bitmap[i] != 0) { c = leul_to_cpu(bitmap[i]); + if (unlikely(global_dirty_tracking & GLOBAL_DIRTY_DIRTY_RATE)) { + total_dirty_pages += ctpopl(c); + } do { j = ctzl(c); c &= ~(1ul << j); diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c index f92c4b498e..17b3d2cbb5 100644 --- a/migration/dirtyrate.c +++ b/migration/dirtyrate.c @@ -28,6 +28,13 @@ #include "sysemu/runstate.h" #include "exec/memory.h" +/* + * total_dirty_pages is procted by BQL and is used + * to stat dirty pages during the period of two + * memory_global_dirty_log_sync + */ +uint64_t total_dirty_pages; + typedef struct DirtyPageRecord { uint64_t start_pages; uint64_t end_pages; From patchwork Mon Nov 1 22:09:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Juan Quintela X-Patchwork-Id: 12597415 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80648C433F5 for ; Mon, 1 Nov 2021 22:09:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 68B7A60F56 for ; Mon, 1 Nov 2021 22:09:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232397AbhKAWMW (ORCPT ); Mon, 1 Nov 2021 18:12:22 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:39386 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232363AbhKAWMR (ORCPT ); Mon, 1 Nov 2021 18:12:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1635804583; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jP/I5apHpfB9miqWjOSmHHOjh4FLTnrs+cMqC3xHGrM=; b=UJ5ZYR8tosx+CzcwON4T+X4Pi1p9fiytMUwu6K59IdAOrEjcL6tYOckZh6mWgAEMaWzGNb GJsPqHbPkym0ee8SwOdKSLQXidEtiXVZUgTjkyK+QblnVNU4R9BH13RooVy4Fcax8pQ/jC Gf+M7G/GN81pNH1GLN9S/L/bL4GtVqg= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-193-B3mBpnEbPPaU_3amosRpJQ-1; Mon, 01 Nov 2021 18:09:42 -0400 X-MC-Unique: B3mBpnEbPPaU_3amosRpJQ-1 Received: by mail-wm1-f72.google.com with SMTP id n189-20020a1c27c6000000b00322f2e380f2so171879wmn.6 for ; Mon, 01 Nov 2021 15:09:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=jP/I5apHpfB9miqWjOSmHHOjh4FLTnrs+cMqC3xHGrM=; b=d0d6C5Zn2L7YNhUOGwVG+r6KQyUELO5GM0IDehI6q2wyimrQI1BgKbxFhr4nx2hvqf /dQNlp4mKD80oJn5iETRkgRboQafWBeDlCiVmNEkhwtTaD1JqpxS7Aj8MBGnFiRl9LE2 vtrumKZNyOcBDx5DsXQCHBTFHVTLfryDcVj9EzFRGCyvtsOf4p9FXahhMEFOwtt7+Lmk H3noZMgtgLtSmAT1nH8DSxdbwmWStiVV0YWw1iUBEnccumeKKYmYWCD4Q57NWnpHd+4X 2cVjhR75FoUYkLRFoRJQyuZf1BfwenvZ414eKyQynd1wzCs604kpsY/x/xB+wtcBoMaD olrg== X-Gm-Message-State: AOAM532vQRccGX4B1iaMjUz29+4L3p7dxcfwIresifD6NkQZntRLvF+Y 1IHg1/9/lA4qBdOubO8snhdzF48v3oDzWUH/tgdoy80y0b6GBdwHT2t7CRx2XbsqTosrZUGGLOf 4bx4DaD/JxAAT X-Received: by 2002:a5d:69ca:: with SMTP id s10mr17619039wrw.312.1635804581263; Mon, 01 Nov 2021 15:09:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwekiY665T3KtWYk+MojmKh2E8nDk/brzhW8Smm/r4bdKRN1BvOX4dI3pL5wzRW0aYS3i58/A== X-Received: by 2002:a5d:69ca:: with SMTP id s10mr17619009wrw.312.1635804581077; Mon, 01 Nov 2021 15:09:41 -0700 (PDT) Received: from localhost (static-233-86-86-188.ipcom.comunitel.net. [188.86.86.233]) by smtp.gmail.com with ESMTPSA id k8sm688985wms.41.2021.11.01.15.09.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Nov 2021 15:09:40 -0700 (PDT) From: Juan Quintela To: qemu-devel@nongnu.org Cc: Markus Armbruster , David Hildenbrand , Eduardo Habkost , xen-devel@lists.xenproject.org, Richard Henderson , Stefano Stabellini , Marcel Apfelbaum , Eric Blake , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , kvm@vger.kernel.org, Peter Xu , =?utf-8?q?Marc-Andr?= =?utf-8?q?=C3=A9_Lureau?= , Paul Durrant , Paolo Bonzini , "Dr. David Alan Gilbert" , Juan Quintela , "Michael S. Tsirkin" , Anthony Perard , =?utf-8?b?SHltYW4gSHVhbmco6buE?= =?utf-8?b?5YuHKQ==?= Subject: [PULL 20/20] migration/dirtyrate: implement dirty-bitmap dirtyrate calculation Date: Mon, 1 Nov 2021 23:09:12 +0100 Message-Id: <20211101220912.10039-21-quintela@redhat.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20211101220912.10039-1-quintela@redhat.com> References: <20211101220912.10039-1-quintela@redhat.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org From: Hyman Huang(黄勇) introduce dirty-bitmap mode as the third method of calc-dirty-rate. implement dirty-bitmap dirtyrate calculation, which can be used to measuring dirtyrate in the absence of dirty-ring. introduce "dirty_bitmap:-b" option in hmp calc_dirty_rate to indicate dirty bitmap method should be used for calculation. Signed-off-by: Hyman Huang(黄勇) Reviewed-by: Peter Xu Reviewed-by: Juan Quintela Signed-off-by: Juan Quintela --- qapi/migration.json | 6 ++- migration/dirtyrate.c | 112 ++++++++++++++++++++++++++++++++++++++---- hmp-commands.hx | 9 ++-- 3 files changed, 112 insertions(+), 15 deletions(-) diff --git a/qapi/migration.json b/qapi/migration.json index fae4bc608c..87146ceea2 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -1770,13 +1770,15 @@ # # @page-sampling: calculate dirtyrate by sampling pages. # -# @dirty-ring: calculate dirtyrate by via dirty ring. +# @dirty-ring: calculate dirtyrate by dirty ring. +# +# @dirty-bitmap: calculate dirtyrate by dirty bitmap. # # Since: 6.1 # ## { 'enum': 'DirtyRateMeasureMode', - 'data': ['page-sampling', 'dirty-ring'] } + 'data': ['page-sampling', 'dirty-ring', 'dirty-bitmap'] } ## # @DirtyRateInfo: diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c index 17b3d2cbb5..d65e744af9 100644 --- a/migration/dirtyrate.c +++ b/migration/dirtyrate.c @@ -15,6 +15,7 @@ #include "qapi/error.h" #include "cpu.h" #include "exec/ramblock.h" +#include "exec/ram_addr.h" #include "qemu/rcu_queue.h" #include "qemu/main-loop.h" #include "qapi/qapi-commands-migration.h" @@ -118,6 +119,10 @@ static struct DirtyRateInfo *query_dirty_rate_info(void) } info->vcpu_dirty_rate = head; } + + if (dirtyrate_mode == DIRTY_RATE_MEASURE_MODE_DIRTY_BITMAP) { + info->sample_pages = 0; + } } trace_query_dirty_rate_info(DirtyRateStatus_str(CalculatingState)); @@ -429,6 +434,79 @@ static int64_t do_calculate_dirtyrate_vcpu(DirtyPageRecord dirty_pages) return memory_size_MB / time_s; } +static inline void record_dirtypages_bitmap(DirtyPageRecord *dirty_pages, + bool start) +{ + if (start) { + dirty_pages->start_pages = total_dirty_pages; + } else { + dirty_pages->end_pages = total_dirty_pages; + } +} + +static void do_calculate_dirtyrate_bitmap(DirtyPageRecord dirty_pages) +{ + DirtyStat.dirty_rate = do_calculate_dirtyrate_vcpu(dirty_pages); +} + +static inline void dirtyrate_manual_reset_protect(void) +{ + RAMBlock *block = NULL; + + WITH_RCU_READ_LOCK_GUARD() { + RAMBLOCK_FOREACH_MIGRATABLE(block) { + memory_region_clear_dirty_bitmap(block->mr, 0, + block->used_length); + } + } +} + +static void calculate_dirtyrate_dirty_bitmap(struct DirtyRateConfig config) +{ + int64_t msec = 0; + int64_t start_time; + DirtyPageRecord dirty_pages; + + qemu_mutex_lock_iothread(); + memory_global_dirty_log_start(GLOBAL_DIRTY_DIRTY_RATE); + + /* + * 1'round of log sync may return all 1 bits with + * KVM_DIRTY_LOG_INITIALLY_SET enable + * skip it unconditionally and start dirty tracking + * from 2'round of log sync + */ + memory_global_dirty_log_sync(); + + /* + * reset page protect manually and unconditionally. + * this make sure kvm dirty log be cleared if + * KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE cap is enabled. + */ + dirtyrate_manual_reset_protect(); + qemu_mutex_unlock_iothread(); + + record_dirtypages_bitmap(&dirty_pages, true); + + start_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); + DirtyStat.start_time = start_time / 1000; + + msec = config.sample_period_seconds * 1000; + msec = set_sample_page_period(msec, start_time); + DirtyStat.calc_time = msec / 1000; + + /* + * dirtyrate_global_dirty_log_stop do two things. + * 1. fetch dirty bitmap from kvm + * 2. stop dirty tracking + */ + dirtyrate_global_dirty_log_stop(); + + record_dirtypages_bitmap(&dirty_pages, false); + + do_calculate_dirtyrate_bitmap(dirty_pages); +} + static void calculate_dirtyrate_dirty_ring(struct DirtyRateConfig config) { CPUState *cpu; @@ -514,7 +592,9 @@ out: static void calculate_dirtyrate(struct DirtyRateConfig config) { - if (config.mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) { + if (config.mode == DIRTY_RATE_MEASURE_MODE_DIRTY_BITMAP) { + calculate_dirtyrate_dirty_bitmap(config); + } else if (config.mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) { calculate_dirtyrate_dirty_ring(config); } else { calculate_dirtyrate_sample_vm(config); @@ -597,12 +677,15 @@ void qmp_calc_dirty_rate(int64_t calc_time, /* * dirty ring mode only works when kvm dirty ring is enabled. + * on the contrary, dirty bitmap mode is not. */ - if ((mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) && - !kvm_dirty_ring_enabled()) { - error_setg(errp, "dirty ring is disabled, use sample-pages method " - "or remeasure later."); - return; + if (((mode == DIRTY_RATE_MEASURE_MODE_DIRTY_RING) && + !kvm_dirty_ring_enabled()) || + ((mode == DIRTY_RATE_MEASURE_MODE_DIRTY_BITMAP) && + kvm_dirty_ring_enabled())) { + error_setg(errp, "mode %s is not enabled, use other method instead.", + DirtyRateMeasureMode_str(mode)); + return; } /* @@ -678,9 +761,8 @@ void hmp_calc_dirty_rate(Monitor *mon, const QDict *qdict) int64_t sample_pages = qdict_get_try_int(qdict, "sample_pages_per_GB", -1); bool has_sample_pages = (sample_pages != -1); bool dirty_ring = qdict_get_try_bool(qdict, "dirty_ring", false); - DirtyRateMeasureMode mode = - (dirty_ring ? DIRTY_RATE_MEASURE_MODE_DIRTY_RING : - DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING); + bool dirty_bitmap = qdict_get_try_bool(qdict, "dirty_bitmap", false); + DirtyRateMeasureMode mode = DIRTY_RATE_MEASURE_MODE_PAGE_SAMPLING; Error *err = NULL; if (!sec) { @@ -688,6 +770,18 @@ void hmp_calc_dirty_rate(Monitor *mon, const QDict *qdict) return; } + if (dirty_ring && dirty_bitmap) { + monitor_printf(mon, "Either dirty ring or dirty bitmap " + "can be specified!\n"); + return; + } + + if (dirty_bitmap) { + mode = DIRTY_RATE_MEASURE_MODE_DIRTY_BITMAP; + } else if (dirty_ring) { + mode = DIRTY_RATE_MEASURE_MODE_DIRTY_RING; + } + qmp_calc_dirty_rate(sec, has_sample_pages, sample_pages, true, mode, &err); if (err) { diff --git a/hmp-commands.hx b/hmp-commands.hx index b6d47bd03f..3a5aeba3fe 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1737,9 +1737,10 @@ ERST { .name = "calc_dirty_rate", - .args_type = "dirty_ring:-r,second:l,sample_pages_per_GB:l?", - .params = "[-r] second [sample_pages_per_GB]", - .help = "start a round of guest dirty rate measurement (using -d to" - "\n\t\t\t specify dirty ring as the method of calculation)", + .args_type = "dirty_ring:-r,dirty_bitmap:-b,second:l,sample_pages_per_GB:l?", + .params = "[-r] [-b] second [sample_pages_per_GB]", + .help = "start a round of guest dirty rate measurement (using -r to" + "\n\t\t\t specify dirty ring as the method of calculation and" + "\n\t\t\t -b to specify dirty bitmap as method of calculation)", .cmd = hmp_calc_dirty_rate, },