From patchwork Thu Feb 4 12:48:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= X-Patchwork-Id: 12067275 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0F41C433E0 for ; Thu, 4 Feb 2021 12:50:16 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 34A4464DF8 for ; Thu, 4 Feb 2021 12:50:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 34A4464DF8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:40828 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l7e5O-0002HU-Vk for qemu-devel@archiver.kernel.org; Thu, 04 Feb 2021 07:50:15 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:33176) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l7e44-0000cu-1H for qemu-devel@nongnu.org; Thu, 04 Feb 2021 07:48:52 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:56441) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1l7e40-0001YX-RI for qemu-devel@nongnu.org; Thu, 04 Feb 2021 07:48:51 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612442927; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=2KGhacpik4l2L0n2hZaFtEbeV9dcCtyLwm+fNcHWTbc=; b=D3+7mGlMhmBMjNRT7kLR5albkRyKveRUUfu6lQpFDsjdusbwPOW3cNNH3ZjLds1z7OELvO 1kfg5rBe7xeG4/h/ROaieWco2Hzaw+nJ9y/cUmB50gSw5cf5WC1Uv9uTkI96CUgVYWXWir 5a2xuc+ZmNGM9cPiAewJWlrYHdHtbRg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-375-ujDkXqwCMhazDgWqwwvJ9w-1; Thu, 04 Feb 2021 07:48:44 -0500 X-MC-Unique: ujDkXqwCMhazDgWqwwvJ9w-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 2894C107ACC7; Thu, 4 Feb 2021 12:48:43 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-115-169.ams2.redhat.com [10.36.115.169]) by smtp.corp.redhat.com (Postfix) with ESMTP id 7A87D722FC; Thu, 4 Feb 2021 12:48:36 +0000 (UTC) From: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= To: qemu-devel@nongnu.org Subject: [PATCH v11 00/12] migration: bring improved savevm/loadvm/delvm to QMP Date: Thu, 4 Feb 2021 12:48:22 +0000 Message-Id: <20210204124834.774401-1-berrange@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=berrange@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=berrange@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.351, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Vladimir Sementsov-Ogievskiy , =?utf-8?q?Daniel_P?= =?utf-8?q?=2E_Berrang=C3=A9?= , qemu-block@nongnu.org, Juan Quintela , John Snow , Markus Armbruster , "Dr. David Alan Gilbert" , Pavel Dovgalyuk , Paolo Bonzini , Max Reitz Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" v1: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg00866.html v2: https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg07523.html v3: https://lists.gnu.org/archive/html/qemu-devel/2020-08/msg07076.html v4: https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg05221.html v5: https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg00587.html v6: https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg02158.html v7: https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg06205.html v8: https://lists.gnu.org/archive/html/qemu-devel/2020-11/msg06464.html v9: https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg05016.html vA: https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg00620.html This series aims to provide a better designed replacement for the savevm/loadvm/delvm HMP commands, which despite their flaws continue to be actively used in the QMP world via the HMP command passthrough facility. The main problems addressed are: - The logic to pick which disk to store the vmstate in is not satsifactory. The first block driver state cannot be assumed to be the root disk image, it might be OVMF varstore and we don't want to store vmstate in there. - The logic to decide which disks must be snapshotted is hardwired to all disks which are writable Again with OVMF there might be a writable varstore, but this can be raw rather than qcow2 format, and thus unable to be snapshotted. While users might wish to snapshot their varstore, in some/many/most cases it is entirely uneccessary. Users are blocked from snapshotting their VM though due to this varstore. - The commands are synchronous blocking execution and returning errors immediately. This is partially addressed by integrating with the job framework. This forces the client to use the async commands to determine the completion status or error message from the operations. In the block code I've only dealt with node names for block devices, as IIUC, this is all that libvirt should need in the -blockdev world it now lives in. IOW, I've made not attempt to cope with people wanting to use these QMP commands in combination with -drive args, as libvirt will never use -drive with a QEMU new enough to have these new commands. The main limitations of this current impl - The snapshot process runs serialized in the main thread. ie QEMU guest execution is blocked for the duration. The job framework lets us fix this in future without changing the QMP semantics exposed to the apps. - Most vmstate loading errors just go to stderr, as they are not using Error **errp reporting. Thus the job framework just reports a fairly generic message "Error -22 while loading VM state" Again this can be fixed later without changing the QMP semantics exposed to apps. I've done some minimal work in libvirt to start to make use of the new commands to validate their functionality, but this isn't finished yet. My ultimate goal is to make the GNOME Boxes maintainer happy again by having internal snapshots work with OVMF: https://gitlab.gnome.org/GNOME/gnome-boxes/-/commit/c486da262f6566326fbcb5e= f45c5f64048f16a6e Changed in v11: - Add missing docs for events for snapshot-delete - Fix mistaken operation name in snapshot-delete docs Changed in v10: - Fix some mis-placed patch chunks - Update qapi version number annotations - Move iotests to new naming scheme - Fix shell based iotests in tests/qemu-iotests/tests subdir - Expand QAPI examples - Remove bogus submodule commit update - Optimize shell pattern matching code - Misc other typo/whitespace fixes Changed in v9: - Rebase to git master to resolve conflicts - Fixed accidental regression in error handling in previous v8 - Fixed formatting of iotest expected output now that we switched to preserving whitespace in QMP input Changed in v8: - Rebase to git master to resolve conflicts - Updated QAPI since versions to 6.0 Changed in v7: - Incorporate changes from: https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg03165.html - Tweaked error message Changed in v6: - Resolve many conflicts with recent replay changes - Misc typos in QAPI Changed in v5: - Fix prevention of tag overwriting - Refactor and expand test suite coverage to validate more negative scenarios Changed in v4: - Make the device lists mandatory, dropping all support for QEMU's built-in heuristics to select devices. - Improve some error reporting and I/O test coverage Changed in v3: - Schedule a bottom half to escape from coroutine context in the jobs. This is needed because the locking in the snapshot code goes horribly wrong when run from a background coroutine instead of the main event thread. - Re-factor way we iterate over devices, so that we correctly report non-existant devices passed by the user over QMP. - Add QAPI docs notes about limitations wrt vmstate error reporting (it all goes to stderr not an Error **errp) so QMP only gets a fairly generic error message currently. - Add I/O test to validate many usage scenarios / errors - Add I/O test helpers to handle QMP events with a deterministic ordering - Ensure 'delete-snapshot' reports an error if requesting delete from devices that don't support snapshot, instead of silently succeeding with no erro. Changed in v2: - Use new command names "snapshot-{load,save,delete}" to make it clear that these are different from the "savevm|loadvm|delvm" as they use the Job framework - Use an include list for block devs, not an exclude list Daniel P. Berrang=C3=A9 (11): block: push error reporting into bdrv_all_*_snapshot functions migration: stop returning errno from load_snapshot() block: add ability to specify list of blockdevs during snapshot block: allow specifying name of block device for vmstate storage block: rename and alter bdrv_all_find_snapshot semantics migration: control whether snapshots are ovewritten migration: wire up support for snapshot device selection migration: introduce a delete_snapshot wrapper iotests: add support for capturing and matching QMP events iotests: fix loading of common.config from tests/ subdir migration: introduce snapshot-{save,load,delete} QMP commands Philippe Mathieu-Daud=C3=A9 (1): migration: Make save_snapshot() return bool, not 0/-1 block/monitor/block-hmp-cmds.c | 7 +- block/snapshot.c | 256 ++++++--- include/block/snapshot.h | 23 +- include/migration/snapshot.h | 47 +- migration/savevm.c | 296 ++++++++-- monitor/hmp-cmds.c | 12 +- qapi/job.json | 9 +- qapi/migration.json | 173 ++++++ replay/replay-debugging.c | 12 +- replay/replay-snapshot.c | 5 +- softmmu/vl.c | 2 +- tests/qemu-iotests/267.out | 12 +- tests/qemu-iotests/common.qemu | 106 +++- tests/qemu-iotests/common.rc | 10 +- .../tests/internal-snapshots-qapi | 386 +++++++++++++ .../tests/internal-snapshots-qapi.out | 520 ++++++++++++++++++ 16 files changed, 1721 insertions(+), 155 deletions(-) create mode 100755 tests/qemu-iotests/tests/internal-snapshots-qapi create mode 100644 tests/qemu-iotests/tests/internal-snapshots-qapi.out --=20 2.29.2