From patchwork Thu Feb 4 17:18:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= X-Patchwork-Id: 12068261 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2314C433E0 for ; Thu, 4 Feb 2021 18:13:15 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 717C864F44 for ; Thu, 4 Feb 2021 18:13:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 717C864F44 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:38040 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l7j7y-0005re-A4 for qemu-devel@archiver.kernel.org; Thu, 04 Feb 2021 13:13:14 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:50040) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l7iHx-0007Eu-Fh for qemu-devel@nongnu.org; Thu, 04 Feb 2021 12:19:31 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:26515) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1l7iHk-0005Fn-Bd for qemu-devel@nongnu.org; Thu, 04 Feb 2021 12:19:27 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612459152; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=g5ZJVP5Pl2Nw8rCNIegHKRON3Y03kQ4MgRjIAw6mnRQ=; b=V+F1nuIAjOdQ9ubX08Cygr2l9mqhXZQLr0kclfPrudcbjXSuvPRmo1dVQPzMH7Aoo4qoWn m8ZVLRSSJPIb0SjcABk8qPczExdBYZ5d6GoYtmqzrw0IQ3CrJC2vSfNl2baB5g0jn+svcz E3lby02NGFsMtco1ZzqyBOgZpgWJ57Q= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-369-l-R7uCHEOVKykJLFNfXo-Q-1; Thu, 04 Feb 2021 12:19:10 -0500 X-MC-Unique: l-R7uCHEOVKykJLFNfXo-Q-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 78A1519357AF; Thu, 4 Feb 2021 17:19:09 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-112-221.ams2.redhat.com [10.36.112.221]) by smtp.corp.redhat.com (Postfix) with ESMTP id 2289D60C05; Thu, 4 Feb 2021 17:19:07 +0000 (UTC) From: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= To: qemu-devel@nongnu.org Subject: [PATCH 00/33] migration: capture error reports into Error object Date: Thu, 4 Feb 2021 17:18:34 +0000 Message-Id: <20210204171907.901471-1-berrange@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=berrange@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=berrange@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.351, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Juan Quintela , =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , "Dr. David Alan Gilbert" , Hailiang Zhang Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Due to its long term heritage most of the migration code just invokes 'error_report' when problems hit. This was fine for HMP, since the messages get redirected from stderr, into the HMP console. It is not OK for QMP because the errors will not be fed back to the QMP client. This wasn't a terrible real world problem with QMP so far because live migration happens in the background, so at least on the target side there is not a QMP command that needs to capture the incoming migration. It is a problem on the source side but it doesn't hit frequently as the source side has fewer failure scenarios. None the less on both sides it would be desirable if 'query-migrate' can report errors correctly. With the introduction of the load-snapshot QMP commands, the need for error reporting becomes more pressing. Wiring up good error reporting is a large and difficult job, which this series does NOT complete. The focus here has been on converting all methods in savevm.c which have an 'int' return value capable of reporting errors. This covers most of the infrastructure for controlling the migration state serialization / protocol. The remaining part that is missing error reporting are the callbacks in the VMStateDescription struct which can return failure codes, but have no "Error **errp" parameter. Thinking about how this might be dealt with in future, a big bang conversion is likely non-viable. We'll probably want to introduce a duplicate set of callbacks with the "Error **errp" parameter and convert impls in batches, eventually removing the original callbacks. I don't intend todo that myself in the immediate future. IOW, this patch series probably solves 50% of the problem, but we still do need the rest to get ideal error reporting. In doing this savevm conversion I noticed a bunch of places which see and then ignore errors. I only fixed one or two of them which were clearly dubious. Other places in savevm.c where it seemed it was probably ok to ignore errors, I've left using error_report() on the basis that those are really warnings. Perhaps they could be changed to warn_report() instead. There are alot of patches here, but I felt it was easier to review for correctness if I converted 1 function at a time. The series does not neccessarily have to be reviewed/appied in 1 go. Daniel P. Berrangé (33): migration: push Error **errp into qemu_loadvm_state() migration: push Error **errp into qemu_loadvm_state_header() migration: push Error **errp into qemu_loadvm_state_setup() migration: push Error **errp into qemu_load_device_state() migration: push Error **errp into qemu_loadvm_state_main() migration: push Error **errp into qemu_loadvm_section_start_full() migration: push Error **errp into qemu_loadvm_section_part_end() migration: push Error **errp into loadvm_process_command() migration: push Error **errp into loadvm_handle_cmd_packaged() migration: push Error **errp into loadvm_postcopy_handle_advise() migration: push Error **errp into ram_postcopy_incoming_init() migration: push Error **errp into loadvm_postcopy_handle_listen() migration: push Error **errp into loadvm_postcopy_handle_run() migration: push Error **errp into loadvm_postcopy_ram_handle_discard() migration: make loadvm_postcopy_handle_resume() void migration: push Error **errp into loadvm_handle_recv_bitmap() migration: push Error **errp into loadvm_process_enable_colo() migration: push Error **errp into colo_init_ram_cache() migration: push Error **errp into check_section_footer() migration: push Error **errp into global_state_store() migration: remove error reporting from qemu_fopen_bdrv() callers migration: push Error **errp into qemu_savevm_state_iterate() migration: simplify some error reporting in save_snapshot() migration: push Error **errp into qemu_savevm_state_setup() migration: push Error **errp into qemu_savevm_state_complete_precopy() migration: push Error **errp into qemu_savevm_state_complete_precopy_non_iterable() migration: push Error **errp into qemu_savevm_state_complete_precopy() migration: push Error **errp into qemu_savevm_send_packaged() migration: push Error **errp into qemu_savevm_live_state() migration: push Error **errp into qemu_save_device_state() migration: push Error **errp into qemu_savevm_state_resume_prepare() migration: push Error **errp into postcopy_resume_handshake() migration: push Error **errp into postcopy_do_resume() include/migration/colo.h | 2 +- include/migration/global_state.h | 2 +- migration/colo.c | 12 +- migration/global_state.c | 6 +- migration/migration.c | 80 ++- migration/postcopy-ram.c | 8 +- migration/postcopy-ram.h | 2 +- migration/ram.c | 17 +- migration/ram.h | 4 +- migration/savevm.c | 594 ++++++++++-------- migration/savevm.h | 23 +- .../tests/internal-snapshots-qapi.out | 3 +- 12 files changed, 427 insertions(+), 326 deletions(-)