From patchwork Fri Mar 11 16:37:44 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= X-Patchwork-Id: 8567611 Return-Path: X-Original-To: patchwork-qemu-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 0C43DC0553 for ; Fri, 11 Mar 2016 16:43:08 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 0C8F6201F5 for ; Fri, 11 Mar 2016 16:43:07 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CED7C201B4 for ; Fri, 11 Mar 2016 16:43:05 +0000 (UTC) Received: from localhost ([::1]:56410 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aeQ9Z-0007WU-4j for patchwork-qemu-devel@patchwork.kernel.org; Fri, 11 Mar 2016 11:43:05 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53928) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aeQ52-0007tR-C8 for qemu-devel@nongnu.org; Fri, 11 Mar 2016 11:38:25 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aeQ50-0005NU-FO for qemu-devel@nongnu.org; Fri, 11 Mar 2016 11:38:24 -0500 Received: from mx1.redhat.com ([209.132.183.28]:41795) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aeQ50-0005ND-5i for qemu-devel@nongnu.org; Fri, 11 Mar 2016 11:38:22 -0500 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) by mx1.redhat.com (Postfix) with ESMTPS id CBA2375755 for ; Fri, 11 Mar 2016 16:38:21 +0000 (UTC) Received: from t530wlan.home.berrange.com.com (vpn1-6-25.ams2.redhat.com [10.36.6.25]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id u2BGc5Ok002899; Fri, 11 Mar 2016 11:38:20 -0500 From: "Daniel P. Berrange" To: qemu-devel@nongnu.org Date: Fri, 11 Mar 2016 16:37:44 +0000 Message-Id: <1457714282-6981-11-git-send-email-berrange@redhat.com> In-Reply-To: <1457714282-6981-1-git-send-email-berrange@redhat.com> References: <1457714282-6981-1-git-send-email-berrange@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 Cc: Amit Shah , "Dr. David Alan Gilbert" , Juan Quintela Subject: [Qemu-devel] [PATCH v4 10/28] migration: add reporting of errors for outgoing migration X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Currently if an application initiates an outgoing migration, it may or may not, get an error reported back on failure. If the error occurs synchronously to the 'migrate' command execution, the client app will see the error message. This is the case for DNS lookup failures. If the error occurs asynchronously to the monitor command though, the error will be thrown away and the client left guessing about what went wrong. This is the case for failure to connect to the TCP server (eg due to wrong port, or firewall rules, or other similar errors). In the future we'll be adding more scope for errors to happen asynchronously with the TLS protocol handshake. TLS errors are hard to diagnose even when they are well reported, so discarding errors entirely will make it impossible to debug TLS connection problems. Management apps which do migration are already using 'query-migrate' / 'info migrate' to check up on progress of background migration operations and to see their end status. This is a fine place to also include the error message when things go wrong. This patch thus adds an 'error-desc' field to the MigrationInfo struct, which will be populated when the 'status' is set to 'failed': (qemu) migrate -d tcp:localhost:9001 (qemu) info migrate capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off x-postcopy-ram: off Migration status: failed (Error connecting to socket: Connection refused) total time: 0 milliseconds In the HMP, when doing non-detached migration, it is also possible to display this error message directly to the app. (qemu) migrate tcp:localhost:9001 Error connecting to socket: Connection refused Or with QMP { "execute": "query-migrate", "arguments": {} } { "return": { "status": "failed", "error-desc": "address resolution failed for myhost:9000: No address associated with hostname" } } Signed-off-by: Daniel P. Berrange Reviewed-by: Dr. David Alan Gilbert --- hmp.c | 13 ++++++++++++- include/migration/migration.h | 5 ++++- include/qapi/error.h | 2 +- migration/migration.c | 15 ++++++++++++--- migration/rdma.c | 10 +++------- migration/tcp.c | 2 +- migration/unix.c | 2 +- qapi-schema.json | 7 ++++++- trace-events | 2 +- util/error.c | 2 +- 10 files changed, 42 insertions(+), 18 deletions(-) diff --git a/hmp.c b/hmp.c index 5b6084a..7126f17 100644 --- a/hmp.c +++ b/hmp.c @@ -34,6 +34,7 @@ #include "ui/console.h" #include "block/qapi.h" #include "qemu-io.h" +#include "qemu/error-report.h" #ifdef CONFIG_SPICE #include @@ -167,8 +168,15 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict) } if (info->has_status) { - monitor_printf(mon, "Migration status: %s\n", + monitor_printf(mon, "Migration status: %s", MigrationStatus_lookup[info->status]); + if (info->status == MIGRATION_STATUS_FAILED && + info->has_error_desc) { + monitor_printf(mon, " (%s)\n", info->error_desc); + } else { + monitor_printf(mon, "\n"); + } + monitor_printf(mon, "total time: %" PRIu64 " milliseconds\n", info->total_time); if (info->has_expected_downtime) { @@ -1532,6 +1540,9 @@ static void hmp_migrate_status_cb(void *opaque) if (status->is_block_migration) { monitor_printf(status->mon, "\n"); } + if (info->has_error_desc) { + error_report("%s", info->error_desc); + } monitor_resume(status->mon); timer_del(status->timer); g_free(status); diff --git a/include/migration/migration.h b/include/migration/migration.h index e335380..46c1bbe 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -171,6 +171,9 @@ struct MigrationState QSIMPLEQ_HEAD(src_page_requests, MigrationSrcPageRequest) src_page_requests; /* The RAMBlock used in the last src_page_request */ RAMBlock *last_req_rb; + + /* The last error that occurred */ + Error *error; }; void migrate_set_state(int *state, int old_state, int new_state); @@ -207,7 +210,7 @@ void rdma_start_outgoing_migration(void *opaque, const char *host_port, Error ** void rdma_start_incoming_migration(const char *host_port, Error **errp); -void migrate_fd_error(MigrationState *s); +void migrate_fd_error(MigrationState *s, const Error *error); void migrate_fd_connect(MigrationState *s); diff --git a/include/qapi/error.h b/include/qapi/error.h index 02e9dd2..c7e2869 100644 --- a/include/qapi/error.h +++ b/include/qapi/error.h @@ -139,7 +139,7 @@ typedef enum ErrorClass { /* * Get @err's human-readable error message. */ -const char *error_get_pretty(Error *err); +const char *error_get_pretty(const Error *err); /* * Get @err's error class. diff --git a/migration/migration.c b/migration/migration.c index a4edbe5..6b2e128 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -691,6 +691,10 @@ MigrationInfo *qmp_query_migrate(Error **errp) break; case MIGRATION_STATUS_FAILED: info->has_status = true; + if (s->error) { + info->has_error_desc = true; + info->error_desc = g_strdup(error_get_pretty(s->error)); + } break; case MIGRATION_STATUS_CANCELLED: info->has_status = true; @@ -863,12 +867,15 @@ static void migrate_fd_cleanup(void *opaque) notifier_list_notify(&migration_state_notifiers, s); } -void migrate_fd_error(MigrationState *s) +void migrate_fd_error(MigrationState *s, const Error *error) { - trace_migrate_fd_error(); + trace_migrate_fd_error(error ? error_get_pretty(error) : ""); assert(s->to_dst_file == NULL); migrate_set_state(&s->state, MIGRATION_STATUS_SETUP, MIGRATION_STATUS_FAILED); + if (!s->error) { + s->error = error_copy(error); + } notifier_list_notify(&migration_state_notifiers, s); } @@ -967,6 +974,8 @@ MigrationState *migrate_init(const MigrationParams *params) s->postcopy_after_devices = false; s->migration_thread_running = false; s->last_req_rb = NULL; + error_free(s->error); + s->error = NULL; migrate_set_state(&s->state, MIGRATION_STATUS_NONE, MIGRATION_STATUS_SETUP); @@ -1067,7 +1076,7 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk, } if (local_err) { - migrate_fd_error(s); + migrate_fd_error(s, local_err); error_propagate(errp, local_err); return; } diff --git a/migration/rdma.c b/migration/rdma.c index 187fc1c..cd33d90 100644 --- a/migration/rdma.c +++ b/migration/rdma.c @@ -3487,16 +3487,14 @@ void rdma_start_outgoing_migration(void *opaque, const char *host_port, Error **errp) { MigrationState *s = opaque; - Error *local_err = NULL, **temp = &local_err; - RDMAContext *rdma = qemu_rdma_data_init(host_port, &local_err); + RDMAContext *rdma = qemu_rdma_data_init(host_port, errp); int ret = 0; if (rdma == NULL) { - ERROR(temp, "Failed to initialize RDMA data structures! %d", ret); goto err; } - ret = qemu_rdma_source_init(rdma, &local_err, + ret = qemu_rdma_source_init(rdma, errp, s->enabled_capabilities[MIGRATION_CAPABILITY_RDMA_PIN_ALL]); if (ret) { @@ -3504,7 +3502,7 @@ void rdma_start_outgoing_migration(void *opaque, } trace_rdma_start_outgoing_migration_after_rdma_source_init(); - ret = qemu_rdma_connect(rdma, &local_err); + ret = qemu_rdma_connect(rdma, errp); if (ret) { goto err; @@ -3516,7 +3514,5 @@ void rdma_start_outgoing_migration(void *opaque, migrate_fd_connect(s); return; err: - error_propagate(errp, local_err); g_free(rdma); - migrate_fd_error(s); } diff --git a/migration/tcp.c b/migration/tcp.c index e888a4e..48904e0 100644 --- a/migration/tcp.c +++ b/migration/tcp.c @@ -40,7 +40,7 @@ static void tcp_wait_for_connect(int fd, Error *err, void *opaque) if (fd < 0) { DPRINTF("migrate connect error: %s\n", error_get_pretty(err)); s->to_dst_file = NULL; - migrate_fd_error(s); + migrate_fd_error(s, err); } else { DPRINTF("migrate connect success\n"); s->to_dst_file = qemu_fopen_socket(fd, "wb"); diff --git a/migration/unix.c b/migration/unix.c index d9aac36..b3537fd 100644 --- a/migration/unix.c +++ b/migration/unix.c @@ -40,7 +40,7 @@ static void unix_wait_for_connect(int fd, Error *err, void *opaque) if (fd < 0) { DPRINTF("migrate connect error: %s\n", error_get_pretty(err)); s->to_dst_file = NULL; - migrate_fd_error(s); + migrate_fd_error(s, err); } else { DPRINTF("migrate connect success\n"); s->to_dst_file = qemu_fopen_socket(fd, "wb"); diff --git a/qapi-schema.json b/qapi-schema.json index 362c9d8..2fd6166 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -484,6 +484,10 @@ # throttled during auto-converge. This is only present when auto-converge # has started throttling guest cpus. (Since 2.5) # +# @error-desc: #optional the human readable error description string, when +# @status is 'failed'. Clients should not attempt to parse the +# error strings. (Since 2.6) +# # Since: 0.14.0 ## { 'struct': 'MigrationInfo', @@ -494,7 +498,8 @@ '*expected-downtime': 'int', '*downtime': 'int', '*setup-time': 'int', - '*x-cpu-throttle-percentage': 'int'} } + '*x-cpu-throttle-percentage': 'int', + '*error-desc': 'str'} } ## # @query-migrate diff --git a/trace-events b/trace-events index 6fba6cc..8e626ab 100644 --- a/trace-events +++ b/trace-events @@ -1483,7 +1483,7 @@ await_return_path_close_on_source_close(void) "" await_return_path_close_on_source_joining(void) "" migrate_set_state(int new_state) "new state %d" migrate_fd_cleanup(void) "" -migrate_fd_error(void) "" +migrate_fd_error(const char *error_desc) "error=%s" migrate_fd_cancel(void) "" migrate_handle_rp_req_pages(const char *rbname, size_t start, size_t len) "in %s at %zx len %zx" migrate_pending(uint64_t size, uint64_t max, uint64_t post, uint64_t nonpost) "pending size %" PRIu64 " max %" PRIu64 " (post=%" PRIu64 " nonpost=%" PRIu64 ")" diff --git a/util/error.c b/util/error.c index 471b8b3..f134a6d 100644 --- a/util/error.c +++ b/util/error.c @@ -218,7 +218,7 @@ ErrorClass error_get_class(const Error *err) return err->err_class; } -const char *error_get_pretty(Error *err) +const char *error_get_pretty(const Error *err) { return err->msg; }