From patchwork Tue May 31 19:11:06 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Md Haris Iqbal X-Patchwork-Id: 9145525 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 257F060761 for ; Tue, 31 May 2016 19:12:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1A09820700 for ; Tue, 31 May 2016 19:12:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0E4CB26240; Tue, 31 May 2016 19:12:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.7 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED,FREEMAIL_FROM,RCVD_IN_BL_SPAMCOP_NET,RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_WEB,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 7AECC20700 for ; Tue, 31 May 2016 19:12:39 +0000 (UTC) Received: from localhost ([::1]:38026 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b7p5i-0002xD-Ls for patchwork-qemu-devel@patchwork.kernel.org; Tue, 31 May 2016 15:12:38 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54072) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b7p5J-0002u3-N0 for qemu-devel@nongnu.org; Tue, 31 May 2016 15:12:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b7p5F-0001W6-DA for qemu-devel@nongnu.org; Tue, 31 May 2016 15:12:12 -0400 Received: from mail-pf0-x242.google.com ([2607:f8b0:400e:c00::242]:36559) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b7p5F-0001Vl-2a for qemu-devel@nongnu.org; Tue, 31 May 2016 15:12:09 -0400 Received: by mail-pf0-x242.google.com with SMTP id 62so14199206pfd.3 for ; Tue, 31 May 2016 12:12:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=duxqhZZ/TD+3g5ousmdG2RSGPsUghv85EmNg5/h8iss=; b=JIXF95FparVg9lw+aAvUA1ctag2nKBcPimUwdY+PzBrwcqAAs/5OzwctKq93BJaLqs 3ZFNxl9fyS4qhk+yYPh52SPSM5Y1+1vkrRJRDuQEHDAEhoiBf64p0bIMX816bGgqVl8Q YbAjM2vag8/kF0ttrITLhzTWlG7FoBd/mwVOVtvL09unoe2A2TCojDymx4KGzUUbQL6t 8jhT+Q1PYFKxqfS8q0Qt/NKYZBXpm3U8f+Odp0M3O3ssWGzMCdcZuZjcZhBatEblqbuu Wu195a2o9YZDfyZawYxAMVfco+5ndm/u2oRaw189QHjJfwx16V3diuM2Q3x3OaFU5I7T dRzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=duxqhZZ/TD+3g5ousmdG2RSGPsUghv85EmNg5/h8iss=; b=DczDgZvCnSvu8RuYwV2FursSuilCCHfhQsv13r63WfWpxF97zkRja2xUUbwBB+u3a5 PMEeDm4mtnmnKKu+Uk+tUlJcMRNyBfvD5ClHNjnstZm4wAPoazY5C1HD4NjeCjlmRYVS kOtKqR6K28U2q4GiUz1uaWb1JUh/eJF6dEzu7tFEWs7QY8HkCyvVvzL/1SD5CuQe+3z3 3SJnLRoM/zx203+johjLc+nk9y01k6m0dNkhTj+hwyIQUqRtEKr2daSCLnfSLjAanw6t xEi4p0nmf6VC7YmETtRaQYpBFcafit42SKeUecRMUNalgJ80n8HdtpNFoTLeBegxaqzo g/1A== X-Gm-Message-State: ALyK8tIUO5FBGsJdldhEl6Z127OqpUQoyAFNmrPjUSDFfWZyeLbcTu9HdIQDeu9R58r6ZA== X-Received: by 10.98.89.207 with SMTP id k76mr58349565pfj.166.1464721928216; Tue, 31 May 2016 12:12:08 -0700 (PDT) Received: from dragon-master.nitk.ac.in ([14.139.155.210]) by smtp.googlemail.com with ESMTPSA id 75sm6021971pfo.82.2016.05.31.12.12.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 31 May 2016 12:12:07 -0700 (PDT) From: Md Haris Iqbal To: qemu-devel@nongnu.org Date: Wed, 1 Jun 2016 00:41:06 +0530 Message-Id: <1464721866-28275-1-git-send-email-haris.phnx@gmail.com> X-Mailer: git-send-email 2.7.4 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2607:f8b0:400e:c00::242 Subject: [Qemu-devel] [Qemu-devel [RFC] [WIP] v1] Keeping the Source side alive incase of network failure (Migration recover from network failure) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Md Haris Iqbal , dgilbert@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP --- include/migration/migration.h | 1 + migration/migration.c | 41 ++++++++++++++++++++++++++++++++++++----- vl.c | 4 ++++ 3 files changed, 41 insertions(+), 5 deletions(-) diff --git a/include/migration/migration.h b/include/migration/migration.h index ac2c12c..33da695 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -325,6 +325,7 @@ void global_state_store_running(void); void flush_page_queue(MigrationState *ms); int ram_save_queue_pages(MigrationState *ms, const char *rbname, ram_addr_t start, ram_addr_t len); +int qemu_migrate_postcopy_outgoing_recovery(MigrationState *ms); PostcopyState postcopy_state_get(void); /* Set the state and return the old state */ diff --git a/migration/migration.c b/migration/migration.c index 991313a..ee0c2a8 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -539,6 +539,7 @@ static bool migration_is_setup_or_active(int state) case MIGRATION_STATUS_ACTIVE: case MIGRATION_STATUS_POSTCOPY_ACTIVE: case MIGRATION_STATUS_SETUP: + case MIGRATION_STATUS_POSTCOPY_RECOVERY: return true; default: @@ -1634,6 +1635,8 @@ static void *migration_thread(void *opaque) /* The active state we expect to be in; ACTIVE or POSTCOPY_ACTIVE */ enum MigrationStatus current_active_state = MIGRATION_STATUS_ACTIVE; + int32_t ret; + rcu_register_thread(); qemu_savevm_state_header(s->to_dst_file); @@ -1700,11 +1703,26 @@ static void *migration_thread(void *opaque) } } - if (qemu_file_get_error(s->to_dst_file)) { - migrate_set_state(&s->state, current_active_state, - MIGRATION_STATUS_FAILED); - trace_migration_thread_file_err(); - break; + if ((ret = qemu_file_get_error(s->to_dst_file))) { + fprintf(stderr, "1 : Error %s %d\n", strerror(-ret), -ret); + if(ret != -EIO && s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) { + /* Network Failure during postcopy */ + + current_active_state = MIGRATION_STATUS_POSTCOPY_RECOVERY; + runstate_set(RUN_STATE_POSTMIGRATE_RECOVERY); + fprintf(stderr, "1.1 : Error %s %d\n", strerror(-ret), -ret); + ret = qemu_migrate_postcopy_outgoing_recovery(s); + if(ret < 0) { + break; + } + + } else { + migrate_set_state(&s->state, current_active_state, + MIGRATION_STATUS_FAILED); + fprintf(stderr, "1.2 : Error %s %d\n", strerror(-ret), -ret); + trace_migration_thread_file_err(); + break; + } } current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME); if (current_time >= initial_time + BUFFER_DELAY) { @@ -1797,6 +1815,19 @@ void migrate_fd_connect(MigrationState *s) s->migration_thread_running = true; } +int qemu_migrate_postcopy_outgoing_recovery(MigrationState* ms) +{ + migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE, + MIGRATION_STATUS_POSTCOPY_RECOVERY); + + /* Code for network recovery to be added here */ + while(1) { + fprintf(stderr, "Not letting it fail\n"); + sleep(2); + } + +} + PostcopyState postcopy_state_get(void) { return atomic_mb_read(&incoming_postcopy_state); diff --git a/vl.c b/vl.c index 5fd22cb..c237140 100644 --- a/vl.c +++ b/vl.c @@ -618,6 +618,10 @@ static const RunStateTransition runstate_transitions_def[] = { { RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING }, { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE }, { RUN_STATE_FINISH_MIGRATE, RUN_STATE_PRELAUNCH }, + { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE_RECOVERY }, + + { RUN_STATE_POSTMIGRATE_RECOVERY, RUN_STATE_FINISH_MIGRATE }, + { RUN_STATE_POSTMIGRATE_RECOVERY, RUN_STATE_SHUTDOWN }, { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING }, { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },