From patchwork Fri Jun 3 07:52:30 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhanghailiang X-Patchwork-Id: 9152083 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id CC1626072B for ; Fri, 3 Jun 2016 08:15:32 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AF6F22675C for ; Fri, 3 Jun 2016 08:15:32 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A3DD828309; Fri, 3 Jun 2016 08:15:32 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C53242675C for ; Fri, 3 Jun 2016 08:15:31 +0000 (UTC) Received: from localhost ([::1]:53139 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b8kGQ-0001Lv-Jw for patchwork-qemu-devel@patchwork.kernel.org; Fri, 03 Jun 2016 04:15:30 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41496) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b8jvv-0007J1-1w for qemu-devel@nongnu.org; Fri, 03 Jun 2016 03:54:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b8jvp-0004r1-JN for qemu-devel@nongnu.org; Fri, 03 Jun 2016 03:54:18 -0400 Received: from szxga02-in.huawei.com ([119.145.14.65]:25215) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b8jvo-0004pZ-NH for qemu-devel@nongnu.org; Fri, 03 Jun 2016 03:54:13 -0400 Received: from 172.24.1.137 (EHLO szxeml431-hub.china.huawei.com) ([172.24.1.137]) by szxrg02-dlp.huawei.com (MOS 4.3.7-GA FastPath queued) with ESMTP id DIE18644; Fri, 03 Jun 2016 15:53:17 +0800 (CST) Received: from localhost (10.177.24.212) by szxeml431-hub.china.huawei.com (10.82.67.208) with Microsoft SMTP Server id 14.3.235.1; Fri, 3 Jun 2016 15:53:07 +0800 From: zhanghailiang To: , , , Date: Fri, 3 Jun 2016 15:52:30 +0800 Message-ID: <1464940366-9880-19-git-send-email-zhang.zhanghailiang@huawei.com> X-Mailer: git-send-email 2.7.2.windows.1 In-Reply-To: <1464940366-9880-1-git-send-email-zhang.zhanghailiang@huawei.com> References: <1464940366-9880-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.177.24.212] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090202.57513797.0080, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2013-06-18 04:22:30, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: d2795a0479a5b7682ac0421ffe9204cc X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 119.145.14.65 Subject: [Qemu-devel] [PATCH COLO-Frame v17 18/34] COLO: Implement failover work for Primary VM X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: xiecl.fnst@cn.fujitsu.com, lizhijian@cn.fujitsu.com, yunhong.jiang@intel.com, eddie.dong@intel.com, peter.huangpeng@huawei.com, zhanghailiang , arei.gonglei@huawei.com, stefanha@redhat.com, zhangchen.fnst@cn.fujitsu.com, hongyang.yang@easystack.cn Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP For PVM, if there is failover request from users. The COLO thread will exit the loop while the failover BH does the cleanup work and resumes VM. Signed-off-by: zhanghailiang Signed-off-by: Li Zhijian Reviewed-by: Dr. David Alan Gilbert --- v13: - Add Reviewed-by tag v12: - Fix error report and remove unnecessary check in primary_vm_do_failover() (Dave's suggestion) v11: - Don't call migration_end() in primary_vm_do_failover(), The cleanup work will be done in migration_thread(). - Remove vm_start() in primary_vm_do_failover() which also been done in migraiton_thread() v10: - Call migration_end() in primary_vm_do_failover() --- include/migration/colo.h | 3 +++ include/migration/failover.h | 1 + migration/colo-failover.c | 7 +++++- migration/colo.c | 54 ++++++++++++++++++++++++++++++++++++++++++-- 4 files changed, 62 insertions(+), 3 deletions(-) diff --git a/include/migration/colo.h b/include/migration/colo.h index e9ac2c3..e32eef4 100644 --- a/include/migration/colo.h +++ b/include/migration/colo.h @@ -32,4 +32,7 @@ void *colo_process_incoming_thread(void *opaque); bool migration_incoming_in_colo_state(void); COLOMode get_colo_mode(void); + +/* failover */ +void colo_do_failover(MigrationState *s); #endif diff --git a/include/migration/failover.h b/include/migration/failover.h index fe71bb4..c4bd81e 100644 --- a/include/migration/failover.h +++ b/include/migration/failover.h @@ -26,5 +26,6 @@ void failover_init_state(void); int failover_set_state(int old_state, int new_state); int failover_get_state(void); void failover_request_active(Error **errp); +bool failover_request_is_active(void); #endif diff --git a/migration/colo-failover.c b/migration/colo-failover.c index 69aac55..fa84172 100644 --- a/migration/colo-failover.c +++ b/migration/colo-failover.c @@ -33,7 +33,7 @@ static void colo_failover_bh(void *opaque) error_report("Unkown error for failover, old_state=%d", old_state); return; } - /*TODO: Do failover work */ + colo_do_failover(NULL); } void failover_request_active(Error **errp) @@ -68,6 +68,11 @@ int failover_get_state(void) return atomic_read(&failover_state); } +bool failover_request_is_active(void) +{ + return failover_get_state() != FAILOVER_STATUS_NONE; +} + void qmp_x_colo_lost_heartbeat(Error **errp) { if (get_colo_mode() == COLO_MODE_UNKNOWN) { diff --git a/migration/colo.c b/migration/colo.c index f98d7fb..52ab82b 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -41,6 +41,40 @@ bool migration_incoming_in_colo_state(void) return mis && (mis->state == MIGRATION_STATUS_COLO); } +static bool colo_runstate_is_stopped(void) +{ + return runstate_check(RUN_STATE_COLO) || !runstate_is_running(); +} + +static void primary_vm_do_failover(void) +{ + MigrationState *s = migrate_get_current(); + int old_state; + + migrate_set_state(&s->state, MIGRATION_STATUS_COLO, + MIGRATION_STATUS_COMPLETED); + + old_state = failover_set_state(FAILOVER_STATUS_HANDLING, + FAILOVER_STATUS_COMPLETED); + if (old_state != FAILOVER_STATUS_HANDLING) { + error_report("Incorrect state (%d) while doing failover for Primary VM", + old_state); + return; + } +} + +void colo_do_failover(MigrationState *s) +{ + /* Make sure vm stopped while failover happened. */ + if (!colo_runstate_is_stopped()) { + vm_stop_force_state(RUN_STATE_COLO); + } + + if (get_colo_mode() == COLO_MODE_PRIMARY) { + primary_vm_do_failover(); + } +} + static void colo_send_message(QEMUFile *f, COLOMessage msg, Error **errp) { @@ -162,9 +196,20 @@ static int colo_do_checkpoint_transaction(MigrationState *s, bioc->usage = 0; qemu_mutex_lock_iothread(); + if (failover_request_is_active()) { + qemu_mutex_unlock_iothread(); + goto out; + } vm_stop_force_state(RUN_STATE_COLO); qemu_mutex_unlock_iothread(); trace_colo_vm_state_change("run", "stop"); + /* + * Failover request bh could be called after vm_stop_force_state(), + * So we need check failover_request_is_active() again. + */ + if (failover_request_is_active()) { + goto out; + } /* Disable block migration */ s->params.blk = 0; @@ -259,6 +304,11 @@ static void colo_process_checkpoint(MigrationState *s) trace_colo_vm_state_change("stop", "run"); while (s->state == MIGRATION_STATUS_COLO) { + if (failover_request_is_active()) { + error_report("failover request"); + goto out; + } + current_time = qemu_clock_get_ms(QEMU_CLOCK_HOST); if (current_time - checkpoint_time < s->parameters.x_checkpoint_delay) { @@ -280,9 +330,9 @@ out: if (local_err) { error_report_err(local_err); } - migrate_set_state(&s->state, MIGRATION_STATUS_COLO, - MIGRATION_STATUS_COMPLETED); + qemu_fclose(fb); + if (s->rp_state.from_dst_file) { qemu_fclose(s->rp_state.from_dst_file); }