From patchwork Mon Aug 8 21:03:41 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Roth X-Patchwork-Id: 9269725 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4DD706075A for ; Mon, 8 Aug 2016 21:06:12 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3CEDE28111 for ; Mon, 8 Aug 2016 21:06:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 31C6428113; Mon, 8 Aug 2016 21:06:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 9C97128156 for ; Mon, 8 Aug 2016 21:06:10 +0000 (UTC) Received: from localhost ([::1]:59759 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bWrkM-0003cQ-TM for patchwork-qemu-devel@patchwork.kernel.org; Mon, 08 Aug 2016 17:06:06 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41337) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bWrji-0003b6-Lh for qemu-devel@nongnu.org; Mon, 08 Aug 2016 17:05:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bWrjf-0005lu-EU for qemu-devel@nongnu.org; Mon, 08 Aug 2016 17:05:26 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:56758) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bWrjf-0005lH-5M for qemu-devel@nongnu.org; Mon, 08 Aug 2016 17:05:23 -0400 Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.11/8.16.0.11) with SMTP id u78KxksG120803 for ; Mon, 8 Aug 2016 17:05:22 -0400 Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.150]) by mx0a-001b2d01.pphosted.com with ESMTP id 24n8sjdyas-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Mon, 08 Aug 2016 17:05:22 -0400 Received: from localhost by e32.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 8 Aug 2016 15:05:21 -0600 Received: from d03dlp01.boulder.ibm.com (9.17.202.177) by e32.co.us.ibm.com (192.168.1.132) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 8 Aug 2016 15:05:18 -0600 X-IBM-Helo: d03dlp01.boulder.ibm.com X-IBM-MailFrom: mdroth@linux.vnet.ibm.com Received: from b03cxnp08028.gho.boulder.ibm.com (b03cxnp08028.gho.boulder.ibm.com [9.17.130.20]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id AD7581FF004C; Mon, 8 Aug 2016 15:05:00 -0600 (MDT) Received: from b03ledav001.gho.boulder.ibm.com (b03ledav001.gho.boulder.ibm.com [9.17.130.232]) by b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u78L5I9B11338114; Mon, 8 Aug 2016 14:05:18 -0700 Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 155C87A001; Mon, 8 Aug 2016 15:05:18 -0600 (MDT) Received: from localhost (unknown [9.80.86.168]) by b03ledav001.gho.boulder.ibm.com (Postfix) with ESMTP id BEF576E045; Mon, 8 Aug 2016 15:05:17 -0600 (MDT) From: Michael Roth To: qemu-devel@nongnu.org Date: Mon, 8 Aug 2016 16:03:41 -0500 X-Mailer: git-send-email 1.9.1 In-Reply-To: <1470690267-31454-1-git-send-email-mdroth@linux.vnet.ibm.com> References: <1470690267-31454-1-git-send-email-mdroth@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16080821-0004-0000-0000-00001017B0EF X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00005568; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000178; SDB=6.00741898; UDB=6.00349152; IPR=6.00514456; BA=6.00004651; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00012252; XFM=3.00000011; UTC=2016-08-08 21:05:20 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16080821-0005-0000-0000-000077CE34D3 Message-Id: <1470690267-31454-11-git-send-email-mdroth@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2016-08-08_15:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1608080226 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] X-Received-From: 148.163.156.1 Subject: [Qemu-devel] [PATCH 10/56] migration: regain control of images when migration fails to complete X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Amit Shah , qemu-stable@nongnu.org, Greg Kurz Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: Greg Kurz We currently have an error path during migration that can cause the source QEMU to abort: migration_thread() migration_completion() runstate_is_running() ----------------> true if guest is running bdrv_inactivate_all() ----------------> inactivate images qemu_savevm_state_complete_precopy() ... qemu_fflush() socket_writev_buffer() --------> error because destination fails qemu_fflush() -------------------> set error on migration stream migration_completion() -----------------> set migrate state to FAILED migration_thread() -----------------------> break migration loop vm_start() -----------------------------> restart guest with inactive images and you get: qemu-system-ppc64: socket_writev_buffer: Got err=104 for (32768/18446744073709551615) qemu-system-ppc64: /home/greg/Work/qemu/qemu-master/block/io.c:1342:bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed. Aborted (core dumped) If we try postcopy with a similar scenario, we also get the writev error message but QEMU leaves the guest paused because entered_postcopy is true. We could possibly do the same with precopy and leave the guest paused. But since the historical default for migration errors is to restart the source, this patch adds a call to bdrv_invalidate_cache_all() instead. Signed-off-by: Greg Kurz Message-Id: <146357896785.6003.11983081732454362715.stgit@bahia.huguette.org> Signed-off-by: Amit Shah (cherry picked from commit fe904ea8242cbae2d7e69c052c754b8f5f1ba1d6) Signed-off-by: Michael Roth --- migration/migration.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 991313a..0563b4c 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1597,19 +1597,32 @@ static void migration_completion(MigrationState *s, int current_active_state, rp_error = await_return_path_close_on_source(s); trace_migration_completion_postcopy_end_after_rp(rp_error); if (rp_error) { - goto fail; + goto fail_invalidate; } } if (qemu_file_get_error(s->to_dst_file)) { trace_migration_completion_file_err(); - goto fail; + goto fail_invalidate; } migrate_set_state(&s->state, current_active_state, MIGRATION_STATUS_COMPLETED); return; +fail_invalidate: + /* If not doing postcopy, vm_start() will be called: let's regain + * control on images. + */ + if (s->state == MIGRATION_STATUS_ACTIVE) { + Error *local_err = NULL; + + bdrv_invalidate_cache_all(&local_err); + if (local_err) { + error_report_err(local_err); + } + } + fail: migrate_set_state(&s->state, current_active_state, MIGRATION_STATUS_FAILED);