From patchwork Wed Jun 22 20:49:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12891481 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 18E9CC43334 for ; Wed, 22 Jun 2022 20:51:28 +0000 (UTC) Received: from localhost ([::1]:46602 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o47Jv-00050U-5c for qemu-devel@archiver.kernel.org; Wed, 22 Jun 2022 16:51:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49114) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47I8-0002HZ-6o for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:36 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:45434) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47I4-0004EU-PB for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655930972; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=z1zRIDGvAyc8CqU7i7BK9gW1oBEe5YfrD+IIcJEkTr0=; b=GwGPam5GqAvIME7M8NxVa2hEyTCuUv5KU2rldvtv7mSRXnYcwB2pO9xwV7/azZ9waVH8t9 KJoGscdS+dfKIYQEG9WWd2OkV0EGOrVe9AQHhkfOKUkFVo6o6zSDYYZJH+gamAiur5bxPq NveavIK30DgSqH+fEw4FkXhd5L+GrKY= Received: from mail-io1-f69.google.com (mail-io1-f69.google.com [209.85.166.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-641-7tVHv8vaP7iVHz2LNGI3PA-1; Wed, 22 Jun 2022 16:49:30 -0400 X-MC-Unique: 7tVHv8vaP7iVHz2LNGI3PA-1 Received: by mail-io1-f69.google.com with SMTP id y22-20020a056602215600b00673b11a9cd5so57769ioy.7 for ; Wed, 22 Jun 2022 13:49:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=z1zRIDGvAyc8CqU7i7BK9gW1oBEe5YfrD+IIcJEkTr0=; b=TWvEcQLzqJkQLloWfKkGDxRLluNJJEKI68aivYFY/uqjhcTf0iEy3psAC8R8j3FkDs DxIynrE2zh+OgIxrQglQeRpH8nL4+qA6Cb7BzaUmhvo0LUusoGsdsPmBm9cJDa1sgfJi +9dAI62L7KhcLEooUW6SIrdzwls7FAEzU9QOP474Pv9GXGiB9iYfDqGqMHCqnLt6uxmQ gaVzHXAbMUt/vShfit59JFHteiSDQyULFx7s4XlA3laAqOlCBiLnQXS6kOsLly1/E4zK Kq9KlsBZ6aUDIzbIDn7dpt9LQudtgz64fxCzAgjyHH3rtCWZEC8DExaNob4W4sWorWCS +fcg== X-Gm-Message-State: AJIora8LDXi8iw3eic7cXa33+OeTur7iN/alDWPHzN1CbzZM+M23pbdK V2v7kqj+27Id/U9MmzEtyhO5GSodn2B4gXq/HpgJu//CnLQdC40YziMpoRQrWQjP8UXka+vP/TL vYeeBpY+ZK1Fy9BQG2bRBwCGQdPYQuzT/2r799pCbxWgiPpnAppZx4/aSm0VMOzQa X-Received: by 2002:a05:6e02:1523:b0:2d3:cb16:2d03 with SMTP id i3-20020a056e02152300b002d3cb162d03mr3167991ilu.198.1655930969989; Wed, 22 Jun 2022 13:49:29 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uHn+eZgEHucwQDvqyQwnAwsJyrriecr0c4It5qhD7X2kgWKxvAv3jvrpAt8s2hp9JTiAxzuQ== X-Received: by 2002:a05:6e02:1523:b0:2d3:cb16:2d03 with SMTP id i3-20020a056e02152300b002d3cb162d03mr3167966ilu.198.1655930969624; Wed, 22 Jun 2022 13:49:29 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id b44-20020a0295af000000b0032b3a7817a7sm8920323jai.107.2022.06.22.13.49.26 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 22 Jun 2022 13:49:28 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Cc: "Daniel P . Berrange" , peterx@redhat.com, Leonardo Bras Soares Passos , Manish Mishra , "Dr . David Alan Gilbert" , Juan Quintela Subject: [PATCH v8 01/15] fixup! migration: remove the QEMUFileOps 'get_buffer' callback Date: Wed, 22 Jun 2022 16:49:06 -0400 Message-Id: <20220622204920.79061-2-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220622204920.79061-1-peterx@redhat.com> References: <20220622204920.79061-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This fixes a bug with the cleanup patch. Should be squashed into the patch in subject. Cc: Daniel P. Berrange Signed-off-by: Peter Xu --- migration/qemu-file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/migration/qemu-file.c b/migration/qemu-file.c index 3a380a6072..1e80d496b7 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -375,7 +375,7 @@ static ssize_t qemu_fill_buffer(QEMUFile *f) qio_channel_wait(f->ioc, G_IO_IN); } } else if (len < 0) { - len = EIO; + len = -EIO; } } while (len == QIO_CHANNEL_ERR_BLOCK); From patchwork Wed Jun 22 20:49:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12891482 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 77345C43334 for ; Wed, 22 Jun 2022 20:51:34 +0000 (UTC) Received: from localhost ([::1]:46678 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o47K1-00054E-HD for qemu-devel@archiver.kernel.org; Wed, 22 Jun 2022 16:51:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49128) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47I8-0002Hj-4T for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:36 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:49103) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47I6-0004FT-Hw for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655930974; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YxCFGQpsPF0Md1s0YwkgD2Yx24BquXhMGvMjqKwffmE=; b=KJKJxLyLnvgJwJVvb08Ef4XzsaPVM9+XTjKrNeey32fxAmLdPZO4YGeoCW63WucJAamxKK Sqq6hF+qSnlzTo48AHu2FpkLkDLFB2GklRztIxtHlKiblDUO7tnQbrhMPYzQyzPV2+xP8I FxHfMRuE/4n+qSMW3mRtxQxWVJP1MgM= Received: from mail-io1-f71.google.com (mail-io1-f71.google.com [209.85.166.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-505-jehxjqV4P_mWHnfMSCY4hA-1; Wed, 22 Jun 2022 16:49:32 -0400 X-MC-Unique: jehxjqV4P_mWHnfMSCY4hA-1 Received: by mail-io1-f71.google.com with SMTP id m3-20020a6bbc03000000b0067277968473so809161iof.19 for ; Wed, 22 Jun 2022 13:49:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YxCFGQpsPF0Md1s0YwkgD2Yx24BquXhMGvMjqKwffmE=; b=hYvGgYo3l7Y+8dCLFn8HX+v20s/5SNfNChbEzvkRwQPUrEX+jEL6jcco3ytvU0AYdA vOx8TQAPlgmisL76bBW38E1GP5quk24dqfM99B9397evp5T8WgTinHbxeU+Yx0cTgtUe 1oOl/afqJbh6rRvWfxhZj+REOuwxzymqn8cu6edvq3yNGPQsoU0OdJGhgPWIlibGHlad VInWSY1OpChPp3FI5z1+lxYWPKKThk2wydtYyNSlZoUEMNiwvGF7QwMfttRyeOu8xvNZ 1hobgOFPwqq9uIFgRN+5Gq80h9U7SQe0OGUVVYucyE2MNuOZb8wcQvM02G+MtpgNSsf0 gnOg== X-Gm-Message-State: AJIora+S/FJKjDaXcPNshD/YmL6CY9ST9Yavr2Ye2Mb/2myfA48f7bL+ 68WyNAJOKNFPt5uzLeB6Nj1ZvAAV3VXj0M8kraypebpH8jVqGN9mvqKPsYzIotwAwRtRXwrqEpF u2vPHz8CGJ2/TBpuqNsAB5ZJzuU46eW/6B0F6BLjfmQGyg7Otn3pZ54+7Ubvp41Ls X-Received: by 2002:a92:a041:0:b0:2d7:7935:effa with SMTP id b1-20020a92a041000000b002d77935effamr3136362ilm.222.1655930971640; Wed, 22 Jun 2022 13:49:31 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vXXB6/QfUk/8IHfO1u3MFzDBCv1SKZDMqr3HT4gqhQg1NgRucwOJP/IRQ2SNcu3aBKP4PsUg== X-Received: by 2002:a92:a041:0:b0:2d7:7935:effa with SMTP id b1-20020a92a041000000b002d77935effamr3136350ilm.222.1655930971386; Wed, 22 Jun 2022 13:49:31 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id b44-20020a0295af000000b0032b3a7817a7sm8920323jai.107.2022.06.22.13.49.29 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 22 Jun 2022 13:49:30 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Cc: "Daniel P . Berrange" , peterx@redhat.com, Leonardo Bras Soares Passos , Manish Mishra , "Dr . David Alan Gilbert" , Juan Quintela Subject: [PATCH v8 02/15] migration: Add postcopy-preempt capability Date: Wed, 22 Jun 2022 16:49:07 -0400 Message-Id: <20220622204920.79061-3-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220622204920.79061-1-peterx@redhat.com> References: <20220622204920.79061-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Firstly, postcopy already preempts precopy due to the fact that we do unqueue_page() first before looking into dirty bits. However that's not enough, e.g., when there're host huge page enabled, when sending a precopy huge page, a postcopy request needs to wait until the whole huge page that is sending to finish. That could introduce quite some delay, the bigger the huge page is the larger delay it'll bring. This patch adds a new capability to allow postcopy requests to preempt existing precopy page during sending a huge page, so that postcopy requests can be serviced even faster. Meanwhile to send it even faster, bypass the precopy stream by providing a standalone postcopy socket for sending requested pages. Since the new behavior will not be compatible with the old behavior, this will not be the default, it's enabled only when the new capability is set on both src/dst QEMUs. This patch only adds the capability itself, the logic will be added in follow up patches. Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Juan Quintela Signed-off-by: Peter Xu --- migration/migration.c | 18 ++++++++++++++++++ migration/migration.h | 1 + qapi/migration.json | 7 ++++++- 3 files changed, 25 insertions(+), 1 deletion(-) diff --git a/migration/migration.c b/migration/migration.c index 78f5057373..ce7bb68cdc 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -1297,6 +1297,13 @@ static bool migrate_caps_check(bool *cap_list, return false; } + if (cap_list[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT]) { + if (!cap_list[MIGRATION_CAPABILITY_POSTCOPY_RAM]) { + error_setg(errp, "Postcopy preempt requires postcopy-ram"); + return false; + } + } + return true; } @@ -2663,6 +2670,15 @@ bool migrate_background_snapshot(void) return s->enabled_capabilities[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]; } +bool migrate_postcopy_preempt(void) +{ + MigrationState *s; + + s = migrate_get_current(); + + return s->enabled_capabilities[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT]; +} + /* migration thread support */ /* * Something bad happened to the RP stream, mark an error @@ -4274,6 +4290,8 @@ static Property migration_properties[] = { DEFINE_PROP_MIG_CAP("x-compress", MIGRATION_CAPABILITY_COMPRESS), DEFINE_PROP_MIG_CAP("x-events", MIGRATION_CAPABILITY_EVENTS), DEFINE_PROP_MIG_CAP("x-postcopy-ram", MIGRATION_CAPABILITY_POSTCOPY_RAM), + DEFINE_PROP_MIG_CAP("x-postcopy-preempt", + MIGRATION_CAPABILITY_POSTCOPY_PREEMPT), DEFINE_PROP_MIG_CAP("x-colo", MIGRATION_CAPABILITY_X_COLO), DEFINE_PROP_MIG_CAP("x-release-ram", MIGRATION_CAPABILITY_RELEASE_RAM), DEFINE_PROP_MIG_CAP("x-block", MIGRATION_CAPABILITY_BLOCK), diff --git a/migration/migration.h b/migration/migration.h index 485d58b95f..d2269c826c 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -400,6 +400,7 @@ int migrate_decompress_threads(void); bool migrate_use_events(void); bool migrate_postcopy_blocktime(void); bool migrate_background_snapshot(void); +bool migrate_postcopy_preempt(void); /* Sending on the return path - generic and then for each message type */ void migrate_send_rp_shut(MigrationIncomingState *mis, diff --git a/qapi/migration.json b/qapi/migration.json index e552ee4f43..7586df3dea 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -467,6 +467,11 @@ # Requires that QEMU be permitted to use locked memory # for guest RAM pages. # (since 7.1) +# @postcopy-preempt: If enabled, the migration process will allow postcopy +# requests to preempt precopy stream, so postcopy requests +# will be handled faster. This is a performance feature and +# should not affect the correctness of postcopy migration. +# (since 7.1) # # Features: # @unstable: Members @x-colo and @x-ignore-shared are experimental. @@ -482,7 +487,7 @@ 'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate', { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] }, 'validate-uuid', 'background-snapshot', - 'zero-copy-send'] } + 'zero-copy-send', 'postcopy-preempt'] } ## # @MigrationCapabilityStatus: From patchwork Wed Jun 22 20:49:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12891483 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B421DC43334 for ; Wed, 22 Jun 2022 20:51:38 +0000 (UTC) Received: from localhost ([::1]:46898 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o47K5-0005DD-N6 for qemu-devel@archiver.kernel.org; Wed, 22 Jun 2022 16:51:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49150) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IA-0002Lw-SA for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:38 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:38331) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47I8-0004G5-GY for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655930976; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qDErhTbzBZRbdnE8iOTjAuoFVFIuE1tIg+P+2wkZzWc=; b=ZTsyTWqcdY2zYet3kOERL8kimHp67ZkDiV+1lTtAFz3hO9jXnR0/x7grXnUNc7TUk0v3lj aAPjeO426sd7S7wokyuuN6Uoo01wOV8Z7+7SpjmQZMrUHy14NknvQURgCVp9DOz2p95yIc k/JtjvIarAe6fQgGDvZ6/bZgLLRqaoI= Received: from mail-il1-f200.google.com (mail-il1-f200.google.com [209.85.166.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-151-teFfZ-iKOkGzcaQtQA2TCA-1; Wed, 22 Jun 2022 16:49:35 -0400 X-MC-Unique: teFfZ-iKOkGzcaQtQA2TCA-1 Received: by mail-il1-f200.google.com with SMTP id n12-20020a92260c000000b002d3c9fc68d6so11713160ile.19 for ; Wed, 22 Jun 2022 13:49:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qDErhTbzBZRbdnE8iOTjAuoFVFIuE1tIg+P+2wkZzWc=; b=uRIP029H0RiwqgNw/9WG/UbF+3QxFCYFMZRec7DJDEgztR0zwWEf/7k8/jocUwxxiX gZKqZXbxqOH+BwS7B2KLEg8vbzmgse9sRmpwFaspSwYeBIVrwwpOFtEJ0s0k82WoD58N ghwoDrdHTghOcW1qQE1Ja3qTlNW7BI0nK+eGEVSDVoYJUqeZJZ0da4BgJigwbmJ8DDr+ cQg8lXdqqxxyr1Gn0rj6Khce3UBprhUVu4SMbO+CCeLLm9MgyZGYBQnnRmB7ifqsHrYu TKtEX6Ud7Heqd0alHSFfvCEXxhPGeWjRdMd03Z1YWeU5zG8M6MAZB8/xYEt/9IbTq0/h P//w== X-Gm-Message-State: AJIora+U8iJ9h6iCed5EaW/5bclPJDtanrx9marVFC78rgdUc2ndaYW6 H1EmV7Xf02VLZNCKzKDzlzp6y1uafIZ4p9Ty/TBT3eDPtw1oJIIpL6q7xdNybcvkK1vXnEpXxWc aebR+T9OLtJ06YSELE3nzWvDFkCZ51lt/5WKyOQSA9jn7DED41M492uEwQCoOvg4c X-Received: by 2002:a05:6e02:12e3:b0:2d1:583e:32bb with SMTP id l3-20020a056e0212e300b002d1583e32bbmr3132246iln.14.1655930973707; Wed, 22 Jun 2022 13:49:33 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vtC8L5LxeQC85Latq0amt3ilthkfgCs0uzXDgEZwS162qk4EOcq1qQZECiwloznaaacIjnGA== X-Received: by 2002:a05:6e02:12e3:b0:2d1:583e:32bb with SMTP id l3-20020a056e0212e300b002d1583e32bbmr3132227iln.14.1655930973218; Wed, 22 Jun 2022 13:49:33 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id b44-20020a0295af000000b0032b3a7817a7sm8920323jai.107.2022.06.22.13.49.31 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 22 Jun 2022 13:49:32 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Cc: "Daniel P . Berrange" , peterx@redhat.com, Leonardo Bras Soares Passos , Manish Mishra , "Dr . David Alan Gilbert" , Juan Quintela Subject: [PATCH v8 03/15] migration: Postcopy preemption preparation on channel creation Date: Wed, 22 Jun 2022 16:49:08 -0400 Message-Id: <20220622204920.79061-4-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220622204920.79061-1-peterx@redhat.com> References: <20220622204920.79061-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -12 X-Spam_score: -1.3 X-Spam_bar: - X-Spam_report: (-1.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, PP_MIME_FAKE_ASCII_TEXT=0.999, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URG_BIZ=0.573 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Create a new socket for postcopy to be prepared to send postcopy requested pages via this specific channel, so as to not get blocked by precopy pages. A new thread is also created on dest qemu to receive data from this new channel based on the ram_load_postcopy() routine. The ram_load_postcopy(POSTCOPY) branch and the thread has not started to function, and that'll be done in follow up patches. Cleanup the new sockets on both src/dst QEMUs, meanwhile look after the new thread too to make sure it'll be recycled properly. Reviewed-by: Daniel P. BerrangĂ© Reviewed-by: Juan Quintela Signed-off-by: Peter Xu --- migration/migration.c | 62 +++++++++++++++++++++++---- migration/migration.h | 8 ++++ migration/postcopy-ram.c | 92 ++++++++++++++++++++++++++++++++++++++-- migration/postcopy-ram.h | 10 +++++ migration/ram.c | 25 ++++++++--- migration/ram.h | 4 +- migration/savevm.c | 20 ++++----- migration/socket.c | 22 +++++++++- migration/socket.h | 1 + migration/trace-events | 5 ++- 10 files changed, 218 insertions(+), 31 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index ce7bb68cdc..9484fec0b2 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -321,6 +321,12 @@ void migration_incoming_state_destroy(void) mis->page_requested = NULL; } + if (mis->postcopy_qemufile_dst) { + migration_ioc_unregister_yank_from_file(mis->postcopy_qemufile_dst); + qemu_fclose(mis->postcopy_qemufile_dst); + mis->postcopy_qemufile_dst = NULL; + } + yank_unregister_instance(MIGRATION_YANK_INSTANCE); } @@ -714,15 +720,21 @@ void migration_fd_process_incoming(QEMUFile *f, Error **errp) migration_incoming_process(); } +static bool migration_needs_multiple_sockets(void) +{ + return migrate_use_multifd() || migrate_postcopy_preempt(); +} + void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) { MigrationIncomingState *mis = migration_incoming_get_current(); Error *local_err = NULL; bool start_migration; + QEMUFile *f; if (!mis->from_src_file) { /* The first connection (multifd may have multiple) */ - QEMUFile *f = qemu_file_new_input(ioc); + f = qemu_file_new_input(ioc); if (!migration_incoming_setup(f, errp)) { return; @@ -730,13 +742,18 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) /* * Common migration only needs one channel, so we can start - * right now. Multifd needs more than one channel, we wait. + * right now. Some features need more than one channel, we wait. */ - start_migration = !migrate_use_multifd(); + start_migration = !migration_needs_multiple_sockets(); } else { /* Multiple connections */ - assert(migrate_use_multifd()); - start_migration = multifd_recv_new_channel(ioc, &local_err); + assert(migration_needs_multiple_sockets()); + if (migrate_use_multifd()) { + start_migration = multifd_recv_new_channel(ioc, &local_err); + } else if (migrate_postcopy_preempt()) { + f = qemu_file_new_input(ioc); + start_migration = postcopy_preempt_new_channel(mis, f); + } if (local_err) { error_propagate(errp, local_err); return; @@ -761,11 +778,20 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) bool migration_has_all_channels(void) { MigrationIncomingState *mis = migration_incoming_get_current(); - bool all_channels; - all_channels = multifd_recv_all_channels_created(); + if (!mis->from_src_file) { + return false; + } + + if (migrate_use_multifd()) { + return multifd_recv_all_channels_created(); + } + + if (migrate_postcopy_preempt()) { + return mis->postcopy_qemufile_dst != NULL; + } - return all_channels && mis->from_src_file != NULL; + return true; } /* @@ -1874,6 +1900,12 @@ static void migrate_fd_cleanup(MigrationState *s) qemu_fclose(tmp); } + if (s->postcopy_qemufile_src) { + migration_ioc_unregister_yank_from_file(s->postcopy_qemufile_src); + qemu_fclose(s->postcopy_qemufile_src); + s->postcopy_qemufile_src = NULL; + } + assert(!migration_is_active(s)); if (s->state == MIGRATION_STATUS_CANCELLING) { @@ -3269,6 +3301,11 @@ static void migration_completion(MigrationState *s) qemu_savevm_state_complete_postcopy(s->to_dst_file); qemu_mutex_unlock_iothread(); + /* Shutdown the postcopy fast path thread */ + if (migrate_postcopy_preempt()) { + postcopy_preempt_shutdown_file(s); + } + trace_migration_completion_postcopy_end_after_complete(); } else { goto fail; @@ -4157,6 +4194,15 @@ void migrate_fd_connect(MigrationState *s, Error *error_in) } } + /* This needs to be done before resuming a postcopy */ + if (postcopy_preempt_setup(s, &local_err)) { + error_report_err(local_err); + migrate_set_state(&s->state, MIGRATION_STATUS_SETUP, + MIGRATION_STATUS_FAILED); + migrate_fd_cleanup(s); + return; + } + if (resume) { /* Wakeup the main migration thread to do the recovery */ migrate_set_state(&s->state, MIGRATION_STATUS_POSTCOPY_PAUSED, diff --git a/migration/migration.h b/migration/migration.h index d2269c826c..941c61e543 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -23,6 +23,7 @@ #include "io/channel-buffer.h" #include "net/announce.h" #include "qom/object.h" +#include "postcopy-ram.h" struct PostcopyBlocktimeContext; @@ -112,6 +113,11 @@ struct MigrationIncomingState { * enabled. */ unsigned int postcopy_channels; + /* QEMUFile for postcopy only; it'll be handled by a separate thread */ + QEMUFile *postcopy_qemufile_dst; + /* Postcopy priority thread is used to receive postcopy requested pages */ + QemuThread postcopy_prio_thread; + bool postcopy_prio_thread_created; /* * An array of temp host huge pages to be used, one for each postcopy * channel. @@ -192,6 +198,8 @@ struct MigrationState { QEMUBH *cleanup_bh; /* Protected by qemu_file_lock */ QEMUFile *to_dst_file; + /* Postcopy specific transfer channel */ + QEMUFile *postcopy_qemufile_src; QIOChannelBuffer *bioc; /* * Protects to_dst_file/from_dst_file pointers. We need to make sure we diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index a66dd536d9..a3561410fe 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -33,6 +33,9 @@ #include "trace.h" #include "hw/boards.h" #include "exec/ramblock.h" +#include "socket.h" +#include "qemu-file.h" +#include "yank_functions.h" /* Arbitrary limit on size of each discard command, * keeps them around ~200 bytes @@ -567,6 +570,11 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState *mis) { trace_postcopy_ram_incoming_cleanup_entry(); + if (mis->postcopy_prio_thread_created) { + qemu_thread_join(&mis->postcopy_prio_thread); + mis->postcopy_prio_thread_created = false; + } + if (mis->have_fault_thread) { Error *local_err = NULL; @@ -1102,8 +1110,13 @@ static int postcopy_temp_pages_setup(MigrationIncomingState *mis) int err, i, channels; void *temp_page; - /* TODO: will be boosted when enable postcopy preemption */ - mis->postcopy_channels = 1; + if (migrate_postcopy_preempt()) { + /* If preemption enabled, need extra channel for urgent requests */ + mis->postcopy_channels = RAM_CHANNEL_MAX; + } else { + /* Both precopy/postcopy on the same channel */ + mis->postcopy_channels = 1; + } channels = mis->postcopy_channels; mis->postcopy_tmp_pages = g_malloc0_n(sizeof(PostcopyTmpPage), channels); @@ -1170,7 +1183,7 @@ int postcopy_ram_incoming_setup(MigrationIncomingState *mis) return -1; } - postcopy_thread_create(mis, &mis->fault_thread, "postcopy/fault", + postcopy_thread_create(mis, &mis->fault_thread, "fault-default", postcopy_ram_fault_thread, QEMU_THREAD_JOINABLE); mis->have_fault_thread = true; @@ -1185,6 +1198,16 @@ int postcopy_ram_incoming_setup(MigrationIncomingState *mis) return -1; } + if (migrate_postcopy_preempt()) { + /* + * This thread needs to be created after the temp pages because + * it'll fetch RAM_CHANNEL_POSTCOPY PostcopyTmpPage immediately. + */ + postcopy_thread_create(mis, &mis->postcopy_prio_thread, "fault-fast", + postcopy_preempt_thread, QEMU_THREAD_JOINABLE); + mis->postcopy_prio_thread_created = true; + } + trace_postcopy_ram_enable_notify(); return 0; @@ -1514,3 +1537,66 @@ void postcopy_unregister_shared_ufd(struct PostCopyFD *pcfd) } } } + +bool postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file) +{ + /* + * The new loading channel has its own threads, so it needs to be + * blocked too. It's by default true, just be explicit. + */ + qemu_file_set_blocking(file, true); + mis->postcopy_qemufile_dst = file; + trace_postcopy_preempt_new_channel(); + + /* Start the migration immediately */ + return true; +} + +int postcopy_preempt_setup(MigrationState *s, Error **errp) +{ + QIOChannel *ioc; + + if (!migrate_postcopy_preempt()) { + return 0; + } + + if (!migrate_multi_channels_is_allowed()) { + error_setg(errp, "Postcopy preempt is not supported as current " + "migration stream does not support multi-channels."); + return -1; + } + + ioc = socket_send_channel_create_sync(errp); + + if (ioc == NULL) { + return -1; + } + + migration_ioc_register_yank(ioc); + s->postcopy_qemufile_src = qemu_file_new_output(ioc); + + trace_postcopy_preempt_new_channel(); + + return 0; +} + +void *postcopy_preempt_thread(void *opaque) +{ + MigrationIncomingState *mis = opaque; + int ret; + + trace_postcopy_preempt_thread_entry(); + + rcu_register_thread(); + + qemu_sem_post(&mis->thread_sync_sem); + + /* Sending RAM_SAVE_FLAG_EOS to terminate this thread */ + ret = ram_load_postcopy(mis->postcopy_qemufile_dst, RAM_CHANNEL_POSTCOPY); + + rcu_unregister_thread(); + + trace_postcopy_preempt_thread_exit(); + + return ret == 0 ? NULL : (void *)-1; +} diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h index 07684c0e1d..34b1080cde 100644 --- a/migration/postcopy-ram.h +++ b/migration/postcopy-ram.h @@ -183,4 +183,14 @@ int postcopy_wake_shared(struct PostCopyFD *pcfd, uint64_t client_addr, int postcopy_request_shared_page(struct PostCopyFD *pcfd, RAMBlock *rb, uint64_t client_addr, uint64_t offset); +/* Hard-code channels for now for postcopy preemption */ +enum PostcopyChannels { + RAM_CHANNEL_PRECOPY = 0, + RAM_CHANNEL_POSTCOPY = 1, + RAM_CHANNEL_MAX, +}; + +bool postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file); +int postcopy_preempt_setup(MigrationState *s, Error **errp); + #endif diff --git a/migration/ram.c b/migration/ram.c index 01f9cc1d72..e4364c0bff 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3659,15 +3659,15 @@ int ram_postcopy_incoming_init(MigrationIncomingState *mis) * rcu_read_lock is taken prior to this being called. * * @f: QEMUFile where to send the data + * @channel: the channel to use for loading */ -int ram_load_postcopy(QEMUFile *f) +int ram_load_postcopy(QEMUFile *f, int channel) { int flags = 0, ret = 0; bool place_needed = false; bool matches_target_page_size = false; MigrationIncomingState *mis = migration_incoming_get_current(); - /* Currently we only use channel 0. TODO: use all the channels */ - PostcopyTmpPage *tmp_page = &mis->postcopy_tmp_pages[0]; + PostcopyTmpPage *tmp_page = &mis->postcopy_tmp_pages[channel]; while (!ret && !(flags & RAM_SAVE_FLAG_EOS)) { ram_addr_t addr; @@ -3691,7 +3691,7 @@ int ram_load_postcopy(QEMUFile *f) flags = addr & ~TARGET_PAGE_MASK; addr &= TARGET_PAGE_MASK; - trace_ram_load_postcopy_loop((uint64_t)addr, flags); + trace_ram_load_postcopy_loop(channel, (uint64_t)addr, flags); if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE | RAM_SAVE_FLAG_COMPRESS_PAGE)) { block = ram_block_from_stream(mis, f, flags); @@ -3732,10 +3732,10 @@ int ram_load_postcopy(QEMUFile *f) } else if (tmp_page->host_addr != host_page_from_ram_block_offset(block, addr)) { /* not the 1st TP within the HP */ - error_report("Non-same host page detected. " + error_report("Non-same host page detected on channel %d: " "Target host page %p, received host page %p " "(rb %s offset 0x"RAM_ADDR_FMT" target_pages %d)", - tmp_page->host_addr, + channel, tmp_page->host_addr, host_page_from_ram_block_offset(block, addr), block->idstr, addr, tmp_page->target_pages); ret = -EINVAL; @@ -4122,7 +4122,12 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) */ WITH_RCU_READ_LOCK_GUARD() { if (postcopy_running) { - ret = ram_load_postcopy(f); + /* + * Note! Here RAM_CHANNEL_PRECOPY is the precopy channel of + * postcopy migration, we have another RAM_CHANNEL_POSTCOPY to + * service fast page faults. + */ + ret = ram_load_postcopy(f, RAM_CHANNEL_PRECOPY); } else { ret = ram_load_precopy(f); } @@ -4284,6 +4289,12 @@ static int ram_resume_prepare(MigrationState *s, void *opaque) return 0; } +void postcopy_preempt_shutdown_file(MigrationState *s) +{ + qemu_put_be64(s->postcopy_qemufile_src, RAM_SAVE_FLAG_EOS); + qemu_fflush(s->postcopy_qemufile_src); +} + static SaveVMHandlers savevm_ram_handlers = { .save_setup = ram_save_setup, .save_live_iterate = ram_save_iterate, diff --git a/migration/ram.h b/migration/ram.h index ded0a3a086..5d90945a6e 100644 --- a/migration/ram.h +++ b/migration/ram.h @@ -61,7 +61,7 @@ void ram_postcopy_send_discard_bitmap(MigrationState *ms); /* For incoming postcopy discard */ int ram_discard_range(const char *block_name, uint64_t start, size_t length); int ram_postcopy_incoming_init(MigrationIncomingState *mis); -int ram_load_postcopy(QEMUFile *f); +int ram_load_postcopy(QEMUFile *f, int channel); void ram_handle_compressed(void *host, uint8_t ch, uint64_t size); @@ -73,6 +73,8 @@ int64_t ramblock_recv_bitmap_send(QEMUFile *file, const char *block_name); int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb); bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t start); +void postcopy_preempt_shutdown_file(MigrationState *s); +void *postcopy_preempt_thread(void *opaque); /* ram cache */ int colo_init_ram_cache(void); diff --git a/migration/savevm.c b/migration/savevm.c index e8a1b96fcd..e3af03cb9b 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -2540,16 +2540,6 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis) { int i; - /* - * If network is interrupted, any temp page we received will be useless - * because we didn't mark them as "received" in receivedmap. After a - * proper recovery later (which will sync src dirty bitmap with receivedmap - * on dest) these cached small pages will be resent again. - */ - for (i = 0; i < mis->postcopy_channels; i++) { - postcopy_temp_page_reset(&mis->postcopy_tmp_pages[i]); - } - trace_postcopy_pause_incoming(); assert(migrate_postcopy_ram()); @@ -2578,6 +2568,16 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis) /* Notify the fault thread for the invalidated file handle */ postcopy_fault_thread_notify(mis); + /* + * If network is interrupted, any temp page we received will be useless + * because we didn't mark them as "received" in receivedmap. After a + * proper recovery later (which will sync src dirty bitmap with receivedmap + * on dest) these cached small pages will be resent again. + */ + for (i = 0; i < mis->postcopy_channels; i++) { + postcopy_temp_page_reset(&mis->postcopy_tmp_pages[i]); + } + error_report("Detected IO failure for postcopy. " "Migration paused."); diff --git a/migration/socket.c b/migration/socket.c index 4fd5e85f50..e6fdf3c5e1 100644 --- a/migration/socket.c +++ b/migration/socket.c @@ -26,7 +26,7 @@ #include "io/channel-socket.h" #include "io/net-listener.h" #include "trace.h" - +#include "postcopy-ram.h" struct SocketOutgoingArgs { SocketAddress *saddr; @@ -39,6 +39,24 @@ void socket_send_channel_create(QIOTaskFunc f, void *data) f, data, NULL, NULL); } +QIOChannel *socket_send_channel_create_sync(Error **errp) +{ + QIOChannelSocket *sioc = qio_channel_socket_new(); + + if (!outgoing_args.saddr) { + object_unref(OBJECT(sioc)); + error_setg(errp, "Initial sock address not set!"); + return NULL; + } + + if (qio_channel_socket_connect_sync(sioc, outgoing_args.saddr, errp) < 0) { + object_unref(OBJECT(sioc)); + return NULL; + } + + return QIO_CHANNEL(sioc); +} + int socket_send_channel_destroy(QIOChannel *send) { /* Remove channel */ @@ -166,6 +184,8 @@ socket_start_incoming_migration_internal(SocketAddress *saddr, if (migrate_use_multifd()) { num = migrate_multifd_channels(); + } else if (migrate_postcopy_preempt()) { + num = RAM_CHANNEL_MAX; } if (qio_net_listener_open_sync(listener, saddr, num, errp) < 0) { diff --git a/migration/socket.h b/migration/socket.h index 891dbccceb..dc54df4e6c 100644 --- a/migration/socket.h +++ b/migration/socket.h @@ -21,6 +21,7 @@ #include "io/task.h" void socket_send_channel_create(QIOTaskFunc f, void *data); +QIOChannel *socket_send_channel_create_sync(Error **errp); int socket_send_channel_destroy(QIOChannel *send); void socket_start_incoming_migration(const char *str, Error **errp); diff --git a/migration/trace-events b/migration/trace-events index 1aec580e92..4bc787cf0c 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -91,7 +91,7 @@ migration_bitmap_clear_dirty(char *str, uint64_t start, uint64_t size, unsigned migration_throttle(void) "" ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: %" PRIx64 " %zx" ram_load_loop(const char *rbname, uint64_t addr, int flags, void *host) "%s: addr: 0x%" PRIx64 " flags: 0x%x host: %p" -ram_load_postcopy_loop(uint64_t addr, int flags) "@%" PRIx64 " %x" +ram_load_postcopy_loop(int channel, uint64_t addr, int flags) "chan=%d addr=0x%" PRIx64 " flags=0x%x" ram_postcopy_send_discard_bitmap(void) "" ram_save_page(const char *rbname, uint64_t offset, void *host) "%s: offset: 0x%" PRIx64 " host: %p" ram_save_queue_pages(const char *rbname, size_t start, size_t len) "%s: start: 0x%zx len: 0x%zx" @@ -278,6 +278,9 @@ postcopy_request_shared_page(const char *sharer, const char *rb, uint64_t rb_off postcopy_request_shared_page_present(const char *sharer, const char *rb, uint64_t rb_offset) "%s already %s offset 0x%"PRIx64 postcopy_wake_shared(uint64_t client_addr, const char *rb) "at 0x%"PRIx64" in %s" postcopy_page_req_del(void *addr, int count) "resolved page req %p total %d" +postcopy_preempt_new_channel(void) "" +postcopy_preempt_thread_entry(void) "" +postcopy_preempt_thread_exit(void) "" get_mem_fault_cpu_index(int cpu, uint32_t pid) "cpu: %d, pid: %u" From patchwork Wed Jun 22 20:49:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12891487 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4E58CC43334 for ; Wed, 22 Jun 2022 20:53:56 +0000 (UTC) Received: from localhost ([::1]:55280 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o47MJ-0002Rm-D2 for qemu-devel@archiver.kernel.org; Wed, 22 Jun 2022 16:53:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49168) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47ID-0002QB-1c for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:41 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:20744) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IA-0004Gb-HL for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:40 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655930978; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=b4/qTm2SCK1BnbfPBTGBw+6SL12ZkJrdh6H0UeTw2hc=; b=G+AXS5wCocSOLLoj+B68bRjnapjW/vimhdoFvKvsfPiZ/K0vqrOvSFe6lM/8MuBhKZxnzZ WXqU+H4BmMZ00QuCF9mdgLeXeh8KhTxmknr4AQ5KbJFcV8dRdfScjk8aGUsxNJQ0zLgMuY TPX7TBvR5ycfQjJYBLA3u9hk1x4Q+Ow= Received: from mail-il1-f198.google.com (mail-il1-f198.google.com [209.85.166.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-397-aXyS0JBHNQ6OQj73WlPDFQ-1; Wed, 22 Jun 2022 16:49:37 -0400 X-MC-Unique: aXyS0JBHNQ6OQj73WlPDFQ-1 Received: by mail-il1-f198.google.com with SMTP id s6-20020a056e021a0600b002d8fcba296aso7973544ild.20 for ; Wed, 22 Jun 2022 13:49:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=b4/qTm2SCK1BnbfPBTGBw+6SL12ZkJrdh6H0UeTw2hc=; b=WH8b7G4Ew4iY1gevVrRPgJxD5p2o/vjPfTYL+D4AS+HV1KnqBPD2Q2GCsA31lFjIa4 WrQFaZckdHFLGeT0pv1xmkXHoUNwH87hLyYmAFh8gHqb0dCtNFhoH39dzQlSh7QB//uw Hd+4UYZyLuKozewbNe0XyzFT+4niKc+hR3+RDhEmnDMT6ioeEyej4JqL9kx+qMyQnrYF V44QVdDgA1fXCQ9dmDEPurxh98n7beFOvYywWvzBIVoBgvtT6Couxdi8gFTmOn8mZ7by MW0396Rg/FS11T1uIOl4UqhE4c38U6SdH0dFAm+iSfWzXGVrOjs0biRDzbMg8mzKx1vK HHSw== X-Gm-Message-State: AJIora8u3uz7r7WFYMaJ+LVjpy07z3eF3OHVdFtB4QmYXsXBD9TaL2sL JF5v+szjHsJApviHE/evj2Nz56cnt0/7pMqNeGu4MJ0732WOYdseDVbHqmYZuYrmDkfR4JyelxJ IZv5gqMZcpoml/V/9rUX6S8boaGmZY5rfXBRMPetKFUJmxKoUDiDEtqkpEmd1h7QE X-Received: by 2002:a05:6602:13c3:b0:672:6e5b:f91d with SMTP id o3-20020a05660213c300b006726e5bf91dmr2281104iov.68.1655930975881; Wed, 22 Jun 2022 13:49:35 -0700 (PDT) X-Google-Smtp-Source: AGRyM1sAbrbI4DIUxGM20WHOIegv/fKzQUzuK902PuSUUEgzLu6mLKyUv+rpBVwe3cO2DxCK5HCajw== X-Received: by 2002:a05:6602:13c3:b0:672:6e5b:f91d with SMTP id o3-20020a05660213c300b006726e5bf91dmr2281083iov.68.1655930975275; Wed, 22 Jun 2022 13:49:35 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id b44-20020a0295af000000b0032b3a7817a7sm8920323jai.107.2022.06.22.13.49.33 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 22 Jun 2022 13:49:34 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Cc: "Daniel P . Berrange" , peterx@redhat.com, Leonardo Bras Soares Passos , Manish Mishra , "Dr . David Alan Gilbert" , Juan Quintela Subject: [PATCH v8 04/15] migration: Postcopy preemption enablement Date: Wed, 22 Jun 2022 16:49:09 -0400 Message-Id: <20220622204920.79061-5-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220622204920.79061-1-peterx@redhat.com> References: <20220622204920.79061-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This patch enables postcopy-preempt feature. It contains two major changes to the migration logic: (1) Postcopy requests are now sent via a different socket from precopy background migration stream, so as to be isolated from very high page request delays. (2) For huge page enabled hosts: when there's postcopy requests, they can now intercept a partial sending of huge host pages on src QEMU. After this patch, we'll live migrate a VM with two channels for postcopy: (1) PRECOPY channel, which is the default channel that transfers background pages; and (2) POSTCOPY channel, which only transfers requested pages. There's no strict rule of which channel to use, e.g., if a requested page is already being transferred on precopy channel, then we will keep using the same precopy channel to transfer the page even if it's explicitly requested. In 99% of the cases we'll prioritize the channels so we send requested page via the postcopy channel as long as possible. On the source QEMU, when we found a postcopy request, we'll interrupt the PRECOPY channel sending process and quickly switch to the POSTCOPY channel. After we serviced all the high priority postcopy pages, we'll switch back to PRECOPY channel so that we'll continue to send the interrupted huge page again. There's no new thread introduced on src QEMU. On the destination QEMU, one new thread is introduced to receive page data from the postcopy specific socket (done in the preparation patch). This patch has a side effect: after sending postcopy pages, previously we'll assume the guest will access follow up pages so we'll keep sending from there. Now it's changed. Instead of going on with a postcopy requested page, we'll go back and continue sending the precopy huge page (which can be intercepted by a postcopy request so the huge page can be sent partially before). Whether that's a problem is debatable, because "assuming the guest will continue to access the next page" may not really suite when huge pages are used, especially if the huge page is large (e.g. 1GB pages). So that locality hint is much meaningless if huge pages are used. Reviewed-by: Dr. David Alan Gilbert Signed-off-by: Peter Xu --- migration/migration.c | 2 + migration/migration.h | 2 +- migration/ram.c | 251 +++++++++++++++++++++++++++++++++++++++-- migration/trace-events | 7 ++ 4 files changed, 253 insertions(+), 9 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 9484fec0b2..5e20d1c941 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -3189,6 +3189,8 @@ static int postcopy_start(MigrationState *ms) MIGRATION_STATUS_FAILED); } + trace_postcopy_preempt_enabled(migrate_postcopy_preempt()); + return ret; fail_closefb: diff --git a/migration/migration.h b/migration/migration.h index 941c61e543..ff714c235f 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -68,7 +68,7 @@ typedef struct { struct MigrationIncomingState { QEMUFile *from_src_file; /* Previously received RAM's RAMBlock pointer */ - RAMBlock *last_recv_block; + RAMBlock *last_recv_block[RAM_CHANNEL_MAX]; /* A hook to allow cleanup at the end of incoming migration */ void *transport_data; void (*transport_cleanup)(void *data); diff --git a/migration/ram.c b/migration/ram.c index e4364c0bff..65b08c4edb 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -296,6 +296,20 @@ struct RAMSrcPageRequest { QSIMPLEQ_ENTRY(RAMSrcPageRequest) next_req; }; +typedef struct { + /* + * Cached ramblock/offset values if preempted. They're only meaningful if + * preempted==true below. + */ + RAMBlock *ram_block; + unsigned long ram_page; + /* + * Whether a postcopy preemption just happened. Will be reset after + * precopy recovered to background migration. + */ + bool preempted; +} PostcopyPreemptState; + /* State of RAM for migration */ struct RAMState { /* QEMUFile used for this migration */ @@ -350,6 +364,14 @@ struct RAMState { /* Queue of outstanding page requests from the destination */ QemuMutex src_page_req_mutex; QSIMPLEQ_HEAD(, RAMSrcPageRequest) src_page_requests; + + /* Postcopy preemption informations */ + PostcopyPreemptState postcopy_preempt_state; + /* + * Current channel we're using on src VM. Only valid if postcopy-preempt + * is enabled. + */ + unsigned int postcopy_channel; }; typedef struct RAMState RAMState; @@ -357,6 +379,11 @@ static RAMState *ram_state; static NotifierWithReturnList precopy_notifier_list; +static void postcopy_preempt_reset(RAMState *rs) +{ + memset(&rs->postcopy_preempt_state, 0, sizeof(PostcopyPreemptState)); +} + /* Whether postcopy has queued requests? */ static bool postcopy_has_request(RAMState *rs) { @@ -1947,6 +1974,55 @@ void ram_write_tracking_stop(void) } #endif /* defined(__linux__) */ +/* + * Check whether two addr/offset of the ramblock falls onto the same host huge + * page. Returns true if so, false otherwise. + */ +static bool offset_on_same_huge_page(RAMBlock *rb, uint64_t addr1, + uint64_t addr2) +{ + size_t page_size = qemu_ram_pagesize(rb); + + addr1 = ROUND_DOWN(addr1, page_size); + addr2 = ROUND_DOWN(addr2, page_size); + + return addr1 == addr2; +} + +/* + * Whether a previous preempted precopy huge page contains current requested + * page? Returns true if so, false otherwise. + * + * This should really happen very rarely, because it means when we were sending + * during background migration for postcopy we're sending exactly the page that + * some vcpu got faulted on on dest node. When it happens, we probably don't + * need to do much but drop the request, because we know right after we restore + * the precopy stream it'll be serviced. It'll slightly affect the order of + * postcopy requests to be serviced (e.g. it'll be the same as we move current + * request to the end of the queue) but it shouldn't be a big deal. The most + * imporant thing is we can _never_ try to send a partial-sent huge page on the + * POSTCOPY channel again, otherwise that huge page will got "split brain" on + * two channels (PRECOPY, POSTCOPY). + */ +static bool postcopy_preempted_contains(RAMState *rs, RAMBlock *block, + ram_addr_t offset) +{ + PostcopyPreemptState *state = &rs->postcopy_preempt_state; + + /* No preemption at all? */ + if (!state->preempted) { + return false; + } + + /* Not even the same ramblock? */ + if (state->ram_block != block) { + return false; + } + + return offset_on_same_huge_page(block, offset, + state->ram_page << TARGET_PAGE_BITS); +} + /** * get_queued_page: unqueue a page from the postcopy requests * @@ -1962,9 +2038,17 @@ static bool get_queued_page(RAMState *rs, PageSearchStatus *pss) RAMBlock *block; ram_addr_t offset; +again: block = unqueue_page(rs, &offset); - if (!block) { + if (block) { + /* See comment above postcopy_preempted_contains() */ + if (postcopy_preempted_contains(rs, block, offset)) { + trace_postcopy_preempt_hit(block->idstr, offset); + /* This request is dropped */ + goto again; + } + } else { /* * Poll write faults too if background snapshot is enabled; that's * when we have vcpus got blocked by the write protected pages. @@ -2180,6 +2264,117 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss) return ram_save_page(rs, pss); } +static bool postcopy_needs_preempt(RAMState *rs, PageSearchStatus *pss) +{ + /* Not enabled eager preempt? Then never do that. */ + if (!migrate_postcopy_preempt()) { + return false; + } + + /* If the ramblock we're sending is a small page? Never bother. */ + if (qemu_ram_pagesize(pss->block) == TARGET_PAGE_SIZE) { + return false; + } + + /* Not in postcopy at all? */ + if (!migration_in_postcopy()) { + return false; + } + + /* + * If we're already handling a postcopy request, don't preempt as this page + * has got the same high priority. + */ + if (pss->postcopy_requested) { + return false; + } + + /* If there's postcopy requests, then check it up! */ + return postcopy_has_request(rs); +} + +/* Returns true if we preempted precopy, false otherwise */ +static void postcopy_do_preempt(RAMState *rs, PageSearchStatus *pss) +{ + PostcopyPreemptState *p_state = &rs->postcopy_preempt_state; + + trace_postcopy_preempt_triggered(pss->block->idstr, pss->page); + + /* + * Time to preempt precopy. Cache current PSS into preempt state, so that + * after handling the postcopy pages we can recover to it. We need to do + * so because the dest VM will have partial of the precopy huge page kept + * over in its tmp huge page caches; better move on with it when we can. + */ + p_state->ram_block = pss->block; + p_state->ram_page = pss->page; + p_state->preempted = true; +} + +/* Whether we're preempted by a postcopy request during sending a huge page */ +static bool postcopy_preempt_triggered(RAMState *rs) +{ + return rs->postcopy_preempt_state.preempted; +} + +static void postcopy_preempt_restore(RAMState *rs, PageSearchStatus *pss) +{ + PostcopyPreemptState *state = &rs->postcopy_preempt_state; + + assert(state->preempted); + + pss->block = state->ram_block; + pss->page = state->ram_page; + /* This is not a postcopy request but restoring previous precopy */ + pss->postcopy_requested = false; + + trace_postcopy_preempt_restored(pss->block->idstr, pss->page); + + /* Reset preempt state, most importantly, set preempted==false */ + postcopy_preempt_reset(rs); +} + +static void postcopy_preempt_choose_channel(RAMState *rs, PageSearchStatus *pss) +{ + MigrationState *s = migrate_get_current(); + unsigned int channel; + QEMUFile *next; + + channel = pss->postcopy_requested ? + RAM_CHANNEL_POSTCOPY : RAM_CHANNEL_PRECOPY; + + if (channel != rs->postcopy_channel) { + if (channel == RAM_CHANNEL_PRECOPY) { + next = s->to_dst_file; + } else { + next = s->postcopy_qemufile_src; + } + /* Update and cache the current channel */ + rs->f = next; + rs->postcopy_channel = channel; + + /* + * If channel switched, reset last_sent_block since the old sent block + * may not be on the same channel. + */ + rs->last_sent_block = NULL; + + trace_postcopy_preempt_switch_channel(channel); + } + + trace_postcopy_preempt_send_host_page(pss->block->idstr, pss->page); +} + +/* We need to make sure rs->f always points to the default channel elsewhere */ +static void postcopy_preempt_reset_channel(RAMState *rs) +{ + if (migrate_postcopy_preempt() && migration_in_postcopy()) { + rs->postcopy_channel = RAM_CHANNEL_PRECOPY; + rs->f = migrate_get_current()->to_dst_file; + trace_postcopy_preempt_reset_channel(); + } +} + /** * ram_save_host_page: save a whole host page * @@ -2211,7 +2406,16 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss) return 0; } + if (migrate_postcopy_preempt() && migration_in_postcopy()) { + postcopy_preempt_choose_channel(rs, pss); + } + do { + if (postcopy_needs_preempt(rs, pss)) { + postcopy_do_preempt(rs, pss); + break; + } + /* Check the pages is dirty and if it is send it */ if (migration_bitmap_clear_dirty(rs, pss->block, pss->page)) { tmppages = ram_save_target_page(rs, pss); @@ -2235,6 +2439,19 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss) /* The offset we leave with is the min boundary of host page and block */ pss->page = MIN(pss->page, hostpage_boundary); + /* + * When with postcopy preempt mode, flush the data as soon as possible for + * postcopy requests, because we've already sent a whole huge page, so the + * dst node should already have enough resource to atomically filling in + * the current missing page. + * + * More importantly, when using separate postcopy channel, we must do + * explicit flush or it won't flush until the buffer is full. + */ + if (migrate_postcopy_preempt() && pss->postcopy_requested) { + qemu_fflush(rs->f); + } + res = ram_save_release_protection(rs, pss, start_page); return (res < 0 ? res : pages); } @@ -2276,8 +2493,17 @@ static int ram_find_and_save_block(RAMState *rs) found = get_queued_page(rs, &pss); if (!found) { - /* priority queue empty, so just search for something dirty */ - found = find_dirty_block(rs, &pss, &again); + /* + * Recover previous precopy ramblock/offset if postcopy has + * preempted precopy. Otherwise find the next dirty bit. + */ + if (postcopy_preempt_triggered(rs)) { + postcopy_preempt_restore(rs, &pss); + found = true; + } else { + /* priority queue empty, so just search for something dirty */ + found = find_dirty_block(rs, &pss, &again); + } } if (found) { @@ -2405,6 +2631,8 @@ static void ram_state_reset(RAMState *rs) rs->last_page = 0; rs->last_version = ram_list.version; rs->xbzrle_enabled = false; + postcopy_preempt_reset(rs); + rs->postcopy_channel = RAM_CHANNEL_PRECOPY; } #define MAX_WAIT 50 /* ms, half buffered_file limit */ @@ -3048,6 +3276,8 @@ static int ram_save_iterate(QEMUFile *f, void *opaque) } qemu_mutex_unlock(&rs->bitmap_mutex); + postcopy_preempt_reset_channel(rs); + /* * Must occur before EOS (or any QEMUFile operation) * because of RDMA protocol. @@ -3125,6 +3355,8 @@ static int ram_save_complete(QEMUFile *f, void *opaque) return ret; } + postcopy_preempt_reset_channel(rs); + ret = multifd_send_sync_main(rs->f); if (ret < 0) { return ret; @@ -3209,11 +3441,13 @@ static int load_xbzrle(QEMUFile *f, ram_addr_t addr, void *host) * @mis: the migration incoming state pointer * @f: QEMUFile where to read the data from * @flags: Page flags (mostly to see if it's a continuation of previous block) + * @channel: the channel we're using */ static inline RAMBlock *ram_block_from_stream(MigrationIncomingState *mis, - QEMUFile *f, int flags) + QEMUFile *f, int flags, + int channel) { - RAMBlock *block = mis->last_recv_block; + RAMBlock *block = mis->last_recv_block[channel]; char id[256]; uint8_t len; @@ -3240,7 +3474,7 @@ static inline RAMBlock *ram_block_from_stream(MigrationIncomingState *mis, return NULL; } - mis->last_recv_block = block; + mis->last_recv_block[channel] = block; return block; } @@ -3694,7 +3928,7 @@ int ram_load_postcopy(QEMUFile *f, int channel) trace_ram_load_postcopy_loop(channel, (uint64_t)addr, flags); if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE | RAM_SAVE_FLAG_COMPRESS_PAGE)) { - block = ram_block_from_stream(mis, f, flags); + block = ram_block_from_stream(mis, f, flags, channel); if (!block) { ret = -EINVAL; break; @@ -3945,7 +4179,8 @@ static int ram_load_precopy(QEMUFile *f) if (flags & (RAM_SAVE_FLAG_ZERO | RAM_SAVE_FLAG_PAGE | RAM_SAVE_FLAG_COMPRESS_PAGE | RAM_SAVE_FLAG_XBZRLE)) { - RAMBlock *block = ram_block_from_stream(mis, f, flags); + RAMBlock *block = ram_block_from_stream(mis, f, flags, + RAM_CHANNEL_PRECOPY); host = host_from_ram_block_offset(block, addr); /* diff --git a/migration/trace-events b/migration/trace-events index 4bc787cf0c..69f311169a 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -111,6 +111,12 @@ ram_load_complete(int ret, uint64_t seq_iter) "exit_code %d seq iteration %" PRI ram_write_tracking_ramblock_start(const char *block_id, size_t page_size, void *addr, size_t length) "%s: page_size: %zu addr: %p length: %zu" ram_write_tracking_ramblock_stop(const char *block_id, size_t page_size, void *addr, size_t length) "%s: page_size: %zu addr: %p length: %zu" unqueue_page(char *block, uint64_t offset, bool dirty) "ramblock '%s' offset 0x%"PRIx64" dirty %d" +postcopy_preempt_triggered(char *str, unsigned long page) "during sending ramblock %s offset 0x%lx" +postcopy_preempt_restored(char *str, unsigned long page) "ramblock %s offset 0x%lx" +postcopy_preempt_hit(char *str, uint64_t offset) "ramblock %s offset 0x%"PRIx64 +postcopy_preempt_send_host_page(char *str, uint64_t offset) "ramblock %s offset 0x%"PRIx64 +postcopy_preempt_switch_channel(int channel) "%d" +postcopy_preempt_reset_channel(void) "" # multifd.c multifd_new_send_channel_async(uint8_t id) "channel %u" @@ -176,6 +182,7 @@ migration_thread_low_pending(uint64_t pending) "%" PRIu64 migrate_transferred(uint64_t tranferred, uint64_t time_spent, uint64_t bandwidth, uint64_t size) "transferred %" PRIu64 " time_spent %" PRIu64 " bandwidth %" PRIu64 " max_size %" PRId64 process_incoming_migration_co_end(int ret, int ps) "ret=%d postcopy-state=%d" process_incoming_migration_co_postcopy_end_main(void) "" +postcopy_preempt_enabled(bool value) "%d" # channel.c migration_set_incoming_channel(void *ioc, const char *ioctype) "ioc=%p ioctype=%s" From patchwork Wed Jun 22 20:49:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12891486 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1F1DFC43334 for ; Wed, 22 Jun 2022 20:53:52 +0000 (UTC) Received: from localhost ([::1]:54972 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o47MF-0002FC-3K for qemu-devel@archiver.kernel.org; Wed, 22 Jun 2022 16:53:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49198) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IF-0002Tt-4t for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:43 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:52674) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IC-0004Gp-VC for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:42 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655930980; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=A6BkIOPUMBsSgvV5skh0LFNxgktFfwQlggDtOwB0A54=; b=fFgCbgGnE8E9JSRPKWtGLGWZYML0vSD8oDbtkbor/nawi8pP+TlKeXs8Vsw3zjM6Tr1oGD ptp6WPfSnw0J3TesRUl2WNZDB9b9QcVLOa3uMyJvENmZEU3EvRqdxDXQ1oDFhoSbuBfzGh qG1uLU/c9FCm4jS3a7kU2vGJtdyZ07s= Received: from mail-il1-f198.google.com (mail-il1-f198.google.com [209.85.166.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-172-tHMxF0mfPuuKP0qVw8STcw-1; Wed, 22 Jun 2022 16:49:39 -0400 X-MC-Unique: tHMxF0mfPuuKP0qVw8STcw-1 Received: by mail-il1-f198.google.com with SMTP id g8-20020a92cda8000000b002d15f63967eso11690205ild.21 for ; Wed, 22 Jun 2022 13:49:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=A6BkIOPUMBsSgvV5skh0LFNxgktFfwQlggDtOwB0A54=; b=SQRqn0Qc45j4d1DBdu1qcEntVQqLacjGiQVIXtvwYkrXTxc9bspiAU1N05OndC8ekB vyhsqT7ULfpPxPVLX2ODDvE0RGf0tl/u9bx87G3aspeKq3RWt7fr5N0QPIgKDquc2ZpR hOIhBGuyjGhBWDfpdCEathaRyj8/9bkhHDtXqA/waVM0cMpWjPebg2Odbrd1orpFtunf h28LaNwycymfaVbhdS71Vh3uLgMhY+ixeCDNdcIiTfPRITLlorTgEjibxxXKuqr1ey0j IgCd3A3BJC5wBRs6+6ozJMpvmxRwkIg0Q6BkYmwFZ7UDXLGvYz0ja/YblfSSKOei19MI sr+w== X-Gm-Message-State: AJIora/5CAk24gZNrfTlvlfUJKgF90oUovMYVSqQhQ8tnYQEZn2oAoRj iUzwUfo1i/gNr3TslQB93yg4Yl0dm9hLZQd6Rmbd/j7WC9wWE5AkWAR5F4VsLRMCOQhJ0iwAcY0 w2dHfoOQyb6eY7e/0o8zA8qMWMFGO/nYGUU4+MTc8MFi2D5Q7BD8/B8deI7yifyCU X-Received: by 2002:a05:6602:1687:b0:66a:44c6:63f6 with SMTP id s7-20020a056602168700b0066a44c663f6mr2850889iow.83.1655930977853; Wed, 22 Jun 2022 13:49:37 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tM/1rqkVSNF1t+nw5eo8OxOhlP/mSqIID8z+AMZsn/O55wGA623pCuuWHuXAgXp/w7ugTieg== X-Received: by 2002:a05:6602:1687:b0:66a:44c6:63f6 with SMTP id s7-20020a056602168700b0066a44c663f6mr2850870iow.83.1655930977460; Wed, 22 Jun 2022 13:49:37 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id b44-20020a0295af000000b0032b3a7817a7sm8920323jai.107.2022.06.22.13.49.35 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 22 Jun 2022 13:49:36 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Cc: "Daniel P . Berrange" , peterx@redhat.com, Leonardo Bras Soares Passos , Manish Mishra , "Dr . David Alan Gilbert" , Juan Quintela Subject: [PATCH v8 05/15] migration: Postcopy recover with preempt enabled Date: Wed, 22 Jun 2022 16:49:10 -0400 Message-Id: <20220622204920.79061-6-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220622204920.79061-1-peterx@redhat.com> References: <20220622204920.79061-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain Received-SPF: pass client-ip=170.10.133.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" To allow postcopy recovery, the ram fast load (preempt-only) dest QEMU thread needs similar handling on fault tolerance. When ram_load_postcopy() fails, instead of stopping the thread it halts with a semaphore, preparing to be kicked again when recovery is detected. A mutex is introduced to make sure there's no concurrent operation upon the socket. To make it simple, the fast ram load thread will take the mutex during its whole procedure, and only release it if it's paused. The fast-path socket will be properly released by the main loading thread safely when there's network failures during postcopy with that mutex held. Reviewed-by: Dr. David Alan Gilbert Signed-off-by: Peter Xu --- migration/migration.c | 27 +++++++++++++++++++++++---- migration/migration.h | 19 +++++++++++++++++++ migration/postcopy-ram.c | 25 +++++++++++++++++++++++-- migration/qemu-file.c | 27 +++++++++++++++++++++++++++ migration/qemu-file.h | 1 + migration/savevm.c | 26 ++++++++++++++++++++++++-- migration/trace-events | 2 ++ 7 files changed, 119 insertions(+), 8 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 5e20d1c941..db82ecbdcd 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -215,9 +215,11 @@ void migration_object_init(void) current_incoming->postcopy_remote_fds = g_array_new(FALSE, TRUE, sizeof(struct PostCopyFD)); qemu_mutex_init(¤t_incoming->rp_mutex); + qemu_mutex_init(¤t_incoming->postcopy_prio_thread_mutex); qemu_event_init(¤t_incoming->main_thread_load_event, false); qemu_sem_init(¤t_incoming->postcopy_pause_sem_dst, 0); qemu_sem_init(¤t_incoming->postcopy_pause_sem_fault, 0); + qemu_sem_init(¤t_incoming->postcopy_pause_sem_fast_load, 0); qemu_mutex_init(¤t_incoming->page_request_mutex); current_incoming->page_requested = g_tree_new(page_request_addr_cmp); @@ -697,9 +699,9 @@ static bool postcopy_try_recover(void) /* * Here, we only wake up the main loading thread (while the - * fault thread will still be waiting), so that we can receive + * rest threads will still be waiting), so that we can receive * commands from source now, and answer it if needed. The - * fault thread will be woken up afterwards until we are sure + * rest threads will be woken up afterwards until we are sure * that source is ready to reply to page requests. */ qemu_sem_post(&mis->postcopy_pause_sem_dst); @@ -3502,6 +3504,18 @@ static MigThrError postcopy_pause(MigrationState *s) qemu_file_shutdown(file); qemu_fclose(file); + /* + * Do the same to postcopy fast path socket too if there is. No + * locking needed because no racer as long as we do this before setting + * status to paused. + */ + if (s->postcopy_qemufile_src) { + migration_ioc_unregister_yank_from_file(s->postcopy_qemufile_src); + qemu_file_shutdown(s->postcopy_qemufile_src); + qemu_fclose(s->postcopy_qemufile_src); + s->postcopy_qemufile_src = NULL; + } + migrate_set_state(&s->state, s->state, MIGRATION_STATUS_POSTCOPY_PAUSED); @@ -3557,8 +3571,13 @@ static MigThrError migration_detect_error(MigrationState *s) return MIG_THR_ERR_FATAL; } - /* Try to detect any file errors */ - ret = qemu_file_get_error_obj(s->to_dst_file, &local_error); + /* + * Try to detect any file errors. Note that postcopy_qemufile_src will + * be NULL when postcopy preempt is not enabled. + */ + ret = qemu_file_get_error_obj_any(s->to_dst_file, + s->postcopy_qemufile_src, + &local_error); if (!ret) { /* Everything is fine */ assert(!local_error); diff --git a/migration/migration.h b/migration/migration.h index ff714c235f..9220cec6bd 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -118,6 +118,18 @@ struct MigrationIncomingState { /* Postcopy priority thread is used to receive postcopy requested pages */ QemuThread postcopy_prio_thread; bool postcopy_prio_thread_created; + /* + * Used to sync between the ram load main thread and the fast ram load + * thread. It protects postcopy_qemufile_dst, which is the postcopy + * fast channel. + * + * The ram fast load thread will take it mostly for the whole lifecycle + * because it needs to continuously read data from the channel, and + * it'll only release this mutex if postcopy is interrupted, so that + * the ram load main thread will take this mutex over and properly + * release the broken channel. + */ + QemuMutex postcopy_prio_thread_mutex; /* * An array of temp host huge pages to be used, one for each postcopy * channel. @@ -147,6 +159,13 @@ struct MigrationIncomingState { /* notify PAUSED postcopy incoming migrations to try to continue */ QemuSemaphore postcopy_pause_sem_dst; QemuSemaphore postcopy_pause_sem_fault; + /* + * This semaphore is used to allow the ram fast load thread (only when + * postcopy preempt is enabled) fall into sleep when there's network + * interruption detected. When the recovery is done, the main load + * thread will kick the fast ram load thread using this semaphore. + */ + QemuSemaphore postcopy_pause_sem_fast_load; /* List of listening socket addresses */ SocketAddressList *socket_address_list; diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index a3561410fe..84f7b1526e 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -1580,6 +1580,15 @@ int postcopy_preempt_setup(MigrationState *s, Error **errp) return 0; } +static void postcopy_pause_ram_fast_load(MigrationIncomingState *mis) +{ + trace_postcopy_pause_fast_load(); + qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex); + qemu_sem_wait(&mis->postcopy_pause_sem_fast_load); + qemu_mutex_lock(&mis->postcopy_prio_thread_mutex); + trace_postcopy_pause_fast_load_continued(); +} + void *postcopy_preempt_thread(void *opaque) { MigrationIncomingState *mis = opaque; @@ -1592,11 +1601,23 @@ void *postcopy_preempt_thread(void *opaque) qemu_sem_post(&mis->thread_sync_sem); /* Sending RAM_SAVE_FLAG_EOS to terminate this thread */ - ret = ram_load_postcopy(mis->postcopy_qemufile_dst, RAM_CHANNEL_POSTCOPY); + qemu_mutex_lock(&mis->postcopy_prio_thread_mutex); + while (1) { + ret = ram_load_postcopy(mis->postcopy_qemufile_dst, + RAM_CHANNEL_POSTCOPY); + /* If error happened, go into recovery routine */ + if (ret) { + postcopy_pause_ram_fast_load(mis); + } else { + /* We're done */ + break; + } + } + qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex); rcu_unregister_thread(); trace_postcopy_preempt_thread_exit(); - return ret == 0 ? NULL : (void *)-1; + return NULL; } diff --git a/migration/qemu-file.c b/migration/qemu-file.c index 1e80d496b7..2f266b25cd 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -160,6 +160,33 @@ int qemu_file_get_error_obj(QEMUFile *f, Error **errp) return f->last_error; } +/* + * Get last error for either stream f1 or f2 with optional Error*. + * The error returned (non-zero) can be either from f1 or f2. + * + * If any of the qemufile* is NULL, then skip the check on that file. + * + * When there is no error on both qemufile, zero is returned. + */ +int qemu_file_get_error_obj_any(QEMUFile *f1, QEMUFile *f2, Error **errp) +{ + int ret = 0; + + if (f1) { + ret = qemu_file_get_error_obj(f1, errp); + /* If there's already error detected, return */ + if (ret) { + return ret; + } + } + + if (f2) { + ret = qemu_file_get_error_obj(f2, errp); + } + + return ret; +} + /* * Set the last error for stream f with optional Error* */ diff --git a/migration/qemu-file.h b/migration/qemu-file.h index 96e72d8bd8..fa13d04d78 100644 --- a/migration/qemu-file.h +++ b/migration/qemu-file.h @@ -141,6 +141,7 @@ void qemu_file_acct_rate_limit(QEMUFile *f, int64_t len); void qemu_file_set_rate_limit(QEMUFile *f, int64_t new_rate); int64_t qemu_file_get_rate_limit(QEMUFile *f); int qemu_file_get_error_obj(QEMUFile *f, Error **errp); +int qemu_file_get_error_obj_any(QEMUFile *f1, QEMUFile *f2, Error **errp); void qemu_file_set_error_obj(QEMUFile *f, int ret, Error *err); void qemu_file_set_error(QEMUFile *f, int ret); int qemu_file_shutdown(QEMUFile *f); diff --git a/migration/savevm.c b/migration/savevm.c index e3af03cb9b..48e85c052c 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -2117,6 +2117,13 @@ static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis) */ qemu_sem_post(&mis->postcopy_pause_sem_fault); + if (migrate_postcopy_preempt()) { + /* The channel should already be setup again; make sure of it */ + assert(mis->postcopy_qemufile_dst); + /* Kick the fast ram load thread too */ + qemu_sem_post(&mis->postcopy_pause_sem_fast_load); + } + return 0; } @@ -2562,6 +2569,21 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis) mis->to_src_file = NULL; qemu_mutex_unlock(&mis->rp_mutex); + /* + * NOTE: this must happen before reset the PostcopyTmpPages below, + * otherwise it's racy to reset those fields when the fast load thread + * can be accessing it in parallel. + */ + if (mis->postcopy_qemufile_dst) { + qemu_file_shutdown(mis->postcopy_qemufile_dst); + /* Take the mutex to make sure the fast ram load thread halted */ + qemu_mutex_lock(&mis->postcopy_prio_thread_mutex); + migration_ioc_unregister_yank_from_file(mis->postcopy_qemufile_dst); + qemu_fclose(mis->postcopy_qemufile_dst); + mis->postcopy_qemufile_dst = NULL; + qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex); + } + migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE, MIGRATION_STATUS_POSTCOPY_PAUSED); @@ -2599,8 +2621,8 @@ retry: while (true) { section_type = qemu_get_byte(f); - if (qemu_file_get_error(f)) { - ret = qemu_file_get_error(f); + ret = qemu_file_get_error_obj_any(f, mis->postcopy_qemufile_dst, NULL); + if (ret) { break; } diff --git a/migration/trace-events b/migration/trace-events index 69f311169a..0e385c3a07 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -270,6 +270,8 @@ mark_postcopy_blocktime_begin(uint64_t addr, void *dd, uint32_t time, int cpu, i mark_postcopy_blocktime_end(uint64_t addr, void *dd, uint32_t time, int affected_cpu) "addr: 0x%" PRIx64 ", dd: %p, time: %u, affected_cpu: %d" postcopy_pause_fault_thread(void) "" postcopy_pause_fault_thread_continued(void) "" +postcopy_pause_fast_load(void) "" +postcopy_pause_fast_load_continued(void) "" postcopy_ram_fault_thread_entry(void) "" postcopy_ram_fault_thread_exit(void) "" postcopy_ram_fault_thread_fds_core(int baseufd, int quitfd) "ufd: %d quitfd: %d" From patchwork Wed Jun 22 20:49:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12891491 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 08204C433EF for ; Wed, 22 Jun 2022 20:56:19 +0000 (UTC) Received: from localhost ([::1]:35640 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o47Oc-000883-59 for qemu-devel@archiver.kernel.org; Wed, 22 Jun 2022 16:56:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49208) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IG-0002ZG-QW for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:44 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:30728) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IE-0004H5-Td for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655930982; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xvhsLJiKfouFIpkmKsyLkMu1rJlWCcBPJQZZsJh3KVI=; b=CGbCRTt4n1whPZtZvr+lpBwBnQll08uuL3uAMOreT6coR81UVDSeChNDTtZsDdwfchatNJ qvJ/ETN+9Q+iS4133nPKQ5mgo1cYl3dCeZNnkumI8qH8QfhUtTVcypRr4o+I7dpFkPcSvN EuKjjizsTArzgKYO5jB4iT1BH9SPxMY= Received: from mail-il1-f197.google.com (mail-il1-f197.google.com [209.85.166.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-213-wT0hhUzrPb25ES2hSXAutg-1; Wed, 22 Jun 2022 16:49:41 -0400 X-MC-Unique: wT0hhUzrPb25ES2hSXAutg-1 Received: by mail-il1-f197.google.com with SMTP id w15-20020a056e021a6f00b002d8eef284f0so9158005ilv.6 for ; Wed, 22 Jun 2022 13:49:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=xvhsLJiKfouFIpkmKsyLkMu1rJlWCcBPJQZZsJh3KVI=; b=m9bPwASCX4t+lbnedtJWKkLqLnFPpXwN7QoJSGxbnkpPd71/IJ2ljyOJRo8SjM+Nlz 5ajsLXFXCFOLIdLt3WQHWsDw+GL2EOD643NEstdsr46x7thbas6/eiy9URi2UskRvxyl 6Q5fsPhaAwYgXe0syBsHG9Ikd3f3z9sew86RBYJ8qJy8qzTBikDnAwkxAYMMNcg0mStH /RgiSGsN664k17+BZWoLuYbDyU8NPAspXCUsFc70Q8zCXKgr2YJ3I1f3YUtP67Hk0FZY KSDsp8p6hOb1u68F5uKOoZqVdNUSAdT7449ZESDQDL6k8WytWKoOKUir+XGDnzz4H2v6 GGGg== X-Gm-Message-State: AJIora9OdaYyeWNpM0jTILfdef1YgB+QQdN69KgJgb258fIClK1wBm3X d1BOt0kNguyXLNU054clvVzPImjgBtl+tnpVQssXkh/Jsqau6r65ZAIrNdbOkEJlKesi65+L3j6 IoJ/iv/elk9DXnH0dW4vCDnUcUJAOFk2FEvih1yeJx1sUXE2W8pUJRg9KaHARvUDt X-Received: by 2002:a05:6602:2d44:b0:669:ef11:523a with SMTP id d4-20020a0566022d4400b00669ef11523amr2792316iow.44.1655930980109; Wed, 22 Jun 2022 13:49:40 -0700 (PDT) X-Google-Smtp-Source: AGRyM1v+XEAJfyzPxgXTvBKxPXJpSF/C9TWfdlBtrRfOwx2zQyv5JXwgxBltGIpMb5LGztOdykmNaA== X-Received: by 2002:a05:6602:2d44:b0:669:ef11:523a with SMTP id d4-20020a0566022d4400b00669ef11523amr2792305iow.44.1655930979800; Wed, 22 Jun 2022 13:49:39 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id b44-20020a0295af000000b0032b3a7817a7sm8920323jai.107.2022.06.22.13.49.37 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 22 Jun 2022 13:49:38 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Cc: "Daniel P . Berrange" , peterx@redhat.com, Leonardo Bras Soares Passos , Manish Mishra , "Dr . David Alan Gilbert" , Juan Quintela Subject: [PATCH v8 06/15] migration: Create the postcopy preempt channel asynchronously Date: Wed, 22 Jun 2022 16:49:11 -0400 Message-Id: <20220622204920.79061-7-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220622204920.79061-1-peterx@redhat.com> References: <20220622204920.79061-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This patch allows the postcopy preempt channel to be created asynchronously. The benefit is that when the connection is slow, we won't take the BQL (and potentially block all things like QMP) for a long time without releasing. A function postcopy_preempt_wait_channel() is introduced, allowing the migration thread to be able to wait on the channel creation. The channel is always created by the main thread, in which we'll kick a new semaphore to tell the migration thread that the channel has created. We'll need to wait for the new channel in two places: (1) when there's a new postcopy migration that is starting, or (2) when there's a postcopy migration to resume. For the start of migration, we don't need to wait for this channel until when we want to start postcopy, aka, postcopy_start(). We'll fail the migration if we found that the channel creation failed (which should probably not happen at all in 99% of the cases, because the main channel is using the same network topology). For a postcopy recovery, we'll need to wait in postcopy_pause(). In that case if the channel creation failed, we can't fail the migration or we'll crash the VM, instead we keep in PAUSED state, waiting for yet another recovery. Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Manish Mishra Signed-off-by: Peter Xu --- migration/migration.c | 16 ++++++++++++ migration/migration.h | 7 +++++ migration/postcopy-ram.c | 56 +++++++++++++++++++++++++++++++--------- migration/postcopy-ram.h | 1 + 4 files changed, 68 insertions(+), 12 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index db82ecbdcd..5d113bd5cc 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -3052,6 +3052,12 @@ static int postcopy_start(MigrationState *ms) int64_t bandwidth = migrate_max_postcopy_bandwidth(); bool restart_block = false; int cur_state = MIGRATION_STATUS_ACTIVE; + + if (postcopy_preempt_wait_channel(ms)) { + migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED); + return -1; + } + if (!migrate_pause_before_switchover()) { migrate_set_state(&ms->state, MIGRATION_STATUS_ACTIVE, MIGRATION_STATUS_POSTCOPY_ACTIVE); @@ -3533,6 +3539,14 @@ static MigThrError postcopy_pause(MigrationState *s) if (s->state == MIGRATION_STATUS_POSTCOPY_RECOVER) { /* Woken up by a recover procedure. Give it a shot */ + if (postcopy_preempt_wait_channel(s)) { + /* + * Preempt enabled, and new channel create failed; loop + * back to wait for another recovery. + */ + continue; + } + /* * Firstly, let's wake up the return path now, with a new * return path channel. @@ -4397,6 +4411,7 @@ static void migration_instance_finalize(Object *obj) qemu_sem_destroy(&ms->postcopy_pause_sem); qemu_sem_destroy(&ms->postcopy_pause_rp_sem); qemu_sem_destroy(&ms->rp_state.rp_sem); + qemu_sem_destroy(&ms->postcopy_qemufile_src_sem); error_free(ms->error); } @@ -4443,6 +4458,7 @@ static void migration_instance_init(Object *obj) qemu_sem_init(&ms->rp_state.rp_sem, 0); qemu_sem_init(&ms->rate_limit_sem, 0); qemu_sem_init(&ms->wait_unplug_sem, 0); + qemu_sem_init(&ms->postcopy_qemufile_src_sem, 0); qemu_mutex_init(&ms->qemu_file_lock); } diff --git a/migration/migration.h b/migration/migration.h index 9220cec6bd..ae4ffd3454 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -219,6 +219,13 @@ struct MigrationState { QEMUFile *to_dst_file; /* Postcopy specific transfer channel */ QEMUFile *postcopy_qemufile_src; + /* + * It is posted when the preempt channel is established. Note: this is + * used for both the start or recover of a postcopy migration. We'll + * post to this sem every time a new preempt channel is created in the + * main thread, and we keep post() and wait() in pair. + */ + QemuSemaphore postcopy_qemufile_src_sem; QIOChannelBuffer *bioc; /* * Protects to_dst_file/from_dst_file pointers. We need to make sure we diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 84f7b1526e..70b21e9d51 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -1552,10 +1552,50 @@ bool postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file) return true; } -int postcopy_preempt_setup(MigrationState *s, Error **errp) +static void +postcopy_preempt_send_channel_new(QIOTask *task, gpointer opaque) { - QIOChannel *ioc; + MigrationState *s = opaque; + QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task)); + Error *local_err = NULL; + + if (qio_task_propagate_error(task, &local_err)) { + /* Something wrong happened.. */ + migrate_set_error(s, local_err); + error_free(local_err); + } else { + migration_ioc_register_yank(ioc); + s->postcopy_qemufile_src = qemu_file_new_output(ioc); + trace_postcopy_preempt_new_channel(); + } + + /* + * Kick the waiter in all cases. The waiter should check upon + * postcopy_qemufile_src to know whether it failed or not. + */ + qemu_sem_post(&s->postcopy_qemufile_src_sem); + object_unref(OBJECT(ioc)); +} +/* Returns 0 if channel established, -1 for error. */ +int postcopy_preempt_wait_channel(MigrationState *s) +{ + /* If preempt not enabled, no need to wait */ + if (!migrate_postcopy_preempt()) { + return 0; + } + + /* + * We need the postcopy preempt channel to be established before + * starting doing anything. + */ + qemu_sem_wait(&s->postcopy_qemufile_src_sem); + + return s->postcopy_qemufile_src ? 0 : -1; +} + +int postcopy_preempt_setup(MigrationState *s, Error **errp) +{ if (!migrate_postcopy_preempt()) { return 0; } @@ -1566,16 +1606,8 @@ int postcopy_preempt_setup(MigrationState *s, Error **errp) return -1; } - ioc = socket_send_channel_create_sync(errp); - - if (ioc == NULL) { - return -1; - } - - migration_ioc_register_yank(ioc); - s->postcopy_qemufile_src = qemu_file_new_output(ioc); - - trace_postcopy_preempt_new_channel(); + /* Kick an async task to connect */ + socket_send_channel_create(postcopy_preempt_send_channel_new, s); return 0; } diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h index 34b1080cde..6147bf7d1d 100644 --- a/migration/postcopy-ram.h +++ b/migration/postcopy-ram.h @@ -192,5 +192,6 @@ enum PostcopyChannels { bool postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file); int postcopy_preempt_setup(MigrationState *s, Error **errp); +int postcopy_preempt_wait_channel(MigrationState *s); #endif From patchwork Wed Jun 22 20:49:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12891485 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B9719C433EF for ; Wed, 22 Jun 2022 20:53:51 +0000 (UTC) Received: from localhost ([::1]:54962 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o47ME-0002Er-Pk for qemu-devel@archiver.kernel.org; Wed, 22 Jun 2022 16:53:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49226) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IJ-0002iq-3r for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:47 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:49929) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IH-0004Hg-Ht for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655930985; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3btWgwqdjZi9Z7WYTcj4UpiN3YkzVWNlwPq6/1+kKWg=; b=A3IxJ4lVpEnLDDaNsmAPAJ4dUEi5hgiDkRdSqUpWapVQMfIf69fx9P55JKvH+Ck3ZWFntw fG5yBy2of9W4ZDNxOEF29tQQB973hM8kvIn5OmTgu2F9khgZT+H61VPOpmlE812yniSh7R R7as3afWgFG5U5FYY6FIdVhlMEmi++M= Received: from mail-io1-f70.google.com (mail-io1-f70.google.com [209.85.166.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-327-BHpkWfhXOkeDw2zYupgAAA-1; Wed, 22 Jun 2022 16:49:43 -0400 X-MC-Unique: BHpkWfhXOkeDw2zYupgAAA-1 Received: by mail-io1-f70.google.com with SMTP id n20-20020a6b7214000000b00669cae33d00so9773789ioc.17 for ; Wed, 22 Jun 2022 13:49:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=3btWgwqdjZi9Z7WYTcj4UpiN3YkzVWNlwPq6/1+kKWg=; b=GiFrAr8g5pYMddG+YQH7Kqt1MoRqb32uyvTCXlBsUKDmZpwwmyLQcCN5Hg45MhpsHB V3L3V3KXv/aqSOc9hM9x5d4dCZ+PLI1oyhPO88KpZzEH/7YQTIOineYllo7DESaqiJXr DUnf3Iku24+yuRKSaXF7w4bvVmuThFUBDZdVWnay8/qZzMsjYmHX542eNzhfAykV2B4w /5oVViMtS7LErzgimp9Avx5aBWfF0qJDX6QHJaM/OszW3GBfPnVwDuIWx0S1yewoByHJ djHVnRH6fAbfdoUblGIXi0k9AtrfbX9IBhqmRBpUxyG+auKtQhrBQZAUfukOubW2KsjA zzlw== X-Gm-Message-State: AJIora/xcbriK8AmaIhvEWvAZzcX+l919xCsca9DRc5+k2PqUA0zZcHd XGdi38kDosiDXlO4sOeL/yYyBtkdBramoJHetxF7lCR6jfN99K1FOAdIW+DCbz89I4InITG0JF4 PDcHz4M2eP5ibLESynmB2B7JlbUPP2hVJVsVtt6otU5tl8dX/pAlR1CUonrNzQc7m X-Received: by 2002:a05:6e02:1aa7:b0:2d3:dce1:400c with SMTP id l7-20020a056e021aa700b002d3dce1400cmr3128375ilv.94.1655930982641; Wed, 22 Jun 2022 13:49:42 -0700 (PDT) X-Google-Smtp-Source: AGRyM1s5lfyQUw92SD1KbTgfl+niAM+ywsToUfUNmgCthH/Cda1zGAqo22ILsqlXSMbyRbQUSf7xYg== X-Received: by 2002:a05:6e02:1aa7:b0:2d3:dce1:400c with SMTP id l7-20020a056e021aa700b002d3dce1400cmr3128358ilv.94.1655930982352; Wed, 22 Jun 2022 13:49:42 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id b44-20020a0295af000000b0032b3a7817a7sm8920323jai.107.2022.06.22.13.49.39 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 22 Jun 2022 13:49:41 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Cc: "Daniel P . Berrange" , peterx@redhat.com, Leonardo Bras Soares Passos , Manish Mishra , "Dr . David Alan Gilbert" , Juan Quintela Subject: [PATCH v8 07/15] migration: Add property x-postcopy-preempt-break-huge Date: Wed, 22 Jun 2022 16:49:12 -0400 Message-Id: <20220622204920.79061-8-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220622204920.79061-1-peterx@redhat.com> References: <20220622204920.79061-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Add a property field that can conditionally disable the "break sending huge page" behavior in postcopy preemption. By default it's enabled. It should only be used for debugging purposes, and we should never remove the "x-" prefix. Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Manish Mishra Signed-off-by: Peter Xu --- migration/migration.c | 2 ++ migration/migration.h | 7 +++++++ migration/ram.c | 7 +++++++ 3 files changed, 16 insertions(+) diff --git a/migration/migration.c b/migration/migration.c index 5d113bd5cc..e10f0400ef 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -4362,6 +4362,8 @@ static Property migration_properties[] = { DEFINE_PROP_SIZE("announce-step", MigrationState, parameters.announce_step, DEFAULT_MIGRATE_ANNOUNCE_STEP), + DEFINE_PROP_BOOL("x-postcopy-preempt-break-huge", MigrationState, + postcopy_preempt_break_huge, true), /* Migration capabilities */ DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE), diff --git a/migration/migration.h b/migration/migration.h index ae4ffd3454..cdad8aceaa 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -340,6 +340,13 @@ struct MigrationState { bool send_configuration; /* Whether we send section footer during migration */ bool send_section_footer; + /* + * Whether we allow break sending huge pages when postcopy preempt is + * enabled. When disabled, we won't interrupt precopy within sending a + * host huge page, which is the old behavior of vanilla postcopy. + * NOTE: this parameter is ignored if postcopy preempt is not enabled. + */ + bool postcopy_preempt_break_huge; /* Needed by postcopy-pause state */ QemuSemaphore postcopy_pause_sem; diff --git a/migration/ram.c b/migration/ram.c index 65b08c4edb..7cbe9c310d 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2266,11 +2266,18 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss) static bool postcopy_needs_preempt(RAMState *rs, PageSearchStatus *pss) { + MigrationState *ms = migrate_get_current(); + /* Not enabled eager preempt? Then never do that. */ if (!migrate_postcopy_preempt()) { return false; } + /* If the user explicitly disabled breaking of huge page, skip */ + if (!ms->postcopy_preempt_break_huge) { + return false; + } + /* If the ramblock we're sending is a small page? Never bother. */ if (qemu_ram_pagesize(pss->block) == TARGET_PAGE_SIZE) { return false; From patchwork Wed Jun 22 20:49:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12891489 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D3AB3C43334 for ; Wed, 22 Jun 2022 20:56:12 +0000 (UTC) Received: from localhost ([::1]:35212 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o47OV-0007qr-Rl for qemu-devel@archiver.kernel.org; Wed, 22 Jun 2022 16:56:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49254) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IL-0002pe-Kn for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:49 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:30483) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IJ-0004Iz-TZ for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655930987; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=540YHl4vbNEoceJeIl9C/p7RZrCYqlHgo1ki0gch8bM=; b=adnb3KoFcscYiiBJCzgrLrmfVWIxL28TfIekHHkYeLtsWzi84McKkLPvsx4mU2p7Dia9Dc RRBXiCE2WNcnuLrIIFyb5F6UKgzG0lDjHqmJGKWUx3JikZryCtSg++J49u4RddqYpUreX1 N4zdnmf9dzZEQUtxI27ZrxlogIZGxaA= Received: from mail-il1-f199.google.com (mail-il1-f199.google.com [209.85.166.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-541-aRju-fNDOu-zmISBvJeuBA-1; Wed, 22 Jun 2022 16:49:46 -0400 X-MC-Unique: aRju-fNDOu-zmISBvJeuBA-1 Received: by mail-il1-f199.google.com with SMTP id p12-20020a056e02144c00b002d196a4d73eso11703204ilo.18 for ; Wed, 22 Jun 2022 13:49:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=540YHl4vbNEoceJeIl9C/p7RZrCYqlHgo1ki0gch8bM=; b=UT5eMCaX9uaWeLyu1DoZeSwuHx+TEiBZ4A1Osx7WRgHzfaJZMHqoyFWt56f/WQf6Pi /jOYj+qD9W1/5NSY2Jp56EB/iQurVRQ/MvekLB6S/etCOFmtGEsJCwF/1k+ybxoncSO5 6EdNOMIpEz/mievhKo8CqlokEy3nDgQWjwz/Ab7jV7a+ggKxuCqMmBU0B+AuYW/PAqfq wRzn2/JWELe5izqsBNSy9jxteFT1YoQL/ZQGHmHzrPCreOkE+QVbhz6MP0cb1olpVTDJ rr1K002E+SegmEewyHuCHV8K/CjnrpC02JsMXlwRA19KIPnRxkPTc28tllsGJ2xDtz9h uaZA== X-Gm-Message-State: AJIora8ExDh56HEal0qDR4bYT0YDLHwXVfbj8Q6FAAJ5wcVhYhZLmnM0 7uE2Y3lruRwZOCXaAx1EblnFO3y6odiZT4WGmFkJ5eEdxpdEbx0Kl1B7fMj4KhIrleW6D7VZBJ5 uyW5LgdbmS11s374HUbGNyKfZ0otyWHjoncG0mrqzTbJPOoA6++BT1Gz2W4h8WsWs X-Received: by 2002:a02:cc0c:0:b0:339:c46a:e5dd with SMTP id n12-20020a02cc0c000000b00339c46ae5ddmr3177137jap.104.1655930985184; Wed, 22 Jun 2022 13:49:45 -0700 (PDT) X-Google-Smtp-Source: AGRyM1sj0ZMrVhz7VWXmN54uCbtmy68yIMFGvS40alcP5ZDtNmY3A/3ZVSOJ3/WbLw45Q7Od+oqX0A== X-Received: by 2002:a02:cc0c:0:b0:339:c46a:e5dd with SMTP id n12-20020a02cc0c000000b00339c46ae5ddmr3177121jap.104.1655930984933; Wed, 22 Jun 2022 13:49:44 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id b44-20020a0295af000000b0032b3a7817a7sm8920323jai.107.2022.06.22.13.49.42 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 22 Jun 2022 13:49:44 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Cc: "Daniel P . Berrange" , peterx@redhat.com, Leonardo Bras Soares Passos , Manish Mishra , "Dr . David Alan Gilbert" , Juan Quintela Subject: [PATCH v8 08/15] migration: Add helpers to detect TLS capability Date: Wed, 22 Jun 2022 16:49:13 -0400 Message-Id: <20220622204920.79061-9-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220622204920.79061-1-peterx@redhat.com> References: <20220622204920.79061-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Add migrate_channel_requires_tls() to detect whether the specific channel requires TLS, leveraging the recently introduced migrate_use_tls(). No functional change intended. Reviewed-by: Dr. David Alan Gilbert Signed-off-by: Peter Xu --- migration/channel.c | 9 ++------- migration/migration.c | 1 + migration/multifd.c | 4 +--- migration/tls.c | 9 +++++++++ migration/tls.h | 4 ++++ 5 files changed, 17 insertions(+), 10 deletions(-) diff --git a/migration/channel.c b/migration/channel.c index 90087d8986..1b0815039f 100644 --- a/migration/channel.c +++ b/migration/channel.c @@ -38,9 +38,7 @@ void migration_channel_process_incoming(QIOChannel *ioc) trace_migration_set_incoming_channel( ioc, object_get_typename(OBJECT(ioc))); - if (migrate_use_tls() && - !object_dynamic_cast(OBJECT(ioc), - TYPE_QIO_CHANNEL_TLS)) { + if (migrate_channel_requires_tls_upgrade(ioc)) { migration_tls_channel_process_incoming(s, ioc, &local_err); } else { migration_ioc_register_yank(ioc); @@ -70,10 +68,7 @@ void migration_channel_connect(MigrationState *s, ioc, object_get_typename(OBJECT(ioc)), hostname, error); if (!error) { - if (s->parameters.tls_creds && - *s->parameters.tls_creds && - !object_dynamic_cast(OBJECT(ioc), - TYPE_QIO_CHANNEL_TLS)) { + if (migrate_channel_requires_tls_upgrade(ioc)) { migration_tls_channel_connect(s, ioc, hostname, &error); if (!error) { diff --git a/migration/migration.c b/migration/migration.c index e10f0400ef..fe77c7d0ef 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -48,6 +48,7 @@ #include "trace.h" #include "exec/target_page.h" #include "io/channel-buffer.h" +#include "io/channel-tls.h" #include "migration/colo.h" #include "hw/boards.h" #include "hw/qdev-properties.h" diff --git a/migration/multifd.c b/migration/multifd.c index 684c014c86..1e49594b02 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -831,9 +831,7 @@ static bool multifd_channel_connect(MultiFDSendParams *p, migrate_get_current()->hostname, error); if (!error) { - if (migrate_use_tls() && - !object_dynamic_cast(OBJECT(ioc), - TYPE_QIO_CHANNEL_TLS)) { + if (migrate_channel_requires_tls_upgrade(ioc)) { multifd_tls_channel_connect(p, ioc, &error); if (!error) { /* diff --git a/migration/tls.c b/migration/tls.c index 32c384a8b6..73e8c9d3c2 100644 --- a/migration/tls.c +++ b/migration/tls.c @@ -166,3 +166,12 @@ void migration_tls_channel_connect(MigrationState *s, NULL, NULL); } + +bool migrate_channel_requires_tls_upgrade(QIOChannel *ioc) +{ + if (!migrate_use_tls()) { + return false; + } + + return !object_dynamic_cast(OBJECT(ioc), TYPE_QIO_CHANNEL_TLS); +} diff --git a/migration/tls.h b/migration/tls.h index de4fe2cafd..98e23c9b0e 100644 --- a/migration/tls.h +++ b/migration/tls.h @@ -37,4 +37,8 @@ void migration_tls_channel_connect(MigrationState *s, QIOChannel *ioc, const char *hostname, Error **errp); + +/* Whether the QIO channel requires further TLS handshake? */ +bool migrate_channel_requires_tls_upgrade(QIOChannel *ioc); + #endif From patchwork Wed Jun 22 20:49:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12891488 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CBE85C43334 for ; Wed, 22 Jun 2022 20:54:45 +0000 (UTC) Received: from localhost ([::1]:58410 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o47N5-0004Vi-JC for qemu-devel@archiver.kernel.org; Wed, 22 Jun 2022 16:54:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49282) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IN-0002vc-FU for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:51 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:52864) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IL-0004J5-RN for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655930989; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=h49fYDlaUAK3N6Swc75rYz8rs8AyMgUxeFjc5byoIx4=; b=EYj5xDbHIS52ygc5nlDQw35vjCOVN2F+cG3iZgGLscTiQ2xi9lPxDLp3vZ7E2cPu5n5b74 /GuNIvHPf6l/iYUdeLSFKQs5CASEe0ZETQ2sDOP2ay5drhensfbUaCMClMYx+R5jY6L6C0 07f/KuMpdTbCvWQoThhkWDRTRmZhPKg= Received: from mail-io1-f69.google.com (mail-io1-f69.google.com [209.85.166.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-226-Qd3-TOwlMQ2B5ZzWXHn0bw-1; Wed, 22 Jun 2022 16:49:48 -0400 X-MC-Unique: Qd3-TOwlMQ2B5ZzWXHn0bw-1 Received: by mail-io1-f69.google.com with SMTP id q13-20020a5d9f0d000000b00669c03397f7so9767468iot.10 for ; Wed, 22 Jun 2022 13:49:47 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=h49fYDlaUAK3N6Swc75rYz8rs8AyMgUxeFjc5byoIx4=; b=Ge69ppjo2OpoQlKCJD0OrtEmC2XkAu4sphui9PmGAm5k/UmtPkENa7J9R8ACwvy2n6 9E9fWgKeJpDmVvQITMNSqQwDGiHzDaNwO946e0NjYSuez3Ic3SJ990Hc5dj7QIW00xBA A0GrSuVXib1znom0P3tCCTdhqzZp0OLehw/vGOQpsFBlfB0xTq4LL8lBE9YSZT2WhB9V izKJmyJT2R1sXjVaTr5+/TRofyxmq7lVtoL1/Ng/86rXF05Odau574+9vGs+0rTCYIM7 0rS08WD1Xn3RvQ7HpGEnEKee2+K9NS8YJHnU8FNoA2xTahNKSxiQXM2pGvTn77FJJHSO ed/A== X-Gm-Message-State: AJIora9LVJf8EE5+6Cm2AbDUr0s0pQNHsCgU+40cAczONc/8WCKodx0A ujdKaAJWrI1XO9XGAVcs6m22gf/mu/MPjZkz7YH/atFZq/96wRMcWritCMi6rUpPDg0l+4A4gRD NWIJLXL+KN9OBpEFl4D1cB8u1dWB43KC0pKq3NVYrrUgNuPUfT7MqNohQqR664aNz X-Received: by 2002:a05:6e02:1a89:b0:2d9:2feb:da69 with SMTP id k9-20020a056e021a8900b002d92febda69mr3218843ilv.189.1655930987016; Wed, 22 Jun 2022 13:49:47 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tntzEoPx0xDgUysrYMvYkCFI9gVzFo1QYTadt11PmbNBuHiQITvKVW3Qz/QhzgCYafjDqtsQ== X-Received: by 2002:a05:6e02:1a89:b0:2d9:2feb:da69 with SMTP id k9-20020a056e021a8900b002d92febda69mr3218823ilv.189.1655930986679; Wed, 22 Jun 2022 13:49:46 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id b44-20020a0295af000000b0032b3a7817a7sm8920323jai.107.2022.06.22.13.49.45 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 22 Jun 2022 13:49:46 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Cc: "Daniel P . Berrange" , peterx@redhat.com, Leonardo Bras Soares Passos , Manish Mishra , "Dr . David Alan Gilbert" , Juan Quintela Subject: [PATCH v8 09/15] migration: Export tls-[creds|hostname|authz] params to cmdline too Date: Wed, 22 Jun 2022 16:49:14 -0400 Message-Id: <20220622204920.79061-10-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220622204920.79061-1-peterx@redhat.com> References: <20220622204920.79061-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain Received-SPF: pass client-ip=170.10.133.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" It's useful for specifying tls credentials all in the cmdline (along with the -object tls-creds-*), especially for debugging purpose. The trick here is we must remember to not free these fields again in the finalize() function of migration object, otherwise it'll cause double-free. The thing is when destroying an object, we'll first destroy the properties that bound to the object, then the object itself. To be explicit, when destroy the object in object_finalize() we have such sequence of operations: object_property_del_all(obj); object_deinit(obj, ti); So after this change the two fields are properly released already even before reaching the finalize() function but in object_property_del_all(), hence we don't need to free them anymore in finalize() or it's double-free. This also fixes a trivial memory leak for tls-authz as we forgot to free it before this patch. Reviewed-by: Daniel P. Berrange Signed-off-by: Peter Xu --- migration/migration.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index fe77c7d0ef..76cf2a72c0 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -4365,6 +4365,9 @@ static Property migration_properties[] = { DEFAULT_MIGRATE_ANNOUNCE_STEP), DEFINE_PROP_BOOL("x-postcopy-preempt-break-huge", MigrationState, postcopy_preempt_break_huge, true), + DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds), + DEFINE_PROP_STRING("tls-hostname", MigrationState, parameters.tls_hostname), + DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz), /* Migration capabilities */ DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE), @@ -4402,12 +4405,9 @@ static void migration_class_init(ObjectClass *klass, void *data) static void migration_instance_finalize(Object *obj) { MigrationState *ms = MIGRATION_OBJ(obj); - MigrationParameters *params = &ms->parameters; qemu_mutex_destroy(&ms->error_mutex); qemu_mutex_destroy(&ms->qemu_file_lock); - g_free(params->tls_hostname); - g_free(params->tls_creds); qemu_sem_destroy(&ms->wait_unplug_sem); qemu_sem_destroy(&ms->rate_limit_sem); qemu_sem_destroy(&ms->pause_sem); From patchwork Wed Jun 22 20:49:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12891493 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AA9DBC43334 for ; Wed, 22 Jun 2022 20:59:22 +0000 (UTC) Received: from localhost ([::1]:44200 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o47RZ-0005VI-Pi for qemu-devel@archiver.kernel.org; Wed, 22 Jun 2022 16:59:21 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49300) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IQ-00035m-Cg for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:54 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:23374) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IO-0004KG-O1 for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655930992; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jwUnbHjViPt/gSGud2YLDmF4j1h1MIToXoGJAjea+To=; b=P191s2JQSCfBSpW0mBl1kcTj+K11+2v7xUB+SscqhRkBjgwMSpcwn4qTJa8oVuGYsQFZ32 qeQH93ytFh8bFBhDh8ckRcaGc+VhL7t7maJUwiU9D6KNwI5XbaFWnyedhKTHkTzsUVdVK3 iRyoXi0Q0L/CDweOtQLXAYrBT1QJCTc= Received: from mail-il1-f197.google.com (mail-il1-f197.google.com [209.85.166.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-140-r0nd15bXOmmbRWbIkEqICg-1; Wed, 22 Jun 2022 16:49:51 -0400 X-MC-Unique: r0nd15bXOmmbRWbIkEqICg-1 Received: by mail-il1-f197.google.com with SMTP id u8-20020a056e021a4800b002d3a5419d1bso11693802ilv.12 for ; Wed, 22 Jun 2022 13:49:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=jwUnbHjViPt/gSGud2YLDmF4j1h1MIToXoGJAjea+To=; b=6pvB9NV0Ifsuilp8H7vWQnliXMZ600q8RXB7U2hh8gucOUQ+AXXrB4skSHcx+6VZrh +8mpM10WcOKqJ8ClH2VWLFKzWsfpU86wd6F83jO1J8B/CKVNvOdf1Vf47v06Xs7QQ9dN /nwmSuYnoNlAAwHdg5QoKK34bsMDDxM21LWh3XHY8XDfSifuz+5YBxUt1vYCrucMzBXf SMnkOTq7ZwZnPIPOJZQj0CT9Fi260elOwPnzs3aldn+njo1seUpQkyc3qp410VX0dn70 BoDB+c0kIuqvVPTxA0JYbjejdQA/AsQiSYNNpNtPO2MEZ04/OMe5pJ1VtfTETAbWdKAY Jl8g== X-Gm-Message-State: AJIora8IewPXkJbDHc+TprSkUj2yRqGIegARgyOwKmEKrK40TGfNuptI E4bqxXmBhLrPKzoDHgpzAJKh0PwVn3a3ZJcepz+gKozEYrlrLRorMoiAHJK+dFggpf/jmPjPe98 dMZPa6qR9Og8CKMSGjvKL/WTG7hlMTAGRau/gh4QYUB2ftPV/yeUsCXTp/nhcp+3L X-Received: by 2002:a92:dd82:0:b0:2d9:126:5bed with SMTP id g2-20020a92dd82000000b002d901265bedmr3206599iln.97.1655930989942; Wed, 22 Jun 2022 13:49:49 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vuRMp0myrBZZL44gPpSENolylrrZTuWzsjUDiLT0/JMg3FBJOXhiuXxZrqKB9IHcPbywInMQ== X-Received: by 2002:a92:dd82:0:b0:2d9:126:5bed with SMTP id g2-20020a92dd82000000b002d901265bedmr3206584iln.97.1655930989652; Wed, 22 Jun 2022 13:49:49 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id b44-20020a0295af000000b0032b3a7817a7sm8920323jai.107.2022.06.22.13.49.46 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 22 Jun 2022 13:49:48 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Cc: "Daniel P . Berrange" , peterx@redhat.com, Leonardo Bras Soares Passos , Manish Mishra , "Dr . David Alan Gilbert" , Juan Quintela Subject: [PATCH v8 10/15] migration: Enable TLS for preempt channel Date: Wed, 22 Jun 2022 16:49:15 -0400 Message-Id: <20220622204920.79061-11-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220622204920.79061-1-peterx@redhat.com> References: <20220622204920.79061-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain Received-SPF: pass client-ip=170.10.133.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" This patch is based on the async preempt channel creation. It continues wiring up the new channel with TLS handshake to destionation when enabled. Note that only the src QEMU needs such operation; the dest QEMU does not need any change for TLS support due to the fact that all channels are established synchronously there, so all the TLS magic is already properly handled by migration_tls_channel_process_incoming(). Reviewed-by: Daniel P. Berrange Signed-off-by: Peter Xu --- migration/postcopy-ram.c | 57 ++++++++++++++++++++++++++++++++++------ migration/trace-events | 1 + 2 files changed, 50 insertions(+), 8 deletions(-) diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 70b21e9d51..b9a37ef255 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -36,6 +36,7 @@ #include "socket.h" #include "qemu-file.h" #include "yank_functions.h" +#include "tls.h" /* Arbitrary limit on size of each discard command, * keeps them around ~200 bytes @@ -1552,15 +1553,15 @@ bool postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file) return true; } +/* + * Setup the postcopy preempt channel with the IOC. If ERROR is specified, + * setup the error instead. This helper will free the ERROR if specified. + */ static void -postcopy_preempt_send_channel_new(QIOTask *task, gpointer opaque) +postcopy_preempt_send_channel_done(MigrationState *s, + QIOChannel *ioc, Error *local_err) { - MigrationState *s = opaque; - QIOChannel *ioc = QIO_CHANNEL(qio_task_get_source(task)); - Error *local_err = NULL; - - if (qio_task_propagate_error(task, &local_err)) { - /* Something wrong happened.. */ + if (local_err) { migrate_set_error(s, local_err); error_free(local_err); } else { @@ -1574,7 +1575,47 @@ postcopy_preempt_send_channel_new(QIOTask *task, gpointer opaque) * postcopy_qemufile_src to know whether it failed or not. */ qemu_sem_post(&s->postcopy_qemufile_src_sem); - object_unref(OBJECT(ioc)); +} + +static void +postcopy_preempt_tls_handshake(QIOTask *task, gpointer opaque) +{ + g_autoptr(QIOChannel) ioc = QIO_CHANNEL(qio_task_get_source(task)); + MigrationState *s = opaque; + Error *local_err = NULL; + + qio_task_propagate_error(task, &local_err); + postcopy_preempt_send_channel_done(s, ioc, local_err); +} + +static void +postcopy_preempt_send_channel_new(QIOTask *task, gpointer opaque) +{ + g_autoptr(QIOChannel) ioc = QIO_CHANNEL(qio_task_get_source(task)); + MigrationState *s = opaque; + QIOChannelTLS *tioc; + Error *local_err = NULL; + + if (qio_task_propagate_error(task, &local_err)) { + goto out; + } + + if (migrate_channel_requires_tls_upgrade(ioc)) { + tioc = migration_tls_client_create(s, ioc, s->hostname, &local_err); + if (!tioc) { + goto out; + } + trace_postcopy_preempt_tls_handshake(); + qio_channel_set_name(QIO_CHANNEL(tioc), "migration-tls-preempt"); + qio_channel_tls_handshake(tioc, postcopy_preempt_tls_handshake, + s, NULL, NULL); + /* Setup the channel until TLS handshake finished */ + return; + } + +out: + /* This handles both good and error cases */ + postcopy_preempt_send_channel_done(s, ioc, local_err); } /* Returns 0 if channel established, -1 for error. */ diff --git a/migration/trace-events b/migration/trace-events index 0e385c3a07..a34afe7b85 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -287,6 +287,7 @@ postcopy_request_shared_page(const char *sharer, const char *rb, uint64_t rb_off postcopy_request_shared_page_present(const char *sharer, const char *rb, uint64_t rb_offset) "%s already %s offset 0x%"PRIx64 postcopy_wake_shared(uint64_t client_addr, const char *rb) "at 0x%"PRIx64" in %s" postcopy_page_req_del(void *addr, int count) "resolved page req %p total %d" +postcopy_preempt_tls_handshake(void) "" postcopy_preempt_new_channel(void) "" postcopy_preempt_thread_entry(void) "" postcopy_preempt_thread_exit(void) "" From patchwork Wed Jun 22 20:49:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12891492 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 65262C43334 for ; Wed, 22 Jun 2022 20:57:25 +0000 (UTC) Received: from localhost ([::1]:38752 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o47Pg-0001mE-FN for qemu-devel@archiver.kernel.org; Wed, 22 Jun 2022 16:57:24 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49346) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IW-0003N8-0I for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:50:00 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:47562) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IU-0004Ke-0H for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:49:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655930996; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Dd9IzIpGwB+6w1Eaay76zHCJfROBn3LfsCIXS7RmNLw=; b=Cg1vqmIbFxm6LTVFnsKhhohplzuoqxMpejeN+7XG6PQ3TmgRAr+zcK4ntPi39RwUzFC5HL pJRYhLzxGCC1naiqGtCpAVV0xkF5a3+5lEYBzeQm5K+CRsI3noycDc1ykYOkHGwcLoKSQs 6ZfA+/QfR6pDKiRcI+eFDGhSKfVNQ8g= Received: from mail-il1-f197.google.com (mail-il1-f197.google.com [209.85.166.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-615-GZwansR8O92TJvjubSBjnA-1; Wed, 22 Jun 2022 16:49:54 -0400 X-MC-Unique: GZwansR8O92TJvjubSBjnA-1 Received: by mail-il1-f197.google.com with SMTP id n14-20020a056e021bae00b002d92c91da8aso3573265ili.15 for ; Wed, 22 Jun 2022 13:49:54 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Dd9IzIpGwB+6w1Eaay76zHCJfROBn3LfsCIXS7RmNLw=; b=YvEB3kLxpQXExz5xXM3/AsORTDp3wsb+7po2QGJwmn1TlJp0uj3RzEOobb1h7xzM2t fCV2TI1SMJ5SyGpUPgvCwOSVxMXBy4Xs0NDNUlrrulYyveOR2iDWLb1jlu3KxEJ6zRTv LQo/CZGju5Xma3ix5G+O2PEiM8lmJFkBshFygpfcjOw9W5Qo8DGZgdy3hapgS7QCnF4f FuDGfo2AgBa3DjSg4q2rqrHgKGhh0OhejegSng0M2zqwrtONwjeWohc/NDOlABsIXRQE tk03FN8jGLRzkVax+3pzhHCbaqZxpelc6oQN8qqxh8DrYeluEOj4q0LBRyTfFKjA9m+U psvw== X-Gm-Message-State: AJIora+BLld7zi+W/PwVrHj4nKonSbkaGrtIr1/czCkzp/qBmSOnw2kU EFOvwXFjy3DdAAcdr2I5LAwdOjMbA6nc3aoALE/8+JnjyaICweLW3MaYXi2g9ZcO0UyaEN3Fobn B7czurnNl1/IRYlFxpRiLX/myZR+3KIZuor7LLPoWw8jW8DNbv48VW+1+l1wpCFoJ X-Received: by 2002:a05:6602:2f05:b0:66a:381e:1754 with SMTP id q5-20020a0566022f0500b0066a381e1754mr2703377iow.144.1655930993705; Wed, 22 Jun 2022 13:49:53 -0700 (PDT) X-Google-Smtp-Source: AGRyM1tjZGlHuBQ/XcW91EguQaldIkPDmffQ1M9Vvf2x3WegxFMU80oBXu+4UaFv8SfBMWx333YCnQ== X-Received: by 2002:a05:6602:2f05:b0:66a:381e:1754 with SMTP id q5-20020a0566022f0500b0066a381e1754mr2703355iow.144.1655930993291; Wed, 22 Jun 2022 13:49:53 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id b44-20020a0295af000000b0032b3a7817a7sm8920323jai.107.2022.06.22.13.49.49 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 22 Jun 2022 13:49:52 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Cc: "Daniel P . Berrange" , peterx@redhat.com, Leonardo Bras Soares Passos , Manish Mishra , "Dr . David Alan Gilbert" , Juan Quintela Subject: [PATCH v8 11/15] migration: Respect postcopy request order in preemption mode Date: Wed, 22 Jun 2022 16:49:16 -0400 Message-Id: <20220622204920.79061-12-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220622204920.79061-1-peterx@redhat.com> References: <20220622204920.79061-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -22 X-Spam_score: -2.3 X-Spam_bar: -- X-Spam_report: (-2.3 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URG_BIZ=0.573 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" With preemption mode on, when we see a postcopy request that was requesting for exactly the page that we have preempted before (so we've partially sent the page already via PRECOPY channel and it got preempted by another postcopy request), currently we drop the request so that after all the other postcopy requests are serviced then we'll go back to precopy stream and start to handle that. We dropped the request because we can't send it via postcopy channel since the precopy channel already contains partial of the data, and we can only send a huge page via one channel as a whole. We can't split a huge page into two channels. That's a very corner case and that works, but there's a change on the order of postcopy requests that we handle since we're postponing this (unlucky) postcopy request to be later than the other queued postcopy requests. The problem is there's a possibility that when the guest was very busy, the postcopy queue can be always non-empty, it means this dropped request will never be handled until the end of postcopy migration. So, there's a chance that there's one dest QEMU vcpu thread waiting for a page fault for an extremely long time just because it's unluckily accessing the specific page that was preempted before. The worst case time it needs can be as long as the whole postcopy migration procedure. It's extremely unlikely to happen, but when it happens it's not good. The root cause of this problem is because we treat pss->postcopy_requested variable as with two meanings bound together, as the variable shows: 1. Whether this page request is urgent, and, 2. Which channel we should use for this page request. With the old code, when we set postcopy_requested it means either both (1) and (2) are true, or both (1) and (2) are false. We can never have (1) and (2) to have different values. However it doesn't necessarily need to be like that. It's very legal that there's one request that has (1) very high urgency, but (2) we'd like to use the precopy channel. Just like the corner case we were discussing above. To differenciate the two meanings better, introduce a new field called postcopy_target_channel, showing which channel we should use for this page request, so as to cover the old meaning (2) only. Then we leave the postcopy_requested variable to stand only for meaning (1), which is the urgency of this page request. With this change, we can easily boost priority of a preempted precopy page as long as we know that page is also requested as a postcopy page. So with the new approach in get_queued_page() instead of dropping that request, we send it right away with the precopy channel so we get back the ordering of the page faults just like how they're requested on dest. Reported-by: Manish Mishra Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Manish Mishra Signed-off-by: Peter Xu --- migration/ram.c | 65 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 52 insertions(+), 13 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 7cbe9c310d..4fbad74c6c 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -442,8 +442,28 @@ struct PageSearchStatus { unsigned long page; /* Set once we wrap around */ bool complete_round; - /* Whether current page is explicitly requested by postcopy */ + /* + * [POSTCOPY-ONLY] Whether current page is explicitly requested by + * postcopy. When set, the request is "urgent" because the dest QEMU + * threads are waiting for us. + */ bool postcopy_requested; + /* + * [POSTCOPY-ONLY] The target channel to use to send current page. + * + * Note: This may _not_ match with the value in postcopy_requested + * above. Let's imagine the case where the postcopy request is exactly + * the page that we're sending in progress during precopy. In this case + * we'll have postcopy_requested set to true but the target channel + * will be the precopy channel (so that we don't split brain on that + * specific page since the precopy channel already contains partial of + * that page data). + * + * Besides that specific use case, postcopy_target_channel should + * always be equal to postcopy_requested, because by default we send + * postcopy pages via postcopy preempt channel. + */ + bool postcopy_target_channel; }; typedef struct PageSearchStatus PageSearchStatus; @@ -495,6 +515,9 @@ static QemuCond decomp_done_cond; static bool do_compress_ram_page(QEMUFile *f, z_stream *stream, RAMBlock *block, ram_addr_t offset, uint8_t *source_buf); +static void postcopy_preempt_restore(RAMState *rs, PageSearchStatus *pss, + bool postcopy_requested); + static void *do_data_compress(void *opaque) { CompressParam *param = opaque; @@ -1516,8 +1539,12 @@ retry: */ static bool find_dirty_block(RAMState *rs, PageSearchStatus *pss, bool *again) { - /* This is not a postcopy requested page */ + /* + * This is not a postcopy requested page, mark it "not urgent", and use + * precopy channel to send it. + */ pss->postcopy_requested = false; + pss->postcopy_target_channel = RAM_CHANNEL_PRECOPY; pss->page = migration_bitmap_find_dirty(rs, pss->block, pss->page); if (pss->complete_round && pss->block == rs->last_seen_block && @@ -2038,15 +2065,20 @@ static bool get_queued_page(RAMState *rs, PageSearchStatus *pss) RAMBlock *block; ram_addr_t offset; -again: block = unqueue_page(rs, &offset); if (block) { /* See comment above postcopy_preempted_contains() */ if (postcopy_preempted_contains(rs, block, offset)) { trace_postcopy_preempt_hit(block->idstr, offset); - /* This request is dropped */ - goto again; + /* + * If what we preempted previously was exactly what we're + * requesting right now, restore the preempted precopy + * immediately, boosting its priority as it's requested by + * postcopy. + */ + postcopy_preempt_restore(rs, pss, true); + return true; } } else { /* @@ -2070,7 +2102,9 @@ again: * really rare. */ pss->complete_round = false; + /* Mark it an urgent request, meanwhile using POSTCOPY channel */ pss->postcopy_requested = true; + pss->postcopy_target_channel = RAM_CHANNEL_POSTCOPY; } return !!block; @@ -2324,7 +2358,8 @@ static bool postcopy_preempt_triggered(RAMState *rs) return rs->postcopy_preempt_state.preempted; } -static void postcopy_preempt_restore(RAMState *rs, PageSearchStatus *pss) +static void postcopy_preempt_restore(RAMState *rs, PageSearchStatus *pss, + bool postcopy_requested) { PostcopyPreemptState *state = &rs->postcopy_preempt_state; @@ -2332,8 +2367,15 @@ static void postcopy_preempt_restore(RAMState *rs, PageSearchStatus *pss) pss->block = state->ram_block; pss->page = state->ram_page; - /* This is not a postcopy request but restoring previous precopy */ - pss->postcopy_requested = false; + + /* Whether this is a postcopy request? */ + pss->postcopy_requested = postcopy_requested; + /* + * When restoring a preempted page, the old data resides in PRECOPY + * slow channel, even if postcopy_requested is set. So always use + * PRECOPY channel here. + */ + pss->postcopy_target_channel = RAM_CHANNEL_PRECOPY; trace_postcopy_preempt_restored(pss->block->idstr, pss->page); @@ -2344,12 +2386,9 @@ static void postcopy_preempt_restore(RAMState *rs, PageSearchStatus *pss) static void postcopy_preempt_choose_channel(RAMState *rs, PageSearchStatus *pss) { MigrationState *s = migrate_get_current(); - unsigned int channel; + unsigned int channel = pss->postcopy_target_channel; QEMUFile *next; - channel = pss->postcopy_requested ? - RAM_CHANNEL_POSTCOPY : RAM_CHANNEL_PRECOPY; - if (channel != rs->postcopy_channel) { if (channel == RAM_CHANNEL_PRECOPY) { next = s->to_dst_file; @@ -2505,7 +2544,7 @@ static int ram_find_and_save_block(RAMState *rs) * preempted precopy. Otherwise find the next dirty bit. */ if (postcopy_preempt_triggered(rs)) { - postcopy_preempt_restore(rs, &pss); + postcopy_preempt_restore(rs, &pss, false); found = true; } else { /* priority queue empty, so just search for something dirty */ From patchwork Wed Jun 22 20:49:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12891490 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 10918CCA479 for ; Wed, 22 Jun 2022 20:56:13 +0000 (UTC) Received: from localhost ([::1]:35250 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o47OV-0007sE-UN for qemu-devel@archiver.kernel.org; Wed, 22 Jun 2022 16:56:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49386) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IY-0003V7-RI for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:50:02 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:45345) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IW-0004LA-Pj for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:50:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655931000; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IiLGJZdjiWL31PIzjZg5Vjjzx9c0Vy6zu6eoNiWFLEg=; b=gUey2OXPT3A0zfRo2WG0vlxiPkTI/pUEUFOIYBEVGTrwJFfEX2zwkTFx6FRfzu8hof502Z bx1l88nlcrnJpmWMjI54SQlB0kyI5VqIGpCinhw8ZIIATuEboDp1zSi4yGFmiO9ZgrH/zY zjjV4WiOCNRJg9lQrKVZ/DF743L3hI4= Received: from mail-il1-f198.google.com (mail-il1-f198.google.com [209.85.166.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-13-wJWUnocGOSC7-8zm6BWlHw-1; Wed, 22 Jun 2022 16:49:57 -0400 X-MC-Unique: wJWUnocGOSC7-8zm6BWlHw-1 Received: by mail-il1-f198.google.com with SMTP id s15-20020a056e02216f00b002d3d5e41565so11710131ilv.10 for ; Wed, 22 Jun 2022 13:49:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=IiLGJZdjiWL31PIzjZg5Vjjzx9c0Vy6zu6eoNiWFLEg=; b=B3rwCM+SgyqNAeEltdBBRcbNPWb9/mv5hYHkvy3iNJ3NDdG87Qvv/+8HNGwRILoNSW CT2+QrqaoECSt7czAvz9C1WH9rSrhBH0Tx8fmfGNDS8MNRkJpJeslIck/Hsydd/kogBi XEyeZVTuyTWbymIG8CHzseR9ueZTB/o2ZAKulna74XVRfqZtIE5gDKQXfItHyFEm//kD 1AcUDbZdXb4gM9v2lZS4zL4xibNvsejV9LC6eBYKSUfI9x/K0RaeDtoq2S4aPvJKiUA3 +6mxtZwHEIjoh77n5bj1qneZydPOtAE/+e1C+XeBUZ+OaYLRICkaVRbY3jrZNSVj8bhE awkA== X-Gm-Message-State: AJIora/ILxlFbrEIw65/Ek4CWxW1hmruWi8mnrMGYv6woc3fJKDr/MPZ q0R1+mlW4KrngM7PZW0gdjlza7aWrEw8aP7UspJ26VxYjJtCHlaIHPruvxhBfc4/oca5RNC6LvM NLc1Z6+iLIZS5voOb4j+ecSVgBgJe8yHb5/Bvzww2v5fB69fjWIYETGFuEEChxVri X-Received: by 2002:a05:6e02:508:b0:2d8:e729:5e3b with SMTP id d8-20020a056e02050800b002d8e7295e3bmr3166884ils.67.1655930996004; Wed, 22 Jun 2022 13:49:56 -0700 (PDT) X-Google-Smtp-Source: AGRyM1t/lyPnM0+SQJVoi8TMIypk6WFJDGCd7PmM39wbnxjQYYQJ9sWMEsnMEA/pxktuoFGvsIfmpQ== X-Received: by 2002:a05:6e02:508:b0:2d8:e729:5e3b with SMTP id d8-20020a056e02050800b002d8e7295e3bmr3166860ils.67.1655930995656; Wed, 22 Jun 2022 13:49:55 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id b44-20020a0295af000000b0032b3a7817a7sm8920323jai.107.2022.06.22.13.49.53 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 22 Jun 2022 13:49:54 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Cc: "Daniel P . Berrange" , peterx@redhat.com, Leonardo Bras Soares Passos , Manish Mishra , "Dr . David Alan Gilbert" , Juan Quintela Subject: [PATCH v8 12/15] tests: Move MigrateCommon upper Date: Wed, 22 Jun 2022 16:49:17 -0400 Message-Id: <20220622204920.79061-13-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220622204920.79061-1-peterx@redhat.com> References: <20220622204920.79061-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" So that it can be used in postcopy tests too soon. Signed-off-by: Peter Xu --- tests/qtest/migration-test.c | 144 +++++++++++++++++------------------ 1 file changed, 72 insertions(+), 72 deletions(-) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index f59d31b2ef..977f820540 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -491,6 +491,78 @@ typedef struct { const char *opts_target; } MigrateStart; +/* + * A hook that runs after the src and dst QEMUs have been + * created, but before the migration is started. This can + * be used to set migration parameters and capabilities. + * + * Returns: NULL, or a pointer to opaque state to be + * later passed to the TestMigrateFinishHook + */ +typedef void * (*TestMigrateStartHook)(QTestState *from, + QTestState *to); + +/* + * A hook that runs after the migration has finished, + * regardless of whether it succeeded or failed, but + * before QEMU has terminated (unless it self-terminated + * due to migration error) + * + * @opaque is a pointer to state previously returned + * by the TestMigrateStartHook if any, or NULL. + */ +typedef void (*TestMigrateFinishHook)(QTestState *from, + QTestState *to, + void *opaque); + +typedef struct { + /* Optional: fine tune start parameters */ + MigrateStart start; + + /* Required: the URI for the dst QEMU to listen on */ + const char *listen_uri; + + /* + * Optional: the URI for the src QEMU to connect to + * If NULL, then it will query the dst QEMU for its actual + * listening address and use that as the connect address. + * This allows for dynamically picking a free TCP port. + */ + const char *connect_uri; + + /* Optional: callback to run at start to set migration parameters */ + TestMigrateStartHook start_hook; + /* Optional: callback to run at finish to cleanup */ + TestMigrateFinishHook finish_hook; + + /* + * Optional: normally we expect the migration process to complete. + * + * There can be a variety of reasons and stages in which failure + * can happen during tests. + * + * If a failure is expected to happen at time of establishing + * the connection, then MIG_TEST_FAIL will indicate that the dst + * QEMU is expected to stay running and accept future migration + * connections. + * + * If a failure is expected to happen while processing the + * migration stream, then MIG_TEST_FAIL_DEST_QUIT_ERR will indicate + * that the dst QEMU is expected to quit with non-zero exit status + */ + enum { + /* This test should succeed, the default */ + MIG_TEST_SUCCEED = 0, + /* This test should fail, dest qemu should keep alive */ + MIG_TEST_FAIL, + /* This test should fail, dest qemu should fail with abnormal status */ + MIG_TEST_FAIL_DEST_QUIT_ERR, + } result; + + /* Optional: set number of migration passes to wait for */ + unsigned int iterations; +} MigrateCommon; + static int test_migrate_start(QTestState **from, QTestState **to, const char *uri, MigrateStart *args) { @@ -1113,78 +1185,6 @@ static void test_baddest(void) test_migrate_end(from, to, false); } -/* - * A hook that runs after the src and dst QEMUs have been - * created, but before the migration is started. This can - * be used to set migration parameters and capabilities. - * - * Returns: NULL, or a pointer to opaque state to be - * later passed to the TestMigrateFinishHook - */ -typedef void * (*TestMigrateStartHook)(QTestState *from, - QTestState *to); - -/* - * A hook that runs after the migration has finished, - * regardless of whether it succeeded or failed, but - * before QEMU has terminated (unless it self-terminated - * due to migration error) - * - * @opaque is a pointer to state previously returned - * by the TestMigrateStartHook if any, or NULL. - */ -typedef void (*TestMigrateFinishHook)(QTestState *from, - QTestState *to, - void *opaque); - -typedef struct { - /* Optional: fine tune start parameters */ - MigrateStart start; - - /* Required: the URI for the dst QEMU to listen on */ - const char *listen_uri; - - /* - * Optional: the URI for the src QEMU to connect to - * If NULL, then it will query the dst QEMU for its actual - * listening address and use that as the connect address. - * This allows for dynamically picking a free TCP port. - */ - const char *connect_uri; - - /* Optional: callback to run at start to set migration parameters */ - TestMigrateStartHook start_hook; - /* Optional: callback to run at finish to cleanup */ - TestMigrateFinishHook finish_hook; - - /* - * Optional: normally we expect the migration process to complete. - * - * There can be a variety of reasons and stages in which failure - * can happen during tests. - * - * If a failure is expected to happen at time of establishing - * the connection, then MIG_TEST_FAIL will indicate that the dst - * QEMU is expected to stay running and accept future migration - * connections. - * - * If a failure is expected to happen while processing the - * migration stream, then MIG_TEST_FAIL_DEST_QUIT_ERR will indicate - * that the dst QEMU is expected to quit with non-zero exit status - */ - enum { - /* This test should succeed, the default */ - MIG_TEST_SUCCEED = 0, - /* This test should fail, dest qemu should keep alive */ - MIG_TEST_FAIL, - /* This test should fail, dest qemu should fail with abnormal status */ - MIG_TEST_FAIL_DEST_QUIT_ERR, - } result; - - /* Optional: set number of migration passes to wait for */ - unsigned int iterations; -} MigrateCommon; - static void test_precopy_common(MigrateCommon *args) { QTestState *from, *to; From patchwork Wed Jun 22 20:49:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12891494 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 75990C43334 for ; Wed, 22 Jun 2022 21:03:07 +0000 (UTC) Received: from localhost ([::1]:50188 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o47VC-0001Eq-Br for qemu-devel@archiver.kernel.org; Wed, 22 Jun 2022 17:03:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49394) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IZ-0003WO-6O for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:50:03 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:42794) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47IX-0004MJ-AM for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:50:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655931000; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SfR5SHJTXqRuuIr1UoNbq78N16WQ4HXIgq4Dnd1A8jw=; b=YrvT+BgtbmJCPuo7kEvv2lCNjFDSJpOnBYajqzjMF52QYoPMDUaR8LBTD7e141TArE+t2z NMmQR8YSWebVABlwaTsR/j2Uxd/UbU6E+r/aj/haMg/0fPj8Z+sz8AMrSH4RBTBJeilA7h Hs/vkUaIfaN3TOBbO1WEHvIHOIEuR+U= Received: from mail-io1-f72.google.com (mail-io1-f72.google.com [209.85.166.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-387-3Mzm1JloOV6TwHRhcF60cg-1; Wed, 22 Jun 2022 16:49:59 -0400 X-MC-Unique: 3Mzm1JloOV6TwHRhcF60cg-1 Received: by mail-io1-f72.google.com with SMTP id y22-20020a056602215600b00673b11a9cd5so58475ioy.7 for ; Wed, 22 Jun 2022 13:49:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SfR5SHJTXqRuuIr1UoNbq78N16WQ4HXIgq4Dnd1A8jw=; b=TOCerpLOtjt+X4Q6Ae0w8gsE2OyM68ZxE/aTIwUmkJsBamzA1+8DHsULwTYAuTwVih Bztb+VOG0v12XINhaVgZXXYxpemNAwfWI5StbpFCU6ZBAsev8wi3hTTlTwiZCgcGVRor SeAk624zViTwc4ikOgCwo4duRgF0pwgei9KqCnj/IBiM+H9U0xAf1YtNIaeQIhpuMLew TaKH4FZON6bi8SnE9ZYDgqwJB6IvULKyLm+xSFnHUgjx+K+jrufPIAtxaoQUsuPuWfY8 fmFiDqzh4Pi2Ap+u9AUZLjyD6737LUWCE7cVqp3Tq9eWW8Ac+EkK6BRul54W0rO2QrYy iXDA== X-Gm-Message-State: AJIora/sK+fxfEhjBC7d6++IdgVg4BTbO5cu9mgVg5fdYlTqGz5Xr9X5 MepNVWZN560tswB1FWHu7DWqqI2z0RjyXtbMcUW5mD52sX4UHFNXBVOufM0D21i95Sfv0eKPFiK SGDbF67qb6uofIQZ4Tmp48YSODTsz70Ca+XNRuYVf/7VJugOEzWBtq/ypsKTFCuHt X-Received: by 2002:a6b:c34d:0:b0:669:9cc4:e450 with SMTP id t74-20020a6bc34d000000b006699cc4e450mr2786929iof.126.1655930998781; Wed, 22 Jun 2022 13:49:58 -0700 (PDT) X-Google-Smtp-Source: AGRyM1t+OCjV5ERGxyE2jsXDwthSxrA0ehfSVgyRcD4lstSpuHcymCXP0np5IK71FvW/srfRH+mXzw== X-Received: by 2002:a6b:c34d:0:b0:669:9cc4:e450 with SMTP id t74-20020a6bc34d000000b006699cc4e450mr2786914iof.126.1655930998493; Wed, 22 Jun 2022 13:49:58 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id b44-20020a0295af000000b0032b3a7817a7sm8920323jai.107.2022.06.22.13.49.56 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 22 Jun 2022 13:49:57 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Cc: "Daniel P . Berrange" , peterx@redhat.com, Leonardo Bras Soares Passos , Manish Mishra , "Dr . David Alan Gilbert" , Juan Quintela Subject: [PATCH v8 13/15] tests: Add postcopy tls migration test Date: Wed, 22 Jun 2022 16:49:18 -0400 Message-Id: <20220622204920.79061-14-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220622204920.79061-1-peterx@redhat.com> References: <20220622204920.79061-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" We just added TLS tests for precopy but not postcopy. Add the corresponding test for vanilla postcopy. Rename the vanilla postcopy to "postcopy/plain" because all postcopy tests will only use unix sockets as channel. Signed-off-by: Peter Xu --- tests/qtest/migration-test.c | 61 +++++++++++++++++++++++++++++------- 1 file changed, 50 insertions(+), 11 deletions(-) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index 977f820540..5ca43ba6a0 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -561,6 +561,9 @@ typedef struct { /* Optional: set number of migration passes to wait for */ unsigned int iterations; + + /* Postcopy specific fields */ + void *postcopy_data; } MigrateCommon; static int test_migrate_start(QTestState **from, QTestState **to, @@ -1049,15 +1052,19 @@ test_migrate_tls_x509_finish(QTestState *from, static int migrate_postcopy_prepare(QTestState **from_ptr, QTestState **to_ptr, - MigrateStart *args) + MigrateCommon *args) { g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs); QTestState *from, *to; - if (test_migrate_start(&from, &to, uri, args)) { + if (test_migrate_start(&from, &to, uri, &args->start)) { return -1; } + if (args->start_hook) { + args->postcopy_data = args->start_hook(from, to); + } + migrate_set_capability(from, "postcopy-ram", true); migrate_set_capability(to, "postcopy-ram", true); migrate_set_capability(to, "postcopy-blocktime", true); @@ -1082,7 +1089,8 @@ static int migrate_postcopy_prepare(QTestState **from_ptr, return 0; } -static void migrate_postcopy_complete(QTestState *from, QTestState *to) +static void migrate_postcopy_complete(QTestState *from, QTestState *to, + MigrateCommon *args) { wait_for_migration_complete(from); @@ -1093,25 +1101,48 @@ static void migrate_postcopy_complete(QTestState *from, QTestState *to) read_blocktime(to); } + if (args->finish_hook) { + args->finish_hook(from, to, args->postcopy_data); + args->postcopy_data = NULL; + } + test_migrate_end(from, to, true); } -static void test_postcopy(void) +static void test_postcopy_common(MigrateCommon *args) { - MigrateStart args = {}; QTestState *from, *to; - if (migrate_postcopy_prepare(&from, &to, &args)) { + if (migrate_postcopy_prepare(&from, &to, args)) { return; } migrate_postcopy_start(from, to); - migrate_postcopy_complete(from, to); + migrate_postcopy_complete(from, to, args); +} + +static void test_postcopy(void) +{ + MigrateCommon args = { }; + + test_postcopy_common(&args); +} + +static void test_postcopy_tls_psk(void) +{ + MigrateCommon args = { + .start_hook = test_migrate_tls_psk_start_match, + .finish_hook = test_migrate_tls_psk_finish, + }; + + test_postcopy_common(&args); } static void test_postcopy_recovery(void) { - MigrateStart args = { - .hide_stderr = true, + MigrateCommon args = { + .start = { + .hide_stderr = true, + }, }; QTestState *from, *to; g_autofree char *uri = NULL; @@ -1167,7 +1198,7 @@ static void test_postcopy_recovery(void) /* Restore the postcopy bandwidth to unlimited */ migrate_set_parameter_int(from, "max-postcopy-bandwidth", 0); - migrate_postcopy_complete(from, to); + migrate_postcopy_complete(from, to, &args); } static void test_baddest(void) @@ -2386,7 +2417,15 @@ int main(int argc, char **argv) module_call_init(MODULE_INIT_QOM); - qtest_add_func("/migration/postcopy/unix", test_postcopy); + qtest_add_func("/migration/postcopy/plain", test_postcopy); +#ifdef CONFIG_GNUTLS + /* + * NOTE: psk test is enough for postcopy, as other types of TLS + * channels are tested under precopy. Here what we want to test is the + * general postcopy path that has TLS channel enabled. + */ + qtest_add_func("/migration/postcopy/tls/psk", test_postcopy_tls_psk); +#endif /* CONFIG_GNUTLS */ qtest_add_func("/migration/postcopy/recovery", test_postcopy_recovery); qtest_add_func("/migration/bad_dest", test_baddest); qtest_add_func("/migration/precopy/unix/plain", test_precopy_unix_plain); From patchwork Wed Jun 22 20:49:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12891496 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E5964C43334 for ; Wed, 22 Jun 2022 21:06:47 +0000 (UTC) Received: from localhost ([::1]:56536 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o47Yk-0005kv-Sx for qemu-devel@archiver.kernel.org; Wed, 22 Jun 2022 17:06:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49540) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47Ig-0003j5-8w for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:50:10 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:25397) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47Ic-0004Yq-DG for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:50:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655931005; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KDVtEAf56hWf4TfRonUyS00+OOL1oT15cA/jh97mNdc=; b=V3EYPtblj41Zp7veB8U5bZx2CKILF3LQllJqyU6iJbKSgtVnboKr71n1vYnAVbd15pi15P PL8ccNNTGXNOydKSM+z2TRDcg6rLpChA6xk/V1AFKNtUVjbhnig80pcvsl+8LO1RWHnpvU tC2/U5DURIUhB/u0gI3mBT7qSeRVWAY= Received: from mail-il1-f197.google.com (mail-il1-f197.google.com [209.85.166.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-110-SxhfEcZTNXemabrOnhP8QA-1; Wed, 22 Jun 2022 16:50:01 -0400 X-MC-Unique: SxhfEcZTNXemabrOnhP8QA-1 Received: by mail-il1-f197.google.com with SMTP id l2-20020a056e0212e200b002d9258029c4so4254570iln.22 for ; Wed, 22 Jun 2022 13:50:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KDVtEAf56hWf4TfRonUyS00+OOL1oT15cA/jh97mNdc=; b=xoK1UdBZIjuiteZYWnPxfZyAMQUffxsDA9BDy+itxLbQiIJIR5w7MuFOYBCXSE89X3 TSozU0nibmsYHauDanlw7TeJPfH/PShkd80Ut5sPRemTp9tFz8O5boQ24bdZs+oj1JyF NcMwU04ItDcut4UEHlzZG9BnNEzN/2Zo7/PzPSASUs47qUJ7eRQAf7FsrDUuHHfVyyJE JbQpcW9GUSQ9q7oAOV9l3PhnX8c8djHf9OB4j6LM8uMZCp9/OXI9x7wDzPrDEV84uAC8 mr2pdLT4ESqlsDcQKBNoh53WJTipz1zW3UVo4lnq39SK1p3ltvh/A95nT4B98SD14aVh YoXg== X-Gm-Message-State: AJIora+HF2z+PEb6QEICW2g0dO6Md0bftVUmjEmtNdCfRUopPDabePq9 VkdUeK4Q9a1GHfOkHoZMC/qjqIUFnqtQcVJLCzmmPNEG1lqzEuU0BWp9VeWKiOcSOCoF4QqztVb zNnLd8+ahLhZgo7nx/HCVyx7SLdIPpmImcN50bxTB3vN6bLbzq0wdGP9d2ahBPEI5 X-Received: by 2002:a05:6638:4889:b0:331:b103:a74c with SMTP id ct9-20020a056638488900b00331b103a74cmr3508590jab.66.1655931000692; Wed, 22 Jun 2022 13:50:00 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uzo+/iXQu1KGUK0VJ3gWzyXXKSRjZuhLpqIkwNsE9r5c/WWjWiiMC+i7fJfzA/sVbdFo9wRg== X-Received: by 2002:a05:6638:4889:b0:331:b103:a74c with SMTP id ct9-20020a056638488900b00331b103a74cmr3508571jab.66.1655931000474; Wed, 22 Jun 2022 13:50:00 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id b44-20020a0295af000000b0032b3a7817a7sm8920323jai.107.2022.06.22.13.49.58 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 22 Jun 2022 13:49:59 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Cc: "Daniel P . Berrange" , peterx@redhat.com, Leonardo Bras Soares Passos , Manish Mishra , "Dr . David Alan Gilbert" , Juan Quintela Subject: [PATCH v8 14/15] tests: Add postcopy tls recovery migration test Date: Wed, 22 Jun 2022 16:49:19 -0400 Message-Id: <20220622204920.79061-15-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220622204920.79061-1-peterx@redhat.com> References: <20220622204920.79061-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain Received-SPF: pass client-ip=170.10.133.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" It's easy to build this upon the postcopy tls test. Rename the old postcopy recovery test to postcopy/recovery/plain. Signed-off-by: Peter Xu --- tests/qtest/migration-test.c | 38 +++++++++++++++++++++++++++--------- 1 file changed, 29 insertions(+), 9 deletions(-) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index 5ca43ba6a0..00b7b7072c 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -1137,17 +1137,15 @@ static void test_postcopy_tls_psk(void) test_postcopy_common(&args); } -static void test_postcopy_recovery(void) +static void test_postcopy_recovery_common(MigrateCommon *args) { - MigrateCommon args = { - .start = { - .hide_stderr = true, - }, - }; QTestState *from, *to; g_autofree char *uri = NULL; - if (migrate_postcopy_prepare(&from, &to, &args)) { + /* Always hide errors for postcopy recover tests since they're expected */ + args->start.hide_stderr = true; + + if (migrate_postcopy_prepare(&from, &to, args)) { return; } @@ -1198,7 +1196,24 @@ static void test_postcopy_recovery(void) /* Restore the postcopy bandwidth to unlimited */ migrate_set_parameter_int(from, "max-postcopy-bandwidth", 0); - migrate_postcopy_complete(from, to, &args); + migrate_postcopy_complete(from, to, args); +} + +static void test_postcopy_recovery(void) +{ + MigrateCommon args = { }; + + test_postcopy_recovery_common(&args); +} + +static void test_postcopy_recovery_tls_psk(void) +{ + MigrateCommon args = { + .start_hook = test_migrate_tls_psk_start_match, + .finish_hook = test_migrate_tls_psk_finish, + }; + + test_postcopy_recovery_common(&args); } static void test_baddest(void) @@ -2426,7 +2441,12 @@ int main(int argc, char **argv) */ qtest_add_func("/migration/postcopy/tls/psk", test_postcopy_tls_psk); #endif /* CONFIG_GNUTLS */ - qtest_add_func("/migration/postcopy/recovery", test_postcopy_recovery); + qtest_add_func("/migration/postcopy/recovery/plain", + test_postcopy_recovery); +#ifdef CONFIG_GNUTLS + qtest_add_func("/migration/postcopy/recovery/tls/psk", + test_postcopy_recovery_tls_psk); +#endif /* CONFIG_GNUTLS */ qtest_add_func("/migration/bad_dest", test_baddest); qtest_add_func("/migration/precopy/unix/plain", test_precopy_unix_plain); qtest_add_func("/migration/precopy/unix/xbzrle", test_precopy_unix_xbzrle); From patchwork Wed Jun 22 20:49:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12891495 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A3A36C433EF for ; Wed, 22 Jun 2022 21:03:30 +0000 (UTC) Received: from localhost ([::1]:51466 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1o47VZ-0002CE-Gx for qemu-devel@archiver.kernel.org; Wed, 22 Jun 2022 17:03:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:49538) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47Ig-0003j4-7n for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:50:10 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:35822) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1o47Ic-0004Z0-RZ for qemu-devel@nongnu.org; Wed, 22 Jun 2022 16:50:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1655931006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FXinBpdPf15f9aDB0THGC3s57V6oQR+QVk5PQyvEjQE=; b=gIWPFBjSbQ3yuLMJLIlsJgfnmfO2sfs00rG3FZsTvQqxphMWYqAVB0q8ItkDNrCi8LXpqX 9GwWcOQnVZWp73SGw3lHh8zlz5z9/jTivLTB0i6xK4HUCdwA52uGXHCFyy1RFWV+TvJlVM Kr6lPvPUbACM8SlDfJvcs/Zu/Kn61m8= Received: from mail-io1-f71.google.com (mail-io1-f71.google.com [209.85.166.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-249-C_1cR0LaOR636tmGnA1nuA-1; Wed, 22 Jun 2022 16:50:03 -0400 X-MC-Unique: C_1cR0LaOR636tmGnA1nuA-1 Received: by mail-io1-f71.google.com with SMTP id 206-20020a6b14d7000000b006727756373cso845890iou.1 for ; Wed, 22 Jun 2022 13:50:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FXinBpdPf15f9aDB0THGC3s57V6oQR+QVk5PQyvEjQE=; b=BZHa55Yk1T6o6QfGVnVk5hQGP78ayrfim2Ulf0P7GG7jJbZaT+V0XWEG65y5+GuHsT X9B5/Tsij+uK76W+6Mi8GkLZSc3Io9Dq6/BrHf1pFQz44QcvJLw2moA4J5c5YxlwsssN IKZnhFtXsU7/udaiyBn1eRfuajUMFLY53GdDMLA7TKiBfd2i39eZg7K/5Ri1+VLUqmk5 o1knrX2QDceU/UxwsLf96n3RaxfJ1EqfRafD9qVTxUzQeRw1fuOBpTRhRZNAWeU4NxIG RdKuM5S/i1OfhpCRAmsRbtFG8aZBNJuadRco2SmJZEct5xbChqo7dAxig+4a2WfSIR7i jNNQ== X-Gm-Message-State: AJIora/gY/WisBfgK4eiUH+Zkl8m1O0idKIosPi2DCCqYAlUmd4A8Tur bPPFwgbPrOLkUMnswqII31o5d4XNjjIoZJTE/AHTKqfp+kM1fULAgtyfwzEEeP92nLv6nLgkzmd gkwGS4GtR6zJuOuZ2o14ipu/Io0PIafgG6tcL9mCFWKaKJT4O4WQsY8FNY3AOxs53 X-Received: by 2002:a6b:b846:0:b0:669:b394:1943 with SMTP id i67-20020a6bb846000000b00669b3941943mr2793397iof.147.1655931002331; Wed, 22 Jun 2022 13:50:02 -0700 (PDT) X-Google-Smtp-Source: AGRyM1v1CImRE0LkDcWiy6WYCgbaPb7iU6VxmQeqi2vtdwK0ra+UcX05sSJr994NpAzv/excxZ4pSA== X-Received: by 2002:a6b:b846:0:b0:669:b394:1943 with SMTP id i67-20020a6bb846000000b00669b3941943mr2793375iof.147.1655931002062; Wed, 22 Jun 2022 13:50:02 -0700 (PDT) Received: from localhost.localdomain (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id b44-20020a0295af000000b0032b3a7817a7sm8920323jai.107.2022.06.22.13.50.00 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 22 Jun 2022 13:50:01 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Cc: "Daniel P . Berrange" , peterx@redhat.com, Leonardo Bras Soares Passos , Manish Mishra , "Dr . David Alan Gilbert" , Juan Quintela Subject: [PATCH v8 15/15] tests: Add postcopy preempt tests Date: Wed, 22 Jun 2022 16:49:20 -0400 Message-Id: <20220622204920.79061-16-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220622204920.79061-1-peterx@redhat.com> References: <20220622204920.79061-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.082, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Four tests are added for preempt mode: - Postcopy plain - Postcopy recovery - Postcopy tls - Postcopy tls+recovery Signed-off-by: Peter Xu --- tests/qtest/migration-test.c | 58 ++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index 00b7b7072c..1f2dc57d8f 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -564,6 +564,7 @@ typedef struct { /* Postcopy specific fields */ void *postcopy_data; + bool postcopy_preempt; } MigrateCommon; static int test_migrate_start(QTestState **from, QTestState **to, @@ -1069,6 +1070,11 @@ static int migrate_postcopy_prepare(QTestState **from_ptr, migrate_set_capability(to, "postcopy-ram", true); migrate_set_capability(to, "postcopy-blocktime", true); + if (args->postcopy_preempt) { + migrate_set_capability(from, "postcopy-preempt", true); + migrate_set_capability(to, "postcopy-preempt", true); + } + /* We want to pick a speed slow enough that the test completes * quickly, but that it doesn't complete precopy even on a slow * machine, so also set the downtime. @@ -1137,6 +1143,26 @@ static void test_postcopy_tls_psk(void) test_postcopy_common(&args); } +static void test_postcopy_preempt(void) +{ + MigrateCommon args = { + .postcopy_preempt = true, + }; + + test_postcopy_common(&args); +} + +static void test_postcopy_preempt_tls_psk(void) +{ + MigrateCommon args = { + .postcopy_preempt = true, + .start_hook = test_migrate_tls_psk_start_match, + .finish_hook = test_migrate_tls_psk_finish, + }; + + test_postcopy_common(&args); +} + static void test_postcopy_recovery_common(MigrateCommon *args) { QTestState *from, *to; @@ -1216,6 +1242,27 @@ static void test_postcopy_recovery_tls_psk(void) test_postcopy_recovery_common(&args); } +static void test_postcopy_preempt_recovery(void) +{ + MigrateCommon args = { + .postcopy_preempt = true, + }; + + test_postcopy_recovery_common(&args); +} + +/* This contains preempt+recovery+tls test altogether */ +static void test_postcopy_preempt_all(void) +{ + MigrateCommon args = { + .postcopy_preempt = true, + .start_hook = test_migrate_tls_psk_start_match, + .finish_hook = test_migrate_tls_psk_finish, + }; + + test_postcopy_recovery_common(&args); +} + static void test_baddest(void) { MigrateStart args = { @@ -2447,6 +2494,17 @@ int main(int argc, char **argv) qtest_add_func("/migration/postcopy/recovery/tls/psk", test_postcopy_recovery_tls_psk); #endif /* CONFIG_GNUTLS */ + + qtest_add_func("/migration/postcopy/preempt/plain", test_postcopy_preempt); + qtest_add_func("/migration/postcopy/preempt/recovery/plain", + test_postcopy_preempt_recovery); +#ifdef CONFIG_GNUTLS + qtest_add_func("/migration/postcopy/preempt/tls/psk", + test_postcopy_preempt_tls_psk); + qtest_add_func("/migration/postcopy/preempt/recovery/tls/psk", + test_postcopy_preempt_all); +#endif /* CONFIG_GNUTLS */ + qtest_add_func("/migration/bad_dest", test_baddest); qtest_add_func("/migration/precopy/unix/plain", test_precopy_unix_plain); qtest_add_func("/migration/precopy/unix/xbzrle", test_precopy_unix_xbzrle);