From patchwork Wed Feb 16 06:28:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 12748192 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CAFD6C433EF for ; Wed, 16 Feb 2022 08:40:32 +0000 (UTC) Received: from localhost ([::1]:58996 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nKFrT-0004oA-QS for qemu-devel@archiver.kernel.org; Wed, 16 Feb 2022 03:40:31 -0500 Received: from eggs.gnu.org ([209.51.188.92]:48748) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nKDpu-0006mh-Ld for qemu-devel@nongnu.org; Wed, 16 Feb 2022 01:30:46 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:37492) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nKDps-0006i0-Bf for qemu-devel@nongnu.org; Wed, 16 Feb 2022 01:30:46 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644993043; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MatQhmiuDazPhBd8ggzkvaXZ3W7gO+frvQyUVeWVm/o=; b=BEvKF7jAY7jGtOIBS+KuBYl+PDt8db5Kg8jWBf+prRr2QUNGugr7YpOh3cxeljuaGnSA/j PU1cKayQ8hfEzO4Tgd17BacWwzaez+al0Oi/cB0/7WMiiPySxZew5z9fxHkzELDL3eCMR+ 7mZJKR8VyUkGBPE9qaH/txqir+M63cg= Received: from mail-pf1-f198.google.com (mail-pf1-f198.google.com [209.85.210.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-81-44jpwnaEOPiMK4d4Pdlh9A-1; Wed, 16 Feb 2022 01:30:42 -0500 X-MC-Unique: 44jpwnaEOPiMK4d4Pdlh9A-1 Received: by mail-pf1-f198.google.com with SMTP id j204-20020a6280d5000000b004e107ad3488so919193pfd.15 for ; Tue, 15 Feb 2022 22:30:42 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=MatQhmiuDazPhBd8ggzkvaXZ3W7gO+frvQyUVeWVm/o=; b=amtlm5OTmi4FNg26ubAmNJUg3vvjQ3ZHTVtPkNDcaoLmDvnYbqiKa8AbtPEKkm7Wit Y7PJPHj6UThkCbMOpxo2JmPXAwY2kWpALSSlgMQK64PXoeOjsD8QL3PlxVMW4m5LlY+R WTnA62s0xJRhFpSf5+aRUkHwQVwUtOzddDnJqDHjW/YhuRTwz+j0ehfzjdoCYsz6TsZ/ koo80GTD63UNm06yP0vfLUtNpv0irNPxHitvALT4Hq5Yu5NQ799ixzjn0yuk5gHy6wQP 5GLGkOt8HpfmowNgBOOkGSjKnG6OSiT/TF3XPzkVlWdGgi9DPXCkrVU/d4UIfZGRhgey 4kJA== X-Gm-Message-State: AOAM531Dc6VcDwhM8Y2n7atG3Q2MbtsNdDwW5n7AV5lEi3M+3ZR36wlZ l1/U1EJgDkKyR/LNkt8Ib8YjpIzk4CevlJ2doqvdsi6Nh5xMyvB8G+AZLNizFNOB3uSNSbMAvdR uvY8/c2yEFcuWyHD9CAS9b1swkup3eWuebU5dJoRtKqmu7L3TWH6asdMtxyXjFOLi X-Received: by 2002:a63:d618:0:b0:36c:8f9a:f80e with SMTP id q24-20020a63d618000000b0036c8f9af80emr1131248pgg.248.1644993041179; Tue, 15 Feb 2022 22:30:41 -0800 (PST) X-Google-Smtp-Source: ABdhPJxZusfyL4hUegvdFEXyiEr+x6CfjtbEftJWTdIboFFfuhyJGwsv3gdY49QvqkWbuG/59tZQTA== X-Received: by 2002:a63:d618:0:b0:36c:8f9a:f80e with SMTP id q24-20020a63d618000000b0036c8f9af80emr1131226pgg.248.1644993040703; Tue, 15 Feb 2022 22:30:40 -0800 (PST) Received: from localhost.localdomain ([64.64.123.81]) by smtp.gmail.com with ESMTPSA id 17sm42104657pfl.175.2022.02.15.22.30.34 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 15 Feb 2022 22:30:40 -0800 (PST) From: Peter Xu To: qemu-devel@nongnu.org Subject: [PATCH 19/20] migration: Postcopy recover with preempt enabled Date: Wed, 16 Feb 2022 14:28:08 +0800 Message-Id: <20220216062809.57179-20-peterx@redhat.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220216062809.57179-1-peterx@redhat.com> References: <20220216062809.57179-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.129.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -28 X-Spam_score: -2.9 X-Spam_bar: -- X-Spam_report: (-2.9 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.083, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Leonardo Bras Soares Passos , "Dr . David Alan Gilbert" , peterx@redhat.com, Juan Quintela Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" To allow postcopy recovery, the ram fast load (preempt-only) dest QEMU thread needs similar handling on fault tolerance. When ram_load_postcopy() fails, instead of stopping the thread it halts with a semaphore, preparing to be kicked again when recovery is detected. A mutex is introduced to make sure there's no concurrent operation upon the socket. To make it simple, the fast ram load thread will take the mutex during its whole procedure, and only release it if it's paused. The fast-path socket will be properly released by the main loading thread safely when there's network failures during postcopy with that mutex held. Signed-off-by: Peter Xu Reviewed-by: Dr. David Alan Gilbert Acked-by: Dr. David Alan Gilbert --- migration/migration.c | 17 +++++++++++++++-- migration/migration.h | 3 +++ migration/postcopy-ram.c | 24 ++++++++++++++++++++++-- migration/savevm.c | 17 +++++++++++++++++ migration/trace-events | 2 ++ 5 files changed, 59 insertions(+), 4 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index d20db04097..c68a281406 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -215,9 +215,11 @@ void migration_object_init(void) current_incoming->postcopy_remote_fds = g_array_new(FALSE, TRUE, sizeof(struct PostCopyFD)); qemu_mutex_init(¤t_incoming->rp_mutex); + qemu_mutex_init(¤t_incoming->postcopy_prio_thread_mutex); qemu_event_init(¤t_incoming->main_thread_load_event, false); qemu_sem_init(¤t_incoming->postcopy_pause_sem_dst, 0); qemu_sem_init(¤t_incoming->postcopy_pause_sem_fault, 0); + qemu_sem_init(¤t_incoming->postcopy_pause_sem_fast_load, 0); qemu_mutex_init(¤t_incoming->page_request_mutex); current_incoming->page_requested = g_tree_new(page_request_addr_cmp); @@ -697,9 +699,9 @@ static bool postcopy_try_recover(void) /* * Here, we only wake up the main loading thread (while the - * fault thread will still be waiting), so that we can receive + * rest threads will still be waiting), so that we can receive * commands from source now, and answer it if needed. The - * fault thread will be woken up afterwards until we are sure + * rest threads will be woken up afterwards until we are sure * that source is ready to reply to page requests. */ qemu_sem_post(&mis->postcopy_pause_sem_dst); @@ -3466,6 +3468,17 @@ static MigThrError postcopy_pause(MigrationState *s) qemu_file_shutdown(file); qemu_fclose(file); + /* + * Do the same to postcopy fast path socket too if there is. No + * locking needed because no racer as long as we do this before setting + * status to paused. + */ + if (s->postcopy_qemufile_src) { + migration_ioc_unregister_yank_from_file(s->postcopy_qemufile_src); + qemu_fclose(s->postcopy_qemufile_src); + s->postcopy_qemufile_src = NULL; + } + migrate_set_state(&s->state, s->state, MIGRATION_STATUS_POSTCOPY_PAUSED); diff --git a/migration/migration.h b/migration/migration.h index b8aacfe3af..945088064a 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -118,6 +118,8 @@ struct MigrationIncomingState { /* Postcopy priority thread is used to receive postcopy requested pages */ QemuThread postcopy_prio_thread; bool postcopy_prio_thread_created; + /* Used to sync with the prio thread */ + QemuMutex postcopy_prio_thread_mutex; /* * An array of temp host huge pages to be used, one for each postcopy * channel. @@ -147,6 +149,7 @@ struct MigrationIncomingState { /* notify PAUSED postcopy incoming migrations to try to continue */ QemuSemaphore postcopy_pause_sem_dst; QemuSemaphore postcopy_pause_sem_fault; + QemuSemaphore postcopy_pause_sem_fast_load; /* List of listening socket addresses */ SocketAddressList *socket_address_list; diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 30eddaacd1..b3d23804bc 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -1575,6 +1575,15 @@ int postcopy_preempt_setup(MigrationState *s, Error **errp) return 0; } +static void postcopy_pause_ram_fast_load(MigrationIncomingState *mis) +{ + trace_postcopy_pause_fast_load(); + qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex); + qemu_sem_wait(&mis->postcopy_pause_sem_fast_load); + qemu_mutex_lock(&mis->postcopy_prio_thread_mutex); + trace_postcopy_pause_fast_load_continued(); +} + void *postcopy_preempt_thread(void *opaque) { MigrationIncomingState *mis = opaque; @@ -1587,11 +1596,22 @@ void *postcopy_preempt_thread(void *opaque) qemu_sem_post(&mis->thread_sync_sem); /* Sending RAM_SAVE_FLAG_EOS to terminate this thread */ - ret = ram_load_postcopy(mis->postcopy_qemufile_dst, RAM_CHANNEL_POSTCOPY); + qemu_mutex_lock(&mis->postcopy_prio_thread_mutex); + while (1) { + ret = ram_load_postcopy(mis->postcopy_qemufile_dst, RAM_CHANNEL_POSTCOPY); + /* If error happened, go into recovery routine */ + if (ret) { + postcopy_pause_ram_fast_load(mis); + } else { + /* We're done */ + break; + } + } + qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex); rcu_unregister_thread(); trace_postcopy_preempt_thread_exit(); - return ret == 0 ? NULL : (void *)-1; + return NULL; } diff --git a/migration/savevm.c b/migration/savevm.c index 254aa78234..2d32340d28 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -2152,6 +2152,13 @@ static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis) */ qemu_sem_post(&mis->postcopy_pause_sem_fault); + if (migrate_postcopy_preempt()) { + /* The channel should already be setup again; make sure of it */ + assert(mis->postcopy_qemufile_dst); + /* Kick the fast ram load thread too */ + qemu_sem_post(&mis->postcopy_pause_sem_fast_load); + } + return 0; } @@ -2607,6 +2614,16 @@ static bool postcopy_pause_incoming(MigrationIncomingState *mis) mis->to_src_file = NULL; qemu_mutex_unlock(&mis->rp_mutex); + if (mis->postcopy_qemufile_dst) { + qemu_file_shutdown(mis->postcopy_qemufile_dst); + /* Take the mutex to make sure the fast ram load thread halted */ + qemu_mutex_lock(&mis->postcopy_prio_thread_mutex); + migration_ioc_unregister_yank_from_file(mis->postcopy_qemufile_dst); + qemu_fclose(mis->postcopy_qemufile_dst); + mis->postcopy_qemufile_dst = NULL; + qemu_mutex_unlock(&mis->postcopy_prio_thread_mutex); + } + migrate_set_state(&mis->state, MIGRATION_STATUS_POSTCOPY_ACTIVE, MIGRATION_STATUS_POSTCOPY_PAUSED); diff --git a/migration/trace-events b/migration/trace-events index b1155d9da0..dfe36a3340 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -270,6 +270,8 @@ mark_postcopy_blocktime_begin(uint64_t addr, void *dd, uint32_t time, int cpu, i mark_postcopy_blocktime_end(uint64_t addr, void *dd, uint32_t time, int affected_cpu) "addr: 0x%" PRIx64 ", dd: %p, time: %u, affected_cpu: %d" postcopy_pause_fault_thread(void) "" postcopy_pause_fault_thread_continued(void) "" +postcopy_pause_fast_load(void) "" +postcopy_pause_fast_load_continued(void) "" postcopy_ram_fault_thread_entry(void) "" postcopy_ram_fault_thread_exit(void) "" postcopy_ram_fault_thread_fds_core(int baseufd, int quitfd) "ufd: %d quitfd: %d"