From patchwork Mon Apr 10 13:52:03 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 9672695 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5751A600CB for ; Mon, 10 Apr 2017 13:53:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 474D025223 for ; Mon, 10 Apr 2017 13:53:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3A7AF26E81; Mon, 10 Apr 2017 13:53:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8098825223 for ; Mon, 10 Apr 2017 13:53:12 +0000 (UTC) Received: from localhost ([::1]:34556 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cxZkk-0004zc-VY for patchwork-qemu-devel@patchwork.kernel.org; Mon, 10 Apr 2017 09:53:10 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42560) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cxZjq-0004yB-HY for qemu-devel@nongnu.org; Mon, 10 Apr 2017 09:52:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cxZjp-0008Mh-1K for qemu-devel@nongnu.org; Mon, 10 Apr 2017 09:52:14 -0400 Received: from mail-wr0-x243.google.com ([2a00:1450:400c:c0c::243]:34486) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cxZjj-0008Ig-Tk; Mon, 10 Apr 2017 09:52:08 -0400 Received: by mail-wr0-x243.google.com with SMTP id u18so21290731wrc.1; Mon, 10 Apr 2017 06:52:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=EjUBD4eKzJV8y83M/nbV+R0PyGxgpW8TWmS6TtQj8ZU=; b=eiHaZPBJZB1RNn9kqRB+Fgqt7i652YMfS+kc6sYU6+PmzzXhif1noTJYDjfDe5b3i1 7yt4yssiAF0ohk7WrTKYPYDNf+wTCVVb8jA1ANBCaWn/5KC3glGrgOLBHzTDQIGK/Cwg CvkKoLseguJdpJmqV8/Wra/lZPzqoksAOZhUBeY9FPuvT0SiQn/okCqejH89X5Niuf0Y mDsc9tWvlHF6JH1HGpUIossfmHHbjJSqEWBWh455IbfjwB89CIQtCXIE6uRM8kK2Yagk 6ac2vcJrxuUZtMPBEqUa2NYd7K0vqzulqgggisUGUKlqZzbsTe83S4HHx60/FNuKCmqI Z3Qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=EjUBD4eKzJV8y83M/nbV+R0PyGxgpW8TWmS6TtQj8ZU=; b=tQYe1EkqliF8ONcGup6nCw9TWUaTcPPzbPbjABOl3L9NY+75AUp9VDWBZ2Lx1dQzOS nYw0ZfBbsdSjJ20lpyi1O7DuG8sOlrs9asF/9gcNujdzQB3deT8e5SHmsMvm/xXLxCWl 8rz5xUFSwsGVxRvOE5MX1TAKEu5OWdyBsZgltIzyKAtW1zoyb3ZS35tMyx+nwq23sRgF w+tEWrlsOvBmk34LEEvWSWog46e9OtorFVv074KQUs/rGufs5Y0TDyAMoNpbNTBBILCT TYb1hlKk/TSj0oWUZWgx7SlKbFq+YZebc27E07lFfFOh/vc+uSvuP/UPkUVwEZ9IuRE2 +Sbw== X-Gm-Message-State: AFeK/H3X6ccgLkcfoGkYCeCcBGTc2eEvedU1SFf11OMIgG9RAqTAdaLMmVr5MfnJj8AXxw== X-Received: by 10.223.135.13 with SMTP id a13mr47509472wra.87.1491832326373; Mon, 10 Apr 2017 06:52:06 -0700 (PDT) Received: from localhost ([51.15.41.238]) by smtp.gmail.com with ESMTPSA id f135sm10437924wmd.7.2017.04.10.06.52.04 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 10 Apr 2017 06:52:04 -0700 (PDT) Date: Mon, 10 Apr 2017 14:52:03 +0100 From: Stefan Hajnoczi To: 858585 jemmy Message-ID: <20170410135203.GA20631@stefanha-x1.localdomain> References: <1491384478-12325-1-git-send-email-lidongchen@tencent.com> <20170406140255.GA31259@stefanha-x1.localdomain> <20170407113450.GO13602@stefanha-x1.localdomain> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.8.0 (2017-02-23) X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2a00:1450:400c:c0c::243 Subject: Re: [Qemu-devel] [Qemu-block] [PATCH v3] migration/block:limit the time used for block migration X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Fam Zheng , qemu-block@nongnu.org, quintela@redhat.com, qemu-devel@nongnu.org, dgilbert@redhat.com, Stefan Hajnoczi , Lidong Chen Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP On Sat, Apr 08, 2017 at 09:17:58PM +0800, 858585 jemmy wrote: > On Fri, Apr 7, 2017 at 7:34 PM, Stefan Hajnoczi wrote: > > On Fri, Apr 07, 2017 at 09:30:33AM +0800, 858585 jemmy wrote: > >> On Thu, Apr 6, 2017 at 10:02 PM, Stefan Hajnoczi wrote: > >> > On Wed, Apr 05, 2017 at 05:27:58PM +0800, jemmy858585@gmail.com wrote: > >> > > >> > A proper solution is to refactor the synchronous code to make it > >> > asynchronous. This might require invoking the system call from a > >> > thread pool worker. > >> > > >> > >> yes, i agree with you, but this is a big change. > >> I will try to find how to optimize this code, maybe need a long time. > >> > >> this patch is not a perfect solution, but can alleviate the problem. > > > > Let's try to understand the problem fully first. > > > > when migrate the vm with high speed, i find vnc response slowly sometime. > not only vnc response slowly, virsh console aslo response slowly sometime. > and the guest os block io performance is also reduce. > > the bug can be reproduce by this command: > virsh migrate-setspeed 165cf436-312f-47e7-90f2-f8aa63f34893 900 > virsh migrate --live 165cf436-312f-47e7-90f2-f8aa63f34893 > --copy-storage-inc qemu+ssh://10.59.163.38/system > > and --copy-storage-all have no problem. > virsh migrate --live 165cf436-312f-47e7-90f2-f8aa63f34893 > --copy-storage-all qemu+ssh://10.59.163.38/system > > compare the difference between --copy-storage-inc and > --copy-storage-all. i find out the reason is > mig_save_device_bulk invoke bdrv_is_allocated, but bdrv_is_allocated > is synchronous and maybe wait > for a long time. > > i write this code to measure the time used by brdrv_is_allocated() > > 279 static int max_time = 0; > 280 int tmp; > > 288 clock_gettime(CLOCK_MONOTONIC_RAW, &ts1); > 289 ret = bdrv_is_allocated(blk_bs(bb), cur_sector, > 290 MAX_IS_ALLOCATED_SEARCH, &nr_sectors); > 291 clock_gettime(CLOCK_MONOTONIC_RAW, &ts2); > 292 > 293 > 294 tmp = (ts2.tv_sec - ts1.tv_sec)*1000000000L > 295 + (ts2.tv_nsec - ts1.tv_nsec); > 296 if (tmp > max_time) { > 297 max_time=tmp; > 298 fprintf(stderr, "max_time is %d\n", max_time); > 299 } > > the test result is below: > > max_time is 37014 > max_time is 1075534 > max_time is 17180913 > max_time is 28586762 > max_time is 49563584 > max_time is 103085447 > max_time is 110836833 > max_time is 120331438 > > bdrv_is_allocated is called after qemu_mutex_lock_iothread. > and the main thread is also call qemu_mutex_lock_iothread. > so cause the main thread maybe wait for a long time. > > if (bmds->shared_base) { > qemu_mutex_lock_iothread(); > aio_context_acquire(blk_get_aio_context(bb)); > /* Skip unallocated sectors; intentionally treats failure as > * an allocated sector */ > while (cur_sector < total_sectors && > !bdrv_is_allocated(blk_bs(bb), cur_sector, > MAX_IS_ALLOCATED_SEARCH, &nr_sectors)) { > cur_sector += nr_sectors; > } > aio_context_release(blk_get_aio_context(bb)); > qemu_mutex_unlock_iothread(); > } > > #0 0x00007f107322f264 in __lll_lock_wait () from /lib64/libpthread.so.0 > #1 0x00007f107322a508 in _L_lock_854 () from /lib64/libpthread.so.0 > #2 0x00007f107322a3d7 in pthread_mutex_lock () from /lib64/libpthread.so.0 > #3 0x0000000000949ecb in qemu_mutex_lock (mutex=0xfc51a0) at > util/qemu-thread-posix.c:60 > #4 0x0000000000459e58 in qemu_mutex_lock_iothread () at /root/qemu/cpus.c:1516 > #5 0x0000000000945322 in os_host_main_loop_wait (timeout=28911939) at > util/main-loop.c:258 > #6 0x00000000009453f2 in main_loop_wait (nonblocking=0) at util/main-loop.c:517 > #7 0x00000000005c76b4 in main_loop () at vl.c:1898 > #8 0x00000000005ceb77 in main (argc=49, argv=0x7fff921182b8, > envp=0x7fff92118448) at vl.c:4709 The following patch moves bdrv_is_allocated() into bb's AioContext. It will execute without blocking other I/O activity. Compile-tested only. diff --git a/migration/block.c b/migration/block.c index 7734ff7..a5572a4 100644 --- a/migration/block.c +++ b/migration/block.c @@ -263,6 +263,29 @@ static void blk_mig_read_cb(void *opaque, int ret) blk_mig_unlock(); } +typedef struct { + int64_t *total_sectors; + int64_t *cur_sector; + BlockBackend *bb; + QemuEvent event; +} MigNextAllocatedClusterData; + +static void coroutine_fn mig_next_allocated_cluster(void *opaque) +{ + MigNextAllocatedClusterData *data = opaque; + int nr_sectors; + + /* Skip unallocated sectors; intentionally treats failure as + * an allocated sector */ + while (*data->cur_sector < *data->total_sectors && + !bdrv_is_allocated(blk_bs(data->bb), *data->cur_sector, + MAX_IS_ALLOCATED_SEARCH, &nr_sectors)) { + *data->cur_sector += nr_sectors; + } + + qemu_event_set(&data->event); +} + /* Called with no lock taken. */ static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds) @@ -274,17 +297,27 @@ static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds) int nr_sectors; if (bmds->shared_base) { + /* Searching for the next allocated cluster can block. Do it in a + * coroutine inside bb's AioContext. That way we don't need to hold + * the global mutex while blocked. + */ + AioContext *bb_ctx; + Coroutine *co; + MigNextAllocatedClusterData data = { + .cur_sector = &cur_sector, + .total_sectors = &total_sectors, + .bb = bb, + }; + + qemu_event_init(&data.event, false); + qemu_mutex_lock_iothread(); - aio_context_acquire(blk_get_aio_context(bb)); - /* Skip unallocated sectors; intentionally treats failure as - * an allocated sector */ - while (cur_sector < total_sectors && - !bdrv_is_allocated(blk_bs(bb), cur_sector, - MAX_IS_ALLOCATED_SEARCH, &nr_sectors)) { - cur_sector += nr_sectors; - } - aio_context_release(blk_get_aio_context(bb)); + bb_ctx = blk_get_aio_context(bb); qemu_mutex_unlock_iothread(); + + co = qemu_coroutine_create(mig_next_allocated_cluster, &data); + aio_co_schedule(bb_ctx, co); + qemu_event_wait(&data.event); } if (cur_sector >= total_sectors) {