From patchwork Tue Mar  1 12:25:54 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
X-Patchwork-Id: 8464861
Return-Path: 
 <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>
X-Original-To: patchwork-qemu-devel@patchwork.kernel.org
Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org
Received: from mail.kernel.org (mail.kernel.org [198.145.29.136])
	by patchwork2.web.kernel.org (Postfix) with ESMTP id 7F0E3C0553
	for <patchwork-qemu-devel@patchwork.kernel.org>;
	Tue,  1 Mar 2016 12:26:26 +0000 (UTC)
Received: from mail.kernel.org (localhost [127.0.0.1])
	by mail.kernel.org (Postfix) with ESMTP id 9D06920251
	for <patchwork-qemu-devel@patchwork.kernel.org>;
	Tue,  1 Mar 2016 12:26:24 +0000 (UTC)
Received: from lists.gnu.org (lists.gnu.org [208.118.235.17])
	(using TLSv1 with cipher AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.kernel.org (Postfix) with ESMTPS id 0A9BA2024F
	for <patchwork-qemu-devel@patchwork.kernel.org>;
	Tue,  1 Mar 2016 12:26:22 +0000 (UTC)
Received: from localhost ([::1]:49135 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from
	<qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>)
	id 1aajNd-0005Qu-81 for patchwork-qemu-devel@patchwork.kernel.org;
	Tue, 01 Mar 2016 07:26:21 -0500
Received: from eggs.gnu.org ([2001:4830:134:3::10]:43055)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1aajNP-0005Jy-On
	for qemu-devel@nongnu.org; Tue, 01 Mar 2016 07:26:14 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1aajNM-0001cA-4e
	for qemu-devel@nongnu.org; Tue, 01 Mar 2016 07:26:07 -0500
Received: from mx1.redhat.com ([209.132.183.28]:55993)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1aajNL-0001bz-TW
	for qemu-devel@nongnu.org; Tue, 01 Mar 2016 07:26:04 -0500
Received: from int-mx13.intmail.prod.int.phx2.redhat.com
	(int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26])
	by mx1.redhat.com (Postfix) with ESMTPS id EE8B21E34;
	Tue,  1 Mar 2016 12:26:02 +0000 (UTC)
Received: from work-vm (ovpn-116-109.ams2.redhat.com [10.36.116.109])
	by int-mx13.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with
	ESMTP id u21CPtOH020153
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256
	verify=NO); Tue, 1 Mar 2016 07:25:57 -0500
Date: Tue, 1 Mar 2016 12:25:54 +0000
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Hailiang Zhang <zhang.zhanghailiang@huawei.com>
Message-ID: <20160301122554.GA3745@work-vm>
References: <1456108832-24212-1-git-send-email-zhang.zhanghailiang@huawei.com>
	<20160225195232.GB18374@work-vm> <20160226163602.GM2161@work-vm>
	<56D15653.90406@huawei.com> <20160229094715.GA2125@work-vm>
	<56D43693.5050401@huawei.com>
MIME-Version: 1.0
Content-Disposition: inline
In-Reply-To: <56D43693.5050401@huawei.com>
User-Agent: Mutt/1.5.24 (2015-08-30)
X-Scanned-By: MIMEDefang 2.68 on 10.5.11.26
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x
X-Received-From: 209.132.183.28
Cc: xiecl.fnst@cn.fujitsu.com, lizhijian@cn.fujitsu.com, quintela@redhat.com,
	armbru@redhat.com, yunhong.jiang@intel.com, eddie.dong@intel.com,
	peter.huangpeng@huawei.com, qemu-devel@nongnu.org,
	arei.gonglei@huawei.com, stefanha@redhat.com,
	pbonzini@redhat.com, amit.shah@redhat.com,
	zhangchen.fnst@cn.fujitsu.com, hongyang.yang@easystack.cn
Subject: Re: [Qemu-devel] [PATCH COLO-Frame v15 00/38] COarse-grain
	LOck-stepping(COLO) Virtual Machines for Non-stop Service (FT)
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: 
 qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org
Sender: 
 qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org
X-Spam-Status: No, score=-5.9 required=5.0 tests=BAYES_00,HK_NAME_DR,
	RCVD_IN_DNSWL_HI,UNPARSEABLE_RELAY autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

* Hailiang Zhang (zhang.zhanghailiang@huawei.com) wrote:
> On 2016/2/29 17:47, Dr. David Alan Gilbert wrote:
> >* Hailiang Zhang (zhang.zhanghailiang@huawei.com) wrote:
> >>On 2016/2/27 0:36, Dr. David Alan Gilbert wrote:
> >>>* Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> >>>>* zhanghailiang (zhang.zhanghailiang@huawei.com) wrote:
> >>>>>From: root <root@localhost.localdomain>
> >>>>>
> >>>>>This is the 15th version of COLO (Still only support periodic checkpoint).
> >>>>>
> >>>>>Here is only COLO frame part, you can get the whole codes from github:
> >>>>>https://github.com/coloft/qemu/commits/colo-v2.6-periodic-mode
> >>>>>
> >>>>>There are little changes for this series except the network releated part.
> >>>>
> >>>>I was looking at the time the guest is paused during COLO and
> >>>>was surprised to find one of the larger chunks was the time to reset
> >>>>the guest before loading each checkpoint;  I've traced it part way, the
> >>>>biggest contributors for my test VM seem to be:
> >>>>
> >>>>   3.8ms  pcibus_reset: VGA
> >>>>   1.8ms  pcibus_reset: virtio-net-pci
> >>>>   1.5ms  pcibus_reset: virtio-blk-pci
> >>>>   1.5ms  qemu_devices_reset: piix4_reset
> >>>>   1.1ms  pcibus_reset: piix3-ide
> >>>>   1.1ms  pcibus_reset: virtio-rng-pci
> >>>>
> >>>>I've not looked deeper yet, but some of these are very silly;
> >>>>I'm running with -nographic so why it's taking 3.8ms to reset VGA is
> >>>>going to be interesting.
> >>>>Also, my only block device is the virtio-blk, so while I understand the
> >>>>standard PC machine has the IDE controller, why it takes it over a ms
> >>>>to reset an unused device.
> >>>
> >>>OK, so I've dug a bit deeper, and it appears that it's the changes in
> >>>PCI bars that actually take the time;  every time we do a reset we
> >>>reset all the BARs, this causes it to do a pci_update_mappings and
> >>>end up doing a memory_region_del_subregion.
> >>>Then we load the config space of the PCI device as we do the vmstate_load,
> >>>and this recreates all the mappings again.
> >>>
> >>>I'm not sure what the fix is, but that sounds like it would
> >>>speed up the checkpoints usefully if we can avoid the map/remap when
> >>>they're the same.
> >>>
> >>
> >>Interesting, and thanks for your report.
> >>
> >>We already known qemu_system_reset() is a time-consuming function, we shouldn't
> >>call it here, but if we didn't do that, there will be a bug, which we have
> >>reported before in the previous COLO series, the bellow is the copy of the related
> >>patch comment:

Paolo suggested one fix, see the patch below;  I'm not sure if it's safe
(in particular if the guest changed a bar and the device code tried to access the memory
while loading the state???) - but it does seem to work and shaves ~10ms off the reset/load
times:

Dave

commit 7570b2984143860005ad9fe79f5394c75f294328
Author: Dr. David Alan Gilbert <dgilbert@redhat.com>
Date:   Tue Mar 1 12:08:14 2016 +0000

    COLO: Lock memory map around reset/load
    
    Changing the memory map appears to be expensive; we see this
    partiuclarly when on loading a checkpoint we:
       a) reset the devices
          This causes PCI bars to be reset
       b) Loading the device states
          This causes the PCI bars to be reloaded.
    
    Turning this all into a single memory_region_transaction saves
     ~10ms/checkpoint.
    
    TBD: What happens if the device code accesses the RAM during loading
    the checkpoint?
    
    Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
    Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
---
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

diff --git a/migration/colo.c b/migration/colo.c
index 45c3432..c44fb2a 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -22,6 +22,7 @@
 #include "net/colo-proxy.h"
 #include "net/net.h"
 #include "block/block_int.h"
+#include "exec/memory.h"
 
 static bool vmstate_loading;
 
@@ -934,6 +935,7 @@ void *colo_process_incoming_thread(void *opaque)
 
         stage_time_start = qemu_clock_get_us(QEMU_CLOCK_HOST);
         qemu_mutex_lock_iothread();
+        memory_region_transaction_begin();
         qemu_system_reset(VMRESET_SILENT);
         stage_time_end = qemu_clock_get_us(QEMU_CLOCK_HOST);
         timed_average_account(&mis->colo_state.time_reset,
@@ -947,6 +949,7 @@ void *colo_process_incoming_thread(void *opaque)
                           stage_time_end - stage_time_start);
         stage_time_start = stage_time_end;
         ret = qemu_load_device_state(fb);
+        memory_region_transaction_commit();
         if (ret < 0) {
             error_report("COLO: load device state failed\n");
             vmstate_loading = false;