From patchwork Tue Feb 11 12:08:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?YmF1ZXJjaGVuKOmZiOiSmeiSmSk=?= X-Patchwork-Id: 11375515 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 58EC4921 for ; Tue, 11 Feb 2020 13:36:46 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8E29F2070A for ; Tue, 11 Feb 2020 13:36:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8E29F2070A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=tencent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:49742 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j1ViW-00026O-Is for patchwork-qemu-devel@patchwork.kernel.org; Tue, 11 Feb 2020 08:36:44 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:51666) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1j1ULI-0001ct-Qf for qemu-devel@nongnu.org; Tue, 11 Feb 2020 07:08:42 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1j1UL7-0004ys-Po for qemu-devel@nongnu.org; Tue, 11 Feb 2020 07:08:31 -0500 Received: from mail6.tencent.com ([220.249.245.26]:41909) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1j1UL7-0004qQ-0o for qemu-devel@nongnu.org; Tue, 11 Feb 2020 07:08:29 -0500 Received: from EX-SZ018.tencent.com (unknown [10.28.6.39]) by mail6.tencent.com (Postfix) with ESMTP id 5E89BCCA33; Tue, 11 Feb 2020 20:08:42 +0800 (CST) Received: from EX-SZ003.tencent.com (10.28.6.15) by EX-SZ018.tencent.com (10.28.6.39) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Tue, 11 Feb 2020 20:08:18 +0800 Received: from EX-SZ005.tencent.com (10.28.6.29) by EX-SZ003.tencent.com (10.28.6.15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Tue, 11 Feb 2020 20:08:18 +0800 Received: from EX-SZ005.tencent.com ([fe80::1c8:f876:daf6:e9c0]) by EX-SZ005.tencent.com ([fe80::1c8:f876:daf6:e9c0%4]) with mapi id 15.01.1713.004; Tue, 11 Feb 2020 20:08:18 +0800 From: =?utf-8?b?YmF1ZXJjaGVuKOmZiOiSmeiSmSk=?= To: qemu-devel Subject: Requesting review about optimizing large guest start up time Thread-Topic: Requesting review about optimizing large guest start up time Thread-Index: AQHV4NMV+rH5F4FwIUiLta/s/6EHWw== Date: Tue, 11 Feb 2020 12:08:18 +0000 Message-ID: Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [9.19.161.93] MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 220.249.245.26 X-Mailman-Approved-At: Tue, 11 Feb 2020 08:34:09 -0500 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "pbonzini >" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" From c882b155466313fcd85ac330a45a573e608b0d74 Mon Sep 17 00:00:00 2001 From: bauerchen Date: Tue, 11 Feb 2020 17:10:35 +0800 Subject: [PATCH] Optimize: large guest start-up in mem-prealloc MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit [desc]:     Large memory VM starts slowly when using -mem-prealloc, and     there are some areas to optimize in current method;     1、mmap will be used to alloc threads stack during create page     clearing threads, and it will attempt mm->mmap_sem for write     lock, but clearing threads have hold read lock, this competition     will cause threads createion very slow;     2、methods of calcuating pages for per threads is not well;if we use     64 threads to split 160 hugepage,63 threads clear 2page,1 thread     clear 34 page,so the entire speed is very slow;     to solve the first problem,we add a mutex in thread function,and     start all threads when all threads finished createion;     and the second problem, we spread remainder to other threads,in     situation that 160 hugepage and 64 threads, there are 32 threads     clear 3 pages,and 32 threads clear 2 pages; [test]:     320G 84c VM start time can be reduced to 10s     680G 84c VM start time can be reduced to 18s Signed-off-by: bauerchen Reviewed-by:Pan Rui Reviewed-by:Ivan Ren ---  util/oslib-posix.c | 44 ++++++++++++++++++++++++++++++++++++--------  1 file changed, 36 insertions(+), 8 deletions(-) --  1.8.3.1 diff --git a/util/oslib-posix.c b/util/oslib-posix.c index 5a291cc..e97369b 100644 --- a/util/oslib-posix.c +++ b/util/oslib-posix.c @@ -76,6 +76,10 @@ static MemsetThread *memset_thread;  static int memset_num_threads;  static bool memset_thread_failed;   +static QemuMutex page_mutex; +static QemuCond page_cond; +static volatile bool thread_create_flag; +  int qemu_get_thread_id(void)  {  #if defined(__linux__) @@ -403,6 +407,14 @@ static void *do_touch_pages(void *arg)      MemsetThread *memset_args = (MemsetThread *)arg;      sigset_t set, oldset;   +    /*wait for all threads create finished */ +    qemu_mutex_lock(&page_mutex); +    while(!thread_create_flag){ +        qemu_cond_wait(&page_cond, &page_mutex); +    } +    qemu_mutex_unlock(&page_mutex); + +      /* unblock SIGBUS */      sigemptyset(&set);      sigaddset(&set, SIGBUS); @@ -448,30 +460,46 @@ static inline int get_memset_num_threads(int smp_cpus)      return ret;  }   +static void calc_page_per_thread(size_t numpages, int memset_threads, size_t *pages_per_thread){ +    int avg = numpages / memset_threads + 1; +    int i = 0; +    int last = avg * memset_threads - numpages; +    for (i = 0; i < memset_threads; i++) +    { +        if(memset_threads - i <= last){ +            pages_per_thread[i] = avg - 1; +        }else +            pages_per_thread[i] = avg; +    } +} +  static bool touch_all_pages(char *area, size_t hpagesize, size_t numpages,                              int smp_cpus)  { -    size_t numpages_per_thread; -    size_t size_per_thread; +    size_t *numpages_per_thread;      char *addr = area;      int i = 0;        memset_thread_failed = false; +    thread_create_flag = false;      memset_num_threads = get_memset_num_threads(smp_cpus); +    numpages_per_thread = g_new0(size_t, memset_num_threads);      memset_thread = g_new0(MemsetThread, memset_num_threads); -    numpages_per_thread = (numpages / memset_num_threads); -    size_per_thread = (hpagesize * numpages_per_thread); +    calc_page_per_thread(numpages, memset_num_threads, numpages_per_thread); +      for (i = 0; i < memset_num_threads; i++) {          memset_thread[i].addr = addr; -        memset_thread[i].numpages = (i == (memset_num_threads - 1)) ? -                                    numpages : numpages_per_thread; +        memset_thread[i].numpages = numpages_per_thread[i];          memset_thread[i].hpagesize = hpagesize;          qemu_thread_create(&memset_thread[i].pgthread, "touch_pages",                             do_touch_pages, &memset_thread[i],                             QEMU_THREAD_JOINABLE); -        addr += size_per_thread; -        numpages -= numpages_per_thread; +        addr += numpages_per_thread[i] * hpagesize; +        numpages -= numpages_per_thread[i];      } +    thread_create_flag = true; +    qemu_cond_broadcast(&page_cond); +      for (i = 0; i < memset_num_threads; i++) {          qemu_thread_join(&memset_thread[i].pgthread);      }