From patchwork Thu Mar 16 02:47:37 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Zhong X-Patchwork-Id: 9627015 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 04F4A60244 for ; Thu, 16 Mar 2017 02:48:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ECC8D28610 for ; Thu, 16 Mar 2017 02:48:00 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DFEA62861A; Thu, 16 Mar 2017 02:48:00 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 0D73128610 for ; Thu, 16 Mar 2017 02:47:59 +0000 (UTC) Received: from localhost ([::1]:40536 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1coLSH-00054Z-It for patchwork-qemu-devel@patchwork.kernel.org; Wed, 15 Mar 2017 22:47:57 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52297) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1coLS6-00054O-9j for qemu-devel@nongnu.org; Wed, 15 Mar 2017 22:47:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1coLS3-0000CW-5d for qemu-devel@nongnu.org; Wed, 15 Mar 2017 22:47:46 -0400 Received: from mga09.intel.com ([134.134.136.24]:2044) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1coLS2-0000C3-Oo for qemu-devel@nongnu.org; Wed, 15 Mar 2017 22:47:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=intel.com; i=@intel.com; q=dns/txt; s=intel; t=1489632462; x=1521168462; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=R3X1zNeCmgmHr3LYl5cXY8F9tUW+iyxQabCr/aTcEdM=; b=lGhN9tSyN0eqvEsBI+Fj2U1Am8YGIiUdXmRNpl8D1wvF+Id3m5Axq6DW fk3TumVvuTj5s6o9jBA9gyF9NLOjUg==; Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 15 Mar 2017 19:47:41 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.36,170,1486454400"; d="scan'208";a="60981846" Received: from fmsmsx104.amr.corp.intel.com ([10.18.124.202]) by orsmga002.jf.intel.com with ESMTP; 15 Mar 2017 19:47:40 -0700 Received: from fmsmsx102.amr.corp.intel.com (10.18.124.200) by fmsmsx104.amr.corp.intel.com (10.18.124.202) with Microsoft SMTP Server (TLS) id 14.3.248.2; Wed, 15 Mar 2017 19:47:40 -0700 Received: from shsmsx152.ccr.corp.intel.com (10.239.6.52) by FMSMSX102.amr.corp.intel.com (10.18.124.200) with Microsoft SMTP Server (TLS) id 14.3.248.2; Wed, 15 Mar 2017 19:47:40 -0700 Received: from shsmsx102.ccr.corp.intel.com ([169.254.2.88]) by SHSMSX152.ccr.corp.intel.com ([169.254.6.132]) with mapi id 14.03.0248.002; Thu, 16 Mar 2017 10:47:38 +0800 From: "Zhong, Yang" To: Paolo Bonzini , "Xu, Anthony" Thread-Topic: [Qemu-devel] [PATCH v1 2/2] reduce qemu's heap Rss size from 12252kB to 2752KB Thread-Index: AQHSmW5shUOz3c6wV0avdbfYdjxFbqGNRFcAgACNkYCAAYglgIAD8MIAgABT2oCAALrQgIAAu/kAgACv5ACAABUwAIAA79LA Date: Thu, 16 Mar 2017 02:47:37 +0000 Message-ID: <7A85DF989CAE8F42902CF7B31A7D94A139557C1D@shsmsx102.ccr.corp.intel.com> References: <1489158897-9206-1-git-send-email-yang.zhong@intel.com> <4712D8F4B26E034E80552F30A67BE0B1A0F91A@ORSMSX112.amr.corp.intel.com> <920352018.1797331.1489251855464.JavaMail.zimbra@redhat.com> <4712D8F4B26E034E80552F30A67BE0B1A13045@ORSMSX112.amr.corp.intel.com> <991a0f3d-5e1f-d17c-a8f3-a42c02ac2f17@redhat.com> <4712D8F4B26E034E80552F30A67BE0B1A1561C@ORSMSX112.amr.corp.intel.com> <4712D8F4B26E034E80552F30A67BE0B1A16F14@ORSMSX112.amr.corp.intel.com> <1466597512.4602267.1489609298091.JavaMail.zimbra@redhat.com> In-Reply-To: <1466597512.4602267.1489609298091.JavaMail.zimbra@redhat.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.24 Subject: Re: [Qemu-devel] [PATCH v1 2/2] reduce qemu's heap Rss size from 12252kB to 2752KB X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Peng, Chao P" , "qemu-devel@nongnu.org" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Hello Paolo, As for your said tcmalloc or jemalloc for optimization memory allocation, we also tried the malloc_trim() in glibc, which can get the same memory optimization. The man info in malloc_trim is only release free memory from the top of the heap, in fact, we found this function also release hole memory below top of heap. Thanks! ZhongYang -----Original Message----- From: Paolo Bonzini [mailto:pbonzini@redhat.com] Sent: Thursday, March 16, 2017 4:22 AM To: Xu, Anthony Cc: Zhong, Yang ; Peng, Chao P ; qemu-devel@nongnu.org Subject: Re: [Qemu-devel] [PATCH v1 2/2] reduce qemu's heap Rss size from 12252kB to 2752KB ----- Original Message ----- > From: "Anthony Xu" > To: "Paolo Bonzini" > Cc: "Yang Zhong" , "Chao P Peng" > , qemu-devel@nongnu.org > Sent: Wednesday, March 15, 2017 8:05:48 PM > Subject: Re: [Qemu-devel] [PATCH v1 2/2] reduce qemu's heap Rss size > from 12252kB to 2752KB > > > The first unref is done after as->current_map is overwritten. > > as->current_map is accessed under RCU, so it needs call_rcu. It > > balances the initial reference that is present since flatview_init. > > Got it, thanks for explanation. > > > > but it is not clear to me, is this a bug or by design? Is > > > flatview_destroy called > > in current thread > > > or RCU thread? > > > > You cannot know, because there are also other callers of > > address_space_get_flatview. Usually it's from the RCU thread. > > If it is from current thread, do we need to check if the global lock > is acquired? > As you mentioned flatview_unref may call memory_region_unref. No, you don't need to check it. Most functions (in this case, memory_region_transaction_commit) can only be called under the global lock. In fact, only address_space_read/write/ld*/st*, plus memory_region_find can be called without the global lock. > In the comments of memory_region_finalize > " /* We know the region is not visible in any address space (it > * does not have a container and cannot be a root either because > * it has no references, > " > > We know the memory region is not a root region, and the memory region > has already been removed from address space even it has sub regions. > memory_region_transaction_commit should be called when the memory > region is removed from address space. > memory_region_transaction_commit seems not be needed in > memory_region_finalize. > Let me know if you think otherwise. Yes, you can replace memory_region_del_subregion in memory_region_finalize with special code that does assert(!mr->enabled); assert(subregion->container == mr); subregion->container = NULL; QTAILQ_REMOVE(&mr->subregions, subregion, subregions_link); memory_region_unref(subregion); The last four lines are shared with memory_region_del_subregion, so please factor them in a new function, for example memory_region_del_subregion_internal. > > > How about fall back to synchronize_rcu? > > > > I'm afraid it would be too slow, but you can test. > > It is not slow, the contention is not high. Yes we can test. It's not a matter of contention. It's a matter of how long an RCU critical section can be---not just at startup, but at any point during QEMU execution. Have you tried tcmalloc or jemalloc? They use the brk region less and sometimes are more aggressive in releasing mmap-ed memory areas. > Please review below patch, MemoryRegion is protected by RCU. Removing transaction begin/commit in memory_region_finalize makes little sense because memory_region_del_subregion calls those two functions again. See above for an alternative. Apart from this, the patch is at least correct, though I'm not sure it's a good idea (synchronize_rcu needs testing). I'm not sure I understand the address_space_write change). Paolo > Thanks, > Anthony > > > > diff --git a/exec.c b/exec.c > index a22f5a0..6631668 100644 > --- a/exec.c > +++ b/exec.c > @@ -2521,7 +2521,8 @@ static void mem_commit(MemoryListener *listener) > > atomic_rcu_set(&as->dispatch, next); > if (cur) { > - call_rcu(cur, address_space_dispatch_free, rcu); > + synchronize_rcu(); > + address_space_dispatch_free(cur); > } > } > > @@ -2567,7 +2568,8 @@ void address_space_destroy_dispatch(AddressSpace > *as) > > atomic_rcu_set(&as->dispatch, NULL); > if (d) { > - call_rcu(d, address_space_dispatch_free, rcu); > + synchronize_rcu(); > + address_space_dispatch_free(d); > } > } > > @@ -2712,19 +2714,22 @@ static bool prepare_mmio_access(MemoryRegion *mr) > return release_lock; > } > > -/* Called within RCU critical section. */ -static MemTxResult > address_space_write_continue(AddressSpace *as, hwaddr addr, > - MemTxAttrs attrs, > - const uint8_t *buf, > - int len, hwaddr addr1, > - hwaddr l, MemoryRegion *mr) > +MemTxResult address_space_write(AddressSpace *as, hwaddr addr, > + MemTxAttrs attrs, const uint8_t *buf, int len) > { > uint8_t *ptr; > uint64_t val; > + hwaddr l=len; > + hwaddr addr1; > + MemoryRegion *mr; > MemTxResult result = MEMTX_OK; > bool release_lock = false; > > for (;;) { > + rcu_read_lock(); > + mr = address_space_translate(as, addr, &addr1, &l, true); > + memory_region_ref(mr); > + rcu_read_unlock(); > if (!memory_access_is_direct(mr, true)) { > release_lock |= prepare_mmio_access(mr); > l = memory_access_size(mr, l, addr1); @@ -2764,7 +2769,7 > @@ static MemTxResult address_space_write_continue(AddressSpace *as, > hwaddr addr, > memcpy(ptr, buf, l); > invalidate_and_set_dirty(mr, addr1, l); > } > - > + memory_region_unref(mr); > if (release_lock) { > qemu_mutex_unlock_iothread(); > release_lock = false; > @@ -2779,27 +2784,6 @@ static MemTxResult > address_space_write_continue(AddressSpace *as, hwaddr addr, > } > > l = len; > - mr = address_space_translate(as, addr, &addr1, &l, true); > - } > - > - return result; > -} > - > -MemTxResult address_space_write(AddressSpace *as, hwaddr addr, > MemTxAttrs attrs, > - const uint8_t *buf, int len) > -{ > - hwaddr l; > - hwaddr addr1; > - MemoryRegion *mr; > - MemTxResult result = MEMTX_OK; > - > - if (len > 0) { > - rcu_read_lock(); > - l = len; > - mr = address_space_translate(as, addr, &addr1, &l, true); > - result = address_space_write_continue(as, addr, attrs, buf, len, > - addr1, l, mr); > - rcu_read_unlock(); > } > > return result; > diff --git a/memory.c b/memory.c > index 64b0a60..d12437c 100644 > --- a/memory.c > +++ b/memory.c > @@ -1505,13 +1505,10 @@ static void memory_region_finalize(Object *obj) > * and cause an infinite loop. > */ > mr->enabled = false; > - memory_region_transaction_begin(); > while (!QTAILQ_EMPTY(&mr->subregions)) { > MemoryRegion *subregion = QTAILQ_FIRST(&mr->subregions); > memory_region_del_subregion(mr, subregion); > } > - memory_region_transaction_commit(); > - > mr->destructor(mr); > memory_region_clear_coalescing(mr); > g_free((char *)mr->name); > > > diff --git a/util/rcu.c b/util/rcu.c index 9adc5e4..3643e43 100644 --- a/util/rcu.c +++ b/util/rcu.c @@ -32,7 +32,7 @@ #include "qemu/atomic.h" #include "qemu/thread.h" #include "qemu/main-loop.h" - +#include "malloc.h" /* * Global grace period counter. Bit 0 is always one in rcu_gp_ctr. * Bits 1 and above are defined in synchronize_rcu. @@ -271,6 +271,7 @@ static void *call_rcu_thread(void *opaque) n--; node->func(node); } + malloc_trim(0); qemu_mutex_unlock_iothread(); } abort();