From patchwork Thu Mar 16 20:02:28 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Xu, Anthony" X-Patchwork-Id: 9629299 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id ED01760249 for ; Thu, 16 Mar 2017 20:06:34 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DD29B2864B for ; Thu, 16 Mar 2017 20:06:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CF7FD2868B; Thu, 16 Mar 2017 20:06:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 2372A2864B for ; Thu, 16 Mar 2017 20:06:34 +0000 (UTC) Received: from localhost ([::1]:45558 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cobfN-0005W1-5H for patchwork-qemu-devel@patchwork.kernel.org; Thu, 16 Mar 2017 16:06:33 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45027) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cobbi-0002bM-Oe for qemu-devel@nongnu.org; Thu, 16 Mar 2017 16:02:47 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cobbd-0003WM-R6 for qemu-devel@nongnu.org; Thu, 16 Mar 2017 16:02:46 -0400 Received: from mga11.intel.com ([192.55.52.93]:5143) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cobbd-0003Vq-Gp for qemu-devel@nongnu.org; Thu, 16 Mar 2017 16:02:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=intel.com; i=@intel.com; q=dns/txt; s=intel; t=1489694561; x=1521230561; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=4nzqyuXQcVho0OuPMP97h9CI9jsllz8jgkgIu5Yoajk=; b=H2CwhNKOWtle24CakvmcmspB2LA6jx+s/sRCvPL3eAUeNJ+zpVz8UVyA YRjS19hoHzNQTN79xS39x5FD2/mfIA==; Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Mar 2017 13:02:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.36,173,1486454400"; d="scan'208";a="835604434" Received: from orsmsx105.amr.corp.intel.com ([10.22.225.132]) by FMSMGA003.fm.intel.com with ESMTP; 16 Mar 2017 13:02:30 -0700 Received: from orsmsx156.amr.corp.intel.com (10.22.240.22) by ORSMSX105.amr.corp.intel.com (10.22.225.132) with Microsoft SMTP Server (TLS) id 14.3.248.2; Thu, 16 Mar 2017 13:02:30 -0700 Received: from orsmsx112.amr.corp.intel.com ([169.254.3.204]) by ORSMSX156.amr.corp.intel.com ([169.254.8.38]) with mapi id 14.03.0248.002; Thu, 16 Mar 2017 13:02:30 -0700 From: "Xu, Anthony" To: Paolo Bonzini Thread-Topic: [Qemu-devel] [PATCH v1 2/2] reduce qemu's heap Rss size from 12252kB to 2752KB Thread-Index: AQHSmW5s7VgxEgkfD0KNa0Uxoqhl0qGOUJAAgAAAp8AZksd/6v842HEggADaOoD///owIIABfJkAgAAwyEB6iksMHPwtPFMg Date: Thu, 16 Mar 2017 20:02:28 +0000 Message-ID: <4712D8F4B26E034E80552F30A67BE0B1A19A11@ORSMSX112.amr.corp.intel.com> References: <1489158897-9206-1-git-send-email-yang.zhong@intel.com> <4712D8F4B26E034E80552F30A67BE0B1A0F91A@ORSMSX112.amr.corp.intel.com> <920352018.1797331.1489251855464.JavaMail.zimbra@redhat.com> <4712D8F4B26E034E80552F30A67BE0B1A13045@ORSMSX112.amr.corp.intel.com> <991a0f3d-5e1f-d17c-a8f3-a42c02ac2f17@redhat.com> <4712D8F4B26E034E80552F30A67BE0B1A1561C@ORSMSX112.amr.corp.intel.com> <4712D8F4B26E034E80552F30A67BE0B1A16F14@ORSMSX112.amr.corp.intel.com> <1466597512.4602267.1489609298091.JavaMail.zimbra@redhat.com> In-Reply-To: <1466597512.4602267.1489609298091.JavaMail.zimbra@redhat.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ctpclassification: CTP_IC x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiNGU3NjhhNDktNjdmNy00MmE3LTlkOTctNzFlZjIxMWFlYmU0IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6Im5mTldjQWRSelRCUVgrcWdWZVlKcmFcL1JscktabXB2WklpZnBOdmxsb1NzPSJ9 x-originating-ip: [10.22.254.138] MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.93 Subject: Re: [Qemu-devel] [PATCH v1 2/2] reduce qemu's heap Rss size from 12252kB to 2752KB X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Zhong, Yang" , "Peng, Chao P" , "qemu-devel@nongnu.org" Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP > > memory_region_finalize. > > Let me know if you think otherwise. > > Yes, you can replace memory_region_del_subregion in > memory_region_finalize > with special code that does > > assert(!mr->enabled); > assert(subregion->container == mr); > subregion->container = NULL; > QTAILQ_REMOVE(&mr->subregions, subregion, subregions_link); > memory_region_unref(subregion); > > The last four lines are shared with memory_region_del_subregion, so please > factor them in a new function, for example > memory_region_del_subregion_internal. After adding synchronize_rcu, I saw an infinite recursive call, mem_commit-> memory_region_finalize-> mem_commit-> memory_region_finalize-> ...... it caused a segment fault, because 8M stack space is used up, and found when memory_region_finalize is called, memory_region_transaction_depth is 0 and memory_region_update_pending is true. That's not normal! As you mentioned in your previous email, that should not happen. >" But if memory_region_transaction_depth is == 0, there should be no >update pending because the loop has never run" The root cause is, MEMORY_LISTENER_CALL_GLOBAL(commit, Forward) is called before memory_region_clear_pending() in memory_region_transaction_commit. so when MEMORY_LISTENER_CALL_GLOBAL(commit, Forward) is called, memory_region_transaction_depth is 0 and memory_region_update_pending is true. mem_commit may call memory_region_finalize, it causes infinite recursive call. my previous fix for this is not complete. We may clear memory_region_update_pending before calling MEMORY_LISTENER_CALL_GLOBAL(commit, Forward) Please review below patch > > It is not slow, the contention is not high. Yes we can test. > > It's not a matter of contention. It's a matter of how long an RCU > critical section can be---not just at startup, but at any point > during QEMU execution. The thing is, seems both address_space_translate and address_space_dispatch_free are called under the global lock. When synchronize_rcu is called, no other threads are in RCU critical section. Seems RCU is not that useful for address space. > > Have you tried tcmalloc or jemalloc? They use the brk region > less and sometimes are more aggressive in releasing mmap-ed memory > areas. They may be aggressive. But if memory are freed afterward, it may not help in some cases, for example, starting a lot of VMs at the same time. > > > Please review below patch, MemoryRegion is protected by RCU. > > Removing transaction begin/commit in memory_region_finalize makes > little sense because memory_region_del_subregion calls those two > functions again. See above for an alternative. Apart from this, > the patch is at least correct, though I'm not sure it's a good > idea (synchronize_rcu needs testing). > I'm not sure I understand the > address_space_write change). After adding synchronize_rcu, we noticed a RCU dead loop. synchronize_rcu is called inside RCU critical section. It happened when guest OS programmed the PCI bar. The call trace is like, address_space_write-> pci_host_config_write_common -> memory_region_transaction_commit ->mem_commit-> synchronize_rcu pci_host_config_write_common is called inside RCU critical section. The address_space_write change fixed this issue. Thanks, Anthony diff --git a/memory.c b/memory.c index 64b0a60..4c95aaf 100644 --- a/memory.c +++ b/memory.c @@ -906,12 +906,6 @@ void memory_region_transaction_begin(void) ++memory_region_transaction_depth; } -static void memory_region_clear_pending(void) -{ - memory_region_update_pending = false; - ioeventfd_update_pending = false; -} - void memory_region_transaction_commit(void) { AddressSpace *as; @@ -927,14 +921,14 @@ void memory_region_transaction_commit(void) QTAILQ_FOREACH(as, &address_spaces, address_spaces_link) { address_space_update_topology(as); } - + memory_region_update_pending = false; MEMORY_LISTENER_CALL_GLOBAL(commit, Forward); } else if (ioeventfd_update_pending) { QTAILQ_FOREACH(as, &address_spaces, address_spaces_link) { address_space_update_ioeventfds(as); } + ioeventfd_update_pending = false; } - memory_region_clear_pending(); } }