From patchwork Thu Jun 9 08:30:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julien Grall X-Patchwork-Id: 12874997 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C50BBC433EF for ; Thu, 9 Jun 2022 08:31:12 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.344765.570377 (Exim 4.92) (envelope-from ) id 1nzDZ6-0001Rv-QK; Thu, 09 Jun 2022 08:30:52 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 344765.570377; Thu, 09 Jun 2022 08:30:52 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1nzDZ6-0001Ro-NY; Thu, 09 Jun 2022 08:30:52 +0000 Received: by outflank-mailman (input) for mailman id 344765; Thu, 09 Jun 2022 08:30:50 +0000 Received: from mail.xenproject.org ([104.130.215.37]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1nzDZ4-0001Dc-L7 for xen-devel@lists.xenproject.org; Thu, 09 Jun 2022 08:30:50 +0000 Received: from xenbits.xenproject.org ([104.239.192.120]) by mail.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1nzDZ4-0002Ml-5Y; Thu, 09 Jun 2022 08:30:50 +0000 Received: from 54-240-197-232.amazon.com ([54.240.197.232] helo=dev-dsk-jgrall-1b-035652ec.eu-west-1.amazon.com) by xenbits.xenproject.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1nzDZ3-0001qx-T6; Thu, 09 Jun 2022 08:30:50 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=xen.org; s=20200302mail; h=Content-Transfer-Encoding:MIME-Version:References: In-Reply-To:Message-Id:Date:Subject:Cc:To:From; bh=cD3CnzUA7uhX3bAfaHZoI05qWLImHHGhMASI4zN+Amk=; b=4/o0eJ/GjP6VQ26C8lKXFTH+Pd EJp6ON2ZJpKCDqXzyl2YzQUYGA3qiez5bEGgOE6UY1wN/M06ZHsHZo7pxomUPOAygBV4d1lNJ4U7s uQSh9XvDVrLHXS2aeT7ufiLK3HhKjLVUQoCrguLIr7+UKc3nGHQjOL7cxBqyjchHSAyQ=; From: Julien Grall To: xen-devel@lists.xenproject.org Cc: bertrand.marquis@arm.com, Julien Grall , Andrew Cooper , George Dunlap , Jan Beulich , Julien Grall , Stefano Stabellini , Wei Liu Subject: [PATCH 1/2] xen/heap: Split init_heap_pages() in two Date: Thu, 9 Jun 2022 09:30:38 +0100 Message-Id: <20220609083039.76667-2-julien@xen.org> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220609083039.76667-1-julien@xen.org> References: <20220609083039.76667-1-julien@xen.org> MIME-Version: 1.0 From: Julien Grall At the moment, init_heap_pages() will call free_heap_pages() page by page. To reduce the time to initialize the heap, we will want to provide multiple pages at the same time. init_heap_pages() is now split in two parts: - init_heap_pages(): will break down the range in multiple set of contiguous pages. For now, the criteria is the pages should belong to the same NUMA node. - init_contig_pages(): will initialize a set of contiguous pages. For now the pages are still passed one by one to free_heap_pages(). Note that the comment before init_heap_pages() is heavily outdated and does not reflect the current code. So update it. This patch is a merge/rework of patches from David Woodhouse and Hongyan Xia. Signed-off-by: Julien Grall ---- Interestingly, I was expecting this patch to perform worse. However, from testing there is a small increase in perf. That said, I split the patch because it keeps refactoring and optimization separated. --- xen/common/page_alloc.c | 82 +++++++++++++++++++++++++++-------------- 1 file changed, 55 insertions(+), 27 deletions(-) diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index 3e6504283f1e..a1938df1406c 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -1778,16 +1778,55 @@ int query_page_offline(mfn_t mfn, uint32_t *status) } /* - * Hand the specified arbitrary page range to the specified heap zone - * checking the node_id of the previous page. If they differ and the - * latter is not on a MAX_ORDER boundary, then we reserve the page by - * not freeing it to the buddy allocator. + * init_contig_heap_pages() is intended to only take pages from the same + * NUMA node. */ +static bool is_contig_page(struct page_info *pg, unsigned int nid) +{ + return (nid == (phys_to_nid(page_to_maddr(pg)))); +} + +/* + * This function should only be called with valid pages from the same NUMA + * node. + * + * Callers should use is_contig_page() first to check if all the pages + * in a range are contiguous. + */ +static void init_contig_heap_pages(struct page_info *pg, unsigned long nr_pages, + bool need_scrub) +{ + unsigned long s, e; + unsigned int nid = phys_to_nid(page_to_maddr(pg)); + + s = mfn_x(page_to_mfn(pg)); + e = mfn_x(mfn_add(page_to_mfn(pg + nr_pages - 1), 1)); + if ( unlikely(!avail[nid]) ) + { + bool use_tail = !(s & ((1UL << MAX_ORDER) - 1)) && + (find_first_set_bit(e) <= find_first_set_bit(s)); + unsigned long n; + + n = init_node_heap(nid, s, nr_pages, &use_tail); + BUG_ON(n > nr_pages); + if ( use_tail ) + e -= n; + else + s += n; + } + + while ( s < e ) + { + free_heap_pages(mfn_to_page(_mfn(s)), 0, need_scrub); + s += 1UL; + } +} + static void init_heap_pages( struct page_info *pg, unsigned long nr_pages) { unsigned long i; - bool idle_scrub = false; + bool need_scrub = scrub_debug; /* * Keep MFN 0 away from the buddy allocator to avoid crossing zone @@ -1812,35 +1851,24 @@ static void init_heap_pages( spin_unlock(&heap_lock); if ( system_state < SYS_STATE_active && opt_bootscrub == BOOTSCRUB_IDLE ) - idle_scrub = true; + need_scrub = true; - for ( i = 0; i < nr_pages; i++ ) + for ( i = 0; i < nr_pages; ) { - unsigned int nid = phys_to_nid(page_to_maddr(pg+i)); + unsigned int nid = phys_to_nid(page_to_maddr(pg)); + unsigned long left = nr_pages - i; + unsigned long contig_pages; - if ( unlikely(!avail[nid]) ) + for ( contig_pages = 1; contig_pages < left; contig_pages++ ) { - unsigned long s = mfn_x(page_to_mfn(pg + i)); - unsigned long e = mfn_x(mfn_add(page_to_mfn(pg + nr_pages - 1), 1)); - bool use_tail = (nid == phys_to_nid(pfn_to_paddr(e - 1))) && - !(s & ((1UL << MAX_ORDER) - 1)) && - (find_first_set_bit(e) <= find_first_set_bit(s)); - unsigned long n; - - n = init_node_heap(nid, mfn_x(page_to_mfn(pg + i)), nr_pages - i, - &use_tail); - BUG_ON(i + n > nr_pages); - if ( n && !use_tail ) - { - i += n - 1; - continue; - } - if ( i + n == nr_pages ) + if ( !is_contig_page(pg + contig_pages, nid) ) break; - nr_pages -= n; } - free_heap_pages(pg + i, 0, scrub_debug || idle_scrub); + init_contig_heap_pages(pg, contig_pages, need_scrub); + + pg += contig_pages; + i += contig_pages; } } From patchwork Thu Jun 9 08:30:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julien Grall X-Patchwork-Id: 12874996 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B6DA8C43334 for ; Thu, 9 Jun 2022 08:31:12 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.344766.570384 (Exim 4.92) (envelope-from ) id 1nzDZ7-0001VV-9I; Thu, 09 Jun 2022 08:30:53 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 344766.570384; Thu, 09 Jun 2022 08:30:53 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1nzDZ7-0001V3-0U; Thu, 09 Jun 2022 08:30:53 +0000 Received: by outflank-mailman (input) for mailman id 344766; Thu, 09 Jun 2022 08:30:52 +0000 Received: from mail.xenproject.org ([104.130.215.37]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1nzDZ5-0001Rg-W1 for xen-devel@lists.xenproject.org; Thu, 09 Jun 2022 08:30:51 +0000 Received: from xenbits.xenproject.org ([104.239.192.120]) by mail.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1nzDZ5-0002Mz-L8; Thu, 09 Jun 2022 08:30:51 +0000 Received: from 54-240-197-232.amazon.com ([54.240.197.232] helo=dev-dsk-jgrall-1b-035652ec.eu-west-1.amazon.com) by xenbits.xenproject.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1nzDZ5-0001qx-CA; Thu, 09 Jun 2022 08:30:51 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=xen.org; s=20200302mail; h=Content-Transfer-Encoding:MIME-Version:References: In-Reply-To:Message-Id:Date:Subject:Cc:To:From; bh=ZiWn/AllIOhVkpe3HghVI1xsTWWu2FCQf6ihSSFqwAc=; b=zCKC0A8Q9YSIyv516jc+2x413r mwGK88HMmrYWgDpQaYEs/Kmc/pbb2ezvngXhWiZVB1f61Yrjh3yn9jviiEAZ3L9Fz5WJBHdwN90dd MMecSlgPaaRfhWVMLfurI/8q1Dwgle9rgBqgSBGkZDkGavW+ZDcZqA2LpMAtU+AD3S7I=; From: Julien Grall To: xen-devel@lists.xenproject.org Cc: bertrand.marquis@arm.com, Hongyan Xia , Andrew Cooper , George Dunlap , Jan Beulich , Julien Grall , Stefano Stabellini , Wei Liu , Julien Grall Subject: [PATCH 2/2] xen/heap: pass order to free_heap_pages() in heap init Date: Thu, 9 Jun 2022 09:30:39 +0100 Message-Id: <20220609083039.76667-3-julien@xen.org> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220609083039.76667-1-julien@xen.org> References: <20220609083039.76667-1-julien@xen.org> MIME-Version: 1.0 From: Hongyan Xia The idea is to split the range into multiple aligned power-of-2 regions which only needs to call free_heap_pages() once each. We check the least significant set bit of the start address and use its bit index as the order of this increment. This makes sure that each increment is both power-of-2 and properly aligned, which can be safely passed to free_heap_pages(). Of course, the order also needs to be sanity checked against the upper bound and MAX_ORDER. Testing on a nested environment on c5.metal with various amount of RAM. Time for end_boot_allocator() to complete: Before After - 90GB: 1426 ms 166 ms - 8GB: 124 ms 12 ms - 4GB: 60 ms 6 ms Signed-off-by: Hongyan Xia Signed-off-by: Julien Grall --- xen/common/page_alloc.c | 39 +++++++++++++++++++++++++++++++++------ 1 file changed, 33 insertions(+), 6 deletions(-) diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index a1938df1406c..bf852cfc11ea 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -1779,16 +1779,28 @@ int query_page_offline(mfn_t mfn, uint32_t *status) /* * init_contig_heap_pages() is intended to only take pages from the same - * NUMA node. + * NUMA node and zone. + * + * For the latter, it is always true for !CONFIG_SEPARATE_XENHEAP since + * free_heap_pages() can only take power-of-two ranges which never cross + * zone boundaries. But for separate xenheap which is manually defined, + * it is possible for a power-of-two range to cross zones, so we need to + * check that as well. */ -static bool is_contig_page(struct page_info *pg, unsigned int nid) +static bool is_contig_page(struct page_info *pg, unsigned int nid, + unsigned int zone) { +#ifdef CONFIG_SEPARATE_XENHEAP + if ( zone != page_to_zone(pg) ) + return false; +#endif + return (nid == (phys_to_nid(page_to_maddr(pg)))); } /* * This function should only be called with valid pages from the same NUMA - * node. + * node and the same zone. * * Callers should use is_contig_page() first to check if all the pages * in a range are contiguous. @@ -1817,8 +1829,22 @@ static void init_contig_heap_pages(struct page_info *pg, unsigned long nr_pages, while ( s < e ) { - free_heap_pages(mfn_to_page(_mfn(s)), 0, need_scrub); - s += 1UL; + /* + * For s == 0, we simply use the largest increment by checking the + * index of the MSB set. For s != 0, we also need to ensure that the + * chunk is properly sized to end at power-of-two alignment. We do this + * by checking the LSB set and use its index as the increment. Both + * cases need to be guarded by MAX_ORDER. + * + * Note that the value of ffsl() and flsl() starts from 1 so we need + * to decrement it by 1. + */ + int inc_order = min(MAX_ORDER, flsl(e - s) - 1); + + if ( s ) + inc_order = min(inc_order, ffsl(s) - 1); + free_heap_pages(mfn_to_page(_mfn(s)), inc_order, need_scrub); + s += (1UL << inc_order); } } @@ -1856,12 +1882,13 @@ static void init_heap_pages( for ( i = 0; i < nr_pages; ) { unsigned int nid = phys_to_nid(page_to_maddr(pg)); + unsigned int zone = page_to_zone(pg); unsigned long left = nr_pages - i; unsigned long contig_pages; for ( contig_pages = 1; contig_pages < left; contig_pages++ ) { - if ( !is_contig_page(pg + contig_pages, nid) ) + if ( !is_contig_page(pg + contig_pages, nid, zone) ) break; }