From patchwork Thu Aug 23 02:29:02 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Alexander Duyck <alexander.duyck@gmail.com>
X-Patchwork-Id: 10573343
Return-Path: <owner-linux-mm@kvack.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
 [172.30.200.125])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EBF1014BD
	for <patchwork-linux-mm@patchwork.kernel.org>;
 Thu, 23 Aug 2018 02:29:08 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CE6CA2C020
	for <patchwork-linux-mm@patchwork.kernel.org>;
 Thu, 23 Aug 2018 02:29:08 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id BC2972C025; Thu, 23 Aug 2018 02:29:08 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE
	autolearn=ham version=3.3.1
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 025BA2C020
	for <patchwork-linux-mm@patchwork.kernel.org>;
 Thu, 23 Aug 2018 02:29:07 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 418A36B2798; Wed, 22 Aug 2018 22:29:06 -0400 (EDT)
Delivered-To: linux-mm-outgoing@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 3C8736B2799; Wed, 22 Aug 2018 22:29:06 -0400 (EDT)
X-Original-To: int-list-linux-mm@kvack.org
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 2DEEC6B279A; Wed, 22 Aug 2018 22:29:06 -0400 (EDT)
X-Original-To: linux-mm@kvack.org
X-Delivered-To: linux-mm@kvack.org
Received: from mail-pl0-f69.google.com (mail-pl0-f69.google.com
 [209.85.160.69])
	by kanga.kvack.org (Postfix) with ESMTP id DED926B2798
	for <linux-mm@kvack.org>; Wed, 22 Aug 2018 22:29:05 -0400 (EDT)
Received: by mail-pl0-f69.google.com with SMTP id 90-v6so1790520pla.18
        for <linux-mm@kvack.org>; Wed, 22 Aug 2018 19:29:05 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:dkim-signature:subject:from:to:cc:date
         :message-id:user-agent:mime-version:content-transfer-encoding;
        bh=ZbCJ5OHEoUgfJzbcVyTjQ0JRU34J5rIj73z3WVlU5HI=;
        b=unkIv9Uzvm3jYXycX3dRHJ9ncSlgynl1T0g5e8Hl1z2NdvO4fIDITjp51+cN5tlQTC
         FERh6SX2fViF8Fw6mMrI1VLYfGoHDVsn/9lrTbijIXuw8+E0KiusPVa/elWqnjd65n5b
         aqSGEaRKjuvxIjcorkv7D0/9JsI2II/BVgoaVDxWEnbKbj6dy8BDStIiRo2eoDsIi4Qh
         CwAqVGeI5qlEMQjWtp0dINJDRJBZwMFE0sUtL7kttC+t16tDwUAfZwdOojVgjMkqHFe4
         WTeT5Gn6rEM9ZG1mLOsdrR5CHpIfwtgi//HtqmIHHkpedoyJQ+AnwkuMoi6D/T2F0QEn
         c1FQ==
X-Gm-Message-State: APzg51CjjS7eb3m6IrV5U5xkBOI+HUN5ecS59p9ZGEeurR3G4pgxKe76
	6OrHeTAkMi/uhV86V38HSZ46JJAQb9ge3zd/I2vkd37w6c8lJG2DEjVM6Up26W2WxOCdyF6h/YT
	qfFLAItGGCP4SvWE2HAjIbGPdwhLyC7gyp/lxeam+0XmQ5rixSFfQ2WYhADxf7f/KxUr6zSGymQ
	KwsE0gKVAESGn2VK8RRrbkWCE3CAPPa6tWyJwkTnWLbcwCKkIZqZmTau5Yk6CVFko1Clhxcrqt6
	KSYx/8jHW129PY+Dg9qeIY+/stwxTe2ExqSCMs8Ge6YqEAjQ3xvyPjna9ydIeyxNLw9+1dXrXYl
	HOnBbfdQ1pQ9VLtiGZT+wdW73nxMlrl3YGEHlJm0oUw3GEc8+DvjAIWKhHAFtI92jrDx3AAJrv3
	J
X-Received: by 2002:a63:5343:: with SMTP id
 t3-v6mr2799179pgl.425.1534991345497;
        Wed, 22 Aug 2018 19:29:05 -0700 (PDT)
X-Received: by 2002:a63:5343:: with SMTP id
 t3-v6mr2799099pgl.425.1534991343919;
        Wed, 22 Aug 2018 19:29:03 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1534991343; cv=none;
        d=google.com; s=arc-20160816;
        b=JZCTVrwMfyXzsQXDrAeXwoqLi+x9Fr+l0Gq+lVX6HxpeZhrmnky4ZIwmGJ8bV6I2Ps
         slMvJOP86thzouiLZmCCSxq06oLJdxZWrOoUUqy4EvoKDgaz5Ff3M1DDVnMk+FqRoXAK
         dGLdMW/8/20GwD+izq2i401mQvfxYuFZ28z5/MjNpKE2nsqFBhEhhcl8vzsV4/N65SR+
         0131v0OCeP9eh3v1zODSnxXscwqeEMVG3IJqGGDhpX7ubuaAbtg55mpfowyJG6uFK1AT
         hRuhiQBygi1rc9VI6cwYPGvY49qu6kUTZvCiEJ/R4bjoScEZgb1wvPFmonGNF66R77BY
         uO0Q==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=content-transfer-encoding:mime-version:user-agent:message-id:date
         :cc:to:from:subject:dkim-signature:arc-authentication-results;
        bh=ZbCJ5OHEoUgfJzbcVyTjQ0JRU34J5rIj73z3WVlU5HI=;
        b=lcEURpLu4A5dMR/GSMMp1s2C71KQWIfV9IJMYEGu+kn8yXfIc8gBwxpzB0LLKw3FGy
         JlHTLLl0mNDHbjX6gKzqWnOOg42HEOSHn7tLu5zB+Lbjj/GSViRXkUKz4t8vwzDYTLcJ
         MGYh4Jaw8Idf14lwDFKTVkZ7ijZt8EJPWHx4zkylKDiIpYDlE4UbmlYJ0hUWQzrMuHIc
         JQctCsskj0X1SYTXhMJHox10EmcZtDyEyeYmyPiH8ztSnaOJNRkyfO/+LA494DZBBh5U
         qWX/cwByPGrgYuV64uSsvTe1e9dR5EPweOGULU9YHYX42Oos2PaI8OBwnClbmCOOVrPK
         utnw==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=pass header.i=@gmail.com header.s=20161025 header.b=tWc4BSG1;
       spf=pass (google.com: domain of alexander.duyck@gmail.com designates
 209.85.220.65 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com;
       dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com
Received: from mail-sor-f65.google.com (mail-sor-f65.google.com.
 [209.85.220.65])
        by mx.google.com with SMTPS id
 j4-v6sor955661pgh.210.2018.08.22.19.29.03
        for <linux-mm@kvack.org>
        (Google Transport Security);
        Wed, 22 Aug 2018 19:29:03 -0700 (PDT)
Received-SPF: pass (google.com: domain of alexander.duyck@gmail.com designates
 209.85.220.65 as permitted sender) client-ip=209.85.220.65;
Authentication-Results: mx.google.com;
       dkim=pass header.i=@gmail.com header.s=20161025 header.b=tWc4BSG1;
       spf=pass (google.com: domain of alexander.duyck@gmail.com designates
 209.85.220.65 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com;
       dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20161025;
        h=subject:from:to:cc:date:message-id:user-agent:mime-version
         :content-transfer-encoding;
        bh=ZbCJ5OHEoUgfJzbcVyTjQ0JRU34J5rIj73z3WVlU5HI=;
        b=tWc4BSG1Nwll0BSTvjAJdavEE0P4+pKzCxrTFO2hIeYxRkeLZnX2JtHW8SfVbsWpXO
         hJ86qfvYl1v3SWJjd/dsjAVPvUQHYAzbXM5a1AnxGPCFMkdKz9ZawvCTfFOLWm4oXp5R
         yD9St6o3YKiwaYwjrpn/8i2KJVydcGpawsiat2bzFEAQa6r91gS6ZdMtcvLgzWO8QDXT
         m9UlM7avoeN93YWNYXwZiVT1IiXCzOgFUem6z35wwG94ek+w0U7GLp/ah6xKeiE8U8SP
         CfN6B3nEfkpNaEPK7WyUbP4sB06UGztKMaxCr306ZiiHbYadIEcIn30IvQk+ciCPTSzj
         uc2A==
X-Google-Smtp-Source: 
 AA+uWPz3RKqg3BidLPmzgTntpvuEyhiNruO3gtigTHIHHJHmuqO/CCM6/t7S5rvR45SRNsch2pkOQQ==
X-Received: by 2002:a65:6143:: with SMTP id
 o3-v6mr24805781pgv.52.1534991343278;
        Wed, 22 Aug 2018 19:29:03 -0700 (PDT)
Received: from localhost.localdomain
 (static-50-53-21-37.bvtn.or.frontiernet.net. [50.53.21.37])
        by smtp.gmail.com with ESMTPSA id
 o10-v6sm4860431pfk.76.2018.08.22.19.29.02
        (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
        Wed, 22 Aug 2018 19:29:02 -0700 (PDT)
Subject: [RFC PATCH] mm: Streamline deferred_init_pages and
 deferred_free_pages
From: Alexander Duyck <alexander.duyck@gmail.com>
To: linux-mm@kvack.org
Cc: akpm@linux-foundation.org, mgorman@techsingularity.net,
 pasha.tatashin@oracle.com, vbabka@suse.cz, mhocko@kernel.org
Date: Wed, 22 Aug 2018 19:29:02 -0700
Message-ID: <20180823022116.4536.6104.stgit@localhost.localdomain>
User-Agent: StGit/0.17.1-dirty
MIME-Version: 1.0
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>
X-Virus-Scanned: ClamAV using ClamSMTP

From: Alexander Duyck <alexander.h.duyck@intel.com>

>From what I could tell the deferred_init_pages and deferred_free_pages were
running less than optimally as there were a number of checks that seemed
like they only needed to be run once per page block instead of being run
per page.

For example there is the pfn_valid check which either needs to be run for
every page if the architecture supports holes, or once per page block if it
doesn't. We can get around needing to perform these checks by just using
pfn_valid_within within the loop and running pfn_valid only on the first
access to any given page.

Also in the case of either a node ID mismatch or the pfn_valid check
failing on an architecture that doesn't support holes it doesn't make sense
to initialize pages for an invalid page block. So to skip over those pages
I have opted to OR in the page block mask to allow us to skip to the end of
a given page block.

With this patch I am seeing a modest improvement in boot time as shown
below on a system with 64GB on each node:
-- before --
[    2.945572] node 0 initialised, 15432905 pages in 636ms
[    2.968575] node 1 initialised, 15957078 pages in 659ms

-- after --
[    2.770127] node 0 initialised, 15432905 pages in 457ms
[    2.785129] node 1 initialised, 15957078 pages in 472ms

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

I'm putting this out as an RFC in order to determine if the assumptions I
have made are valid. I wasn't certain if they were correct but the
definition and usage of deferred_pfn_valid just didn't seem right to me
since it would result in us possibly skipping a head page, and then still
initializing and freeing all of the tail pages.

 mm/page_alloc.c |  152 +++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 101 insertions(+), 51 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 15ea511fb41c..0aca48b377fa 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1410,27 +1410,17 @@ void clear_zone_contiguous(struct zone *zone)
 static void __init deferred_free_range(unsigned long pfn,
 				       unsigned long nr_pages)
 {
-	struct page *page;
-	unsigned long i;
+	struct page *page, *last_page;
 
 	if (!nr_pages)
 		return;
 
 	page = pfn_to_page(pfn);
+	last_page = page + nr_pages;
 
-	/* Free a large naturally-aligned chunk if possible */
-	if (nr_pages == pageblock_nr_pages &&
-	    (pfn & (pageblock_nr_pages - 1)) == 0) {
-		set_pageblock_migratetype(page, MIGRATE_MOVABLE);
-		__free_pages_boot_core(page, pageblock_order);
-		return;
-	}
-
-	for (i = 0; i < nr_pages; i++, page++, pfn++) {
-		if ((pfn & (pageblock_nr_pages - 1)) == 0)
-			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
+	do {
 		__free_pages_boot_core(page, 0);
-	}
+	} while (++page < last_page);
 }
 
 /* Completion tracking for deferred_init_memmap() threads */
@@ -1446,14 +1436,10 @@ static inline void __init pgdat_init_report_one_done(void)
 /*
  * Returns true if page needs to be initialized or freed to buddy allocator.
  *
- * First we check if pfn is valid on architectures where it is possible to have
- * holes within pageblock_nr_pages. On systems where it is not possible, this
- * function is optimized out.
+ * First we check if a current large page is valid by only checking the validity
+ * of the first pfn we have access to in the page.
  *
- * Then, we check if a current large page is valid by only checking the validity
- * of the head pfn.
- *
- * Finally, meminit_pfn_in_nid is checked on systems where pfns can interleave
+ * Then, meminit_pfn_in_nid is checked on systems where pfns can interleave
  * within a node: a pfn is between start and end of a node, but does not belong
  * to this memory node.
  */
@@ -1461,9 +1447,7 @@ static inline void __init pgdat_init_report_one_done(void)
 deferred_pfn_valid(int nid, unsigned long pfn,
 		   struct mminit_pfnnid_cache *nid_init_state)
 {
-	if (!pfn_valid_within(pfn))
-		return false;
-	if (!(pfn & (pageblock_nr_pages - 1)) && !pfn_valid(pfn))
+	if (!pfn_valid(pfn))
 		return false;
 	if (!meminit_pfn_in_nid(pfn, nid, nid_init_state))
 		return false;
@@ -1477,24 +1461,53 @@ static inline void __init pgdat_init_report_one_done(void)
 static void __init deferred_free_pages(int nid, int zid, unsigned long pfn,
 				       unsigned long end_pfn)
 {
-	struct mminit_pfnnid_cache nid_init_state = { };
 	unsigned long nr_pgmask = pageblock_nr_pages - 1;
-	unsigned long nr_free = 0;
+	struct mminit_pfnnid_cache nid_init_state = { };
 
-	for (; pfn < end_pfn; pfn++) {
-		if (!deferred_pfn_valid(nid, pfn, &nid_init_state)) {
-			deferred_free_range(pfn - nr_free, nr_free);
-			nr_free = 0;
-		} else if (!(pfn & nr_pgmask)) {
-			deferred_free_range(pfn - nr_free, nr_free);
-			nr_free = 1;
-			touch_nmi_watchdog();
-		} else {
-			nr_free++;
+	while (pfn < end_pfn) {
+		unsigned long aligned_pfn, nr_free;
+
+		/*
+		 * Determine if our first pfn is valid, use this as a
+		 * representative value for the page block. Store the value
+		 * as either 0 or 1 to nr_free.
+		 *
+		 * If the pfn itself is valid, but this page block isn't
+		 * then we can assume the issue is the entire page block
+		 * and it can be skipped.
+		 */
+		nr_free = !!deferred_pfn_valid(nid, pfn, &nid_init_state);
+		if (!nr_free && pfn_valid_within(pfn))
+			pfn |= nr_pgmask;
+
+		/*
+		 * Move to next pfn and align the end of our next section
+		 * to process with the end of the block. If we were given
+		 * the end of a block to process we will do nothing in the
+		 * loop below.
+		 */
+		pfn++;
+		aligned_pfn = min(__ALIGN_MASK(pfn, nr_pgmask), end_pfn);
+
+		for (; pfn < aligned_pfn; pfn++) {
+			if (!pfn_valid_within(pfn)) {
+				deferred_free_range(pfn - nr_free, nr_free);
+				nr_free = 0;
+			} else {
+				nr_free++;
+			}
 		}
+
+		/* Free a large naturally-aligned chunk if possible */
+		if (nr_free == pageblock_nr_pages)
+			__free_pages_boot_core(pfn_to_page(pfn - nr_free),
+					       pageblock_order);
+		else
+			deferred_free_range(pfn - nr_free, nr_free);
+
+		/* Let the watchdog know we are still alive */
+		touch_nmi_watchdog();
 	}
-	/* Free the last block of pages to allocator */
-	deferred_free_range(pfn - nr_free, nr_free);
 }
 
 /*
@@ -1506,25 +1519,62 @@ static unsigned long  __init deferred_init_pages(int nid, int zid,
 						 unsigned long pfn,
 						 unsigned long end_pfn)
 {
-	struct mminit_pfnnid_cache nid_init_state = { };
 	unsigned long nr_pgmask = pageblock_nr_pages - 1;
+	struct mminit_pfnnid_cache nid_init_state = { };
 	unsigned long nr_pages = 0;
-	struct page *page = NULL;
 
-	for (; pfn < end_pfn; pfn++) {
-		if (!deferred_pfn_valid(nid, pfn, &nid_init_state)) {
-			page = NULL;
-			continue;
-		} else if (!page || !(pfn & nr_pgmask)) {
+	while (pfn < end_pfn) {
+		unsigned long aligned_pfn;
+		struct page *page = NULL;
+
+		/*
+		 * Determine if our first pfn is valid, use this as a
+		 * representative value for the page block.
+		 */
+		if (deferred_pfn_valid(nid, pfn, &nid_init_state)) {
 			page = pfn_to_page(pfn);
-			touch_nmi_watchdog();
-		} else {
-			page++;
+			__init_single_page(page++, pfn, zid, nid);
+			nr_pages++;
+
+			if (!(pfn & nr_pgmask))
+				set_pageblock_migratetype(page,
+							  MIGRATE_MOVABLE);
+		} else if (pfn_valid_within(pfn)) {
+			/*
+			 * If the issue is the node or is not specific to
+			 * this individual pfn we can just jump to the end
+			 * of the page block and skip the entire block.
+			 */
+			pfn |= nr_pgmask;
 		}
-		__init_single_page(page, pfn, zid, nid);
-		nr_pages++;
+
+		/*
+		 * Move to next pfn and align the end of our next section
+		 * to process with the end of the block. If we were given
+		 * the end of a block to process we will do nothing in the
+		 * loop below.
+		 */
+		pfn++;
+		aligned_pfn = min(__ALIGN_MASK(pfn, nr_pgmask), end_pfn);
+
+		for (; pfn < aligned_pfn; pfn++) {
+			if (!pfn_valid_within(pfn)) {
+				page = NULL;
+				continue;
+			}
+
+			if (!page)
+				page = pfn_to_page(pfn);
+
+			__init_single_page(page++, pfn, zid, nid);
+			nr_pages++;
+		}
+
+		/* Let the watchdog know we are still alive */
+		touch_nmi_watchdog();
 	}
-	return (nr_pages);
+
+	return nr_pages;
 }
 
 /* Initialise remaining memory on a node */