From patchwork Fri Apr 5 22:12:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10887889 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 66E321390 for ; Fri, 5 Apr 2019 22:12:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4900B28A6E for ; Fri, 5 Apr 2019 22:12:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3D15228A72; Fri, 5 Apr 2019 22:12:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A754F28A6E for ; Fri, 5 Apr 2019 22:12:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B2F136B000C; Fri, 5 Apr 2019 18:12:16 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id ADDAD6B000D; Fri, 5 Apr 2019 18:12:16 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 958186B0010; Fri, 5 Apr 2019 18:12:16 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f198.google.com (mail-pl1-f198.google.com [209.85.214.198]) by kanga.kvack.org (Postfix) with ESMTP id 574B46B000C for ; Fri, 5 Apr 2019 18:12:16 -0400 (EDT) Received: by mail-pl1-f198.google.com with SMTP id s19so5095113plp.6 for ; Fri, 05 Apr 2019 15:12:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:subject:from:to:cc:date :message-id:in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=6vIqu5AbJXGwURZ/x1P8u+8ZQ7uFngKRtabwAKH4lKk=; b=icz4TC7VKhNJFGzs/yjcfugj7nsBHwLzIMfdoZc3vw8UMXDNJSxsPlWolHXAJCtNqV LGfuR3PoqjfLU9vMxAanHmp4lGzj66HY3ChRmVGgm7YcXXH+3hwuDrSZgcot81o6uQoI ChZawwBKmkJxiFiSdACznus9BthAF4VCnnZaKf/TlyeIzXEX2YQ8H4HkXKvY+3df/xVH F1TiqXhvLRIUNdrdXg1eR7vPAIWd06txdCCUwhbWOMiHtcJagGoJjqfNcCiwnVpdL/uW tAyQppNeJdiVBB/7TtvuHnDrM/ba3pSisBnbAIuOSWkFZSvfxfa161IkNZkFpIwT1zN5 ML1w== X-Gm-Message-State: APjAAAUnhIO8cndJOoYaPOT6oUXQb7iPnDT/xw+4mGr29l3XHYhJVwlu PyG7BCq/WIXPg6EnnwiVLSZOWq6n/KCD0SkAtCEmNSZfmQZuBg95F5GBPEiWgggV96pmCB1QVQF 41ChiLSDggl/Pnm1WFqP0jH+BsaeS3Oo9XkFNK0D0nNc9jv7SrDohzgAC4P1B+M3WlQ== X-Received: by 2002:a17:902:442:: with SMTP id 60mr15731911ple.107.1554502335819; Fri, 05 Apr 2019 15:12:15 -0700 (PDT) X-Received: by 2002:a17:902:442:: with SMTP id 60mr15731831ple.107.1554502334852; Fri, 05 Apr 2019 15:12:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554502334; cv=none; d=google.com; s=arc-20160816; b=HzEbBNjVlcKD8j2dMf3AVo8QAa5yE5TxwLbj6aLT0OT0i70nGN73ehuUJP4FuqfY8T hObS9eWH2YVP7rRm8kUtUZbRk2/H24m/IgZncCUZV6xNNSnIaBOOmkrwB961IDcSBn9J 55cuTKoMPt18z8Fjxm7FfkNjKoTQ3k+Yq9HhTJwS2clxL33q4HNdpnllaO27iZkEWIGC 17NHUj0iPUqEQJVXf2sGgyZl3F2mcvmafUm7ZgyGI6v76ZMCp7Hr+gCmLvKplcip8mdo FYKL3Q0OPCamGduA+pyJaWGuMXuq7QVgqoiqwsHSOkYYOBQ0jBf54eLkLTmopoD8FWzW /fMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject:dkim-signature; bh=6vIqu5AbJXGwURZ/x1P8u+8ZQ7uFngKRtabwAKH4lKk=; b=0wHhTJhJmP6B+O5vbQKzK5S255t74Hyz/SGHiYFAvXQQCIrLJTHhcCvGLLSc8eM22N SDK/c2cyo396NlZ17QXEo/Lqk6vDuBHsImOcdR1q2W/LR4ha4JQz8pJqTnXyM7UTyosE LodsWRKj/EM5pZ2Pun3YrCo9TpzMOUsNVh0QC7lUf2X8rWpumbeJ4k4zLwdPlKYJKd1f PKu5hribvvK1Is9ZoxNc9bumn8IEWznUGsqsGlIqixl+iOIoNgNUaGDiO75RPVvtI/YT p3FA/kcmYd8KBm2wFZ0jNCCRUlnh5nK0PU0ylvOgaGCanvm5GTsw5govB6m3DcxWFlI/ SiIw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=AaXZcevV; spf=pass (google.com: domain of alexander.duyck@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id b36sor27925364plb.40.2019.04.05.15.12.14 for (Google Transport Security); Fri, 05 Apr 2019 15:12:14 -0700 (PDT) Received-SPF: pass (google.com: domain of alexander.duyck@gmail.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=AaXZcevV; spf=pass (google.com: domain of alexander.duyck@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=6vIqu5AbJXGwURZ/x1P8u+8ZQ7uFngKRtabwAKH4lKk=; b=AaXZcevVbqDJKo2yhUTaeeyqT+FTQEuobGRmKBpkrnlthtbbFwFfypEu9jOhKsOZVg +UQg5E3SMZaW1yQbl+/bMWebkz3hqP+amJzksTp1Xk+xPQ1Tb7FOPWc5hbhHb+HrFUKK p5ipXJf1eh3E4jHx1C1M2hiNS6fZ3/ssDYIwP9d3dX4BCOcO0TLkQVwHa5m+h2ErwDiV rMbb/4KnTVKGmNTp1P5LdesEJvIf/3JGmSqPTDA1KASfjtIRyJfWjfeFv6lFof5ZwOO7 Wj1zx/qvHFJ2P7xjAH+VL7STVAkgMnpH6EVVOld6EgnSIlH6SYiEawhLbvIDXfAglLkC MmCA== X-Google-Smtp-Source: APXvYqwDFkg3s634IuCt/DDVsGVay2lJjERO15hTE9pSPT4JufXw1seluZxsaXsAI68OURcCjqb39A== X-Received: by 2002:a17:902:758d:: with SMTP id j13mr15626076pll.44.1554502334418; Fri, 05 Apr 2019 15:12:14 -0700 (PDT) Received: from localhost.localdomain (50-126-100-225.drr01.csby.or.frontiernet.net. [50.126.100.225]) by smtp.gmail.com with ESMTPSA id h11sm26078686pgq.57.2019.04.05.15.12.13 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 05 Apr 2019 15:12:13 -0700 (PDT) Subject: [mm PATCH v7 1/4] mm: Use mm_zero_struct_page from SPARC on all 64b architectures From: Alexander Duyck To: linux-mm@kvack.org, akpm@linux-foundation.org Cc: pavel.tatashin@microsoft.com, mhocko@suse.com, dave.jiang@intel.com, linux-nvdimm@lists.01.org, alexander.h.duyck@linux.intel.com, linux-kernel@vger.kernel.org, willy@infradead.org, mingo@kernel.org, yi.z.zhang@linux.intel.com, khalid.aziz@oracle.com, rppt@linux.vnet.ibm.com, vbabka@suse.cz, sparclinux@vger.kernel.org, dan.j.williams@intel.com, ldufour@linux.vnet.ibm.com, mgorman@techsingularity.net, davem@davemloft.net, kirill.shutemov@linux.intel.com Date: Fri, 05 Apr 2019 15:12:13 -0700 Message-ID: <20190405221213.12227.9392.stgit@localhost.localdomain> In-Reply-To: <20190405221043.12227.19679.stgit@localhost.localdomain> References: <20190405221043.12227.19679.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Alexander Duyck Use the same approach that was already in use on Sparc on all the architectures that support a 64b long. This is mostly motivated by the fact that 7 to 10 store/move instructions are likely always going to be faster than having to call into a function that is not specialized for handling page init. An added advantage to doing it this way is that the compiler can get away with combining writes in the __init_single_page call. As a result the memset call will be reduced to only about 4 write operations, or at least that is what I am seeing with GCC 6.2 as the flags, LRU pointers, and count/mapcount seem to be cancelling out at least 4 of the 8 assignments on my system. One change I had to make to the function was to reduce the minimum page size to 56 to support some powerpc64 configurations. This change should introduce no change on SPARC since it already had this code. In the case of x86_64 I saw a reduction from 3.75s to 2.80s when initializing 384GB of RAM per node. Pavel Tatashin tested on a system with Broadcom's Stingray CPU and 48GB of RAM and found that __init_single_page() takes 19.30ns / 64-byte struct page before this patch and with this patch it takes 17.33ns / 64-byte struct page. Mike Rapoport ran a similar test on a OpenPower (S812LC 8348-21C) with Power8 processor and 128GB or RAM. His results per 64-byte struct page were 4.68ns before, and 4.59ns after this patch. Reviewed-by: Pavel Tatashin Acked-by: Michal Hocko Signed-off-by: Alexander Duyck --- arch/sparc/include/asm/pgtable_64.h | 30 -------------------------- include/linux/mm.h | 41 ++++++++++++++++++++++++++++++++--- 2 files changed, 38 insertions(+), 33 deletions(-) diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h index 1393a8ac596b..22500c3be7a9 100644 --- a/arch/sparc/include/asm/pgtable_64.h +++ b/arch/sparc/include/asm/pgtable_64.h @@ -231,36 +231,6 @@ extern struct page *mem_map_zero; #define ZERO_PAGE(vaddr) (mem_map_zero) -/* This macro must be updated when the size of struct page grows above 80 - * or reduces below 64. - * The idea that compiler optimizes out switch() statement, and only - * leaves clrx instructions - */ -#define mm_zero_struct_page(pp) do { \ - unsigned long *_pp = (void *)(pp); \ - \ - /* Check that struct page is either 64, 72, or 80 bytes */ \ - BUILD_BUG_ON(sizeof(struct page) & 7); \ - BUILD_BUG_ON(sizeof(struct page) < 64); \ - BUILD_BUG_ON(sizeof(struct page) > 80); \ - \ - switch (sizeof(struct page)) { \ - case 80: \ - _pp[9] = 0; /* fallthrough */ \ - case 72: \ - _pp[8] = 0; /* fallthrough */ \ - default: \ - _pp[7] = 0; \ - _pp[6] = 0; \ - _pp[5] = 0; \ - _pp[4] = 0; \ - _pp[3] = 0; \ - _pp[2] = 0; \ - _pp[1] = 0; \ - _pp[0] = 0; \ - } \ -} while (0) - /* PFNs are real physical page numbers. However, mem_map only begins to record * per-page information starting at pfn_base. This is to handle systems where * the first physical page in the machine is at some huge physical address, diff --git a/include/linux/mm.h b/include/linux/mm.h index fe52e266016e..f391c2d7c180 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -124,10 +124,45 @@ static inline void totalram_pages_set(long val) /* * On some architectures it is expensive to call memset() for small sizes. - * Those architectures should provide their own implementation of "struct page" - * zeroing by defining this macro in . + * If an architecture decides to implement their own version of + * mm_zero_struct_page they should wrap the defines below in a #ifndef and + * define their own version of this macro in */ -#ifndef mm_zero_struct_page +#if BITS_PER_LONG == 64 +/* This function must be updated when the size of struct page grows above 80 + * or reduces below 56. The idea that compiler optimizes out switch() + * statement, and only leaves move/store instructions. Also the compiler can + * combine write statments if they are both assignments and can be reordered, + * this can result in several of the writes here being dropped. + */ +#define mm_zero_struct_page(pp) __mm_zero_struct_page(pp) +static inline void __mm_zero_struct_page(struct page *page) +{ + unsigned long *_pp = (void *)page; + + /* Check that struct page is either 56, 64, 72, or 80 bytes */ + BUILD_BUG_ON(sizeof(struct page) & 7); + BUILD_BUG_ON(sizeof(struct page) < 56); + BUILD_BUG_ON(sizeof(struct page) > 80); + + switch (sizeof(struct page)) { + case 80: + _pp[9] = 0; /* fallthrough */ + case 72: + _pp[8] = 0; /* fallthrough */ + case 64: + _pp[7] = 0; /* fallthrough */ + case 56: + _pp[6] = 0; + _pp[5] = 0; + _pp[4] = 0; + _pp[3] = 0; + _pp[2] = 0; + _pp[1] = 0; + _pp[0] = 0; + } +} +#else #define mm_zero_struct_page(pp) ((void)memset((pp), 0, sizeof(struct page))) #endif From patchwork Fri Apr 5 22:12:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10887893 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D5E8715AC for ; Fri, 5 Apr 2019 22:12:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B893F28A6E for ; Fri, 5 Apr 2019 22:12:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AC38128A72; Fri, 5 Apr 2019 22:12:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6ED5328A6E for ; Fri, 5 Apr 2019 22:12:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6C6366B0010; Fri, 5 Apr 2019 18:12:23 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 64A556B0266; Fri, 5 Apr 2019 18:12:23 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4EE676B0269; Fri, 5 Apr 2019 18:12:23 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f197.google.com (mail-pl1-f197.google.com [209.85.214.197]) by kanga.kvack.org (Postfix) with ESMTP id 063356B0010 for ; Fri, 5 Apr 2019 18:12:23 -0400 (EDT) Received: by mail-pl1-f197.google.com with SMTP id 102so5086614plb.20 for ; Fri, 05 Apr 2019 15:12:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:subject:from:to:cc:date :message-id:in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=G12Kq6AqytAmvUYvKgWRSsHxwaZfUjEmXsdn1Ujj/lQ=; b=HijFyjBTveRJvwfs7EdW7anLYwzhrhRSdmbamZk5P7xpt5kWkW5KfrHwL2rTKZ3See /0e8Earxx7iDBmH6qCzYQfZFjCk+o/l+DtQkilv6pgXJTh29npg6vVA+qi29PmiC78mS pekp3Xx48adELY6caVo/07nU0UKlcQc+2W4x95dwBJKqwdTsxaPQIk2PvLdOSUcJKbGn 3zOVsF1xXLIo9RDYgTxqBiJG4wIC7U+LwRrw3XX1HTtxg9Ah7x9dNu7kgjZp1iPLycle tveGhirXxB6Y5v3XvOXLP93K9whR2UEI+TaVL2dILeiMeNfvnD78NxZwr8e30+1dZG63 oEPA== X-Gm-Message-State: APjAAAXNyXCC/TL4rubgm6R8EKgRRUb0GsQLzVvnuPO2InpYDGi/W8oa 5F1Y7tLfQMPDLWfthNPaafWd8gC4QCyBJViu3Ky2utbEc0/134/3xBwa7leObQEkwZSWedez16S 6Xiag4RSXd8zInZVltCVq31UikVWerXNbCSXI0Dg3xq8K8bBlxjhIaDkEkS+xoaR95g== X-Received: by 2002:a17:902:9884:: with SMTP id s4mr15639400plp.179.1554502342624; Fri, 05 Apr 2019 15:12:22 -0700 (PDT) X-Received: by 2002:a17:902:9884:: with SMTP id s4mr15639286plp.179.1554502341149; Fri, 05 Apr 2019 15:12:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554502341; cv=none; d=google.com; s=arc-20160816; b=nE54rH2pPxugC1muLA6qMkzrLeIonHHgGwQKixCxzuATyanEGXpHxGAkvYr7DxofXF 81NwEO788k9F6huNUg5V9Z31UdoPashBatDW5yoqhNvvZ9t0OW1a29sYYsG+xueGB9te +IXNrWaW1cD37XiohbM5vDwPE0QOT7/mI+LqI6iWtafy+KV06C+xyhXWy5NgWStelJwt pXfMBlMWz41jSAZJcAf0Ead05V21m7lYyWBLxLJraJJNaF+Mw50xco2xJStnEgEYagqf AGiJ9diEMeWSkfxf/8W5K/L7amqPQR5OzxbDGIh8F4GX++iTBryvsSdsESobsz1zxmwI QfdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject:dkim-signature; bh=G12Kq6AqytAmvUYvKgWRSsHxwaZfUjEmXsdn1Ujj/lQ=; b=Vl7486GDXiP7gjeUnl8hqZAKexwccLbjFPiHcoNExaiNeDK7enPN/gYuHNU15VEXHa spNWp6ABJVdxiE0fsnJhWi/8qU6+i04/BDCGPoabGk4K8OVGsVXEjYmr4SEFF9y2/vS2 vccVkPbxc5cc6QJIgmutFIBlbNckxSWTk+reiFdrEnjSiEmv2eaYAP+KJg7ezTpK562a ZW2Rp9dH9DOsa2tXUcp38LYsMuahRNfS65mSvISfLm9jY8tTXmi+EGNjNqrzjxX95JEd El57XEZty3MY3Gj16qFjJCAsVjP5L0j7esLhKG6ttjtBbIXHYydnTKrSKs6ADRRTZcQA hGKw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=XG38YpTI; spf=pass (google.com: domain of alexander.duyck@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id m2sor25641111pgq.36.2019.04.05.15.12.21 for (Google Transport Security); Fri, 05 Apr 2019 15:12:21 -0700 (PDT) Received-SPF: pass (google.com: domain of alexander.duyck@gmail.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=XG38YpTI; spf=pass (google.com: domain of alexander.duyck@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=G12Kq6AqytAmvUYvKgWRSsHxwaZfUjEmXsdn1Ujj/lQ=; b=XG38YpTI00BwC7KObaPOXNoT0/3aGicw/qoEXT5J9ke838FQZXhTBrBKripeVlQ9Vd yoK3ch4YRr/qZAq7Q4/iihliZd8Jck2y2mchzu+u4+pPFwguJE1oafMedNzJyWlXxSpE yY+wYtwdKUvaIGV6knDJ6e66/Ptv8oMYhPfSswhRm0d9ZpLUp/OuQtANSM1iKaGcTRhV OwyCg74LMwWPMtCOirkiYNg/JdBtLrvH6oguVDMPPUtSfuOWH9NJkkn0flINl+pyFd/L z7NgJvX3iIEDuY838PDjJKhJpqNs0VELk852grpRuUrlkmnbxXDU5pmUMwHfSzBF7+8p ceeg== X-Google-Smtp-Source: APXvYqx8yKoHJNenjnCFS6xMZGAPS6NHqwxxDYSwPI5ifsuuBq34/S66jQZo95MrWUZOrgize8leVw== X-Received: by 2002:a63:2c3:: with SMTP id 186mr14136668pgc.161.1554502340690; Fri, 05 Apr 2019 15:12:20 -0700 (PDT) Received: from localhost.localdomain (50-126-100-225.drr01.csby.or.frontiernet.net. [50.126.100.225]) by smtp.gmail.com with ESMTPSA id w23sm24386014pgj.72.2019.04.05.15.12.19 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 05 Apr 2019 15:12:20 -0700 (PDT) Subject: [mm PATCH v7 2/4] mm: Drop meminit_pfn_in_nid as it is redundant From: Alexander Duyck To: linux-mm@kvack.org, akpm@linux-foundation.org Cc: pavel.tatashin@microsoft.com, mhocko@suse.com, dave.jiang@intel.com, linux-nvdimm@lists.01.org, alexander.h.duyck@linux.intel.com, linux-kernel@vger.kernel.org, willy@infradead.org, mingo@kernel.org, yi.z.zhang@linux.intel.com, khalid.aziz@oracle.com, rppt@linux.vnet.ibm.com, vbabka@suse.cz, sparclinux@vger.kernel.org, dan.j.williams@intel.com, ldufour@linux.vnet.ibm.com, mgorman@techsingularity.net, davem@davemloft.net, kirill.shutemov@linux.intel.com Date: Fri, 05 Apr 2019 15:12:19 -0700 Message-ID: <20190405221219.12227.93957.stgit@localhost.localdomain> In-Reply-To: <20190405221043.12227.19679.stgit@localhost.localdomain> References: <20190405221043.12227.19679.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Alexander Duyck As best as I can tell the meminit_pfn_in_nid call is completely redundant. The deferred memory initialization is already making use of for_each_free_mem_range which in turn will call into __next_mem_range which will only return a memory range if it matches the node ID provided assuming it is not NUMA_NO_NODE. I am operating on the assumption that there are no zones or pgdata_t structures that have a NUMA node of NUMA_NO_NODE associated with them. If that is the case then __next_mem_range will never return a memory range that doesn't match the zone's node ID and as such the check is redundant. So one piece I would like to verify on this is if this works for ia64. Technically it was using a different approach to get the node ID, but it seems to have the node ID also encoded into the memblock. So I am assuming this is okay, but would like to get confirmation on that. On my x86_64 test system with 384GB of memory per node I saw a reduction in initialization time from 2.80s to 1.85s as a result of this patch. Reviewed-by: Pavel Tatashin Acked-by: Michal Hocko Signed-off-by: Alexander Duyck --- mm/page_alloc.c | 51 ++++++++++++++------------------------------------- 1 file changed, 14 insertions(+), 37 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 0c53807a2943..2d2bca9803d2 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1398,36 +1398,22 @@ int __meminit early_pfn_to_nid(unsigned long pfn) #endif #ifdef CONFIG_NODES_SPAN_OTHER_NODES -static inline bool __meminit __maybe_unused -meminit_pfn_in_nid(unsigned long pfn, int node, - struct mminit_pfnnid_cache *state) +/* Only safe to use early in boot when initialisation is single-threaded */ +static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node) { int nid; - nid = __early_pfn_to_nid(pfn, state); + nid = __early_pfn_to_nid(pfn, &early_pfnnid_cache); if (nid >= 0 && nid != node) return false; return true; } -/* Only safe to use early in boot when initialisation is single-threaded */ -static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node) -{ - return meminit_pfn_in_nid(pfn, node, &early_pfnnid_cache); -} - #else - static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node) { return true; } -static inline bool __meminit __maybe_unused -meminit_pfn_in_nid(unsigned long pfn, int node, - struct mminit_pfnnid_cache *state) -{ - return true; -} #endif @@ -1556,21 +1542,13 @@ static inline void __init pgdat_init_report_one_done(void) * * Then, we check if a current large page is valid by only checking the validity * of the head pfn. - * - * Finally, meminit_pfn_in_nid is checked on systems where pfns can interleave - * within a node: a pfn is between start and end of a node, but does not belong - * to this memory node. */ -static inline bool __init -deferred_pfn_valid(int nid, unsigned long pfn, - struct mminit_pfnnid_cache *nid_init_state) +static inline bool __init deferred_pfn_valid(unsigned long pfn) { if (!pfn_valid_within(pfn)) return false; if (!(pfn & (pageblock_nr_pages - 1)) && !pfn_valid(pfn)) return false; - if (!meminit_pfn_in_nid(pfn, nid, nid_init_state)) - return false; return true; } @@ -1578,15 +1556,14 @@ static inline void __init pgdat_init_report_one_done(void) * Free pages to buddy allocator. Try to free aligned pages in * pageblock_nr_pages sizes. */ -static void __init deferred_free_pages(int nid, int zid, unsigned long pfn, +static void __init deferred_free_pages(unsigned long pfn, unsigned long end_pfn) { - struct mminit_pfnnid_cache nid_init_state = { }; unsigned long nr_pgmask = pageblock_nr_pages - 1; unsigned long nr_free = 0; for (; pfn < end_pfn; pfn++) { - if (!deferred_pfn_valid(nid, pfn, &nid_init_state)) { + if (!deferred_pfn_valid(pfn)) { deferred_free_range(pfn - nr_free, nr_free); nr_free = 0; } else if (!(pfn & nr_pgmask)) { @@ -1606,17 +1583,18 @@ static void __init deferred_free_pages(int nid, int zid, unsigned long pfn, * by performing it only once every pageblock_nr_pages. * Return number of pages initialized. */ -static unsigned long __init deferred_init_pages(int nid, int zid, +static unsigned long __init deferred_init_pages(struct zone *zone, unsigned long pfn, unsigned long end_pfn) { - struct mminit_pfnnid_cache nid_init_state = { }; unsigned long nr_pgmask = pageblock_nr_pages - 1; + int nid = zone_to_nid(zone); unsigned long nr_pages = 0; + int zid = zone_idx(zone); struct page *page = NULL; for (; pfn < end_pfn; pfn++) { - if (!deferred_pfn_valid(nid, pfn, &nid_init_state)) { + if (!deferred_pfn_valid(pfn)) { page = NULL; continue; } else if (!page || !(pfn & nr_pgmask)) { @@ -1679,12 +1657,12 @@ static int __init deferred_init_memmap(void *data) for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) { spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa)); - nr_pages += deferred_init_pages(nid, zid, spfn, epfn); + nr_pages += deferred_init_pages(zone, spfn, epfn); } for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) { spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa)); - deferred_free_pages(nid, zid, spfn, epfn); + deferred_free_pages(spfn, epfn); } pgdat_resize_unlock(pgdat, &flags); @@ -1716,7 +1694,6 @@ static int __init deferred_init_memmap(void *data) static noinline bool __init deferred_grow_zone(struct zone *zone, unsigned int order) { - int zid = zone_idx(zone); int nid = zone_to_nid(zone); pg_data_t *pgdat = NODE_DATA(nid); unsigned long nr_pages_needed = ALIGN(1 << order, PAGES_PER_SECTION); @@ -1766,7 +1743,7 @@ static int __init deferred_init_memmap(void *data) while (spfn < epfn && nr_pages < nr_pages_needed) { t = ALIGN(spfn + PAGES_PER_SECTION, PAGES_PER_SECTION); first_deferred_pfn = min(t, epfn); - nr_pages += deferred_init_pages(nid, zid, spfn, + nr_pages += deferred_init_pages(zone, spfn, first_deferred_pfn); spfn = first_deferred_pfn; } @@ -1778,7 +1755,7 @@ static int __init deferred_init_memmap(void *data) for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) { spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); epfn = min_t(unsigned long, first_deferred_pfn, PFN_DOWN(epa)); - deferred_free_pages(nid, zid, spfn, epfn); + deferred_free_pages(spfn, epfn); if (first_deferred_pfn == epfn) break; From patchwork Fri Apr 5 22:12:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10887897 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 30B651390 for ; Fri, 5 Apr 2019 22:12:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1022528A6E for ; Fri, 5 Apr 2019 22:12:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0420D28A72; Fri, 5 Apr 2019 22:12:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3FC5C28A6E for ; Fri, 5 Apr 2019 22:12:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 27E616B0269; Fri, 5 Apr 2019 18:12:29 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 2051B6B026A; Fri, 5 Apr 2019 18:12:29 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 07FDE6B026B; Fri, 5 Apr 2019 18:12:29 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) by kanga.kvack.org (Postfix) with ESMTP id BB95D6B0269 for ; Fri, 5 Apr 2019 18:12:28 -0400 (EDT) Received: by mail-pf1-f199.google.com with SMTP id e20so5336801pfn.8 for ; Fri, 05 Apr 2019 15:12:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:subject:from:to:cc:date :message-id:in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=dVuSGqGMdSL+lYB9NVdPOXJ/yezgrt5Vhv+/46VVp14=; b=KnIY3SQHaNJV2QEFsJ6fbdHsawTVdzMjHGdXXnrSGOYo1v/Yih8u5/np6qRgME1Qjr SbBN8LSCG6M8t/UF3DsN6o0PbZdUqZClrI7weoK6LtGij4EdnB7dDwIwTCk9ZJjS642a B1kEvSvvfbKsqG+iEbdtl84lQJFMGH+kohAPaV39VzJRKE8e7DoWo4v4VYd6goN144F/ 3H/PajITuezszrEYIYBBSkEDECfZs0aSJyTenDhKmkDwiAZ2qoy6EtlT9FWStEWTORD5 QbtwvGbZip9Moh7SgCXfXC2rOZyQrGk0CSpyX8TgxDgd2jLQDpbIF8RiQ7min1uJGN6v 24ag== X-Gm-Message-State: APjAAAU1mDPgd+9tZOuCBl5CRXjyoZ+irwLHSMOWjFmHi8d208wUb+eu CLpIT0zz9hsuzsFDudjCIq//WqWJ+KreD7FWwnAMNXHdnxUQ9oF2nqwysQrUn3b5rtn/OLuie/N 5pmHPteAhPk0V63JXmBjn+aXoPIV7E9wBSzuEE5Ws+Szi2zLkpPLZvFKMVJ7vMtaycg== X-Received: by 2002:a62:12d0:: with SMTP id 77mr15016673pfs.15.1554502348368; Fri, 05 Apr 2019 15:12:28 -0700 (PDT) X-Received: by 2002:a62:12d0:: with SMTP id 77mr15016600pfs.15.1554502347325; Fri, 05 Apr 2019 15:12:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554502347; cv=none; d=google.com; s=arc-20160816; b=HNpDYwyPLyN9EnqzXMhgmdqJiQ3FKWjLbNtFZyxI/st4VR3eUayyKC0ySsjvHtxPHV fAGd1e+wqgZ5Gpflv/jE4r8nVdjoFodA+2bNCWwOERj8rNj4Hv29EaSsvvDCHL0LC36I V/YuXAz2JNARzenj4epvh3IvSfJ/h0YM6gqwVswOSXlRxYJ0NSXawXTJ4lM2rFCrOw7g ucwV5CmGZS4s6lINGdfEZPGNcZHhqiIpvc6qs+c3s0jdcrv4sK6JjRWCaTO3y0PBHNUF CUwpgWAyu0SH70B2maYBGSexWKXuDVce7/HYjGqtYunPVvCZT8sNVYT14oSl+UcrpKfa 6CBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject:dkim-signature; bh=dVuSGqGMdSL+lYB9NVdPOXJ/yezgrt5Vhv+/46VVp14=; b=lAxXI0FsIicHIu7uwxxy6ypfETbbYK19lwwVpNqq5h+N5+vr1FTk5przHbWqaRgMif CjCQYqFi/fRjY7hXKQRJuLuRpsDebeLOfQq8255WryQaZmxrJpzE+hn/BtxXzZ3vSATh 2B9/rU3jv1u61DWVaASOtCn3AnP06bYMmOdNGIDA0pHLmTm21BgbZMpwQLrilYbEcrt1 51gkhgZd7EN9Gj0JjfnK4c/zfr03VFgujCA2d/yghZfBGKpJiiDUpRObJdr6yrPvm3rt kUv+Lv1cDik0i/CEr14xznkWpinEeq/zdND1kGCE3vBvX3oV944D2dPFlbsguVhoQP5w HwCQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=JQVCFkvU; spf=pass (google.com: domain of alexander.duyck@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id z64sor3847683pfz.2.2019.04.05.15.12.27 for (Google Transport Security); Fri, 05 Apr 2019 15:12:27 -0700 (PDT) Received-SPF: pass (google.com: domain of alexander.duyck@gmail.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=JQVCFkvU; spf=pass (google.com: domain of alexander.duyck@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=dVuSGqGMdSL+lYB9NVdPOXJ/yezgrt5Vhv+/46VVp14=; b=JQVCFkvUkfWotyAFTK0D20cgOePlePMZbaEA60cegN8gvROzBwq6nj7PYVeXVjhR8g NZzIIiVk2R4QQvX6GA8mIKChjKmhPLCY+hYRKgEiuHzfsshVMvf0mOQlLzE+e0mG5L2U prbcDAvY/fTOQJSxxCXfORrqJuu4QoAEf0wXdalbKvLm/vf/DkOoQZdivMOpzrNZP/lN vkS06bjvHACsJldSdBM0ThykU5zprTeBnS46dQ8JZ9hcpXn47Zjhtzeah/QrVfh5i7Vh PVnpqnU7tXzqfiMoHOm76lWBBUH+Mx2uwHhcCELcSLVlkTi6xJuNGIX7/LefnnkIlzvL fIoQ== X-Google-Smtp-Source: APXvYqxKbwb1VzShMBMk6QhT/5s3udfUks/YgS3lLw5sgTjR3Ne4rhJN++RLqGAP8qX1lUodDpAx3g== X-Received: by 2002:aa7:8289:: with SMTP id s9mr15095613pfm.208.1554502346930; Fri, 05 Apr 2019 15:12:26 -0700 (PDT) Received: from localhost.localdomain (50-126-100-225.drr01.csby.or.frontiernet.net. [50.126.100.225]) by smtp.gmail.com with ESMTPSA id f63sm30382350pfc.180.2019.04.05.15.12.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 05 Apr 2019 15:12:26 -0700 (PDT) Subject: [mm PATCH v7 3/4] mm: Implement new zone specific memblock iterator From: Alexander Duyck To: linux-mm@kvack.org, akpm@linux-foundation.org Cc: pavel.tatashin@microsoft.com, mhocko@suse.com, dave.jiang@intel.com, linux-nvdimm@lists.01.org, alexander.h.duyck@linux.intel.com, linux-kernel@vger.kernel.org, willy@infradead.org, mingo@kernel.org, yi.z.zhang@linux.intel.com, khalid.aziz@oracle.com, rppt@linux.vnet.ibm.com, vbabka@suse.cz, sparclinux@vger.kernel.org, dan.j.williams@intel.com, ldufour@linux.vnet.ibm.com, mgorman@techsingularity.net, davem@davemloft.net, kirill.shutemov@linux.intel.com Date: Fri, 05 Apr 2019 15:12:25 -0700 Message-ID: <20190405221225.12227.22573.stgit@localhost.localdomain> In-Reply-To: <20190405221043.12227.19679.stgit@localhost.localdomain> References: <20190405221043.12227.19679.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Alexander Duyck Introduce a new iterator for_each_free_mem_pfn_range_in_zone. This iterator will take care of making sure a given memory range provided is in fact contained within a zone. It takes are of all the bounds checking we were doing in deferred_grow_zone, and deferred_init_memmap. In addition it should help to speed up the search a bit by iterating until the end of a range is greater than the start of the zone pfn range, and will exit completely if the start is beyond the end of the zone. Reviewed-by: Pavel Tatashin Signed-off-by: Alexander Duyck Reviewed-by: Mike Rapoport --- include/linux/memblock.h | 25 ++++++++++++++++++ mm/memblock.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++ mm/page_alloc.c | 31 +++++++++------------- 3 files changed, 101 insertions(+), 19 deletions(-) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index 294d5d80e150..f8b78892b977 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -240,6 +240,31 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn, i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid)) #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */ +#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT +void __next_mem_pfn_range_in_zone(u64 *idx, struct zone *zone, + unsigned long *out_spfn, + unsigned long *out_epfn); +/** + * for_each_free_mem_range_in_zone - iterate through zone specific free + * memblock areas + * @i: u64 used as loop variable + * @zone: zone in which all of the memory blocks reside + * @p_start: ptr to phys_addr_t for start address of the range, can be %NULL + * @p_end: ptr to phys_addr_t for end address of the range, can be %NULL + * + * Walks over free (memory && !reserved) areas of memblock in a specific + * zone. Available once memblock and an empty zone is initialized. The main + * assumption is that the zone start, end, and pgdat have been associated. + * This way we can use the zone to determine NUMA node, and if a given part + * of the memblock is valid for the zone. + */ +#define for_each_free_mem_pfn_range_in_zone(i, zone, p_start, p_end) \ + for (i = 0, \ + __next_mem_pfn_range_in_zone(&i, zone, p_start, p_end); \ + i != U64_MAX; \ + __next_mem_pfn_range_in_zone(&i, zone, p_start, p_end)) +#endif /* CONFIG_DEFERRED_STRUCT_PAGE_INIT */ + /** * for_each_free_mem_range - iterate through free memblock areas * @i: u64 used as loop variable diff --git a/mm/memblock.c b/mm/memblock.c index e7665cf914b1..28fa8926d9f8 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1255,6 +1255,70 @@ int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size, return 0; } #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */ +#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT +/** + * __next_mem_pfn_range_in_zone - iterator for for_each_*_range_in_zone() + * + * @idx: pointer to u64 loop variable + * @zone: zone in which all of the memory blocks reside + * @out_spfn: ptr to ulong for start pfn of the range, can be %NULL + * @out_epfn: ptr to ulong for end pfn of the range, can be %NULL + * + * This function is meant to be a zone/pfn specific wrapper for the + * for_each_mem_range type iterators. Specifically they are used in the + * deferred memory init routines and as such we were duplicating much of + * this logic throughout the code. So instead of having it in multiple + * locations it seemed like it would make more sense to centralize this to + * one new iterator that does everything they need. + */ +void __init_memblock +__next_mem_pfn_range_in_zone(u64 *idx, struct zone *zone, + unsigned long *out_spfn, unsigned long *out_epfn) +{ + int zone_nid = zone_to_nid(zone); + phys_addr_t spa, epa; + int nid; + + __next_mem_range(idx, zone_nid, MEMBLOCK_NONE, + &memblock.memory, &memblock.reserved, + &spa, &epa, &nid); + + while (*idx != U64_MAX) { + unsigned long epfn = PFN_DOWN(epa); + unsigned long spfn = PFN_UP(spa); + + /* + * Verify the end is at least past the start of the zone and + * that we have at least one PFN to initialize. + */ + if (zone->zone_start_pfn < epfn && spfn < epfn) { + /* if we went too far just stop searching */ + if (zone_end_pfn(zone) <= spfn) { + *idx = U64_MAX; + break; + } + + if (out_spfn) + *out_spfn = max(zone->zone_start_pfn, spfn); + if (out_epfn) + *out_epfn = min(zone_end_pfn(zone), epfn); + + return; + } + + __next_mem_range(idx, zone_nid, MEMBLOCK_NONE, + &memblock.memory, &memblock.reserved, + &spa, &epa, &nid); + } + + /* signal end of iteration */ + if (out_spfn) + *out_spfn = ULONG_MAX; + if (out_epfn) + *out_epfn = 0; +} + +#endif /* CONFIG_DEFERRED_STRUCT_PAGE_INIT */ /** * memblock_alloc_range_nid - allocate boot memory block diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2d2bca9803d2..61467e28c966 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1613,11 +1613,9 @@ static unsigned long __init deferred_init_pages(struct zone *zone, static int __init deferred_init_memmap(void *data) { pg_data_t *pgdat = data; - int nid = pgdat->node_id; unsigned long start = jiffies; unsigned long nr_pages = 0; unsigned long spfn, epfn, first_init_pfn, flags; - phys_addr_t spa, epa; int zid; struct zone *zone; const struct cpumask *cpumask = cpumask_of_node(pgdat->node_id); @@ -1654,14 +1652,12 @@ static int __init deferred_init_memmap(void *data) * freeing pages we can access pages that are ahead (computing buddy * page in __free_one_page()). */ - for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) { - spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); - epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa)); + for_each_free_mem_pfn_range_in_zone(i, zone, &spfn, &epfn) { + spfn = max_t(unsigned long, first_init_pfn, spfn); nr_pages += deferred_init_pages(zone, spfn, epfn); } - for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) { - spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); - epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa)); + for_each_free_mem_pfn_range_in_zone(i, zone, &spfn, &epfn) { + spfn = max_t(unsigned long, first_init_pfn, spfn); deferred_free_pages(spfn, epfn); } pgdat_resize_unlock(pgdat, &flags); @@ -1669,8 +1665,8 @@ static int __init deferred_init_memmap(void *data) /* Sanity check that the next zone really is unpopulated */ WARN_ON(++zid < MAX_NR_ZONES && populated_zone(++zone)); - pr_info("node %d initialised, %lu pages in %ums\n", nid, nr_pages, - jiffies_to_msecs(jiffies - start)); + pr_info("node %d initialised, %lu pages in %ums\n", + pgdat->node_id, nr_pages, jiffies_to_msecs(jiffies - start)); pgdat_init_report_one_done(); return 0; @@ -1694,13 +1690,11 @@ static int __init deferred_init_memmap(void *data) static noinline bool __init deferred_grow_zone(struct zone *zone, unsigned int order) { - int nid = zone_to_nid(zone); - pg_data_t *pgdat = NODE_DATA(nid); unsigned long nr_pages_needed = ALIGN(1 << order, PAGES_PER_SECTION); + pg_data_t *pgdat = zone->zone_pgdat; unsigned long nr_pages = 0; unsigned long first_init_pfn, spfn, epfn, t, flags; unsigned long first_deferred_pfn = pgdat->first_deferred_pfn; - phys_addr_t spa, epa; u64 i; /* Only the last zone may have deferred pages */ @@ -1736,9 +1730,8 @@ static int __init deferred_init_memmap(void *data) return false; } - for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) { - spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); - epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa)); + for_each_free_mem_pfn_range_in_zone(i, zone, &spfn, &epfn) { + spfn = max_t(unsigned long, first_init_pfn, spfn); while (spfn < epfn && nr_pages < nr_pages_needed) { t = ALIGN(spfn + PAGES_PER_SECTION, PAGES_PER_SECTION); @@ -1752,9 +1745,9 @@ static int __init deferred_init_memmap(void *data) break; } - for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) { - spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); - epfn = min_t(unsigned long, first_deferred_pfn, PFN_DOWN(epa)); + for_each_free_mem_pfn_range_in_zone(i, zone, &spfn, &epfn) { + spfn = max_t(unsigned long, first_init_pfn, spfn); + epfn = min_t(unsigned long, first_deferred_pfn, epfn); deferred_free_pages(spfn, epfn); if (first_deferred_pfn == epfn) From patchwork Fri Apr 5 22:12:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10887901 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3E3B115AC for ; Fri, 5 Apr 2019 22:12:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1EC3D28A6E for ; Fri, 5 Apr 2019 22:12:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 12CAE28A72; Fri, 5 Apr 2019 22:12:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4535128A6E for ; Fri, 5 Apr 2019 22:12:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 95C3B6B026B; Fri, 5 Apr 2019 18:12:35 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 8E3876B026C; Fri, 5 Apr 2019 18:12:35 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7845C6B026D; Fri, 5 Apr 2019 18:12:35 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f200.google.com (mail-pl1-f200.google.com [209.85.214.200]) by kanga.kvack.org (Postfix) with ESMTP id 394DC6B026B for ; Fri, 5 Apr 2019 18:12:35 -0400 (EDT) Received: by mail-pl1-f200.google.com with SMTP id x5so5116107pll.2 for ; Fri, 05 Apr 2019 15:12:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:dkim-signature:subject:from:to:cc:date :message-id:in-reply-to:references:user-agent:mime-version :content-transfer-encoding; bh=uA3ehOa7Jw+A2AorvTLb7I9vCJ3fGJ8/5DDYOfi5sas=; b=fX3piJsN2IqWmvdKnfR6yM00hKNiJcU7gxSe8es83mtqBRfcr42Po4Mpw4jMsRzKZk ZgCLBdJsIM6DoZYN/eLwS+f4PHEomXe7yOYtQu/nUH2mRapK8eTrrhhiIGaYVURioOf9 MnmymTr6j+ROgYCHWNgmngaYiY0NYfOYrNK9vvygbe3jhZYk6LRvOtxMqeStcmoIOh7n PdCeHNNX9WMCbgAhr8dC9ljQdWc3J5+y8NxvWeTu9tNvM262VQFtSRRe4/KpceNqBnC/ d6opuN9ce0g3c0B5XkA3GqfmMGv/ga1DFrLe/Q2as5WGbLo+FjzNkLCRn13sqggLe05x hbmg== X-Gm-Message-State: APjAAAWS/Sp09uFbGmmHaMKWu2lKyYB9oT4YyGzxjK+ln5Tqeom00x9p KsyKWU3klyqyBVqaqhSEpuo7olQWOFaavDwnRp7/RXoafqHDmfuOemVuplaPfgvMHdfp701U6O1 4JbdmP/YlVdWKg+rsMYpZa1rEiM36rORXlWc7YI6bvB9YZ8o5Q8RTQb/OUsuOgOT+Rg== X-Received: by 2002:a63:ce50:: with SMTP id r16mr11630278pgi.89.1554502354805; Fri, 05 Apr 2019 15:12:34 -0700 (PDT) X-Received: by 2002:a63:ce50:: with SMTP id r16mr11630192pgi.89.1554502353620; Fri, 05 Apr 2019 15:12:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554502353; cv=none; d=google.com; s=arc-20160816; b=zeppFkovAdmt3p+tK5aGW0L/yGV71qyD+yhdWa1PVLvw8JgdFlnlZGf6BHHGi6SqnT aN1j+biFy10kodrwuVkdyRo1lotQIa195fcwkDPeYzJGzrPmMc624Je7IITxSrsopTCw ia98Vg2jyBg43CEbb2yFwgw+GIbIg3nJiNwS92AAOeLbJr8o2Y0+zd1MFwGuV9XTR9U9 BUJw62Ie314JPU1A+1j6OGsSmaUH2FLN+uiC8dLsvk6s0LnFo1bzim9mROekvT6COZsQ Xgw11wmCDbWe1S8YHlqA9CSt6XhqqIVfy+hPZtelQ/NaKXzPksRSOXKbZ2iRhpPMpbFM TD4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject:dkim-signature; bh=uA3ehOa7Jw+A2AorvTLb7I9vCJ3fGJ8/5DDYOfi5sas=; b=XvaoAODzuBB91ozNOpjWIi8lS2ERB0jdf4wWa0plWbEq0XYEsctosX5vZOyrj+eOSb 1t1ey833rqHYCbjO6zJExhSQ5UQkr4TrP5SRT4bWzFaMvDft/2HmijutT4XhuLJ5QqI4 M0RWMA1kxS6bUWfKDhBjmEXv2fsXY9Ppq+oe8xYsuqIFoXJbkTvviof8KhxQfrHKming snZYwSaPdYO4yp5DqKCEdAa2udUJ+ummXDp4mq+XiKQoHbx3xNYeCbvGOksoSl0ZBVGr TCRV9v2ozwhNCCrp63DU497Mjn6Ybggmw3dRaBrQE2XnOmQXQ3qvGNU5Ken3K1sOqpO9 mQyw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=WnKcALRd; spf=pass (google.com: domain of alexander.duyck@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from mail-sor-f65.google.com (mail-sor-f65.google.com. [209.85.220.65]) by mx.google.com with SMTPS id k4sor17299299pfa.53.2019.04.05.15.12.33 for (Google Transport Security); Fri, 05 Apr 2019 15:12:33 -0700 (PDT) Received-SPF: pass (google.com: domain of alexander.duyck@gmail.com designates 209.85.220.65 as permitted sender) client-ip=209.85.220.65; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=WnKcALRd; spf=pass (google.com: domain of alexander.duyck@gmail.com designates 209.85.220.65 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=uA3ehOa7Jw+A2AorvTLb7I9vCJ3fGJ8/5DDYOfi5sas=; b=WnKcALRdTyp6pCXsE8JbLXwNSXmiTwrFsD86/ca15xC7u7WFUMr1dE7ApT0PWfBIj/ fVBe++igQ81F0zOktIZCozX8rjRSnoaY/GpBr5fuTvNnGBXAU/EDVdcQ0j+YK8gSJeFp SozQ8JmKuk9ZMiwRigZonoRQ1Kju/DEIKDhTYiizw22qlnPVmsaxixL9bcZW32E+OvG7 XHC2vZ1cNV5OSM9zq4yYXWIslODYdwvFMAYTmV7/VnGEQQYwI/TRwTOOL8ladUa+HnhF 1iCWT8NRfzQ4w7IDGGxbRN7d6C/Zjv1zIzS70/GaGt3PS1S0zVGPPvA/WQa7F3kdFjsY gkUg== X-Google-Smtp-Source: APXvYqyGZ2UBm/RKCejvVc5BO/RoTRbaNNjBjAA4639olLhxNtuws2N0qQgEDERakVOHFz2Xtsl69Q== X-Received: by 2002:a65:5183:: with SMTP id h3mr14500620pgq.53.1554502353162; Fri, 05 Apr 2019 15:12:33 -0700 (PDT) Received: from localhost.localdomain (50-126-100-225.drr01.csby.or.frontiernet.net. [50.126.100.225]) by smtp.gmail.com with ESMTPSA id a3sm39604616pfn.182.2019.04.05.15.12.32 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 05 Apr 2019 15:12:32 -0700 (PDT) Subject: [mm PATCH v7 4/4] mm: Initialize MAX_ORDER_NR_PAGES at a time instead of doing larger sections From: Alexander Duyck To: linux-mm@kvack.org, akpm@linux-foundation.org Cc: pavel.tatashin@microsoft.com, mhocko@suse.com, dave.jiang@intel.com, linux-nvdimm@lists.01.org, alexander.h.duyck@linux.intel.com, linux-kernel@vger.kernel.org, willy@infradead.org, mingo@kernel.org, yi.z.zhang@linux.intel.com, khalid.aziz@oracle.com, rppt@linux.vnet.ibm.com, vbabka@suse.cz, sparclinux@vger.kernel.org, dan.j.williams@intel.com, ldufour@linux.vnet.ibm.com, mgorman@techsingularity.net, davem@davemloft.net, kirill.shutemov@linux.intel.com Date: Fri, 05 Apr 2019 15:12:32 -0700 Message-ID: <20190405221231.12227.85836.stgit@localhost.localdomain> In-Reply-To: <20190405221043.12227.19679.stgit@localhost.localdomain> References: <20190405221043.12227.19679.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP From: Alexander Duyck Add yet another iterator, for_each_free_mem_range_in_zone_from, and then use it to support initializing and freeing pages in groups no larger than MAX_ORDER_NR_PAGES. By doing this we can greatly improve the cache locality of the pages while we do several loops over them in the init and freeing process. We are able to tighten the loops further as a result of the "from" iterator as we can perform the initial checks for first_init_pfn in our first call to the iterator, and continue without the need for those checks via the "from" iterator. I have added this functionality in the function called deferred_init_mem_pfn_range_in_zone that primes the iterator and causes us to exit if we encounter any failure. On my x86_64 test system with 384GB of memory per node I saw a reduction in initialization time from 1.85s to 1.38s as a result of this patch. Reviewed-by: Pavel Tatashin Signed-off-by: Alexander Duyck --- include/linux/memblock.h | 16 +++++ mm/page_alloc.c | 162 ++++++++++++++++++++++++++++++++++------------ 2 files changed, 137 insertions(+), 41 deletions(-) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index f8b78892b977..47e3c0612592 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -263,6 +263,22 @@ void __next_mem_pfn_range_in_zone(u64 *idx, struct zone *zone, __next_mem_pfn_range_in_zone(&i, zone, p_start, p_end); \ i != U64_MAX; \ __next_mem_pfn_range_in_zone(&i, zone, p_start, p_end)) + +/** + * for_each_free_mem_range_in_zone_from - iterate through zone specific + * free memblock areas from a given point + * @i: u64 used as loop variable + * @zone: zone in which all of the memory blocks reside + * @p_start: ptr to phys_addr_t for start address of the range, can be %NULL + * @p_end: ptr to phys_addr_t for end address of the range, can be %NULL + * + * Walks over free (memory && !reserved) areas of memblock in a specific + * zone, continuing from current position. Available as soon as memblock is + * initialized. + */ +#define for_each_free_mem_pfn_range_in_zone_from(i, zone, p_start, p_end) \ + for (; i != U64_MAX; \ + __next_mem_pfn_range_in_zone(&i, zone, p_start, p_end)) #endif /* CONFIG_DEFERRED_STRUCT_PAGE_INIT */ /** diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 61467e28c966..06fbec9edf84 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1609,16 +1609,100 @@ static unsigned long __init deferred_init_pages(struct zone *zone, return (nr_pages); } +/* + * This function is meant to pre-load the iterator for the zone init. + * Specifically it walks through the ranges until we are caught up to the + * first_init_pfn value and exits there. If we never encounter the value we + * return false indicating there are no valid ranges left. + */ +static bool __init +deferred_init_mem_pfn_range_in_zone(u64 *i, struct zone *zone, + unsigned long *spfn, unsigned long *epfn, + unsigned long first_init_pfn) +{ + u64 j; + + /* + * Start out by walking through the ranges in this zone that have + * already been initialized. We don't need to do anything with them + * so we just need to flush them out of the system. + */ + for_each_free_mem_pfn_range_in_zone(j, zone, spfn, epfn) { + if (*epfn <= first_init_pfn) + continue; + if (*spfn < first_init_pfn) + *spfn = first_init_pfn; + *i = j; + return true; + } + + return false; +} + +/* + * Initialize and free pages. We do it in two loops: first we initialize + * struct page, then free to buddy allocator, because while we are + * freeing pages we can access pages that are ahead (computing buddy + * page in __free_one_page()). + * + * In order to try and keep some memory in the cache we have the loop + * broken along max page order boundaries. This way we will not cause + * any issues with the buddy page computation. + */ +static unsigned long __init +deferred_init_maxorder(u64 *i, struct zone *zone, unsigned long *start_pfn, + unsigned long *end_pfn) +{ + unsigned long mo_pfn = ALIGN(*start_pfn + 1, MAX_ORDER_NR_PAGES); + unsigned long spfn = *start_pfn, epfn = *end_pfn; + unsigned long nr_pages = 0; + u64 j = *i; + + /* First we loop through and initialize the page values */ + for_each_free_mem_pfn_range_in_zone_from(j, zone, start_pfn, end_pfn) { + unsigned long t; + + if (mo_pfn <= *start_pfn) + break; + + t = min(mo_pfn, *end_pfn); + nr_pages += deferred_init_pages(zone, *start_pfn, t); + + if (mo_pfn < *end_pfn) { + *start_pfn = mo_pfn; + break; + } + } + + /* Reset values and now loop through freeing pages as needed */ + swap(j, *i); + + for_each_free_mem_pfn_range_in_zone_from(j, zone, &spfn, &epfn) { + unsigned long t; + + if (mo_pfn <= spfn) + break; + + t = min(mo_pfn, epfn); + deferred_free_pages(spfn, t); + + if (mo_pfn <= epfn) + break; + } + + return nr_pages; +} + /* Initialise remaining memory on a node */ static int __init deferred_init_memmap(void *data) { pg_data_t *pgdat = data; + const struct cpumask *cpumask = cpumask_of_node(pgdat->node_id); + unsigned long spfn = 0, epfn = 0, nr_pages = 0; + unsigned long first_init_pfn, flags; unsigned long start = jiffies; - unsigned long nr_pages = 0; - unsigned long spfn, epfn, first_init_pfn, flags; - int zid; struct zone *zone; - const struct cpumask *cpumask = cpumask_of_node(pgdat->node_id); + int zid; u64 i; /* Bind memory initialisation thread to a local node if possible */ @@ -1644,22 +1728,20 @@ static int __init deferred_init_memmap(void *data) if (first_init_pfn < zone_end_pfn(zone)) break; } - first_init_pfn = max(zone->zone_start_pfn, first_init_pfn); + + /* If the zone is empty somebody else may have cleared out the zone */ + if (!deferred_init_mem_pfn_range_in_zone(&i, zone, &spfn, &epfn, + first_init_pfn)) + goto zone_empty; /* - * Initialize and free pages. We do it in two loops: first we initialize - * struct page, than free to buddy allocator, because while we are - * freeing pages we can access pages that are ahead (computing buddy - * page in __free_one_page()). + * Initialize and free pages in MAX_ORDER sized increments so + * that we can avoid introducing any issues with the buddy + * allocator. */ - for_each_free_mem_pfn_range_in_zone(i, zone, &spfn, &epfn) { - spfn = max_t(unsigned long, first_init_pfn, spfn); - nr_pages += deferred_init_pages(zone, spfn, epfn); - } - for_each_free_mem_pfn_range_in_zone(i, zone, &spfn, &epfn) { - spfn = max_t(unsigned long, first_init_pfn, spfn); - deferred_free_pages(spfn, epfn); - } + while (spfn < epfn) + nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn); +zone_empty: pgdat_resize_unlock(pgdat, &flags); /* Sanity check that the next zone really is unpopulated */ @@ -1692,9 +1774,9 @@ static int __init deferred_init_memmap(void *data) { unsigned long nr_pages_needed = ALIGN(1 << order, PAGES_PER_SECTION); pg_data_t *pgdat = zone->zone_pgdat; - unsigned long nr_pages = 0; - unsigned long first_init_pfn, spfn, epfn, t, flags; unsigned long first_deferred_pfn = pgdat->first_deferred_pfn; + unsigned long spfn, epfn, flags; + unsigned long nr_pages = 0; u64 i; /* Only the last zone may have deferred pages */ @@ -1723,37 +1805,35 @@ static int __init deferred_init_memmap(void *data) return true; } - first_init_pfn = max(zone->zone_start_pfn, first_deferred_pfn); - - if (first_init_pfn >= pgdat_end_pfn(pgdat)) { + /* If the zone is empty somebody else may have cleared out the zone */ + if (!deferred_init_mem_pfn_range_in_zone(&i, zone, &spfn, &epfn, + first_deferred_pfn)) { + pgdat->first_deferred_pfn = ULONG_MAX; pgdat_resize_unlock(pgdat, &flags); - return false; + return true; } - for_each_free_mem_pfn_range_in_zone(i, zone, &spfn, &epfn) { - spfn = max_t(unsigned long, first_init_pfn, spfn); + /* + * Initialize and free pages in MAX_ORDER sized increments so + * that we can avoid introducing any issues with the buddy + * allocator. + */ + while (spfn < epfn) { + /* update our first deferred PFN for this section */ + first_deferred_pfn = spfn; + + nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn); - while (spfn < epfn && nr_pages < nr_pages_needed) { - t = ALIGN(spfn + PAGES_PER_SECTION, PAGES_PER_SECTION); - first_deferred_pfn = min(t, epfn); - nr_pages += deferred_init_pages(zone, spfn, - first_deferred_pfn); - spfn = first_deferred_pfn; - } + /* We should only stop along section boundaries */ + if ((first_deferred_pfn ^ spfn) < PAGES_PER_SECTION) + continue; + /* If our quota has been met we can stop here */ if (nr_pages >= nr_pages_needed) break; } - for_each_free_mem_pfn_range_in_zone(i, zone, &spfn, &epfn) { - spfn = max_t(unsigned long, first_init_pfn, spfn); - epfn = min_t(unsigned long, first_deferred_pfn, epfn); - deferred_free_pages(spfn, epfn); - - if (first_deferred_pfn == epfn) - break; - } - pgdat->first_deferred_pfn = first_deferred_pfn; + pgdat->first_deferred_pfn = spfn; pgdat_resize_unlock(pgdat, &flags); return nr_pages > 0;