From patchwork Thu Oct 11 22:13:34 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10637523 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D7288157A for ; Thu, 11 Oct 2018 22:17:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C0E722C1FB for ; Thu, 11 Oct 2018 22:17:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B2B752C205; Thu, 11 Oct 2018 22:17:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 29E482C207 for ; Thu, 11 Oct 2018 22:17:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2CCBE6B0005; Thu, 11 Oct 2018 18:16:58 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 27B986B0006; Thu, 11 Oct 2018 18:16:58 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 169D46B0008; Thu, 11 Oct 2018 18:16:58 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f198.google.com (mail-pg1-f198.google.com [209.85.215.198]) by kanga.kvack.org (Postfix) with ESMTP id C93F86B0005 for ; Thu, 11 Oct 2018 18:16:57 -0400 (EDT) Received: by mail-pg1-f198.google.com with SMTP id r16-v6so7585982pgv.17 for ; Thu, 11 Oct 2018 15:16:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=E0WoVFiUqIIn0PLhGSCojsTtJfbPFdXSCo9wmfd4jsg=; b=YSJnYSZUP+Vfq4Y3F4jdPyLS+SSfxQ0nmDd7haq1G0R/tmb9sGUa7yk5J+v9TqhZUY wrliiLwjtiBDzKzDjwWqP19uUXcp3BjybKJd87be6cz0Rie8y3IAJfLE8yqzX+9B1M0D kNYihzaC6pwJD2obwvh2zvRkuJnf6rQdLXux2Wg3/i3e+4d/DQ0C/C/hirsiaH3378Nd YTFk73wkBogoewNkbYyOOGTWW/jClePr7FcmXhk/dsN6AhXaOATew9oZX5LD7/4WA03U CT2fLD6u5wa1dYZ5MuyBT69wOBEdZ1I4Fv6gYcuwu4iuANaDB0QWE4kCkVdqcK9E04ho CHcA== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: ABuFfoilH+9M+wQINzwCYbh2fnyRFErbsdN4Ws/EP0rvBnvQLL0wUy0P A/N1rjQjwMfeiXYkHQhrP/m4izs11m/73Vy0rCwJW6yavToKoF0H9CqzbsYon/f3YZayMYKdV6e kRsLlftc/uftLCdPS6s79VsbrwG9in1Z4WZG1SViO5xocbK2WOJ3ZvD+TpEqjunvOrw== X-Received: by 2002:a63:3842:: with SMTP id h2-v6mr3037251pgn.300.1539296217454; Thu, 11 Oct 2018 15:16:57 -0700 (PDT) X-Google-Smtp-Source: ACcGV631WdEn8UWE9/lc+wj7HnCqkDAeS/IqmNSn1uC9KFcJM9DWDW5nUOUA8fgvclflHEW3j+pk X-Received: by 2002:a63:3842:: with SMTP id h2-v6mr3037200pgn.300.1539296216366; Thu, 11 Oct 2018 15:16:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539296216; cv=none; d=google.com; s=arc-20160816; b=yP8Hln2tXi6bNTrHgbv//gmbWitgTw6kJKehUhWNsRh/bGFvSJ8GHlaPGKccfGlib9 fiz6JYpH6orOhqZ1VNxGhbUrNZmdebj159nHPFrHz1b2F5yWP9yScPwRjoziqfq0wMAn 84IC5lhmTZqigqzAqutTQOw9w9USTfPsRdTmh0sjrxtW35l5WQ7IG6G4LNDFlWP9pdpU 45A6uGt9F4Qc6LILCrE9vadPfgJDDUaQXVHR7xrYLXhs4XCp3qdNlN8Yv00Q1td7EqIZ uBkUA75Fmxs5sFLCMK4735qjPNarVQQ4eC9WlgeWLyyr+4BakD0WfoYKBXVeCY23G+FX Qvpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=E0WoVFiUqIIn0PLhGSCojsTtJfbPFdXSCo9wmfd4jsg=; b=tca9AN+zAmddZ1s65YogU4u76SrOJbUss3xzFqsT/gAnRumlIxUGu07nUthfn+Mkos cFpCIx7jnuVinde2i7ImcyBsHcToq4iC5e38lcgx+eb1fyCKXy959o1Sz3nM8qhcvIkg fp89aObABo1VPqDC92jJ2iFF4md2806+F6KGP+Dfgn/9jRKfEiRVwWj0GGVHvyRKAZzj ekbPeNlnXZLSWWNkZBUXUn12EqS/PTyqSu9XJPaEAHLUDtROJtDQuXyb7XkVQ1Y1ufzd dMsGMVxpb0Fgtln6Ra+GjpVtjrXA/CpjhK3g4BZiAi292gml0qa9W6Le/c3PQwdb8+fD pPHw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga18.intel.com (mga18.intel.com. [134.134.136.126]) by mx.google.com with ESMTPS id r29-v6si24916614pff.262.2018.10.11.15.16.56 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 11 Oct 2018 15:16:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) client-ip=134.134.136.126; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Oct 2018 15:16:55 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,370,1534834800"; d="scan'208";a="80765698" Received: from ahduyck-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.7.198.157]) by orsmga008.jf.intel.com with ESMTP; 11 Oct 2018 15:13:34 -0700 Subject: [mm PATCH v2 1/6] mm: Use mm_zero_struct_page from SPARC on all 64b architectures From: Alexander Duyck To: linux-mm@kvack.org, akpm@linux-foundation.org Cc: pavel.tatashin@microsoft.com, mhocko@suse.com, dave.jiang@intel.com, alexander.h.duyck@linux.intel.com, linux-kernel@vger.kernel.org, willy@infradead.org, davem@davemloft.net, yi.z.zhang@linux.intel.com, khalid.aziz@oracle.com, rppt@linux.vnet.ibm.com, vbabka@suse.cz, sparclinux@vger.kernel.org, dan.j.williams@intel.com, ldufour@linux.vnet.ibm.com, mgorman@techsingularity.net, mingo@kernel.org, kirill.shutemov@linux.intel.com Date: Thu, 11 Oct 2018 15:13:34 -0700 Message-ID: <20181011221334.1925.31961.stgit@localhost.localdomain> In-Reply-To: <20181011221237.1925.85591.stgit@localhost.localdomain> References: <20181011221237.1925.85591.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This change makes it so that we use the same approach that was already in use on Sparc on all the archtectures that support a 64b long. This is mostly motivated by the fact that 8 to 10 store/move instructions are likely always going to be faster than having to call into a function that is not specialized for handling page init. An added advantage to doing it this way is that the compiler can get away with combining writes in the __init_single_page call. As a result the memset call will be reduced to only about 4 write operations, or at least that is what I am seeing with GCC 6.2 as the flags, LRU poitners, and count/mapcount seem to be cancelling out at least 4 of the 8 assignments on my system. One change I had to make to the function was to reduce the minimum page size to 56 to support some powerpc64 configurations. Signed-off-by: Alexander Duyck --- arch/sparc/include/asm/pgtable_64.h | 30 ------------------------------ include/linux/mm.h | 34 ++++++++++++++++++++++++++++++++++ 2 files changed, 34 insertions(+), 30 deletions(-) diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h index 1393a8ac596b..22500c3be7a9 100644 --- a/arch/sparc/include/asm/pgtable_64.h +++ b/arch/sparc/include/asm/pgtable_64.h @@ -231,36 +231,6 @@ extern struct page *mem_map_zero; #define ZERO_PAGE(vaddr) (mem_map_zero) -/* This macro must be updated when the size of struct page grows above 80 - * or reduces below 64. - * The idea that compiler optimizes out switch() statement, and only - * leaves clrx instructions - */ -#define mm_zero_struct_page(pp) do { \ - unsigned long *_pp = (void *)(pp); \ - \ - /* Check that struct page is either 64, 72, or 80 bytes */ \ - BUILD_BUG_ON(sizeof(struct page) & 7); \ - BUILD_BUG_ON(sizeof(struct page) < 64); \ - BUILD_BUG_ON(sizeof(struct page) > 80); \ - \ - switch (sizeof(struct page)) { \ - case 80: \ - _pp[9] = 0; /* fallthrough */ \ - case 72: \ - _pp[8] = 0; /* fallthrough */ \ - default: \ - _pp[7] = 0; \ - _pp[6] = 0; \ - _pp[5] = 0; \ - _pp[4] = 0; \ - _pp[3] = 0; \ - _pp[2] = 0; \ - _pp[1] = 0; \ - _pp[0] = 0; \ - } \ -} while (0) - /* PFNs are real physical page numbers. However, mem_map only begins to record * per-page information starting at pfn_base. This is to handle systems where * the first physical page in the machine is at some huge physical address, diff --git a/include/linux/mm.h b/include/linux/mm.h index 273d4dbd3883..dee407998366 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -102,8 +102,42 @@ static inline void set_max_mapnr(unsigned long limit) { } * zeroing by defining this macro in . */ #ifndef mm_zero_struct_page +#if BITS_PER_LONG == 64 +/* This function must be updated when the size of struct page grows above 80 + * or reduces below 64. The idea that compiler optimizes out switch() + * statement, and only leaves move/store instructions + */ +#define mm_zero_struct_page(pp) __mm_zero_struct_page(pp) +static inline void __mm_zero_struct_page(struct page *page) +{ + unsigned long *_pp = (void *)page; + + /* Check that struct page is either 56, 64, 72, or 80 bytes */ + BUILD_BUG_ON(sizeof(struct page) & 7); + BUILD_BUG_ON(sizeof(struct page) < 56); + BUILD_BUG_ON(sizeof(struct page) > 80); + + switch (sizeof(struct page)) { + case 80: + _pp[9] = 0; /* fallthrough */ + case 72: + _pp[8] = 0; /* fallthrough */ + default: + _pp[7] = 0; /* fallthrough */ + case 56: + _pp[6] = 0; + _pp[5] = 0; + _pp[4] = 0; + _pp[3] = 0; + _pp[2] = 0; + _pp[1] = 0; + _pp[0] = 0; + } +} +#else #define mm_zero_struct_page(pp) ((void)memset((pp), 0, sizeof(struct page))) #endif +#endif /* * Default maximum number of active map areas, this limits the number of vmas From patchwork Thu Oct 11 22:13:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10637525 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E0034157A for ; Thu, 11 Oct 2018 22:17:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CA7D82C207 for ; Thu, 11 Oct 2018 22:17:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B19D92C1FB; Thu, 11 Oct 2018 22:17:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0858A2C1FB for ; Thu, 11 Oct 2018 22:17:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5B5D06B0006; Thu, 11 Oct 2018 18:16:58 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 56C756B0007; Thu, 11 Oct 2018 18:16:58 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 44EB16B000A; Thu, 11 Oct 2018 18:16:58 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f200.google.com (mail-pg1-f200.google.com [209.85.215.200]) by kanga.kvack.org (Postfix) with ESMTP id E940F6B0007 for ; Thu, 11 Oct 2018 18:16:57 -0400 (EDT) Received: by mail-pg1-f200.google.com with SMTP id d69-v6so7585947pgc.22 for ; Thu, 11 Oct 2018 15:16:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=o0gCdndkFSWI8RgBcsDEgQo7ACksLOMarDM7kKuGvgo=; b=AKDmflYuBiXqfxYFyLxNlRYntzsWnXwcgJLF8Ncf2IeB1GUz5AvUpMWod0g56CvBw4 WVmg9oUPcWDTDenBzq+XA6SNoxDbTzWqf7x94t18eQGzKJttNw1RpX4ctPsqDO95Vaah dgBHXMjasZrm4ZcWz53UIpNZ4uTxe3OA0EC5LrxOZINlAuZrob4NsbWEv160aESgesOE 78pmA9Py+uvgZU53BKV9wvyKPZ0pwkjo4DYye204bASkAIlpIceqaOhCVOphrMrbhbiv NUYy9JElcBjp2X1Ljz5o/TmwTCfRwddz+UFwEChPDtzhtylN75qPRsukZnsPsNdIuwgR imJQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: ABuFfojXA6T+f7ndx2Q5CXu+tHWKDbM4yO4rRBxcTGgqn7V0Ci3d3/z7 q9SzBCM11MxrrNQDZuuUjk5LvnHcTXRszQDG32TKeJegmIvNjWusyDHEO246qCUl/T4rMYnjYED qfa/+52dLGmoGBgLwFNUeb+CCmPnQ2rbbSlobsfvGz76JW8iDXSgDFFYG6QO1WqKoew== X-Received: by 2002:aa7:8252:: with SMTP id e18-v6mr3396918pfn.164.1539296217603; Thu, 11 Oct 2018 15:16:57 -0700 (PDT) X-Google-Smtp-Source: ACcGV632FQ8d+4GYSWS3G0mlEUVTsbf039Hp7H3Q6eb/71Z05sO90jU/S44LFRuMNHAdk45RyVlF X-Received: by 2002:aa7:8252:: with SMTP id e18-v6mr3396874pfn.164.1539296216572; Thu, 11 Oct 2018 15:16:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539296216; cv=none; d=google.com; s=arc-20160816; b=ywpa9p4kXve3P+X9m6xittbEG6bCxqIQ2LKlabvbT4QxQDTHJndZLI7KFuewr0lAgi oyVC2Yo/1lcncErmaCIpFcW5gu8c81BhZ51zoJAWT+pWmIe6u/M/VJiclEz3fjQbNYNF TONYlVEGNx+uWOR8Hpk4zVDrCkNtQXgBftmPEWvTOwKnn0gyxWljHgp8uJl9HOBLraWr L2fHq0C1LIzdk72+sgvJfrLGUmVgpRS5XMhe8dJnL9uVfEw/CR+zIwwrHcv+gtgMu9lT b6oxl0zLVyJTYiquAxqSW2JB+EVvb99B57LCBJzA1Rdvg9oMpqYDUUbKOOSoeKOeGR0a cWkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=o0gCdndkFSWI8RgBcsDEgQo7ACksLOMarDM7kKuGvgo=; b=MxNrflic4HZxsS3PW81hQaZpRd5KyMxxeHJ1E6TmDcbFqosYgGNU8czmJP4kczuKCX QoKVOWN785ZtSMsr3zvK5NJedZut2L+u4Rt9OaDsJmLBONZu6+DlUAuRbPRPDxf1Q/Gj PYjQA8vfyq6iEKOR0EaZnvBrJfhTP4BTOiKi/RRuTD4zivRGRdDIMZU+VvaGzCHIo1WD c3KFclrU/Kij/vkCNy5jgIGqWDB9oGxc/K05TCULkeKAssDGA3Lmmsl32oWR/cl1xdeJ WNsSOK6igFxMqfmlUxEmUEpLeaSvOAOfxV+exBZsKaCgM25Y740Ixo/VzrhfKedcwx2M 5vvA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga18.intel.com (mga18.intel.com. [134.134.136.126]) by mx.google.com with ESMTPS id r29-v6si24916614pff.262.2018.10.11.15.16.56 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 11 Oct 2018 15:16:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) client-ip=134.134.136.126; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Oct 2018 15:16:55 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,370,1534834800"; d="scan'208";a="80765700" Received: from ahduyck-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.7.198.157]) by orsmga008.jf.intel.com with ESMTP; 11 Oct 2018 15:13:40 -0700 Subject: [mm PATCH v2 2/6] mm: Drop meminit_pfn_in_nid as it is redundant From: Alexander Duyck To: linux-mm@kvack.org, akpm@linux-foundation.org Cc: pavel.tatashin@microsoft.com, mhocko@suse.com, dave.jiang@intel.com, alexander.h.duyck@linux.intel.com, linux-kernel@vger.kernel.org, willy@infradead.org, davem@davemloft.net, yi.z.zhang@linux.intel.com, khalid.aziz@oracle.com, rppt@linux.vnet.ibm.com, vbabka@suse.cz, sparclinux@vger.kernel.org, dan.j.williams@intel.com, ldufour@linux.vnet.ibm.com, mgorman@techsingularity.net, mingo@kernel.org, kirill.shutemov@linux.intel.com Date: Thu, 11 Oct 2018 15:13:40 -0700 Message-ID: <20181011221340.1925.26861.stgit@localhost.localdomain> In-Reply-To: <20181011221237.1925.85591.stgit@localhost.localdomain> References: <20181011221237.1925.85591.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP As best as I can tell the meminit_pfn_in_nid call is completely redundant. The deferred memory initialization is already making use of for_each_free_mem_range which in turn will call into __next_mem_range which will only return a memory range if it matches the node ID provided assuming it is not NUMA_NO_NODE. I am operating on the assumption that there are no zones or pgdata_t structures that have a NUMA node of NUMA_NO_NODE associated with them. If that is the case then __next_mem_range will never return a memory range that doesn't match the zone's node ID and as such the check is redundant. So one piece I would like to verfy on this is if this works for ia64. Technically it was using a different approach to get the node ID, but it seems to have the node ID also encoded into the memblock. So I am assuming this is okay, but would like to get confirmation on that. Signed-off-by: Alexander Duyck --- mm/page_alloc.c | 50 ++++++++++++++------------------------------------ 1 file changed, 14 insertions(+), 36 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index a02ce11c49f2..076ffb6214c3 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1305,36 +1305,22 @@ int __meminit early_pfn_to_nid(unsigned long pfn) #endif #ifdef CONFIG_NODES_SPAN_OTHER_NODES -static inline bool __meminit __maybe_unused -meminit_pfn_in_nid(unsigned long pfn, int node, - struct mminit_pfnnid_cache *state) +/* Only safe to use early in boot when initialisation is single-threaded */ +static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node) { int nid; - nid = __early_pfn_to_nid(pfn, state); + nid = __early_pfn_to_nid(pfn, &early_pfnnid_cache); if (nid >= 0 && nid != node) return false; return true; } -/* Only safe to use early in boot when initialisation is single-threaded */ -static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node) -{ - return meminit_pfn_in_nid(pfn, node, &early_pfnnid_cache); -} - #else - static inline bool __meminit early_pfn_in_nid(unsigned long pfn, int node) { return true; } -static inline bool __meminit __maybe_unused -meminit_pfn_in_nid(unsigned long pfn, int node, - struct mminit_pfnnid_cache *state) -{ - return true; -} #endif @@ -1463,21 +1449,13 @@ static inline void __init pgdat_init_report_one_done(void) * * Then, we check if a current large page is valid by only checking the validity * of the head pfn. - * - * Finally, meminit_pfn_in_nid is checked on systems where pfns can interleave - * within a node: a pfn is between start and end of a node, but does not belong - * to this memory node. */ -static inline bool __init -deferred_pfn_valid(int nid, unsigned long pfn, - struct mminit_pfnnid_cache *nid_init_state) +static inline bool __init deferred_pfn_valid(unsigned long pfn) { if (!pfn_valid_within(pfn)) return false; if (!(pfn & (pageblock_nr_pages - 1)) && !pfn_valid(pfn)) return false; - if (!meminit_pfn_in_nid(pfn, nid, nid_init_state)) - return false; return true; } @@ -1485,15 +1463,14 @@ static inline void __init pgdat_init_report_one_done(void) * Free pages to buddy allocator. Try to free aligned pages in * pageblock_nr_pages sizes. */ -static void __init deferred_free_pages(int nid, int zid, unsigned long pfn, +static void __init deferred_free_pages(unsigned long pfn, unsigned long end_pfn) { - struct mminit_pfnnid_cache nid_init_state = { }; unsigned long nr_pgmask = pageblock_nr_pages - 1; unsigned long nr_free = 0; for (; pfn < end_pfn; pfn++) { - if (!deferred_pfn_valid(nid, pfn, &nid_init_state)) { + if (!deferred_pfn_valid(pfn)) { deferred_free_range(pfn - nr_free, nr_free); nr_free = 0; } else if (!(pfn & nr_pgmask)) { @@ -1513,17 +1490,18 @@ static void __init deferred_free_pages(int nid, int zid, unsigned long pfn, * by performing it only once every pageblock_nr_pages. * Return number of pages initialized. */ -static unsigned long __init deferred_init_pages(int nid, int zid, +static unsigned long __init deferred_init_pages(struct zone *zone, unsigned long pfn, unsigned long end_pfn) { - struct mminit_pfnnid_cache nid_init_state = { }; unsigned long nr_pgmask = pageblock_nr_pages - 1; + int nid = zone_to_nid(zone); unsigned long nr_pages = 0; + int zid = zone_idx(zone); struct page *page = NULL; for (; pfn < end_pfn; pfn++) { - if (!deferred_pfn_valid(nid, pfn, &nid_init_state)) { + if (!deferred_pfn_valid(pfn)) { page = NULL; continue; } else if (!page || !(pfn & nr_pgmask)) { @@ -1586,12 +1564,12 @@ static int __init deferred_init_memmap(void *data) for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) { spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa)); - nr_pages += deferred_init_pages(nid, zid, spfn, epfn); + nr_pages += deferred_init_pages(zone, spfn, epfn); } for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) { spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa)); - deferred_free_pages(nid, zid, spfn, epfn); + deferred_free_pages(spfn, epfn); } pgdat_resize_unlock(pgdat, &flags); @@ -1680,7 +1658,7 @@ static int __init deferred_init_memmap(void *data) while (spfn < epfn && nr_pages < nr_pages_needed) { t = ALIGN(spfn + PAGES_PER_SECTION, PAGES_PER_SECTION); first_deferred_pfn = min(t, epfn); - nr_pages += deferred_init_pages(nid, zid, spfn, + nr_pages += deferred_init_pages(zone, spfn, first_deferred_pfn); spfn = first_deferred_pfn; } @@ -1692,7 +1670,7 @@ static int __init deferred_init_memmap(void *data) for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) { spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); epfn = min_t(unsigned long, first_deferred_pfn, PFN_DOWN(epa)); - deferred_free_pages(nid, zid, spfn, epfn); + deferred_free_pages(spfn, epfn); if (first_deferred_pfn == epfn) break; From patchwork Thu Oct 11 22:13:45 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10637533 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1CB463CF1 for ; Thu, 11 Oct 2018 22:17:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0563A2C1FB for ; Thu, 11 Oct 2018 22:17:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EDD432C206; Thu, 11 Oct 2018 22:17:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D84632C1FB for ; Thu, 11 Oct 2018 22:17:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A90CF6B000A; Thu, 11 Oct 2018 18:17:00 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A1F096B0266; Thu, 11 Oct 2018 18:17:00 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7B0476B000A; Thu, 11 Oct 2018 18:17:00 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f197.google.com (mail-pf1-f197.google.com [209.85.210.197]) by kanga.kvack.org (Postfix) with ESMTP id 01A446B000D for ; Thu, 11 Oct 2018 18:17:00 -0400 (EDT) Received: by mail-pf1-f197.google.com with SMTP id g72-v6so9964056pfk.9 for ; Thu, 11 Oct 2018 15:16:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=kmrwVflFWyZinFdH62Izd3Nlt5Izch/exlKWVQnI7Mk=; b=rlTyNNPeYQYzuMgQOX7PCr+wimaWT2xIdVs3C3GeyrT3bJvDH7TYCVZAb+iDZadZFb 2SXFz0hQs6d6YgMYPoMKY1sa7ib/OcV/ZA+X6jrRqH3XkWHkXbB3XbJYk7lvf3kWwdVx XMh6ovo42poIuq0CiV96gVpBYfw3KVEmFPumbTRN0uA/Jzph+lAFwucTHy3seokGUpbD ffpY6FET13XPgzh7B9cPxOnOETDC6pzziECXYvIAC5Rw1w4Y+DuH4b0jIIbkVIiXGNjJ V/DUcj6HiruczQcgeeFO3tZhSefKN2qrfH8kdnn5ohS54kZGC3UfWqGTkcn5VR/OM8aU Bk/A== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: ABuFfoj62bfiNYZpEM7UGtlwoew6E9BqKHYdgU8O2pLoxezQDk9HzkrJ F/De3UUS2GmRhH15ejU2zMZhhd9sGMkw8Kcw7dt/Yxm5gYUhghc0DV4JtKxZiq6xvt69nMN/Bgx Iqh3bEqUqZq5Dk8DaLB552xwo1h+mTw6tzES0H/0wA5qlPnMlRi3fwGGPinNl+1cHEQ== X-Received: by 2002:a63:b709:: with SMTP id t9-v6mr2911929pgf.366.1539296219617; Thu, 11 Oct 2018 15:16:59 -0700 (PDT) X-Google-Smtp-Source: ACcGV63D8ZVHQ2XxuPngRpT+14imWqs/bvVoiHYoabjwe+sOgI0yZtntPkMWTyOtj/es/ebhBmqD X-Received: by 2002:a63:b709:: with SMTP id t9-v6mr2911854pgf.366.1539296217716; Thu, 11 Oct 2018 15:16:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539296217; cv=none; d=google.com; s=arc-20160816; b=qOqgVrhj9jht7fSc9fNTdalNFX8N7gJnUH3vT5bWcP1Nr1fGGvWM+/s0vMNnp2Dc6L rJoEZrqeyDIiuFcOwp5oE2lrLlzxG67i7VqUf2qQmNBxXi8Tp4W+3MhaqjqHBkiWbkIW 0iq9L2oe5ysvGdqJN2VQLFn73wrVZCeeec+HSOcvJS5kA5GrEB3znwPo4loewk5/bGFM 1/EhrwAanXddr5cnw/FOD+bvZS1s5Pu6Jg1DfWy8JdCnfLaynQ4PsmgHbal0Wt8LX1mV wN3n+0uaNp/VrmdUbtKLZCwJjXt0EFzAR3mh/Uyku2OvQQ0e4KucGdgeQHzu7ZhJfN3T 6THQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=kmrwVflFWyZinFdH62Izd3Nlt5Izch/exlKWVQnI7Mk=; b=eLlfA1bRY/hDZWQk8R0ohAscH2akMvZXrfP/kFGiO4CA05a9AzN3gUKhIp6XV3LPdS 33neYCYG9lvWs5PUnyZsJrrD8R3sEP72QdQIac1e9mgG53VNyWmhOqSDUwLjA4rOQQKW pPnRBFtrXH4uuCqvmlOWoje7ycfA5L+u2nTOmQKe0XICz4q8lePXoiZBMG36I0Zf5VNj o+46TRaBsmAOMgTMbt5YkqW4SgbcGlZiLPt4TjE+F+bbMrubxhnpjWL5J7bSp6mjaoxs CD/P935XnIS8vG8eAEhHuRhgsXsyxZACASIJdyZi5eHgg+XwJOyjr9UxWG6mF3Dt3HRs vHJw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga18.intel.com (mga18.intel.com. [134.134.136.126]) by mx.google.com with ESMTPS id r29-v6si24916614pff.262.2018.10.11.15.16.56 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 11 Oct 2018 15:16:57 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) client-ip=134.134.136.126; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Oct 2018 15:16:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,370,1534834800"; d="scan'208";a="80765702" Received: from ahduyck-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.7.198.157]) by orsmga008.jf.intel.com with ESMTP; 11 Oct 2018 15:13:45 -0700 Subject: [mm PATCH v2 3/6] mm: Use memblock/zone specific iterator for handling deferred page init From: Alexander Duyck To: linux-mm@kvack.org, akpm@linux-foundation.org Cc: pavel.tatashin@microsoft.com, mhocko@suse.com, dave.jiang@intel.com, alexander.h.duyck@linux.intel.com, linux-kernel@vger.kernel.org, willy@infradead.org, davem@davemloft.net, yi.z.zhang@linux.intel.com, khalid.aziz@oracle.com, rppt@linux.vnet.ibm.com, vbabka@suse.cz, sparclinux@vger.kernel.org, dan.j.williams@intel.com, ldufour@linux.vnet.ibm.com, mgorman@techsingularity.net, mingo@kernel.org, kirill.shutemov@linux.intel.com Date: Thu, 11 Oct 2018 15:13:45 -0700 Message-ID: <20181011221345.1925.16113.stgit@localhost.localdomain> In-Reply-To: <20181011221237.1925.85591.stgit@localhost.localdomain> References: <20181011221237.1925.85591.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This patch introduces a new iterator for_each_free_mem_pfn_range_in_zone. This iterator will take care of making sure a given memory range provided is in fact contained within a zone. It takes are of all the bounds checking we were doing in deferred_grow_zone, and deferred_init_memmap. In addition it should help to speed up the search a bit by iterating until the end of a range is greater than the start of the zone pfn range, and will exit completely if the start is beyond the end of the zone. This patch adds yet another iterator called for_each_free_mem_range_in_zone_from and then uses it to support initializing and freeing pages in groups no larger than MAX_ORDER_NR_PAGES. By doing this we can greatly improve the cache locality of the pages while we do several loops over them in the init and freeing process. We are able to tighten the loops as a result since we only really need the checks for first_init_pfn in our first iteration and after that we can assume that all future values will be greater than this. So I have added a function called deferred_init_mem_pfn_range_in_zone that primes the iterators and if it fails we can just exit. Signed-off-by: Alexander Duyck --- include/linux/memblock.h | 58 +++++++++++++++ mm/memblock.c | 63 ++++++++++++++++ mm/page_alloc.c | 176 ++++++++++++++++++++++++++++++++-------------- 3 files changed, 242 insertions(+), 55 deletions(-) diff --git a/include/linux/memblock.h b/include/linux/memblock.h index d4d0e0181682..a89580b80a3d 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -178,6 +178,25 @@ void __next_reserved_mem_region(u64 *idx, phys_addr_t *out_start, p_start, p_end, p_nid)) /** + * for_each_mem_range - iterate through memblock areas from type_a and not + * included in type_b. Or just type_a if type_b is NULL. + * @i: u64 used as loop variable + * @type_a: ptr to memblock_type to iterate + * @type_b: ptr to memblock_type which excludes from the iteration + * @nid: node selector, %NUMA_NO_NODE for all nodes + * @flags: pick from blocks based on memory attributes + * @p_start: ptr to phys_addr_t for start address of the range, can be %NULL + * @p_end: ptr to phys_addr_t for end address of the range, can be %NULL + * @p_nid: ptr to int for nid of the range, can be %NULL + */ +#define for_each_mem_range_from(i, type_a, type_b, nid, flags, \ + p_start, p_end, p_nid) \ + for (i = 0, __next_mem_range(&i, nid, flags, type_a, type_b, \ + p_start, p_end, p_nid); \ + i != (u64)ULLONG_MAX; \ + __next_mem_range(&i, nid, flags, type_a, type_b, \ + p_start, p_end, p_nid)) +/** * for_each_mem_range_rev - reverse iterate through memblock areas from * type_a and not included in type_b. Or just type_a if type_b is NULL. * @i: u64 used as loop variable @@ -248,6 +267,45 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn, i >= 0; __next_mem_pfn_range(&i, nid, p_start, p_end, p_nid)) #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */ +#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT +void __next_mem_pfn_range_in_zone(u64 *idx, struct zone *zone, + unsigned long *out_spfn, + unsigned long *out_epfn); +/** + * for_each_free_mem_range_in_zone - iterate through zone specific free + * memblock areas + * @i: u64 used as loop variable + * @zone: zone in which all of the memory blocks reside + * @p_start: ptr to phys_addr_t for start address of the range, can be %NULL + * @p_end: ptr to phys_addr_t for end address of the range, can be %NULL + * + * Walks over free (memory && !reserved) areas of memblock in a specific + * zone. Available as soon as memblock is initialized. + */ +#define for_each_free_mem_pfn_range_in_zone(i, zone, p_start, p_end) \ + for (i = 0, \ + __next_mem_pfn_range_in_zone(&i, zone, p_start, p_end); \ + i != (u64)ULLONG_MAX; \ + __next_mem_pfn_range_in_zone(&i, zone, p_start, p_end)) + +/** + * for_each_free_mem_range_in_zone_from - iterate through zone specific + * free memblock areas from a given point + * @i: u64 used as loop variable + * @zone: zone in which all of the memory blocks reside + * @p_start: ptr to phys_addr_t for start address of the range, can be %NULL + * @p_end: ptr to phys_addr_t for end address of the range, can be %NULL + * + * Walks over free (memory && !reserved) areas of memblock in a specific + * zone, continuing from current position. Available as soon as memblock is + * initialized. + */ +#define for_each_free_mem_pfn_range_in_zone_from(i, zone, p_start, p_end) \ + for (; i != (u64)ULLONG_MAX; \ + __next_mem_pfn_range_in_zone(&i, zone, p_start, p_end)) + +#endif /* CONFIG_DEFERRED_STRUCT_PAGE_INIT */ + /** * for_each_free_mem_range - iterate through free memblock areas * @i: u64 used as loop variable diff --git a/mm/memblock.c b/mm/memblock.c index b0ebca546ba1..c06f8edd0409 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1239,6 +1239,69 @@ int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size, return 0; } #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */ +#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT +/** + * __next_mem_pfn_range_in_zone - iterator for for_each_*_range_in_zone() + * + * @idx: pointer to u64 loop variable + * @zone: zone in which all of the memory blocks reside + * @out_start: ptr to ulong for start pfn of the range, can be %NULL + * @out_end: ptr to ulong for end pfn of the range, can be %NULL + * + * This function is meant to be a zone/pfn specific wrapper for the + * for_each_mem_range type iterators. Specifically they are used in the + * deferred memory init routines and as such we were duplicating much of + * this logic throughout the code. So instead of having it in multiple + * locations it seemed like it would make more sense to centralize this to + * one new iterator that does everything they need. + */ +void __init_memblock +__next_mem_pfn_range_in_zone(u64 *idx, struct zone *zone, + unsigned long *out_spfn, unsigned long *out_epfn) +{ + int zone_nid = zone_to_nid(zone); + phys_addr_t spa, epa; + int nid; + + __next_mem_range(idx, zone_nid, MEMBLOCK_NONE, + &memblock.memory, &memblock.reserved, + &spa, &epa, &nid); + + while (*idx != ULLONG_MAX) { + unsigned long epfn = PFN_DOWN(epa); + unsigned long spfn = PFN_UP(spa); + + /* + * Verify the end is at least past the start of the zone and + * that we have at least one PFN to initialize. + */ + if (zone->zone_start_pfn < epfn && spfn < epfn) { + /* if we went too far just stop searching */ + if (zone_end_pfn(zone) <= spfn) + break; + + if (out_spfn) + *out_spfn = max(zone->zone_start_pfn, spfn); + if (out_epfn) + *out_epfn = min(zone_end_pfn(zone), epfn); + + return; + } + + __next_mem_range(idx, zone_nid, MEMBLOCK_NONE, + &memblock.memory, &memblock.reserved, + &spa, &epa, &nid); + } + + /* signal end of iteration */ + *idx = ULLONG_MAX; + if (out_spfn) + *out_spfn = ULONG_MAX; + if (out_epfn) + *out_epfn = 0; +} + +#endif /* CONFIG_DEFERRED_STRUCT_PAGE_INIT */ #ifdef CONFIG_HAVE_MEMBLOCK_PFN_VALID unsigned long __init_memblock memblock_next_valid_pfn(unsigned long pfn) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 076ffb6214c3..3603d5444865 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1516,19 +1516,103 @@ static unsigned long __init deferred_init_pages(struct zone *zone, return (nr_pages); } +/* + * This function is meant to pre-load the iterator for the zone init. + * Specifically it walks through the ranges until we are caught up to the + * first_init_pfn value and exits there. If we never encounter the value we + * return false indicating there are no valid ranges left. + */ +static bool __init +deferred_init_mem_pfn_range_in_zone(u64 *i, struct zone *zone, + unsigned long *spfn, unsigned long *epfn, + unsigned long first_init_pfn) +{ + u64 j; + + /* + * Start out by walking through the ranges in this zone that have + * already been initialized. We don't need to do anything with them + * so we just need to flush them out of the system. + */ + for_each_free_mem_pfn_range_in_zone(j, zone, spfn, epfn) { + if (*epfn <= first_init_pfn) + continue; + if (*spfn < first_init_pfn) + *spfn = first_init_pfn; + *i = j; + return true; + } + + return false; +} + +/* + * Initialize and free pages. We do it in two loops: first we initialize + * struct page, than free to buddy allocator, because while we are + * freeing pages we can access pages that are ahead (computing buddy + * page in __free_one_page()). + * + * In order to try and keep some memory in the cache we have the loop + * broken along max page order boundaries. This way we will not cause + * any issues with the buddy page computation. + */ +static unsigned long __init +deferred_init_maxorder(u64 *i, struct zone *zone, unsigned long *start_pfn, + unsigned long *end_pfn) +{ + unsigned long mo_pfn = ALIGN(*start_pfn + 1, MAX_ORDER_NR_PAGES); + unsigned long spfn = *start_pfn, epfn = *end_pfn; + unsigned long nr_pages = 0; + u64 j = *i; + + /* First we loop through and initialize the page values */ + for_each_free_mem_pfn_range_in_zone_from(j, zone, &spfn, &epfn) { + unsigned long t; + + if (mo_pfn <= spfn) + break; + + t = min(mo_pfn, epfn); + nr_pages += deferred_init_pages(zone, spfn, t); + + if (mo_pfn <= epfn) + break; + } + + /* Reset values and now loop through freeing pages as needed */ + j = *i; + + for_each_free_mem_pfn_range_in_zone_from(j, zone, start_pfn, end_pfn) { + unsigned long t; + + if (mo_pfn <= *start_pfn) + break; + + t = min(mo_pfn, *end_pfn); + deferred_free_pages(*start_pfn, t); + *start_pfn = t; + + if (mo_pfn < *end_pfn) + break; + } + + /* Store our current values to be reused on the next iteration */ + *i = j; + + return nr_pages; +} + /* Initialise remaining memory on a node */ static int __init deferred_init_memmap(void *data) { pg_data_t *pgdat = data; - int nid = pgdat->node_id; + const struct cpumask *cpumask = cpumask_of_node(pgdat->node_id); + unsigned long spfn = 0, epfn = 0, nr_pages = 0; + unsigned long first_init_pfn, flags; unsigned long start = jiffies; - unsigned long nr_pages = 0; - unsigned long spfn, epfn, first_init_pfn, flags; - phys_addr_t spa, epa; - int zid; struct zone *zone; - const struct cpumask *cpumask = cpumask_of_node(pgdat->node_id); u64 i; + int zid; /* Bind memory initialisation thread to a local node if possible */ if (!cpumask_empty(cpumask)) @@ -1553,31 +1637,30 @@ static int __init deferred_init_memmap(void *data) if (first_init_pfn < zone_end_pfn(zone)) break; } - first_init_pfn = max(zone->zone_start_pfn, first_init_pfn); + + /* If the zone is empty somebody else may have cleared out the zone */ + if (!deferred_init_mem_pfn_range_in_zone(&i, zone, &spfn, &epfn, + first_init_pfn)) { + pgdat_resize_unlock(pgdat, &flags); + pgdat_init_report_one_done(); + return 0; + } /* - * Initialize and free pages. We do it in two loops: first we initialize - * struct page, than free to buddy allocator, because while we are - * freeing pages we can access pages that are ahead (computing buddy - * page in __free_one_page()). + * Initialize and free pages in MAX_ORDER sized increments so + * that we can avoid introducing any issues with the buddy + * allocator. */ - for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) { - spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); - epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa)); - nr_pages += deferred_init_pages(zone, spfn, epfn); - } - for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) { - spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); - epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa)); - deferred_free_pages(spfn, epfn); - } + while (spfn < epfn) + nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn); + pgdat_resize_unlock(pgdat, &flags); /* Sanity check that the next zone really is unpopulated */ WARN_ON(++zid < MAX_NR_ZONES && populated_zone(++zone)); - pr_info("node %d initialised, %lu pages in %ums\n", nid, nr_pages, - jiffies_to_msecs(jiffies - start)); + pr_info("node %d initialised, %lu pages in %ums\n", + pgdat->node_id, nr_pages, jiffies_to_msecs(jiffies - start)); pgdat_init_report_one_done(); return 0; @@ -1608,14 +1691,11 @@ static int __init deferred_init_memmap(void *data) static noinline bool __init deferred_grow_zone(struct zone *zone, unsigned int order) { - int zid = zone_idx(zone); - int nid = zone_to_nid(zone); - pg_data_t *pgdat = NODE_DATA(nid); unsigned long nr_pages_needed = ALIGN(1 << order, PAGES_PER_SECTION); - unsigned long nr_pages = 0; - unsigned long first_init_pfn, spfn, epfn, t, flags; + pg_data_t *pgdat = zone->zone_pgdat; unsigned long first_deferred_pfn = pgdat->first_deferred_pfn; - phys_addr_t spa, epa; + unsigned long spfn, epfn, flags; + unsigned long nr_pages = 0; u64 i; /* Only the last zone may have deferred pages */ @@ -1644,37 +1724,23 @@ static int __init deferred_init_memmap(void *data) return true; } - first_init_pfn = max(zone->zone_start_pfn, first_deferred_pfn); - - if (first_init_pfn >= pgdat_end_pfn(pgdat)) { + /* If the zone is empty somebody else may have cleared out the zone */ + if (!deferred_init_mem_pfn_range_in_zone(&i, zone, &spfn, &epfn, + first_deferred_pfn)) { pgdat_resize_unlock(pgdat, &flags); - return false; + return true; } - for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) { - spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); - epfn = min_t(unsigned long, zone_end_pfn(zone), PFN_DOWN(epa)); - - while (spfn < epfn && nr_pages < nr_pages_needed) { - t = ALIGN(spfn + PAGES_PER_SECTION, PAGES_PER_SECTION); - first_deferred_pfn = min(t, epfn); - nr_pages += deferred_init_pages(zone, spfn, - first_deferred_pfn); - spfn = first_deferred_pfn; - } - - if (nr_pages >= nr_pages_needed) - break; + /* + * Initialize and free pages in MAX_ORDER sized increments so + * that we can avoid introducing any issues with the buddy + * allocator. + */ + while (spfn < epfn && nr_pages < nr_pages_needed) { + nr_pages += deferred_init_maxorder(&i, zone, &spfn, &epfn); + first_deferred_pfn = spfn; } - for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &spa, &epa, NULL) { - spfn = max_t(unsigned long, first_init_pfn, PFN_UP(spa)); - epfn = min_t(unsigned long, first_deferred_pfn, PFN_DOWN(epa)); - deferred_free_pages(spfn, epfn); - - if (first_deferred_pfn == epfn) - break; - } pgdat->first_deferred_pfn = first_deferred_pfn; pgdat_resize_unlock(pgdat, &flags); From patchwork Thu Oct 11 22:13:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10637527 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 32FA73CF1 for ; Thu, 11 Oct 2018 22:17:08 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1F1232C1FB for ; Thu, 11 Oct 2018 22:17:08 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 12AB52C206; Thu, 11 Oct 2018 22:17:08 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9F3F22C1FB for ; Thu, 11 Oct 2018 22:17:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BF7936B0008; Thu, 11 Oct 2018 18:16:59 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B56E26B000A; Thu, 11 Oct 2018 18:16:59 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A43186B000C; Thu, 11 Oct 2018 18:16:59 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by kanga.kvack.org (Postfix) with ESMTP id 655266B000A for ; Thu, 11 Oct 2018 18:16:59 -0400 (EDT) Received: by mail-pg1-f199.google.com with SMTP id e6-v6so7650866pge.5 for ; Thu, 11 Oct 2018 15:16:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=ySD2/lL/cVXssri9oktKi8PgviEkOlvPvzuaaJIkOPM=; b=HbuW9YHeCN2jID5Zhn9vVX5dMq1bxshe/BYoBd6G5joEQaoMjpMtrE5jpy6OH2JHpv po0YOHvrOu6CHx4/PO6BOqJzpvpg8YqmzJb37qUdE/K88pv0LAjDDbbcOLuubttQEGRr r6jlfVxokZCTl60fR2LJOcytVO9PaDKRDolhRrllhY+wPIMWTQCXx5PWOhypyCLUlcLB Rt9ty8sLjn2KWCu34UCNKpesmJVQ24K/B2ARtIcqXS41sUMJe2USa8ZeJiHJNsL/4QQF l+xX11Oiygqpn72FkJ/jAVQDhJNRab6exok7y9Fdea6vYORVNu60eYFigEAmsp9JloKT bqYQ== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: ABuFfogmL8dbF/Y+BPtcDQ35Tm5do9ZyO+f0g/i/CD6wqgiQJX55z8BK ek02cAwR+pzVqCB2LuISowwdiWYAbpGv5DPW/VYy/1toP0SqEx4Lok9UTznPFK+10mutES6+b4g rOZ7Ueygj97ib5BY4AtAbJY7N9q7oCGuhOQyxF6O7kni//aJwqhTjdg/wTp7UcJEk6A== X-Received: by 2002:a62:5e02:: with SMTP id s2-v6mr3474212pfb.146.1539296219097; Thu, 11 Oct 2018 15:16:59 -0700 (PDT) X-Google-Smtp-Source: ACcGV63ajWKJKPqaExbDbZh3Dqi+w1456ItxtQLbMs2jAdHdJgBJnyqS0H7cQeO3gUii6wHDJZqb X-Received: by 2002:a62:5e02:: with SMTP id s2-v6mr3474150pfb.146.1539296218016; Thu, 11 Oct 2018 15:16:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539296217; cv=none; d=google.com; s=arc-20160816; b=zJMgXa1d4V0C1zPZwxyiWFxizhW82D84JXdHjjzRrKCfG2x2IQaV/8UlwzEsHM32hq AS7KFaT/tkD5DAAMWDvTGussirQllAosUPhSIoYk2E+s/ytwXYqQrAyuQcVwfrZV2iaw d1MPdJ75yzurOnIcVID7FfheT/vhZuok8auHEu323VekbYGem+7iLjc2jNA4F5prlFK4 BOQX7c1ulsBuGsu+/7+JuidkAk0iB/eYbmshHMQ2EyIaiQqjDKCDKNZ4JPcQ3qoWuUrJ pZgJAxCHTfsHvxWghJJAlqJBsDjWTWkjZlZ/3e5TVvsxvjr+fXwlXwr70t1+GtFeDb+u Uk+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=ySD2/lL/cVXssri9oktKi8PgviEkOlvPvzuaaJIkOPM=; b=Hq22WZQ6QlqS21sK5cxuIhCMB4W6RCJL+nQNKHOq9tc2IJjB5ArlZdHzrPznVAmCKt TSIQqoEQ8UyXUMOX26D/HqrovdljfoDL7Rmpi/5/Ypzz5JNMVRG2ld053EYSYYakG5J8 ClJfS7ycoO+viBtx+p+iUsZqLFPHcYr2+ndRHiIdYMTK0AjmYsWICoRnSIg8ledbj+li p2hLzrSfNQnjcYWuH2p3l2F24LL/5p/EsbHaoFMPnNM39WPtJhv1xPdYQPmCS3xg++sb rTl41oxUkk7CQqSmiNNyjEnUtUC1T0wnUSEtM3wGWmCF+MZ7GMNhN6bWNXIVpzKx/wvs zsjA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga18.intel.com (mga18.intel.com. [134.134.136.126]) by mx.google.com with ESMTPS id r29-v6si24916614pff.262.2018.10.11.15.16.57 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 11 Oct 2018 15:16:57 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) client-ip=134.134.136.126; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Oct 2018 15:16:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,370,1534834800"; d="scan'208";a="80765705" Received: from ahduyck-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.7.198.157]) by orsmga008.jf.intel.com with ESMTP; 11 Oct 2018 15:13:51 -0700 Subject: [mm PATCH v2 4/6] mm: Do not set reserved flag for hotplug memory From: Alexander Duyck To: linux-mm@kvack.org, akpm@linux-foundation.org Cc: pavel.tatashin@microsoft.com, mhocko@suse.com, dave.jiang@intel.com, alexander.h.duyck@linux.intel.com, linux-kernel@vger.kernel.org, willy@infradead.org, davem@davemloft.net, yi.z.zhang@linux.intel.com, khalid.aziz@oracle.com, rppt@linux.vnet.ibm.com, vbabka@suse.cz, sparclinux@vger.kernel.org, dan.j.williams@intel.com, ldufour@linux.vnet.ibm.com, mgorman@techsingularity.net, mingo@kernel.org, kirill.shutemov@linux.intel.com Date: Thu, 11 Oct 2018 15:13:51 -0700 Message-ID: <20181011221351.1925.67694.stgit@localhost.localdomain> In-Reply-To: <20181011221237.1925.85591.stgit@localhost.localdomain> References: <20181011221237.1925.85591.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP The general suspicion at this point is that the setting of the reserved bit is not really needed for hotplug memory. In addition the setting of this bit results in issues for DAX in that it is not possible to assign the region to KVM if the reserved bit is set in each page. For now we can try just not setting the bit since we suspect it isn't adding value in setting it. If at a later time we find that it is needed we can come back through and re-add it for the hotplug paths. Suggested-by: Michael Hocko Reported-by: Dan Williams Signed-off-by: Alexander Duyck --- mm/page_alloc.c | 11 ----------- 1 file changed, 11 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 3603d5444865..e435223e2ddb 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5571,8 +5571,6 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, page = pfn_to_page(pfn); __init_single_page(page, pfn, zone, nid); - if (context == MEMMAP_HOTPLUG) - __SetPageReserved(page); /* * Mark the block movable so that blocks are reserved for @@ -5626,15 +5624,6 @@ void __ref memmap_init_zone_device(struct zone *zone, __init_single_page(page, pfn, zone_idx, nid); /* - * Mark page reserved as it will need to wait for onlining - * phase for it to be fully associated with a zone. - * - * We can use the non-atomic __set_bit operation for setting - * the flag as we are still initializing the pages. - */ - __SetPageReserved(page); - - /* * ZONE_DEVICE pages union ->lru with a ->pgmap back * pointer and hmm_data. It is a bug if a ZONE_DEVICE * page is ever freed or placed on a driver-private list. From patchwork Thu Oct 11 22:13:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10637531 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B99FE157A for ; Thu, 11 Oct 2018 22:17:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A49542C1FB for ; Thu, 11 Oct 2018 22:17:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 98A332C206; Thu, 11 Oct 2018 22:17:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D18332C1FB for ; Thu, 11 Oct 2018 22:17:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 81C436B000C; Thu, 11 Oct 2018 18:17:00 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5FD1D6B0269; Thu, 11 Oct 2018 18:17:00 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 478946B000E; Thu, 11 Oct 2018 18:17:00 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by kanga.kvack.org (Postfix) with ESMTP id E46106B000C for ; Thu, 11 Oct 2018 18:16:59 -0400 (EDT) Received: by mail-pl1-f199.google.com with SMTP id f17-v6so7918009plr.1 for ; Thu, 11 Oct 2018 15:16:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=eNd6I70UP7FktykxSCJvVzXKcLm6++R1rOpUNF8D0mI=; b=f4n22ZiG0PRxoGqoJ6ivTHGdIck7khPAV1DcV+Rc+y/1f13+0DnsAYM+cviipK4Jkb 8+7JFDIrmmxjrCmeqqvmagN6zGQSnnaoWDSGKaNlmowhGpT0Rancft5CUTywwjW4+DR6 akt37p0pY8YE0fY7T1JevVBatCO32WgcoBIPRZrbkQ4fvA6k8OgwUnLhW1bNsOb+bZ29 lMosPDNe9p4g60VaP/U3WOY4RuHtJhr1qJTbkv4PMXGgLb6j8XgwQIGBNiAYrjlA7dkx JqCm6oZH4u/+2gYEUfrkstuOc8leJ7uSl0+aTWUN14W4Q+iNzWpwb8S9hDwxcb4mcjLh cqag== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: ABuFfoiwfRks4I0sZvkza7OPPnmSt5nmKusV4evjl2U0GSXl409paPMD o1FSRcqS7EWeBeEdvmNfdz6wKZ+R+IFX4Bcl/qGQHFSkMWYMZhdZZACPd/XtgbLDIYaw8b58qWV vK4XFnvBps5GXDw0Jx7S2rrKgFNow2E3SKgWcZtUIlqIzDg/7r/tiiC++83eVD+iHYw== X-Received: by 2002:a63:680a:: with SMTP id d10-v6mr3004258pgc.7.1539296219583; Thu, 11 Oct 2018 15:16:59 -0700 (PDT) X-Google-Smtp-Source: ACcGV62fA3izl4aPEGOr7m+JnwuHxU5YDP3IdxH82hlKjY1zjb3tt5iYFGUNbMAcZOKRVtQyhkmv X-Received: by 2002:a63:680a:: with SMTP id d10-v6mr3004195pgc.7.1539296218259; Thu, 11 Oct 2018 15:16:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539296218; cv=none; d=google.com; s=arc-20160816; b=N9Btn88nTtNr6TF8ksBjf8+xdpnh8AyViX8tizcculCCdgg0HUBM3Y7PrD/5vmqXR2 QbbUm0aA5uY+y8THR6Ayp8eOvxZCG8qPXLef0E1HUkqqCvNCplCT2bcnb3hzYyuF7Fn5 HDt4IbVPlACHo1X1iN9D3kiHJIytuhNojol/JkUK+VNHJl4CcODbEaxr4E+HFT8u/CR6 Cpb+b6PyGDEg40wo7NAchBGBKyxrTPWR7WWIhxByKn0j24azaU5HXc4TeRt2far6IWHG 58IjsaTr5WyC7Zi0rMv+2P44SbQ/rEwkv+XJL9irwy/8LEDz5Y+1wyDBpxO6bO1pdufV 4o5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=eNd6I70UP7FktykxSCJvVzXKcLm6++R1rOpUNF8D0mI=; b=aBWWvCzNs+S5uMIQsaEbBiitldgBDdJjI8y56Q6XhsafqrVJLpHC58DwI6DKxYa2vo W/8M6hxl/p2PHPpOfbGeknYErL2VPQH1dic2+mfTBxWarkEE0lE5XGuZrL3PK2sUscH9 8/qs9rsSbmimP8xfv2JJIAOHHazH5gZkVDrzMpSCQy73VXzu9dDGemw+QQ4lJj4eqQkQ VF3IoRpcolQPEdgLzc10jwqRKp0VE70HJDsoTzDsJJKuXtxZfp/zh1G2QCZ4j8AI4mUc IRt8AlONx7dYc78lsxWFE1qFdDl+ReGj5j0u2NdREjdnp2sxa9yEbzV+BO+WGC0UNjBS JhSQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga18.intel.com (mga18.intel.com. [134.134.136.126]) by mx.google.com with ESMTPS id r29-v6si24916614pff.262.2018.10.11.15.16.58 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 11 Oct 2018 15:16:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) client-ip=134.134.136.126; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Oct 2018 15:16:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,370,1534834800"; d="scan'208";a="80765708" Received: from ahduyck-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.7.198.157]) by orsmga008.jf.intel.com with ESMTP; 11 Oct 2018 15:13:57 -0700 Subject: [mm PATCH v2 5/6] mm: Move hot-plug specific memory init into separate functions and optimize From: Alexander Duyck To: linux-mm@kvack.org, akpm@linux-foundation.org Cc: pavel.tatashin@microsoft.com, mhocko@suse.com, dave.jiang@intel.com, alexander.h.duyck@linux.intel.com, linux-kernel@vger.kernel.org, willy@infradead.org, davem@davemloft.net, yi.z.zhang@linux.intel.com, khalid.aziz@oracle.com, rppt@linux.vnet.ibm.com, vbabka@suse.cz, sparclinux@vger.kernel.org, dan.j.williams@intel.com, ldufour@linux.vnet.ibm.com, mgorman@techsingularity.net, mingo@kernel.org, kirill.shutemov@linux.intel.com Date: Thu, 11 Oct 2018 15:13:57 -0700 Message-ID: <20181011221357.1925.5661.stgit@localhost.localdomain> In-Reply-To: <20181011221237.1925.85591.stgit@localhost.localdomain> References: <20181011221237.1925.85591.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This patch is going through and combining the bits in memmap_init_zone and memmap_init_zone_device that are related to hotplug into a single function called __memmap_init_hotplug. I also took the opportunity to integrate __init_single_page's functionality into this function. In doing so I can get rid of some of the redundancy such as the LRU pointers versus the pgmap. Signed-off-by: Alexander Duyck --- mm/page_alloc.c | 199 ++++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 137 insertions(+), 62 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e435223e2ddb..5987c859676b 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1192,6 +1192,79 @@ static void __meminit __init_single_page(struct page *page, unsigned long pfn, #endif } +static void __meminit __init_pageblock(unsigned long start_pfn, + unsigned long nr_pages, + unsigned long zone, int nid, + struct dev_pagemap *pgmap) +{ + unsigned long nr_pgmask = pageblock_nr_pages - 1; + struct page *start_page = pfn_to_page(start_pfn); + unsigned long pfn = start_pfn + nr_pages - 1; +#ifdef WANT_PAGE_VIRTUAL + bool is_highmem = is_highmem_idx(zone); +#endif + struct page *page; + + /* + * Enforce the following requirements: + * size > 0 + * size < pageblock_nr_pages + * start_pfn -> pfn does not cross pageblock_nr_pages boundary + */ + VM_BUG_ON(((start_pfn ^ pfn) | (nr_pages - 1)) > nr_pgmask); + + /* + * Work from highest page to lowest, this way we will still be + * warm in the cache when we call set_pageblock_migratetype + * below. + * + * The loop is based around the page pointer as the main index + * instead of the pfn because pfn is not used inside the loop if + * the section number is not in page flags and WANT_PAGE_VIRTUAL + * is not defined. + */ + for (page = start_page + nr_pages; page-- != start_page; pfn--) { + mm_zero_struct_page(page); + set_page_links(page, zone, nid, pfn); + init_page_count(page); + page_mapcount_reset(page); + page_cpupid_reset_last(page); + + /* + * ZONE_DEVICE pages union ->lru with a ->pgmap back + * pointer and hmm_data. It is a bug if a ZONE_DEVICE + * page is ever freed or placed on a driver-private list. + */ + if (pgmap) + page->pgmap = pgmap; + else + INIT_LIST_HEAD(&page->lru); +#ifdef WANT_PAGE_VIRTUAL + /* The shift won't overflow because ZONE_NORMAL is below 4G. */ + if (!is_highmem) + set_page_address(page, __va(pfn << PAGE_SHIFT)); +#endif + } + + /* + * Mark the block movable so that blocks are reserved for + * movable at startup. This will force kernel allocations + * to reserve their blocks rather than leaking throughout + * the address space during boot when many long-lived + * kernel allocations are made. + * + * bitmap is created for zone's valid pfn range. but memmap + * can be created for invalid pages (for alignment) + * check here not to call set_pageblock_migratetype() against + * pfn out of zone. + * + * Please note that MEMMAP_HOTPLUG path doesn't clear memmap + * because this is done early in sparse_add_one_section + */ + if (!(start_pfn & nr_pgmask)) + set_pageblock_migratetype(start_page, MIGRATE_MOVABLE); +} + #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT static void __meminit init_reserved_page(unsigned long pfn) { @@ -5518,6 +5591,30 @@ void __ref build_all_zonelists(pg_data_t *pgdat) return false; } +static void __meminit __memmap_init_hotplug(unsigned long size, int nid, + unsigned long zone, + unsigned long start_pfn, + struct dev_pagemap *pgmap) +{ + unsigned long pfn = start_pfn + size; + + while (pfn != start_pfn) { + unsigned long stride = pfn; + + pfn = max(ALIGN_DOWN(pfn - 1, pageblock_nr_pages), start_pfn); + stride -= pfn; + + /* + * Mark page reserved as it will need to wait for + * onlining phase for it to be fully associated with + * a zone. + */ + __init_pageblock(pfn, stride, zone, nid, pgmap); + + cond_resched(); + } +} + /* * Initially all pages are reserved - free ones are freed * up by memblock_free_all() once the early boot process is @@ -5528,46 +5625,57 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, struct vmem_altmap *altmap) { unsigned long pfn, end_pfn = start_pfn + size; - struct page *page; if (highest_memmap_pfn < end_pfn - 1) highest_memmap_pfn = end_pfn - 1; + if (context == MEMMAP_HOTPLUG) { #ifdef CONFIG_ZONE_DEVICE - /* - * Honor reservation requested by the driver for this ZONE_DEVICE - * memory. We limit the total number of pages to initialize to just - * those that might contain the memory mapping. We will defer the - * ZONE_DEVICE page initialization until after we have released - * the hotplug lock. - */ - if (zone == ZONE_DEVICE) { - if (!altmap) - return; + /* + * Honor reservation requested by the driver for this + * ZONE_DEVICE memory. We limit the total number of pages to + * initialize to just those that might contain the memory + * mapping. We will defer the ZONE_DEVICE page initialization + * until after we have released the hotplug lock. + */ + if (zone == ZONE_DEVICE) { + if (!altmap) + return; + + if (start_pfn == altmap->base_pfn) + start_pfn += altmap->reserve; + end_pfn = altmap->base_pfn + + vmem_altmap_offset(altmap); + } +#endif + /* + * For these pages we don't need to record the pgmap as they + * should represent only those pages used to store the memory + * map. The actual ZONE_DEVICE pages will be initialized later. + */ + __memmap_init_hotplug(end_pfn - start_pfn, nid, zone, + start_pfn, NULL); - if (start_pfn == altmap->base_pfn) - start_pfn += altmap->reserve; - end_pfn = altmap->base_pfn + vmem_altmap_offset(altmap); + return; } -#endif for (pfn = start_pfn; pfn < end_pfn; pfn++) { + struct page *page; + /* * There can be holes in boot-time mem_map[]s handed to this * function. They do not exist on hotplugged memory. */ - if (context == MEMMAP_EARLY) { - if (!early_pfn_valid(pfn)) { - pfn = next_valid_pfn(pfn) - 1; - continue; - } - if (!early_pfn_in_nid(pfn, nid)) - continue; - if (overlap_memmap_init(zone, &pfn)) - continue; - if (defer_init(nid, pfn, end_pfn)) - break; + if (!early_pfn_valid(pfn)) { + pfn = next_valid_pfn(pfn) - 1; + continue; } + if (!early_pfn_in_nid(pfn, nid)) + continue; + if (overlap_memmap_init(zone, &pfn)) + continue; + if (defer_init(nid, pfn, end_pfn)) + break; page = pfn_to_page(pfn); __init_single_page(page, pfn, zone, nid); @@ -5597,14 +5705,12 @@ void __ref memmap_init_zone_device(struct zone *zone, unsigned long size, struct dev_pagemap *pgmap) { - unsigned long pfn, end_pfn = start_pfn + size; struct pglist_data *pgdat = zone->zone_pgdat; unsigned long zone_idx = zone_idx(zone); unsigned long start = jiffies; int nid = pgdat->node_id; - if (WARN_ON_ONCE(!pgmap || !is_dev_zone(zone))) - return; + VM_BUG_ON(!is_dev_zone(zone)); /* * The call to memmap_init_zone should have already taken care @@ -5613,44 +5719,13 @@ void __ref memmap_init_zone_device(struct zone *zone, */ if (pgmap->altmap_valid) { struct vmem_altmap *altmap = &pgmap->altmap; + unsigned long end_pfn = start_pfn + size; start_pfn = altmap->base_pfn + vmem_altmap_offset(altmap); size = end_pfn - start_pfn; } - for (pfn = start_pfn; pfn < end_pfn; pfn++) { - struct page *page = pfn_to_page(pfn); - - __init_single_page(page, pfn, zone_idx, nid); - - /* - * ZONE_DEVICE pages union ->lru with a ->pgmap back - * pointer and hmm_data. It is a bug if a ZONE_DEVICE - * page is ever freed or placed on a driver-private list. - */ - page->pgmap = pgmap; - page->hmm_data = 0; - - /* - * Mark the block movable so that blocks are reserved for - * movable at startup. This will force kernel allocations - * to reserve their blocks rather than leaking throughout - * the address space during boot when many long-lived - * kernel allocations are made. - * - * bitmap is created for zone's valid pfn range. but memmap - * can be created for invalid pages (for alignment) - * check here not to call set_pageblock_migratetype() against - * pfn out of zone. - * - * Please note that MEMMAP_HOTPLUG path doesn't clear memmap - * because this is done early in sparse_add_one_section - */ - if (!(pfn & (pageblock_nr_pages - 1))) { - set_pageblock_migratetype(page, MIGRATE_MOVABLE); - cond_resched(); - } - } + __memmap_init_hotplug(size, nid, zone_idx, start_pfn, pgmap); pr_info("%s initialised, %lu pages in %ums\n", dev_name(pgmap->dev), size, jiffies_to_msecs(jiffies - start)); From patchwork Thu Oct 11 22:14:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 10637529 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5C5BF3CF1 for ; Thu, 11 Oct 2018 22:17:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 470EA2C1FB for ; Thu, 11 Oct 2018 22:17:11 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3B34C2C206; Thu, 11 Oct 2018 22:17:11 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 91D982C1FB for ; Thu, 11 Oct 2018 22:17:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 56E236B0010; Thu, 11 Oct 2018 18:17:00 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 53C246B000C; Thu, 11 Oct 2018 18:17:00 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3B81A6B0010; Thu, 11 Oct 2018 18:17:00 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by kanga.kvack.org (Postfix) with ESMTP id E9A1C6B000A for ; Thu, 11 Oct 2018 18:16:59 -0400 (EDT) Received: by mail-pg1-f199.google.com with SMTP id o18-v6so7642081pgv.14 for ; Thu, 11 Oct 2018 15:16:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:from :to:cc:date:message-id:in-reply-to:references:user-agent :mime-version:content-transfer-encoding; bh=LwXUdPbLyllLNF90rcMXHq2EVt6K21HeUXihAHi/Hu4=; b=uRUCgv/K/euHt5CAdq+zLuqYHomGSS89ZoQUMMonqzUhweBS7PEHZkolstYz8/4pt6 sFitKd/vptc439Tq6qXM+to3mZ9Siw1QSytwycNeM5Wgct2cnhpDggG+3rkLfygGEBjf /xx1f61U+iJokFvlfynR3O9m1E+6tqNCXo4J49Lg+ipyUgmOJj6aBlssBlT+2WjZb6sM n58wkaIOo8VMxYRSkfJ5o+YiWTW6LxFf1igf3n/wJ4e3APRe4hY78TGp8NAMr/y/RIfY WnzXvBGo360aCfnqIlwvdycWu8xxmm3OWUsUnR9CErna1sNqgsmki87B+ETSCBCRIkd3 yKYg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: ABuFfoieFUY8NQ7+LzT5kjwKFd4vpxvuPBdVxEcIU1NjzLDn5DHQpMyJ lLJRqp3y08pb7E/yk0XyOR65EYnc+87DJDmAQuiBtB2JuqRCcfhhde7mgbRM21IMxWORozF1VUE 9y9ap6c2GgjSjvUbKYJhx3vzr09NN16W+eFUeg6QzEbkJYKzv7XcJz2R6eYa1fefCbQ== X-Received: by 2002:a17:902:2b84:: with SMTP id l4-v6mr3313443plb.265.1539296219613; Thu, 11 Oct 2018 15:16:59 -0700 (PDT) X-Google-Smtp-Source: ACcGV61SnlSEQ7MaYnzKi5Ywc+hlt6b8NrRG73HUBgp9Z6G80cjQ9N/VdULZMnt/9vagtXfn4sAg X-Received: by 2002:a17:902:2b84:: with SMTP id l4-v6mr3313397plb.265.1539296218572; Thu, 11 Oct 2018 15:16:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539296218; cv=none; d=google.com; s=arc-20160816; b=DoG/dUvpZF+JMl+tda63i69uQXqrKiVPi0uIzmT0r32vAiPVp+rqR2kJb47qjnDs+o S0RfS0V1Im1gwVyy/KcCqo9xQVdtRpjoFJfKKO7BOeVsCsgYJg2xpJI0w0HqwE95Z2mu ib7NhXARLLe/YB4mSj6PO1TZ/i7tG0lrGwcxi3IOjOy2HP0u4NZyhlmqH2behG3OQbMp +fJg6EGLbS+7hh4RR7FgWdzm+OZnS08/V3uzndCPD7lgqdVdR7LqW2xe8E97s7hIWddf vz/DHAZOLy4OSpVyhPH8lyZbPQVovN7HpBoxtZuYKU2QDg4c27GEGETk2KtubztzNwzI kAmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:user-agent:references :in-reply-to:message-id:date:cc:to:from:subject; bh=LwXUdPbLyllLNF90rcMXHq2EVt6K21HeUXihAHi/Hu4=; b=FR8Cp2jyI4A/V60ZgxqcjZdCLjLBM59LwfdOnMjM6sh6aQoT+dR6nlpkfuP74rqQZW FFpT8yjDmtGvo//ehnXJRoz/kBaPqrHxKgrqbcEf0+LwHMPOggfm5FhyUEW/izBbTGBj 7IFm1GPxjLOg3VJ7hA93orAn24OAAgFz7oKkI4kPUKU1b/KZ73qV+xI9tG5dOZs2C2u1 rYqwaAM3oXEqcmbTebsjFUjbV9RPGLHrM3ZPOqLBxKavDT1h+k6VgxDuFgDNYjEsJmbc Jyk3k9Qj2InTFXqM2opxZS1OFhRheg7HI5jkrHiljECVIFf5G4s7DU1PopAbn8nP/HxX OjVg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65]) by mx.google.com with ESMTPS id i7-v6si17793225pgr.147.2018.10.11.15.16.58 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 11 Oct 2018 15:16:58 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.65 as permitted sender) client-ip=134.134.136.65; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of alexander.h.duyck@linux.intel.com designates 134.134.136.65 as permitted sender) smtp.mailfrom=alexander.h.duyck@linux.intel.com; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Oct 2018 15:16:58 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,370,1534834800"; d="scan'208";a="80765711" Received: from ahduyck-mobl.amr.corp.intel.com (HELO localhost.localdomain) ([10.7.198.157]) by orsmga008.jf.intel.com with ESMTP; 11 Oct 2018 15:14:03 -0700 Subject: [mm PATCH v2 6/6] mm: Use common iterator for deferred_init_pages and deferred_free_pages From: Alexander Duyck To: linux-mm@kvack.org, akpm@linux-foundation.org Cc: pavel.tatashin@microsoft.com, mhocko@suse.com, dave.jiang@intel.com, alexander.h.duyck@linux.intel.com, linux-kernel@vger.kernel.org, willy@infradead.org, davem@davemloft.net, yi.z.zhang@linux.intel.com, khalid.aziz@oracle.com, rppt@linux.vnet.ibm.com, vbabka@suse.cz, sparclinux@vger.kernel.org, dan.j.williams@intel.com, ldufour@linux.vnet.ibm.com, mgorman@techsingularity.net, mingo@kernel.org, kirill.shutemov@linux.intel.com Date: Thu, 11 Oct 2018 15:14:03 -0700 Message-ID: <20181011221402.1925.94407.stgit@localhost.localdomain> In-Reply-To: <20181011221237.1925.85591.stgit@localhost.localdomain> References: <20181011221237.1925.85591.stgit@localhost.localdomain> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP This patch creates a common iterator to be used by both deferred_init_pages and deferred_free_pages. By doing this we can cut down a bit on code overhead as they will likely both be inlined into the same function anyway. This new approach allows deferred_init_pages to make use of __init_pageblock. By doing this we can cut down on the code size by sharing code between both the hotplug and deferred memory init code paths. An additional benefit to this approach is that we improve in cache locality of the memory init as we can focus on the memory areas related to identifying if a given PFN is valid and keep that warm in the cache until we transition to a region of a different type. So we will stream through a chunk of valid blocks before we turn to initializing page structs. Signed-off-by: Alexander Duyck --- mm/page_alloc.c | 134 +++++++++++++++++++++++++++---------------------------- 1 file changed, 65 insertions(+), 69 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 5987c859676b..a018315c8f0c 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1477,32 +1477,6 @@ void clear_zone_contiguous(struct zone *zone) } #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT -static void __init deferred_free_range(unsigned long pfn, - unsigned long nr_pages) -{ - struct page *page; - unsigned long i; - - if (!nr_pages) - return; - - page = pfn_to_page(pfn); - - /* Free a large naturally-aligned chunk if possible */ - if (nr_pages == pageblock_nr_pages && - (pfn & (pageblock_nr_pages - 1)) == 0) { - set_pageblock_migratetype(page, MIGRATE_MOVABLE); - __free_pages_boot_core(page, pageblock_order); - return; - } - - for (i = 0; i < nr_pages; i++, page++, pfn++) { - if ((pfn & (pageblock_nr_pages - 1)) == 0) - set_pageblock_migratetype(page, MIGRATE_MOVABLE); - __free_pages_boot_core(page, 0); - } -} - /* Completion tracking for deferred_init_memmap() threads */ static atomic_t pgdat_init_n_undone __initdata; static __initdata DECLARE_COMPLETION(pgdat_init_all_done_comp); @@ -1514,48 +1488,77 @@ static inline void __init pgdat_init_report_one_done(void) } /* - * Returns true if page needs to be initialized or freed to buddy allocator. + * Returns count if page range needs to be initialized or freed * - * First we check if pfn is valid on architectures where it is possible to have - * holes within pageblock_nr_pages. On systems where it is not possible, this - * function is optimized out. + * First, we check if a current large page is valid by only checking the + * validity of the head pfn. * - * Then, we check if a current large page is valid by only checking the validity - * of the head pfn. + * Then we check if the contiguous pfns are valid on architectures where it + * is possible to have holes within pageblock_nr_pages. On systems where it + * is not possible, this function is optimized out. */ -static inline bool __init deferred_pfn_valid(unsigned long pfn) +static unsigned long __next_pfn_valid_range(unsigned long *i, + unsigned long end_pfn) { - if (!pfn_valid_within(pfn)) - return false; - if (!(pfn & (pageblock_nr_pages - 1)) && !pfn_valid(pfn)) - return false; - return true; + unsigned long pfn = *i; + unsigned long count; + + while (pfn < end_pfn) { + unsigned long t = ALIGN(pfn + 1, pageblock_nr_pages); + unsigned long pageblock_pfn = min(t, end_pfn); + +#ifndef CONFIG_HOLES_IN_ZONE + count = pageblock_pfn - pfn; + pfn = pageblock_pfn; + if (!pfn_valid(pfn)) + continue; +#else + for (count = 0; pfn < pageblock_pfn; pfn++) { + if (pfn_valid_within(pfn)) { + count++; + continue; + } + + if (count) + break; + } + + if (!count) + continue; +#endif + *i = pfn; + return count; + } + + return 0; } +#define for_each_deferred_pfn_valid_range(i, start_pfn, end_pfn, pfn, count) \ + for (i = (start_pfn), \ + count = __next_pfn_valid_range(&i, (end_pfn)); \ + count && ({ pfn = i - count; 1; }); \ + count = __next_pfn_valid_range(&i, (end_pfn))) /* * Free pages to buddy allocator. Try to free aligned pages in * pageblock_nr_pages sizes. */ -static void __init deferred_free_pages(unsigned long pfn, +static void __init deferred_free_pages(unsigned long start_pfn, unsigned long end_pfn) { - unsigned long nr_pgmask = pageblock_nr_pages - 1; - unsigned long nr_free = 0; - - for (; pfn < end_pfn; pfn++) { - if (!deferred_pfn_valid(pfn)) { - deferred_free_range(pfn - nr_free, nr_free); - nr_free = 0; - } else if (!(pfn & nr_pgmask)) { - deferred_free_range(pfn - nr_free, nr_free); - nr_free = 1; - touch_nmi_watchdog(); + unsigned long i, pfn, count; + + for_each_deferred_pfn_valid_range(i, start_pfn, end_pfn, pfn, count) { + struct page *page = pfn_to_page(pfn); + + if (count == pageblock_nr_pages) { + __free_pages_boot_core(page, pageblock_order); } else { - nr_free++; + while (count--) + __free_pages_boot_core(page++, 0); } + + touch_nmi_watchdog(); } - /* Free the last block of pages to allocator */ - deferred_free_range(pfn - nr_free, nr_free); } /* @@ -1564,29 +1567,22 @@ static void __init deferred_free_pages(unsigned long pfn, * Return number of pages initialized. */ static unsigned long __init deferred_init_pages(struct zone *zone, - unsigned long pfn, + unsigned long start_pfn, unsigned long end_pfn) { - unsigned long nr_pgmask = pageblock_nr_pages - 1; + unsigned long i, pfn, count; int nid = zone_to_nid(zone); unsigned long nr_pages = 0; int zid = zone_idx(zone); - struct page *page = NULL; - for (; pfn < end_pfn; pfn++) { - if (!deferred_pfn_valid(pfn)) { - page = NULL; - continue; - } else if (!page || !(pfn & nr_pgmask)) { - page = pfn_to_page(pfn); - touch_nmi_watchdog(); - } else { - page++; - } - __init_single_page(page, pfn, zid, nid); - nr_pages++; + for_each_deferred_pfn_valid_range(i, start_pfn, end_pfn, pfn, count) { + nr_pages += count; + __init_pageblock(pfn, count, zid, nid, NULL); + + touch_nmi_watchdog(); } - return (nr_pages); + + return nr_pages; } /*