From patchwork Fri Oct 30 19:02:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 11870681 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0B7346A2 for ; Fri, 30 Oct 2020 19:03:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B84CA208B6 for ; Fri, 30 Oct 2020 19:03:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B84CA208B6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 7A4926B007B; Fri, 30 Oct 2020 15:02:54 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 63C9A6B007D; Fri, 30 Oct 2020 15:02:54 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4B60D6B007E; Fri, 30 Oct 2020 15:02:54 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0049.hostedemail.com [216.40.44.49]) by kanga.kvack.org (Postfix) with ESMTP id 140AE6B007B for ; Fri, 30 Oct 2020 15:02:54 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 7718C362B for ; Fri, 30 Oct 2020 19:02:53 +0000 (UTC) X-FDA: 77429513826.22.beam48_5b0ca3527298 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id 4C35518038E60 for ; Fri, 30 Oct 2020 19:02:53 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,ben.widawsky@intel.com,,RULES_HIT:30003:30054:30064,0,RBL:134.134.136.20:@intel.com:.lbl8.mailshell.net-62.50.0.100 64.95.201.95;04yfjjw8iuh8wkdciu1b5sqducub7ypegqdechmx93566fsreihufc5eb9cqm8u.rged1o1kkekp4rdf58tqbjf9gdooibxqqq6hsnsfh77ebsco3s914u31twwjsi4.y-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:ft,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:68,LUA_SUMMARY:none X-HE-Tag: beam48_5b0ca3527298 X-Filterd-Recvd-Size: 4832 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Fri, 30 Oct 2020 19:02:52 +0000 (UTC) IronPort-SDR: Ht37eyqtim3meETS+kxr1f4zAfpwOqOlzvMo/kZFvskUKv5dffOY4AdtKG0d3iMhqDql3YwzgZ 26+kaJXw4tWA== X-IronPort-AV: E=McAfee;i="6000,8403,9790"; a="155629126" X-IronPort-AV: E=Sophos;i="5.77,434,1596524400"; d="scan'208";a="155629126" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2020 12:02:52 -0700 IronPort-SDR: BaEq9jfMC8z/cKfYoy+F1sulYcsqWnz9zF3IlRo+BIpHkuzlqTj/BH3jtO3eGBtkr0xDiResWB Eb95nxA4EEPg== X-IronPort-AV: E=Sophos;i="5.77,434,1596524400"; d="scan'208";a="537167713" Received: from kingelix-mobl.amr.corp.intel.com (HELO bwidawsk-mobl5.local) ([10.252.139.120]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2020 12:02:51 -0700 From: Ben Widawsky To: linux-mm , Mike Kravetz , Andrew Morton Cc: Ben Widawsky , Dave Hansen , Michal Hocko , linux-kernel@vger.kernel.org Subject: [PATCH 11/12] mm/mempolicy: huge-page allocation for many preferred Date: Fri, 30 Oct 2020 12:02:37 -0700 Message-Id: <20201030190238.306764-12-ben.widawsky@intel.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201030190238.306764-1-ben.widawsky@intel.com> References: <20201030190238.306764-1-ben.widawsky@intel.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Implement the missing huge page allocation functionality while obeying the preferred node semantics. This uses a fallback mechanism to try multiple preferred nodes first, and then all other nodes. It cannot use the helper function that was introduced because huge page allocation already has its own helpers and it was more LOC, and effort to try to consolidate that. The weirdness is MPOL_PREFERRED_MANY can't be called yet because it is part of the UAPI we haven't yet exposed. Instead of make that define global, it's simply changed with the UAPI patch. Link: https://lore.kernel.org/r/20200630212517.308045-12-ben.widawsky@intel.com Signed-off-by: Ben Widawsky --- mm/hugetlb.c | 20 +++++++++++++++++--- mm/mempolicy.c | 3 ++- 2 files changed, 19 insertions(+), 4 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index fe76f8fd5a73..d9acc25ed3b5 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1094,7 +1094,7 @@ static struct page *dequeue_huge_page_vma(struct hstate *h, unsigned long address, int avoid_reserve, long chg) { - struct page *page; + struct page *page = NULL; struct mempolicy *mpol; gfp_t gfp_mask; nodemask_t *nodemask; @@ -1115,7 +1115,14 @@ static struct page *dequeue_huge_page_vma(struct hstate *h, gfp_mask = htlb_alloc_mask(h); nid = huge_node(vma, address, gfp_mask, &mpol, &nodemask); - page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask); + if (mpol->mode != MPOL_BIND && nodemask) { /* AKA MPOL_PREFERRED_MANY */ + page = dequeue_huge_page_nodemask(h, gfp_mask | __GFP_RETRY_MAYFAIL, + nid, nodemask); + if (!page) + page = dequeue_huge_page_nodemask(h, gfp_mask, nid, NULL); + } else { + page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask); + } if (page && !avoid_reserve && vma_has_reserves(vma, chg)) { SetPagePrivate(page); h->resv_huge_pages--; @@ -1977,7 +1984,14 @@ struct page *alloc_buddy_huge_page_with_mpol(struct hstate *h, nodemask_t *nodemask; nid = huge_node(vma, addr, gfp_mask, &mpol, &nodemask); - page = alloc_surplus_huge_page(h, gfp_mask, nid, nodemask); + if (mpol->mode != MPOL_BIND && nodemask) { /* AKA MPOL_PREFERRED_MANY */ + page = alloc_surplus_huge_page(h, gfp_mask | __GFP_RETRY_MAYFAIL, + nid, nodemask); + if (!page) + alloc_surplus_huge_page(h, gfp_mask, nid, NULL); + } else { + page = alloc_surplus_huge_page(h, gfp_mask, nid, nodemask); + } mpol_cond_put(mpol); return page; diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 343340c87f03..aab9ef698aa8 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -2075,7 +2075,8 @@ int huge_node(struct vm_area_struct *vma, unsigned long addr, gfp_t gfp_flags, huge_page_shift(hstate_vma(vma))); } else { nid = policy_node(gfp_flags, *mpol, numa_node_id()); - if ((*mpol)->mode == MPOL_BIND) + if ((*mpol)->mode == MPOL_BIND || + (*mpol)->mode == MPOL_PREFERRED_MANY) *nodemask = &(*mpol)->nodes; } return nid;