From patchwork Fri Jun 18 03:44:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Tang X-Patchwork-Id: 12330055 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7CC3CC49361 for ; Fri, 18 Jun 2021 03:45:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 28EE361351 for ; Fri, 18 Jun 2021 03:45:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 28EE361351 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C40C26B007B; Thu, 17 Jun 2021 23:45:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C18146B007D; Thu, 17 Jun 2021 23:45:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A94F56B007E; Thu, 17 Jun 2021 23:45:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0032.hostedemail.com [216.40.44.32]) by kanga.kvack.org (Postfix) with ESMTP id 7957B6B007B for ; Thu, 17 Jun 2021 23:45:07 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 1C3FF181AEF09 for ; Fri, 18 Jun 2021 03:45:07 +0000 (UTC) X-FDA: 78265453854.04.7CE5920 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by imf12.hostedemail.com (Postfix) with ESMTP id 6FACC54C for ; Fri, 18 Jun 2021 03:45:05 +0000 (UTC) IronPort-SDR: /rT2Zzxt7t4+KYgxxX/Rqq6tFBppMaDpO5PPszz6MsldgNYFTwfo/meWCqyuFMtMn2m/NlHHhp tsaqLlfCnecw== X-IronPort-AV: E=McAfee;i="6200,9189,10018"; a="203467031" X-IronPort-AV: E=Sophos;i="5.83,281,1616482800"; d="scan'208";a="203467031" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Jun 2021 20:45:05 -0700 IronPort-SDR: y3IaxVP0Txk7G62sYbw6pSd5e9G2OHejtUlWZ2NFnbKdDL8kCVh7KnKpdb3SATlG2BS+/DNNE4 x+T7eqwH2lrQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,281,1616482800"; d="scan'208";a="485539879" Received: from shbuild999.sh.intel.com ([10.239.147.94]) by orsmga001.jf.intel.com with ESMTP; 17 Jun 2021 20:45:02 -0700 From: Feng Tang To: linux-mm@kvack.org, Andrew Morton , Michal Hocko , David Rientjes , Dave Hansen , Ben Widawsky Cc: linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, Andrea Arcangeli , Mel Gorman , Mike Kravetz , Randy Dunlap , Vlastimil Babka , Andi Kleen , Dan Williams , ying.huang@intel.com, Feng Tang Subject: [PATCH v5 -mm 4/6] mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY Date: Fri, 18 Jun 2021 11:44:42 +0800 Message-Id: <1623987884-43576-5-git-send-email-feng.tang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1623987884-43576-1-git-send-email-feng.tang@intel.com> References: <1623987884-43576-1-git-send-email-feng.tang@intel.com> X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 6FACC54C X-Stat-Signature: 6wdcwjasizrxfu9cr7uh9pffsyndmww6 Authentication-Results: imf12.hostedemail.com; dkim=none; spf=none (imf12.hostedemail.com: domain of feng.tang@intel.com has no SPF policy when checking 192.55.52.93) smtp.mailfrom=feng.tang@intel.com; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=intel.com (policy=none) X-HE-Tag: 1623987905-742098 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Ben Widawsky Implement the missing huge page allocation functionality while obeying the preferred node semantics. This is similar to the implementation for general page allocation, as it uses a fallback mechanism to try multiple preferred nodes first, and then all other nodes. [Thanks to 0day bot for caching the missing #ifdef CONFIG_NUMA issue] Link: https://lore.kernel.org/r/20200630212517.308045-12-ben.widawsky@intel.com Suggested-by: Michal Hocko Signed-off-by: Ben Widawsky Co-developed-by: Feng Tang Signed-off-by: Feng Tang --- mm/hugetlb.c | 27 +++++++++++++++++++++++++-- mm/mempolicy.c | 3 ++- 2 files changed, 27 insertions(+), 3 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e4120680e31a..c771debd35a6 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1143,7 +1143,7 @@ static struct page *dequeue_huge_page_vma(struct hstate *h, unsigned long address, int avoid_reserve, long chg) { - struct page *page; + struct page *page = NULL; struct mempolicy *mpol; gfp_t gfp_mask; nodemask_t *nodemask; @@ -1164,7 +1164,18 @@ static struct page *dequeue_huge_page_vma(struct hstate *h, gfp_mask = htlb_alloc_mask(h); nid = huge_node(vma, address, gfp_mask, &mpol, &nodemask); +#ifdef CONFIG_NUMA + if (mpol->mode == MPOL_PREFERRED_MANY) { + page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask); + if (page) + goto check_reserve; + /* Fallback to all nodes */ + nodemask = NULL; + } +#endif page = dequeue_huge_page_nodemask(h, gfp_mask, nid, nodemask); + +check_reserve: if (page && !avoid_reserve && vma_has_reserves(vma, chg)) { SetHPageRestoreReserve(page); h->resv_huge_pages--; @@ -2048,9 +2059,21 @@ struct page *alloc_buddy_huge_page_with_mpol(struct hstate *h, nodemask_t *nodemask; nid = huge_node(vma, addr, gfp_mask, &mpol, &nodemask); +#ifdef CONFIG_NUMA + if (mpol->mode == MPOL_PREFERRED_MANY) { + gfp_t gfp = (gfp_mask | __GFP_NOWARN) & ~__GFP_DIRECT_RECLAIM; + + page = alloc_surplus_huge_page(h, gfp, nid, nodemask); + if (page) + goto exit; + /* Fallback to all nodes */ + nodemask = NULL; + } +#endif page = alloc_surplus_huge_page(h, gfp_mask, nid, nodemask); - mpol_cond_put(mpol); +exit: + mpol_cond_put(mpol); return page; } diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 9dce67fc9bb6..93f8789758a7 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -2054,7 +2054,8 @@ int huge_node(struct vm_area_struct *vma, unsigned long addr, gfp_t gfp_flags, huge_page_shift(hstate_vma(vma))); } else { nid = policy_node(gfp_flags, *mpol, numa_node_id()); - if ((*mpol)->mode == MPOL_BIND) + if ((*mpol)->mode == MPOL_BIND || + (*mpol)->mode == MPOL_PREFERRED_MANY) *nodemask = &(*mpol)->nodes; } return nid;