From patchwork Mon Sep 28 17:53:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804447 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3D70E618 for ; Mon, 28 Sep 2020 17:55:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D19B12100A for ; Mon, 28 Sep 2020 17:55:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="PcWM1IW1"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="AeRz+mXg" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D19B12100A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8A983900005; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 602576B0087; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 34173900005; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0083.hostedemail.com [216.40.44.83]) by kanga.kvack.org (Postfix) with ESMTP id E27936B0087 for ; Mon, 28 Sep 2020 13:55:23 -0400 (EDT) Received: from smtpin15.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 96CBF180AD802 for ; Mon, 28 Sep 2020 17:55:23 +0000 (UTC) X-FDA: 77313222126.15.line79_2b0b8dd27183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin15.hostedemail.com (Postfix) with ESMTP id 6A8B51814B0C1 for ; Mon, 28 Sep 2020 17:55:23 +0000 (UTC) X-Spam-Summary: 1,0,0,28fdeb2e9a65b3ec,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:1:2:41:69:355:379:541:800:960:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2194:2198:2199:2200:2393:2559:2562:2898:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3872:3874:4050:4250:4321:4605:5007:6119:6120:6261:6653:6742:7576:7875:9592:10004:11026:11473:11658:11914:12043:12296:12438:12555:12679:12895:12986:13161:13229:13255:13894:14096:21080:21611:21627:21740:21796:21990:30036:30054:30064:30070:30075,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-64.100.201.100 62.18.0.100;04yf7n1bsad4bxa3aa9jd7eho1znwoco6f5fd8gqfsexk4hx76jrkfr1ikxaszw.6tzpm9tasggxtbh3tybpwe7q5czj1kte6tucr8swbqh6cooeism3e4k7tk4w7eg.k-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: line79_2b0b8dd27183 X-Filterd-Recvd-Size: 11226 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf37.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:22 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 5FB36E14; Mon, 28 Sep 2020 13:55:20 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=1JXkwDp9KLCIB rx6jYPsY+s0zDNvLwvprwELtbcH1wI=; b=PcWM1IW1xAk/OXpjNFCoGNof1Y3rq vs5b7fBue3B0+BME4GUysIJ+S4WYQhjEqBOBEvc1+6dH6PvAFp0p+bxR6nC3UwCL QWmDIKhhu2th6TIrHogQ4Y4qHpBLJ7oV7DVpW2Dm/yZ+Q4P+1dHyKuZ8KesHQHQ0 LP2eUwZ+i/hBcZf2JpnSDL7kfPCTVQFC1+IghVLLzc82/z6zbAF1T0Y2LYti2tcy nrmBKO4x7ozhh4p0KCbJCHGRD8gWsd/XZy+XDzMbIOySNdsmSq1HJSW8mj+1Iai6 zE5F6qkIbUh/dUkjrIyA1N4h85KJxLOue0BhTAj/7Acik0nVqJrakDbiA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=1JXkwDp9KLCIBrx6jYPsY+s0zDNvLwvprwELtbcH1wI=; b=AeRz+mXg aPkenou0K+dBeH6HEtt5xx6xyDf+KqVHVruB4ad0OKP4ypm+UQoakVzPiEa8aqtF Ovn79QSuMZJVe59fBWuEE5WH4HfL76eJONMhsi4hhxjhLaNlMgJ7qEnaCVcSd9cp mIwwa71J1CgjBELP5af1xR7lZBU2OsbnByryr/HhgWTRHvzCTNu4GqU1FIsRZy1F Po0k1s+qL9oLO2f9if4hs5z2lq1Cj+RYZD+apZiA4+aU/cb/K8SLwoFFalv59Jk7 qusbYcaJyF0Qp/y0afYhMQ49LW1tcq5PkfYbvm88mGBpPhnF9TwrX1ZtfPmkYGQu 9jvvS5b15y3CSQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmh grihhlfhhrohhmpeiiihdrhigrnhesshgvnhhtrdgtohhm X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 40EF8306467D; Mon, 28 Sep 2020 13:55:18 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org Subject: [RFC PATCH v2 01/30] mm/pagewalk: use READ_ONCE when reading the PUD entry unlocked Date: Mon, 28 Sep 2020 13:53:59 -0400 Message-Id: <20200928175428.4110504-2-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Jason Gunthorpe The pagewalker runs while only holding the mmap_sem for read. The pud can be set asynchronously, while also holding the mmap_sem for read eg from: handle_mm_fault() __handle_mm_fault() create_huge_pmd() dev_dax_huge_fault() __dev_dax_pud_fault() vmf_insert_pfn_pud() insert_pfn_pud() pud_lock() set_pud_at() At least x86 sets the PUD using WRITE_ONCE(), so an unlocked read of unstable data should be paired to use READ_ONCE(). For the pagewalker to work locklessly the PUD must work similarly to the PMD: once the PUD entry becomes a pointer to a PMD, it must be stable, and safe to pass to pmd_offset() Passing the value from READ_ONCE into the callbacks prevents the callers from seeing inconsistencies after they re-read, such as seeing pud_none(). If a callback does obtain the pud_lock then it should trigger ACTION_AGAIN if a data race caused the original value to change. Use the same pattern as gup_pmd_range() and pass in the address of the local READ_ONCE stack variable to pmd_offset() to avoid reading it again. Signed-off-by: Jason Gunthorpe --- include/linux/pagewalk.h | 2 +- mm/hmm.c | 16 +++++++--------- mm/mapping_dirty_helpers.c | 6 ++---- mm/pagewalk.c | 28 ++++++++++++++++------------ mm/ptdump.c | 3 +-- 5 files changed, 27 insertions(+), 28 deletions(-) diff --git a/include/linux/pagewalk.h b/include/linux/pagewalk.h index b1cb6b753abb..6caf28aadafb 100644 --- a/include/linux/pagewalk.h +++ b/include/linux/pagewalk.h @@ -39,7 +39,7 @@ struct mm_walk_ops { unsigned long next, struct mm_walk *walk); int (*p4d_entry)(p4d_t *p4d, unsigned long addr, unsigned long next, struct mm_walk *walk); - int (*pud_entry)(pud_t *pud, unsigned long addr, + int (*pud_entry)(pud_t pud, pud_t *pudp, unsigned long addr, unsigned long next, struct mm_walk *walk); int (*pmd_entry)(pmd_t *pmd, unsigned long addr, unsigned long next, struct mm_walk *walk); diff --git a/mm/hmm.c b/mm/hmm.c index 943cb2ba4442..419e9e50fd51 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -402,28 +402,26 @@ static inline unsigned long pud_to_hmm_pfn_flags(struct hmm_range *range, hmm_pfn_flags_order(PUD_SHIFT - PAGE_SHIFT); } -static int hmm_vma_walk_pud(pud_t *pudp, unsigned long start, unsigned long end, - struct mm_walk *walk) +static int hmm_vma_walk_pud(pud_t pud, pud_t *pudp, unsigned long start, + unsigned long end, struct mm_walk *walk) { struct hmm_vma_walk *hmm_vma_walk = walk->private; struct hmm_range *range = hmm_vma_walk->range; unsigned long addr = start; - pud_t pud; int ret = 0; spinlock_t *ptl = pud_trans_huge_lock(pudp, walk->vma); if (!ptl) return 0; + if (memcmp(pudp, &pud, sizeof(pud)) != 0) { + walk->action = ACTION_AGAIN; + spin_unlock(ptl); + return 0; + } /* Normally we don't want to split the huge page */ walk->action = ACTION_CONTINUE; - pud = READ_ONCE(*pudp); - if (pud_none(pud)) { - spin_unlock(ptl); - return hmm_vma_walk_hole(start, end, -1, walk); - } - if (pud_huge(pud) && pud_devmap(pud)) { unsigned long i, npages, pfn; unsigned int required_fault; diff --git a/mm/mapping_dirty_helpers.c b/mm/mapping_dirty_helpers.c index 2c7d03675903..9fc46ebef497 100644 --- a/mm/mapping_dirty_helpers.c +++ b/mm/mapping_dirty_helpers.c @@ -150,11 +150,9 @@ static int wp_clean_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned long end, * causes dirty info loss. The pagefault handler should do * that if needed. */ -static int wp_clean_pud_entry(pud_t *pud, unsigned long addr, unsigned long end, - struct mm_walk *walk) +static int wp_clean_pud_entry(pud_t pudval, pud_t *pudp, unsigned long addr, + unsigned long end, struct mm_walk *walk) { - pud_t pudval = READ_ONCE(*pud); - if (!pud_trans_unstable(&pudval)) return 0; diff --git a/mm/pagewalk.c b/mm/pagewalk.c index e81640d9f177..15d1e423b4a3 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -58,7 +58,7 @@ static int walk_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, return err; } -static int walk_pmd_range(pud_t *pud, unsigned long addr, unsigned long end, +static int walk_pmd_range(pud_t pud, unsigned long addr, unsigned long end, struct mm_walk *walk) { pmd_t *pmd; @@ -67,7 +67,7 @@ static int walk_pmd_range(pud_t *pud, unsigned long addr, unsigned long end, int err = 0; int depth = real_depth(3); - pmd = pmd_offset(pud, addr); + pmd = pmd_offset(&pud, addr); do { again: next = pmd_addr_end(addr, end); @@ -119,17 +119,19 @@ static int walk_pmd_range(pud_t *pud, unsigned long addr, unsigned long end, static int walk_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end, struct mm_walk *walk) { - pud_t *pud; + pud_t *pudp; + pud_t pud; unsigned long next; const struct mm_walk_ops *ops = walk->ops; int err = 0; int depth = real_depth(2); - pud = pud_offset(p4d, addr); + pudp = pud_offset(p4d, addr); do { again: + pud = READ_ONCE(*pudp); next = pud_addr_end(addr, end); - if (pud_none(*pud) || (!walk->vma && !walk->no_vma)) { + if (pud_none(pud) || (!walk->vma && !walk->no_vma)) { if (ops->pte_hole) err = ops->pte_hole(addr, next, depth, walk); if (err) @@ -140,27 +142,29 @@ static int walk_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end, walk->action = ACTION_SUBTREE; if (ops->pud_entry) - err = ops->pud_entry(pud, addr, next, walk); + err = ops->pud_entry(pud, pudp, addr, next, walk); if (err) break; if (walk->action == ACTION_AGAIN) goto again; - if ((!walk->vma && (pud_leaf(*pud) || !pud_present(*pud))) || + if ((!walk->vma && (pud_leaf(pud) || !pud_present(pud))) || walk->action == ACTION_CONTINUE || !(ops->pmd_entry || ops->pte_entry)) continue; - if (walk->vma) - split_huge_pud(walk->vma, pud, addr); - if (pud_none(*pud)) - goto again; + if (walk->vma) { + split_huge_pud(walk->vma, pudp, addr); + pud = READ_ONCE(*pudp); + if (pud_none(pud)) + goto again; + } err = walk_pmd_range(pud, addr, next, walk); if (err) break; - } while (pud++, addr = next, addr != end); + } while (pudp++, addr = next, addr != end); return err; } diff --git a/mm/ptdump.c b/mm/ptdump.c index ba88ec43ff21..2055b940408e 100644 --- a/mm/ptdump.c +++ b/mm/ptdump.c @@ -65,11 +65,10 @@ static int ptdump_p4d_entry(p4d_t *p4d, unsigned long addr, return 0; } -static int ptdump_pud_entry(pud_t *pud, unsigned long addr, +static int ptdump_pud_entry(pud_t val, pud_t *pudp, unsigned long addr, unsigned long next, struct mm_walk *walk) { struct ptdump_state *st = walk->private; - pud_t val = READ_ONCE(*pud); #if CONFIG_PGTABLE_LEVELS > 2 && defined(CONFIG_KASAN) if (pud_page(val) == virt_to_page(lm_alias(kasan_early_shadow_pmd))) From patchwork Mon Sep 28 17:54:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804449 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A36FD6CA for ; Mon, 28 Sep 2020 17:55:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 46F9C2158C for ; Mon, 28 Sep 2020 17:55:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="SEbbQmed"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="tYJqjIrU" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 46F9C2158C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B44F36B0068; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id AA52D900006; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7D197900003; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0065.hostedemail.com [216.40.44.65]) by kanga.kvack.org (Postfix) with ESMTP id 3A939900007 for ; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id EF1224410 for ; Mon, 28 Sep 2020 17:55:23 +0000 (UTC) X-FDA: 77313222126.28.sink46_450292827183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin28.hostedemail.com (Postfix) with ESMTP id 7D8686C0B for ; Mon, 28 Sep 2020 17:55:23 +0000 (UTC) X-Spam-Summary: 1,0,0,5d8138b8fa87ea8e,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:69:327:355:379:541:800:960:966:968:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2196:2199:2393:2559:2562:2898:2910:2918:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3874:4250:4321:4385:4605:5007:6117:6119:6120:6121:6261:6653:6691:6742:7576:7875:7901:7903:8957:9036:10004:10954:11026:11473:11658:11914:12043:12291:12296:12438:12555:12679:12683:12895:12986:13161:13229:13255:13894:14096:14110:21080:21627:21789:21939:21990:30054:30064:30070,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04yrrptcsiijc9unzum9z6hopst3wycsmhxb81f577comcm31ao55kxix3zaqc6.azjadkqi18rqwaxix6sp11kuzncpja4gqtfjb9gazd1z98yqgycdgcbhhxp4csi.r-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: sink46_450292827183 X-Filterd-Recvd-Size: 24015 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:22 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 657D2E1B; Mon, 28 Sep 2020 13:55:20 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=lIEkADkjiRgN2 eBw4b52cvu6kArTIQoE8NgK2aLJwLE=; b=SEbbQmedG4QyLRc+5zepJmelkZpLP FcKJ2Qw1JpE2tZewH3B6iOpnoy8mHcTdd14lMDTzWV0LtFGQ2kEx0otfFIjNJ0NU D8n02ApfgmYinn30FLg3s9FWX0Ef3f6mKWggYYZoakDCNVVPVUUQPRno4kRm2GnE CX9IsQ+EVCBdosacSp+TKGJdzOWCRwJbFMUkrc0eGwYKhwXjMk4/s5HpR7FAIila l0nnGY3mb/F2kkD6TCgPK+wU35Ctrjfc+dOQ0VgSKjaYBgLzLtQNz9v1LGfdWiKi p4b99jKfpv65BNNyICb+aS/TcCteLt+SlYIUs/u9r/64FgEi8FvVqU9IA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=lIEkADkjiRgN2eBw4b52cvu6kArTIQoE8NgK2aLJwLE=; b=tYJqjIrU c+YHhYc2ZrOfx+ysmnmvB+TkNAQhFHvUJjPyQY+1objcdKbhrKDhcIkBnnKme41B gvXe/f6Zn2N26YlYRozSmX/veC2UlEV9miKYVts+VC53Hz6iL9toaRWTyH5puZZa uKsVPjqJxB51oWurD64qbYgx6sz5hxne7+jnexFzDgTdY+/qIqHO5QapF6NYy4km vcXpaWhU/YY/kGka/1OiPCrrn4eM1iAEbBJgrADlGTsaKnOto0cC9JpmZZXLoynk dS2ewb2WroLyJlPs2dWxFunAnEa2vomga7E9aJPK/96dy75fdo3YoVlOhl4WwnhG e0hPWqUonHCdqA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmh grihhlfhhrohhmpeiiihdrhigrnhesshgvnhhtrdgtohhm X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 951493064682; Mon, 28 Sep 2020 13:55:18 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 02/30] mm: pagewalk: use READ_ONCE when reading the PMD entry unlocked Date: Mon, 28 Sep 2020 13:54:00 -0400 Message-Id: <20200928175428.4110504-3-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan The pagewalker runs while only holding the mmap_sem for read. The pud can be set asynchronously, while also holding the mmap_sem for read. This follows the same way as the commit: mm/pagewalk: use READ_ONCE when reading the PUD entry unlocked" Signed-off-by: Zi Yan --- fs/proc/task_mmu.c | 69 ++++++++++++++++++++++++++-------------- include/linux/pagewalk.h | 2 +- mm/madvise.c | 59 ++++++++++++++++++---------------- mm/memcontrol.c | 30 +++++++++++------ mm/mempolicy.c | 15 ++++++--- mm/mincore.c | 10 +++--- mm/pagewalk.c | 21 ++++++------ 7 files changed, 124 insertions(+), 82 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 069978777423..a21484b1414d 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -570,28 +570,33 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, } #endif -static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, - struct mm_walk *walk) +static int smaps_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, + unsigned long end, struct mm_walk *walk) { struct vm_area_struct *vma = walk->vma; pte_t *pte; spinlock_t *ptl; - ptl = pmd_trans_huge_lock(pmd, vma); + ptl = pmd_trans_huge_lock(pmdp, vma); if (ptl) { - smaps_pmd_entry(pmd, addr, walk); + if (memcmp(pmdp, &pmd, sizeof(pmd)) != 0) { + walk->action = ACTION_AGAIN; + spin_unlock(ptl); + return 0; + } + smaps_pmd_entry(pmdp, addr, walk); spin_unlock(ptl); goto out; } - if (pmd_trans_unstable(pmd)) + if (pmd_trans_unstable(&pmd)) goto out; /* * The mmap_lock held all the way back in m_start() is what * keeps khugepaged out of here and from collapsing things * in here. */ - pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); + pte = pte_offset_map_lock(vma->vm_mm, pmdp, addr, &ptl); for (; addr != end; pte++, addr += PAGE_SIZE) smaps_pte_entry(pte, addr, walk); pte_unmap_unlock(pte - 1, ptl); @@ -1091,7 +1096,7 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma, } #endif -static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr, +static int clear_refs_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, unsigned long end, struct mm_walk *walk) { struct clear_refs_private *cp = walk->private; @@ -1100,20 +1105,25 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr, spinlock_t *ptl; struct page *page; - ptl = pmd_trans_huge_lock(pmd, vma); + ptl = pmd_trans_huge_lock(pmdp, vma); if (ptl) { + if (memcmp(pmdp, &pmd, sizeof(pmd)) != 0) { + walk->action = ACTION_AGAIN; + spin_unlock(ptl); + return 0; + } if (cp->type == CLEAR_REFS_SOFT_DIRTY) { - clear_soft_dirty_pmd(vma, addr, pmd); + clear_soft_dirty_pmd(vma, addr, pmdp); goto out; } - if (!pmd_present(*pmd)) + if (!pmd_present(pmd)) goto out; - page = pmd_page(*pmd); + page = pmd_page(pmd); /* Clear accessed and referenced bits. */ - pmdp_test_and_clear_young(vma, addr, pmd); + pmdp_test_and_clear_young(vma, addr, pmdp); test_and_clear_page_young(page); ClearPageReferenced(page); out: @@ -1121,10 +1131,10 @@ static int clear_refs_pte_range(pmd_t *pmd, unsigned long addr, return 0; } - if (pmd_trans_unstable(pmd)) + if (pmd_trans_unstable(&pmd)) return 0; - pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); + pte = pte_offset_map_lock(vma->vm_mm, pmdp, addr, &ptl); for (; addr != end; pte++, addr += PAGE_SIZE) { ptent = *pte; @@ -1388,8 +1398,8 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm, return make_pme(frame, flags); } -static int pagemap_pmd_range(pmd_t *pmdp, unsigned long addr, unsigned long end, - struct mm_walk *walk) +static int pagemap_pmd_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, + unsigned long end, struct mm_walk *walk) { struct vm_area_struct *vma = walk->vma; struct pagemapread *pm = walk->private; @@ -1401,9 +1411,14 @@ static int pagemap_pmd_range(pmd_t *pmdp, unsigned long addr, unsigned long end, ptl = pmd_trans_huge_lock(pmdp, vma); if (ptl) { u64 flags = 0, frame = 0; - pmd_t pmd = *pmdp; struct page *page = NULL; + if (memcmp(pmdp, &pmd, sizeof(pmd)) != 0) { + walk->action = ACTION_AGAIN; + spin_unlock(ptl); + return 0; + } + if (vma->vm_flags & VM_SOFTDIRTY) flags |= PM_SOFT_DIRTY; @@ -1456,7 +1471,7 @@ static int pagemap_pmd_range(pmd_t *pmdp, unsigned long addr, unsigned long end, return err; } - if (pmd_trans_unstable(pmdp)) + if (pmd_trans_unstable(&pmd)) return 0; #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ @@ -1768,7 +1783,7 @@ static struct page *can_gather_numa_stats_pmd(pmd_t pmd, } #endif -static int gather_pte_stats(pmd_t *pmd, unsigned long addr, +static int gather_pte_stats(pmd_t pmd, pmd_t *pmdp, unsigned long addr, unsigned long end, struct mm_walk *walk) { struct numa_maps *md = walk->private; @@ -1778,22 +1793,28 @@ static int gather_pte_stats(pmd_t *pmd, unsigned long addr, pte_t *pte; #ifdef CONFIG_TRANSPARENT_HUGEPAGE - ptl = pmd_trans_huge_lock(pmd, vma); + ptl = pmd_trans_huge_lock(pmdp, vma); if (ptl) { struct page *page; - page = can_gather_numa_stats_pmd(*pmd, vma, addr); + if (memcmp(pmdp, &pmd, sizeof(pmd)) != 0) { + walk->action = ACTION_AGAIN; + spin_unlock(ptl); + return 0; + } + + page = can_gather_numa_stats_pmd(pmd, vma, addr); if (page) - gather_stats(page, md, pmd_dirty(*pmd), + gather_stats(page, md, pmd_dirty(pmd), HPAGE_PMD_SIZE/PAGE_SIZE); spin_unlock(ptl); return 0; } - if (pmd_trans_unstable(pmd)) + if (pmd_trans_unstable(&pmd)) return 0; #endif - orig_pte = pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); + orig_pte = pte = pte_offset_map_lock(walk->mm, pmdp, addr, &ptl); do { struct page *page = can_gather_numa_stats(*pte, vma, addr); if (!page) diff --git a/include/linux/pagewalk.h b/include/linux/pagewalk.h index 6caf28aadafb..686b57e94a9f 100644 --- a/include/linux/pagewalk.h +++ b/include/linux/pagewalk.h @@ -41,7 +41,7 @@ struct mm_walk_ops { unsigned long next, struct mm_walk *walk); int (*pud_entry)(pud_t pud, pud_t *pudp, unsigned long addr, unsigned long next, struct mm_walk *walk); - int (*pmd_entry)(pmd_t *pmd, unsigned long addr, + int (*pmd_entry)(pmd_t pmd, pmd_t *pmdp, unsigned long addr, unsigned long next, struct mm_walk *walk); int (*pte_entry)(pte_t *pte, unsigned long addr, unsigned long next, struct mm_walk *walk); diff --git a/mm/madvise.c b/mm/madvise.c index ae266dfede8a..16e7b8eadb13 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -183,14 +183,14 @@ static long madvise_behavior(struct vm_area_struct *vma, } #ifdef CONFIG_SWAP -static int swapin_walk_pmd_entry(pmd_t *pmd, unsigned long start, +static int swapin_walk_pmd_entry(pmd_t pmd, pmd_t *pmdp, unsigned long start, unsigned long end, struct mm_walk *walk) { pte_t *orig_pte; struct vm_area_struct *vma = walk->private; unsigned long index; - if (pmd_none_or_trans_huge_or_clear_bad(pmd)) + if (pmd_none_or_trans_huge_or_clear_bad(&pmd)) return 0; for (index = start; index != end; index += PAGE_SIZE) { @@ -199,7 +199,7 @@ static int swapin_walk_pmd_entry(pmd_t *pmd, unsigned long start, struct page *page; spinlock_t *ptl; - orig_pte = pte_offset_map_lock(vma->vm_mm, pmd, start, &ptl); + orig_pte = pte_offset_map_lock(vma->vm_mm, pmdp, start, &ptl); pte = *(orig_pte + ((index - start) / PAGE_SIZE)); pte_unmap_unlock(orig_pte, ptl); @@ -304,7 +304,7 @@ static long madvise_willneed(struct vm_area_struct *vma, return 0; } -static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, +static int madvise_cold_or_pageout_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, unsigned long end, struct mm_walk *walk) { @@ -322,26 +322,29 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, return -EINTR; #ifdef CONFIG_TRANSPARENT_HUGEPAGE - if (pmd_trans_huge(*pmd)) { - pmd_t orig_pmd; + if (pmd_trans_huge(pmd)) { unsigned long next = pmd_addr_end(addr, end); tlb_change_page_size(tlb, HPAGE_PMD_SIZE); - ptl = pmd_trans_huge_lock(pmd, vma); + ptl = pmd_trans_huge_lock(pmdp, vma); if (!ptl) return 0; - orig_pmd = *pmd; - if (is_huge_zero_pmd(orig_pmd)) + if (memcmp(pmdp, &pmd, sizeof(pmd)) != 0) { + walk->action = ACTION_AGAIN; + goto huge_unlock; + } + + if (is_huge_zero_pmd(pmd)) goto huge_unlock; - if (unlikely(!pmd_present(orig_pmd))) { + if (unlikely(!pmd_present(pmd))) { VM_BUG_ON(thp_migration_supported() && - !is_pmd_migration_entry(orig_pmd)); + !is_pmd_migration_entry(pmd)); goto huge_unlock; } - page = pmd_page(orig_pmd); + page = pmd_page(pmd); /* Do not interfere with other mappings of this page */ if (page_mapcount(page) != 1) @@ -361,12 +364,12 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, return 0; } - if (pmd_young(orig_pmd)) { - pmdp_invalidate(vma, addr, pmd); - orig_pmd = pmd_mkold(orig_pmd); + if (pmd_young(pmd)) { + pmdp_invalidate(vma, addr, pmdp); + pmd = pmd_mkold(pmd); - set_pmd_at(mm, addr, pmd, orig_pmd); - tlb_remove_pmd_tlb_entry(tlb, pmd, addr); + set_pmd_at(mm, addr, pmdp, pmd); + tlb_remove_pmd_tlb_entry(tlb, pmdp, addr); } ClearPageReferenced(page); @@ -388,11 +391,11 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, } regular_page: - if (pmd_trans_unstable(pmd)) + if (pmd_trans_unstable(&pmd)) return 0; #endif tlb_change_page_size(tlb, PAGE_SIZE); - orig_pte = pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); + orig_pte = pte = pte_offset_map_lock(vma->vm_mm, pmdp, addr, &ptl); flush_tlb_batched_pending(mm); arch_enter_lazy_mmu_mode(); for (; addr < end; pte++, addr += PAGE_SIZE) { @@ -424,12 +427,12 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, if (split_huge_page(page)) { unlock_page(page); put_page(page); - pte_offset_map_lock(mm, pmd, addr, &ptl); + pte_offset_map_lock(mm, pmdp, addr, &ptl); break; } unlock_page(page); put_page(page); - pte = pte_offset_map_lock(mm, pmd, addr, &ptl); + pte = pte_offset_map_lock(mm, pmdp, addr, &ptl); pte--; addr -= PAGE_SIZE; continue; @@ -566,7 +569,7 @@ static long madvise_pageout(struct vm_area_struct *vma, return 0; } -static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr, +static int madvise_free_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, unsigned long end, struct mm_walk *walk) { @@ -580,15 +583,15 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr, unsigned long next; next = pmd_addr_end(addr, end); - if (pmd_trans_huge(*pmd)) - if (madvise_free_huge_pmd(tlb, vma, pmd, addr, next)) + if (pmd_trans_huge(pmd)) + if (madvise_free_huge_pmd(tlb, vma, pmdp, addr, next)) goto next; - if (pmd_trans_unstable(pmd)) + if (pmd_trans_unstable(&pmd)) return 0; tlb_change_page_size(tlb, PAGE_SIZE); - orig_pte = pte = pte_offset_map_lock(mm, pmd, addr, &ptl); + orig_pte = pte = pte_offset_map_lock(mm, pmdp, addr, &ptl); flush_tlb_batched_pending(mm); arch_enter_lazy_mmu_mode(); for (; addr != end; pte++, addr += PAGE_SIZE) { @@ -634,12 +637,12 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr, if (split_huge_page(page)) { unlock_page(page); put_page(page); - pte_offset_map_lock(mm, pmd, addr, &ptl); + pte_offset_map_lock(mm, pmdp, addr, &ptl); goto out; } unlock_page(page); put_page(page); - pte = pte_offset_map_lock(mm, pmd, addr, &ptl); + pte = pte_offset_map_lock(mm, pmdp, addr, &ptl); pte--; addr -= PAGE_SIZE; continue; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 9c4a0851348f..b28f620c1c5b 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5827,7 +5827,7 @@ static inline enum mc_target_type get_mctgt_type_thp(struct vm_area_struct *vma, } #endif -static int mem_cgroup_count_precharge_pte_range(pmd_t *pmd, +static int mem_cgroup_count_precharge_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, unsigned long end, struct mm_walk *walk) { @@ -5835,22 +5835,27 @@ static int mem_cgroup_count_precharge_pte_range(pmd_t *pmd, pte_t *pte; spinlock_t *ptl; - ptl = pmd_trans_huge_lock(pmd, vma); + ptl = pmd_trans_huge_lock(pmdp, vma); if (ptl) { + if (memcmp(pmdp, &pmd, sizeof(pmd)) != 0) { + walk->action = ACTION_AGAIN; + spin_unlock(ptl); + return 0; + } /* * Note their can not be MC_TARGET_DEVICE for now as we do not * support transparent huge page with MEMORY_DEVICE_PRIVATE but * this might change. */ - if (get_mctgt_type_thp(vma, addr, *pmd, NULL) == MC_TARGET_PAGE) + if (get_mctgt_type_thp(vma, addr, pmd, NULL) == MC_TARGET_PAGE) mc.precharge += HPAGE_PMD_NR; spin_unlock(ptl); return 0; } - if (pmd_trans_unstable(pmd)) + if (pmd_trans_unstable(&pmd)) return 0; - pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); + pte = pte_offset_map_lock(vma->vm_mm, pmdp, addr, &ptl); for (; addr != end; pte++, addr += PAGE_SIZE) if (get_mctgt_type(vma, addr, *pte, NULL)) mc.precharge++; /* increment precharge temporarily */ @@ -6023,7 +6028,7 @@ static void mem_cgroup_cancel_attach(struct cgroup_taskset *tset) mem_cgroup_clear_mc(); } -static int mem_cgroup_move_charge_pte_range(pmd_t *pmd, +static int mem_cgroup_move_charge_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, unsigned long end, struct mm_walk *walk) { @@ -6035,13 +6040,18 @@ static int mem_cgroup_move_charge_pte_range(pmd_t *pmd, union mc_target target; struct page *page; - ptl = pmd_trans_huge_lock(pmd, vma); + ptl = pmd_trans_huge_lock(pmdp, vma); if (ptl) { + if (memcmp(pmdp, &pmd, sizeof(pmd)) != 0) { + walk->action = ACTION_AGAIN; + spin_unlock(ptl); + return 0; + } if (mc.precharge < HPAGE_PMD_NR) { spin_unlock(ptl); return 0; } - target_type = get_mctgt_type_thp(vma, addr, *pmd, &target); + target_type = get_mctgt_type_thp(vma, addr, pmd, &target); if (target_type == MC_TARGET_PAGE) { page = target.page; if (!isolate_lru_page(page)) { @@ -6066,10 +6076,10 @@ static int mem_cgroup_move_charge_pte_range(pmd_t *pmd, return 0; } - if (pmd_trans_unstable(pmd)) + if (pmd_trans_unstable(&pmd)) return 0; retry: - pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); + pte = pte_offset_map_lock(vma->vm_mm, pmdp, addr, &ptl); for (; addr != end; addr += PAGE_SIZE) { pte_t ptent = *(pte++); bool device = false; diff --git a/mm/mempolicy.c b/mm/mempolicy.c index eddbe4e56c73..731a7710395f 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -516,7 +516,7 @@ static int queue_pages_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr, * -EIO - only MPOL_MF_STRICT was specified and an existing page was already * on a node that does not follow the policy. */ -static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr, +static int queue_pages_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, unsigned long end, struct mm_walk *walk) { struct vm_area_struct *vma = walk->vma; @@ -528,18 +528,23 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr, pte_t *pte; spinlock_t *ptl; - ptl = pmd_trans_huge_lock(pmd, vma); + ptl = pmd_trans_huge_lock(pmdp, vma); if (ptl) { - ret = queue_pages_pmd(pmd, ptl, addr, end, walk); + if (memcmp(pmdp, &pmd, sizeof(pmd)) != 0) { + walk->action = ACTION_AGAIN; + spin_unlock(ptl); + return 0; + } + ret = queue_pages_pmd(pmdp, ptl, addr, end, walk); if (ret != 2) return ret; } /* THP was split, fall through to pte walk */ - if (pmd_trans_unstable(pmd)) + if (pmd_trans_unstable(&pmd)) return 0; - pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); + pte = pte_offset_map_lock(walk->mm, pmdp, addr, &ptl); for (; addr != end; pte++, addr += PAGE_SIZE) { if (!pte_present(*pte)) continue; diff --git a/mm/mincore.c b/mm/mincore.c index 02db1a834021..168661f32aaa 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -96,8 +96,8 @@ static int mincore_unmapped_range(unsigned long addr, unsigned long end, return 0; } -static int mincore_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, - struct mm_walk *walk) +static int mincore_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, + unsigned long end, struct mm_walk *walk) { spinlock_t *ptl; struct vm_area_struct *vma = walk->vma; @@ -105,19 +105,19 @@ static int mincore_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, unsigned char *vec = walk->private; int nr = (end - addr) >> PAGE_SHIFT; - ptl = pmd_trans_huge_lock(pmd, vma); + ptl = pmd_trans_huge_lock(pmdp, vma); if (ptl) { memset(vec, 1, nr); spin_unlock(ptl); goto out; } - if (pmd_trans_unstable(pmd)) { + if (pmd_trans_unstable(&pmd)) { __mincore_unmapped_range(addr, end, vma, vec); goto out; } - ptep = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); + ptep = pte_offset_map_lock(walk->mm, pmdp, addr, &ptl); for (; addr != end; ptep++, addr += PAGE_SIZE) { pte_t pte = *ptep; diff --git a/mm/pagewalk.c b/mm/pagewalk.c index 15d1e423b4a3..a3752c82a7b2 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -61,17 +61,19 @@ static int walk_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, static int walk_pmd_range(pud_t pud, unsigned long addr, unsigned long end, struct mm_walk *walk) { - pmd_t *pmd; + pmd_t *pmdp; + pmd_t pmd; unsigned long next; const struct mm_walk_ops *ops = walk->ops; int err = 0; int depth = real_depth(3); - pmd = pmd_offset(&pud, addr); + pmdp = pmd_offset(&pud, addr); do { again: + pmd = READ_ONCE(*pmdp); next = pmd_addr_end(addr, end); - if (pmd_none(*pmd) || (!walk->vma && !walk->no_vma)) { + if (pmd_none(pmd) || (!walk->vma && !walk->no_vma)) { if (ops->pte_hole) err = ops->pte_hole(addr, next, depth, walk); if (err) @@ -86,7 +88,7 @@ static int walk_pmd_range(pud_t pud, unsigned long addr, unsigned long end, * needs to know about pmd_trans_huge() pmds */ if (ops->pmd_entry) - err = ops->pmd_entry(pmd, addr, next, walk); + err = ops->pmd_entry(pmd, pmdp, addr, next, walk); if (err) break; @@ -97,21 +99,22 @@ static int walk_pmd_range(pud_t pud, unsigned long addr, unsigned long end, * Check this here so we only break down trans_huge * pages when we _need_ to */ - if ((!walk->vma && (pmd_leaf(*pmd) || !pmd_present(*pmd))) || + if ((!walk->vma && (pmd_leaf(pmd) || !pmd_present(pmd))) || walk->action == ACTION_CONTINUE || !(ops->pte_entry)) continue; if (walk->vma) { - split_huge_pmd(walk->vma, pmd, addr); - if (pmd_trans_unstable(pmd)) + split_huge_pmd(walk->vma, pmdp, addr); + pmd = READ_ONCE(*pmdp); + if (pmd_trans_unstable(&pmd)) goto again; } - err = walk_pte_range(pmd, addr, next, walk); + err = walk_pte_range(pmdp, addr, next, walk); if (err) break; - } while (pmd++, addr = next, addr != end); + } while (pmdp++, addr = next, addr != end); return err; } From patchwork Mon Sep 28 17:54:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804445 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D44656CA for ; Mon, 28 Sep 2020 17:55:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 898DB208D5 for ; Mon, 28 Sep 2020 17:55:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="ZgLtXxPq"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="MSjto/Lp" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 898DB208D5 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3E7B66B005D; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 38292900006; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 15DE1900003; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0247.hostedemail.com [216.40.44.247]) by kanga.kvack.org (Postfix) with ESMTP id D7FA66B005D for ; Mon, 28 Sep 2020 13:55:23 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 9637A1E02 for ; Mon, 28 Sep 2020 17:55:23 +0000 (UTC) X-FDA: 77313222126.20.map83_4a0a30427183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin20.hostedemail.com (Postfix) with ESMTP id 826A0180C07AF for ; Mon, 28 Sep 2020 17:55:23 +0000 (UTC) X-Spam-Summary: 10,1,0,15404996f4b8a444,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:2:41:69:355:379:541:800:960:966:973:982:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1605:1606:1730:1747:1777:1792:1981:2194:2196:2199:2200:2393:2559:2562:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:4120:4250:4321:4385:4605:5007:6117:6119:6120:6261:6653:6742:7576:7901:7903:7927:9010:9012:10007:11026:11232:11473:11658:11914:12043:12216:12296:12438:12555:12679:12895:12986:13161:13229:13255:13894:14096:21080:21451:21627:21990:30054:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04y8mrk4erncnrmzjk8x3gf16bj88yp6r9oin8kmabp87popgomhj9z1brapiwy.wsiu8wtb4c7grgsdd7ckkz59o1h6ua7i35xk4tx6wfk4czwmfasaj1rcptnx3zb.4-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:1:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: map83_4a0a30427183 X-Filterd-Recvd-Size: 9761 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf06.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:22 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 649DFE18; Mon, 28 Sep 2020 13:55:20 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=QhvUEUtDR8jg5 JOpmkKTRwuFRtglZQ/vY1xqC3rzo+Y=; b=ZgLtXxPqBSqSPdR91hgiV4VdcPIak W87Aw2siDPBTbHNiXO18yOYzE1RbU1uc8OYHYPxIjWxD5WChdgM25aFx8WiHwpS4 qzjlvUS4RuUSly9HXsy4pslpdCnFbTrtER5iKZIY/AdfNA+ri6kR0pgg3DQIEjZg lMx0wJFwQ0tZF579l2ruxjSpPnS3RyUqKquReI5LAXTqsKwatxC1ZSDGJJaj2TJm Cht3OMROuJy1fgS8G5/qtIjztysadd8YHQ7upTmk/BkAZPBBgKCA3piyapqaOdGq HJByAdj0u8QyCB4d3gbReby/vDvMVd5wCfjlqwT7rOCjp9aZix914+qFA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=QhvUEUtDR8jg5JOpmkKTRwuFRtglZQ/vY1xqC3rzo+Y=; b=MSjto/Lp Ks6w7QJ9GuVIUQYhEw0TMaGr/QnEqDbAo3Ag3ZwXgo5UXMVAoH8aqEnFLSv+hHx6 8c15DQqdjRr+zFgWszHu3hhd6BHbsSVar+1zxiOGo97DK8YnAryMCxhMGJ6KSGO2 rKK9Xx1ISsw0oHBXiJzr4yoIcjeBPlPkEtJx7X+KSpeuJbp7DhLa9pIhxugi08gB SiTxoBC1pNAQNKHsnJt9qKiTwTlpld9Ha7EXWJg0p5hu5MC72dMuOyyFPGdlz+5a sVq0SQdIiuV8Q4+Vu0A5ZRqZiSnigkNI68Aj1wHpo9SeBPk+M7FG/f4F7llmZGps z/qJ7q+9neDCOg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmh grihhlfhhrohhmpeiiihdrhigrnhesshgvnhhtrdgtohhm X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id EFBE63064683; Mon, 28 Sep 2020 13:55:18 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 03/30] mm: thp: use single linked list for THP page table page deposit. Date: Mon, 28 Sep 2020 13:54:01 -0400 Message-Id: <20200928175428.4110504-4-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan The old design uses the double linked list page->lru to chain all deposited page table pages when creating a THP and page->pmd_huge_pte to point to the first page of the list. As the second pointer in page->lru overlaps with page->pmd_huge_pte, the design prevents multi-level page table page deposit, which is useful for PUD and higher level THPs. The new design uses single linked list, where deposit_head points to a single linked list of deposited pages and deposit_node can be used to deposit the page itself to another list. For example, this allows us to have one PUD page points to a list of PMD pages, each of which points a list of PTE pages to support PUD level THP. Signed-off-by: Zi Yan --- include/linux/mm.h | 9 +++++---- include/linux/mm_types.h | 8 +++++--- kernel/fork.c | 4 ++-- mm/pgtable-generic.c | 15 +++++---------- 4 files changed, 17 insertions(+), 19 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 17e712207d74..01b62da34794 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include @@ -2249,7 +2250,7 @@ static inline spinlock_t *pmd_lockptr(struct mm_struct *mm, pmd_t *pmd) static inline bool pmd_ptlock_init(struct page *page) { #ifdef CONFIG_TRANSPARENT_HUGEPAGE - page->pmd_huge_pte = NULL; + init_llist_head(&page->deposit_head); #endif return ptlock_init(page); } @@ -2257,12 +2258,12 @@ static inline bool pmd_ptlock_init(struct page *page) static inline void pmd_ptlock_free(struct page *page) { #ifdef CONFIG_TRANSPARENT_HUGEPAGE - VM_BUG_ON_PAGE(page->pmd_huge_pte, page); + VM_BUG_ON_PAGE(!llist_empty(&page->deposit_head), page); #endif ptlock_free(page); } -#define pmd_huge_pte(mm, pmd) (pmd_to_page(pmd)->pmd_huge_pte) +#define huge_pmd_deposit_head(mm, pmd) (pmd_to_page(pmd)->deposit_head) #else @@ -2274,7 +2275,7 @@ static inline spinlock_t *pmd_lockptr(struct mm_struct *mm, pmd_t *pmd) static inline bool pmd_ptlock_init(struct page *page) { return true; } static inline void pmd_ptlock_free(struct page *page) {} -#define pmd_huge_pte(mm, pmd) ((mm)->pmd_huge_pte) +#define huge_pmd_deposit_head(mm, pmd) ((mm)->deposit_head_pmd) #endif diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 496c3ff97cce..be842926577a 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -6,6 +6,7 @@ #include #include +#include #include #include #include @@ -143,8 +144,8 @@ struct page { struct list_head deferred_list; }; struct { /* Page table pages */ - unsigned long _pt_pad_1; /* compound_head */ - pgtable_t pmd_huge_pte; /* protected by page->ptl */ + struct llist_head deposit_head; /* pgtable deposit list head */ + struct llist_node deposit_node; /* pgtable deposit list node */ unsigned long _pt_pad_2; /* mapping */ union { struct mm_struct *pt_mm; /* x86 pgds only */ @@ -511,7 +512,8 @@ struct mm_struct { struct mmu_notifier_subscriptions *notifier_subscriptions; #endif #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS - pgtable_t pmd_huge_pte; /* protected by page_table_lock */ + /* pgtable deposit list head, protected by page_table_lock */ + struct llist_head deposit_head_pmd; #endif #ifdef CONFIG_NUMA_BALANCING /* diff --git a/kernel/fork.c b/kernel/fork.c index 138cd6ca50da..9c8e880538de 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -661,7 +661,7 @@ static void check_mm(struct mm_struct *mm) mm_pgtables_bytes(mm)); #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS - VM_BUG_ON_MM(mm->pmd_huge_pte, mm); + VM_BUG_ON_MM(!llist_empty(&mm->deposit_head_pmd), mm); #endif } @@ -1022,7 +1022,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, mmu_notifier_subscriptions_init(mm); init_tlb_flush_pending(mm); #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS - mm->pmd_huge_pte = NULL; + init_llist_head(&mm->deposit_head_pmd); #endif mm_init_uprobes_state(mm); diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c index 9578db83e312..dbb0154165f1 100644 --- a/mm/pgtable-generic.c +++ b/mm/pgtable-generic.c @@ -164,11 +164,7 @@ void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp, assert_spin_locked(pmd_lockptr(mm, pmdp)); /* FIFO */ - if (!pmd_huge_pte(mm, pmdp)) - INIT_LIST_HEAD(&pgtable->lru); - else - list_add(&pgtable->lru, &pmd_huge_pte(mm, pmdp)->lru); - pmd_huge_pte(mm, pmdp) = pgtable; + llist_add(&pgtable->deposit_node, &huge_pmd_deposit_head(mm, pmdp)); } #endif @@ -180,12 +176,11 @@ pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp) assert_spin_locked(pmd_lockptr(mm, pmdp)); + /* only withdraw from a non empty list */ + VM_BUG_ON(llist_empty(&huge_pmd_deposit_head(mm, pmdp))); /* FIFO */ - pgtable = pmd_huge_pte(mm, pmdp); - pmd_huge_pte(mm, pmdp) = list_first_entry_or_null(&pgtable->lru, - struct page, lru); - if (pmd_huge_pte(mm, pmdp)) - list_del(&pgtable->lru); + pgtable = llist_entry(llist_del_first(&huge_pmd_deposit_head(mm, pmdp)), + struct page, deposit_node); return pgtable; } #endif From patchwork Mon Sep 28 17:54:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804441 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 23D6D618 for ; Mon, 28 Sep 2020 17:55:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9D3F3214D8 for ; Mon, 28 Sep 2020 17:55:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="icWNxsjP"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="Tg3P8xtH" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9D3F3214D8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C356E900004; Mon, 28 Sep 2020 13:55:23 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B2068900003; Mon, 28 Sep 2020 13:55:23 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9E873900002; Mon, 28 Sep 2020 13:55:23 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0132.hostedemail.com [216.40.44.132]) by kanga.kvack.org (Postfix) with ESMTP id 87D5F6B005D for ; Mon, 28 Sep 2020 13:55:23 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 4845E180AD806 for ; Mon, 28 Sep 2020 17:55:23 +0000 (UTC) X-FDA: 77313222126.18.wheel31_1617c4627183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin18.hostedemail.com (Postfix) with ESMTP id 2410A1016E295 for ; Mon, 28 Sep 2020 17:55:23 +0000 (UTC) X-Spam-Summary: 1,0,0,c394a474d7c661a4,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:541:800:960:966:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1544:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2693:2731:3138:3139:3140:3141:3142:3353:3866:4119:4250:4321:4385:4605:5007:6114:6119:6120:6261:6642:6653:6742:7576:7875:7901:10004:11026:11473:11657:11658:11914:12043:12291:12296:12438:12555:12679:12683:12895:12986:13255:13894:14110:14181:14721:21080:21451:21627:21990:30003:30054:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-64.100.201.100 62.18.0.100;04yrq3nq6gjd66itss1isntkfo7k9ociu71kejc465ytxwqqyo37hf9g64b7oe3.33qnw4sehtwpahiet5dqz5zfxz7c1cy3pm19mdbqdg5x3mrj9f7kr57scj1x868.g-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:59,LUA_SUMMARY:none X-HE-Tag: wheel31_1617c4627183 X-Filterd-Recvd-Size: 8245 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf15.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:22 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 63CD4E17; Mon, 28 Sep 2020 13:55:20 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=/iLEjo06CSGEa pnKoXGuDZP8TVgidFWGouWqDpYGetE=; b=icWNxsjP5uTrE1Wr00eXXG21kKr2M Ox47jYHnPPsqXuEP56ze8OfuK5dJhEWI3tiOjmE/wg9/kNfQn7iMP9yWPbACoK8Q cD3RRxZk6XNVp/mCy+bIQsIxt5V072Bhuq0vtpPTPhT/d3xWSN+J4MoXh16+H5eG v59KtFZyMKwDfKrCEYBqjVmxbgcypgAieDrYNMkjz/kQm/VNVSZBWrcEcjvb/rw2 YclNLp8BFacCVHwYq+bp6aPtTTOnr7s7XJotUllDdusH4zTx4uBqCLxXZT65KbJM eU0LwvA3bqh6S7n5XwJWqgVxgvngC0lZ90Ag7sYUXz6CGcd8oG8kl5W4g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=/iLEjo06CSGEapnKoXGuDZP8TVgidFWGouWqDpYGetE=; b=Tg3P8xtH TViy2l/mFvSjJoNoBDpxk75TuNZ8EqJZ/CTAp9WbBlTIVk+fVA846N07vudBXCgn C3cdbptY+QDoJM4Vkq34CMNsSKa7xaT14vgZdf3nsMstJZStH4OcxX218z+LDBFa K2aal2SwC+TY0VcD+5H/QaSuzdz0jMOHzu2zzGQhxnuLPtGqlMSnjK4J6Q9fVLFO 8ZC7jIuLgyCvbBO1o1nzHRp+6jKgXE3vx05UqsUR+iBwEJrvN3scabP5v0Y33BUt pkhEvriaDsoU/JscsJscbxnAH7wenSLZtxzRI9u8Kn5c++rZ7D2zmT2SN/4LA9T3 Ps05L5Jub3pFYA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmh grihhlfhhrohhmpeiiihdrhigrnhesshgvnhhtrdgtohhm X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 53DB63064686; Mon, 28 Sep 2020 13:55:19 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 04/30] mm: add new helper functions to allocate one PMD page with 512 PTE pages. Date: Mon, 28 Sep 2020 13:54:02 -0400 Message-Id: <20200928175428.4110504-5-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan This prepares for PUD THP support, which allocates 512 of such PMD pages when creating a PUD THP. These page table pages will be withdrawn during THP split. Signed-off-by: Zi Yan --- arch/x86/include/asm/pgalloc.h | 60 ++++++++++++++++++++++++++++++++++ arch/x86/mm/pgtable.c | 25 ++++++++++++++ include/linux/huge_mm.h | 3 ++ 3 files changed, 88 insertions(+) diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h index 62ad61d6fefc..b24284522973 100644 --- a/arch/x86/include/asm/pgalloc.h +++ b/arch/x86/include/asm/pgalloc.h @@ -52,6 +52,19 @@ extern pgd_t *pgd_alloc(struct mm_struct *); extern void pgd_free(struct mm_struct *mm, pgd_t *pgd); extern pgtable_t pte_alloc_one(struct mm_struct *); +extern pgtable_t pte_alloc_order(struct mm_struct *mm, unsigned long address, + int order); + +static inline void pte_free_order(struct mm_struct *mm, struct page *pte, + int order) +{ + int i; + + for (i = 0; i < (1< 2 +static inline pmd_t *pmd_alloc_one_page_with_ptes(struct mm_struct *mm, unsigned long addr) +{ + pgtable_t pte_pgtables; + pmd_t *pmd; + spinlock_t *pmd_ptl; + int i; + + pte_pgtables = pte_alloc_order(mm, addr, + HPAGE_PUD_ORDER - HPAGE_PMD_ORDER); + if (!pte_pgtables) + return NULL; + + pmd = pmd_alloc_one(mm, addr); + if (unlikely(!pmd)) { + pte_free_order(mm, pte_pgtables, + HPAGE_PUD_ORDER - HPAGE_PMD_ORDER); + return NULL; + } + pmd_ptl = pmd_lock(mm, pmd); + + for (i = 0; i < (1<<(HPAGE_PUD_ORDER - HPAGE_PMD_ORDER)); i++) + pgtable_trans_huge_deposit(mm, pmd, pte_pgtables + i); + + spin_unlock(pmd_ptl); + + return pmd; +} + +static inline void pmd_free_page_with_ptes(struct mm_struct *mm, pmd_t *pmd) +{ + spinlock_t *pmd_ptl; + int i; + + BUG_ON((unsigned long)pmd & (PAGE_SIZE-1)); + pmd_ptl = pmd_lock(mm, pmd); + + for (i = 0; i < (1<<(HPAGE_PUD_ORDER - HPAGE_PMD_ORDER)); i++) { + pgtable_t pte_pgtable; + + pte_pgtable = pgtable_trans_huge_withdraw(mm, pmd); + pte_free(mm, pte_pgtable); + } + + spin_unlock(pmd_ptl); + pmd_free(mm, pmd); +} + extern void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd); static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd, diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index dfd82f51ba66..7be73aee6183 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -33,6 +33,31 @@ pgtable_t pte_alloc_one(struct mm_struct *mm) return __pte_alloc_one(mm, __userpte_alloc_gfp); } +pgtable_t pte_alloc_order(struct mm_struct *mm, unsigned long address, int order) +{ + struct page *pte; + int i; + + pte = alloc_pages(__userpte_alloc_gfp, order); + if (!pte) + return NULL; + split_page(pte, order); + for (i = 1; i < (1 << order); i++) + set_page_private(pte + i, 0); + + for (i = 0; i < (1<= 0) { + pgtable_pte_page_dtor(&pte[i]); + __free_page(&pte[i]); + } + return NULL; + } + } + return pte; +} + static int __init setup_userpte(char *arg) { if (!arg) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 8a8bc46a2432..e9d228d4fc69 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -115,6 +115,9 @@ extern struct kobj_attribute shmem_enabled_attr; #define HPAGE_PMD_ORDER (HPAGE_PMD_SHIFT-PAGE_SHIFT) #define HPAGE_PMD_NR (1< X-Patchwork-Id: 11804451 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0F043618 for ; Mon, 28 Sep 2020 17:55:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A47CF208FE for ; Mon, 28 Sep 2020 17:55:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="SFN+ToHp"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="h4x5vkIt" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A47CF208FE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1FD376B0087; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 16112900003; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 025E46B008A; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0241.hostedemail.com [216.40.44.241]) by kanga.kvack.org (Postfix) with ESMTP id E1C326B0087 for ; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 7EEF02471 for ; Mon, 28 Sep 2020 17:55:24 +0000 (UTC) X-FDA: 77313222168.07.land82_5a0538327183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin07.hostedemail.com (Postfix) with ESMTP id 59C6A18062B7B for ; Mon, 28 Sep 2020 17:55:24 +0000 (UTC) X-Spam-Summary: 1,0,0,a687e31d75e281ca,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:541:800:960:966:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1544:1711:1730:1747:1777:1792:1981:2194:2196:2199:2200:2393:2559:2562:2693:3138:3139:3140:3141:3142:3355:3865:3866:3867:3868:3870:3871:3874:4119:4250:4321:4385:4605:5007:6117:6119:6120:6261:6653:6742:7576:7903:7927:8784:9010:10004:11026:11473:11658:11914:12043:12216:12291:12296:12438:12555:12679:12895:12986:13161:13229:13255:13894:14096:14181:14721:21080:21451:21627:21990:30012:30054:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-64.100.201.100 62.18.0.100;04y8hr6zoodmhmegpzhocs6y4u5ciopbxstirxaoxup81sp16y8crgd1aqaibda.jb7qdeu6zc8kdk7hstynkjfgif6qgxuc5o1j3poa7hwtm6z5ebe6dndig1jiiyb.6-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: land82_5a0538327183 X-Filterd-Recvd-Size: 8731 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:23 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 20126CA9; Mon, 28 Sep 2020 13:55:21 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=Gw9wJp8IzBhsG H24yD81uyFcJHF+7lzeeR/XcfNfCU4=; b=SFN+ToHp6cDYhIHKZ6aqpXAKZhKxg JIKpwEvVd5BVBsllSuSC1ILsH9d1DXKq1Tdu+JcGFF0ChdtruiJoRjA6XVt3OTit J1B1qWqYMQVnVO2Capb3wOtBkCeKM81btetu2U2SKV9UStuM4YrNfVzMv4mKFYGG q6+pXjvoPVP2CtFoGvvZk51daMMV4iy7ubXuhU7PLHuCDiQUfFYuiSGYo2JMYOto TiGLpfsG6IYkfercEC3jvzIwQyWopKtPKW80ycBCN3qH65S70ZarTk178byTPjQW g0sZLftwy0nBdtiq+2suCJ7m97EgyyKqQblEu+F8oFSu3RLrnc6TZU7hg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=Gw9wJp8IzBhsGH24yD81uyFcJHF+7lzeeR/XcfNfCU4=; b=h4x5vkIt TzXUSQ5j3DQb3gs/I0aMZeDa5H7HGY1DL5RgahoQU8/AFWwORPr7liJJXkHNvPKE T02inzV9jWm/fM1GO/dgy+rfcQ7Col3DNVBa2mr9SZ85W8znEyxZ7NUO6Az+Yq2g VzevDyZPZDYFEtNOYQ88X1Bq4w/586jfPKW0qe87iApCwnELMiZVvlHZlI/mSmUj q3iJ0CqfqEnQ5lhtxyTQNr8Pz4u8MAK+HIUpT6b1XiyEw4oeAQ0YZdVCuTT5KVs8 Fcb1BnbcYo2iiLCU2STafCImHGpKcFmTAe6lMyP5PXR/rvWV8PH//pB/9vu00KTg P5GMaF6bHRitCQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmh grihhlfhhrohhmpeiiihdrhigrnhesshgvnhhtrdgtohhm X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id AC2BB306468A; Mon, 28 Sep 2020 13:55:19 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 05/30] mm: thp: add page table deposit/withdraw functions for PUD THP. Date: Mon, 28 Sep 2020 13:54:03 -0400 Message-Id: <20200928175428.4110504-6-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan We deposit 512 PMD pages, each of which has 512 PTE pages deposited in its ->deposit_head, to mm->deposit_head_pud. They will be withdrawn and used when a PUD THP split into 512 PMD THPs. In this way, when any of the 512 PMD THPs is split further, we will use the existing code path to withdraw PTE pages for use. Signed-off-by: Zi Yan --- include/linux/mm.h | 2 ++ include/linux/mm_types.h | 3 +++ include/linux/pgtable.h | 3 +++ kernel/fork.c | 6 ++++++ mm/pgtable-generic.c | 23 +++++++++++++++++++++++ 5 files changed, 37 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 01b62da34794..8f54f06c8eb6 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2321,6 +2321,8 @@ static inline spinlock_t *pud_lock(struct mm_struct *mm, pud_t *pud) return ptl; } +#define huge_pud_deposit_head(mm, pud) ((mm)->deposit_head_pud) + extern void __init pagecache_init(void); extern void __init free_area_init_memoryless_node(int nid); extern void free_initmem(void); diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index be842926577a..5ff4dd6a3e32 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -515,6 +515,9 @@ struct mm_struct { /* pgtable deposit list head, protected by page_table_lock */ struct llist_head deposit_head_pmd; #endif +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + struct llist_head deposit_head_pud; /* protected by page_table_lock */ +#endif #ifdef CONFIG_NUMA_BALANCING /* * numa_next_scan is the next time that the PTEs will be marked diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 177eab8e1c31..1f6d46465c54 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -465,10 +465,13 @@ static inline pmd_t pmdp_collapse_flush(struct vm_area_struct *vma, #ifndef __HAVE_ARCH_PGTABLE_DEPOSIT extern void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp, pgtable_t pgtable); +extern void pgtable_trans_huge_pud_deposit(struct mm_struct *mm, pud_t *pudp, + pgtable_t pgtable); #endif #ifndef __HAVE_ARCH_PGTABLE_WITHDRAW extern pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp); +extern pgtable_t pgtable_trans_huge_pud_withdraw(struct mm_struct *mm, pud_t *pudp); #endif #ifdef CONFIG_TRANSPARENT_HUGEPAGE diff --git a/kernel/fork.c b/kernel/fork.c index 9c8e880538de..86fbeec751ef 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -663,6 +663,9 @@ static void check_mm(struct mm_struct *mm) #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS VM_BUG_ON_MM(!llist_empty(&mm->deposit_head_pmd), mm); #endif +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + VM_BUG_ON_MM(!llist_empty(&mm->deposit_head_pud), mm); +#endif } #define allocate_mm() (kmem_cache_alloc(mm_cachep, GFP_KERNEL)) @@ -1023,6 +1026,9 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, init_tlb_flush_pending(mm); #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS init_llist_head(&mm->deposit_head_pmd); +#endif +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + init_llist_head(&mm->deposit_head_pud); #endif mm_init_uprobes_state(mm); diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c index dbb0154165f1..a014cf847067 100644 --- a/mm/pgtable-generic.c +++ b/mm/pgtable-generic.c @@ -166,6 +166,15 @@ void pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp, /* FIFO */ llist_add(&pgtable->deposit_node, &huge_pmd_deposit_head(mm, pmdp)); } + +void pgtable_trans_huge_pud_deposit(struct mm_struct *mm, pud_t *pudp, + pgtable_t pgtable) +{ + assert_spin_locked(pud_lockptr(mm, pudp)); + + /* FIFO */ + llist_add(&pgtable->deposit_node, &huge_pud_deposit_head(mm, pudp)); +} #endif #ifndef __HAVE_ARCH_PGTABLE_WITHDRAW @@ -183,6 +192,20 @@ pgtable_t pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp) struct page, deposit_node); return pgtable; } + +pgtable_t pgtable_trans_huge_pud_withdraw(struct mm_struct *mm, pud_t *pudp) +{ + pgtable_t pgtable; + + assert_spin_locked(pud_lockptr(mm, pudp)); + + /* only withdraw from a non empty list */ + VM_BUG_ON(llist_empty(&huge_pud_deposit_head(mm, pudp))); + /* FIFO */ + pgtable = llist_entry(llist_del_first(&huge_pud_deposit_head(mm, pmdp)), + struct page, deposit_node); + return pgtable; +} #endif #ifndef __HAVE_ARCH_PMDP_INVALIDATE From patchwork Mon Sep 28 17:54:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804457 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F08A9618 for ; Mon, 28 Sep 2020 17:55:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A71B221548 for ; Mon, 28 Sep 2020 17:55:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="bPqHgoSe"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="QysaKMya" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A71B221548 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 143B790000C; Mon, 28 Sep 2020 13:55:26 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B1FE9900007; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 86415900003; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0169.hostedemail.com [216.40.44.169]) by kanga.kvack.org (Postfix) with ESMTP id 58947900003 for ; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 137602471 for ; Mon, 28 Sep 2020 17:55:25 +0000 (UTC) X-FDA: 77313222210.30.grass25_5a0e2e627183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin30.hostedemail.com (Postfix) with ESMTP id D94EE180B3C83 for ; Mon, 28 Sep 2020 17:55:24 +0000 (UTC) X-Spam-Summary: 1,0,0,acce637484332813,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:541:560:800:960:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1540:1711:1730:1747:1777:1792:2198:2199:2393:2559:2562:3138:3139:3140:3141:3142:3352:3865:3867:3868:3870:3871:4321:5007:6119:6120:6261:6653:6742:7576:7901:8957:10004:11026:11473:11658:11914:12043:12296:12555:12679:12895:13069:13255:13311:13357:13894:14181:14384:14721:21080:21627:21990:30054:30064:30070,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04yfr1zyguu9nyjk4oo9ijjn3mphfophgroy5sqojm1ikafndzkjwttf74qd7qe.64hx7wfq96b5su9q8ym4k8etn8633716rdgworpoki9wagqhfoge87m5eyfp5p5.n-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:22,LUA_SUMMARY:none X-HE-Tag: grass25_5a0e2e627183 X-Filterd-Recvd-Size: 5050 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf24.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:24 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 4B923D71; Mon, 28 Sep 2020 13:55:21 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=o0WCxQyI2O6mh GAJROeebiC+vwx5YKmyUHqh7AyOuKk=; b=bPqHgoSeqVUscY4YGUlnjKCBypnig 5aMXn+EAoCrtOBY2XDUZPdAPt5U7Ph8M1/lmYI6nREKNcIe+D8AY1DU6eFN5C3XM mXAwgVfDTUnvxoztaJBlvYp77fqBu4URm4CCrqOCE5e9sMcxa6TjYTLGZjSW6Tjd I57SIz+rUbDqedJVEtqyfVNT+JdQh/YHHOILb8/80o5CetepUBvvfKXEJC3ShCS2 /VmVr9d7pAW8aV112CnvU4OgSChzFoN7vKUC/oV2dj4x6hUp4Ps2EBXQjEzCK8tp Nd7U6yPMpK3jzIHGmkpPdzjXCGSxDwFcnIIghXX7HkgdfaGIPBnB+Q6Uw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=o0WCxQyI2O6mhGAJROeebiC+vwx5YKmyUHqh7AyOuKk=; b=QysaKMya UJNPeNsO849elS+MTT/9LzLCGhmvb8wCwheU+6wpcxcQcvl5inN6+T3xCfyyserm tit146y3F1jwZIr3CgYnUme7wTZPnTifUUxNf+YaFYnYiYKMiddk1/S+1nG7WdW6 2tmUJoQrtL2Gr38eHnqkDQOCnIh9oljmd9krdx9NSs+zMdCa5/d8AWYqIJTAsGrX RZ4Q2WWj9QgJAKDWCxUqbzj+7aMIWiWJjutoAfczhV0excCHFUqu7twiVmCn9rM1 aWCMN+P4ZtVZXgAcvkq246wTpMrplnEDqNfAOmaU3fP3uUJ6zWet150RHynWkEgm SMpQk4Vgfy4wZw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmh grihhlfhhrohhmpeiiihdrhigrnhesshgvnhhtrdgtohhm X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 0EF7F306468C; Mon, 28 Sep 2020 13:55:20 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 06/30] mm: change thp_order and thp_nr as we will have not just PMD THPs. Date: Mon, 28 Sep 2020 13:54:04 -0400 Message-Id: <20200928175428.4110504-7-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan As PUD THP is going to be added in the following patches, thp_order and thp_nr can be HPAGE_PUD_ORDER and HPAGE_PUD_NR, respectively. Signed-off-by: Zi Yan --- include/linux/huge_mm.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index e9d228d4fc69..addd206150e2 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -279,7 +279,7 @@ static inline unsigned int thp_order(struct page *page) { VM_BUG_ON_PGFLAGS(PageTail(page), page); if (PageHead(page)) - return HPAGE_PMD_ORDER; + return page[1].compound_order; return 0; } @@ -291,7 +291,7 @@ static inline int thp_nr_pages(struct page *page) { VM_BUG_ON_PGFLAGS(PageTail(page), page); if (PageHead(page)) - return HPAGE_PMD_NR; + return (1< X-Patchwork-Id: 11804453 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8136F618 for ; Mon, 28 Sep 2020 17:55:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 22F9A208FE for ; Mon, 28 Sep 2020 17:55:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="LOFJw9rU"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="sVIAkHpV" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 22F9A208FE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 87898900008; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7C801900009; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5CD41900008; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0193.hostedemail.com [216.40.44.193]) by kanga.kvack.org (Postfix) with ESMTP id 2D827900006 for ; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id E4C3712C9 for ; Mon, 28 Sep 2020 17:55:24 +0000 (UTC) X-FDA: 77313222168.20.mouth26_2a0bfd527183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin20.hostedemail.com (Postfix) with ESMTP id C0147180C07A3 for ; Mon, 28 Sep 2020 17:55:24 +0000 (UTC) X-Spam-Summary: 1,0,0,2375aa3a0159c5da,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:4:41:355:379:541:800:960:966:968:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:1801:2196:2198:2199:2200:2393:2559:2562:2637:2693:2737:3138:3139:3140:3141:3142:3865:3867:3868:3871:3874:4250:4321:4385:4605:5007:6119:6120:6261:6653:6742:7576:7875:7901:7903:8603:8957:10004:11026:11232:11233:11473:11657:11658:11914:12043:12291:12296:12438:12555:12679:12683:12895:12986:13255:13894:21080:21220:21433:21451:21627:21966:21987:21990:30003:30054:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-64.100.201.100 62.18.0.100;04yru6ua3s93d7m7k7zp5mhwmhm6qyc5jyn5c85hmfpi7hrhi5dc8wttar6ehuk.4e4deqhdpqb3buatybwfsmakntfo64s4xg5oyzgitparj7wifmf43hfe13z1pt7.k-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: mouth26_2a0bfd527183 X-Filterd-Recvd-Size: 16051 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf23.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:23 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 4FE36DC1; Mon, 28 Sep 2020 13:55:21 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=w/1WxE4McpQMz BhtMir9tigfs64TiTil/gzau/IBjZc=; b=LOFJw9rUKw1rJet+DC4SH5AJH+qbW NPxNI1wfb7w13IUaiuXwQBBNeXpmPJExLyLJ90loi0iXCGTd9DlZPnxXYG7T2jfT jl+yX4JYUmeRNLuZHuAoj9VAHvq+hq3B/Ivwg7le5rR88C3fy1qfax3OHdT5GJeL LnYlJIluJKrBEb2paKpobPidKDTKP7w5gPNuOUhR3i8EbYxSP3Mc3LasZ1Cxo/IX 4si1kPwAcIAMedYt9YlqCCGQEEVn5LBk1V/qQMSggZzaOfRYB5XJxuAtop6TkX6P EcQxco72ICuayjwoEMZjKxwmnSaGs2uInUeKuK+sUcBZuQXqV0Ki86/AQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=w/1WxE4McpQMzBhtMir9tigfs64TiTil/gzau/IBjZc=; b=sVIAkHpV 9/w8lGuEIODa8ppVtq20qjoutZMmeONKiKrmd1rxGiE7OMd4N+jwtoKByAkzPP4Y MvNE8aewHoM1JEuxVmvpPdIfP7lY9uCdlKYFTxonZg/0xKcQfeDaL9j2XxclnfB+ Gg65LHF1WJyHhEL/tz0e+6iQ17pyjB4OQPTsOvYTGhzaA72mnI2ws1V8f+2kX/Z2 WvtNXuZZy6jeKoCI7XZ0Fly32FQpVi+DW+Txjgq1HBvgzBb8MZ1gkwyMrR83xVwn dozvk8A4HhuSVMaqy9GQpbj1BVoQpCT/RMxhPNXzv7K/oxvNMaUb9noSmhAqdKmB R4dNlcAUH1NvhA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmh grihhlfhhrohhmpeiiihdrhigrnhesshgvnhhtrdgtohhm X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 665A8306468E; Mon, 28 Sep 2020 13:55:20 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 07/30] mm: thp: add anonymous PUD THP page fault support without enabling it. Date: Mon, 28 Sep 2020 13:54:05 -0400 Message-Id: <20200928175428.4110504-8-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan This adds PUD THP support for anonymous pages. Applications will be able to get PUD pages during page faults when their VMAs are larger than PUD page size after the page fault path is enabled. No shared zero PUD THP is created and shared by all read-only zero PUD THPs, different zero read-only PMD THPs. We do not want to reserve so much physical memory for this use, assuming the case will be rare. New PUD THP related events are added too. Signed-off-by: Zi Yan --- arch/x86/include/asm/pgtable.h | 2 + drivers/base/node.c | 3 + fs/proc/meminfo.c | 2 + include/linux/huge_mm.h | 6 ++ include/linux/mmzone.h | 1 + include/linux/vm_event_item.h | 3 + mm/huge_memory.c | 105 +++++++++++++++++++++++++++++++++ mm/page_alloc.c | 3 +- mm/rmap.c | 24 ++++++-- mm/vmstat.c | 4 ++ 10 files changed, 147 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index a02c67291cfc..199de6be2f6d 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1141,6 +1141,8 @@ static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm, unsigned long return native_pmdp_get_and_clear(pmdp); } +#define mk_pud(page, pgprot) pfn_pud(page_to_pfn(page), (pgprot)) + #define __HAVE_ARCH_PUDP_HUGE_GET_AND_CLEAR static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, unsigned long addr, pud_t *pudp) diff --git a/drivers/base/node.c b/drivers/base/node.c index 9426b0f1f660..fe809c914be0 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -428,6 +428,7 @@ static ssize_t node_read_meminfo(struct device *dev, "Node %d SUnreclaim: %8lu kB\n" #ifdef CONFIG_TRANSPARENT_HUGEPAGE "Node %d AnonHugePages: %8lu kB\n" + "Node %d AnonHugePUDPages: %8lu kB\n" "Node %d ShmemHugePages: %8lu kB\n" "Node %d ShmemPmdMapped: %8lu kB\n" "Node %d FileHugePages: %8lu kB\n" @@ -457,6 +458,8 @@ static ssize_t node_read_meminfo(struct device *dev, , nid, K(node_page_state(pgdat, NR_ANON_THPS) * HPAGE_PMD_NR), + nid, K(node_page_state(pgdat, NR_ANON_THPS_PUD) * + HPAGE_PUD_NR), nid, K(node_page_state(pgdat, NR_SHMEM_THPS) * HPAGE_PMD_NR), nid, K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED) * diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c index 887a5532e449..b60e0c241015 100644 --- a/fs/proc/meminfo.c +++ b/fs/proc/meminfo.c @@ -130,6 +130,8 @@ static int meminfo_proc_show(struct seq_file *m, void *v) #ifdef CONFIG_TRANSPARENT_HUGEPAGE show_val_kb(m, "AnonHugePages: ", global_node_page_state(NR_ANON_THPS) * HPAGE_PMD_NR); + show_val_kb(m, "AnonHugePUDPages: ", + global_node_page_state(NR_ANON_THPS_PUD) * HPAGE_PUD_NR); show_val_kb(m, "ShmemHugePages: ", global_node_page_state(NR_SHMEM_THPS) * HPAGE_PMD_NR); show_val_kb(m, "ShmemPmdMapped: ", diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index addd206150e2..7528652400e4 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -18,10 +18,15 @@ extern int copy_huge_pud(struct mm_struct *dst_mm, struct mm_struct *src_mm, #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD extern void huge_pud_set_accessed(struct vm_fault *vmf, pud_t orig_pud); +extern int do_huge_pud_anonymous_page(struct vm_fault *vmf); #else static inline void huge_pud_set_accessed(struct vm_fault *vmf, pud_t orig_pud) { } +extern int do_huge_pud_anonymous_page(struct vm_fault *vmf) +{ + return VM_FAULT_FALLBACK; +} #endif extern vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd); @@ -323,6 +328,7 @@ struct page *mm_get_huge_zero_page(struct mm_struct *mm); void mm_put_huge_zero_page(struct mm_struct *mm); #define mk_huge_pmd(page, prot) pmd_mkhuge(mk_pmd(page, prot)) +#define mk_huge_pud(page, prot) pud_mkhuge(mk_pud(page, prot)) static inline bool thp_migration_supported(void) { diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 7e0ea3fe95ca..cbc768d364fd 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -196,6 +196,7 @@ enum node_stat_item { NR_FILE_THPS, NR_FILE_PMDMAPPED, NR_ANON_THPS, + NR_ANON_THPS_PUD, NR_VMSCAN_WRITE, NR_VMSCAN_IMMEDIATE, /* Prioritise for reclaim when writeback ends */ NR_DIRTIED, /* page dirtyings since bootup */ diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 18e75974d4e3..416d9966fa3f 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -93,6 +93,9 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, THP_DEFERRED_SPLIT_PAGE, THP_SPLIT_PMD, #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + THP_FAULT_ALLOC_PUD, + THP_FAULT_FALLBACK_PUD, + THP_FAULT_FALLBACK_PUD_CHARGE, THP_SPLIT_PUD, #endif THP_ZERO_PAGE_ALLOC, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index b1c7dc8a6f96..20a3d393d451 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -933,6 +933,111 @@ vm_fault_t vmf_insert_pfn_pud_prot(struct vm_fault *vmf, pfn_t pfn, return VM_FAULT_NOPAGE; } EXPORT_SYMBOL_GPL(vmf_insert_pfn_pud_prot); + +static int __do_huge_pud_anonymous_page(struct vm_fault *vmf, struct page *page, + gfp_t gfp) +{ + struct vm_area_struct *vma = vmf->vma; + pmd_t *pmd_pgtable; + unsigned long haddr = vmf->address & HPAGE_PUD_MASK; + int ret = 0; + + VM_BUG_ON_PAGE(!PageCompound(page), page); + + if (mem_cgroup_charge(page, vma->vm_mm, gfp)) { + put_page(page); + count_vm_event(THP_FAULT_FALLBACK_PUD); + count_vm_event(THP_FAULT_FALLBACK_CHARGE); + return VM_FAULT_FALLBACK; + } + cgroup_throttle_swaprate(page, gfp); + + pmd_pgtable = pmd_alloc_one_page_with_ptes(vma->vm_mm, haddr); + if (unlikely(!pmd_pgtable)) { + ret = VM_FAULT_OOM; + goto release; + } + + clear_huge_page(page, vmf->address, HPAGE_PUD_NR); + /* + * The memory barrier inside __SetPageUptodate makes sure that + * clear_huge_page writes become visible before the set_pmd_at() + * write. + */ + __SetPageUptodate(page); + + vmf->ptl = pud_lock(vma->vm_mm, vmf->pud); + if (unlikely(!pud_none(*vmf->pud))) { + goto unlock_release; + } else { + pud_t entry; + int i; + + ret = check_stable_address_space(vma->vm_mm); + if (ret) + goto unlock_release; + + /* Deliver the page fault to userland */ + if (userfaultfd_missing(vma)) { + vm_fault_t ret2; + + spin_unlock(vmf->ptl); + put_page(page); + pmd_free_page_with_ptes(vma->vm_mm, pmd_pgtable); + ret2 = handle_userfault(vmf, VM_UFFD_MISSING); + VM_BUG_ON(ret2 & VM_FAULT_FALLBACK); + return ret2; + } + + entry = mk_huge_pud(page, vma->vm_page_prot); + entry = maybe_pud_mkwrite(pud_mkdirty(entry), vma); + page_add_new_anon_rmap(page, vma, haddr, true); + lru_cache_add_inactive_or_unevictable(page, vma); + pgtable_trans_huge_pud_deposit(vma->vm_mm, vmf->pud, + virt_to_page(pmd_pgtable)); + set_pud_at(vma->vm_mm, haddr, vmf->pud, entry); + add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PUD_NR); + mm_inc_nr_pmds(vma->vm_mm); + for (i = 0; i < (1<<(HPAGE_PUD_ORDER - HPAGE_PMD_ORDER)); i++) + mm_inc_nr_ptes(vma->vm_mm); + spin_unlock(vmf->ptl); + count_vm_event(THP_FAULT_ALLOC_PUD); + } + + return 0; +unlock_release: + spin_unlock(vmf->ptl); +release: + if (pmd_pgtable) + pmd_free_page_with_ptes(vma->vm_mm, pmd_pgtable); + put_page(page); + return ret; + +} + +int do_huge_pud_anonymous_page(struct vm_fault *vmf) +{ + struct vm_area_struct *vma = vmf->vma; + gfp_t gfp; + struct page *page; + unsigned long haddr = vmf->address & HPAGE_PUD_MASK; + + if (haddr < vma->vm_start || haddr + HPAGE_PUD_SIZE > vma->vm_end) + return VM_FAULT_FALLBACK; + if (unlikely(anon_vma_prepare(vma))) + return VM_FAULT_OOM; + /* no khugepaged_enter, since PUD THP is not supported by khugepaged */ + + gfp = alloc_hugepage_direct_gfpmask(vma); + page = alloc_hugepage_vma(gfp, vma, haddr, HPAGE_PUD_ORDER); + if (unlikely(!page)) { + count_vm_event(THP_FAULT_FALLBACK_PUD); + return VM_FAULT_FALLBACK; + } + prep_transhuge_page(page); + return __do_huge_pud_anonymous_page(vmf, page, gfp); +} + #endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ static void touch_pmd(struct vm_area_struct *vma, unsigned long addr, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 6b1b4a331792..29abeff09fcc 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5434,7 +5434,8 @@ void show_free_areas(unsigned int filter, nodemask_t *nodemask) K(node_page_state(pgdat, NR_SHMEM_THPS) * HPAGE_PMD_NR), K(node_page_state(pgdat, NR_SHMEM_PMDMAPPED) * HPAGE_PMD_NR), - K(node_page_state(pgdat, NR_ANON_THPS) * HPAGE_PMD_NR), + K(node_page_state(pgdat, NR_ANON_THPS) * HPAGE_PMD_NR + + node_page_state(pgdat, NR_ANON_THPS_PUD) * HPAGE_PUD_NR), #endif K(node_page_state(pgdat, NR_WRITEBACK_TEMP)), node_page_state(pgdat, NR_KERNEL_STACK_KB), diff --git a/mm/rmap.c b/mm/rmap.c index 1b84945d655c..5683f367a792 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -726,6 +726,7 @@ pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address) pgd_t *pgd; p4d_t *p4d; pud_t *pud; + pud_t pude; pmd_t *pmd = NULL; pmd_t pmde; @@ -738,7 +739,10 @@ pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address) goto out; pud = pud_offset(p4d, address); - if (!pud_present(*pud)) + + pude = *pud; + barrier(); + if (!pud_present(pude) || pud_trans_huge(pude)) goto out; pmd = pmd_offset(pud, address); @@ -1137,8 +1141,12 @@ void do_page_add_anon_rmap(struct page *page, * pte lock(a spinlock) is held, which implies preemption * disabled. */ - if (compound) - __inc_lruvec_page_state(page, NR_ANON_THPS); + if (compound) { + if (nr == HPAGE_PMD_NR) + __inc_lruvec_page_state(page, NR_ANON_THPS); + else + __inc_lruvec_page_state(page, NR_ANON_THPS_PUD); + } __mod_lruvec_page_state(page, NR_ANON_MAPPED, nr); } @@ -1180,7 +1188,10 @@ void page_add_new_anon_rmap(struct page *page, if (hpage_pincount_available(page)) atomic_set(compound_pincount_ptr(page), 0); - __inc_lruvec_page_state(page, NR_ANON_THPS); + if (nr == HPAGE_PMD_NR) + __inc_lruvec_page_state(page, NR_ANON_THPS); + else + __inc_lruvec_page_state(page, NR_ANON_THPS_PUD); } else { /* Anon THP always mapped first with PMD */ VM_BUG_ON_PAGE(PageTransCompound(page), page); @@ -1286,7 +1297,10 @@ static void page_remove_anon_compound_rmap(struct page *page) if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) return; - __dec_lruvec_page_state(page, NR_ANON_THPS); + if (thp_nr_pages(page) == HPAGE_PMD_NR) + __dec_lruvec_page_state(page, NR_ANON_THPS); + else + __dec_lruvec_page_state(page, NR_ANON_THPS_PUD); if (TestClearPageDoubleMap(page)) { /* diff --git a/mm/vmstat.c b/mm/vmstat.c index 79e5cd0abd0e..a9e50ef6a40d 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1209,6 +1209,7 @@ const char * const vmstat_text[] = { "nr_file_hugepages", "nr_file_pmdmapped", "nr_anon_transparent_hugepages", + "nr_anon_transparent_pud_hugepages", "nr_vmscan_write", "nr_vmscan_immediate_reclaim", "nr_dirtied", @@ -1326,6 +1327,9 @@ const char * const vmstat_text[] = { "thp_deferred_split_page", "thp_split_pmd", #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + "thp_fault_alloc_pud", + "thp_fault_fallback_pud", + "thp_fault_fallback_pud_charge", "thp_split_pud", #endif "thp_zero_page_alloc", From patchwork Mon Sep 28 17:54:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804455 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DFACD618 for ; Mon, 28 Sep 2020 17:55:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A0020208FE for ; Mon, 28 Sep 2020 17:55:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="JqPeyEE+"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="ErH1HLnr" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A0020208FE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BEFB1900003; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9C54A90000B; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6BB99900006; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0193.hostedemail.com [216.40.44.193]) by kanga.kvack.org (Postfix) with ESMTP id 2DDFC900007 for ; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id E9399180AD801 for ; Mon, 28 Sep 2020 17:55:24 +0000 (UTC) X-FDA: 77313222168.21.party25_1a0e8d627183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id C292118045C0D for ; Mon, 28 Sep 2020 17:55:24 +0000 (UTC) X-Spam-Summary: 1,0,0,3848ae9925b48487,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:541:800:960:966:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1542:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:3138:3139:3140:3141:3142:3353:3865:3867:3868:3871:3872:4117:4321:4385:5007:6119:6120:6261:6653:6742:7576:7875:7901:7927:8957:10004:11026:11473:11658:11914:12043:12291:12296:12438:12555:12679:12683:12895:13894:14096:14110:14181:14721:21080:21627:21990:30054:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04yrotugfxukymduasmoyjtnek3czycub8r55hzx6rcu8c6suu64zwjki834uxe.mkpqmd99me3hqc5tc8c1gdeqzhqutmjp3fmx3cmtakcwwrtdj89ae3c7q8443x3.6-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: party25_1a0e8d627183 X-Filterd-Recvd-Size: 6455 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:24 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id A4C2CE21; Mon, 28 Sep 2020 13:55:21 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=8Oi70jGe8Gu0u ZIuNyP+QGtlOgK/mS4X4NyXkO0eQ0c=; b=JqPeyEE+MzNnOOMdegl5R7KikhSKS A9fna4KhQ3+IA70ib+vN5ROsqKjIAqa0zq/p4eel6y+/P7va8UaVMEOCj6JJvCFx uAGSnZreNkQydqM2Y5rz4FglzgHScixIuFqCvzn15qObfg7XjTprUklCtnt9DS6L i2/CGJ9XoXXF61SxuuqPBQ7mcspSUXHvl1cIfTvX37qnxtEcdrgULgXe7TKhnw0r XCJi6Fq2QjdutAVioEoSZ/CiuNN6Vuu+3Ca1nYD7tQhM7dfaPUo/RXNIgSB7hZqt VTKwQ0zmdS0bvwqW+bvizh4M+S0oVIBZ8rKlXp357HtjqYrWzuMyNutpg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=8Oi70jGe8Gu0uZIuNyP+QGtlOgK/mS4X4NyXkO0eQ0c=; b=ErH1HLnr yCLslm/NxxyASe24LQ4KLQPAR4q4XlFMtcjyk53xoshrDh+eRnKiDyAEMhW63yzj +JUmQesUN3ynbjied11OwHUqmNwmAfB71HOABfDJPbjXZSCZWy5Ftwjd1qU47Ltq 0+BVt4/oC2RIHGvekoghQ5jogZk3EuZSz9CDzMx+uqL9irgsK7/MpLD2vI+sH0jy lookiLhIgGHNxUWs6Rr5Yh0OfMGZYeEpjVV0wv70IlbL4DNO4ggxHUbT27d6ICks SwiZoIweP+ITvHFjtOkd+7bcxPbGf12Mxpyz0j87wRn7AT+0q3ir8UhvRxzRwjKr HiTdcH+0aX3CXQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmh grihhlfhhrohhmpeiiihdrhigrnhesshgvnhhtrdgtohhm X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id BCA143064687; Mon, 28 Sep 2020 13:55:20 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 08/30] mm: thp: add PUD THP support for copy_huge_pud. Date: Mon, 28 Sep 2020 13:54:06 -0400 Message-Id: <20200928175428.4110504-9-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan copy_huge_pud needs to allocate 1 PMD page table page and 512 PTE page table pages and deposit them when copying a PUD THP. It is similar to what we do at PUD THP page faults. Signed-off-by: Zi Yan --- mm/huge_memory.c | 36 ++++++++++++++++++++++++++++-------- 1 file changed, 28 insertions(+), 8 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 20a3d393d451..ea9fbedcda26 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1264,7 +1264,12 @@ int copy_huge_pud(struct mm_struct *dst_mm, struct mm_struct *src_mm, { spinlock_t *dst_ptl, *src_ptl; pud_t pud; - int ret; + pmd_t *pmd_pgtable = NULL; + int ret = -ENOMEM; + + pmd_pgtable = pmd_alloc_one_page_with_ptes(vma->vm_mm, addr); + if (unlikely(!pmd_pgtable)) + goto out; dst_ptl = pud_lock(dst_mm, dst_pud); src_ptl = pud_lockptr(src_mm, src_pud); @@ -1272,16 +1277,30 @@ int copy_huge_pud(struct mm_struct *dst_mm, struct mm_struct *src_mm, ret = -EAGAIN; pud = *src_pud; - if (unlikely(!pud_trans_huge(pud) && !pud_devmap(pud))) - goto out_unlock; /* - * When page table lock is held, the huge zero pud should not be - * under splitting since we don't split the page itself, only pud to - * a page table. + * only transparent huge pud page needs extra page table pages for + * possible huge page split */ - if (is_huge_zero_pud(pud)) { - /* No huge zero pud yet */ + if (!pud_trans_huge(pud)) + pmd_free_page_with_ptes(dst_mm, pmd_pgtable); + + if (unlikely(!pud_trans_huge(pud) && !pud_devmap(pud))) + goto out_unlock; + + if (pud_trans_huge(pud)) { + struct page *src_page; + int i; + + src_page = pud_page(pud); + VM_BUG_ON_PAGE(!PageHead(src_page), src_page); + get_page(src_page); + page_dup_rmap(src_page, true); + add_mm_counter(dst_mm, MM_ANONPAGES, HPAGE_PUD_NR); + mm_inc_nr_pmds(dst_mm); + for (i = 0; i < (1<<(HPAGE_PUD_ORDER - HPAGE_PMD_ORDER)); i++) + mm_inc_nr_ptes(dst_mm); + pgtable_trans_huge_pud_deposit(dst_mm, dst_pud, virt_to_page(pmd_pgtable)); } pudp_set_wrprotect(src_mm, addr, src_pud); @@ -1292,6 +1311,7 @@ int copy_huge_pud(struct mm_struct *dst_mm, struct mm_struct *src_mm, out_unlock: spin_unlock(src_ptl); spin_unlock(dst_ptl); +out: return ret; } From patchwork Mon Sep 28 17:54:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804459 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 09291618 for ; Mon, 28 Sep 2020 17:55:45 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B13992184D for ; Mon, 28 Sep 2020 17:55:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="W2K3Us/8"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="fxrT9c03" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B13992184D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3F65F900006; Mon, 28 Sep 2020 13:55:26 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D739A900009; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B0497900006; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0051.hostedemail.com [216.40.44.51]) by kanga.kvack.org (Postfix) with ESMTP id 76382900007 for ; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 31273181AE86D for ; Mon, 28 Sep 2020 17:55:25 +0000 (UTC) X-FDA: 77313222210.04.cake04_4d177fd27183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin04.hostedemail.com (Postfix) with ESMTP id 13921800AD3F for ; Mon, 28 Sep 2020 17:55:25 +0000 (UTC) X-Spam-Summary: 1,0,0,9e139273407ec0c7,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:541:800:960:966:968:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1542:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:3138:3139:3140:3141:3142:3353:3865:3867:3868:3871:4117:4250:4385:5007:6119:6120:6261:6653:6742:7576:7875:7901:7903:8957:9040:10004:11026:11232:11473:11658:11914:12043:12291:12296:12438:12555:12679:12683:12895:13894:14096:14110:14181:14721:21063:21080:21627:21990:30054:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04yf4pcktwow1tfgnzgojxn6xb6mwocg6wcgg4kqcm8z65fii1kye1ny75xxjij.m6fmz7p9jdriywbpxoy8nka8ftycy86cymt3zpk4e6uuyzxcwto1cograqoua3p.y-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: cake04_4d177fd27183 X-Filterd-Recvd-Size: 6863 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf06.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:24 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 0E19EDAA; Mon, 28 Sep 2020 13:55:21 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=vrkxZJEQEBPuH CVbBAj+BQgl5osnBBmheesqLl7iLcM=; b=W2K3Us/8YRs4FcMLHPHb+2LU+kawc sMht8gZOYrUjngiiuCmcVI/au1+l3/EYEXw3ADll8FIpZr50bHD6fT5kBGhTN90l h6Tqs++h4vj9tTvdKPLg1TfXzfTSCNIH5/NXUbgUesqi7GsOK/20Ch/weCl8MgPk SEH2Pt47IGxrPxRVnrl0V1+Q3c8p0bQqHe3XBO5EW9yKJXCPodqurxUuDbVZ5kHz x08DV22ZFN29KYuOgQ8K+MpaI4R/aD03GWxayBDvQsD6KXUJ6xiaTyGi2B4zbw/b lakaxoyprrosbT1ytZfPWkquE9emOSg99oXwRUq3yUHoL5cTQ4ylCgM7w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=vrkxZJEQEBPuHCVbBAj+BQgl5osnBBmheesqLl7iLcM=; b=fxrT9c03 fdx0/UW/BNcBqstgRcX+vG7MroPZKqt75Q2/VImuzPwmVoM5LmQoF6bbmXCS46b/ Qs2ALEaNwoY8XH+VOQ29WGP81TEs4+5eIgX3y80lZ/CMwBP0PZOurSzLl/nB6ESL mf6dqX8UNIOKsCnKnVXc9jUGmQTYy1BRu7qinJJrEYy7g22ujZC9Og2PvR3/HVq1 r5BGm425X/7lD0LVnwBeD59D+Oe2q3ZvTkiT+EwIvieRDVSZOngAujlbg0ac+MG3 pniuCoN1PO1xOjn3ZYTZeaUhCHpFGhZ0aG8lk1/Z19sY5uuUVCh5trbXHF79RyZw wr3E057cwSrbXg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmh grihhlfhhrohhmpeiiihdrhigrnhesshgvnhhtrdgtohhm X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 1EC933064685; Mon, 28 Sep 2020 13:55:21 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 09/30] mm: thp: add PUD THP support to zap_huge_pud. Date: Mon, 28 Sep 2020 13:54:07 -0400 Message-Id: <20200928175428.4110504-10-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan Preallocated 513 (1 PMD and 512 PTE) page table pages need to be freed when PUD THP is removed. zap_pud_deposited_table is added to perform the action. Signed-off-by: Zi Yan --- mm/huge_memory.c | 48 +++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 45 insertions(+), 3 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index ea9fbedcda26..76069affebef 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2013,11 +2013,27 @@ spinlock_t *__pud_trans_huge_lock(pud_t *pud, struct vm_area_struct *vma) } #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +static inline void zap_pud_deposited_table(struct mm_struct *mm, pud_t *pud) +{ + pgtable_t pgtable; + int i; + + pgtable = pgtable_trans_huge_pud_withdraw(mm, pud); + pmd_free_page_with_ptes(mm, (pmd_t *)page_address(pgtable)); + + mm_dec_nr_pmds(mm); + for (i = 0; i < (1<<(HPAGE_PUD_ORDER - HPAGE_PMD_ORDER)); i++) + mm_dec_nr_ptes(mm); +} + int zap_huge_pud(struct mmu_gather *tlb, struct vm_area_struct *vma, pud_t *pud, unsigned long addr) { + pud_t orig_pud; spinlock_t *ptl; + tlb_change_page_size(tlb, HPAGE_PUD_SIZE); + ptl = __pud_trans_huge_lock(pud, vma); if (!ptl) return 0; @@ -2027,14 +2043,40 @@ int zap_huge_pud(struct mmu_gather *tlb, struct vm_area_struct *vma, * pgtable_trans_huge_withdraw after finishing pudp related * operations. */ - pudp_huge_get_and_clear_full(tlb->mm, addr, pud, tlb->fullmm); + orig_pud = pudp_huge_get_and_clear_full(tlb->mm, addr, pud, + tlb->fullmm); tlb_remove_pud_tlb_entry(tlb, pud, addr); if (vma_is_special_huge(vma)) { spin_unlock(ptl); /* No zero page support yet */ + } else if (is_huge_zero_pud(orig_pud)) { + zap_pud_deposited_table(tlb->mm, pud); + spin_unlock(ptl); + tlb_remove_page_size(tlb, pud_page(orig_pud), HPAGE_PUD_SIZE); } else { - /* No support for anonymous PUD pages yet */ - BUG(); + struct page *page = NULL; + int flush_needed = 1; + + if (pud_present(orig_pud)) { + page = pud_page(orig_pud); + page_remove_rmap(page, true); + VM_BUG_ON_PAGE(page_mapcount(page) < 0, page); + VM_BUG_ON_PAGE(!PageHead(page), page); + } else + WARN_ONCE(1, "Non present huge pmd without pmd migration enabled!"); + + if (PageAnon(page)) { + zap_pud_deposited_table(tlb->mm, pud); + add_mm_counter(tlb->mm, MM_ANONPAGES, -HPAGE_PUD_NR); + } else { + if (arch_needs_pgtable_deposit()) + zap_pud_deposited_table(tlb->mm, pud); + add_mm_counter(tlb->mm, MM_FILEPAGES, -HPAGE_PUD_NR); + } + + spin_unlock(ptl); + if (flush_needed) + tlb_remove_page_size(tlb, page, HPAGE_PUD_SIZE); } return 1; } From patchwork Mon Sep 28 17:54:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804461 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1AEED6CA for ; Mon, 28 Sep 2020 17:55:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CC3C821941 for ; Mon, 28 Sep 2020 17:55:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="aGKrAuFE"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="Rmjfm5wt" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CC3C821941 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 72093900009; Mon, 28 Sep 2020 13:55:26 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 0298590000A; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B6EFF90000C; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0076.hostedemail.com [216.40.44.76]) by kanga.kvack.org (Postfix) with ESMTP id 8CB6B90000A for ; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 50C26181AE86E for ; Mon, 28 Sep 2020 17:55:25 +0000 (UTC) X-FDA: 77313222210.23.soap13_19150d427183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin23.hostedemail.com (Postfix) with ESMTP id 2CA033760C for ; Mon, 28 Sep 2020 17:55:25 +0000 (UTC) X-Spam-Summary: 1,0,0,99905234cc23bc4f,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1541:1711:1730:1747:1777:1792:1801:2198:2199:2393:2559:2562:3138:3139:3140:3141:3142:3352:3867:3871:4321:4605:5007:6120:6261:6653:6742:7576:7901:10004:11026:11473:11658:11914:12043:12296:12438:12555:12679:12895:13069:13311:13357:13894:14096:14181:14384:14581:14721:21080:21451:21627:30054:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04yfpe11xnzkyp86ffkcd6n1pw4zgopiwmtwxrddcr1r31rxxg1zasi7qk9jupd.f77hdonsgcu5nigfdmm8s63u9ddjceayuwppbcwr4odn8r4gbqyxi6goye9t96s.s-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: soap13_19150d427183 X-Filterd-Recvd-Size: 5156 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf01.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:24 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 5B2A9E3B; Mon, 28 Sep 2020 13:55:22 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=yT8BqugvTfzX9 L+ABEmA5mKWIFrSrhshFc+djKx/V3o=; b=aGKrAuFETfvFo4s3uBM7F78Ud8X7/ xkgCTUDMcd/tBhOn3y4M2odiBIVc+w0gnHamHUE7/pmQCVRkxwQx0xrVBRtaZFSd SYRu78pFlEIy59DpizHiI/ivIx3zR3piW2nuHavWVXcoYOVIdmQgK/Fmh9VVVmbW 3MN3fZoFbfvsr5Nh0T9nbx4IXc1gG2tOqBE1gVDZ1uUPZfpjKaZ5/1ulENnYq1Eh gREN3MAcLIOcUF12c0W4q2AV3JlCAp2GOFR6jnZ93A1ySOmn3TwNqu8YYReHLl81 MGdfCTyDMdZZbXPvbaxiIsfWkU8cEdU5NOePBq35hzjKsTq7vp8JoIj+w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=yT8BqugvTfzX9L+ABEmA5mKWIFrSrhshFc+djKx/V3o=; b=Rmjfm5wt kVGtJQ24G9TjUoQR/Oq6OBke3QE1yzYEVXda6XuNBCCNyrwr1Z/QesQKOVxm7rtA w7N4pMnJ4YrYKof4L2avf4X5X98uJvIwC/s/3Fcp3+G9VsLBqchhS34/yOTSSnD0 L4sh0vaaX6xtYxfygSZLi3eAt9ic1OSPm61gTGYFWiAS41OrvgZ5jvL5qaLeX7Y7 NWB7sKeYl2yQy0Rn3oc0KiQ077MKSQ7dkYF7taEkgXidwU3Gz8pxMMissx/1nfKo AhyiUFAckTiYQER28ehsjj87Z9i4OtqcsEFPE1eYaotoVe44AWBd34ttZS6qKAtN rtbFpVqoHU9tgw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmh grihhlfhhrohhmpeiiihdrhigrnhesshgvnhhtrdgtohhm X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 763C8306468B; Mon, 28 Sep 2020 13:55:21 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 10/30] fs: proc: add PUD THP kpageflag. Date: Mon, 28 Sep 2020 13:54:08 -0400 Message-Id: <20200928175428.4110504-11-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan Bit 27 is used to identify PUD THP. Signed-off-by: Zi Yan --- fs/proc/page.c | 2 ++ include/uapi/linux/kernel-page-flags.h | 1 + 2 files changed, 3 insertions(+) diff --git a/fs/proc/page.c b/fs/proc/page.c index f3b39a7d2bf3..e4e2ad3612c9 100644 --- a/fs/proc/page.c +++ b/fs/proc/page.c @@ -161,6 +161,8 @@ u64 stable_page_flags(struct page *page) u |= BIT_ULL(KPF_ZERO_PAGE); u |= BIT_ULL(KPF_THP); } + if (compound_order(head) == HPAGE_PUD_ORDER) + u |= 1 << KPF_PUD_THP; } else if (is_zero_pfn(page_to_pfn(page))) u |= BIT_ULL(KPF_ZERO_PAGE); diff --git a/include/uapi/linux/kernel-page-flags.h b/include/uapi/linux/kernel-page-flags.h index 6f2f2720f3ac..62c5fc70909b 100644 --- a/include/uapi/linux/kernel-page-flags.h +++ b/include/uapi/linux/kernel-page-flags.h @@ -36,5 +36,6 @@ #define KPF_ZERO_PAGE 24 #define KPF_IDLE 25 #define KPF_PGTABLE 26 +#define KPF_PUD_THP 27 #endif /* _UAPILINUX_KERNEL_PAGE_FLAGS_H */ From patchwork Mon Sep 28 17:54:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804463 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 52C3E6CA for ; Mon, 28 Sep 2020 17:55:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0355D21924 for ; Mon, 28 Sep 2020 17:55:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="XoR5kJLQ"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="l4IGgVOd" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0355D21924 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A8A8D900007; Mon, 28 Sep 2020 13:55:26 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 610EE90000B; Mon, 28 Sep 2020 13:55:26 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 35EEC90000D; Mon, 28 Sep 2020 13:55:26 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0125.hostedemail.com [216.40.44.125]) by kanga.kvack.org (Postfix) with ESMTP id E65A3900006 for ; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id A9888181AE86D for ; Mon, 28 Sep 2020 17:55:25 +0000 (UTC) X-FDA: 77313222210.02.star10_0304dd727183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id 8478C101A0AB2 for ; Mon, 28 Sep 2020 17:55:25 +0000 (UTC) X-Spam-Summary: 1,0,0,80e24008fb9eeac8,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:1:2:41:355:379:541:800:960:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:1801:2198:2199:2393:2559:2562:2895:3138:3139:3140:3141:3142:3865:3868:3870:4049:4250:4321:4605:5007:6119:6120:6261:6653:6742:7576:7901:7903:10004:11026:11473:11657:11658:11914:12043:12291:12296:12438:12555:12679:12683:12895:12986:13894:13972:14096:21080:21220:21433:21451:21627:21990:30003:30054:30064:30070,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04ygkzatr1augrykbnxm7ka6kuthtopo3kithnsbqpfryu689d55jeiobdsddf7.rd7dkhjm8bf9iutdbim74en4iysmisi1ce6rhziy8ttbfegriag7xwc5e3jsbsi.6-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: star10_0304dd727183 X-Filterd-Recvd-Size: 10443 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf31.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:24 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id B0D5BE1E; Mon, 28 Sep 2020 13:55:22 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=ibZzjGtUmXNRx N5Ks+shsif1f52725VsPFkhAaQPr2s=; b=XoR5kJLQUL6nBYC9owYSwRWOgPzHD PuIqp/bJRAg1phhiy5eBZSKaE6kwmrz2e3OUEpqaavw685V2W/rUIL07/jmHlKhp qkkLP/f3mE9izBcpVpVBMJ8ObyVcB97XPyTzOQCFB2s96TPY0OYVNCHMbWPAMIaf cMePPbmoa/CmIqy2XRAWv5ioAePR3z0SyRfnywZxkFDrQFbr4zWZeLTu/YPGPup7 igPwTIPk8q4L5Ty0p/UgX2ZYpw/zas0spnYYNZusx9yh6n5tqrra3caJE78w5ZIs DLuSGG+fHev8rRhmSsHav0i4Ihz/QnI+gPOctddiBCOcMXYgZreXNOLcA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=ibZzjGtUmXNRxN5Ks+shsif1f52725VsPFkhAaQPr2s=; b=l4IGgVOd 0x5MR01ycWOtAC/Qp6b/tLeE+QORW0hvfDXZx9VZW/afDTtnKAWS3vXsaUP9oUsV TYXp0+GfAG3VsUYGJksYldnMX7yKfcOUir+aNsYzfsCRIGcGjf4OsvlhEBQmDqTo +RR/zBg6+7IFnfeLpLnGpplXYIPjknpXxc3Mi/jOihG2/ImB62thIkafp63/Shfu oM48V3sKUBMzY/S3hEQeIH05oi4jDxSWo/QaND2HBLxwXgqQlX4uqjMXvLjEFHeZ nmYI9ZrTljz7kZm+a63n0R/+MfnPZEkibPIgUPt8jpNGjx3V5AQ2I/2ZCAl2Swfh wvoJIwioqkDjPQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmh grihhlfhhrohhmpeiiihdrhigrnhesshgvnhhtrdgtohhm X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id CD7A33064610; Mon, 28 Sep 2020 13:55:21 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 11/30] mm: thp: handling PUD THP reference bit. Date: Mon, 28 Sep 2020 13:54:09 -0400 Message-Id: <20200928175428.4110504-12-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan Add PUD-level TLB flush ops and teach page_vma_mapped_talk about PUD THPs. Signed-off-by: Zi Yan --- arch/x86/include/asm/pgtable.h | 3 +++ arch/x86/mm/pgtable.c | 13 +++++++++++++ include/linux/mmu_notifier.h | 13 +++++++++++++ include/linux/pgtable.h | 14 ++++++++++++++ include/linux/rmap.h | 1 + mm/page_vma_mapped.c | 33 +++++++++++++++++++++++++++++---- mm/rmap.c | 12 +++++++++--- 7 files changed, 82 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 199de6be2f6d..8bf7bfd71a46 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -1127,6 +1127,9 @@ extern int pudp_test_and_clear_young(struct vm_area_struct *vma, extern int pmdp_clear_flush_young(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp); +#define __HAVE_ARCH_PUDP_CLEAR_YOUNG_FLUSH +extern int pudp_clear_flush_young(struct vm_area_struct *vma, + unsigned long address, pud_t *pudp); #define pmd_write pmd_write static inline int pmd_write(pmd_t pmd) diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 7be73aee6183..e4a2dffcc418 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -633,6 +633,19 @@ int pmdp_clear_flush_young(struct vm_area_struct *vma, return young; } +int pudp_clear_flush_young(struct vm_area_struct *vma, + unsigned long address, pud_t *pudp) +{ + int young; + + VM_BUG_ON(address & ~HPAGE_PUD_MASK); + + young = pudp_test_and_clear_young(vma, address, pudp); + if (young) + flush_tlb_range(vma, address, address + HPAGE_PUD_SIZE); + + return young; +} #endif /** diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h index b8200782dede..4ffa179e654f 100644 --- a/include/linux/mmu_notifier.h +++ b/include/linux/mmu_notifier.h @@ -557,6 +557,19 @@ static inline void mmu_notifier_range_init_migrate( __young; \ }) +#define pudp_clear_flush_young_notify(__vma, __address, __pudp) \ +({ \ + int __young; \ + struct vm_area_struct *___vma = __vma; \ + unsigned long ___address = __address; \ + __young = pudp_clear_flush_young(___vma, ___address, __pudp); \ + __young |= mmu_notifier_clear_flush_young(___vma->vm_mm, \ + ___address, \ + ___address + \ + PUD_SIZE); \ + __young; \ +}) + #define ptep_clear_young_notify(__vma, __address, __ptep) \ ({ \ int __young; \ diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 1f6d46465c54..bb163504fb01 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -243,6 +243,20 @@ static inline int pmdp_clear_flush_young(struct vm_area_struct *vma, #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ #endif +#ifndef __HAVE_ARCH_PUDP_CLEAR_YOUNG_FLUSH +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +extern int pudp_clear_flush_young(struct vm_area_struct *vma, + unsigned long address, pud_t *pudp); +#else +int pudp_clear_flush_young(struct vm_area_struct *vma, + unsigned long address, pud_t *pudp) +{ + BUILD_BUG(); + return 0; +} +#endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ +#endif + #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long address, diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 3a6adfa70fb0..0af61dd193d2 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -206,6 +206,7 @@ struct page_vma_mapped_walk { struct page *page; struct vm_area_struct *vma; unsigned long address; + pud_t *pud; pmd_t *pmd; pte_t *pte; spinlock_t *ptl; diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 5e77b269c330..f88e845ad5e6 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -145,9 +145,12 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) struct page *page = pvmw->page; pgd_t *pgd; p4d_t *p4d; - pud_t *pud; + pud_t pude; pmd_t pmde; + if (!pvmw->pte && !pvmw->pmd && pvmw->pud) + return not_found(pvmw); + /* The only possible pmd mapping has been handled on last iteration */ if (pvmw->pmd && !pvmw->pte) return not_found(pvmw); @@ -174,10 +177,32 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) p4d = p4d_offset(pgd, pvmw->address); if (!p4d_present(*p4d)) return false; - pud = pud_offset(p4d, pvmw->address); - if (!pud_present(*pud)) + pvmw->pud = pud_offset(p4d, pvmw->address); + + /* + * Make sure the pud value isn't cached in a register by the + * compiler and used as a stale value after we've observed a + * subsequent update. + */ + pude = READ_ONCE(*pvmw->pud); + if (pud_trans_huge(pude)) { + pvmw->ptl = pud_lock(mm, pvmw->pud); + if (likely(pud_trans_huge(*pvmw->pud))) { + if (pvmw->flags & PVMW_MIGRATION) + return not_found(pvmw); + if (pud_page(*pvmw->pud) != page) + return not_found(pvmw); + return true; + } else if (!pud_present(*pvmw->pud)) + return not_found(pvmw); + + /* THP pud was split under us: handle on pmd level */ + spin_unlock(pvmw->ptl); + pvmw->ptl = NULL; + } else if (!pud_present(pude)) return false; - pvmw->pmd = pmd_offset(pud, pvmw->address); + + pvmw->pmd = pmd_offset(pvmw->pud, pvmw->address); /* * Make sure the pmd value isn't cached in a register by the * compiler and used as a stale value after we've observed a diff --git a/mm/rmap.c b/mm/rmap.c index 5683f367a792..629f8fe7ffac 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -803,9 +803,15 @@ static bool page_referenced_one(struct page *page, struct vm_area_struct *vma, referenced++; } } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { - if (pmdp_clear_flush_young_notify(vma, address, - pvmw.pmd)) - referenced++; + if (pvmw.pmd) { + if (pmdp_clear_flush_young_notify(vma, address, + pvmw.pmd)) + referenced++; + } else if (pvmw.pud) { + if (pudp_clear_flush_young_notify(vma, address, + pvmw.pud)) + referenced++; + } } else { /* unexpected pmd-mapped page? */ WARN_ON_ONCE(1); From patchwork Mon Sep 28 17:54:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804465 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0A940618 for ; Mon, 28 Sep 2020 17:55:52 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7532321924 for ; Mon, 28 Sep 2020 17:55:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="FDCNTYRT"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="QihS4yJl" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7532321924 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DADE690000A; Mon, 28 Sep 2020 13:55:26 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 7CC8590000D; Mon, 28 Sep 2020 13:55:26 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D0AB900007; Mon, 28 Sep 2020 13:55:26 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0067.hostedemail.com [216.40.44.67]) by kanga.kvack.org (Postfix) with ESMTP id 1357190000B for ; Mon, 28 Sep 2020 13:55:26 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id C8A922461 for ; Mon, 28 Sep 2020 17:55:25 +0000 (UTC) X-FDA: 77313222210.24.nut38_060a52d27183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin24.hostedemail.com (Postfix) with ESMTP id A9C4B1A4A5 for ; Mon, 28 Sep 2020 17:55:25 +0000 (UTC) X-Spam-Summary: 1,0,0,3becb208650bbc50,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:327:355:379:541:560:800:960:966:968:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2693:2731:2892:2895:2898:3138:3139:3140:3141:3142:3165:3369:3865:3866:3867:3868:3870:3871:3872:4250:4303:4321:4385:4605:5007:6117:6119:6120:6261:6653:6742:7576:7875:7903:8660:8957:9165:10004:11026:11473:11658:11914:12043:12295:12296:12438:12555:12679:12895:12986:13148:13230:13255:13894:14096:21063:21080:21451:21627:21795:21939:21987:21990:30003:30051:30054:30064:30070:30075:30079,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04ygbparpktxrykb9hbpei6o6x1i3oc4bxishtwzwpx86bgo8y684wkqgc8t37d.h8bme71rzi55b7g9h6na69eguiuhu4cqyuor7nquqr8fbu7fjtgcu1591mkzmtd.r-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24 ,LUA_SUM X-HE-Tag: nut38_060a52d27183 X-Filterd-Recvd-Size: 22013 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf49.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:24 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 1A50EE77; Mon, 28 Sep 2020 13:55:23 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:24 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=jf9MOR3sBUMFg yxBaYyCCqhD9F08ttqVhQdqPvz+Nqc=; b=FDCNTYRTGUIm7EhjfMEcOVwn77jgi otO8RqZCzYJQu0290xuafRsiElm0fDfst93ezHS8RD9h3iYgSx3FhjGQu7BxqVYP SnaOKnzdHFSaqUGgaG3u85k/oiisi2BI8SPZ/7uGvuBShdpl6bdjMgCIpob4fR02 aftY08oE2bOrgHwF+piV1r0g1d3SjclRChIg5p4KRsT6DBEk1nTPYSuzNh1Wah4v Ue8wB+xJDuVlZ8PptTV9GBrxfc7mWhzuMlONS3Eb/oqyCUBulf8J5UIsAigzX737 ZaEW7fHJ28Ku6xW2DZ+ZPEMbJIIB65mS6GtzDa4EgK4gKnGe64r0fDTPw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=jf9MOR3sBUMFgyxBaYyCCqhD9F08ttqVhQdqPvz+Nqc=; b=QihS4yJl aPVAJu8lWtNEC/4uc10tOpE+/x4Z/axjUarOMvlClwCPevSU0hQtCKh4SOMVLe8G Ol15TriqNQo5rYEjIWLI/m1SMKvSppdiE4//J0YtpYMi9mSPxMLgveRkaKGg8fxN Ri3AUBBltTnS+jWjN5XbTtevXnZdOF/ox7cgmfP8DIlrjr8cZuwkPWcQOO9uPORe HqoslYnIE9s4oZmZ0sULwjbFLmUcA9YjqdrpoM7xmjPVVg8C10Z00AKQGRavAs1u H4xaFmCb7vhCm5nDZOi6OktAYFoy47ABof3WzASOcLDR2hMVnR1dEOrNZZ7btUJ9 iivLfkW96sNchA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmh grihhlfhhrohhmpeiiihdrhigrnhesshgvnhhtrdgtohhm X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 2F9D43064674; Mon, 28 Sep 2020 13:55:22 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 12/30] mm: rmap: add mappped/unmapped page order to anonymous page rmap functions. Date: Mon, 28 Sep 2020 13:54:10 -0400 Message-Id: <20200928175428.4110504-13-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan page_add_anon_rmap, do_page_add_anon_rmap, page_add_new_anon_rmap, page_remove_rmap are changed to have page order as a parameter. This prepares for PMD-mapped PUD THP, since a PUD THP can be mapped in three different ways, PTEs, PMDs, and PUDs and the original boolean parameter is not enough to record the information. Signed-off-by: Zi Yan --- include/linux/rmap.h | 8 ++++---- kernel/events/uprobes.c | 4 ++-- mm/huge_memory.c | 16 ++++++++-------- mm/hugetlb.c | 4 ++-- mm/khugepaged.c | 6 +++--- mm/ksm.c | 4 ++-- mm/memory.c | 16 ++++++++-------- mm/migrate.c | 10 +++++----- mm/rmap.c | 22 +++++++++++++--------- mm/swapfile.c | 4 ++-- mm/userfaultfd.c | 2 +- 11 files changed, 50 insertions(+), 46 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 0af61dd193d2..1244549f3eaf 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -171,13 +171,13 @@ struct anon_vma *page_get_anon_vma(struct page *page); */ void page_move_anon_rmap(struct page *, struct vm_area_struct *); void page_add_anon_rmap(struct page *, struct vm_area_struct *, - unsigned long, bool); + unsigned long, int); void do_page_add_anon_rmap(struct page *, struct vm_area_struct *, - unsigned long, int); + unsigned long, int, int); void page_add_new_anon_rmap(struct page *, struct vm_area_struct *, - unsigned long, bool); + unsigned long, int); void page_add_file_rmap(struct page *, bool); -void page_remove_rmap(struct page *, bool); +void page_remove_rmap(struct page *, int); void hugepage_add_anon_rmap(struct page *, struct vm_area_struct *, unsigned long); diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 0e18aaf23a7b..21b85bac881d 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -183,7 +183,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, if (new_page) { get_page(new_page); - page_add_new_anon_rmap(new_page, vma, addr, false); + page_add_new_anon_rmap(new_page, vma, addr, 0); lru_cache_add_inactive_or_unevictable(new_page, vma); } else /* no new page, just dec_mm_counter for old_page */ @@ -200,7 +200,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr, set_pte_at_notify(mm, addr, pvmw.pte, mk_pte(new_page, vma->vm_page_prot)); - page_remove_rmap(old_page, false); + page_remove_rmap(old_page, 0); if (!page_mapped(old_page)) try_to_free_swap(old_page); page_vma_mapped_walk_done(&pvmw); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 76069affebef..6716c5286494 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -618,7 +618,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, entry = mk_huge_pmd(page, vma->vm_page_prot); entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma); - page_add_new_anon_rmap(page, vma, haddr, true); + page_add_new_anon_rmap(page, vma, haddr, HPAGE_PMD_ORDER); lru_cache_add_inactive_or_unevictable(page, vma); pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable); set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry); @@ -991,7 +991,7 @@ static int __do_huge_pud_anonymous_page(struct vm_fault *vmf, struct page *page, entry = mk_huge_pud(page, vma->vm_page_prot); entry = maybe_pud_mkwrite(pud_mkdirty(entry), vma); - page_add_new_anon_rmap(page, vma, haddr, true); + page_add_new_anon_rmap(page, vma, haddr, HPAGE_PUD_ORDER); lru_cache_add_inactive_or_unevictable(page, vma); pgtable_trans_huge_pud_deposit(vma->vm_mm, vmf->pud, virt_to_page(pmd_pgtable)); @@ -1773,7 +1773,7 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, if (pmd_present(orig_pmd)) { page = pmd_page(orig_pmd); - page_remove_rmap(page, true); + page_remove_rmap(page, HPAGE_PMD_ORDER); VM_BUG_ON_PAGE(page_mapcount(page) < 0, page); VM_BUG_ON_PAGE(!PageHead(page), page); } else if (thp_migration_supported()) { @@ -2059,7 +2059,7 @@ int zap_huge_pud(struct mmu_gather *tlb, struct vm_area_struct *vma, if (pud_present(orig_pud)) { page = pud_page(orig_pud); - page_remove_rmap(page, true); + page_remove_rmap(page, HPAGE_PUD_ORDER); VM_BUG_ON_PAGE(page_mapcount(page) < 0, page); VM_BUG_ON_PAGE(!PageHead(page), page); } else @@ -2187,7 +2187,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, set_page_dirty(page); if (!PageReferenced(page) && pmd_young(_pmd)) SetPageReferenced(page); - page_remove_rmap(page, true); + page_remove_rmap(page, HPAGE_PMD_ORDER); put_page(page); add_mm_counter(mm, mm_counter_file(page), -HPAGE_PMD_NR); return; @@ -2319,7 +2319,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, if (freeze) { for (i = 0; i < HPAGE_PMD_NR; i++) { - page_remove_rmap(page + i, false); + page_remove_rmap(page + i, 0); put_page(page + i); } } @@ -3089,7 +3089,7 @@ void set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw, if (pmd_soft_dirty(pmdval)) pmdswp = pmd_swp_mksoft_dirty(pmdswp); set_pmd_at(mm, address, pvmw->pmd, pmdswp); - page_remove_rmap(page, true); + page_remove_rmap(page, HPAGE_PMD_ORDER); put_page(page); } @@ -3115,7 +3115,7 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) flush_cache_range(vma, mmun_start, mmun_start + HPAGE_PMD_SIZE); if (PageAnon(new)) - page_add_anon_rmap(new, vma, mmun_start, true); + page_add_anon_rmap(new, vma, mmun_start, HPAGE_PMD_ORDER); else page_add_file_rmap(new, true); set_pmd_at(mm, mmun_start, pvmw->pmd, pmde); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 61469fd3ad92..25674d7b1e5f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4007,7 +4007,7 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, set_page_dirty(page); hugetlb_count_sub(pages_per_huge_page(h), mm); - page_remove_rmap(page, true); + page_remove_rmap(page, huge_page_order(h)); spin_unlock(ptl); tlb_remove_page_size(tlb, page, huge_page_size(h)); @@ -4232,7 +4232,7 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma, mmu_notifier_invalidate_range(mm, range.start, range.end); set_huge_pte_at(mm, haddr, ptep, make_huge_pte(vma, new_page, 1)); - page_remove_rmap(old_page, true); + page_remove_rmap(old_page, huge_page_order(h)); hugepage_add_new_anon_rmap(new_page, vma, haddr); set_page_huge_active(new_page); /* Make the old page be freed below */ diff --git a/mm/khugepaged.c b/mm/khugepaged.c index f1d5f6dde47c..636a0f32b09e 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -765,7 +765,7 @@ static void __collapse_huge_page_copy(pte_t *pte, struct page *page, * superfluous. */ pte_clear(vma->vm_mm, address, _pte); - page_remove_rmap(src_page, false); + page_remove_rmap(src_page, 0); spin_unlock(ptl); free_page_and_swap_cache(src_page); } @@ -1175,7 +1175,7 @@ static void collapse_huge_page(struct mm_struct *mm, spin_lock(pmd_ptl); BUG_ON(!pmd_none(*pmd)); - page_add_new_anon_rmap(new_page, vma, address, true); + page_add_new_anon_rmap(new_page, vma, address, HPAGE_PMD_ORDER); lru_cache_add_inactive_or_unevictable(new_page, vma); pgtable_trans_huge_deposit(mm, pmd, pgtable); set_pmd_at(mm, address, pmd, _pmd); @@ -1478,7 +1478,7 @@ void collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr) if (pte_none(*pte)) continue; page = vm_normal_page(vma, addr, *pte); - page_remove_rmap(page, false); + page_remove_rmap(page, 0); } pte_unmap_unlock(start_pte, ptl); diff --git a/mm/ksm.c b/mm/ksm.c index 9afccc36dbd2..f32bdfe768b4 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -1153,7 +1153,7 @@ static int replace_page(struct vm_area_struct *vma, struct page *page, */ if (!is_zero_pfn(page_to_pfn(kpage))) { get_page(kpage); - page_add_anon_rmap(kpage, vma, addr, false); + page_add_anon_rmap(kpage, vma, addr, 0); newpte = mk_pte(kpage, vma->vm_page_prot); } else { newpte = pte_mkspecial(pfn_pte(page_to_pfn(kpage), @@ -1177,7 +1177,7 @@ static int replace_page(struct vm_area_struct *vma, struct page *page, ptep_clear_flush(vma, addr, ptep); set_pte_at_notify(mm, addr, ptep, newpte); - page_remove_rmap(page, false); + page_remove_rmap(page, 0); if (!page_mapped(page)) try_to_free_swap(page); put_page(page); diff --git a/mm/memory.c b/mm/memory.c index 05789aa4af12..37e206a7d213 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1090,7 +1090,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, mark_page_accessed(page); } rss[mm_counter(page)]--; - page_remove_rmap(page, false); + page_remove_rmap(page, 0); if (unlikely(page_mapcount(page) < 0)) print_bad_pte(vma, addr, ptent, page); if (unlikely(__tlb_remove_page(tlb, page))) { @@ -1118,7 +1118,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, pte_clear_not_present_full(mm, addr, pte, tlb->fullmm); rss[mm_counter(page)]--; - page_remove_rmap(page, false); + page_remove_rmap(page, 0); put_page(page); continue; } @@ -2726,7 +2726,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) * thread doing COW. */ ptep_clear_flush_notify(vma, vmf->address, vmf->pte); - page_add_new_anon_rmap(new_page, vma, vmf->address, false); + page_add_new_anon_rmap(new_page, vma, vmf->address, 0); lru_cache_add_inactive_or_unevictable(new_page, vma); /* * We call the notify macro here because, when using secondary @@ -2758,7 +2758,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) * mapcount is visible. So transitively, TLBs to * old page will be flushed before it can be reused. */ - page_remove_rmap(old_page, false); + page_remove_rmap(old_page, 0); } /* Free the old page.. */ @@ -3249,10 +3249,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) /* ksm created a completely new copy */ if (unlikely(page != swapcache && swapcache)) { - page_add_new_anon_rmap(page, vma, vmf->address, false); + page_add_new_anon_rmap(page, vma, vmf->address, 0); lru_cache_add_inactive_or_unevictable(page, vma); } else { - do_page_add_anon_rmap(page, vma, vmf->address, exclusive); + do_page_add_anon_rmap(page, vma, vmf->address, exclusive, 0); } swap_free(entry); @@ -3396,7 +3396,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) } inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); - page_add_new_anon_rmap(page, vma, vmf->address, false); + page_add_new_anon_rmap(page, vma, vmf->address, 0); lru_cache_add_inactive_or_unevictable(page, vma); setpte: set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); @@ -3655,7 +3655,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct page *page) /* copy-on-write page */ if (write && !(vma->vm_flags & VM_SHARED)) { inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); - page_add_new_anon_rmap(page, vma, vmf->address, false); + page_add_new_anon_rmap(page, vma, vmf->address, 0); lru_cache_add_inactive_or_unevictable(page, vma); } else { inc_mm_counter_fast(vma->vm_mm, mm_counter_file(page)); diff --git a/mm/migrate.c b/mm/migrate.c index 3ab965f83029..a7320e9d859c 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -270,7 +270,7 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma, set_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte); if (PageAnon(new)) - page_add_anon_rmap(new, vma, pvmw.address, false); + page_add_anon_rmap(new, vma, pvmw.address, 0); else page_add_file_rmap(new, false); } @@ -2194,7 +2194,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm, * new page and page_add_new_anon_rmap guarantee the copy is * visible before the pagetable update. */ - page_add_anon_rmap(new_page, vma, start, true); + page_add_anon_rmap(new_page, vma, start, HPAGE_PMD_ORDER); /* * At this point the pmd is numa/protnone (i.e. non present) and the TLB * has already been flushed globally. So no TLB can be currently @@ -2211,7 +2211,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm, page_ref_unfreeze(page, 2); mlock_migrate_page(new_page, page); - page_remove_rmap(page, true); + page_remove_rmap(page, HPAGE_PMD_ORDER); set_page_owner_migrate_reason(new_page, MR_NUMA_MISPLACED); spin_unlock(ptl); @@ -2455,7 +2455,7 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp, * drop page refcount. Page won't be freed, as we took * a reference just above. */ - page_remove_rmap(page, false); + page_remove_rmap(page, 0); put_page(page); if (pte_present(pte)) @@ -2940,7 +2940,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate, goto unlock_abort; inc_mm_counter(mm, MM_ANONPAGES); - page_add_new_anon_rmap(page, vma, addr, false); + page_add_new_anon_rmap(page, vma, addr, 0); if (!is_zone_device_page(page)) lru_cache_add_inactive_or_unevictable(page, vma); get_page(page); diff --git a/mm/rmap.c b/mm/rmap.c index 629f8fe7ffac..0d922e5fb38c 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1100,7 +1100,7 @@ static void __page_check_anon_rmap(struct page *page, * @page: the page to add the mapping to * @vma: the vm area in which the mapping is added * @address: the user virtual address mapped - * @compound: charge the page as compound or small page + * @map_order: the order of the charged page * * The caller needs to hold the pte lock, and the page must be locked in * the anon_vma case: to serialize mapping,index checking after setting, @@ -1108,9 +1108,10 @@ static void __page_check_anon_rmap(struct page *page, * (but PageKsm is never downgraded to PageAnon). */ void page_add_anon_rmap(struct page *page, - struct vm_area_struct *vma, unsigned long address, bool compound) + struct vm_area_struct *vma, unsigned long address, int map_order) { - do_page_add_anon_rmap(page, vma, address, compound ? RMAP_COMPOUND : 0); + do_page_add_anon_rmap(page, vma, address, + map_order > 0 ? RMAP_COMPOUND : 0, map_order); } /* @@ -1119,7 +1120,8 @@ void page_add_anon_rmap(struct page *page, * Everybody else should continue to use page_add_anon_rmap above. */ void do_page_add_anon_rmap(struct page *page, - struct vm_area_struct *vma, unsigned long address, int flags) + struct vm_area_struct *vma, unsigned long address, int flags, + int map_order) { bool compound = flags & RMAP_COMPOUND; bool first; @@ -1174,15 +1176,16 @@ void do_page_add_anon_rmap(struct page *page, * @page: the page to add the mapping to * @vma: the vm area in which the mapping is added * @address: the user virtual address mapped - * @compound: charge the page as compound or small page + * @map_order: the order of the charged page * * Same as page_add_anon_rmap but must only be called on *new* pages. * This means the inc-and-test can be bypassed. * Page does not have to be locked. */ void page_add_new_anon_rmap(struct page *page, - struct vm_area_struct *vma, unsigned long address, bool compound) + struct vm_area_struct *vma, unsigned long address, int map_order) { + bool compound = map_order > 0; int nr = compound ? thp_nr_pages(page) : 1; VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma); @@ -1339,12 +1342,13 @@ static void page_remove_anon_compound_rmap(struct page *page) /** * page_remove_rmap - take down pte mapping from a page * @page: page to remove mapping from - * @compound: uncharge the page as compound or small page + * @map_order: the order of the uncharged page * * The caller needs to hold the pte lock. */ -void page_remove_rmap(struct page *page, bool compound) +void page_remove_rmap(struct page *page, int map_order) { + bool compound = map_order > 0; lock_page_memcg(page); if (!PageAnon(page)) { @@ -1734,7 +1738,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, * * See Documentation/vm/mmu_notifier.rst */ - page_remove_rmap(subpage, PageHuge(page)); + page_remove_rmap(subpage, compound_order(page)); put_page(page); } diff --git a/mm/swapfile.c b/mm/swapfile.c index 20012c0c0252..495ecdbd7859 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1919,9 +1919,9 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, set_pte_at(vma->vm_mm, addr, pte, pte_mkold(mk_pte(page, vma->vm_page_prot))); if (page == swapcache) { - page_add_anon_rmap(page, vma, addr, false); + page_add_anon_rmap(page, vma, addr, 0); } else { /* ksm created a completely new copy */ - page_add_new_anon_rmap(page, vma, addr, false); + page_add_new_anon_rmap(page, vma, addr, 0); lru_cache_add_inactive_or_unevictable(page, vma); } swap_free(entry); diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 9a3d451402d7..4979e64d7e47 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -122,7 +122,7 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm, goto out_release_uncharge_unlock; inc_mm_counter(dst_mm, MM_ANONPAGES); - page_add_new_anon_rmap(page, dst_vma, dst_addr, false); + page_add_new_anon_rmap(page, dst_vma, dst_addr, 0); lru_cache_add_inactive_or_unevictable(page, dst_vma); set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); From patchwork Mon Sep 28 17:54:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804469 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 79C986CA for ; Mon, 28 Sep 2020 17:55:58 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 304EE21924 for ; Mon, 28 Sep 2020 17:55:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="RIm6Hu50"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="S2bT/ZI0" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 304EE21924 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9C9F090000F; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9772F90000B; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6E30690000F; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0085.hostedemail.com [216.40.44.85]) by kanga.kvack.org (Postfix) with ESMTP id 36D2990000D for ; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 1F76D8249980 for ; Mon, 28 Sep 2020 17:55:26 +0000 (UTC) X-FDA: 77313222252.24.seed78_4c0a37527183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin24.hostedemail.com (Postfix) with ESMTP id 060F81A4A0 for ; Mon, 28 Sep 2020 17:55:26 +0000 (UTC) X-Spam-Summary: 1,0,0,56438345c8f07fb7,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:541:560:800:960:968:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1541:1711:1730:1747:1777:1792:2198:2199:2393:2559:2562:3138:3139:3140:3141:3142:3352:3865:3871:3874:5007:6119:6120:6261:6653:6742:7576:7903:10004:11026:11473:11658:11914:12043:12114:12296:12438:12555:12679:12895:13069:13311:13357:13894:14096:14181:14384:14721:21080:21627:21990:30054:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-64.100.201.100 62.18.0.100;04y8ejow9hkeezn7pph7fnm6fgrxfopfmymf5mahnobtpwknej5fg8gddr5m8tr.rz3obnxg81chfqfiuymt7xjzqzxj4yp1e3oneisromsds6af43brn938afjitkt.6-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: seed78_4c0a37527183 X-Filterd-Recvd-Size: 5423 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf14.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:25 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 9BF05E79; Mon, 28 Sep 2020 13:55:23 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:24 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=M3Z5iltvAqpPX zgr4nDP3SM4M0PVu4wUve2VkVgkarQ=; b=RIm6Hu50wQRzGwujXp7gxNqIZcVKi KpBjCp4WeVNao7jLnVzlYFRx/wq6Ony32NdgfgQuswGC92phH9PfzBvoCsvoqyxZ KND7r3CEXZOqnam2Kkl0QxdDGe3FQcpmCImJ7YkOgPLqErBwe1IQ033ZDdn8l36Z pcp8v78qe6g1cKLPFOTxeSnwNSvCwEBvOwTZbW32BXE7elOyHbIRQWVn9jBRh9yC l8Z7TiBLaLN2BTSsBI3yoreMaPfQEf0jYXen2dndh7zrCjO7mPbn5sPfVxunUhpd rw3r5DdVtQJIQ33GFJCQa3yC/Zw9kT7UzmMuu2XeU+34FAaLEjMbzMYbg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=M3Z5iltvAqpPXzgr4nDP3SM4M0PVu4wUve2VkVgkarQ=; b=S2bT/ZI0 ecspBpXxJ0fmgANiN86cZWhwvxMJoG724bYeA/5cCMMK9pe6O6vuIb96paQmfKLi Tbf3NccwNxQQbHQp0T5GYaRmGIeoZvos8t2GPlI41g37ie21tyF8OqeX/23jbiWA CHn36krd80hPd67dEpYjRqwwFFRdRez+JK2CgQmrcU8EwaCw/SHzrAjcj/jY41bH LwgH09D2ZYk0eO6IQtC9eQXYttV5k1o6dymMEj3aDKrD1odPZI49qXZAp31kMtkY dTPMvvgnMzFDSNri+kSIZ+9HZB3vx4CFYmobc2mq3CBvSLGKlLWegc1pyDwzddvm v+fPZp3vbVFv0A== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpeduvdenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 88CE33064683; Mon, 28 Sep 2020 13:55:22 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 13/30] mm: rmap: add map_order to page_remove_anon_compound_rmap. Date: Mon, 28 Sep 2020 13:54:11 -0400 Message-Id: <20200928175428.4110504-14-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan When PMD-mapped PUD THP is enabled by the upcoming commits, we can unmap a PMD-mapped PUD THP that should be counted as NR_ANON_THPS. The added map_order tells us about this situation. Signed-off-by: Zi Yan --- mm/rmap.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 0d922e5fb38c..7fc0bf07b9bc 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1292,7 +1292,7 @@ static void page_remove_file_rmap(struct page *page, bool compound) clear_page_mlock(page); } -static void page_remove_anon_compound_rmap(struct page *page) +static void page_remove_anon_compound_rmap(struct page *page, int map_order) { int i, nr; @@ -1306,7 +1306,7 @@ static void page_remove_anon_compound_rmap(struct page *page) if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) return; - if (thp_nr_pages(page) == HPAGE_PMD_NR) + if (map_order == HPAGE_PMD_ORDER) __dec_lruvec_page_state(page, NR_ANON_THPS); else __dec_lruvec_page_state(page, NR_ANON_THPS_PUD); @@ -1357,7 +1357,7 @@ void page_remove_rmap(struct page *page, int map_order) } if (compound) { - page_remove_anon_compound_rmap(page); + page_remove_anon_compound_rmap(page, map_order); goto out; } From patchwork Mon Sep 28 17:54:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804501 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CC9176CA for ; Mon, 28 Sep 2020 17:57:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3C23F21941 for ; Mon, 28 Sep 2020 17:55:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="hkgG5GwC"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="vMyt8iH5" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3C23F21941 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 13C13900011; Mon, 28 Sep 2020 13:55:28 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 0A100900010; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C8B8D900011; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0223.hostedemail.com [216.40.44.223]) by kanga.kvack.org (Postfix) with ESMTP id 9C92F90000D for ; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 5A5272461 for ; Mon, 28 Sep 2020 17:55:27 +0000 (UTC) X-FDA: 77313222294.07.offer74_630b0f127183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin07.hostedemail.com (Postfix) with ESMTP id 3410B180629FC for ; Mon, 28 Sep 2020 17:55:27 +0000 (UTC) X-Spam-Summary: 1,0,0,6eec0abf16083835,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:355:379:541:960:966:968:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:1801:1981:2194:2196:2198:2199:2200:2201:2393:2559:2562:2693:2731:2890:2898:2901:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:3874:4042:4250:4321:4385:4605:5007:6120:6261:6653:6742:7576:7875:7901:7903:7904:8603:8660:8957:9010:9592:10004:10226:11026:11232:11657:11914:12043:12114:12291:12296:12438:12555:12679:12683:12895:12986:13141:13148:13230:13894:21080:21433:21450:21451:21627:21740:21939:21966:21990:30001:30003:30034:30054:30064:30070:30090,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04yfgn3uwqymyisoe6per3j8rrkaayp5km1gzkg83y8p86xy5i4wowz8wxczi98.gxggnckj3417rfca96hrc9sheaebh41dnuqrd4fmfj7ewmexondbj94qzk6ydcz.k-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral, Custom_r X-HE-Tag: offer74_630b0f127183 X-Filterd-Recvd-Size: 51812 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:26 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id F2CEFE7C; Mon, 28 Sep 2020 13:55:23 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=L/Dj0ivDPFBMY /WClZZ2EstlDjqq1BXSs/PeEKi8pyw=; b=hkgG5GwCeCAjtbHKBt/x09LyA0XoB 0JZC0FWmuv70JmRBW+lVKpyrP93sdJ3a1tvUccJ+ZhfOp7lOSHghRMX3VFr/sA5E 0Kc9AbuVJ5WZu6EO0B2nUtdj8DG8HjMPVRtVB1Y+vTmqvGMVeI3tLcCn6aQADsoz waSVA2+5qRxb5CvLXgf1wifmQEeGNtN7MeCAi2kxLKYdp6NTYPZxMzWduLlpNUYp JCtbgpc84cBA0t1TZeiHOXUgioqz87u/T7+m6Ev+DhxuG6VBIfW89gAbxkQsDOFG G0VPBp09QYlgIs3izgIe8A9ZluUGPnJaxJuodJQEs58fotVXK8DxFmzkQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=L/Dj0ivDPFBMY/WClZZ2EstlDjqq1BXSs/PeEKi8pyw=; b=vMyt8iH5 k/kRRbnI9z40BI8UbV0hnpAuAOnGvxObp9biDlaPNdAfj4EA2M9IO3T6NnRt2+Bu FZt2BRgm1VCXjjsEB2fYTRb0AC28H9/l4ksQXOIDRQTtffB8YmWrorZJIpMujzes UbstBim/oDdpMKR5Y1AdTcyJ0npfdkWyinW9d2DcG4SuG0QlskQ2lG0KraXVnD7Y VRSRnzFOOoQViXGGuY9HMG/MR6GCgDg65qSqYOEM5pxC4E5AqmFGHyeUyloDJzmO M4gOjLhko5n6nOJ85yZ7/ucLiZTTrFRshk8Z/Cp7+fEOkZ81iahY6fjfexmWTe10 lLePUP3X8eT9zQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpeduvdenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 116E83064686; Mon, 28 Sep 2020 13:55:23 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 14/30] mm: thp: add PUD THP split_huge_pud_page() function. Date: Mon, 28 Sep 2020 13:54:12 -0400 Message-Id: <20200928175428.4110504-15-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan It mimics PMD-level THP split. In addition, to support PMD-mapped PUD THP, PMDPageInPUD() is added to identify the first page in the PMD sized aligned physical pages. For example, in x86_64, the page[0], page[512], page[1024], ... are regarded as PMDPageInPUD. For the mapcount of PMD-mapped PUD THP, sub_compound_mapcount() is added to uses (PMDPageInPUD+3).compound_mapcount as the mapcount, since each base page's mapcount is used for PTE mapping, first tail page's compound_mapcount is already in use, and second tail page's compound_mapcount overlaps with in-use deferred_list. PagePUDDoubleMap() is added to indicate both PUD-mapped and PMD-mapped PUD THPs. PageDoubleMap() remains its original meaning, indicating both PMD-mapped and PTE-mapped THPs. Signed-off-by: Zi Yan --- arch/x86/include/asm/pgalloc.h | 9 + arch/x86/include/asm/pgtable.h | 21 ++ include/linux/huge_mm.h | 31 +- include/linux/memcontrol.h | 5 + include/linux/mm.h | 25 +- include/linux/page-flags.h | 48 +++ include/linux/pgtable.h | 17 ++ include/linux/rmap.h | 1 + include/linux/swap.h | 2 + include/linux/vm_event_item.h | 4 + mm/huge_memory.c | 525 +++++++++++++++++++++++++++++++-- mm/memcontrol.c | 13 + mm/memory.c | 2 +- mm/page_alloc.c | 21 +- mm/pagewalk.c | 2 +- mm/pgtable-generic.c | 11 + mm/rmap.c | 93 +++++- mm/swap.c | 30 ++ mm/util.c | 22 +- mm/vmstat.c | 4 + 20 files changed, 832 insertions(+), 54 deletions(-) diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h index b24284522973..f6926725c379 100644 --- a/arch/x86/include/asm/pgalloc.h +++ b/arch/x86/include/asm/pgalloc.h @@ -99,6 +99,15 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd, #define pmd_pgtable(pmd) pmd_page(pmd) +static inline void pud_populate_with_pgtable(struct mm_struct *mm, pud_t *pud, + struct page *pte) +{ + unsigned long pfn = page_to_pfn(pte); + + paravirt_alloc_pmd(mm, pfn); + set_pud(pud, __pud(((pteval_t)pfn << PAGE_SHIFT) | _PAGE_TABLE)); +} + #if CONFIG_PGTABLE_LEVELS > 2 static inline pmd_t *pmd_alloc_one_page_with_ptes(struct mm_struct *mm, unsigned long addr) { diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h index 8bf7bfd71a46..575c349e08b2 100644 --- a/arch/x86/include/asm/pgtable.h +++ b/arch/x86/include/asm/pgtable.h @@ -630,6 +630,12 @@ static inline pmd_t pmd_mkinvalid(pmd_t pmd) __pgprot(pmd_flags(pmd) & ~(_PAGE_PRESENT|_PAGE_PROTNONE))); } +static inline pud_t pud_mknotpresent(pud_t pud) +{ + return pfn_pud(pud_pfn(pud), + __pgprot(pud_flags(pud) & ~(_PAGE_PRESENT|_PAGE_PROTNONE))); +} + static inline u64 flip_protnone_guard(u64 oldval, u64 val, u64 mask); static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) @@ -1246,6 +1252,21 @@ static inline p4d_t *user_to_kernel_p4dp(p4d_t *p4dp) } #endif /* CONFIG_PAGE_TABLE_ISOLATION */ +#ifndef pudp_establish +#define pudp_establish pudp_establish +static inline pud_t pudp_establish(struct vm_area_struct *vma, + unsigned long address, pud_t *pudp, pud_t pud) +{ + if (IS_ENABLED(CONFIG_SMP)) { + return xchg(pudp, pud); + } else { + pud_t old = *pudp; + *pudp = pud; + return old; + } +} +#endif + /* * clone_pgd_range(pgd_t *dst, pgd_t *src, int count); * diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 7528652400e4..e5c68e680907 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -222,17 +222,27 @@ void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, void split_huge_pmd_address(struct vm_area_struct *vma, unsigned long address, bool freeze, struct page *page); +bool can_split_huge_pud_page(struct page *page, int *pextra_pins); +int split_huge_pud_page_to_list(struct page *page, struct list_head *list); +static inline int split_huge_pud_page(struct page *page) +{ + return split_huge_pud_page_to_list(page, NULL); +} void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud, - unsigned long address); + unsigned long address, bool freeze, struct page *page); #define split_huge_pud(__vma, __pud, __address) \ do { \ pud_t *____pud = (__pud); \ if (pud_trans_huge(*____pud) \ || pud_devmap(*____pud)) \ - __split_huge_pud(__vma, __pud, __address); \ + __split_huge_pud(__vma, __pud, __address, \ + false, NULL); \ } while (0) +void split_huge_pud_address(struct vm_area_struct *vma, unsigned long address, + bool freeze, struct page *page); + extern int hugepage_madvise(struct vm_area_struct *vma, unsigned long *vm_flags, int advice); extern void vma_adjust_trans_huge(struct vm_area_struct *vma, @@ -422,8 +432,25 @@ static inline void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, static inline void split_huge_pmd_address(struct vm_area_struct *vma, unsigned long address, bool freeze, struct page *page) {} +static inline bool +can_split_huge_pud_page(struct page *page, int *pextra_pins) +{ + BUILD_BUG(); + return false; +} +static inline int +split_huge_pud_page_to_list(struct page *page, struct list_head *list) +{ + return 0; +} +static inline int split_huge_pud_page(struct page *page) +{ + return 0; +} #define split_huge_pud(__vma, __pmd, __address) \ do { } while (0) +static inline void split_huge_pud_address(struct vm_area_struct *vma, + unsigned long address, bool freeze, struct page *page) {} static inline int hugepage_madvise(struct vm_area_struct *vma, unsigned long *vm_flags, int advice) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index e391e3c56de5..a7622510d43d 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -932,6 +932,7 @@ static inline void memcg_memory_event_mm(struct mm_struct *mm, #ifdef CONFIG_TRANSPARENT_HUGEPAGE void mem_cgroup_split_huge_fixup(struct page *head); +void mem_cgroup_split_huge_pud_fixup(struct page *head); #endif #else /* CONFIG_MEMCG */ @@ -1264,6 +1265,10 @@ static inline void mem_cgroup_split_huge_fixup(struct page *head) { } +static inline void mem_cgroup_split_huge_pud_fixup(struct page *head) +{ +} + static inline void count_memcg_events(struct mem_cgroup *memcg, enum vm_event_item idx, unsigned long count) diff --git a/include/linux/mm.h b/include/linux/mm.h index 8f54f06c8eb6..51b75ffa6a6c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -801,6 +801,24 @@ static inline int compound_mapcount(struct page *page) return head_compound_mapcount(page); } +static inline unsigned int compound_order(struct page *page); +static inline atomic_t *sub_compound_mapcount_ptr(struct page *page, int sub_level) +{ + struct page *head = compound_head(page); + + VM_BUG_ON_PAGE(!PageCompound(page), page); + VM_BUG_ON_PAGE(compound_order(head) != HPAGE_PUD_ORDER, page); + VM_BUG_ON_PAGE((page - head) % HPAGE_PMD_NR, page); + VM_BUG_ON_PAGE(sub_level != 1, page); + return &page[2 + sub_level].compound_mapcount; +} + +/* Only works for PUD pages */ +static inline int sub_compound_mapcount(struct page *page) +{ + return atomic_read(sub_compound_mapcount_ptr(page, 1)) + 1; +} + /* * The atomic page->_mapcount, starts from -1: so that transitions * both from it and to it can be tracked, using atomic_inc_and_test @@ -893,13 +911,6 @@ static inline void destroy_compound_page(struct page *page) compound_page_dtors[page[1].compound_dtor](page); } -static inline unsigned int compound_order(struct page *page) -{ - if (!PageHead(page)) - return 0; - return page[1].compound_order; -} - static inline bool hpage_pincount_available(struct page *page) { /* diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index fbbb841a9346..f1bfb02622cf 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -235,6 +235,9 @@ static inline void page_init_poison(struct page *page, size_t size) * * PF_SECOND: * the page flag is stored in the first tail page. + * + * PF_THIRD: + * the page flag is stored in the second tail page. */ #define PF_POISONED_CHECK(page) ({ \ VM_BUG_ON_PGFLAGS(PagePoisoned(page), page); \ @@ -253,6 +256,9 @@ static inline void page_init_poison(struct page *page, size_t size) #define PF_SECOND(page, enforce) ({ \ VM_BUG_ON_PGFLAGS(!PageHead(page), page); \ PF_POISONED_CHECK(&page[1]); }) +#define PF_THIRD(page, enforce) ({ \ + VM_BUG_ON_PGFLAGS(!PageHead(page), page); \ + PF_POISONED_CHECK(&page[2]); }) /* * Macros to create function definitions for page flags @@ -674,6 +680,30 @@ static inline int PageTransTail(struct page *page) return PageTail(page); } +#define HPAGE_PMD_SHIFT PMD_SHIFT +#define HPAGE_PMD_ORDER (HPAGE_PMD_SHIFT-PAGE_SHIFT) +#define HPAGE_PMD_NR (1<_mapcount in all sub-PMD pages is + * offset up by one. This reference will go away with last sub_compound_mapcount. + * + * See also __split_huge_pud_locked() and page_remove_anon_compound_rmap(). + */ +PAGEFLAG(PUDDoubleMap, double_map, PF_THIRD) + TESTSCFLAG(PUDDoubleMap, double_map, PF_THIRD) #else TESTPAGEFLAG_FALSE(TransHuge) TESTPAGEFLAG_FALSE(TransCompound) TESTPAGEFLAG_FALSE(TransCompoundMap) TESTPAGEFLAG_FALSE(TransTail) +TESTPAGEFLAG_FALSE(PMDPageInPUD) PAGEFLAG_FALSE(DoubleMap) TESTSCFLAG_FALSE(DoubleMap) +PAGEFLAG_FALSE(PUDDoubleMap) + TESTSETFLAG_FALSE(PUDDoubleMap) #endif /* diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index bb163504fb01..02279a97e170 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -508,6 +508,11 @@ extern pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp); #endif +#ifndef __HAVE_ARCH_PUDP_INVALIDATE +extern pud_t pudp_invalidate(struct vm_area_struct *vma, unsigned long address, + pud_t *pudp); +#endif + #ifndef __HAVE_ARCH_PTE_SAME static inline int pte_same(pte_t pte_a, pte_t pte_b) { @@ -1161,6 +1166,18 @@ static inline pmd_t pmd_read_atomic(pmd_t *pmdp) } #endif +#ifndef pud_read_atomic +static inline pud_t pud_read_atomic(pud_t *pudp) +{ + /* + * Depend on compiler for an atomic pmd read. NOTE: this is + * only going to work, if the pmdval_t isn't larger than + * an unsigned long. + */ + return *pudp; +} +#endif + #ifndef arch_needs_pgtable_deposit #define arch_needs_pgtable_deposit() (false) #endif diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 1244549f3eaf..0680b9fff2b3 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -99,6 +99,7 @@ enum ttu_flags { TTU_RMAP_LOCKED = 0x80, /* do not grab rmap lock: * caller holds it */ TTU_SPLIT_FREEZE = 0x100, /* freeze pte under splitting thp */ + TTU_SPLIT_HUGE_PUD = 0x200, /* split huge PUD if any */ }; #ifdef CONFIG_MMU diff --git a/include/linux/swap.h b/include/linux/swap.h index f32804e2fad5..dee400a56e84 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -340,6 +340,8 @@ extern void lru_note_cost_page(struct page *); extern void lru_cache_add(struct page *); extern void lru_add_page_tail(struct page *page, struct page *page_tail, struct lruvec *lruvec, struct list_head *head); +extern void lru_add_pud_page_tail(struct page *page, struct page *page_tail, + struct lruvec *lruvec, struct list_head *head); extern void mark_page_accessed(struct page *); extern void lru_add_drain(void); extern void lru_add_drain_cpu(int cpu); diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 416d9966fa3f..cf2b5632b96c 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -97,6 +97,10 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, THP_FAULT_FALLBACK_PUD, THP_FAULT_FALLBACK_PUD_CHARGE, THP_SPLIT_PUD, + THP_SPLIT_PUD_PAGE, + THP_SPLIT_PUD_PAGE_FAILED, + THP_ZERO_PUD_PAGE_ALLOC, + THP_ZERO_PUD_PAGE_ALLOC_FAILED, #endif THP_ZERO_PAGE_ALLOC, THP_ZERO_PAGE_ALLOC_FAILED, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 6716c5286494..4a899e856088 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1775,7 +1775,7 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, page = pmd_page(orig_pmd); page_remove_rmap(page, HPAGE_PMD_ORDER); VM_BUG_ON_PAGE(page_mapcount(page) < 0, page); - VM_BUG_ON_PAGE(!PageHead(page), page); + VM_BUG_ON_PAGE(!PageHead(page) && !PMDPageInPUD(page), page); } else if (thp_migration_supported()) { swp_entry_t entry; @@ -2082,8 +2082,16 @@ int zap_huge_pud(struct mmu_gather *tlb, struct vm_area_struct *vma, } static void __split_huge_pud_locked(struct vm_area_struct *vma, pud_t *pud, - unsigned long haddr) + unsigned long haddr, bool freeze) { + struct mm_struct *mm = vma->vm_mm; + struct page *page; + pgtable_t pgtable; + pud_t _pud, old_pud; + bool young, write, dirty, soft_dirty; + unsigned long addr; + int i; + VM_BUG_ON(haddr & ~HPAGE_PUD_MASK); VM_BUG_ON_VMA(vma->vm_start > haddr, vma); VM_BUG_ON_VMA(vma->vm_end < haddr + HPAGE_PUD_SIZE, vma); @@ -2091,23 +2099,141 @@ static void __split_huge_pud_locked(struct vm_area_struct *vma, pud_t *pud, count_vm_event(THP_SPLIT_PUD); - pudp_huge_clear_flush_notify(vma, haddr, pud); + if (!vma_is_anonymous(vma)) { + _pud = pudp_huge_clear_flush_notify(vma, haddr, pud); + /* + * We are going to unmap this huge page. So + * just go ahead and zap it + */ + if (arch_needs_pgtable_deposit()) + zap_pud_deposited_table(mm, pud); + if (vma_is_dax(vma)) + return; + page = pud_page(_pud); + if (!PageReferenced(page) && pud_young(_pud)) + SetPageReferenced(page); + page_remove_rmap(page, HPAGE_PUD_ORDER); + put_page(page); + add_mm_counter(mm, MM_FILEPAGES, -HPAGE_PUD_NR); + return; + } + + /* See the comment above pmdp_invalidate() in __split_huge_pmd_locked() */ + old_pud = pudp_invalidate(vma, haddr, pud); + + page = pud_page(old_pud); + VM_BUG_ON_PAGE(!page_count(page), page); + page_ref_add(page, (1<<(HPAGE_PUD_ORDER-HPAGE_PMD_ORDER)) - 1); + if (pud_dirty(old_pud)) + SetPageDirty(page); + write = pud_write(old_pud); + young = pud_young(old_pud); + dirty = pud_dirty(old_pud); + soft_dirty = pud_soft_dirty(old_pud); + + pgtable = pgtable_trans_huge_pud_withdraw(mm, pud); + pud_populate_with_pgtable(mm, &_pud, pgtable); + + for (i = 0, addr = haddr; i < HPAGE_PUD_NR; + i += HPAGE_PMD_NR, addr += PMD_SIZE) { + pmd_t entry, *pmd; + /* + * Note that NUMA hinting access restrictions are not + * transferred to avoid any possibility of altering + * permissions across VMAs. + */ + if (freeze) { + swp_entry_t swp_entry; + + swp_entry = make_migration_entry(page + i, write); + entry = swp_entry_to_pmd(swp_entry); + if (soft_dirty) + entry = pmd_swp_mksoft_dirty(entry); + } else { + entry = mk_huge_pmd(page + i, READ_ONCE(vma->vm_page_prot)); + entry = maybe_pmd_mkwrite(entry, vma); + if (!write) + entry = pmd_wrprotect(entry); + if (!young) + entry = pmd_mkold(entry); + if (soft_dirty) + entry = pmd_mksoft_dirty(entry); + } + pmd = pmd_offset(&_pud, addr); + VM_BUG_ON(!pmd_none(*pmd)); + set_pmd_at(mm, addr, pmd, entry); + /* distinguish between pud compound_mapcount and pmd compound_mapcount */ + if (atomic_inc_and_test(sub_compound_mapcount_ptr(&page[i], 1))) { + /* first pmd-mapped pud page */ + lock_page_memcg(page); + __inc_lruvec_page_state(page, NR_ANON_THPS); + unlock_page_memcg(page); + } + } + + /* + * Set PG_double_map before dropping compound_mapcount to avoid + * false-negative page_mapped(). + */ + if (compound_mapcount(page) > 1 && !TestSetPagePUDDoubleMap(page)) { + for (i = 0; i < HPAGE_PUD_NR; i += HPAGE_PMD_NR) + /* distinguish between pud compound_mapcount and pmd compound_mapcount */ + atomic_inc(sub_compound_mapcount_ptr(&page[i], 1)); + } + + lock_page_memcg(page); + if (atomic_add_negative(-1, compound_mapcount_ptr(page))) { + /* Last compound_mapcount is gone. */ + __dec_lruvec_page_state(page, NR_ANON_THPS_PUD); + if (TestClearPagePUDDoubleMap(page)) { + /* No need in mapcount reference anymore */ + for (i = 0; i < HPAGE_PUD_NR; i += HPAGE_PMD_NR) + /* distinguish between pud compound_mapcount and pmd compound_mapcount */ + atomic_dec(sub_compound_mapcount_ptr(&page[i], 1)); + } + } + unlock_page_memcg(page); + + smp_wmb(); /* make pte visible before pmd */ + pud_populate_with_pgtable(mm, pud, pgtable); + + if (freeze) { + for (i = 0; i < HPAGE_PUD_NR; i += HPAGE_PMD_NR) { + page_remove_rmap(page + i, HPAGE_PMD_ORDER); + put_page(page + i); + } + } } void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud, - unsigned long address) + unsigned long address, bool freeze, struct page *page) { spinlock_t *ptl; + struct mm_struct *mm = vma->vm_mm; + unsigned long haddr = address & HPAGE_PUD_MASK; struct mmu_notifier_range range; mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, vma, vma->vm_mm, address & HPAGE_PUD_MASK, (address & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE); mmu_notifier_invalidate_range_start(&range); - ptl = pud_lock(vma->vm_mm, pud); - if (unlikely(!pud_trans_huge(*pud) && !pud_devmap(*pud))) + ptl = pud_lock(mm, pud); + + /* + * If caller asks to setup a migration entries, we need a page to check + * pmd against. Otherwise we can end up replacing wrong page. + */ + VM_BUG_ON(freeze && !page); + if (page && page != pud_page(*pud)) + goto out; + + if (pud_trans_huge(*pud)) { + page = pud_page(*pud); + if (PageMlocked(page)) + clear_page_mlock(page); + } else if (unlikely(!pud_devmap(*pud))) goto out; - __split_huge_pud_locked(vma, pud, range.start); + __split_huge_pud_locked(vma, pud, haddr, freeze); out: spin_unlock(ptl); @@ -2117,6 +2243,280 @@ void __split_huge_pud(struct vm_area_struct *vma, pud_t *pud, */ mmu_notifier_invalidate_range_only_end(&range); } + +void split_huge_pud_address(struct vm_area_struct *vma, unsigned long address, + bool freeze, struct page *page) +{ + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + + pgd = pgd_offset(vma->vm_mm, address); + if (!pgd_present(*pgd)) + return; + + p4d = p4d_offset(pgd, address); + if (!p4d_present(*p4d)) + return; + + pud = pud_offset(p4d, address); + + __split_huge_pud(vma, pud, address, freeze, page); +} + +static void unmap_pud_page(struct page *page) +{ + enum ttu_flags ttu_flags = TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS | + TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PUD; + bool unmap_success; + + VM_BUG_ON_PAGE(!PageHead(page), page); + + if (PageAnon(page)) + ttu_flags |= TTU_SPLIT_FREEZE; + + unmap_success = try_to_unmap(page, ttu_flags); + VM_BUG_ON_PAGE(!unmap_success, page); +} + +static void remap_pud_page(struct page *page) +{ + int i; + + VM_BUG_ON(!PageTransHuge(page)); + if (compound_order(page) == HPAGE_PUD_ORDER) { + remove_migration_ptes(page, page, true); + } else if (compound_order(page) == HPAGE_PMD_ORDER) { + for (i = 0; i < HPAGE_PUD_NR; i += HPAGE_PMD_NR) + remove_migration_ptes(page + i, page + i, true); + } else + VM_BUG_ON_PAGE(1, page); +} + +static void __split_huge_pud_page_tail(struct page *head, int tail, + struct lruvec *lruvec, struct list_head *list) +{ + struct page *page_tail = head + tail; + + VM_BUG_ON_PAGE(page_ref_count(page_tail) != 0, page_tail); + + /* + * Clone page flags before unfreezing refcount. + * + * After successful get_page_unless_zero() might follow flags change, + * for example lock_page() which set PG_waiters. + */ + + page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; + page_tail->flags |= (head->flags & + ((1L << PG_referenced) | + (1L << PG_swapbacked) | + (1L << PG_swapcache) | + (1L << PG_mlocked) | + (1L << PG_uptodate) | + (1L << PG_active) | + (1L << PG_locked) | + (1L << PG_unevictable) | + (1L << PG_dirty) | + /* preserve THP */ + (1L << PG_head))); + + /* ->mapping in first tail page is compound_mapcount */ + VM_BUG_ON_PAGE(tail > 2 && page_tail->mapping != TAIL_MAPPING, + page_tail); + page_tail->mapping = head->mapping; + page_tail->index = head->index + tail; + + /* Page flags also must be visible before we make the page PMD-compound. */ + smp_wmb(); + + clear_compound_head(page_tail); + prep_compound_page(page_tail, HPAGE_PMD_ORDER); + prep_transhuge_page(page_tail); + + /* Finally unfreeze refcount. Additional reference from page cache. */ + page_ref_unfreeze(page_tail, 1 + (!PageAnon(head) || + PageSwapCache(head))); + + if (page_is_young(head)) + set_page_young(page_tail); + if (page_is_idle(head)) + set_page_idle(page_tail); + + page_cpupid_xchg_last(page_tail, page_cpupid_last(head)); + lru_add_pud_page_tail(head, page_tail, lruvec, list); +} + +static void __split_huge_pud_page(struct page *page, struct list_head *list, + unsigned long flags) +{ + struct page *head = compound_head(page); + pg_data_t *pgdat = page_pgdat(head); + struct lruvec *lruvec; + int i; + + lruvec = mem_cgroup_page_lruvec(head, pgdat); + + /* complete memcg works before add pages to LRU */ + mem_cgroup_split_huge_pud_fixup(head); + + /* no file-back page support yet */ + VM_BUG_ON(!PageAnon(page)); + + for (i = HPAGE_PUD_NR - HPAGE_PMD_NR; i >= 1; i -= HPAGE_PMD_NR) + __split_huge_pud_page_tail(head, i, lruvec, list); + + /* reset head page order */ + prep_compound_page(head, HPAGE_PMD_ORDER); + prep_transhuge_page(head); + + page_ref_inc(head); + + spin_unlock_irqrestore(&pgdat->lru_lock, flags); + + remap_pud_page(head); + + for (i = 0; i < HPAGE_PUD_NR; i += HPAGE_PMD_NR) { + struct page *subpage = head + i; + + if (subpage == page) + continue; + unlock_page(subpage); + + /* + * Subpages may be freed if there wasn't any mapping + * like if add_to_swap() is running on a lru page that + * had its mapping zapped. And freeing these pages + * requires taking the lru_lock so we do the put_page + * of the tail pages after the split is complete. + */ + put_page(subpage); + } +} +/* Racy check whether the huge page can be split */ +bool can_split_huge_pud_page(struct page *page, int *pextra_pins) +{ + int extra_pins; + + VM_BUG_ON(!PageAnon(page)); + + extra_pins = PageSwapCache(page) ? HPAGE_PUD_NR : 0; + + if (pextra_pins) + *pextra_pins = extra_pins; + return total_mapcount(page) == page_count(page) - extra_pins - 1; +} + +/* + * This function splits huge page into normal pages. @page can point to any + * subpage of huge page to split. Split doesn't change the position of @page. + * + * Only caller must hold pin on the @page, otherwise split fails with -EBUSY. + * The huge page must be locked. + * + * If @list is null, tail pages will be added to LRU list, otherwise, to @list. + * + * Both head page and tail pages will inherit mapping, flags, and so on from + * the hugepage. + * + * GUP pin and PG_locked transferred to @page. Rest subpages can be freed if + * they are not mapped. + * + * Returns 0 if the hugepage is split successfully. + * Returns -EBUSY if the page is pinned or if anon_vma disappeared from under + * us. + */ +int split_huge_pud_page_to_list(struct page *page, struct list_head *list) +{ + struct page *head = compound_head(page); + struct pglist_data *pgdata = NODE_DATA(page_to_nid(head)); + struct deferred_split *ds_queue = get_deferred_split_queue(head); + struct anon_vma *anon_vma = NULL; + struct address_space *mapping = NULL; + int count, mapcount, extra_pins, ret; + bool mlocked; + unsigned long flags; + + VM_BUG_ON_PAGE(is_huge_zero_page(page), page); + VM_BUG_ON_PAGE(!PageLocked(page), page); + VM_BUG_ON_PAGE(!PageCompound(page), page); + VM_BUG_ON_PAGE(!PageAnon(page), page); + + if (PageWriteback(page)) + return -EBUSY; + + /* + * The caller does not necessarily hold an mmap_sem that would + * prevent the anon_vma disappearing so we first we take a + * reference to it and then lock the anon_vma for write. This + * is similar to page_lock_anon_vma_read except the write lock + * is taken to serialise against parallel split or collapse + * operations. + */ + anon_vma = page_get_anon_vma(head); + if (!anon_vma) { + ret = -EBUSY; + goto out; + } + mapping = NULL; + anon_vma_lock_write(anon_vma); + /* + * Racy check if we can split the page, before unmap_pud_page() will + * split PUDs + */ + if (!can_split_huge_pud_page(head, &extra_pins)) { + ret = -EBUSY; + goto out_unlock; + } + + mlocked = PageMlocked(page); + unmap_pud_page(head); + VM_BUG_ON_PAGE(compound_mapcount(head), head); + + /* Make sure the page is not on per-CPU pagevec as it takes pin */ + if (mlocked) + lru_add_drain(); + + /* prevent PageLRU to go away from under us, and freeze lru stats */ + spin_lock_irqsave(&pgdata->lru_lock, flags); + + /* Prevent deferred_split_scan() touching ->_refcount */ + spin_lock(&ds_queue->split_queue_lock); + count = page_count(head); + mapcount = total_mapcount(head); + if (!mapcount && page_ref_freeze(head, 1 + extra_pins)) { + if (!list_empty(page_deferred_list(head))) { + ds_queue->split_queue_len--; + list_del(page_deferred_list(head)); + } + if (mapping) + __dec_node_page_state(page, NR_SHMEM_THPS); + spin_unlock(&ds_queue->split_queue_lock); + __split_huge_pud_page(page, list, flags); + ret = 0; + } else { + if (IS_ENABLED(CONFIG_DEBUG_VM) && mapcount) { + pr_alert("total_mapcount: %u, page_count(): %u\n", + mapcount, count); + if (PageTail(page)) + dump_page(head, NULL); + dump_page(page, "total_mapcount(head) > 0"); + } + spin_unlock(&ds_queue->split_queue_lock); + spin_unlock_irqrestore(&pgdata->lru_lock, flags); + remap_pud_page(head); + ret = -EBUSY; + } + +out_unlock: + if (anon_vma) { + anon_vma_unlock_write(anon_vma); + put_anon_vma(anon_vma); + } +out: + count_vm_event(!ret ? THP_SPLIT_PUD_PAGE : THP_SPLIT_PUD_PAGE_FAILED); + return ret; +} #endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ static void __split_huge_zero_page_pmd(struct vm_area_struct *vma, @@ -2157,7 +2557,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, unsigned long haddr, bool freeze) { struct mm_struct *mm = vma->vm_mm; - struct page *page; + struct page *page, *head; pgtable_t pgtable; pmd_t old_pmd, _pmd; bool young, write, soft_dirty, pmd_migration = false, uffd_wp = false; @@ -2246,7 +2646,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, uffd_wp = pmd_uffd_wp(old_pmd); } VM_BUG_ON_PAGE(!page_count(page), page); - page_ref_add(page, HPAGE_PMD_NR - 1); + head = compound_head(page); + page_ref_add(head, HPAGE_PMD_NR - 1); /* * Withdraw the table only after we mark the pmd entry invalid. @@ -2294,15 +2695,25 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, /* * Set PG_double_map before dropping compound_mapcount to avoid * false-negative page_mapped(). + * Don't set it if the PUD page is mapped at PUD level, since + * page_mapped() is true in that case. */ - if (compound_mapcount(page) > 1 && - !TestSetPageDoubleMap(page)) { + if (((PMDPageInPUD(page) && + sub_compound_mapcount(page) > + (1 + PagePUDDoubleMap(compound_head(page)))) || + (!PMDPageInPUD(page) && + compound_mapcount(page) > 1)) + && !TestSetPageDoubleMap(page)) { for (i = 0; i < HPAGE_PMD_NR; i++) atomic_inc(&page[i]._mapcount); } lock_page_memcg(page); - if (atomic_add_negative(-1, compound_mapcount_ptr(page))) { + + if ((PMDPageInPUD(page) && + atomic_add_negative(-1, sub_compound_mapcount_ptr(page, 1))) || + (!PMDPageInPUD(page) && + atomic_add_negative(-1, compound_mapcount_ptr(page)))) { /* Last compound_mapcount is gone. */ __dec_lruvec_page_state(page, NR_ANON_THPS); if (TestClearPageDoubleMap(page)) { @@ -2430,6 +2841,11 @@ void vma_adjust_trans_huge(struct vm_area_struct *vma, * previously contain an hugepage: check if we need to split * an huge pmd. */ + if (start & ~HPAGE_PUD_MASK && + (start & HPAGE_PUD_MASK) >= vma->vm_start && + (start & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE <= vma->vm_end) + split_huge_pud_address(vma, start, false, NULL); + if (start & ~HPAGE_PMD_MASK && (start & HPAGE_PMD_MASK) >= vma->vm_start && (start & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE <= vma->vm_end) @@ -2440,6 +2856,11 @@ void vma_adjust_trans_huge(struct vm_area_struct *vma, * previously contain an hugepage: check if we need to split * an huge pmd. */ + if (end & ~HPAGE_PUD_MASK && + (end & HPAGE_PUD_MASK) >= vma->vm_start && + (end & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE <= vma->vm_end) + split_huge_pud_address(vma, end, false, NULL); + if (end & ~HPAGE_PMD_MASK && (end & HPAGE_PMD_MASK) >= vma->vm_start && (end & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE <= vma->vm_end) @@ -2454,6 +2875,11 @@ void vma_adjust_trans_huge(struct vm_area_struct *vma, struct vm_area_struct *next = vma->vm_next; unsigned long nstart = next->vm_start; nstart += adjust_next; + if (nstart & ~HPAGE_PUD_MASK && + (nstart & HPAGE_PUD_MASK) >= next->vm_start && + (nstart & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE <= next->vm_end) + split_huge_pud_address(next, nstart, false, NULL); + if (nstart & ~HPAGE_PMD_MASK && (nstart & HPAGE_PMD_MASK) >= next->vm_start && (nstart & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE <= next->vm_end) @@ -2645,12 +3071,23 @@ int total_mapcount(struct page *page) if (PageHuge(page)) return compound; ret = compound; - for (i = 0; i < nr; i++) - ret += atomic_read(&page[i]._mapcount) + 1; + /* if PMD, read all base page, if PUD, read the sub_compound_mapcount()*/ + if (compound_order(page) == HPAGE_PMD_ORDER) { + for (i = 0; i < nr; i++) + ret += atomic_read(&page[i]._mapcount) + 1; + } else if (compound_order(page) == HPAGE_PUD_ORDER) { + for (i = 0; i < HPAGE_PUD_NR; i += HPAGE_PMD_NR) + ret += sub_compound_mapcount(&page[i]); + for (i = 0; i < nr; i++) + ret += atomic_read(&page[i]._mapcount) + 1; + /* both PUD and PMD has HPAGE_PMD_NR sub pages */ + nr = HPAGE_PMD_NR; + } else + VM_BUG_ON_PAGE(1, page); /* File pages has compound_mapcount included in _mapcount */ if (!PageAnon(page)) return ret - compound * nr; - if (PageDoubleMap(page)) + if (PagePUDDoubleMap(page) || PageDoubleMap(page)) ret -= nr; return ret; } @@ -2681,7 +3118,7 @@ int total_mapcount(struct page *page) */ int page_trans_huge_mapcount(struct page *page, int *total_mapcount) { - int i, ret, _total_mapcount, mapcount; + int i, ret, _total_mapcount, mapcount, nr; /* hugetlbfs shouldn't call it */ VM_BUG_ON_PAGE(PageHuge(page), page); @@ -2696,14 +3133,41 @@ int page_trans_huge_mapcount(struct page *page, int *total_mapcount) page = compound_head(page); _total_mapcount = ret = 0; - for (i = 0; i < thp_nr_pages(page); i++) { - mapcount = atomic_read(&page[i]._mapcount) + 1; - ret = max(ret, mapcount); - _total_mapcount += mapcount; - } - if (PageDoubleMap(page)) { + nr = thp_nr_pages(page); + /* if PMD, read all base page, if PUD, read the sub_compound_mapcount()*/ + if (compound_order(page) == HPAGE_PMD_ORDER) { + for (i = 0; i < nr; i++) { + mapcount = atomic_read(&page[i]._mapcount) + 1; + ret = max(ret, mapcount); + _total_mapcount += mapcount; + } + } else if (compound_order(page) == HPAGE_PUD_ORDER) { + for (i = 0; i < nr; i += HPAGE_PMD_NR) { + int j; + + mapcount = sub_compound_mapcount(&page[i]); + ret = max(ret, mapcount); + _total_mapcount += mapcount; + + /* Triple mapped at base page size */ + for (j = 0; j < HPAGE_PMD_NR; j++) { + mapcount = atomic_read(&page[i + j]._mapcount) + 1; + ret = max(ret, mapcount); + _total_mapcount += mapcount; + } + + if (PageDoubleMap(&page[i])) { + ret -= 1; + _total_mapcount -= HPAGE_PMD_NR; + } + } + /* both PUD and PMD has HPAGE_PMD_NR sub pages */ + nr = HPAGE_PMD_NR; + } else + VM_BUG_ON_PAGE(1, page); + if (PageDoubleMap(page) || PagePUDDoubleMap(page)) { ret -= 1; - _total_mapcount -= thp_nr_pages(page); + _total_mapcount -= nr; } mapcount = compound_mapcount(page); ret += mapcount; @@ -2948,6 +3412,9 @@ static unsigned long deferred_split_count(struct shrinker *shrink, return READ_ONCE(ds_queue->split_queue_len); } +#define deferred_list_entry(x) (compound_head(list_entry((void *)x, \ + struct page, mapping))) + static unsigned long deferred_split_scan(struct shrinker *shrink, struct shrink_control *sc) { @@ -2981,12 +3448,18 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); list_for_each_safe(pos, next, &list) { - page = list_entry((void *)pos, struct page, mapping); + page = deferred_list_entry(pos); if (!trylock_page(page)) goto next; /* split_huge_page() removes page from list on success */ - if (!split_huge_page(page)) - split++; + if (compound_order(page) == HPAGE_PUD_ORDER) { + if (!split_huge_pud_page(page)) + split++; + } else if (compound_order(page) == HPAGE_PMD_ORDER) { + if (!split_huge_page(page)) + split++; + } else + VM_BUG_ON_PAGE(1, page); unlock_page(page); next: put_page(page); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b28f620c1c5b..ed75ef95b24a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3281,6 +3281,19 @@ void mem_cgroup_split_huge_fixup(struct page *head) head[i].mem_cgroup = memcg; } } + +void mem_cgroup_split_huge_pud_fixup(struct page *head) +{ + int i; + + if (mem_cgroup_disabled()) + return; + + for (i = HPAGE_PMD_NR; i < HPAGE_PUD_NR; i += HPAGE_PMD_NR) + head[i].mem_cgroup = head->mem_cgroup; + + /*__mod_memcg_state(head->mem_cgroup, MEMCG_RSS_HUGE, -HPAGE_PUD_NR);*/ +} #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ #ifdef CONFIG_MEMCG_SWAP diff --git a/mm/memory.c b/mm/memory.c index 37e206a7d213..e0e0459c0caf 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4133,7 +4133,7 @@ static vm_fault_t create_huge_pud(struct vm_fault *vmf) } split: /* COW or write-notify not handled on PUD level: split pud.*/ - __split_huge_pud(vmf->vma, vmf->pud, vmf->address); + __split_huge_pud(vmf->vma, vmf->pud, vmf->address, false, NULL); #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ return VM_FAULT_FALLBACK; } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 29abeff09fcc..6bdb38a8fb48 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -679,6 +679,9 @@ void prep_compound_page(struct page *page, unsigned int order) atomic_set(compound_mapcount_ptr(page), -1); if (hpage_pincount_available(page)) atomic_set(compound_pincount_ptr(page), 0); + if (order == HPAGE_PUD_ORDER) + for (i = 0; i < HPAGE_PUD_NR; i += HPAGE_PMD_NR) + atomic_set(sub_compound_mapcount_ptr(&page[i], 1), -1); } #ifdef CONFIG_DEBUG_PAGEALLOC @@ -1132,6 +1135,16 @@ static int free_tail_pages_check(struct page *head_page, struct page *page) */ break; default: + /* sub_compound_map_ptr store here */ + if (compound_order(head_page) == HPAGE_PUD_ORDER && + (page - head_page) % HPAGE_PMD_NR == 3) { + if (unlikely(atomic_read(&page->compound_mapcount) != -1)) { + pr_err("sub_compound_mapcount: %d\n", + atomic_read(&page->compound_mapcount) + 1); + bad_page(page, "nonzero sub_compound_mapcount"); + } + break; + } if (page->mapping != TAIL_MAPPING) { bad_page(page, "corrupted mapping in tail page"); goto out; @@ -1183,8 +1196,14 @@ static __always_inline bool free_pages_prepare(struct page *page, VM_BUG_ON_PAGE(compound && compound_order(page) != order, page); - if (compound) + if (compound) { ClearPageDoubleMap(page); + if (order == HPAGE_PUD_ORDER) { + ClearPagePUDDoubleMap(page); + for (i = 0; i < HPAGE_PUD_NR; i += HPAGE_PMD_NR) + ClearPageDoubleMap(&page[i]); + } + } for (i = 1; i < (1 << order); i++) { if (compound) bad += free_tail_pages_check(page, page + i); diff --git a/mm/pagewalk.c b/mm/pagewalk.c index a3752c82a7b2..c190140637c9 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -160,7 +160,7 @@ static int walk_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end, if (walk->vma) { split_huge_pud(walk->vma, pudp, addr); pud = READ_ONCE(*pudp); - if (pud_none(pud)) + if (pud_trans_unstable(&pud)) goto again; } diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c index a014cf847067..2b83dd4807e5 100644 --- a/mm/pgtable-generic.c +++ b/mm/pgtable-generic.c @@ -218,6 +218,17 @@ pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long address, } #endif +#ifndef __HAVE_ARCH_PUDP_INVALIDATE +pud_t pudp_invalidate(struct vm_area_struct *vma, unsigned long address, + pud_t *pudp) +{ + pud_t old = pudp_establish(vma, address, pudp, pud_mknotpresent(*pudp)); + + flush_pud_tlb_range(vma, address, address + HPAGE_PUD_SIZE); + return old; +} +#endif + #ifndef pmdp_collapse_flush pmd_t pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long address, pmd_t *pmdp) diff --git a/mm/rmap.c b/mm/rmap.c index 7fc0bf07b9bc..b4950f7a0978 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1132,10 +1132,21 @@ void do_page_add_anon_rmap(struct page *page, VM_BUG_ON_PAGE(!PageLocked(page), page); if (compound) { - atomic_t *mapcount; + atomic_t *mapcount = NULL; VM_BUG_ON_PAGE(!PageLocked(page), page); VM_BUG_ON_PAGE(!PageTransHuge(page), page); - mapcount = compound_mapcount_ptr(page); + if (compound_order(page) == HPAGE_PUD_ORDER) { + if (map_order == HPAGE_PUD_ORDER) { + mapcount = compound_mapcount_ptr(page); + } else if (map_order == HPAGE_PMD_ORDER) { + VM_BUG_ON(!PMDPageInPUD(page)); + mapcount = sub_compound_mapcount_ptr(page, 1); + } else + VM_BUG_ON(1); + } else if (compound_order(page) == HPAGE_PMD_ORDER) { + mapcount = compound_mapcount_ptr(page); + } else + VM_BUG_ON(1); first = atomic_inc_and_test(mapcount); } else { first = atomic_inc_and_test(&page->_mapcount); @@ -1150,7 +1161,7 @@ void do_page_add_anon_rmap(struct page *page, * disabled. */ if (compound) { - if (nr == HPAGE_PMD_NR) + if (map_order == HPAGE_PMD_ORDER) __inc_lruvec_page_state(page, NR_ANON_THPS); else __inc_lruvec_page_state(page, NR_ANON_THPS_PUD); @@ -1197,10 +1208,15 @@ void page_add_new_anon_rmap(struct page *page, if (hpage_pincount_available(page)) atomic_set(compound_pincount_ptr(page), 0); - if (nr == HPAGE_PMD_NR) - __inc_lruvec_page_state(page, NR_ANON_THPS); - else + if (map_order == HPAGE_PUD_ORDER) { + VM_BUG_ON(compound_order(page) != HPAGE_PUD_ORDER); + /* Anon THP always mapped first with PMD */ __inc_lruvec_page_state(page, NR_ANON_THPS_PUD); + } else if (map_order == HPAGE_PMD_ORDER) { + VM_BUG_ON(compound_order(page) != HPAGE_PMD_ORDER); + __inc_lruvec_page_state(page, NR_ANON_THPS); + } else + VM_BUG_ON(1); } else { /* Anon THP always mapped first with PMD */ VM_BUG_ON_PAGE(PageTransCompound(page), page); @@ -1294,10 +1310,38 @@ static void page_remove_file_rmap(struct page *page, bool compound) static void page_remove_anon_compound_rmap(struct page *page, int map_order) { - int i, nr; + int i, nr = 0; + struct page *head = compound_head(page); + + if (compound_order(head) == HPAGE_PUD_ORDER) { + if (map_order == HPAGE_PMD_ORDER) { + VM_BUG_ON(!PMDPageInPUD(page)); + if (atomic_add_negative(-1, sub_compound_mapcount_ptr(page, 1))) { + if (TestClearPageDoubleMap(page)) { + /* + * Subpages can be mapped with PTEs too. Check how many of + * themi are still mapped. + */ + for (i = 0; i < thp_nr_pages(head); i++) { + if (atomic_add_negative(-1, &head[i]._mapcount)) + nr++; + } + } + __dec_node_page_state(page, NR_ANON_THPS); + } + nr += HPAGE_PMD_NR; + __mod_node_page_state(page_pgdat(head), NR_ANON_MAPPED, -nr); + return; + } - if (!atomic_add_negative(-1, compound_mapcount_ptr(page))) - return; + VM_BUG_ON(map_order != HPAGE_PUD_ORDER); + if (!atomic_add_negative(-1, compound_mapcount_ptr(page))) + return; + } else if (compound_order(head) == HPAGE_PMD_ORDER) { + if (!atomic_add_negative(-1, compound_mapcount_ptr(page))) + return; + } else + VM_BUG_ON_PAGE(1, page); /* Hugepages are not counted in NR_ANON_PAGES for now. */ if (unlikely(PageHuge(page))) @@ -1308,10 +1352,31 @@ static void page_remove_anon_compound_rmap(struct page *page, int map_order) if (map_order == HPAGE_PMD_ORDER) __dec_lruvec_page_state(page, NR_ANON_THPS); - else + else if (map_order == HPAGE_PUD_ORDER) __dec_lruvec_page_state(page, NR_ANON_THPS_PUD); + else + VM_BUG_ON(1); - if (TestClearPageDoubleMap(page)) { + /* PMD-mapped PUD THP is handled above */ + if (TestClearPagePUDDoubleMap(head)) { + VM_BUG_ON(!(compound_order(head) == HPAGE_PUD_ORDER || head == page)); + /* + * Subpages can be mapped with PMDs too. Check how many of + * them are still mapped. + */ + for (i = 0, nr = 0; i < HPAGE_PUD_NR; i += HPAGE_PMD_NR) { + if (atomic_add_negative(-1, sub_compound_mapcount_ptr(&head[i], 1))) + nr += HPAGE_PMD_NR; + } + /* + * Queue the page for deferred split if at least one pmd page + * of the pud compound page is unmapped, but at least one + * pmd page is still mapped. + */ + if (nr && nr < thp_nr_pages(head)) + deferred_split_huge_page(head); + } else if (TestClearPageDoubleMap(head)) { + VM_BUG_ON(compound_order(head) != HPAGE_PMD_ORDER); /* * Subpages can be mapped with PTEs too. Check how many of * them are still mapped. @@ -1335,8 +1400,10 @@ static void page_remove_anon_compound_rmap(struct page *page, int map_order) if (unlikely(PageMlocked(page))) clear_page_mlock(page); - if (nr) - __mod_lruvec_page_state(page, NR_ANON_MAPPED, -nr); + if (nr) { + __mod_lruvec_page_state(head, NR_ANON_MAPPED, -nr); + deferred_split_huge_page(head); + } } /** diff --git a/mm/swap.c b/mm/swap.c index 7e79829a2e73..43c18e5b6916 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -1005,6 +1005,36 @@ void lru_add_page_tail(struct page *page, struct page *page_tail, page_lru(page_tail)); } } + +/* used by __split_pud_huge_page_tail() */ +void lru_add_pud_page_tail(struct page *page, struct page *page_tail, + struct lruvec *lruvec, struct list_head *list) +{ + VM_BUG_ON_PAGE(!PageHead(page), page); + VM_BUG_ON_PAGE(PageLRU(page_tail), page); + lockdep_assert_held(&lruvec_pgdat(lruvec)->lru_lock); + + if (!list) + SetPageLRU(page_tail); + + if (likely(PageLRU(page))) + list_add_tail(&page_tail->lru, &page->lru); + else if (list) { + /* page reclaim is reclaiming a huge page */ + get_page(page_tail); + list_add_tail(&page_tail->lru, list); + } else { + /* + * Head page has not yet been counted, as an hpage, + * so we must account for each subpage individually. + * + * Put page_tail on the list at the correct position + * so they all end up in order. + */ + add_page_to_lru_list_tail(page_tail, lruvec, + page_lru(page_tail)); + } +} #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec, diff --git a/mm/util.c b/mm/util.c index bb902f5a6582..e22d04d9e020 100644 --- a/mm/util.c +++ b/mm/util.c @@ -653,6 +653,12 @@ bool page_mapped(struct page *page) page = compound_head(page); if (atomic_read(compound_mapcount_ptr(page)) >= 0) return true; + if (compound_order(page) == HPAGE_PUD_ORDER) { + for (i = 0; i < HPAGE_PUD_NR; i += HPAGE_PMD_NR) { + if (sub_compound_mapcount(page + i) > 0) + return true; + } + } if (PageHuge(page)) return false; for (i = 0; i < compound_nr(page); i++) { @@ -713,17 +719,27 @@ struct address_space *page_mapping_file(struct page *page) int __page_mapcount(struct page *page) { int ret; + struct page *head = compound_head(page); + /* base page mapping */ ret = atomic_read(&page->_mapcount) + 1; + + /* PMDInPUD mapping */ + if (compound_order(head) == HPAGE_PUD_ORDER) { + struct page *sub_compound_page = head + + (((page - head) / HPAGE_PMD_NR) * HPAGE_PMD_NR); + + ret += sub_compound_mapcount(sub_compound_page); + } /* * For file THP page->_mapcount contains total number of mapping * of the page: no need to look into compound_mapcount. */ if (!PageAnon(page) && !PageHuge(page)) return ret; - page = compound_head(page); - ret += atomic_read(compound_mapcount_ptr(page)) + 1; - if (PageDoubleMap(page)) + /* highest compound mapping */ + ret += atomic_read(compound_mapcount_ptr(head)) + 1; + if (PageDoubleMap(head)) ret--; return ret; } diff --git a/mm/vmstat.c b/mm/vmstat.c index a9e50ef6a40d..2bb702d79f01 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1331,6 +1331,10 @@ const char * const vmstat_text[] = { "thp_fault_fallback_pud", "thp_fault_fallback_pud_charge", "thp_split_pud", + "thp_split_pud_page", + "thp_split_pud_page_failed", + "thp_zero_pud_page_alloc", + "thp_zero_pud_page_alloc_failed", #endif "thp_zero_page_alloc", "thp_zero_page_alloc_failed", From patchwork Mon Sep 28 17:54:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804467 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CEC62618 for ; Mon, 28 Sep 2020 17:55:54 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9278D2311C for ; Mon, 28 Sep 2020 17:55:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="gog8QY/C"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="OzCZjtAW" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9278D2311C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 57B5390000E; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5523290000B; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3A73090000E; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0126.hostedemail.com [216.40.44.126]) by kanga.kvack.org (Postfix) with ESMTP id 2305A90000B for ; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id CF3B2181AE86D for ; Mon, 28 Sep 2020 17:55:26 +0000 (UTC) X-FDA: 77313222252.18.spy40_0813b4127183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin18.hostedemail.com (Postfix) with ESMTP id AD1F6101CE720 for ; Mon, 28 Sep 2020 17:55:26 +0000 (UTC) X-Spam-Summary: 1,0,0,962fc68e599a4bba,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:541:800:960:968:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1534:1540:1711:1714:1730:1747:1777:1792:2198:2199:2393:2559:2562:2693:3138:3139:3140:3141:3142:3351:3865:3867:3871:3872:4250:4321:5007:6119:6120:6261:6653:6742:7576:7901:9010:9012:10004:11026:11473:11658:11914:12043:12438:12555:12679:12895:12986:13069:13255:13311:13357:13894:14096:14181:14384:14721:21080:21627:21990:30054:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04y84ugay8pi84nnxbi6n9j5i1k4pocptkwj9djzrnik61mewaye4wxyy6b6rcw.pm9osk5schxzrybeowys4hzbphd846ek5x49rqg9cu8rxbteiscjhc1ewb9uj7p.o-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: spy40_0813b4127183 X-Filterd-Recvd-Size: 4928 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:26 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 4A58FE33; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=bn+CdI2mzadwD 3ISILwshyyh0HI8VM1uFL6AVtefXJ4=; b=gog8QY/CSxCjynTpVZ4HAGDug73ke xX+kcrWQkb94b5wOhnLbNTiVsR/ruk8Mwbqk8F/EaLhCJPayGmXCWfyspc+ao2eJ 0C3vNpyaK4saUjBDQF3SdVlZu54ar6WDhOfx5d+9jUl3nwsb+wqiSHH/8WkE0PRV hoalI72wNNOumDkz1Mxfn/qi2Lrcj7wmZ3mDCEP/ZlzkNeMf9i+1L3dbY/qehCcQ Bc/vlHiyu0rYdUDIZ5oIt05j0evLgPe1nDfPW6VF+JYJYLK3XHMtWl8HzjsqfdwU BGCNi6/dor+Qe9uduRF4QKg1PFeyAuobUZjKrqNUUFCOw16tlYrmsMFTQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=bn+CdI2mzadwD3ISILwshyyh0HI8VM1uFL6AVtefXJ4=; b=OzCZjtAW gBaouZLgxnDnH+V9Qh7yqZfg3+GMmnaV90PS0NnKH4niDr8ak1EhQKjAjBLitU2c uRjYzKlOhRmuadaAIZzBb92TcJRFvBdMl219MpHLAgtvkOmvrPGLdL1q/dqU+IWL S50+m1VzPZb1u5B+pivVc2DUyNnjSJuHIxpHBqzzxn8VQXeNmaoIlEoHI288xt8o AX+YhkFNhpUlEnU7/bC3Dl4l+mo72ywtGBmBfR06Q7oD6MjIcGg4qG6lE1kXyKY9 Q2c2kRM/1DBjzyjVX3fsrreEfsUpKFV+P9CBmKcg9ylR5pGHOjo2H+1UAPnvfU+n hHLnAyIFe1WZzA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpeduvdenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 686373064685; Mon, 28 Sep 2020 13:55:23 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 15/30] mm: thp: add PUD THP to deferred split list when PUD mapping is gone. Date: Mon, 28 Sep 2020 13:54:13 -0400 Message-Id: <20200928175428.4110504-16-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan When PUD mapping is gone, there is no need to keep the PUD THP. Add it to deferred split list, so when memory pressure comes, the THP will be split. Signed-off-by: Zi Yan --- mm/rmap.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/mm/rmap.c b/mm/rmap.c index b4950f7a0978..424322807966 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1329,6 +1329,9 @@ static void page_remove_anon_compound_rmap(struct page *page, int map_order) } __dec_node_page_state(page, NR_ANON_THPS); } + /* deferred split huge pud page if PUD map is gone */ + if (!compound_mapcount(head)) + deferred_split_huge_page(head); nr += HPAGE_PMD_NR; __mod_node_page_state(page_pgdat(head), NR_ANON_MAPPED, -nr); return; From patchwork Mon Sep 28 17:54:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804471 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DB07C6CA for ; Mon, 28 Sep 2020 17:56:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 968F32184D for ; Mon, 28 Sep 2020 17:56:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="M7xV7Ip4"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="ioPd4MMV" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 968F32184D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4EB5990000B; Mon, 28 Sep 2020 13:55:28 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1D6A0900012; Mon, 28 Sep 2020 13:55:28 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DE3C990000B; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0149.hostedemail.com [216.40.44.149]) by kanga.kvack.org (Postfix) with ESMTP id B87A5900010 for ; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 7B3E6180AD801 for ; Mon, 28 Sep 2020 17:55:27 +0000 (UTC) X-FDA: 77313222294.27.sugar97_2a1719027183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin27.hostedemail.com (Postfix) with ESMTP id 475793D668 for ; Mon, 28 Sep 2020 17:55:27 +0000 (UTC) X-Spam-Summary: 1,0,0,030b7df70c868d1e,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1541:1711:1730:1747:1777:1792:2194:2198:2199:2200:2393:2559:2562:2731:3138:3139:3140:3141:3142:3352:3865:3866:3871:3874:5007:6119:6120:6261:6653:6742:7576:8660:8957:10004:11026:11473:11658:11914:12043:12296:12438:12555:12679:12895:13069:13148:13230:13311:13357:13894:14181:14384:14721:21080:21627:21939:30054:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-64.100.201.100 62.18.0.100;04yf4x5e9p3hngrff4whb5unxoccooph1gq4it9dnbskba884n544b3bigw8c6u.r64fauqcjgf7e7nj953yj7rnbpbs7cxuszuxw9ikym9bxc8593djitoyydhiuh1.6-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: sugar97_2a1719027183 X-Filterd-Recvd-Size: 5312 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf08.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:26 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id A510ED92; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=01LYlc5bDzp3u qzR9xxfv49dl40Bp54QbYMpQqXO/5E=; b=M7xV7Ip4pSaGpM8FfuydMwvO0fCGH zGTwAaxetq6YCSXCJaFSXSMVVRR+7XCaxRwCXRNyRgvbA5OkFS4xogt5/hq8aGYa MHoGtx1bA1WnF8b1xTkDW7Ozxs5jJlm3jvzV18BudEsGJL5oftAXPw+S4KWTtOa+ ovNUwCbq0Z297HcQvaS9OMmRr61NfYe0mPj34A6XZYUdqj33OuZabIA3X/lIm5qQ kX+IBv9LpobNgSBad6u3/iWfsrSjSDi8ew4QriaIGvIoLvOauOwfwhX2VX8S5Wyq gk0kHZMz/uIBJSr+E9lyAVIwyGqfLzsFSpuNhRNBwF23CHHOmpaJEZDTw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=01LYlc5bDzp3uqzR9xxfv49dl40Bp54QbYMpQqXO/5E=; b=ioPd4MMV 55cAG7caB4GzPpNXV6YKinZBo+9I1F6vWb8/o0eJAAfr3y6pUAy0AW00miyueaiy uPXAnW+LNFyjleTU/zBsseynT4nAp/OCozX+lBRzBW0SHY14dH4GdjoGXVOOa8m9 CJRkCP/+uVXmAIF6BqGx1g0ZHQvlXbqYlhxI7voWYfMnU5yr9RVGWfs8TOMCpiM4 OW78uxueiVgxYaCNA8vMEwcuM3vB/BsiKaZDTBjBrxqJ+s5nvGYLo7U8+zrZr4Tt jPemCbUCIf0eR/gfG50NExHXRqqQas2NLOVkJIr/x5hOkBP1qm/gIiTlHk7K7kAw IZ7i+fSPvMZL+Q== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpeduvdenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id BE8013064688; Mon, 28 Sep 2020 13:55:23 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 16/30] mm: debug: adapt dump_page to PUD THP. Date: Mon, 28 Sep 2020 13:54:14 -0400 Message-Id: <20200928175428.4110504-17-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan Since the order of a PUD THP is greater than MAX_ORDER, do not consider its tail pages corrupted. Also print sub_compound_mapcount when dumping a PMDPageInPUD. Signed-off-by: Zi Yan --- mm/debug.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/mm/debug.c b/mm/debug.c index ccca576b2899..f5b035dc620d 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -68,7 +68,9 @@ void __dump_page(struct page *page, const char *reason) goto hex_only; } - if (page < head || (page >= head + MAX_ORDER_NR_PAGES)) { + if (page < head || + (page >= head + max_t(unsigned long, compound_nr(head), + (unsigned long)MAX_ORDER_NR_PAGES))) { /* * Corrupt page, so we cannot call page_mapping. Instead, do a * safe subset of the steps that page_mapping() does. Caution: @@ -109,6 +111,8 @@ void __dump_page(struct page *page, const char *reason) head, compound_order(head), head_compound_mapcount(head)); } + if (compound_order(head) == HPAGE_PUD_ORDER && PMDPageInPUD(page)) + pr_warn("sub_compound_mapcount:%d\n", sub_compound_mapcount(page)); } if (PageKsm(page)) type = "ksm "; From patchwork Mon Sep 28 17:54:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804473 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 896F5618 for ; Mon, 28 Sep 2020 17:56:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3FC532184D for ; Mon, 28 Sep 2020 17:56:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="Hn+ay/U2"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="Ut2Dy29W" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3FC532184D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 80D4B900012; Mon, 28 Sep 2020 13:55:28 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 4257290000D; Mon, 28 Sep 2020 13:55:28 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2C51790000B; Mon, 28 Sep 2020 13:55:28 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0028.hostedemail.com [216.40.44.28]) by kanga.kvack.org (Postfix) with ESMTP id 0393F90000D for ; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id B9F898249980 for ; Mon, 28 Sep 2020 17:55:27 +0000 (UTC) X-FDA: 77313222294.26.curve54_0714a5d27183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin26.hostedemail.com (Postfix) with ESMTP id 966731804A301 for ; Mon, 28 Sep 2020 17:55:27 +0000 (UTC) X-Spam-Summary: 1,0,0,82458d97dc1d7674,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1543:1711:1730:1747:1777:1792:2198:2199:2393:2559:2562:2693:2890:3138:3139:3140:3141:3142:3354:3865:3867:3870:3871:3872:4042:4118:4250:4321:4605:5007:6117:6119:6120:6261:6653:6742:7576:7901:7903:8660:8957:10004:11026:11473:11658:11914:12043:12296:12438:12555:12679:12895:12986:13148:13230:13255:13894:14096:14181:14721:21080:21450:21627:21939:21990:30001:30003:30054:30064:30070,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04yrt5cw8cdpagkdkxtxbh14tis4gop5txj9u79xamx55f9m31dbr6zwq7zsafe.57rho566trj9yhgqm7ya6e8ixzu8x14kza85nmmorinmuhtb7exhfyin9uqjcoc.k-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: curve54_0714a5d27183 X-Filterd-Recvd-Size: 7842 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:26 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 08D60E88; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=jyLbUIlQCtFGz 7jNfvQyTQJ+R8QHi6y2ry8kkWn61AM=; b=Hn+ay/U24pF9LPmOCv8p85LuSyLZ0 y0ZSMPnUWnZX31bAwM8Q6H9Hwo9HID53QZHuf7l8xSoxbNFvGPCSJerBi2mlVPJD 32sascRspIB3Acwx6IH4s+eDaKdr0S8h6cpMLaBMWNFWrrToWZmNQQf4lvGWSwH7 g9QfUUAR+x6gfRcHlF9XDSe5xpU4Kki5/QFgMz35tctxXjN1FbykfLau2pass/xl pPJo503R6m91iX9IRxsCLppmH9cvoBYJLWX0xkLSlSs/JgDdBpR1oOhn3MxDcHoJ YPTwCOvUc2dpSFW2cWiUoczp270HXyC5LalEO7StcsE3k/RnezgRRhIyw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=jyLbUIlQCtFGz7jNfvQyTQJ+R8QHi6y2ry8kkWn61AM=; b=Ut2Dy29W xcq2q4FraGvdMfQI4YwmLJbOh/OHg/8Y78QXH8MYWJ+4HDgw/+KwsZuJQgSPu7xp 8xmca4fOxBiPs+NvyOd2C8zR8v5KmxNMvYN8h4NDfVULGJSJuJHAc8Zuj/0PNhwI SXaAcAAA0yQtJ1AOf3tLqzjNOu5TK/4JKbIjz9Wy2sY8jpr9O5DNuLgVDXnSb1p8 3at9g1Y2vASoyD24YMTepZULAoAUdpdLK3clNsiySLpkrlzyAgG3kVNcJB4IeM3y 37LA+pBIg5PUg92JeRIjN95+LeU+vXWrHVR/09ogLZDuJUIBtiHiBvwEw5dQhwo6 tVVlIcy4MbQ1gg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpeduvdenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 219893064610; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 17/30] mm: thp: PUD THP COW splits PUD page and falls back to PMD page. Date: Mon, 28 Sep 2020 13:54:15 -0400 Message-Id: <20200928175428.4110504-18-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan COW on PUD THPs has the same behavior as COW on PMD THPs to avoid high COW overhead. As a result, do_huge_pmd_wp will see PMD-mapped PUD THPs, thus needs to count PUD mappings in total mapcount when calling page_trans_huge_map_swapcount in reuse_swap_page to avoid false positive. Change page_trans_huge_map_swapcount to get it right. Signed-off-by: Zi Yan --- include/linux/huge_mm.h | 5 +++++ mm/huge_memory.c | 13 +++++++++++++ mm/memory.c | 3 +-- mm/swapfile.c | 7 ++++++- 4 files changed, 25 insertions(+), 3 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index e5c68e680907..589e5af5a1c2 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -19,6 +19,7 @@ extern int copy_huge_pud(struct mm_struct *dst_mm, struct mm_struct *src_mm, #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD extern void huge_pud_set_accessed(struct vm_fault *vmf, pud_t orig_pud); extern int do_huge_pud_anonymous_page(struct vm_fault *vmf); +extern vm_fault_t do_huge_pud_wp_page(struct vm_fault *vmf, pud_t orig_pud); #else static inline void huge_pud_set_accessed(struct vm_fault *vmf, pud_t orig_pud) { @@ -27,6 +28,10 @@ extern int do_huge_pud_anonymous_page(struct vm_fault *vmf) { return VM_FAULT_FALLBACK; } +extern vm_fault_t do_huge_pud_wp_page(struct vm_fault *vmf, pud_t orig_pud) +{ + return VM_FAULT_FALLBACK; +} #endif extern vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd); diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 4a899e856088..9aa19aa643cd 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1335,6 +1335,19 @@ void huge_pud_set_accessed(struct vm_fault *vmf, pud_t orig_pud) unlock: spin_unlock(vmf->ptl); } + +vm_fault_t do_huge_pud_wp_page(struct vm_fault *vmf, pud_t orig_pud) +{ + struct vm_area_struct *vma = vmf->vma; + + /* + * split pud directly. a whole pud page is not swappable, so there is + * no need to try reuse_swap_page + */ + __split_huge_pud(vma, vmf->pud, vmf->address, false, NULL); + return VM_FAULT_FALLBACK; +} + #endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ void huge_pmd_set_accessed(struct vm_fault *vmf, pmd_t orig_pmd) diff --git a/mm/memory.c b/mm/memory.c index e0e0459c0caf..ab80d13807aa 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4141,9 +4141,8 @@ static vm_fault_t create_huge_pud(struct vm_fault *vmf) static vm_fault_t wp_huge_pud(struct vm_fault *vmf, pud_t orig_pud) { #ifdef CONFIG_TRANSPARENT_HUGEPAGE - /* No support for anonymous transparent PUD pages yet */ if (vma_is_anonymous(vmf->vma)) - return VM_FAULT_FALLBACK; + return do_huge_pud_wp_page(vmf, orig_pud); if (vmf->vma->vm_ops->huge_fault) return vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PUD); #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ diff --git a/mm/swapfile.c b/mm/swapfile.c index 495ecdbd7859..a6989b0c4d44 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1635,7 +1635,12 @@ static int page_trans_huge_map_swapcount(struct page *page, int *total_mapcount, /* hugetlbfs shouldn't call it */ VM_BUG_ON_PAGE(PageHuge(page), page); - if (!IS_ENABLED(CONFIG_THP_SWAP) || likely(!PageTransCompound(page))) { + if (!IS_ENABLED(CONFIG_THP_SWAP) || likely(!PageTransCompound(page)) || + /* + * PMD-mapped PUD THP need to take PUD mappings into account by + * using page_trans_huge_mapcount + */ + unlikely(thp_order(page) == HPAGE_PUD_ORDER)) { mapcount = page_trans_huge_mapcount(page, total_mapcount); if (PageSwapCache(page)) swapcount = page_swapcount(page); From patchwork Mon Sep 28 17:54:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804475 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0238C618 for ; Mon, 28 Sep 2020 17:56:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A3D77221E7 for ; Mon, 28 Sep 2020 17:56:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="FlooIRVk"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="kHadzIx4" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A3D77221E7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C5AE6900010; Mon, 28 Sep 2020 13:55:28 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C369590000D; Mon, 28 Sep 2020 13:55:28 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A3C96900015; Mon, 28 Sep 2020 13:55:28 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0035.hostedemail.com [216.40.44.35]) by kanga.kvack.org (Postfix) with ESMTP id 7184E900010 for ; Mon, 28 Sep 2020 13:55:28 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 27B182471 for ; Mon, 28 Sep 2020 17:55:28 +0000 (UTC) X-FDA: 77313222336.22.stick59_34059e527183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id DD9E418038E68 for ; Mon, 28 Sep 2020 17:55:27 +0000 (UTC) X-Spam-Summary: 1,0,0,215480f6e2c74370,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:1:2:41:355:379:541:800:960:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3865:3867:3868:3870:4049:4250:4321:4605:5007:6119:6120:6261:6653:6742:7576:7901:8957:9036:9592:10004:10226:11026:11232:11473:11658:11914:12043:12291:12296:12438:12555:12679:12683:12895:12986:13894:14110:21080:21627:21990:30003:30054:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-64.100.201.100 62.18.0.100;04yf5ssokybcjfk6r9n8gj7b66nypycfyzm9o7oudmoiuf6pqjsud4yzcr97zm6.m3xc9xpwfg58e66mygtpu6r4g3fmnb55we4ppmozh5e5z8kd5zaz6yfzefmwsgd.n-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: stick59_34059e527183 X-Filterd-Recvd-Size: 10252 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf37.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:27 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 5D92DDFE; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=MBT4lm3A6+58v V7ehE09nj7/xFAdFR40w6fslUm1WHg=; b=FlooIRVkdGPvb1B7X33VMVVF7n1wp UaiRQX3EmCVmt/S4kWXjwbr+wxvmtwKz/o32+LPHhDnEqzYXNaUZH3Agt4uODi/N u1VsDn9q3cikNtuB0yPsELVxjObNKNUnO8y+BaDqEomDgvsTCETHzvo74rOK5yY+ oK6rWPiA7p41AlgsbrySWm0sBqGTz0V4KLOyiCk7UkDIkUOZoSUCrbHAF5VZT0xg 3uBDSUrzngY11kwyAIgPwwCgTQrhtg0JKIBdJuX4cnzq6/eoe7hCUBhkVbEB+kpe wniG+QViuPc4DP5ipHub+1vg9Cy1bFBYQkm72AXY4xNXY8V5A0TA0uMbA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=MBT4lm3A6+58vV7ehE09nj7/xFAdFR40w6fslUm1WHg=; b=kHadzIx4 bbpU4sVnHcsUyE8DrZY+I7WIgidMRqHicibUFP8S0l1ShpZrx2yt/0bGuTfOfQ3f EpWdJBuIuk5HZPdxgZlWZZ+tlftu+UU26K/LAfVxHqd1l6yaYh21QW1kUiek4NSP KZXYP2v+5PtncTH4MIy0mCpjVHaSnKoH9xjcoaxiMpi6co3+9fXDVYPlifPhP55W OL7uZV+kWh1k50T8zqc7CRpEI2uUkYDvSsUUxwUycEW31w61zkGKpH1+BfVu1j5j eDEJfXe9GLaPivWkmLfGiqhey9SRlwKcnNiAOonJQtoS9+Jqp2iswdpF/meI3Aog E6Xp16XaQ3BI3A== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpeduvdenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 775B73064674; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 18/30] mm: thp: PUD THP follow_p*d_page() support. Date: Mon, 28 Sep 2020 13:54:16 -0400 Message-Id: <20200928175428.4110504-19-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan Add follow_page support for PUD THPs. Signed-off-by: Zi Yan --- include/linux/huge_mm.h | 11 +++++++ mm/gup.c | 60 ++++++++++++++++++++++++++++++++- mm/huge_memory.c | 73 ++++++++++++++++++++++++++++++++++++++++- 3 files changed, 142 insertions(+), 2 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 589e5af5a1c2..c7bc40c4a5e2 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -20,6 +20,10 @@ extern int copy_huge_pud(struct mm_struct *dst_mm, struct mm_struct *src_mm, extern void huge_pud_set_accessed(struct vm_fault *vmf, pud_t orig_pud); extern int do_huge_pud_anonymous_page(struct vm_fault *vmf); extern vm_fault_t do_huge_pud_wp_page(struct vm_fault *vmf, pud_t orig_pud); +extern struct page *follow_trans_huge_pud(struct vm_area_struct *vma, + unsigned long addr, + pud_t *pud, + unsigned int flags); #else static inline void huge_pud_set_accessed(struct vm_fault *vmf, pud_t orig_pud) { @@ -32,6 +36,13 @@ extern vm_fault_t do_huge_pud_wp_page(struct vm_fault *vmf, pud_t orig_pud) { return VM_FAULT_FALLBACK; } +struct page *follow_trans_huge_pud(struct vm_area_struct *vma, + unsigned long addr, + pud_t *pud, + unsigned int flags) +{ + return NULL; +} #endif extern vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd); diff --git a/mm/gup.c b/mm/gup.c index b21cc220f036..972cca69f228 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -696,10 +696,68 @@ static struct page *follow_pud_mask(struct vm_area_struct *vma, if (page) return page; } + +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + if (likely(!pud_trans_huge(*pud))) { + if (unlikely(pud_bad(*pud))) + return no_page_table(vma, flags); + return follow_pmd_mask(vma, address, pud, flags, ctx); + } + + ptl = pud_lock(mm, pud); + + if (unlikely(!pud_trans_huge(*pud))) { + spin_unlock(ptl); + if (unlikely(pud_bad(*pud))) + return no_page_table(vma, flags); + return follow_pmd_mask(vma, address, pud, flags, ctx); + } + + if (flags & FOLL_SPLIT) { + int ret; + pmd_t *pmd = NULL; + + page = pud_page(*pud); + if (is_huge_zero_page(page)) { + + spin_unlock(ptl); + ret = 0; + split_huge_pud(vma, pud, address); + pmd = pmd_offset(pud, address); + split_huge_pmd(vma, pmd, address); + if (pmd_trans_unstable(pmd)) + ret = -EBUSY; + } else { + get_page(page); + spin_unlock(ptl); + lock_page(page); + ret = split_huge_pud_page(page); + if (!ret) + ret = split_huge_page(page); + else { + unlock_page(page); + put_page(page); + goto out; + } + unlock_page(page); + put_page(page); + if (pud_none(*pud)) + return no_page_table(vma, flags); + pmd = pmd_offset(pud, address); + } +out: + return ret ? ERR_PTR(ret) : + follow_page_pte(vma, address, pmd, flags, &ctx->pgmap); + } + page = follow_trans_huge_pud(vma, address, pud, flags); + spin_unlock(ptl); + ctx->page_mask = HPAGE_PUD_NR - 1; + return page; +#else if (unlikely(pud_bad(*pud))) return no_page_table(vma, flags); - return follow_pmd_mask(vma, address, pud, flags, ctx); +#endif } static struct page *follow_p4d_mask(struct vm_area_struct *vma, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9aa19aa643cd..61ae7a0ded84 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1258,6 +1258,77 @@ struct page *follow_devmap_pud(struct vm_area_struct *vma, unsigned long addr, return page; } +/* + * FOLL_FORCE can write to even unwritable pmd's, but only + * after we've gone through a COW cycle and they are dirty. + */ +static inline bool can_follow_write_pud(pud_t pud, unsigned int flags) +{ + return pud_write(pud) || + ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pud_dirty(pud)); +} + +struct page *follow_trans_huge_pud(struct vm_area_struct *vma, + unsigned long addr, + pud_t *pud, + unsigned int flags) +{ + struct mm_struct *mm = vma->vm_mm; + struct page *page = NULL; + + assert_spin_locked(pud_lockptr(mm, pud)); + + if (flags & FOLL_WRITE && !can_follow_write_pud(*pud, flags)) + goto out; + + /* Avoid dumping huge zero page */ + if ((flags & FOLL_DUMP) && is_huge_zero_pud(*pud)) + return ERR_PTR(-EFAULT); + + /* Full NUMA hinting faults to serialise migration in fault paths */ + /*&& pud_protnone(*pmd)*/ + if ((flags & FOLL_NUMA)) + goto out; + + page = pud_page(*pud); + VM_BUG_ON_PAGE(!PageHead(page) && !is_zone_device_page(page), page); + if (flags & FOLL_TOUCH) + touch_pud(vma, addr, pud, flags); + if ((flags & FOLL_MLOCK) && (vma->vm_flags & VM_LOCKED)) { + /* + * We don't mlock() pte-mapped THPs. This way we can avoid + * leaking mlocked pages into non-VM_LOCKED VMAs. + * + * For anon THP: + * + * We do the same thing as PMD-level THP. + * + * For file THP: + * + * No support yet. + * + */ + + if (PageAnon(page) && compound_mapcount(page) != 1) + goto skip_mlock; + if (PagePUDDoubleMap(page) || !page->mapping) + goto skip_mlock; + if (!trylock_page(page)) + goto skip_mlock; + lru_add_drain(); + if (page->mapping && !PagePUDDoubleMap(page)) + mlock_vma_page(page); + unlock_page(page); + } +skip_mlock: + page += (addr & ~HPAGE_PUD_MASK) >> PAGE_SHIFT; + VM_BUG_ON_PAGE(!PageCompound(page) && !is_zone_device_page(page), page); + if (flags & FOLL_GET) + get_page(page); + +out: + return page; +} int copy_huge_pud(struct mm_struct *dst_mm, struct mm_struct *src_mm, pud_t *dst_pud, pud_t *src_pud, unsigned long addr, struct vm_area_struct *vma) @@ -1462,7 +1533,7 @@ struct page *follow_trans_huge_pmd(struct vm_area_struct *vma, goto out; page = pmd_page(*pmd); - VM_BUG_ON_PAGE(!PageHead(page) && !is_zone_device_page(page), page); + VM_BUG_ON_PAGE(!PageHead(page) && !is_zone_device_page(page) && !PMDPageInPUD(page), page); if (!try_grab_page(page, flags)) return ERR_PTR(-ENOMEM); From patchwork Mon Sep 28 17:54:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804477 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 819226CA for ; Mon, 28 Sep 2020 17:56:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3340D21D41 for ; Mon, 28 Sep 2020 17:56:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="Q+wV/JzL"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="PYYNaM47" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3340D21D41 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1FBC8900016; Mon, 28 Sep 2020 13:55:29 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 1D622900015; Mon, 28 Sep 2020 13:55:29 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F393A900016; Mon, 28 Sep 2020 13:55:28 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0131.hostedemail.com [216.40.44.131]) by kanga.kvack.org (Postfix) with ESMTP id D1C22900015 for ; Mon, 28 Sep 2020 13:55:28 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 9123C180AD802 for ; Mon, 28 Sep 2020 17:55:28 +0000 (UTC) X-FDA: 77313222336.08.basin08_0a11b0827183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin08.hostedemail.com (Postfix) with ESMTP id 7699E1819E773 for ; Mon, 28 Sep 2020 17:55:28 +0000 (UTC) X-Spam-Summary: 1,0,0,54a64ae709cd223e,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1543:1711:1730:1747:1777:1792:2393:2559:2562:2693:3138:3139:3140:3141:3142:3355:3865:3868:3874:4041:4118:4321:4605:5007:6119:6120:6261:6653:6742:7576:7903:8603:8957:10004:11026:11473:11658:11914:12043:12291:12296:12438:12555:12679:12683:12895:12986:13894:14096:14110:14181:14721:21080:21450:21627:21796:21990:30036:30054:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04yfyh1monq6ozh919y66dqmqu7n6ypoc84tt5ob7ymwdokz1kxkxm3y6tqkztw.3qdd4b5qtharyqckt39as73th7e8d8frhcj5cppk4yg5dmmir1rc6hif7fniixx.k-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:26,LUA_SUMMARY:none X-HE-Tag: basin08_0a11b0827183 X-Filterd-Recvd-Size: 7665 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf26.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:27 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id B2291E89; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:26 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=+Ag2KegvStddb FvKA1aftsfAEt1esGFTwgcceSEwVTE=; b=Q+wV/JzLi/GVQYAkHGoDt99oT6+cl GVYZQnorxiSP0T5OVC/u6BAy7DuyFrlyCxeWC80/C+6EvSuvCidfq+hoCz2DthP7 W49p2Y9LC6W+VM/X5g3+WnIh91BlyZthXT7/ahA7yLYDNXyZfeldfAEz6NSl4pwo zMaPb3hssAJ9xMJ1gZz0Ycm6fPyAx5Vg02YzWsi5Z3ZkerGR0Ch73rXxa5hrFIjl l8S2b/QiNBjEJUlSVbdeq9adboFVDjx+ADO+0F4ascZsjli5QhxHy0xhYoW9j+OA 394bwxfpf9Y64bzqpcc/VhyiSfVNq7W/3BXEzP7IMMSIEcUL7H7E6sHjA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=+Ag2KegvStddbFvKA1aftsfAEt1esGFTwgcceSEwVTE=; b=PYYNaM47 1SA6hG7pzXIwayd5OpM+ea3rDYKYEOOVD+IQLxRhmJC90dkkLY4wZMHCM+QNTsA4 yjGyLGnYFzj+H+RLt16wBb0KULdoJquhc8hfjW6CSMFvGc4/xtbtsw6DGS11giAW XCxd9n6ZbzIOX8GnXXISgPg/DzLHD1MDlxuTKsdiLvi1/yWRZ0wHFUqwLS/SrRlK ojFiqulMf5zAUZHBhr3kVpGw/bq/BQBXRa2cgAcKWion2e3HrjsP5oni2Mc3/duc pxX7T5nKzupgFP9J47GcB2o5LCwBDzaLZdArjmAbIwA5FnriGGs4K8aYV6eq5AMN KMBJ4lisrf+QeA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpeduvdenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id CD5B1306468B; Mon, 28 Sep 2020 13:55:24 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 19/30] mm: stats: make smap stats understand PUD THPs. Date: Mon, 28 Sep 2020 13:54:17 -0400 Message-Id: <20200928175428.4110504-20-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan Signed-off-by: Zi Yan --- fs/proc/task_mmu.c | 68 ++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 63 insertions(+), 5 deletions(-) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index a21484b1414d..077196182288 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -430,10 +430,9 @@ static void smaps_page_accumulate(struct mem_size_stats *mss, } static void smaps_account(struct mem_size_stats *mss, struct page *page, - bool compound, bool young, bool dirty, bool locked) + unsigned long size, bool young, bool dirty, bool locked) { - int i, nr = compound ? compound_nr(page) : 1; - unsigned long size = nr * PAGE_SIZE; + int i, nr = size / PAGE_SIZE; /* * First accumulate quantities that depend only on |size| and the type @@ -530,7 +529,7 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr, if (!page) return; - smaps_account(mss, page, false, pte_young(*pte), pte_dirty(*pte), locked); + smaps_account(mss, page, PAGE_SIZE, pte_young(*pte), pte_dirty(*pte), locked); } #ifdef CONFIG_TRANSPARENT_HUGEPAGE @@ -561,8 +560,44 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, /* pass */; else mss->file_thp += HPAGE_PMD_SIZE; - smaps_account(mss, page, true, pmd_young(*pmd), pmd_dirty(*pmd), locked); + smaps_account(mss, page, HPAGE_PMD_SIZE, pmd_young(*pmd), + pmd_dirty(*pmd), locked); } + +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +static void smaps_pud_entry(pud_t *pud, unsigned long addr, + struct mm_walk *walk) +{ + struct mem_size_stats *mss = walk->private; + struct vm_area_struct *vma = walk->vma; + bool locked = !!(vma->vm_flags & VM_LOCKED); + struct page *page = NULL; + + if (pud_present(*pud)) { + /* FOLL_DUMP will return -EFAULT on huge zero page */ + page = follow_trans_huge_pud(vma, addr, pud, FOLL_DUMP); + } + if (IS_ERR_OR_NULL(page)) + return; + if (PageAnon(page)) + mss->anonymous_thp += HPAGE_PUD_SIZE; + else if (PageSwapBacked(page)) + mss->shmem_thp += HPAGE_PUD_SIZE; + else if (is_zone_device_page(page)) + /* pass */; + else + mss->file_thp += HPAGE_PUD_SIZE; + smaps_account(mss, page, HPAGE_PUD_SIZE, pud_young(*pud), + pud_dirty(*pud), locked); +} +#else +static void smaps_pud_entry(pud_t *pud, unsigned long addr, + struct mm_walk *walk) +{ +} +#endif + + #else static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, struct mm_walk *walk) @@ -570,6 +605,28 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr, } #endif +static int smaps_pud_range(pud_t pud, pud_t *pudp, unsigned long addr, + unsigned long end, struct mm_walk *walk) +{ + struct vm_area_struct *vma = walk->vma; + spinlock_t *ptl; + + ptl = pud_trans_huge_lock(pudp, vma); + if (ptl) { + if (memcmp(pudp, &pud, sizeof(pud)) != 0) { + walk->action = ACTION_AGAIN; + spin_unlock(ptl); + return 0; + } + smaps_pud_entry(pudp, addr, walk); + spin_unlock(ptl); + walk->action = ACTION_CONTINUE; + } + + cond_resched(); + return 0; +} + static int smaps_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, unsigned long end, struct mm_walk *walk) { @@ -712,6 +769,7 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask, #endif /* HUGETLB_PAGE */ static const struct mm_walk_ops smaps_walk_ops = { + .pud_entry = smaps_pud_range, .pmd_entry = smaps_pte_range, .hugetlb_entry = smaps_hugetlb_range, }; From patchwork Mon Sep 28 17:54:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804479 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 30315618 for ; Mon, 28 Sep 2020 17:56:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D8142221E7 for ; Mon, 28 Sep 2020 17:56:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="Yb7AuBe8"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="oGnbxuPk" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D8142221E7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 52E93900015; Mon, 28 Sep 2020 13:55:29 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5093D90000D; Mon, 28 Sep 2020 13:55:29 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2BBFB900017; Mon, 28 Sep 2020 13:55:29 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0155.hostedemail.com [216.40.44.155]) by kanga.kvack.org (Postfix) with ESMTP id 15D9F90000D for ; Mon, 28 Sep 2020 13:55:29 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id CFAE31F1A for ; Mon, 28 Sep 2020 17:55:28 +0000 (UTC) X-FDA: 77313222336.02.rub89_0f116c727183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id A906210187192 for ; Mon, 28 Sep 2020 17:55:28 +0000 (UTC) X-Spam-Summary: 1,0,0,f1f780aa4b7aac6c,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:1:2:41:69:355:379:541:800:960:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2198:2199:2393:2553:2559:2562:2693:2895:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3874:4049:4250:4321:5007:6119:6120:6261:6653:6742:7576:7875:7903:8957:9592:10004:11026:11473:11658:11914:12043:12291:12296:12438:12555:12679:12683:12895:13894:13972:14096:14110:21080:21451:21611:21627:21990:30003:30054:30064:30069:30070:30090,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-64.100.201.100 62.18.0.100;04yrgsm54zntodrh9sqefe8gnh8h4ypqrbqbitkdy56c478djnbyhrk4syx14hm.6xupxesc8bbbnam7s9eq8zad3gp748y88aos4f83nadxym37pz93ngtz8mywknq.y-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: rub89_0f116c727183 X-Filterd-Recvd-Size: 10746 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf50.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:28 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 1A2C8E9C; Mon, 28 Sep 2020 13:55:26 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=UVh0VnweyeEVQ +rLxzGuT0SDw8Id+iKtTujnmk1NI3o=; b=Yb7AuBe8LHothHCDCimJ8qPdzQi1S quLpm9keAzZDYTfqU54E4JCS5ZdmrkneT+HXaj9O3ILJZOHlXT9lLTJkSANrmNQu TFqu45k1gOhrNAhusDgnzexJKLvbhDG9GnVaDh+MdBy/7o0TEvEdIzfY6hvgw17t YSLVjFGFoi5tVcYX07TTiwlL1KVR6qpsL1+KPS1wJvu8X/zRh9RMct36aye+jIig RrO09bYeL7o0QmUlCCdIoEJqnE+uFiaSAQiIVbtO13lbDXM5yKuW0gIoHraCf8AA CSMCk0U/36t09FVtE/rkjJXU1IV75lZJ+xqRvrHUAil8PHgm20OVd9T6Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=UVh0VnweyeEVQ+rLxzGuT0SDw8Id+iKtTujnmk1NI3o=; b=oGnbxuPk CZH9J57gwO5uDt84xEjPgIu+70w0mWYzXKR/G/o0UnKJ8pNMLqKPkA+8460HJALA jbmeiQ9TUraNbS+IyBs8QqMNN7oVuHRUH4ESxCbWqnh7JIlNYhGRBLPvYmlgH3TF rKWtL0vkqeh8uLyP2xnkhFO9ZuvA+4H88wGpunfDeTKgCseCZN9sSa8PmUGMgOg1 nDcGg6Ljd/zbASr1OXJG8lkiD34tdgI5my1qDTyOe5VRSYt6jLXvNEAAbBRHjXkU aMMD7LoRlxpmx2+BpfdQnnJWlek4131d0xmUTz/7VZxgnInZ48q2FNFyOyEZeiva qZo63V4nA7+CbA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpeduvdenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 2E2A9306468C; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 20/30] mm: page_vma_walk: teach it about PMD-mapped PUD THP. Date: Mon, 28 Sep 2020 13:54:18 -0400 Message-Id: <20200928175428.4110504-21-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan We now have PMD-mapped PUD THP and PTE-mapped PUD THP, page_vma_walk should handle them properly. Signed-off-by: Zi Yan --- mm/page_vma_mapped.c | 152 +++++++++++++++++++++++++++++++++---------- 1 file changed, 118 insertions(+), 34 deletions(-) diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index f88e845ad5e6..5a3c1b561ff5 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -7,6 +7,12 @@ #include "internal.h" +enum check_pmd_result { + PVM_NOT_MAPPED = 0, + PVM_LEAF_ENTRY, + PVM_NONLEAF_ENTRY, +}; + static inline bool not_found(struct page_vma_mapped_walk *pvmw) { page_vma_mapped_walk_done(pvmw); @@ -52,6 +58,22 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw) return true; } +static bool map_pmd(struct page_vma_mapped_walk *pvmw) +{ + pmd_t pmde; + + pvmw->pmd = pmd_offset(pvmw->pud, pvmw->address); + pmde = READ_ONCE(*pvmw->pmd); + if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde)) { + pvmw->ptl = pmd_lock(pvmw->vma->vm_mm, pvmw->pmd); + return true; + } else if (!pmd_present(pmde)) + return false; + + pvmw->ptl = pmd_lock(pvmw->vma->vm_mm, pvmw->pmd); + return true; +} + static inline bool pfn_is_match(struct page *page, unsigned long pfn) { unsigned long page_pfn = page_to_pfn(page); @@ -115,6 +137,57 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw) return pfn_is_match(pvmw->page, pfn); } +/** + * check_pmd - check if @pvmw->page is mapped at the @pvmw->pmd + * + * page_vma_mapped_walk() found a place where @pvmw->page is *potentially* + * mapped. check_pmd() has to validate this. + * + * @pvmw->pmd may point to empty PMD, migraiton PMD, PMD pointing to arbitrary + * huge page, or PMD pointing to a PTE page table page. + * + * If PVMW_MIGRATION flag is set, returns PVM_LEAF_ENTRY if @pvmw->pmd contains + * migration entry that points to @pvmw->page. + * + * If PVMW_MIGRATION flag is not set, returns PVM_LEAF_ENTRY if @pvmw->pmd + * points to @pvmw->page. + * + * If @pvmw->pmd points to a PTE page table page, returns PVM_NONLEAF_ENTRY. + * + * Otherwise, return PVM_NOT_MAPPED. + * + */ +static enum check_pmd_result check_pmd(struct page_vma_mapped_walk *pvmw) +{ + unsigned long pfn; + + if (likely(pmd_trans_huge(*pvmw->pmd))) { + if (pvmw->flags & PVMW_MIGRATION) + return 0; + pfn = pmd_pfn(*pvmw->pmd); + if (!pfn_is_match(pvmw->page, pfn)) + return PVM_NOT_MAPPED; + return PVM_LEAF_ENTRY; + } else if (!pmd_present(*pvmw->pmd)) { + if (thp_migration_supported()) { + if (!(pvmw->flags & PVMW_MIGRATION)) + return 0; + if (is_migration_entry(pmd_to_swp_entry(*pvmw->pmd))) { + swp_entry_t entry = pmd_to_swp_entry(*pvmw->pmd); + + pfn = migration_entry_to_pfn(entry); + if (!pfn_is_match(pvmw->page, pfn)) + return PVM_NOT_MAPPED; + return PVM_LEAF_ENTRY; + } + } + return 0; + } + /* THP pmd was split under us: handle on pte level */ + spin_unlock(pvmw->ptl); + pvmw->ptl = NULL; + return PVM_NONLEAF_ENTRY; +} /** * page_vma_mapped_walk - check if @pvmw->page is mapped in @pvmw->vma at * @pvmw->address @@ -146,14 +219,14 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) pgd_t *pgd; p4d_t *p4d; pud_t pude; - pmd_t pmde; + enum check_pmd_result pmd_check_res; if (!pvmw->pte && !pvmw->pmd && pvmw->pud) return not_found(pvmw); /* The only possible pmd mapping has been handled on last iteration */ if (pvmw->pmd && !pvmw->pte) - return not_found(pvmw); + goto next_pmd; if (pvmw->pte) goto next_pte; @@ -202,42 +275,47 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) } else if (!pud_present(pude)) return false; - pvmw->pmd = pmd_offset(pvmw->pud, pvmw->address); - /* - * Make sure the pmd value isn't cached in a register by the - * compiler and used as a stale value after we've observed a - * subsequent update. - */ - pmde = READ_ONCE(*pvmw->pmd); - if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde)) { - pvmw->ptl = pmd_lock(mm, pvmw->pmd); - if (likely(pmd_trans_huge(*pvmw->pmd))) { - if (pvmw->flags & PVMW_MIGRATION) - return not_found(pvmw); - if (pmd_page(*pvmw->pmd) != page) - return not_found(pvmw); + if (!map_pmd(pvmw)) + goto next_pmd; + /* pmd locked after map_pmd */ + while (1) { + pmd_check_res = check_pmd(pvmw); + if (pmd_check_res == PVM_LEAF_ENTRY) return true; - } else if (!pmd_present(*pvmw->pmd)) { - if (thp_migration_supported()) { - if (!(pvmw->flags & PVMW_MIGRATION)) - return not_found(pvmw); - if (is_migration_entry(pmd_to_swp_entry(*pvmw->pmd))) { - swp_entry_t entry = pmd_to_swp_entry(*pvmw->pmd); - - if (migration_entry_to_page(entry) != page) - return not_found(pvmw); - return true; + else if (pmd_check_res == PVM_NONLEAF_ENTRY) + goto pte_level; +next_pmd: + /* Only PMD-mapped PUD THP has next pmd. */ + if (!(PageTransHuge(pvmw->page) && compound_order(pvmw->page) == HPAGE_PUD_ORDER)) + return not_found(pvmw); + do { + pvmw->address += HPAGE_PMD_SIZE; + if (pvmw->address >= pvmw->vma->vm_end || + pvmw->address >= + __vma_address(pvmw->page, pvmw->vma) + + thp_nr_pages(pvmw->page) * PAGE_SIZE) + return not_found(pvmw); + /* Did we cross page table boundary? */ + if (pvmw->address % PUD_SIZE == 0) { + /* + * Reset pmd here, so we will no stay at PMD + * level after restart. + */ + pvmw->pmd = NULL; + if (pvmw->ptl) { + spin_unlock(pvmw->ptl); + pvmw->ptl = NULL; } + goto restart; + } else { + pvmw->pmd++; } - return not_found(pvmw); - } else { - /* THP pmd was split under us: handle on pte level */ - spin_unlock(pvmw->ptl); - pvmw->ptl = NULL; - } - } else if (!pmd_present(pmde)) { - return false; + } while (pmd_none(*pvmw->pmd)); + + if (!pvmw->ptl) + pvmw->ptl = pmd_lock(mm, pvmw->pmd); } +pte_level: if (!map_pte(pvmw)) goto next_pte; while (1) { @@ -257,6 +335,12 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) /* Did we cross page table boundary? */ if (pvmw->address % PMD_SIZE == 0) { pte_unmap(pvmw->pte); + /* + * In the case of PTE-mapped PUD THP, next entry + * can be PMD. Reset pte here, so we will not + * stay at PTE level after restart. + */ + pvmw->pte = NULL; if (pvmw->ptl) { spin_unlock(pvmw->ptl); pvmw->ptl = NULL; From patchwork Mon Sep 28 17:54:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804481 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6C9236CA for ; Mon, 28 Sep 2020 17:56:17 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1310B2184D for ; Mon, 28 Sep 2020 17:56:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="ifGE3bk2"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="AbB0DpFI" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1310B2184D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BBD70900017; Mon, 28 Sep 2020 13:55:29 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id B6F8C90000D; Mon, 28 Sep 2020 13:55:29 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A8944900017; Mon, 28 Sep 2020 13:55:29 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0087.hostedemail.com [216.40.44.87]) by kanga.kvack.org (Postfix) with ESMTP id 8C90B90000D for ; Mon, 28 Sep 2020 13:55:29 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 597BA1F1A for ; Mon, 28 Sep 2020 17:55:29 +0000 (UTC) X-FDA: 77313222378.25.beast57_1a0bfd127183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id 2EAC91805110F for ; Mon, 28 Sep 2020 17:55:29 +0000 (UTC) X-Spam-Summary: 1,0,0,a5ac36778c1f4dba,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:1:41:69:355:379:541:800:960:966:968:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1605:1730:1747:1777:1792:2194:2196:2198:2199:2200:2201:2393:2559:2562:2636:2693:2895:2898:2904:3138:3139:3140:3141:3142:3865:3866:3867:3868:3870:3871:3872:4037:4250:4321:4385:5007:6117:6119:6120:6261:6653:6742:7576:7875:7901:7903:8957:10004:11026:11232:11473:11658:11914:12043:12291:12296:12438:12555:12679:12683:12895:12986:13894:14096:21080:21324:21433:21627:21990:30003:30025:30054:30064:30070,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04yr8tqnrt19d8nwzwxf5i9fh7qzpopgsw7h4qx47emtfxnhby3etrgmm5dsq5s.wdsiztwbtpg5depfbnofw1frzhjoy9ufhxafwu8kh8f9t5hij5edb44jk68x6o3.e-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: beast57_1a0bfd127183 X-Filterd-Recvd-Size: 14761 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf17.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:28 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 75D5DE41; Mon, 28 Sep 2020 13:55:26 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=7v6dOQYL2X3KD +L3Optjt6XoQCTc8k9QrHm+vVl8mus=; b=ifGE3bk2Ya1oanKJQq3VYe/LYAh39 AKN/UvwLpAbHeEZs8nvm3s0tv+RIOpbmro3Dd/amxZb0/1x4syyzeBzC4FX14OA3 SXeqsTNrYDoo9m7RvuZ20v92ZRgzmJA0MfUpDPCVCxXBiRWryknxhNaxeynLsPtk LQzta6ZP9YIk2jG7HPh3JMgLfaLhVq40cBYcVZ+K0rMz8kTyXGspXaGe/Hl+Ap6Z oDK4yWHPXgrvY6tHToYOIHNoBv35aanAKy98xg9iKA6S/uixvJgXMI/mVkWcIwAy 5Xb9PFICw4JV943rwhRAIQ9ueEKGqOBgKYsL0S/iK53jiYPkbRso4Pt4w== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=7v6dOQYL2X3KD+L3Optjt6XoQCTc8k9QrHm+vVl8mus=; b=AbB0DpFI xvGjz/dzaDj4sX0cV6dB706UChhDZqLmQDI1+Kujh3tLs1qGtLF01mNJ4vxZLr5I jWsChplncE7yioNYP9E8vR2vbsi0Q6kWi6jOKZUyVDtQn4ikiuFDsXj/6GTzcH1w tDtHnfkQtyfqranSwI5WGdKZYDMcJ57Vw/sWnAqSAz5reiQfREN8H6ywaj/Eq6Ev TqtN8BeeJxPXhzuPtJ+aIU6He8CrXQAeI41J1GjlfRxvBd+vNAVbUmAiA2yYGgAW p4BXdCZiOO18K0cV7m5eAH3Do4L+OZjGO4T3SUbmYwQCer1vDqWKUJQgkZuoRV5X tl56iYTGQUUNQw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpeduvdenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 854B9306468A; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 21/30] mm: thp: PUD THP support in try_to_unmap(). Date: Mon, 28 Sep 2020 13:54:19 -0400 Message-Id: <20200928175428.4110504-22-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan Unmap different subpages in different sized THPs properly in the try_to_unmap() function. pvmw.pte, pvmw.pmd, pvmw.pud are used to identify unmapped page sizes: 1. pvmw.pte != NULL: PTE pages or PageHuge. 2. pvmw.pte == NULL and pvmw.pmd != NULL: PMD pages. 3. pvmw.pte == NULL and pvmw.pmd == NULL and pvmw.pud != NULL: PUD pages. Signed-off-by: Zi Yan --- mm/migrate.c | 2 +- mm/rmap.c | 156 ++++++++++++++++++++++++++++++++++++++------------- 2 files changed, 117 insertions(+), 41 deletions(-) diff --git a/mm/migrate.c b/mm/migrate.c index a7320e9d859c..d0e6afe682aa 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -225,7 +225,7 @@ static bool remove_migration_pte(struct page *page, struct vm_area_struct *vma, #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION /* PMD-mapped THP migration entry */ - if (!pvmw.pte) { + if (!pvmw.pte && pvmw.pmd) { VM_BUG_ON_PAGE(PageHuge(page) || !PageTransCompound(page), page); remove_migration_pmd(&pvmw, new); continue; diff --git a/mm/rmap.c b/mm/rmap.c index 424322807966..32f2e0312e16 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1125,6 +1125,7 @@ void do_page_add_anon_rmap(struct page *page, { bool compound = flags & RMAP_COMPOUND; bool first; + struct page *head = compound_head(page); if (unlikely(PageKsm(page))) lock_page_memcg(page); @@ -1134,7 +1135,7 @@ void do_page_add_anon_rmap(struct page *page, if (compound) { atomic_t *mapcount = NULL; VM_BUG_ON_PAGE(!PageLocked(page), page); - VM_BUG_ON_PAGE(!PageTransHuge(page), page); + VM_BUG_ON_PAGE(!PMDPageInPUD(page) && !PageTransHuge(page), page); if (compound_order(page) == HPAGE_PUD_ORDER) { if (map_order == HPAGE_PUD_ORDER) { mapcount = compound_mapcount_ptr(page); @@ -1143,7 +1144,7 @@ void do_page_add_anon_rmap(struct page *page, mapcount = sub_compound_mapcount_ptr(page, 1); } else VM_BUG_ON(1); - } else if (compound_order(page) == HPAGE_PMD_ORDER) { + } else if (compound_order(head) == HPAGE_PMD_ORDER) { mapcount = compound_mapcount_ptr(page); } else VM_BUG_ON(1); @@ -1153,7 +1154,7 @@ void do_page_add_anon_rmap(struct page *page, } if (first) { - int nr = compound ? thp_nr_pages(page) : 1; + int nr = 1<vm_flags & VM_LOCKED)) @@ -1487,6 +1491,11 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, is_zone_device_page(page) && !is_device_private_page(page)) return true; + if (flags & TTU_SPLIT_HUGE_PUD) { + split_huge_pud_address(vma, address, + flags & TTU_SPLIT_FREEZE, page); + } + if (flags & TTU_SPLIT_HUGE_PMD) { split_huge_pmd_address(vma, address, flags & TTU_SPLIT_FREEZE, page); @@ -1519,7 +1528,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, while (page_vma_mapped_walk(&pvmw)) { #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION /* PMD-mapped THP migration entry */ - if (!pvmw.pte && (flags & TTU_MIGRATION)) { + if (!pvmw.pte && pvmw.pmd && (flags & TTU_MIGRATION)) { VM_BUG_ON_PAGE(PageHuge(page) || !PageTransCompound(page), page); set_pmd_migration_entry(&pvmw, page); @@ -1551,9 +1560,25 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, } /* Unexpected PMD-mapped THP? */ - VM_BUG_ON_PAGE(!pvmw.pte, page); - subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte); + if (pvmw.pte) { + subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte); + /* + * PageHuge always uses pvmw.pte to store relevant page + * table entry + */ + if (PageHuge(page)) + map_order = compound_order(page); + else + map_order = 0; + } else if (!pvmw.pte && pvmw.pmd) { + subpage = page - page_to_pfn(page) + pmd_pfn(*pvmw.pmd); + map_order = HPAGE_PMD_ORDER; + } else if (!pvmw.pte && !pvmw.pmd && pvmw.pud) { + subpage = page - page_to_pfn(page) + pud_pfn(*pvmw.pud); + map_order = HPAGE_PUD_ORDER; + } + VM_BUG_ON(!subpage); address = pvmw.address; if (PageHuge(page)) { @@ -1631,8 +1656,12 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, } if (!(flags & TTU_IGNORE_ACCESS)) { - if (ptep_clear_flush_young_notify(vma, address, - pvmw.pte)) { + if ((pvmw.pte && + ptep_clear_flush_young_notify(vma, address, pvmw.pte)) || + ((!pvmw.pte && pvmw.pmd) && + pmdp_clear_flush_young_notify(vma, address, pvmw.pmd)) || + ((!pvmw.pte && !pvmw.pmd && pvmw.pud) && + pudp_clear_flush_young_notify(vma, address, pvmw.pud))) { ret = false; page_vma_mapped_walk_done(&pvmw); break; @@ -1640,7 +1669,12 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, } /* Nuke the page table entry. */ - flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); + if (pvmw.pte) + flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); + else if (!pvmw.pte && pvmw.pmd) + flush_cache_page(vma, address, pmd_pfn(*pvmw.pmd)); + else if (!pvmw.pte && !pvmw.pmd && pvmw.pud) + flush_cache_page(vma, address, pud_pfn(*pvmw.pud)); if (should_defer_flush(mm, flags)) { /* * We clear the PTE but do not flush so potentially @@ -1650,16 +1684,34 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, * transition on a cached TLB entry is written through * and traps if the PTE is unmapped. */ - pteval = ptep_get_and_clear(mm, address, pvmw.pte); + if (pvmw.pte) { + pteval = ptep_get_and_clear(mm, address, pvmw.pte); + + set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); + } else if (!pvmw.pte && pvmw.pmd) { + pmdval = pmdp_huge_get_and_clear(mm, address, pvmw.pmd); - set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); + set_tlb_ubc_flush_pending(mm, pmd_dirty(pmdval)); + } else if (!pvmw.pte && !pvmw.pmd && pvmw.pud) { + pudval = pudp_huge_get_and_clear(mm, address, pvmw.pud); + + set_tlb_ubc_flush_pending(mm, pud_dirty(pudval)); + } } else { - pteval = ptep_clear_flush(vma, address, pvmw.pte); + if (pvmw.pte) + pteval = ptep_clear_flush(vma, address, pvmw.pte); + else if (!pvmw.pte && pvmw.pmd) + pmdval = pmdp_huge_clear_flush(vma, address, pvmw.pmd); + else if (!pvmw.pte && !pvmw.pmd && pvmw.pud) + pudval = pudp_huge_clear_flush(vma, address, pvmw.pud); } /* Move the dirty bit to the page. Now the pte is gone. */ - if (pte_dirty(pteval)) - set_page_dirty(page); + if ((pvmw.pte && pte_dirty(pteval)) || + ((!pvmw.pte && pvmw.pmd) && pmd_dirty(pmdval)) || + ((!pvmw.pte && !pvmw.pmd && pvmw.pud) && pud_dirty(pudval)) + ) + set_page_dirty(page); /* Update high watermark before we lower rss */ update_hiwater_rss(mm); @@ -1694,35 +1746,59 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, } else if (IS_ENABLED(CONFIG_MIGRATION) && (flags & (TTU_MIGRATION|TTU_SPLIT_FREEZE))) { swp_entry_t entry; - pte_t swp_pte; - if (arch_unmap_one(mm, vma, address, pteval) < 0) { - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; - } + if (pvmw.pte) { + pte_t swp_pte; - /* - * Store the pfn of the page in a special migration - * pte. do_swap_page() will wait until the migration - * pte is removed and then restart fault handling. - */ - entry = make_migration_entry(subpage, - pte_write(pteval)); - swp_pte = swp_entry_to_pte(entry); - if (pte_soft_dirty(pteval)) - swp_pte = pte_swp_mksoft_dirty(swp_pte); - if (pte_uffd_wp(pteval)) - swp_pte = pte_swp_mkuffd_wp(swp_pte); - set_pte_at(mm, address, pvmw.pte, swp_pte); - /* - * No need to invalidate here it will synchronize on - * against the special swap migration pte. - */ + if (arch_unmap_one(mm, vma, address, pteval) < 0) { + set_pte_at(mm, address, pvmw.pte, pteval); + ret = false; + page_vma_mapped_walk_done(&pvmw); + break; + } + + /* + * Store the pfn of the page in a special migration + * pte. do_swap_page() will wait until the migration + * pte is removed and then restart fault handling. + */ + entry = make_migration_entry(subpage, + pte_write(pteval)); + swp_pte = swp_entry_to_pte(entry); + if (pte_soft_dirty(pteval)) + swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); + set_pte_at(mm, address, pvmw.pte, swp_pte); + /* + * No need to invalidate here it will synchronize on + * against the special swap migration pte. + */ + } else if (!pvmw.pte && pvmw.pmd) { + pmd_t swp_pmd; + /* + * Store the pfn of the page in a special migration + * pte. do_swap_page() will wait until the migration + * pte is removed and then restart fault handling. + */ + entry = make_migration_entry(subpage, + pmd_write(pmdval)); + swp_pmd = swp_entry_to_pmd(entry); + if (pmd_soft_dirty(pmdval)) + swp_pmd = pmd_swp_mksoft_dirty(swp_pmd); + set_pmd_at(mm, address, pvmw.pmd, swp_pmd); + /* + * No need to invalidate here it will synchronize on + * against the special swap migration pte. + */ + } else if (!pvmw.pte && !pvmw.pmd && pvmw.pud) { + VM_BUG_ON(1); + } } else if (PageAnon(page)) { swp_entry_t entry = { .val = page_private(subpage) }; pte_t swp_pte; + + VM_BUG_ON(!pvmw.pte); /* * Store the swap location in the pte. * See handle_pte_fault() ... @@ -1808,7 +1884,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma, * * See Documentation/vm/mmu_notifier.rst */ - page_remove_rmap(subpage, compound_order(page)); + page_remove_rmap(subpage, map_order); put_page(page); } From patchwork Mon Sep 28 17:54:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804483 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 896D56CA for ; Mon, 28 Sep 2020 17:56:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 41B9C221E7 for ; Mon, 28 Sep 2020 17:56:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="MVxKSB4G"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="Qv2pOgr/" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 41B9C221E7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 11273900018; Mon, 28 Sep 2020 13:55:30 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 04CDE90000D; Mon, 28 Sep 2020 13:55:29 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E110E900019; Mon, 28 Sep 2020 13:55:29 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0095.hostedemail.com [216.40.44.95]) by kanga.kvack.org (Postfix) with ESMTP id C790E900018 for ; Mon, 28 Sep 2020 13:55:29 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 8B1561E02 for ; Mon, 28 Sep 2020 17:55:29 +0000 (UTC) X-FDA: 77313222378.20.pump03_5e16cc727183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin20.hostedemail.com (Postfix) with ESMTP id 61645180C07A3 for ; Mon, 28 Sep 2020 17:55:29 +0000 (UTC) X-Spam-Summary: 1,0,0,7b9e5fca5846e363,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:541:800:960:966:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1543:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2693:3138:3139:3140:3141:3142:3354:3867:3871:3872:3874:4118:4321:4385:4605:5007:6117:6119:6120:6261:6653:6742:7576:7875:7901:8957:9592:10004:11026:11473:11658:11914:12043:12291:12296:12438:12555:12679:12683:12895:12986:13894:14096:14110:14181:14721:21080:21627:21990:30054:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04yfgb3tjz8gsjay6fknq9azwcs17yc5g67oifmshzs6miunc9xmhmxgfs3ocok.zjh96e1kees41ibsu87oazkjtt3ednsu3oomurh6aj6gyi3mmq4gm5cfog6ifpi.h-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:22,LUA_SUMMARY:none X-HE-Tag: pump03_5e16cc727183 X-Filterd-Recvd-Size: 7565 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf22.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:28 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id BB344EA0; Mon, 28 Sep 2020 13:55:26 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=n6jp16hT4oTH1 dxVIOY600H59pcJD1Ay2zwRoDXWqKs=; b=MVxKSB4GH5gmkQT5B/PviITHQj+X5 iQhMIQQGBGQqOH14ZUf8fjssIBRB8fPKYaJsoyu1TbzJW0SY83g2r+pIQHbG2j8L 2VIkFQUPYRw37yia2BZPJvEB3o6Q6KRJkDB0B47a+U95sechbkVDhN5uQwtjPc87 vePnt9Yw08q84BLbZSXbt2fG1hcw8elLooLaKZlCp3mfVe7tB7Z3L6jcjlEXZpmo QjXOF+wthQNjz/VdicXz92zyYK2HISx2seEmChC7Oe4LuP+WBuRu2+SWmBncxYbE ymjC15ggxyvaLozPGm2OqyiTakkQEE33OXB6Je3CyH5oObtDTSrYhbiEQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=n6jp16hT4oTH1dxVIOY600H59pcJD1Ay2zwRoDXWqKs=; b=Qv2pOgr/ SGuyEf0woWJpgrCAYYd82mBfIZuASR/78LOMlx9tRCv81jWke6LhdAK9V1VZb7dV 4vcefVPgcFm0zgBWhXmuPtUW/bhlYW1DfSx1yX2q5hmNX56H7oSQvM9blKgE1Z4Q wt3xjsATwsLWxQCSBCSXVwkIATM7h4htQftSlLU5qIohzGtMf9ZNeksNA3yV+Nw3 l3FbjV+U15qIIBymx9xspFV0AAv4nDSnQ+NeukDWV+EmyJpuMYz7D6RjC1n4Pv6H hsACRgIQWLkC1OYbd5dkqywSPy6beMrBoQdZBlpYkW1zjUo5If37UdPmxV5M+cD4 J4dcIhMuUFQyQg== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpeduvdenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id DBDD43064682; Mon, 28 Sep 2020 13:55:25 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 22/30] mm: thp: split PUD THPs at page reclaim. Date: Mon, 28 Sep 2020 13:54:20 -0400 Message-Id: <20200928175428.4110504-23-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan We cannot swap PUD THPs, so split them before swap them out. PUD THPs will be split into PMD THPs, so that if THP_SWAP is enabled, PMD THPs can be swapped out as a whole. Signed-off-by: Zi Yan --- mm/swap_slots.c | 2 ++ mm/vmscan.c | 33 +++++++++++++++++++++++++++------ 2 files changed, 29 insertions(+), 6 deletions(-) diff --git a/mm/swap_slots.c b/mm/swap_slots.c index 3e6453573a89..65b8742a0446 100644 --- a/mm/swap_slots.c +++ b/mm/swap_slots.c @@ -312,6 +312,8 @@ swp_entry_t get_swap_page(struct page *page) entry.val = 0; if (PageTransHuge(page)) { + if (compound_order(page) == HPAGE_PUD_ORDER) + return entry; if (IS_ENABLED(CONFIG_THP_SWAP)) get_swap_pages(1, &entry, HPAGE_PMD_NR); goto out; diff --git a/mm/vmscan.c b/mm/vmscan.c index eae57d092931..12e169af663c 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1244,7 +1244,21 @@ static unsigned int shrink_page_list(struct list_head *page_list, if (!PageSwapCache(page)) { if (!(sc->gfp_mask & __GFP_IO)) goto keep_locked; - if (PageTransHuge(page)) { + if (!PageTransHuge(page)) + goto try_to_swap; + if (compound_order(page) == HPAGE_PUD_ORDER) { + /* cannot split THP, skip it */ + if (!can_split_huge_pud_page(page, NULL)) + goto activate_locked; + /* Split PUD THPs before swapping */ + if (split_huge_pud_page_to_list(page, page_list)) + goto activate_locked; + else { + sc->nr_scanned -= (nr_pages - HPAGE_PMD_NR); + nr_pages = HPAGE_PMD_NR; + } + } + if (compound_order(page) == HPAGE_PMD_ORDER) { /* cannot split THP, skip it */ if (!can_split_huge_page(page, NULL)) goto activate_locked; @@ -1254,14 +1268,17 @@ static unsigned int shrink_page_list(struct list_head *page_list, * tail pages can be freed without IO. */ if (!compound_mapcount(page) && - split_huge_page_to_list(page, - page_list)) + split_huge_page_to_list(page, + page_list)) goto activate_locked; } +try_to_swap: if (!add_to_swap(page)) { if (!PageTransHuge(page)) goto activate_locked_split; /* Fallback to swap normal pages */ + VM_BUG_ON_PAGE(compound_order(page) != HPAGE_PMD_ORDER, + page); if (split_huge_page_to_list(page, page_list)) goto activate_locked; @@ -1278,6 +1295,7 @@ static unsigned int shrink_page_list(struct list_head *page_list, mapping = page_mapping(page); } } else if (unlikely(PageTransHuge(page))) { + VM_BUG_ON_PAGE(compound_order(page) != HPAGE_PMD_ORDER, page); /* Split file THP */ if (split_huge_page_to_list(page, page_list)) goto keep_locked; @@ -1303,9 +1321,12 @@ static unsigned int shrink_page_list(struct list_head *page_list, enum ttu_flags flags = ttu_flags | TTU_BATCH_FLUSH; bool was_swapbacked = PageSwapBacked(page); - if (unlikely(PageTransHuge(page))) - flags |= TTU_SPLIT_HUGE_PMD; - + if (unlikely(PageTransHuge(page))) { + if (compound_order(page) == HPAGE_PMD_ORDER) + flags |= TTU_SPLIT_HUGE_PMD; + else if (compound_order(page) == HPAGE_PUD_ORDER) + flags |= TTU_SPLIT_HUGE_PUD; + } if (!try_to_unmap(page, flags)) { stat->nr_unmap_fail += nr_pages; if (!was_swapbacked && PageSwapBacked(page)) From patchwork Mon Sep 28 17:54:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804485 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 57001618 for ; Mon, 28 Sep 2020 17:56:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 17598221E7 for ; Mon, 28 Sep 2020 17:56:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="SothXaKe"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="KJiZvA+m" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 17598221E7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 779F6900019; Mon, 28 Sep 2020 13:55:30 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 701EA90000D; Mon, 28 Sep 2020 13:55:30 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5C98C900019; Mon, 28 Sep 2020 13:55:30 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0060.hostedemail.com [216.40.44.60]) by kanga.kvack.org (Postfix) with ESMTP id 44C9890000D for ; Mon, 28 Sep 2020 13:55:30 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 0B55C181AE86D for ; Mon, 28 Sep 2020 17:55:30 +0000 (UTC) X-FDA: 77313222420.18.net14_0e031a227183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin18.hostedemail.com (Postfix) with ESMTP id DBB1B1016E295 for ; Mon, 28 Sep 2020 17:55:29 +0000 (UTC) X-Spam-Summary: 1,0,0,2439975788510d0f,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1542:1711:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3354:3865:3867:3870:3871:4117:4321:5007:6119:6120:6261:6653:6742:7576:7901:8660:9036:10004:11026:11473:11658:11914:12043:12291:12296:12438:12555:12679:12683:12895:12986:13148:13230:13894:14096:14110:14181:14721:21080:21627:21939:21990:30054:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-64.100.201.100 62.18.0.100;04y8oity3mtg5cxjcx6z4xjkbi6yiycijb5qbdmmbk18jimy7mehz9sf3ac7wri.pc3as5kasmcnd89r84q5guewaazbe4rx1d1q1rjse3pskdjtid9cfaxot1xzmh5.n-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: net14_0e031a227183 X-Filterd-Recvd-Size: 6618 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf19.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:29 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 29242EAA; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=wDKwPLy4VFcLp paBAPPIxv3ZweIVcEaNjmSuTChBG+s=; b=SothXaKew0HL/ALe/TnCgWjpYQ8pC tbv2CUOxJpzhRqjkYJEqAITyyGHHLYx7HlbnSDIOgqOP05pVD20Fx/8RA//b2x2/ Z1xVLe9KUT8sEtlUHEqqf5SFtimaePtA51rd30UoAsDR9vGF2J38J+VerOqycBi2 31m5zQUUEVEnvAsH9pE8VxCrkwWFaWAIi55dcO1K/llRiEPi4EZoRx+Aar4dmIiI Zxlyj2p9rQnMWwWlp2BJ7SikpQ4pc3G8rzCYJRkay8qyaSZShbYnSMKqvnyaycC/ RMYSep9fZEskFD01A6N/VgWnM3D/xQBE/+9pQGy7vx0Lpe0ZskOKEQPfQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=wDKwPLy4VFcLppaBAPPIxv3ZweIVcEaNjmSuTChBG+s=; b=KJiZvA+m GN9UiZpcZOT2xWfn4MVpTO8SvyzBqPJ8am2es2pJitbp7NKnX6q1fSntWVgm6Kzd pSCe8IzArU7X+/c3+F6kyae20+LriQLBV01kl077B8g/fqA+Ny9ikJDkBOv/YsfL 6zeLtdZrwowN9W26Xt9+dA//vy79oPcMSFr66NgKRPDtKxEpgntDWrmQ1V7emwJd x3f8gIGOhZKNC4CjVesuqwS+M0FEl/0Acxzyt29RmkonR9/6saL1iV8OG9SOemGg GYs2N4KGk5qsBsE5muwou6VbOgRBMEvQEaoOs/IrioGyZZS8r+tHT/X+JwKw9UnH Ly/Eg3biRTGyrQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpeduvdenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 3C9EF306467E; Mon, 28 Sep 2020 13:55:26 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 23/30] mm: support PUD THP pagemap support. Date: Mon, 28 Sep 2020 13:54:21 -0400 Message-Id: <20200928175428.4110504-24-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan pagemap_pud_range is added to print pud page flags properly. Signed-off-by: Zi Yan --- fs/proc/task_mmu.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 63 insertions(+) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 077196182288..04a3158d0d5b 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1553,6 +1553,68 @@ static int pagemap_pmd_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr, return err; } +static int pagemap_pud_range(pud_t pud, pud_t *pudp, unsigned long addr, + unsigned long end, struct mm_walk *walk) +{ + struct vm_area_struct *vma = walk->vma; + struct pagemapread *pm = walk->private; + spinlock_t *ptl; + int err = 0; + +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + ptl = pud_trans_huge_lock(pudp, vma); + if (ptl) { + u64 flags = 0, frame = 0; + struct page *page = NULL; + + if (memcmp(pudp, &pud, sizeof(pud)) != 0) { + walk->action = ACTION_AGAIN; + spin_unlock(ptl); + return 0; + } + if (vma->vm_flags & VM_SOFTDIRTY) + flags |= PM_SOFT_DIRTY; + + if (pud_present(pud)) { + page = pud_page(pud); + + flags |= PM_PRESENT; + if (pud_soft_dirty(pud)) + flags |= PM_SOFT_DIRTY; + if (pm->show_pfn) + frame = pud_pfn(pud) + + ((addr & ~PUD_MASK) >> PAGE_SHIFT); + } + + if (page && page_mapcount(page) == 1) + flags |= PM_MMAP_EXCLUSIVE; + + for (; addr != end; addr += PAGE_SIZE) { + pagemap_entry_t pme = make_pme(frame, flags); + + err = add_to_pagemap(addr, &pme, pm); + if (err) + break; + if (pm->show_pfn) { + if (flags & PM_PRESENT) + frame++; + else if (flags & PM_SWAP) + frame += (1 << MAX_SWAPFILES_SHIFT); + } + } + spin_unlock(ptl); + walk->action = ACTION_CONTINUE; + return err; + } + + if (pud_trans_unstable(&pud)) { + walk->action = ACTION_AGAIN; + return 0; + } +#endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ + return err; +} + #ifdef CONFIG_HUGETLB_PAGE /* This function walks within one hugetlb entry in the single call */ static int pagemap_hugetlb_range(pte_t *ptep, unsigned long hmask, @@ -1603,6 +1665,7 @@ static int pagemap_hugetlb_range(pte_t *ptep, unsigned long hmask, #endif /* HUGETLB_PAGE */ static const struct mm_walk_ops pagemap_ops = { + .pud_entry = pagemap_pud_range, .pmd_entry = pagemap_pmd_range, .pte_hole = pagemap_pte_hole, .hugetlb_entry = pagemap_hugetlb_range, From patchwork Mon Sep 28 17:54:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804489 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BE4D06CA for ; Mon, 28 Sep 2020 17:56:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6F3B422204 for ; Mon, 28 Sep 2020 17:56:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="Q/ZciFth"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="m/+0TnpP" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6F3B422204 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0F10890001B; Mon, 28 Sep 2020 13:55:31 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 04E7F90001A; Mon, 28 Sep 2020 13:55:30 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DB7C190001C; Mon, 28 Sep 2020 13:55:30 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0044.hostedemail.com [216.40.44.44]) by kanga.kvack.org (Postfix) with ESMTP id C2D1B90001A for ; Mon, 28 Sep 2020 13:55:30 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 804098249980 for ; Mon, 28 Sep 2020 17:55:30 +0000 (UTC) X-FDA: 77313222420.03.eyes71_301587527183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin03.hostedemail.com (Postfix) with ESMTP id 3089628A4EA for ; Mon, 28 Sep 2020 17:55:30 +0000 (UTC) X-Spam-Summary: 1,0,0,c453f63e03d03ba9,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:421:541:800:960:973:988:989:1260:1261:1311:1314:1345:1359:1431:1437:1515:1535:1543:1711:1730:1747:1777:1792:1801:1981:2194:2199:2393:2559:2562:2693:3138:3139:3140:3141:3142:3308:3353:3865:3866:3867:3868:3870:3871:3872:4119:4250:4321:4605:5007:6117:6119:6120:6261:6630:6653:6742:7576:7903:8603:10004:11026:11473:11658:11914:12043:12291:12296:12438:12555:12679:12895:12986:13255:13869:13894:14181:14721:21080:21451:21627:21795:21990:30005:30051:30054:30062:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-64.100.201.100 62.18.0.100;04yfffq4d5uu7hqx7c1tsx6bh8jpaop9j7smxd6z9x5zd75q7qi4inp7p87iock.jfza58h46ay7f63zd9tit86kp3wz3j4sx75f6prrny6g6ou6rcuk59sxooftqqo.n-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: eyes71_301587527183 X-Filterd-Recvd-Size: 8110 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf08.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:29 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 7A193EAB; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=nnBLzQcFmKx5h yU3MIHb5MmWVYU/HVsQSzx32OV/OSc=; b=Q/ZciFth/aUmefHcm5EEVJELC9iNH jeR9dbHWioBEqE/AQ1yutvLyEj4hU6hOLz59KedvxWktTRaCGByiMufg69lzGXJR D0QlWAQgPSH15WLLVc1hUf8mi8U/zZw49tSrvyuP3hoyCCkDllofewr1+yfQLxD3 AlNHKtIKemiFFprSRiMEOM3rLwM4la+sQ0ciljZOmBKyNWZCyCBwZf4CR2ZTWxLa 7VJQPkPU2ZrFpPb29mIdUEWQSRV6f9laE4+HqFzONBNcE4b1rBX00sxfy0E82l+1 Ygj3iQfyPr/sWW7uUZAigLaqkR2OGcrdMbbIMl8vcs/IKsgig769j1iPw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=nnBLzQcFmKx5hyU3MIHb5MmWVYU/HVsQSzx32OV/OSc=; b=m/+0TnpP BeKneCabUUV20m6siEpEtpF9cvZHM+kwxIAEa9SFkwLmPO9NzVOrYOe2BL2FMNNU pKPb+1YUyU4b74FLtTSRzYWDNiHpbIN0ZhSJeCXsrFrN7bD5Nn71e86J4sVvu2cF Y3U0xB/7yEvZvAvD7y4b8/DD8aYAEGEowQIsxI5evefDcvn4gKypf0vgQ9Ckd7E+ rb88xMHh9pDHu2dJ2wf+U9mpcDDqqpGwVRGIsIli10+qd3cop0xRN7T+kgGRBLEr BbY0u4xpMiwQNHKzCnoNO0EK2ztXw5V7OOFOrHqNzUQLYTLS7Qzlll4OMkRuzb3e NDvQ7CoZBYmbig== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpeduvdenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 939873064610; Mon, 28 Sep 2020 13:55:26 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 24/30] mm: madvise: add page size options to MADV_HUGEPAGE and MADV_NOHUGEPAGE. Date: Mon, 28 Sep 2020 13:54:22 -0400 Message-Id: <20200928175428.4110504-25-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan It allows user to specify up to what page size kernel will generate THPs to back up the memory range in madvise. Because we now have PMD and PUD THPs, they require different amount of kernel effort to be generated, and we want to prevent user from getting long page fault latency if we always try to allocate PUD THPs first. Signed-off-by: Zi Yan --- include/uapi/asm-generic/mman-common.h | 23 +++++++++++++++++++++++ mm/khugepaged.c | 1 + mm/madvise.c | 17 +++++++++++++++-- 3 files changed, 39 insertions(+), 2 deletions(-) diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h index f94f65d429be..8009acb55fca 100644 --- a/include/uapi/asm-generic/mman-common.h +++ b/include/uapi/asm-generic/mman-common.h @@ -6,6 +6,7 @@ Author: Michael S. Tsirkin , Mellanox Technologies Ltd. Based on: asm-xxx/mman.h */ +#include #define PROT_READ 0x1 /* page can be read */ #define PROT_WRITE 0x2 /* page can be written */ @@ -80,4 +81,26 @@ #define PKEY_ACCESS_MASK (PKEY_DISABLE_ACCESS |\ PKEY_DISABLE_WRITE) + +/* + * Huge page size encoding when MADV_HUGEPAGE is specified, and a huge page + * size other than the default is desired. See hugetlb_encode.h. + */ +#define MADV_HUGEPAGE_SHIFT HUGETLB_FLAG_ENCODE_SHIFT +#define MADV_HUGEPAGE_MASK HUGETLB_FLAG_ENCODE_MASK +#define MADV_BEHAVIOR_MASK ((1<vm_flags needs to * take mmap_lock for writing. Others, which simply traverse vmas, need @@ -74,7 +87,7 @@ static long madvise_behavior(struct vm_area_struct *vma, pgoff_t pgoff; unsigned long new_flags = vma->vm_flags; - switch (behavior) { + switch (get_behavior(behavior)) { case MADV_NORMAL: new_flags = new_flags & ~VM_RAND_READ & ~VM_SEQ_READ; break; @@ -953,7 +966,7 @@ madvise_vma(struct vm_area_struct *vma, struct vm_area_struct **prev, static bool madvise_behavior_valid(int behavior) { - switch (behavior) { + switch (get_behavior(behavior)) { case MADV_DOFORK: case MADV_DONTFORK: case MADV_NORMAL: From patchwork Mon Sep 28 17:54:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804487 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AF38B618 for ; Mon, 28 Sep 2020 17:56:25 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 646D322204 for ; Mon, 28 Sep 2020 17:56:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="C59hXIN3"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="j0FrHWS4" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 646D322204 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CFAFC90000D; Mon, 28 Sep 2020 13:55:30 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id CD2DA90001B; Mon, 28 Sep 2020 13:55:30 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BEB5590000D; Mon, 28 Sep 2020 13:55:30 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0139.hostedemail.com [216.40.44.139]) by kanga.kvack.org (Postfix) with ESMTP id 9783390001A for ; Mon, 28 Sep 2020 13:55:30 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 640B1181AE86D for ; Mon, 28 Sep 2020 17:55:30 +0000 (UTC) X-FDA: 77313222420.02.fire76_08138cb27183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id 31E9B101CFC34 for ; Mon, 28 Sep 2020 17:55:30 +0000 (UTC) X-Spam-Summary: 1,0,0,85d126e51ec61a0b,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1542:1711:1730:1747:1777:1792:2393:2559:2562:2693:3138:3139:3140:3141:3142:3354:3865:3867:3868:3870:3871:3872:3874:4118:4250:4321:5007:6117:6119:6120:6261:6630:6653:6742:7576:7903:10004:11026:11473:11658:11914:12043:12296:12438:12555:12679:12895:12986:13894:13972:14181:14721:21080:21324:21451:21627:30054:30064:30070,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-64.100.201.100 62.18.0.100;04yg3piha6hoxi84a8f7bcn1xyyc8ycc7g3mfgqu7d7b35h6dzu9ti31jecarw4.j4tx74byiw9cc7baynnn1xragbcktkoqi4ni8zrnwc8q8d95od89uut13rrb9xy.1-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: fire76_08138cb27183 X-Filterd-Recvd-Size: 7022 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:29 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id CD665E95; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=rCrx+wAdGZ4Zl ehY5QUiHxqrgmnxiUlf7t6hp6icemY=; b=C59hXIN3yDOINTV4MVhsGhLRUzwbb UPYycdysWR66pa3JrnWmafuw+ydviMmjuub/cZjzkxcTSrwEf1egf7K15FUE21p9 VX5G/HvxQfQf1mUglYVirkX8aqYpUeIjAkh9WTA/NR7zQXZ796GWOeBNOwUHfWBM 4aeHzyUYchfYdwTavalwYz9rk4Drjb2J++j9zqE8QlqtOZ22ldVNH2Z44tnVtDg7 I9BLU8RaYOjrP2/8X1A3EonlKFzgNuiTHDdqIJjyuK5kMFpsC8uleqFfTdtp2A1R LYBLJc3pAJL1D5Y9zlkGE/3sCPaj+3sqEu5kkJ0Htf07i8Dq+jS7yUtJA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=rCrx+wAdGZ4ZlehY5QUiHxqrgmnxiUlf7t6hp6icemY=; b=j0FrHWS4 1N0CXibOSAMghD1zLIu+sQukT6SNTW/kSB6KK/HftENN6LSVEp/CD4PoV0gUvMF6 jwjV3nFr2HFXDeDXzIMhgat1bW1MrHJO4uEyTl1BER304W5U62M6tvUvxzQWiXmN HLVoP0EPK9z1AYu1BPVdvzrAVXe36zHuq2YRXzuvuqHe7OlhcjF+RsSGsJSyFzdb 9NzU2wcTfNu4+BXfOxjOg/VJouNM4wyXbnZRu8a2+Dw24k4x9qvhcKdrDcucy1zs QCHA31YxWEW9DB5ArwgoAy5uhggRmeaYWku+9J30YS+Hi7DR81RrNrcXLVlOwf0c qF/eh0+d4rS/Cw== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpeduvdenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id E98073064674; Mon, 28 Sep 2020 13:55:26 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 25/30] mm: vma: add VM_HUGEPAGE_PUD to vm_flags at bit 37. Date: Mon, 28 Sep 2020 13:54:23 -0400 Message-Id: <20200928175428.4110504-26-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan madvise can set this bit via MADV_HUGEPAGE | MADV_HUGEPAGE_1GB and unset it via MADV_NOHUGEPAGE | MADV_HUGEPAGE_1GB. Later, kernel will check this bit to decide whether to allocate PUD THPs or not on a VMA when the global PUD THP is set to madvise. Signed-off-by: Zi Yan --- include/linux/mm.h | 6 ++++++ mm/khugepaged.c | 9 +++++++++ 2 files changed, 15 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 51b75ffa6a6c..78bee63c64da 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -305,11 +305,13 @@ extern unsigned int kobjsize(const void *objp); #define VM_HIGH_ARCH_BIT_2 34 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_BIT_3 35 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_BIT_4 36 /* bit only usable on 64-bit architectures */ +#define VM_HIGH_ARCH_BIT_5 37 /* bit only usable on 64-bit architectures */ #define VM_HIGH_ARCH_0 BIT(VM_HIGH_ARCH_BIT_0) #define VM_HIGH_ARCH_1 BIT(VM_HIGH_ARCH_BIT_1) #define VM_HIGH_ARCH_2 BIT(VM_HIGH_ARCH_BIT_2) #define VM_HIGH_ARCH_3 BIT(VM_HIGH_ARCH_BIT_3) #define VM_HIGH_ARCH_4 BIT(VM_HIGH_ARCH_BIT_4) +#define VM_HIGH_ARCH_5 BIT(VM_HIGH_ARCH_BIT_5) #endif /* CONFIG_ARCH_USES_HIGH_VMA_FLAGS */ #ifdef CONFIG_ARCH_HAS_PKEYS @@ -325,6 +327,10 @@ extern unsigned int kobjsize(const void *objp); #endif #endif /* CONFIG_ARCH_HAS_PKEYS */ +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +#define VM_HUGEPAGE_PUD VM_HIGH_ARCH_5 +#endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ + #if defined(CONFIG_X86) # define VM_PAT VM_ARCH_1 /* PAT reserves whole VMA at once (x86) */ #elif defined(CONFIG_PPC) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index b34c78085017..f085c218ea84 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -345,6 +345,9 @@ struct attribute_group khugepaged_attr_group = { int hugepage_madvise(struct vm_area_struct *vma, unsigned long *vm_flags, int advice) { + /* only support 1GB PUD THP on x86 now */ + bool use_pud_page = advice & MADV_HUGEPAGE_1GB; + advice = advice & MADV_BEHAVIOR_MASK; switch (advice) { case MADV_HUGEPAGE: @@ -359,6 +362,9 @@ int hugepage_madvise(struct vm_area_struct *vma, #endif *vm_flags &= ~VM_NOHUGEPAGE; *vm_flags |= VM_HUGEPAGE; + + if (use_pud_page) + *vm_flags |= VM_HUGEPAGE_PUD; /* * If the vma become good for khugepaged to scan, * register it here without waiting a page fault that @@ -371,6 +377,9 @@ int hugepage_madvise(struct vm_area_struct *vma, case MADV_NOHUGEPAGE: *vm_flags &= ~VM_HUGEPAGE; *vm_flags |= VM_NOHUGEPAGE; + + if (use_pud_page) + *vm_flags &= ~VM_HUGEPAGE_PUD; /* * Setting VM_NOHUGEPAGE will prevent khugepaged from scanning * this vma even if we leave the mm registered in khugepaged if From patchwork Mon Sep 28 17:54:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804491 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F352A6CA for ; Mon, 28 Sep 2020 17:56:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A9CF322204 for ; Mon, 28 Sep 2020 17:56:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="XEu86+j+"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="lsRyCnmS" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A9CF322204 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5CA2290001C; Mon, 28 Sep 2020 13:55:31 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 52C7790001A; Mon, 28 Sep 2020 13:55:31 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3A02890001C; Mon, 28 Sep 2020 13:55:31 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0229.hostedemail.com [216.40.44.229]) by kanga.kvack.org (Postfix) with ESMTP id 0A6DC90001D for ; Mon, 28 Sep 2020 13:55:31 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id C46BE365A for ; Mon, 28 Sep 2020 17:55:30 +0000 (UTC) X-FDA: 77313222420.08.mouth47_1d090d327183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin08.hostedemail.com (Postfix) with ESMTP id A4EB81819E76F for ; Mon, 28 Sep 2020 17:55:30 +0000 (UTC) X-Spam-Summary: 1,0,0,f3be5bf35097e05e,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1544:1711:1730:1747:1777:1792:2198:2199:2393:2559:2562:2693:3138:3139:3140:3141:3142:3355:3664:3865:3866:3867:3868:3871:3872:4119:4250:4321:4605:5007:6117:6119:6120:6261:6653:6742:7576:7903:8784:10004:11026:11473:11658:11914:12043:12291:12296:12438:12555:12679:12683:12895:13894:14039:14040:14181:14721:21080:21451:21627:21795:21966:21990:30051:30054:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04y8e8ffdyhm4ib944dk346hm6fhyycmieagne47ihaf7ttn66ddxnnwj6hts6e.tjs1ju9onobt55btg8o114ezmpfwz57pbhnccg7eyog9xsiy7174jnnfkf3irtb.4-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: mouth47_1d090d327183 X-Filterd-Recvd-Size: 8819 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf17.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:29 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 1C73EEB0; Mon, 28 Sep 2020 13:55:28 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=fKNGPn4b5sGwL Rh+ItgDgX7lMBGsorph0sms9B238k0=; b=XEu86+j+e7gcMHEv92P5s4xJp20rp zMFVKS65FnuMSKgAA77Gtj5Ut+FUrecE44+tUeEk6LFeW9NtYgTpYMm120lIOWGx HKht3fcFgqTYoq5eaVGgHK5PK8AhelDj/mEBUuq3cgVpjZ/OUSU++DpMZI8K84LG YIjEjqIXnkDi3RUfoY0XSLSIu8G2ETQ2XsEyId1umd5KLN01DJxt3rkbIv7rr1tm +KOna1sKXCp+cZEAZPGdRN+hzk+oS+y6HttZbrD3q2+llWNhzOpswf1GxRE0V73e SjuI2T4VXRU3LOUm/1Wvi0b8EELHFNSaY2bSuHMRsFt7//jDROOkxq96Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=fKNGPn4b5sGwLRh+ItgDgX7lMBGsorph0sms9B238k0=; b=lsRyCnmS nAcktrPPtkfFAhB7BBGBbHqNo/gPpg3cChaqKww8YC0T6Y+zAygHI2S+g8Z4bVjX QRR7zu1tFpCbO4aHf6IPZAAd5fF2hhY/x+P0wHq3Kh/6KQDOZCpvOGuMFZ98KziJ Ex+CzzF4D+kBV6HuNyknRSav09t+z9bSTYrRziX7HxBtd9kOVQ7dxnLFhVXHSAwl HuN9o7AsduqJLdKQCbPw6wlAdkrTCqJki99q1sUS5+CN8FYdGRBspor/Y9zn7UZc wP7XsClW4DokSWczCjfxpj78fTugYYgGOlTD2TEFkL+BZ427ytqKodqCX7ZSEiE5 +9DfsXEmWWATMQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpeduvdenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 4C6BB3064686; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 26/30] mm: thp: add a global knob to enable/disable PUD THPs. Date: Mon, 28 Sep 2020 13:54:24 -0400 Message-Id: <20200928175428.4110504-27-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan Like the existing global PMD THP knob, it allows user to enable/disable PUD THPs. PUD THP is disabled by default unless user knows the performance tradeoff of using it, like longer first time page fault due to larger page zeroing and longer page allocation time when memory is fragmented. Experienced user can enable it and take advantage of its benefit of suffering fewer page faults and TLB misses. * always means PUD THPs will be allocated on all VMAs if possible. * madvise means PUD THPs will be allocated if vm_flags has VM_HUGEPAGE_PUD set via madvise syscall using MADV_HUGEPAGE | MADV_HUGEPAGE_PUD. * none means PUD THPs will not be allocated on any VMA. Signed-off-by: Zi Yan --- include/linux/huge_mm.h | 14 ++++++++++++++ mm/huge_memory.c | 38 ++++++++++++++++++++++++++++++++++++++ mm/memory.c | 2 +- 3 files changed, 53 insertions(+), 1 deletion(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index c7bc40c4a5e2..0d0f9cf25aeb 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -119,6 +119,8 @@ enum transparent_hugepage_flag { #ifdef CONFIG_DEBUG_VM TRANSPARENT_HUGEPAGE_DEBUG_COW_FLAG, #endif + TRANSPARENT_PUD_HUGEPAGE_FLAG, + TRANSPARENT_PUD_HUGEPAGE_REQ_MADV_FLAG, }; struct kobject; @@ -184,6 +186,18 @@ static inline bool __transparent_hugepage_enabled(struct vm_area_struct *vma) } bool transparent_hugepage_enabled(struct vm_area_struct *vma); +static inline bool transparent_pud_hugepage_enabled(struct vm_area_struct *vma) +{ + if (transparent_hugepage_enabled(vma)) { + if (transparent_hugepage_flags & (1 << TRANSPARENT_PUD_HUGEPAGE_FLAG)) + return true; + if (transparent_hugepage_flags & + (1 << TRANSPARENT_PUD_HUGEPAGE_REQ_MADV_FLAG)) + return !!(vma->vm_flags & VM_HUGEPAGE_PUD); + } + + return false; +} #define HPAGE_CACHE_INDEX_MASK (HPAGE_PMD_NR - 1) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 61ae7a0ded84..1965753b31a2 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -199,6 +199,43 @@ static ssize_t enabled_store(struct kobject *kobj, static struct kobj_attribute enabled_attr = __ATTR(enabled, 0644, enabled_show, enabled_store); +static ssize_t enabled_pud_thp_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + if (test_bit(TRANSPARENT_PUD_HUGEPAGE_FLAG, &transparent_hugepage_flags)) + return sprintf(buf, "[always] madvise never\n"); + else if (test_bit(TRANSPARENT_PUD_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags)) + return sprintf(buf, "always [madvise] never\n"); + else + return sprintf(buf, "always madvise [never]\n"); +} + +static ssize_t enabled_pud_thp_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + ssize_t ret = count; + + if (!memcmp("always", buf, + min(sizeof("always")-1, count))) { + clear_bit(TRANSPARENT_PUD_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags); + set_bit(TRANSPARENT_PUD_HUGEPAGE_FLAG, &transparent_hugepage_flags); + } else if (!memcmp("madvise", buf, + min(sizeof("madvise")-1, count))) { + clear_bit(TRANSPARENT_PUD_HUGEPAGE_FLAG, &transparent_hugepage_flags); + set_bit(TRANSPARENT_PUD_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags); + } else if (!memcmp("never", buf, + min(sizeof("never")-1, count))) { + clear_bit(TRANSPARENT_PUD_HUGEPAGE_FLAG, &transparent_hugepage_flags); + clear_bit(TRANSPARENT_PUD_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags); + } else + ret = -EINVAL; + + return ret; +} +static struct kobj_attribute enabled_pud_thp_attr = + __ATTR(enabled_pud_thp, 0644, enabled_pud_thp_show, enabled_pud_thp_store); + ssize_t single_hugepage_flag_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf, enum transparent_hugepage_flag flag) @@ -305,6 +342,7 @@ static struct kobj_attribute hpage_pmd_size_attr = static struct attribute *hugepage_attr[] = { &enabled_attr.attr, + &enabled_pud_thp_attr.attr, &defrag_attr.attr, &use_zero_page_attr.attr, &hpage_pmd_size_attr.attr, diff --git a/mm/memory.c b/mm/memory.c index ab80d13807aa..9f7b509a3aa7 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4282,7 +4282,7 @@ static vm_fault_t __handle_mm_fault(struct vm_area_struct *vma, if (!vmf.pud) return VM_FAULT_OOM; retry_pud: - if (pud_none(*vmf.pud) && __transparent_hugepage_enabled(vma)) { + if (pud_none(*vmf.pud) && transparent_pud_hugepage_enabled(vma)) { ret = create_huge_pud(&vmf); if (!(ret & VM_FAULT_FALLBACK)) return ret; From patchwork Mon Sep 28 17:54:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804493 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E657A618 for ; Mon, 28 Sep 2020 17:56:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AA47B22204 for ; Mon, 28 Sep 2020 17:56:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="bqH0Bi6P"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="OK7wJUb5" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AA47B22204 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9C25390001D; Mon, 28 Sep 2020 13:55:31 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9975090001A; Mon, 28 Sep 2020 13:55:31 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8398990001E; Mon, 28 Sep 2020 13:55:31 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0179.hostedemail.com [216.40.44.179]) by kanga.kvack.org (Postfix) with ESMTP id 6DB1690001D for ; Mon, 28 Sep 2020 13:55:31 -0400 (EDT) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 2FDE2181AE86D for ; Mon, 28 Sep 2020 17:55:31 +0000 (UTC) X-FDA: 77313222462.12.joke31_461513227183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin12.hostedemail.com (Postfix) with ESMTP id 06D8D1805408D for ; Mon, 28 Sep 2020 17:55:30 +0000 (UTC) X-Spam-Summary: 1,0,0,e0b3fcd8f3cbd515,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1541:1711:1730:1747:1777:1792:1801:2393:2553:2559:2562:2693:2901:3138:3139:3140:3141:3142:3352:3865:3866:3867:3868:3870:3871:3872:4117:4321:4605:5007:6119:6120:6261:6653:6742:7576:7901:8603:10004:10903:11026:11232:11473:11657:11658:11914:12043:12296:12438:12555:12679:12895:12986:13069:13311:13357:13894:14181:14384:14581:14721:21080:21627:21795:21990:30051:30054:30056:30064:30090,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-64.100.201.100 62.18.0.100;04yrdbiok1f8oi9g11pzdpjccm9bbyp7md9peuh51djn6hx43x7o5rs9w7eicjq.cru1rrj97bac8qtwtyowh7gpxx9gq86or4rcnd8qndirhi4whjx6kejdx7da5ps.w-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:27,LUA_SUMMARY:none X-HE-Tag: joke31_461513227183 X-Filterd-Recvd-Size: 6070 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:30 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 875AFE7E; Mon, 28 Sep 2020 13:55:28 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=hEDfPgaD0F1Ir Gg16yzXHXns+J50v05ddCe9x/z5XhY=; b=bqH0Bi6Pd970v1YMJbeVN95QTRqME aldJWGE6nstMyTGKIMSyKYimhbyFk7WMzI7vCdR8PdVq0PicpFSustr4hZluwT06 CGjolCRuoWo4KLxCp61sy+TocJr55EjF2Hd5xIYj6hGyHxScwTwixkJmM5KotHMT +BvfQo3OwWd5RCdzkiqqsDLz9833zLZFkaO2wctsD7KXQzo0DOWEjUn6/QVKTKt4 9EadzdhIQxAiHU9TrefdMSjAVgraKQ+4zCNH2Tvq/CHZfWrg6BHOrRRaRGsTdmOs 75EXkC2YZ1JD/d4M6gj0BNmgWB+1ll2C6gWN4JvUN0Bt1j5R9ejkHtCng== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=hEDfPgaD0F1IrGg16yzXHXns+J50v05ddCe9x/z5XhY=; b=OK7wJUb5 P6iKo/6Sh7RSJTZp9j4cEOt8OWrjc7IEw2eS+nqU5Dhs0vP34uu6WQ0UAIEW6oWK JbM8PByEmH3TM1N1VHJCP3q+svIFLcGctbECZr2kUdT5Oim5pT8bsjFxIk8PO6Ru Pswwd90kxdBEYcmS+bW/uUUJlN8JCkhowhTmqK9xdFNQ58Bf+Wx6Owe4rUkV9dBO qmLn11tFo4dxgQYh5azhsybW395xB5vEnqRhvU2JXI8TA2kbaa6UyPvxVvgPpVzk SthMtmmzeS9tQZYPJJQjjxZ497WHtp6C1DUapLbkXwUdCyxvQtjbVJEnpJHg4LI3 e0YDPIRVDotUeQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpeduvdenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id A13CB3064688; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 27/30] mm: thp: make PUD THP size public. Date: Mon, 28 Sep 2020 13:54:25 -0400 Message-Id: <20200928175428.4110504-28-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan User can access the PUD THP size via `cat /sys/kernel/mm/transparent_hugepage/hpage_pud_size`. This is similar to make PMD THP size public. Signed-off-by: Zi Yan --- Documentation/admin-guide/mm/transhuge.rst | 1 + mm/huge_memory.c | 13 +++++++++++++ 2 files changed, 14 insertions(+) diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index b2acd0d395ca..11b173c2650e 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -159,6 +159,7 @@ Some userspace (such as a test program, or an optimized memory allocation library) may want to know the size (in bytes) of a transparent hugepage:: cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size + cat /sys/kernel/mm/transparent_hugepage/hpage_pud_size khugepaged will be automatically started when transparent_hugepage/enabled is set to "always" or "madvise, and it'll diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 1965753b31a2..20ecffc27396 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -340,12 +340,25 @@ static ssize_t hpage_pmd_size_show(struct kobject *kobj, static struct kobj_attribute hpage_pmd_size_attr = __ATTR_RO(hpage_pmd_size); +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD +static ssize_t hpage_pud_size_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sprintf(buf, "%lu\n", HPAGE_PUD_SIZE); +} +static struct kobj_attribute hpage_pud_size_attr = + __ATTR_RO(hpage_pud_size); +#endif + static struct attribute *hugepage_attr[] = { &enabled_attr.attr, &enabled_pud_thp_attr.attr, &defrag_attr.attr, &use_zero_page_attr.attr, &hpage_pmd_size_attr.attr, +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD + &hpage_pud_size_attr.attr, +#endif #ifdef CONFIG_SHMEM &shmem_enabled_attr.attr, #endif From patchwork Mon Sep 28 17:54:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804495 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 351BC618 for ; Mon, 28 Sep 2020 17:56:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id C497B22204 for ; Mon, 28 Sep 2020 17:56:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="OkzELMSD"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="WSAfhOKn" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C497B22204 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5A9E290001E; Mon, 28 Sep 2020 13:55:32 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 5099E90001A; Mon, 28 Sep 2020 13:55:32 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F33D90001E; Mon, 28 Sep 2020 13:55:32 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0044.hostedemail.com [216.40.44.44]) by kanga.kvack.org (Postfix) with ESMTP id 2A9FF90001A for ; Mon, 28 Sep 2020 13:55:32 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id E198D8249980 for ; Mon, 28 Sep 2020 17:55:31 +0000 (UTC) X-FDA: 77313222462.26.crib00_140ed0027183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin26.hostedemail.com (Postfix) with ESMTP id 9CFBF1804A301 for ; Mon, 28 Sep 2020 17:55:31 +0000 (UTC) X-Spam-Summary: 1,0,0,bc88ebc956e3491a,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:4:41:69:355:379:421:541:800:960:966:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1730:1747:1777:1792:1801:2194:2196:2198:2199:2200:2201:2393:2559:2562:2638:2689:2693:2731:2892:2898:3138:3139:3140:3141:3142:3355:3865:3866:3867:3868:3870:3871:3872:3874:4250:4321:4385:4605:5007:6119:6120:6261:6630:6653:6742:7576:7875:7903:8603:9592:10004:11026:11473:11657:11658:11914:12043:12296:12438:12555:12679:12895:13894:13972:21080:21451:21627:21987:21990:30029:30045:30054:30056:30064:30070:30075,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04yre3ie34mmpuq6zogbmaikj9fn6yccjdddxqi5g8cyqsut9sjc1zoewymyzo5.bbhmxkmk81fsi9qfm45mxxdhwxj16wtpihaayy8rmndi58xxdj9otyikappzz7k.c-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:24,LUA_SUMMARY:none X-HE-Tag: crib00_140ed0027183 X-Filterd-Recvd-Size: 16267 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf31.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:30 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id E1D60EB6; Mon, 28 Sep 2020 13:55:28 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=g/l3kBF8ztP9z PFDpwOmayWvLlVcLVx1lR8mmO+yqpU=; b=OkzELMSDtU+ipHbiupKhYuKHKIbQZ rIlMg2ODpQVIHQQp7HzvouMgD2+IBl28OTAP0b37F8biiFzTRPotp4YiUXHCnhY6 yYgPc0Wzf62geWF+gHwi//UIXMONQnpFvLZF3uGyVUuFtT2lKHVb2reTxP1vSR4c 4OT2LnsjnByHeMosKy4d3MaBpNQDVZboxQvjieh3kg6cmhjIxL2MC0eizGtxJFUE Uwd0aTAG/moV1sEY50YYByqOglNn0ptwW3J10Yrd/qdeFxLhNmHg/ZY0gdpDlkDU o7YPKwQOBdRm7zBKYInGtzALtEmIbYTBagt+qJGwBb75Rqw2JzJnq+DbA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=g/l3kBF8ztP9zPFDpwOmayWvLlVcLVx1lR8mmO+yqpU=; b=WSAfhOKn ECAXp1gmCjDXD7aQrVScPWaiKrk1whrDLpV7tae9dDCYtdPZ/PH+XJ7bfy//zN9u gn/JSvOwKocwQXMorq9kC37MxOJzgLPPDGY7CGJm1sG801vR4cIhtjsC0ESJhWPh mIcIag6Q2Vd8sp1rouPkBSYMZ9KDnCpIRFPqazJZ6svyZj3GN9EcITlhCPjxViiZ 2nvaWAAI3gx7WU/Zm88FYACA9DiUUboFyS7tAC758HeZrmPWy7mJEzoHehdLmcN9 xwz5+qlhX5pSllwv3GGuUbNbgq8BhhPB2qxLSeoiyY2PbOGXmF74RkGaulgqe3Yv YQ+cZp/J+Azc5w== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpeduvdenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 03730306467D; Mon, 28 Sep 2020 13:55:27 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 28/30] hugetlb: cma: move cma reserve function to cma.c. Date: Mon, 28 Sep 2020 13:54:26 -0400 Message-Id: <20200928175428.4110504-29-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan It will be used by other allocations, like 1GB THP allocation in the upcoming commit. Signed-off-by: Zi Yan --- .../admin-guide/kernel-parameters.txt | 2 +- arch/arm64/mm/hugetlbpage.c | 2 +- arch/powerpc/mm/hugetlbpage.c | 2 +- arch/x86/kernel/setup.c | 8 +- include/linux/cma.h | 15 +++ include/linux/hugetlb.h | 12 --- mm/cma.c | 88 ++++++++++++++++++ mm/hugetlb.c | 92 ++----------------- 8 files changed, 120 insertions(+), 101 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 7fbfc1a3e1e1..3f8f3199f4fc 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1524,7 +1524,7 @@ hpet_mmap= [X86, HPET_MMAP] Allow userspace to mmap HPET registers. Default set by CONFIG_HPET_MMAP_DEFAULT. - hugetlb_cma= [HW] The size of a cma area used for allocation + hugepage_cma= [HW] The size of a cma area used for allocation of gigantic hugepages. Format: nn[KMGTPE] diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 55ecf6de9ff7..8a3ad7eaae49 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -52,7 +52,7 @@ void __init arm64_hugetlb_cma_reserve(void) * breaking this assumption. */ WARN_ON(order <= MAX_ORDER); - hugetlb_cma_reserve(order); + hugepage_cma_reserve(order); } #endif /* CONFIG_CMA */ diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 36c3800769fb..6c1e61251df2 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -713,6 +713,6 @@ void __init gigantic_hugetlb_cma_reserve(void) if (order) { VM_WARN_ON(order < MAX_ORDER); - hugetlb_cma_reserve(order); + hugepage_cma_reserve(order); } } diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index ad8f909b5dc8..a732ead4985a 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -16,7 +16,7 @@ #include #include #include -#include +#include #include #include #include @@ -641,7 +641,7 @@ static void __init trim_snb_memory(void) * already been reserved. */ memblock_reserve(0, 1<<20); - + for (i = 0; i < ARRAY_SIZE(bad_pages); i++) { if (memblock_reserve(bad_pages[i], PAGE_SIZE)) printk(KERN_WARNING "failed to reserve 0x%08lx\n", @@ -733,7 +733,7 @@ static void __init trim_low_memory_range(void) { memblock_reserve(0, ALIGN(reserve_low, PAGE_SIZE)); } - + /* * Dump out kernel offset information on panic. */ @@ -1144,7 +1144,7 @@ void __init setup_arch(char **cmdline_p) dma_contiguous_reserve(max_pfn_mapped << PAGE_SHIFT); if (boot_cpu_has(X86_FEATURE_GBPAGES)) - hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT); + hugepage_cma_reserve(PUD_SHIFT - PAGE_SHIFT); /* * Reserve memory for crash kernel after SRAT is parsed so that it diff --git a/include/linux/cma.h b/include/linux/cma.h index 217999c8a762..9989d580c2a7 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -49,4 +49,19 @@ extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, extern bool cma_release(struct cma *cma, const struct page *pages, unsigned int count); extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data); + +extern void cma_reserve(int min_order, unsigned long requested_size, + const char *name, struct cma *cma_struct[N_MEMORY]); +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) +extern void __init hugepage_cma_reserve(int order); +extern void __init hugepage_cma_check(void); +#else +static inline void __init hugepage_cma_check(void) +{ +} +static inline void __init hugepage_cma_reserve(int order) +{ +} +#endif + #endif diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index d5cc5f802dd4..087d13a1dc24 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -935,16 +935,4 @@ static inline spinlock_t *huge_pte_lock(struct hstate *h, return ptl; } -#if defined(CONFIG_HUGETLB_PAGE) && defined(CONFIG_CMA) -extern void __init hugetlb_cma_reserve(int order); -extern void __init hugetlb_cma_check(void); -#else -static inline __init void hugetlb_cma_reserve(int order) -{ -} -static inline __init void hugetlb_cma_check(void) -{ -} -#endif - #endif /* _LINUX_HUGETLB_H */ diff --git a/mm/cma.c b/mm/cma.c index 7f415d7cda9f..1a9d997fa5ab 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -38,6 +38,10 @@ struct cma cma_areas[MAX_CMA_AREAS]; unsigned cma_area_count; +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) +struct cma *hugepage_cma[MAX_NUMNODES]; +#endif +unsigned long hugepage_cma_size __initdata; static DEFINE_MUTEX(cma_mutex); phys_addr_t cma_get_base(const struct cma *cma) @@ -541,3 +545,87 @@ int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data) return 0; } + +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS) +/* + * cma_reserve() - reserve CMA for gigantic pages on nodes with memory + * + * must be called after free_area_init() that updates N_MEMORY via node_set_state(). + * cma_reserve() scans over N_MEMORY nodemask and hence expects the platforms + * to have initialized N_MEMORY state. + */ +void __init cma_reserve(int min_order, unsigned long requested_size, const char *name, + struct cma *cma_struct[MAX_NUMNODES]) +{ + unsigned long size, reserved, per_node; + int nid; + + if (!requested_size) + return; + + if (requested_size < (PAGE_SIZE << min_order)) { + pr_warn("%s_cma: cma area should be at least %lu MiB\n", + name, (PAGE_SIZE << min_order) / SZ_1M); + return; + } + + /* + * If 3 GB area is requested on a machine with 4 numa nodes, + * let's allocate 1 GB on first three nodes and ignore the last one. + */ + per_node = DIV_ROUND_UP(requested_size, nr_online_nodes); + pr_info("%s_cma: reserve %lu MiB, up to %lu MiB per node\n", + name, requested_size / SZ_1M, per_node / SZ_1M); + + reserved = 0; + for_each_node_state(nid, N_ONLINE) { + int res; + char node_name[CMA_MAX_NAME]; + + size = min(per_node, requested_size - reserved); + size = round_up(size, PAGE_SIZE << min_order); + + snprintf(node_name, sizeof(name), "%s%d", name, nid); + res = cma_declare_contiguous_nid(0, size, 0, + PAGE_SIZE << min_order, + 0, false, node_name, + &cma_struct[nid], nid); + if (res) { + pr_warn("%s_cma: reservation failed: err %d, node %d", + name, res, nid); + continue; + } + + reserved += size; + pr_info("%s_cma: reserved %lu MiB on node %d\n", + name, size / SZ_1M, nid); + + if (reserved >= requested_size) + break; + } +} + +static bool hugepage_cma_reserve_called __initdata; + +static int __init cmdline_parse_hugepage_cma(char *p) +{ + hugepage_cma_size = memparse(p, &p); + return 0; +} + +early_param("hugepage_cma", cmdline_parse_hugepage_cma); + +void __init hugepage_cma_reserve(int order) +{ + hugepage_cma_reserve_called = true; + cma_reserve(order, hugepage_cma_size, "hugepage", hugepage_cma); +} + +void __init hugepage_cma_check(void) +{ + if (!hugepage_cma_size || hugepage_cma_reserve_called) + return; + + pr_warn("hugepage_cma: the option isn't supported by current arch\n"); +} +#endif diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 25674d7b1e5f..871f1c315c48 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -48,9 +48,9 @@ unsigned int default_hstate_idx; struct hstate hstates[HUGE_MAX_HSTATE]; #ifdef CONFIG_CMA -static struct cma *hugetlb_cma[MAX_NUMNODES]; +extern struct cma *hugepage_cma[MAX_NUMNODES]; #endif -static unsigned long hugetlb_cma_size __initdata; +extern unsigned long hugepage_cma_size __initdata; /* * Minimum page order among possible hugepage sizes, set to a proper value @@ -1227,7 +1227,7 @@ static void free_gigantic_page(struct page *page, unsigned int order) * cma_release() returns false. */ #ifdef CONFIG_CMA - if (cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order)) + if (cma_release(hugepage_cma[page_to_nid(page)], page, 1 << order)) return; #endif @@ -1247,8 +1247,8 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask, struct page *page; int node; - if (hugetlb_cma[nid]) { - page = cma_alloc(hugetlb_cma[nid], nr_pages, + if (hugepage_cma[nid]) { + page = cma_alloc(hugepage_cma[nid], nr_pages, huge_page_order(h), true); if (page) return page; @@ -1256,10 +1256,10 @@ static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask, if (!(gfp_mask & __GFP_THISNODE)) { for_each_node_mask(node, *nodemask) { - if (node == nid || !hugetlb_cma[node]) + if (node == nid || !hugepage_cma[node]) continue; - page = cma_alloc(hugetlb_cma[node], nr_pages, + page = cma_alloc(hugepage_cma[node], nr_pages, huge_page_order(h), true); if (page) return page; @@ -2554,8 +2554,8 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) for (i = 0; i < h->max_huge_pages; ++i) { if (hstate_is_gigantic(h)) { - if (hugetlb_cma_size) { - pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n"); + if (hugepage_cma_size) { + pr_warn_once("HugeTLB: hugepage_cma is enabled, skip boot time allocation\n"); break; } if (!alloc_bootmem_huge_page(h)) @@ -3231,7 +3231,7 @@ static int __init hugetlb_init(void) } } - hugetlb_cma_check(); + hugepage_cma_check(); hugetlb_init_hstates(); gather_bootmem_prealloc(); report_hugepages(); @@ -5665,75 +5665,3 @@ void move_hugetlb_state(struct page *oldpage, struct page *newpage, int reason) spin_unlock(&hugetlb_lock); } } - -#ifdef CONFIG_CMA -static bool cma_reserve_called __initdata; - -static int __init cmdline_parse_hugetlb_cma(char *p) -{ - hugetlb_cma_size = memparse(p, &p); - return 0; -} - -early_param("hugetlb_cma", cmdline_parse_hugetlb_cma); - -void __init hugetlb_cma_reserve(int order) -{ - unsigned long size, reserved, per_node; - int nid; - - cma_reserve_called = true; - - if (!hugetlb_cma_size) - return; - - if (hugetlb_cma_size < (PAGE_SIZE << order)) { - pr_warn("hugetlb_cma: cma area should be at least %lu MiB\n", - (PAGE_SIZE << order) / SZ_1M); - return; - } - - /* - * If 3 GB area is requested on a machine with 4 numa nodes, - * let's allocate 1 GB on first three nodes and ignore the last one. - */ - per_node = DIV_ROUND_UP(hugetlb_cma_size, nr_online_nodes); - pr_info("hugetlb_cma: reserve %lu MiB, up to %lu MiB per node\n", - hugetlb_cma_size / SZ_1M, per_node / SZ_1M); - - reserved = 0; - for_each_node_state(nid, N_ONLINE) { - int res; - char name[CMA_MAX_NAME]; - - size = min(per_node, hugetlb_cma_size - reserved); - size = round_up(size, PAGE_SIZE << order); - - snprintf(name, sizeof(name), "hugetlb%d", nid); - res = cma_declare_contiguous_nid(0, size, 0, PAGE_SIZE << order, - 0, false, name, - &hugetlb_cma[nid], nid); - if (res) { - pr_warn("hugetlb_cma: reservation failed: err %d, node %d", - res, nid); - continue; - } - - reserved += size; - pr_info("hugetlb_cma: reserved %lu MiB on node %d\n", - size / SZ_1M, nid); - - if (reserved >= hugetlb_cma_size) - break; - } -} - -void __init hugetlb_cma_check(void) -{ - if (!hugetlb_cma_size || cma_reserve_called) - return; - - pr_warn("hugetlb_cma: the option isn't supported by current arch\n"); -} - -#endif /* CONFIG_CMA */ From patchwork Mon Sep 28 17:54:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 11804497 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 91039618 for ; Mon, 28 Sep 2020 17:56:40 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 40499221E7 for ; Mon, 28 Sep 2020 17:56:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="gkWZtUub"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="Leevw8rq" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 40499221E7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AAFFA90001A; Mon, 28 Sep 2020 13:55:32 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id A5A8890001F; Mon, 28 Sep 2020 13:55:32 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8D95E900020; Mon, 28 Sep 2020 13:55:32 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0124.hostedemail.com [216.40.44.124]) by kanga.kvack.org (Postfix) with ESMTP id 6F6A790001F for ; Mon, 28 Sep 2020 13:55:32 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 36323824999B for ; Mon, 28 Sep 2020 17:55:32 +0000 (UTC) X-FDA: 77313222504.02.watch04_1300cf827183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id 03AB6101FA54A for ; Mon, 28 Sep 2020 17:55:31 +0000 (UTC) X-Spam-Summary: 1,0,0,9fc5bc58418eda9c,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:1:2:41:69:355:379:541:800:960:966:968:973:988:989:1260:1261:1311:1314:1345:1359:1431:1437:1515:1605:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2693:2731:2895:3138:3139:3140:3141:3142:3865:3866:3867:3868:3871:3872:4051:4250:4321:4385:4423:4605:5007:6119:6120:6261:6630:6653:6742:7576:7875:7901:7903:7904:8603:8957:9010:9592:10004:11026:11473:11658:11914:12043:12291:12296:12438:12555:12679:12683:12895:12986:13161:13229:13894:14096:21080:21451:21627:21966:21987:21990:30003:30034:30054:30064:30070,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04y8gc3gsd1xy31knpboieb7o1uxzoc1neu37n9u844scw6fyrzmmnff5aqsyy4.zg6ju5q9uyzsomrbkg6gf1n5tniuaeraud5ybku4emhrsx73upph5tketafpqym.y-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:23,LUA_SUMMARY:none X-HE-Tag: watch04_1300cf827183 X-Filterd-Recvd-Size: 12947 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf13.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:31 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 4DB7FEB8; Mon, 28 Sep 2020 13:55:29 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=Nj/H1bhUbJdbY KZHa/71xGaBODpP4brVRKFdfTzhT6c=; b=gkWZtUubFT06rXcTnth5omC+yd5DX RyZJL4TCSVIbCb0v7g3+WlQ0Y7fqvw7Hubqd1pPUgMpDuD8QYF7zAP5zs1TauCHF 3GcXAZTfxcxY9pLUAXi3HDPJ9HAKM1LGN4VzyGgnYeDcvRfq0Tspd+ZqOVdi3aYi mdiLmvI5MiYaEF+AKIieaRJ8RDiexxgVPNFd098rcYxi4xVZxkRV7QtspBcVg6XQ LR1QBEeNaoZGWaVc8eJKJyNqyeGIt6UmIcVYZ1uETD5AH4I4VI9fpb/tSraMOOs6 dW/kNnEDBbEMHeDo4HprH2TsrtNhzsrpTZQ838hUHfgj2dC/1ceW3gY/A== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=Nj/H1bhUbJdbYKZHa/71xGaBODpP4brVRKFdfTzhT6c=; b=Leevw8rq LQ5I5XbmbI+exRs8SO2qBixGTGP+sjKvn9XOx6mD6VQIrxf1fTuFvcAsAqymQXYR AaVcMwk/NFCcG2unWpnhw2+cHK0/iMgM5R31bcG8b47D/vEZ9PW3zfe4zCZJxPXd /ZnHtphbVlKC4cdBL7WUK8B4YpQihTPtqkPFvy2d1+hmf95kC9QAgglknFQChWUM P3DiQqTpknHJ/mfQ/QvXos+zKGdg1NaSHzvFz+TAcDAjtmeDHE4PMvGpsQt6JqlJ mNadnWBQSmNKOCjf/+DneDkoR6N8Uqg1JBEkGRjHaGPvE1ur/aOyRwS1LkiEHut5 DU+QrhYcvGCSEA== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpeduvdenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id 595303064683; Mon, 28 Sep 2020 13:55:28 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 29/30] mm: thp: use cma reservation for pud thp allocation. Date: Mon, 28 Sep 2020 13:54:27 -0400 Message-Id: <20200928175428.4110504-30-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan Sharing hugepage_cma reservation with hugetlb for pud thp allocaiton. The reserved cma regions still can be used for moveable page allocations. During 1GB page split, all subpages are cleared from the CMA bitmap, since they are no more 1GB pages and will be freed via the normal path instead of cma_release(). Signed-off-by: Zi Yan --- include/linux/cma.h | 3 +++ include/linux/huge_mm.h | 10 ++++++++++ mm/cma.c | 31 +++++++++++++++++++++++++++++++ mm/huge_memory.c | 34 ++++++++++++++++++++++++++++++++++ mm/hugetlb.c | 21 +-------------------- mm/mempolicy.c | 14 +++++++++++++- mm/page_alloc.c | 29 +++++++++++++++++++++++++++++ 7 files changed, 121 insertions(+), 21 deletions(-) diff --git a/include/linux/cma.h b/include/linux/cma.h index 9989d580c2a7..c299b62b3a7a 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -48,6 +48,9 @@ extern struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align, bool no_warn); extern bool cma_release(struct cma *cma, const struct page *pages, unsigned int count); +extern bool cma_clear_bitmap_if_in_range(struct cma *cma, const struct page *page, + unsigned int count); + extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data); extern void cma_reserve(int min_order, unsigned long requested_size, diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 0d0f9cf25aeb..163b244d9acd 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -24,6 +24,8 @@ extern struct page *follow_trans_huge_pud(struct vm_area_struct *vma, unsigned long addr, pud_t *pud, unsigned int flags); +extern struct page *alloc_thp_pud_page(int nid); +extern bool free_thp_pud_page(struct page *page, int order); #else static inline void huge_pud_set_accessed(struct vm_fault *vmf, pud_t orig_pud) { @@ -43,6 +45,14 @@ struct page *follow_trans_huge_pud(struct vm_area_struct *vma, { return NULL; } +struct page *alloc_thp_pud_page(int nid) +{ + return NULL; +} +extern bool free_thp_pud_page(struct page *page, int order); +{ + return false; +} #endif extern vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd); diff --git a/mm/cma.c b/mm/cma.c index 1a9d997fa5ab..c595aad61f58 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -532,6 +532,37 @@ bool cma_release(struct cma *cma, const struct page *pages, unsigned int count) return true; } +/** + * cma_clear_bitmap_if_in_range() - clear bitmap for a given page + * @cma: Contiguous memory region for which the allocation is performed. + * @pages: Allocated pages. + * @count: Number of allocated pages. + * + * This function clears bitmap of memory allocated by cma_alloc(). + * It returns false when provided pages do not belong to contiguous area and + * true otherwise. + */ +bool cma_clear_bitmap_if_in_range(struct cma *cma, const struct page *pages, + unsigned int count) +{ + unsigned long pfn; + + if (!cma || !pages) + return false; + + pfn = page_to_pfn(pages); + + if (pfn < cma->base_pfn || pfn >= cma->base_pfn + cma->count) + return false; + + if (pfn + count > cma->base_pfn + cma->count) + return false; + + cma_clear_bitmap(cma, pfn, count); + + return true; +} + int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data) { int i; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 20ecffc27396..910e51f35910 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -33,6 +33,7 @@ #include #include #include +#include #include #include @@ -62,6 +63,10 @@ static struct shrinker deferred_split_shrinker; static atomic_t huge_zero_refcount; struct page *huge_zero_page __read_mostly; +#ifdef CONFIG_CMA +extern struct cma *hugepage_cma[MAX_NUMNODES]; +#endif + bool transparent_hugepage_enabled(struct vm_area_struct *vma) { /* The addr is used to check if the vma size fits */ @@ -2498,6 +2503,17 @@ static void __split_huge_pud_page(struct page *page, struct list_head *list, /* no file-back page support yet */ VM_BUG_ON(!PageAnon(page)); + /* + * clear cma bitmap when we split pud page so the subpages can be freed + * as normal pages + */ + if (IS_ENABLED(CONFIG_CMA)) { + struct cma *cma = hugepage_cma[page_to_nid(head)]; + + VM_BUG_ON(!cma_clear_bitmap_if_in_range(cma, head, + thp_nr_pages(head))); + } + for (i = HPAGE_PUD_NR - HPAGE_PMD_NR; i >= 1; i -= HPAGE_PMD_NR) __split_huge_pud_page_tail(head, i, lruvec, list); @@ -3732,3 +3748,21 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new) update_mmu_cache_pmd(vma, address, pvmw->pmd); } #endif + +struct page *alloc_thp_pud_page(int nid) +{ + struct page *page = NULL; +#ifdef CONFIG_CMA + page = cma_alloc(hugepage_cma[nid], HPAGE_PUD_NR, HPAGE_PUD_ORDER, true); +#endif + return page; +} + +bool free_thp_pud_page(struct page *page, int order) +{ + bool ret = false; +#ifdef CONFIG_CMA + ret = cma_release(hugepage_cma[page_to_nid(page)], page, 1< X-Patchwork-Id: 11804499 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id ED8CB618 for ; Mon, 28 Sep 2020 17:56:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9FA482311C for ; Mon, 28 Sep 2020 17:56:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=sent.com header.i=@sent.com header.b="fPRTBnTb"; dkim=temperror (0-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="dTWJdbWC" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9FA482311C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=sent.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D8CCA90001F; Mon, 28 Sep 2020 13:55:32 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id CF4AD900021; Mon, 28 Sep 2020 13:55:32 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA897900020; Mon, 28 Sep 2020 13:55:32 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0167.hostedemail.com [216.40.44.167]) by kanga.kvack.org (Postfix) with ESMTP id 8D60B90001A for ; Mon, 28 Sep 2020 13:55:32 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 4FD2F180AD807 for ; Mon, 28 Sep 2020 17:55:32 +0000 (UTC) X-FDA: 77313222504.13.point91_3513a9a27183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id 2A45918140B60 for ; Mon, 28 Sep 2020 17:55:32 +0000 (UTC) X-Spam-Summary: 1,0,0,48ef27d0eab1bf96,d41d8cd98f00b204,zi.yan@sent.com,,RULES_HIT:41:355:379:541:800:960:973:988:989:1260:1261:1311:1314:1345:1359:1437:1515:1535:1541:1711:1730:1747:1777:1792:2393:2559:2562:3138:3139:3140:3141:3142:3352:3876:3877:4250:5007:6114:6119:6120:6261:6642:6653:6742:7576:7901:10004:10226:11026:11473:11658:11914:12043:12296:12438:12555:12679:12895:13069:13311:13357:13894:14181:14384:14721:21080:21627:21990:30003:30054:30064,0,RBL:64.147.123.17:@sent.com:.lbl8.mailshell.net-62.18.0.100 64.100.201.100;04ygh463g55taq9j8f83c7uhuk7xxycqtbs1ktpxhseggdf881e1uqjnrx476ex.hs1azydjhj1hmmzhb48gopxftbms1jtxbm4mfitcaa33eqrgnwuyqbpcy9w6bzm.y-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: point91_3513a9a27183 X-Filterd-Recvd-Size: 5200 Received: from wnew3-smtp.messagingengine.com (wnew3-smtp.messagingengine.com [64.147.123.17]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 17:55:31 +0000 (UTC) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailnew.west.internal (Postfix) with ESMTP id 8D15EE05; Mon, 28 Sep 2020 13:55:29 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 28 Sep 2020 13:55:30 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sent.com; h=from :to:cc:subject:date:message-id:in-reply-to:references:reply-to :mime-version:content-transfer-encoding; s=fm1; bh=fBlfD3KicOh48 xsTea8dHE8609zEG6sz2gFrwigOsnU=; b=fPRTBnTbx3dLJDGVIQn2jIdnZlFbQ /Sh5U12/4DT+wEEq6alb2D0bbKX4l6Raks1ESBkiRRU1vkIU3ZYQfU+AQ47rOmKs hI0zhJhNj/6YHcbxlbYCnhkuJIJN/g4b9yV9F7nfjFiBButXfhrqCJaZKaH59+tQ zYil0xzb3Pc4/vV7z3PhWn6VM9cHGplOMd9oV1m6Mk9hdZ1tv1/KakSybD0bp2Ei tjEpkd6qVIx7M28CMa8vvyXUt34XLJ2pN5Zy/TaZmV/87p+eDovVaiF+U2rBUupZ Ki6t4HPv7RcszDXSudrxRqMoLAoePQcOPgatINgD3uSEjDQZHsX94phKg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:reply-to:subject :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; bh=fBlfD3KicOh48xsTea8dHE8609zEG6sz2gFrwigOsnU=; b=dTWJdbWC u6lm0HJBcOlCQ82ni6AMa4eosL4aQbhDOSFYXyYXNe2DUkS4loq8atA1cjmrQybm 6bcITY7fJcFPE+QlZvRJQRuyXYr8hYWBSNAZie5w+tjXbE3jFjsFu7YWCLDqCGsm L9IR4D7YkCDAEnKaH2BZ7lZEbEZ5HEqviZOcU1gW9leIAl9HT8SdTcc/ZUvW0dkO vu5zguNdQHpIbJNYmGEQF+SlgWbPuO5Oo2hUlL0B2mZ3QN2nZQPl3YPqsAinp4YG rAc/U8E9ibZZo6z/FrMdEoCzTytbDpsiPWJDuaD5EljpSc2YUhHnyT/NV3PgTdAe HjJ4nteLlqVJiQ== X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrvdeigdeliecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvffufffkofgjfhhrggfgsedtkeertdertddtnecuhfhrohhmpegkihcujggr nhcuoeiiihdrhigrnhesshgvnhhtrdgtohhmqeenucggtffrrghtthgvrhhnpeduhfffve ektdduhfdutdfgtdekkedvhfetuedufedtgffgvdevleehheevjefgtdenucfkphepuddv rdegiedruddtiedrudeigeenucevlhhushhtvghrufhiiigvpedvleenucfrrghrrghmpe hmrghilhhfrhhomhepiihirdihrghnsehsvghnthdrtghomh X-ME-Proxy: Received: from nvrsysarch6.NVidia.COM (unknown [12.46.106.164]) by mail.messagingengine.com (Postfix) with ESMTPA id AEE59306467E; Mon, 28 Sep 2020 13:55:28 -0400 (EDT) From: Zi Yan To: linux-mm@kvack.org Cc: "Kirill A . Shutemov" , Roman Gushchin , Rik van Riel , Matthew Wilcox , Shakeel Butt , Yang Shi , Jason Gunthorpe , Mike Kravetz , Michal Hocko , David Hildenbrand , William Kucharski , Andrea Arcangeli , John Hubbard , David Nellans , linux-kernel@vger.kernel.org, Zi Yan Subject: [RFC PATCH v2 30/30] mm: thp: enable anonymous PUD THP at page fault path. Date: Mon, 28 Sep 2020 13:54:28 -0400 Message-Id: <20200928175428.4110504-31-zi.yan@sent.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200928175428.4110504-1-zi.yan@sent.com> References: <20200928175428.4110504-1-zi.yan@sent.com> Reply-To: Zi Yan MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Zi Yan All previous commits have anonymous PUD THP support ready, so we can enable anonymous PUD THP page fault now. Signed-off-by: Zi Yan --- mm/memory.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 9f7b509a3aa7..dc285d9872fc 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4122,16 +4122,15 @@ static vm_fault_t create_huge_pud(struct vm_fault *vmf) { #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && \ defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) - /* No support for anonymous transparent PUD pages yet */ if (vma_is_anonymous(vmf->vma)) - goto split; + return do_huge_pud_anonymous_page(vmf); if (vmf->vma->vm_ops->huge_fault) { vm_fault_t ret = vmf->vma->vm_ops->huge_fault(vmf, PE_SIZE_PUD); if (!(ret & VM_FAULT_FALLBACK)) return ret; } -split: + /* COW or write-notify not handled on PUD level: split pud.*/ __split_huge_pud(vmf->vma, vmf->pud, vmf->address, false, NULL); #endif /* CONFIG_TRANSPARENT_HUGEPAGE */