From patchwork Mon Oct 26 14:51:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 11857459 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3AE9B6A2 for ; Mon, 26 Oct 2020 14:54:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D9B4C24640 for ; Mon, 26 Oct 2020 14:54:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="IXkxUGg2" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D9B4C24640 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DFA606B0078; Mon, 26 Oct 2020 10:54:01 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id D822D6B007B; Mon, 26 Oct 2020 10:54:01 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BFC3E6B007D; Mon, 26 Oct 2020 10:54:01 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0050.hostedemail.com [216.40.44.50]) by kanga.kvack.org (Postfix) with ESMTP id 90AE76B0078 for ; Mon, 26 Oct 2020 10:54:01 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 257CD1EE6 for ; Mon, 26 Oct 2020 14:54:01 +0000 (UTC) X-FDA: 77414371482.04.crate73_110b45027274 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin04.hostedemail.com (Postfix) with ESMTP id 001148004330 for ; Mon, 26 Oct 2020 14:54:00 +0000 (UTC) X-Spam-Summary: 1,0,0,2999b7cf18ffc3c1,d41d8cd98f00b204,songmuchun@bytedance.com,,RULES_HIT:2:41:355:379:541:800:960:965:966:973:988:989:1260:1311:1314:1345:1359:1431:1437:1515:1535:1606:1730:1747:1777:1792:1981:2194:2196:2198:2199:2200:2201:2393:2559:2562:2731:3138:3139:3140:3141:3142:3354:3865:3866:3867:3871:4120:4385:4390:4395:4605:5007:6119:6261:6653:6737:6738:7875:7903:9036:10004:11026:11473:11657:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12683:12895:13161:13229:13894:14096:14110:14394:21080:21444:21451:21627:21990:30054,0,RBL:209.85.214.196:@bytedance.com:.lbl8.mailshell.net-66.100.201.201 62.2.0.100;04y8ducxd8n83duqd4ri43rrcdnmhop3nb7tfxi448w8h47y3k6cjh4wb8taf4p.8jfjao77d5fp1brh9dgarhkf816o5mbkg6obbxggen3i4zhy65fn8o8h17zwnak.1-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:25,LUA_SUMMARY:none X-HE-Tag: crate73_110b45027274 X-Filterd-Recvd-Size: 9580 Received: from mail-pl1-f196.google.com (mail-pl1-f196.google.com [209.85.214.196]) by imf01.hostedemail.com (Postfix) with ESMTP for ; Mon, 26 Oct 2020 14:54:00 +0000 (UTC) Received: by mail-pl1-f196.google.com with SMTP id b19so4863216pld.0 for ; Mon, 26 Oct 2020 07:54:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TKnqB2H4A9GfL129LyuW/aKQzjXxtcujGjamVVTZt+o=; b=IXkxUGg2Agm1PK38QD2SNw/68rctaJpPIznRseP3npaQajtiMXDCt8jOzBSoDL9hbw 2JG7iTMh6wRzgSCb1XxgWMnJRprM1/woCbCCblyEwHYxwTo+xdMuxW9HfdgohKDfQSGK aHwc+1jOMZuWxJ1bP1fGbnQSpzDzvCdVl0S2AmTYJYpjia6IAEOLK71PlDBfxri0pQ0f hotdFh4af4I754SBFNinBVkHPkWECHHI36/gBps/kjMUTMOXkDz174YNGOvJcPVy9kvd 2EgZi/ej1JQeKlD9Teit1kkYYWGa/UV6YXdEaZDikGS6MAWhmXnc1Kfx/gYp/xqcKcQG Tnmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TKnqB2H4A9GfL129LyuW/aKQzjXxtcujGjamVVTZt+o=; b=oMJgxk7nRVEy4uBm/KnNSAVFtaT8C4GnzKmWCe3ZHG6M1ex79fol5txz1ymsQkf9Ni zN9i+5I6DufSsR3pyqWFgFrSLVDYAaZiulqbtcp308OKdLmsLGOQAR6uj0O76wd3sPRY RqklgHMn8GT2EMQTHcth4R9p87IkBXzEwSyU/KwYe58qcKMO5aCEax+o9Jbt3UnVpBeV ADcKNDv3Aq2xA7evQZrXPuUcETSu4rMq36UZ43mqclzi37jelRApM1RUy+IH8Y6qBUD6 12TGjhAvRpwDnW7s8JYtlXhdVSAHHn6CBPnSZnXgn6s62G3Kkv6Clzo0mIpj4JoVeI0g Rmpg== X-Gm-Message-State: AOAM530ccnJ1GrwPoHWaLuwe/budPtq21Eyyf/QFtN75MJcENpjZm7Us YeiEhwDwEOZmHY0k/YwC9D5xZw== X-Google-Smtp-Source: ABdhPJzcCWQyRV2cLPt2WZWLE1vOjxPG/tlYXb0D8MZ8Bmyv8p03mIypI33vMhOkGMCpzltSgagfSA== X-Received: by 2002:a17:90a:62c1:: with SMTP id k1mr17165182pjs.135.1603724039650; Mon, 26 Oct 2020 07:53:59 -0700 (PDT) Received: from localhost.localdomain ([103.136.220.89]) by smtp.gmail.com with ESMTPSA id x123sm12042726pfb.212.2020.10.26.07.53.51 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 26 Oct 2020 07:53:59 -0700 (PDT) From: Muchun Song To: corbet@lwn.net, mike.kravetz@oracle.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, rdunlap@infradead.org, oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, almasrymina@google.com, rientjes@google.com, willy@infradead.org Cc: duanxiongchun@bytedance.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Muchun Song Subject: [PATCH v2 05/19] mm/hugetlb: Introduce pgtable allocation/freeing helpers Date: Mon, 26 Oct 2020 22:51:00 +0800 Message-Id: <20201026145114.59424-6-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20201026145114.59424-1-songmuchun@bytedance.com> References: <20201026145114.59424-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On some architectures, the vmemmap areas use huge page mapping. If we want to free the unused vmemmap pages, we have to split the huge pmd firstly. So we should pre-allocate pgtable to split huge pmd. Signed-off-by: Muchun Song --- arch/x86/include/asm/hugetlb.h | 5 ++ include/linux/hugetlb.h | 17 +++++ mm/hugetlb.c | 117 +++++++++++++++++++++++++++++++++ 3 files changed, 139 insertions(+) diff --git a/arch/x86/include/asm/hugetlb.h b/arch/x86/include/asm/hugetlb.h index 1721b1aadeb1..f5e882f999cd 100644 --- a/arch/x86/include/asm/hugetlb.h +++ b/arch/x86/include/asm/hugetlb.h @@ -5,6 +5,11 @@ #include #include +#ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP +#define VMEMMAP_HPAGE_SHIFT PMD_SHIFT +#define arch_vmemmap_support_huge_mapping() boot_cpu_has(X86_FEATURE_PSE) +#endif + #define hugepages_supported() boot_cpu_has(X86_FEATURE_PSE) #endif /* _ASM_X86_HUGETLB_H */ diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index eed3dd3bd626..ace304a6196c 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -593,6 +593,23 @@ static inline unsigned int blocks_per_huge_page(struct hstate *h) #include +#ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP +#ifndef arch_vmemmap_support_huge_mapping +static inline bool arch_vmemmap_support_huge_mapping(void) +{ + return false; +} +#endif + +#ifndef VMEMMAP_HPAGE_SHIFT +#define VMEMMAP_HPAGE_SHIFT PMD_SHIFT +#endif +#define VMEMMAP_HPAGE_ORDER (VMEMMAP_HPAGE_SHIFT - PAGE_SHIFT) +#define VMEMMAP_HPAGE_NR (1 << VMEMMAP_HPAGE_ORDER) +#define VMEMMAP_HPAGE_SIZE ((1UL) << VMEMMAP_HPAGE_SHIFT) +#define VMEMMAP_HPAGE_MASK (~(VMEMMAP_HPAGE_SIZE - 1)) +#endif /* CONFIG_HUGETLB_PAGE_FREE_VMEMMAP */ + #ifndef is_hugepage_only_range static inline int is_hugepage_only_range(struct mm_struct *mm, unsigned long addr, unsigned long len) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index f1b2b733b49b..d6ae9b6876be 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1295,11 +1295,108 @@ static inline void destroy_compound_gigantic_page(struct page *page, #ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP #define RESERVE_VMEMMAP_NR 2U +#define page_huge_pte(page) ((page)->pmd_huge_pte) + static inline unsigned int nr_free_vmemmap(struct hstate *h) { return h->nr_free_vmemmap_pages; } +static inline unsigned int nr_vmemmap(struct hstate *h) +{ + return nr_free_vmemmap(h) + RESERVE_VMEMMAP_NR; +} + +static inline unsigned long nr_vmemmap_size(struct hstate *h) +{ + return (unsigned long)nr_vmemmap(h) << PAGE_SHIFT; +} + +static inline unsigned int nr_pgtable(struct hstate *h) +{ + unsigned long vmemmap_size = nr_vmemmap_size(h); + + if (!arch_vmemmap_support_huge_mapping()) + return 0; + + /* + * No need pre-allocate page tabels when there is no vmemmap pages + * to free. + */ + if (!nr_free_vmemmap(h)) + return 0; + + return ALIGN(vmemmap_size, VMEMMAP_HPAGE_SIZE) >> VMEMMAP_HPAGE_SHIFT; +} + +static inline void vmemmap_pgtable_init(struct page *page) +{ + page_huge_pte(page) = NULL; +} + +static void vmemmap_pgtable_deposit(struct page *page, pte_t *pte_p) +{ + pgtable_t pgtable = virt_to_page(pte_p); + + /* FIFO */ + if (!page_huge_pte(page)) + INIT_LIST_HEAD(&pgtable->lru); + else + list_add(&pgtable->lru, &page_huge_pte(page)->lru); + page_huge_pte(page) = pgtable; +} + +static pte_t *vmemmap_pgtable_withdraw(struct page *page) +{ + pgtable_t pgtable; + + /* FIFO */ + pgtable = page_huge_pte(page); + if (unlikely(!pgtable)) + return NULL; + page_huge_pte(page) = list_first_entry_or_null(&pgtable->lru, + struct page, lru); + if (page_huge_pte(page)) + list_del(&pgtable->lru); + return page_to_virt(pgtable); +} + +static int vmemmap_pgtable_prealloc(struct hstate *h, struct page *page) +{ + int i; + pte_t *pte_p; + unsigned int nr = nr_pgtable(h); + + if (!nr) + return 0; + + vmemmap_pgtable_init(page); + + for (i = 0; i < nr; i++) { + pte_p = pte_alloc_one_kernel(&init_mm); + if (!pte_p) + goto out; + vmemmap_pgtable_deposit(page, pte_p); + } + + return 0; +out: + while (i-- && (pte_p = vmemmap_pgtable_withdraw(page))) + pte_free_kernel(&init_mm, pte_p); + return -ENOMEM; +} + +static inline void vmemmap_pgtable_free(struct hstate *h, struct page *page) +{ + pte_t *pte_p; + + if (!nr_pgtable(h)) + return; + + while ((pte_p = vmemmap_pgtable_withdraw(page))) + pte_free_kernel(&init_mm, pte_p); +} + static void __init hugetlb_vmemmap_init(struct hstate *h) { unsigned int order = huge_page_order(h); @@ -1323,6 +1420,15 @@ static void __init hugetlb_vmemmap_init(struct hstate *h) static inline void hugetlb_vmemmap_init(struct hstate *h) { } + +static inline int vmemmap_pgtable_prealloc(struct hstate *h, struct page *page) +{ + return 0; +} + +static inline void vmemmap_pgtable_free(struct hstate *h, struct page *page) +{ +} #endif static void update_and_free_page(struct hstate *h, struct page *page) @@ -1531,6 +1637,9 @@ void free_huge_page(struct page *page) static void prep_new_huge_page(struct hstate *h, struct page *page, int nid) { + /* Must be called before the initialization of @page->lru */ + vmemmap_pgtable_free(h, page); + INIT_LIST_HEAD(&page->lru); set_compound_page_dtor(page, HUGETLB_PAGE_DTOR); set_hugetlb_cgroup(page, NULL); @@ -1783,6 +1892,14 @@ static struct page *alloc_fresh_huge_page(struct hstate *h, if (!page) return NULL; + if (vmemmap_pgtable_prealloc(h, page)) { + if (hstate_is_gigantic(h)) + free_gigantic_page(page, huge_page_order(h)); + else + put_page(page); + return NULL; + } + if (hstate_is_gigantic(h)) prep_compound_gigantic_page(page, huge_page_order(h)); prep_new_huge_page(h, page, page_to_nid(page));