From patchwork Sun Nov 8 14:10:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 11889601 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 14BF7139F for ; Sun, 8 Nov 2020 14:12:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 84D6420B1F for ; Sun, 8 Nov 2020 14:12:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="Rhp9WxDm" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 84D6420B1F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C5ACA6B0070; Sun, 8 Nov 2020 09:12:33 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id C08AE6B0071; Sun, 8 Nov 2020 09:12:33 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA9BB6B0072; Sun, 8 Nov 2020 09:12:33 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0050.hostedemail.com [216.40.44.50]) by kanga.kvack.org (Postfix) with ESMTP id 7F6876B0070 for ; Sun, 8 Nov 2020 09:12:33 -0500 (EST) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 2E249181AEF09 for ; Sun, 8 Nov 2020 14:12:33 +0000 (UTC) X-FDA: 77461441386.05.teeth24_0c0b81e272e4 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin05.hostedemail.com (Postfix) with ESMTP id 1126B18014A02 for ; Sun, 8 Nov 2020 14:12:33 +0000 (UTC) X-Spam-Summary: 1,0,0,370c19192593b29a,d41d8cd98f00b204,songmuchun@bytedance.com,,RULES_HIT:41:355:379:541:800:960:965:966:973:988:989:1260:1311:1314:1345:1359:1431:1437:1515:1535:1544:1711:1730:1747:1777:1792:2196:2198:2199:2200:2393:2559:2562:2731:3138:3139:3140:3141:3142:3355:3622:3865:3867:3868:3871:4119:4250:4321:4385:4390:4395:4605:5007:6119:6261:6653:6737:6738:7875:7903:9036:10004:11026:11473:11658:11914:12043:12048:12291:12296:12297:12438:12517:12519:12555:12683:12895:13894:14096:14181:14394:14721:21080:21444:21451:21627:21990:30054,0,RBL:209.85.210.196:@bytedance.com:.lbl8.mailshell.net-66.100.201.201 62.2.0.100;04ygy79jgmgw4hqyc86s46ttqyk4ryc7hctuagj7skzgtg7rtp9jndhzbk3w3of.mj9ffynhto936cgxehhotecn47myt9xdpan6nthorzktjqfeydtpdmqkadtgzmq.e-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:69,LUA_SUMMARY:none X-HE-Tag: teeth24_0c0b81e272e4 X-Filterd-Recvd-Size: 8339 Received: from mail-pf1-f196.google.com (mail-pf1-f196.google.com [209.85.210.196]) by imf25.hostedemail.com (Postfix) with ESMTP for ; Sun, 8 Nov 2020 14:12:32 +0000 (UTC) Received: by mail-pf1-f196.google.com with SMTP id g7so5525299pfc.2 for ; Sun, 08 Nov 2020 06:12:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=YtIyQzAvHroZKptlekhFrmJwjKkMKzILzouKGzt2f6k=; b=Rhp9WxDmyv/RhEytkDFxoQkEq1V5pi1tKU3OVQVsW77wJdXH5f/2Z6/ZGtm1PIRL6g q0wr6tFpcF1PgghtqZMN/8uys3MOEYWcBedrFIk5Fa6gEX1+P6XZ6qLV6JxETXkYo3Qq TWKnTnb3ZweeJ1oCforZ2f3rKk++psVY2q7vMPdA0c7nQZXIKHxe69gvZL8EZUef6oyZ DaWBU71aB9pdgFB0v5RyRMOJ0C9eXQwPU44+gUhLBHeJ13wwcS+Wwxz8PGDMu+4OHHNB imdOAxrP5OBW2JuG8eZDRK+xBu4CPgNzAkUFT/se7MUH4lOUIbulFXpAi2IzdxrNUhOl wokA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YtIyQzAvHroZKptlekhFrmJwjKkMKzILzouKGzt2f6k=; b=l8ZaI0IkEFntMIYhJT/U1uo30c5ZwJpsM4VoEAIEzdZh4Y6snSEs212Y37FnC61bcG G8/dKmI++kNf+VFZhWLnimXu/YwuPa6GeFgzcxPT7YvYTaC50wwmJrvuJbZ403lOlbCz 3zV2gWehG+7OrRo2LSeP/MThGF8vthArhxSbFtghe09Fo/Jt1OrSRN71bcFDX3qxKHcd nNSIxiwxCSqLlN8zL5qc5uHa5FysbnIk3WVt+C46LJfySERsvuHF7DaQTQQR/ESlcO6s soQwp40v7gBfjrQTnnQXNeoE/Cg36kMZBXQdOdYLdhf+kloWWp3AdzRrv/+1G/oLTW3D xFpg== X-Gm-Message-State: AOAM532prjb6Ga+eV9ZhMEIOgmtxw58nnMncTNPMEyel1OWt2Da0lwqg 5E2W+EMgBYrHCs/bAsZ+QGhdTQ== X-Google-Smtp-Source: ABdhPJyBBqHXdVJzJb0PmTdi9AVnwB/gLdr18ecPlbIl65Vq7v6QcCD8JbdsWRJwBKvGxFKR9Uosrg== X-Received: by 2002:aa7:8481:0:b029:18b:f647:45f7 with SMTP id u1-20020aa784810000b029018bf64745f7mr3139497pfn.58.1604844751504; Sun, 08 Nov 2020 06:12:31 -0800 (PST) Received: from localhost.localdomain ([103.136.220.94]) by smtp.gmail.com with ESMTPSA id z11sm8754047pfk.52.2020.11.08.06.12.21 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 08 Nov 2020 06:12:30 -0800 (PST) From: Muchun Song To: corbet@lwn.net, mike.kravetz@oracle.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, rdunlap@infradead.org, oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, almasrymina@google.com, rientjes@google.com, willy@infradead.org, osalvador@suse.de, mhocko@suse.com Cc: duanxiongchun@bytedance.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Muchun Song Subject: [PATCH v3 05/21] mm/hugetlb: Introduce pgtable allocation/freeing helpers Date: Sun, 8 Nov 2020 22:10:57 +0800 Message-Id: <20201108141113.65450-6-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20201108141113.65450-1-songmuchun@bytedance.com> References: <20201108141113.65450-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On x86_64, vmemmap is always PMD mapped if the machine has hugepages support and if we have 2MB contiguos pages and PMD aligned. If we want to free the unused vmemmap pages, we have to split the huge pmd firstly. So we should pre-allocate pgtable to split PMD to PTE. Signed-off-by: Muchun Song --- include/linux/hugetlb.h | 10 +++++ mm/hugetlb.c | 111 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 121 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index eed3dd3bd626..d81c262418db 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -593,6 +593,16 @@ static inline unsigned int blocks_per_huge_page(struct hstate *h) #include +#ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP +#ifndef VMEMMAP_HPAGE_SHIFT +#define VMEMMAP_HPAGE_SHIFT HPAGE_SHIFT +#endif +#define VMEMMAP_HPAGE_ORDER (VMEMMAP_HPAGE_SHIFT - PAGE_SHIFT) +#define VMEMMAP_HPAGE_NR (1 << VMEMMAP_HPAGE_ORDER) +#define VMEMMAP_HPAGE_SIZE ((1UL) << VMEMMAP_HPAGE_SHIFT) +#define VMEMMAP_HPAGE_MASK (~(VMEMMAP_HPAGE_SIZE - 1)) +#endif /* CONFIG_HUGETLB_PAGE_FREE_VMEMMAP */ + #ifndef is_hugepage_only_range static inline int is_hugepage_only_range(struct mm_struct *mm, unsigned long addr, unsigned long len) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a0007902fafb..5c7be2ee7e15 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1303,6 +1303,108 @@ static inline void destroy_compound_gigantic_page(struct page *page, */ #define RESERVE_VMEMMAP_NR 2U +#define page_huge_pte(page) ((page)->pmd_huge_pte) + +static inline unsigned int free_vmemmap_pages_per_hpage(struct hstate *h) +{ + return h->nr_free_vmemmap_pages; +} + +static inline unsigned int vmemmap_pages_per_hpage(struct hstate *h) +{ + return free_vmemmap_pages_per_hpage(h) + RESERVE_VMEMMAP_NR; +} + +static inline unsigned long vmemmap_pages_size_per_hpage(struct hstate *h) +{ + return (unsigned long)vmemmap_pages_per_hpage(h) << PAGE_SHIFT; +} + +static inline unsigned int pgtable_pages_to_prealloc_per_hpage(struct hstate *h) +{ + unsigned long vmemmap_size = vmemmap_pages_size_per_hpage(h); + + /* + * No need pre-allocate page tabels when there is no vmemmap pages + * to free. + */ + if (!free_vmemmap_pages_per_hpage(h)) + return 0; + + return ALIGN(vmemmap_size, VMEMMAP_HPAGE_SIZE) >> VMEMMAP_HPAGE_SHIFT; +} + +static inline void vmemmap_pgtable_init(struct page *page) +{ + page_huge_pte(page) = NULL; +} + +static void vmemmap_pgtable_deposit(struct page *page, pgtable_t pgtable) +{ + /* FIFO */ + if (!page_huge_pte(page)) + INIT_LIST_HEAD(&pgtable->lru); + else + list_add(&pgtable->lru, &page_huge_pte(page)->lru); + page_huge_pte(page) = pgtable; +} + +static pgtable_t vmemmap_pgtable_withdraw(struct page *page) +{ + pgtable_t pgtable; + + /* FIFO */ + pgtable = page_huge_pte(page); + page_huge_pte(page) = list_first_entry_or_null(&pgtable->lru, + struct page, lru); + if (page_huge_pte(page)) + list_del(&pgtable->lru); + return pgtable; +} + +static int vmemmap_pgtable_prealloc(struct hstate *h, struct page *page) +{ + int i; + pgtable_t pgtable; + unsigned int nr = pgtable_pages_to_prealloc_per_hpage(h); + + if (!nr) + return 0; + + vmemmap_pgtable_init(page); + + for (i = 0; i < nr; i++) { + pte_t *pte_p; + + pte_p = pte_alloc_one_kernel(&init_mm); + if (!pte_p) + goto out; + vmemmap_pgtable_deposit(page, virt_to_page(pte_p)); + } + + return 0; +out: + while (i-- && (pgtable = vmemmap_pgtable_withdraw(page))) + pte_free_kernel(&init_mm, page_to_virt(pgtable)); + return -ENOMEM; +} + +static void vmemmap_pgtable_free(struct hstate *h, struct page *page) +{ + pgtable_t pgtable; + unsigned int nr = pgtable_pages_to_prealloc_per_hpage(h); + + if (!nr) + return; + + pgtable = page_huge_pte(page); + if (!pgtable) + return; + + while (nr-- && (pgtable = vmemmap_pgtable_withdraw(page))) + pte_free_kernel(&init_mm, page_to_virt(pgtable)); +} + static void __init hugetlb_vmemmap_init(struct hstate *h) { unsigned int order = huge_page_order(h); @@ -1326,6 +1428,15 @@ static void __init hugetlb_vmemmap_init(struct hstate *h) static inline void hugetlb_vmemmap_init(struct hstate *h) { } + +static inline int vmemmap_pgtable_prealloc(struct hstate *h, struct page *page) +{ + return 0; +} + +static inline void vmemmap_pgtable_free(struct hstate *h, struct page *page) +{ +} #endif static void update_and_free_page(struct hstate *h, struct page *page)