From patchwork Fri Nov 20 06:43:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 11919571 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1477C5519F for ; Fri, 20 Nov 2020 06:47:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2AA472222F for ; Fri, 20 Nov 2020 06:47:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="VmrGqdVE" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2AA472222F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 916EB6B007B; Fri, 20 Nov 2020 01:47:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8EFFB6B007D; Fri, 20 Nov 2020 01:47:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 804346B007E; Fri, 20 Nov 2020 01:47:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0086.hostedemail.com [216.40.44.86]) by kanga.kvack.org (Postfix) with ESMTP id 47E3D6B007B for ; Fri, 20 Nov 2020 01:47:00 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id D0F5A1EF2 for ; Fri, 20 Nov 2020 06:46:59 +0000 (UTC) X-FDA: 77503864158.11.wind99_300d5c027349 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id AE51E180F8B86 for ; Fri, 20 Nov 2020 06:46:59 +0000 (UTC) X-HE-Tag: wind99_300d5c027349 X-Filterd-Recvd-Size: 7629 Received: from mail-pg1-f193.google.com (mail-pg1-f193.google.com [209.85.215.193]) by imf15.hostedemail.com (Postfix) with ESMTP for ; Fri, 20 Nov 2020 06:46:59 +0000 (UTC) Received: by mail-pg1-f193.google.com with SMTP id 34so6472691pgp.10 for ; Thu, 19 Nov 2020 22:46:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=88Vyf1Pcm29kb9drVzemkW2MmMhaFRrGbBNvTcfU0mI=; b=VmrGqdVEOmmHQIibPHpVJJjw83uIZZNbNV7ajzRc+ac94AKkBDewjWcEn0w1pLdNww SYgU3lDsbdUs2nBjB9CVxNVWksSQOgZSz7RhoKAsntzXOatfNxp/bnOl96fstM6HxRWc Qw55+TZP9tbhXQXk7PXbT3kZFCGah5DtZGIH2jzYKhqRynAxkqd+qkdAPrV/wo8sBlhw 5V1y3bMzOQSzjsuoyXb5S74L1+IlvSnynuGzUEpYLTQJy1tZjK0CxMF9LfsMR2ILa70e Fpc7/r2DvD1XMTOSAbac2R+eixL4bGYtNgC6cBB2bYxFmZ0F5CgUSwqb6xIeeGSJ7a70 7dFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=88Vyf1Pcm29kb9drVzemkW2MmMhaFRrGbBNvTcfU0mI=; b=U7qlJIU9uPHjoaIyc15phcsyRPxHDmMv19eNTm934cTXJU5DhwFuSPJFZnct2hXVx1 BxhF+i0DEAOJnfRf+ax/fkrq3X16cLmyz78jWOpqWbE/AN2NbgGkfvy8OAg7etDoaj+v 4uu2umxGYAjUaVvOUhQgDvzZpL0d9eZvrFUnJo41P+6hGlfLMrKCE+V2UC/iqdRWqDFc 4kzb5PE0MbXQjePOeSEh7vgCPcRVcJhZ+lQ5GTkelltaJ1NSIJr3WZ2c9XVhANKCwRl6 V57wKx5Qt9qU6GQLSpcraw/ZOclh5TRpnMR1mwOvD8/NDnBX9dadzb5xq6gB2yaONz7U iumQ== X-Gm-Message-State: AOAM530au4uhSSLwhDIGtw/LXvQACHRSjsmk5klwhdSb+q5y9EaPnlX7 DzE3qi/+UJvQDI3pjKvjdrOvOg== X-Google-Smtp-Source: ABdhPJyA2pbz2sUo0O1TKkJnOt8g0Z0zF62vjdl9xLhDjvD5IAhMOMWolvPOKpNYn7tiG+C8sefhJw== X-Received: by 2002:aa7:8552:0:b029:18e:f030:e514 with SMTP id y18-20020aa785520000b029018ef030e514mr12036552pfn.2.1605854818095; Thu, 19 Nov 2020 22:46:58 -0800 (PST) Received: from localhost.localdomain ([103.136.221.72]) by smtp.gmail.com with ESMTPSA id 23sm2220278pfx.210.2020.11.19.22.46.48 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 19 Nov 2020 22:46:57 -0800 (PST) From: Muchun Song To: corbet@lwn.net, mike.kravetz@oracle.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, rdunlap@infradead.org, oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, almasrymina@google.com, rientjes@google.com, willy@infradead.org, osalvador@suse.de, mhocko@suse.com, song.bao.hua@hisilicon.com Cc: duanxiongchun@bytedance.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Muchun Song Subject: [PATCH v5 05/21] mm/hugetlb: Introduce pgtable allocation/freeing helpers Date: Fri, 20 Nov 2020 14:43:09 +0800 Message-Id: <20201120064325.34492-6-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20201120064325.34492-1-songmuchun@bytedance.com> References: <20201120064325.34492-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On x86_64, vmemmap is always PMD mapped if the machine has hugepages support and if we have 2MB contiguous pages and PMD alignment. If we want to free the unused vmemmap pages, we have to split the huge PMD firstly. So we should pre-allocate pgtable to split PMD to PTE. Signed-off-by: Muchun Song Suggested-by: Oscar Salvador Acked-by: Mike Kravetz --- mm/hugetlb_vmemmap.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++ mm/hugetlb_vmemmap.h | 11 ++++++++ 2 files changed, 87 insertions(+) diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 1afe245395e5..ec70980000d8 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -99,6 +99,8 @@ */ #define pr_fmt(fmt) "HugeTLB Vmemmap: " fmt +#include +#include #include "hugetlb_vmemmap.h" /* @@ -111,6 +113,80 @@ */ #define RESERVE_VMEMMAP_NR 2U +#ifndef VMEMMAP_HPAGE_SHIFT +#define VMEMMAP_HPAGE_SHIFT HPAGE_SHIFT +#endif +#define VMEMMAP_HPAGE_ORDER (VMEMMAP_HPAGE_SHIFT - PAGE_SHIFT) +#define VMEMMAP_HPAGE_NR (1 << VMEMMAP_HPAGE_ORDER) +#define VMEMMAP_HPAGE_SIZE ((1UL) << VMEMMAP_HPAGE_SHIFT) +#define VMEMMAP_HPAGE_MASK (~(VMEMMAP_HPAGE_SIZE - 1)) + +static inline unsigned int free_vmemmap_pages_per_hpage(struct hstate *h) +{ + return h->nr_free_vmemmap_pages; +} + +static inline unsigned int vmemmap_pages_per_hpage(struct hstate *h) +{ + return free_vmemmap_pages_per_hpage(h) + RESERVE_VMEMMAP_NR; +} + +static inline unsigned long vmemmap_pages_size_per_hpage(struct hstate *h) +{ + return (unsigned long)vmemmap_pages_per_hpage(h) << PAGE_SHIFT; +} + +static inline unsigned int pgtable_pages_to_prealloc_per_hpage(struct hstate *h) +{ + unsigned long vmemmap_size = vmemmap_pages_size_per_hpage(h); + + /* + * No need pre-allocate page tables when there is no vmemmap pages + * to free. + */ + if (!free_vmemmap_pages_per_hpage(h)) + return 0; + + return ALIGN(vmemmap_size, VMEMMAP_HPAGE_SIZE) >> VMEMMAP_HPAGE_SHIFT; +} + +void vmemmap_pgtable_free(struct page *page) +{ + struct page *pte_page, *t_page; + + list_for_each_entry_safe(pte_page, t_page, &page->lru, lru) { + list_del(&pte_page->lru); + pte_free_kernel(&init_mm, page_to_virt(pte_page)); + } +} + +int vmemmap_pgtable_prealloc(struct hstate *h, struct page *page) +{ + unsigned int nr = pgtable_pages_to_prealloc_per_hpage(h); + + /* + * Use the huge page lru list to temporarily store the preallocated + * pages. The preallocated pages are used and the list is emptied + * before the huge page is put into use. When the huge page is put + * into use by prep_new_huge_page() the list will be reinitialized. + */ + INIT_LIST_HEAD(&page->lru); + + while (nr--) { + pte_t *pte_p; + + pte_p = pte_alloc_one_kernel(&init_mm); + if (!pte_p) + goto out; + list_add(&virt_to_page(pte_p)->lru, &page->lru); + } + + return 0; +out: + vmemmap_pgtable_free(page); + return -ENOMEM; +} + void __init hugetlb_vmemmap_init(struct hstate *h) { unsigned int order = huge_page_order(h); diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 40c0c7dfb60d..9eca6879c0a4 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -12,9 +12,20 @@ #ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP void __init hugetlb_vmemmap_init(struct hstate *h); +int vmemmap_pgtable_prealloc(struct hstate *h, struct page *page); +void vmemmap_pgtable_free(struct page *page); #else static inline void hugetlb_vmemmap_init(struct hstate *h) { } + +static inline int vmemmap_pgtable_prealloc(struct hstate *h, struct page *page) +{ + return 0; +} + +static inline void vmemmap_pgtable_free(struct page *page) +{ +} #endif /* CONFIG_HUGETLB_PAGE_FREE_VMEMMAP */ #endif /* _LINUX_HUGETLB_VMEMMAP_H */