From patchwork Fri Nov 20 06:43:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 11919589 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13413C5519F for ; Fri, 20 Nov 2020 06:48:29 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7450E206B6 for ; Fri, 20 Nov 2020 06:48:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="Hdg9qI+B" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7450E206B6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EB99B6B005C; Fri, 20 Nov 2020 01:48:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E91366B0068; Fri, 20 Nov 2020 01:48:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D834C6B007E; Fri, 20 Nov 2020 01:48:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0067.hostedemail.com [216.40.44.67]) by kanga.kvack.org (Postfix) with ESMTP id A0CD26B005C for ; Fri, 20 Nov 2020 01:48:27 -0500 (EST) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 3A3D3180AD820 for ; Fri, 20 Nov 2020 06:48:27 +0000 (UTC) X-FDA: 77503867854.25.trade13_630cd5e27349 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id 1768518015434 for ; Fri, 20 Nov 2020 06:48:27 +0000 (UTC) X-HE-Tag: trade13_630cd5e27349 X-Filterd-Recvd-Size: 8599 Received: from mail-pf1-f195.google.com (mail-pf1-f195.google.com [209.85.210.195]) by imf33.hostedemail.com (Postfix) with ESMTP for ; Fri, 20 Nov 2020 06:48:26 +0000 (UTC) Received: by mail-pf1-f195.google.com with SMTP id v5so2888567pff.10 for ; Thu, 19 Nov 2020 22:48:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=fPRCn8/kiXAueLuQQpe5BNdwIIijLQR0RASycvhGpiY=; b=Hdg9qI+B/zArNnptrPOekO3F5Sv+7WQ850J/5YBXdZBHBc3v8aXdpEebbBU+ZNUvu9 kBaASy+YQSDYvO+l3cAkvWy8afEN1dYK9ZNDa3Eukh9Lb8cT+8ZSHdyR1PlcgH6Q5YOm GBMhHVk3LwTwXq+yPufzgU6rHL0c1VMDrEkBWD6N1kege8+75amTfwFh4ZkpzCqUwMmd nSRKZOJ0yG0/NQ0W+xhMVtdoTRx/KEFW0uFBuSbYJKHypluo0aCz3H4HiJM7zWGF/EsM 8iYDpQdcRIuRsH2viPUOTUG8LIWyyAR0oFh1rbnvK/Ews3d+1eTir/b1kh97ITDUESDb o0fQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=fPRCn8/kiXAueLuQQpe5BNdwIIijLQR0RASycvhGpiY=; b=FkoDCbBWKP1erf1jxN7lKqAWsYLS0jgvzuBLNztzNf5SRcJ6yR59IXdGmckwBRxNRg nShHulw5hIe5lucXKnV3LjykhYLKUuRzeObYgio7u1Fr3tvtA4Y5MrSV1xrVXUnaQIzB 12HDfD2OAnrlNGDAG3LVy60HEDHYrVYdhRyV47nbwL734chiyBWiaLpNiV152Fwj16Qh oFkBrV6FL/YcFr3u3sQpW2jmexWy51QXeF+0Bx4dllp8mGo6pocoAiDPbCH/b+x6CzaI TpnYPnFNdx15eYkEKrI0TO1/LZPZx/Q1nn6l6tgL6YoG+I+Cp5PBYD2nJFixMual9HBQ sOIw== X-Gm-Message-State: AOAM530BTRgS7E8VN+FbkJy6judM5dAAM9ue/o7/lNNP4nsR1VcjcwEq eadHBITM3F/7xWDXOy++vXGZyQ== X-Google-Smtp-Source: ABdhPJx4u46vHQNchdnfimXCIyMWc/51pYcv2O01OlE739ipscOuIyXRdkQKW5MvYwDW6CRWcxH3mQ== X-Received: by 2002:a63:504f:: with SMTP id q15mr12829737pgl.119.1605854905654; Thu, 19 Nov 2020 22:48:25 -0800 (PST) Received: from localhost.localdomain ([103.136.221.72]) by smtp.gmail.com with ESMTPSA id 23sm2220278pfx.210.2020.11.19.22.48.16 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Thu, 19 Nov 2020 22:48:25 -0800 (PST) From: Muchun Song To: corbet@lwn.net, mike.kravetz@oracle.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, rdunlap@infradead.org, oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, almasrymina@google.com, rientjes@google.com, willy@infradead.org, osalvador@suse.de, mhocko@suse.com, song.bao.hua@hisilicon.com Cc: duanxiongchun@bytedance.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Muchun Song Subject: [PATCH v5 14/21] mm/hugetlb: Support freeing vmemmap pages of gigantic page Date: Fri, 20 Nov 2020 14:43:18 +0800 Message-Id: <20201120064325.34492-15-songmuchun@bytedance.com> X-Mailer: git-send-email 2.21.0 (Apple Git-122) In-Reply-To: <20201120064325.34492-1-songmuchun@bytedance.com> References: <20201120064325.34492-1-songmuchun@bytedance.com> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The gigantic page is allocated by bootmem, if we want to free the unused vmemmap pages. We also should allocate the page table. So we also allocate page tables from bootmem. Signed-off-by: Muchun Song --- include/linux/hugetlb.h | 3 +++ mm/hugetlb.c | 5 +++++ mm/hugetlb_vmemmap.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++++ mm/hugetlb_vmemmap.h | 13 +++++++++++ 4 files changed, 81 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index eed3dd3bd626..da18fc9ed152 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -506,6 +506,9 @@ struct hstate { struct huge_bootmem_page { struct list_head list; struct hstate *hstate; +#ifdef CONFIG_HUGETLB_PAGE_FREE_VMEMMAP + pte_t *vmemmap_pte; +#endif }; struct page *alloc_huge_page(struct vm_area_struct *vma, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index ba927ae7f9bd..055604d07046 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2607,6 +2607,7 @@ static void __init gather_bootmem_prealloc(void) WARN_ON(page_count(page) != 1); prep_compound_huge_page(page, h->order); WARN_ON(PageReserved(page)); + gather_vmemmap_pgtable_init(m, page); prep_new_huge_page(h, page, page_to_nid(page)); put_page(page); /* free it into the hugepage allocator */ @@ -2659,6 +2660,10 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) break; cond_resched(); } + + if (hstate_is_gigantic(h)) + i -= gather_vmemmap_pgtable_prealloc(); + if (i < h->max_huge_pages) { char buf[32]; diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index e2ddc73ce25f..3629165d8158 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -103,6 +103,7 @@ #include #include #include +#include #include #include "hugetlb_vmemmap.h" @@ -204,6 +205,65 @@ int vmemmap_pgtable_prealloc(struct hstate *h, struct page *page) return -ENOMEM; } +unsigned long __init gather_vmemmap_pgtable_prealloc(void) +{ + struct huge_bootmem_page *m, *tmp; + unsigned long nr_free = 0; + + list_for_each_entry_safe(m, tmp, &huge_boot_pages, list) { + struct hstate *h = m->hstate; + unsigned int nr = pgtable_pages_to_prealloc_per_hpage(h); + unsigned int pgtable_size; + + if (!nr) + continue; + + pgtable_size = nr << PAGE_SHIFT; + m->vmemmap_pte = memblock_alloc_try_nid(pgtable_size, + PAGE_SIZE, 0, MEMBLOCK_ALLOC_ACCESSIBLE, + NUMA_NO_NODE); + if (!m->vmemmap_pte) { + nr_free++; + list_del(&m->list); + memblock_free_early(__pa(m), huge_page_size(h)); + } + } + + return nr_free; +} + +void __init gather_vmemmap_pgtable_init(struct huge_bootmem_page *m, + struct page *page) +{ + struct hstate *h = m->hstate; + unsigned long pte = (unsigned long)m->vmemmap_pte; + unsigned int nr = pgtable_pages_to_prealloc_per_hpage(h); + + /* + * Use the huge page lru list to temporarily store the preallocated + * pages. The preallocated pages are used and the list is emptied + * before the huge page is put into use. When the huge page is put + * into use by prep_new_huge_page() the list will be reinitialized. + */ + INIT_LIST_HEAD(&page->lru); + + while (nr--) { + struct page *pte_page = virt_to_page(pte); + + __ClearPageReserved(pte_page); + list_add(&pte_page->lru, &page->lru); + pte += PAGE_SIZE; + } + + /* + * If we had gigantic hugepages allocated at boot time, we need + * to restore the 'stolen' pages to totalram_pages in order to + * fix confusing memory reports from free(1) and another + * side-effects, like CommitLimit going negative. + */ + adjust_managed_page_count(page, nr); +} + /* * Walk a vmemmap address to the pmd it maps. */ diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 6dfa7ed6f88a..779d3cb9333f 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -14,6 +14,9 @@ void __init hugetlb_vmemmap_init(struct hstate *h); int vmemmap_pgtable_prealloc(struct hstate *h, struct page *page); void vmemmap_pgtable_free(struct page *page); +unsigned long __init gather_vmemmap_pgtable_prealloc(void); +void __init gather_vmemmap_pgtable_init(struct huge_bootmem_page *m, + struct page *page); void alloc_huge_page_vmemmap(struct hstate *h, struct page *head); void free_huge_page_vmemmap(struct hstate *h, struct page *head); @@ -35,6 +38,16 @@ static inline void vmemmap_pgtable_free(struct page *page) { } +static inline unsigned long gather_vmemmap_pgtable_prealloc(void) +{ + return 0; +} + +static inline void gather_vmemmap_pgtable_init(struct huge_bootmem_page *m, + struct page *page) +{ +} + static inline void alloc_huge_page_vmemmap(struct hstate *h, struct page *head) { }