From patchwork Mon Jan 27 23:21:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951859 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17A72C02188 for ; Mon, 27 Jan 2025 23:23:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 74A0D2801C9; Mon, 27 Jan 2025 18:22:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6FA7528013A; Mon, 27 Jan 2025 18:22:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 54D852801C9; Mon, 27 Jan 2025 18:22:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 357FA28013A for ; Mon, 27 Jan 2025 18:22:56 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id E317F12056D for ; Mon, 27 Jan 2025 23:22:55 +0000 (UTC) X-FDA: 83054809110.14.797E38A Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf17.hostedemail.com (Postfix) with ESMTP id 0E51840003 for ; Mon, 27 Jan 2025 23:22:53 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=pvzMHPRW; spf=pass (imf17.hostedemail.com: domain of 3TBWYZwQKCBs6M4C7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3TBWYZwQKCBs6M4C7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020174; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pZnSVaYX+KiYCdXZ4CmyGFw5hmL1VdfR5XYIpStDEuk=; b=mSgKE8tHNca/x0+GxFiHWFuub94XDegTALuMffzKoxJX+HE5gPau3Vce3kHkD1h+s+xJLP J6NXOlLel2iiwV92Km3UEenM8Z6ObbW6uhr0GjQ4h8/WGnlr0HvMVOtox8WSWeWMtBNjTx buZW+P1y0Pkp6CPjTXKy6VOidjxHOTs= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=pvzMHPRW; spf=pass (imf17.hostedemail.com: domain of 3TBWYZwQKCBs6M4C7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3TBWYZwQKCBs6M4C7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020174; a=rsa-sha256; cv=none; b=NGRd/GSK1pLQZ6/NUXww6ZERMjyXuqkwSfBtbmOJy63nxLiO9jJHeLu9GDDV3SF7KPinpe IuR8ErLwJ1uBvr/qk0v713hSwmVzf8+LtTYAkoZLTiPTJ9HDsDiEyWuwhNgGvBQEPXk3oO iRgTdVIAgvyjfLHllIHTiqevNINRL+o= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ef7fbd99a6so9591429a91.1 for ; Mon, 27 Jan 2025 15:22:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020173; x=1738624973; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=pZnSVaYX+KiYCdXZ4CmyGFw5hmL1VdfR5XYIpStDEuk=; b=pvzMHPRWxcKg+wqjPRTiuam4QrSqIDsjfQWo0cFi3FJ9N3Eo4JvYVCvIrntEkwO5G2 MoKh59cPUlwUTz9ZublgXw6TW6Sc/5/jurkwWMX37j6/nwsSBB7ILw+ybs5myCSAq2Fe VUz6FCPsX/54qfQ4F4bXUV0gdSbMmT4QQ042cVOWImuZUPk55/1aB7Xzj5N70XWHXduK REcJ24DqYhW6X4Zr4Q2cqgQR6wi1QsEzqVIPk5lCVPsMZn8KofYkNOE/3QMjPmV299zE 0VIOqlq7Ca4AuvHKCalIjkIdKoD+pGLycauMU/7nMYDZfJjIyURW+QKvE3rrlrqvtIaX AaYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020173; x=1738624973; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pZnSVaYX+KiYCdXZ4CmyGFw5hmL1VdfR5XYIpStDEuk=; b=wFPbjX/mCQ0ACgd5rpxgLepj5rVSuDXhOc4cACn3zeASBpU1MzoMth9TrP4wZRbXeH A2PSIeHdzOnkkFLZGWnEzzMWysSenDQc3mmU0tZRpz/4TzMLB3KIPt/PzPi2QxKfF+mx 1zQKGsjRUwxBAGjqn5hxkAKKNPrTgiczT92U5DEEZ6YyKr02HtuJFRtaq40VL+8A67mx c9CJxaEqUbPFUe1jT5fcaMRnZOTzEeOTx561vn4BVS0GA1dT79oo7miq+xLpY38XV6W7 dvNaWZQ5tSKw4sTOqjz6i6KJa4ANS98dZcShSHAmEQUvDPu3mKRepR/BbNi76SzgUyuV 9OHw== X-Forwarded-Encrypted: i=1; AJvYcCXM1YFio2NXG58EPYaniTVYqiUJ+a8dD4fPKs2oIk1nBXTOW5qr8K6d2VHwyhvsgDG98oiewbJOrw==@kvack.org X-Gm-Message-State: AOJu0YxRMarw7iVK1hLdaCEuRP70BmkaG+r5V62LTh9FVgKjRiSpWMSW KEli2Z4a5F3xr2S26gSMDvbJDBNrIUvmdnBfu5w8Z9/4TvMXq/ODXl9jAcMrteA5xf8Tuw== X-Google-Smtp-Source: AGHT+IFeqEuccYmAhGNha3QlLz2orNvO84nvz+r3SrCrM4Gd+ORaJN3aYgAlJa4q+mgHhPJuRCi37zYw X-Received: from pfbcg7.prod.google.com ([2002:a05:6a00:2907:b0:728:aad0:33a4]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:23c1:b0:726:64a5:5f67 with SMTP id d2e1a72fcca58-72daf99731bmr61371606b3a.12.1738020172884; Mon, 27 Jan 2025 15:22:52 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:58 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-19-fvdl@google.com> Subject: [PATCH 18/27] mm/hugetlb: add pre-HVO framework From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Queue-Id: 0E51840003 X-Stat-Signature: 8bjir4swirfzfgq4azs3bocgj1kdno5h X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1738020173-98006 X-HE-Meta: U2FsdGVkX1+/nCt2+5PR1sbs//cysfK+ezESiDLWQqB4e+jzP2VYmtHWvPhoo6jykjHgVJS2qNOYjmxjRsycHfV+OL9QWtx9c/YGBEQgTLRoClx4eibIkx9UswmkEVMWWhgB0CdQWw1jXrh3Yqum1dp/89/rebMXFfh2PxOf6Vxntkj+U/JHYJ5oQdqn4xVr+dpK2sU/vEeuZXdTPi0OjAhPG+P3NLxKQHzGeTXIzeCL6C0plYlk5RJHisO3BdIDMSvz8Nq/uG97IJvvPh2OkyusCKOjr0GXRj1SJrLGXR6Tyoo/PuWxYhSC140Dlg10bZ5v1Rr39PakXZVaJs7bEWWPfT45wDwTdGmCnOQkIeGA3/2InaInUVqV/75tgtxpsZ+vRjnanzUaKsmGXAoWo/UywEgCGdIbmoOxcrym4nLim5NpcUSY0KvWJV/o5h6asg0kAOwsXOURTvwgGNSo7hT+CFDIJmT3UX+8HlRtadKbIt9OeAjoSt0fh31paW40TXPol4hKmOjo4zoRrP1PUa2g6xmj5sd7nZ7AGs7CQ2+MA6lORXRapEpNH8H793iikyNizaYpQ3zEuyqujmUiI8PEr7HuBYuidY25L8d0BEdjjf1xg+lhvNHtR2yB9e/+vsIwCgczEJo07TsAmLqYV/iFX5cNDhGDcjqIVwZDZTT2Zye38NBvq1hQFyOrG4CWYNLBIpReaMvKkv6Akh3x/6r/EwE0q2NpnCyjOfWDNyqoKy0UYLgf4vy3nJkXDQJFWLgqluBbEf1cEc4QkQKFDD5aMIXn+CiJvTKAcyk70GGjX4nHj86avp3UJY0M89VOUw6Uve6O5lYU5bmu1AQ2JOFrEGq6FS3V4O2eS40Rj2tZUSOnO5V2IvhEzW65B8Fq9l2OYBt9BU+f9sr08dHrH71DYfPcvpiQO53XJL66CesN/A8Kuc3BgvHIfh/eWLvdYMAx1k317GeYEjwso8a GxTM+w8D JOWspP7yYy9mUk/GRgbn8oqKRISfNJGi+zkVS3Z169SsvF88CeDND/7AB+ADY5GJprx8FQrBtZWpFYEi6P2rgZpzqJK/+4PTTIwT3aygxkB39VA2M+D5AzDEBEmtIp13Yf1KKnJaAznF/HC3FLey2F+rpBoHSeZ06Dh6ZHzjhENPNWy/y8bcrlpJOxQNTptheKoOp9DjcfKUPn78iB5MzNoLst8/CVNxmljV8lwD3N1jCpl3KxXUqVPDwaLBHhEmqlm8NX7x/S2TjV8QAmNb1jaP9e8pOhfX/C+eXebIwMD57z/wFQGKqIm79KGS0gW+lKVocK8vAcntz3cXBzAibPmtkoGzhvVjw47ClEVKXTIZpQMZMqVvkxCeBQSccHXbUtbFDgAiXUKBPbbrv/7YBMaS2tYBvy9e0LMRZJDgDVo9K41eJa1aklqhmCn1hSAXRun1pCDZRuceUiQFPuzFOapwrxWlj9A6wlmtUbRZN041I6+/940G9PlPPOQeW00KIWgHkudXE+iCF+scQhtzOK3J+Ug== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Define flags for pre-HVOed bootmem hugetlb pages, and act on them. The most important flag is the HVO flag, signalling that a bootmem allocated gigantic page has already been HVO-ed. If this flag is seen by the hugetlb bootmem gather code, the page is marked as HVO optimized. The HVO code will then not try to optimize it again. Instead, it will just map the tail page mirror pages read-only, completing the HVO steps. No functional change, as nothing sets the flags yet. Signed-off-by: Frank van der Linden --- arch/powerpc/mm/hugetlbpage.c | 1 + include/linux/hugetlb.h | 4 +++ mm/hugetlb.c | 24 ++++++++++++++++- mm/hugetlb_vmemmap.c | 50 +++++++++++++++++++++++++++++++++-- mm/hugetlb_vmemmap.h | 15 +++++++++++ 5 files changed, 91 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 6b043180220a..d3c1b749dcfc 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -113,6 +113,7 @@ static int __init pseries_alloc_bootmem_huge_page(struct hstate *hstate) gpage_freearray[nr_gpages] = 0; list_add(&m->list, &huge_boot_pages[0]); m->hstate = hstate; + m->flags = 0; return 1; } diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 5061279e5f73..10a7ce2b95e1 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -681,8 +681,12 @@ struct hstate { struct huge_bootmem_page { struct list_head list; struct hstate *hstate; + unsigned long flags; }; +#define HUGE_BOOTMEM_HVO 0x0001 +#define HUGE_BOOTMEM_ZONES_VALID 0x0002 + int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 2aa35c1d112b..05c5a65e605f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3220,6 +3220,7 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages[node]); m->hstate = h; + m->flags = 0; return 1; } @@ -3287,7 +3288,7 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, struct folio *folio, *tmp_f; /* Send list for bulk vmemmap optimization processing */ - hugetlb_vmemmap_optimize_folios(h, folio_list); + hugetlb_vmemmap_optimize_bootmem_folios(h, folio_list); list_for_each_entry_safe(folio, tmp_f, folio_list, lru) { if (!folio_test_hugetlb_vmemmap_optimized(folio)) { @@ -3316,6 +3317,13 @@ static bool __init hugetlb_bootmem_page_zones_valid(int nid, unsigned long start_pfn; bool valid; + if (m->flags & HUGE_BOOTMEM_ZONES_VALID) { + /* + * Already validated, skip check. + */ + return true; + } + start_pfn = virt_to_phys(m) >> PAGE_SHIFT; valid = !pfn_range_intersects_zones(nid, start_pfn, @@ -3348,6 +3356,11 @@ static void __init hugetlb_bootmem_free_invalid_page(int nid, struct page *page, } } +static bool __init hugetlb_bootmem_page_prehvo(struct huge_bootmem_page *m) +{ + return (m->flags & HUGE_BOOTMEM_HVO); +} + /* * Put bootmem huge pages into the standard lists after mem_map is up. * Note: This only applies to gigantic (order > MAX_PAGE_ORDER) pages. @@ -3388,6 +3401,15 @@ static void __init gather_bootmem_prealloc_node(unsigned long nid) hugetlb_folio_init_vmemmap(folio, h, HUGETLB_VMEMMAP_RESERVE_PAGES); init_new_hugetlb_folio(h, folio); + + if (hugetlb_bootmem_page_prehvo(m)) + /* + * If pre-HVO was done, just set the + * flag, the HVO code will then skip + * this folio. + */ + folio_set_hugetlb_vmemmap_optimized(folio); + list_add(&folio->lru, &folio_list); /* diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 326cdf94192e..4eddf3c30d62 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -649,14 +649,39 @@ static int hugetlb_vmemmap_split_folio(const struct hstate *h, struct folio *fol return vmemmap_remap_split(vmemmap_start, vmemmap_end, vmemmap_reuse); } -void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list) +static void __hugetlb_vmemmap_optimize_folios(struct hstate *h, + struct list_head *folio_list, + bool boot) { struct folio *folio; + int nr_to_optimize; LIST_HEAD(vmemmap_pages); unsigned long flags = VMEMMAP_REMAP_NO_TLB_FLUSH | VMEMMAP_SYNCHRONIZE_RCU; + nr_to_optimize = 0; list_for_each_entry(folio, folio_list, lru) { - int ret = hugetlb_vmemmap_split_folio(h, folio); + int ret; + unsigned long spfn, epfn; + + if (boot && folio_test_hugetlb_vmemmap_optimized(folio)) { + /* + * Already optimized by pre-HVO, just map the + * mirrored tail page structs RO. + */ + spfn = (unsigned long)&folio->page; + epfn = spfn + pages_per_huge_page(h); + vmemmap_wrprotect_hvo(spfn, epfn, folio_nid(folio), + HUGETLB_VMEMMAP_RESERVE_SIZE); + register_page_bootmem_memmap(pfn_to_section_nr(spfn), + &folio->page, + HUGETLB_VMEMMAP_RESERVE_SIZE); + static_branch_inc(&hugetlb_optimize_vmemmap_key); + continue; + } + + nr_to_optimize++; + + ret = hugetlb_vmemmap_split_folio(h, folio); /* * Spliting the PMD requires allocating a page, thus lets fail @@ -668,6 +693,16 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l break; } + if (!nr_to_optimize) + /* + * All pre-HVO folios, nothing left to do. It's ok if + * there is a mix of pre-HVO and not yet HVO-ed folios + * here, as __hugetlb_vmemmap_optimize_folio() will + * skip any folios that already have the optimized flag + * set, see vmemmap_should_optimize_folio(). + */ + goto out; + flush_tlb_all(); list_for_each_entry(folio, folio_list, lru) { @@ -693,10 +728,21 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l } } +out: flush_tlb_all(); free_vmemmap_page_list(&vmemmap_pages); } +void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list) +{ + __hugetlb_vmemmap_optimize_folios(h, folio_list, false); +} + +void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head *folio_list) +{ + __hugetlb_vmemmap_optimize_folios(h, folio_list, true); +} + static struct ctl_table hugetlb_vmemmap_sysctls[] = { { .procname = "hugetlb_optimize_vmemmap", diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 2fcae92d3359..a6354a27e63f 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -24,6 +24,8 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h, struct list_head *non_hvo_folios); void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio); void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list); +void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head *folio_list); + static inline unsigned int hugetlb_vmemmap_size(const struct hstate *h) { @@ -64,6 +66,19 @@ static inline void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list { } +static inline void hugetlb_vmemmap_init_early(int nid) +{ +} + +static inline void hugetlb_vmemmap_init_late(int nid) +{ +} + +static inline void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, + struct list_head *folio_list) +{ +} + static inline unsigned int hugetlb_vmemmap_optimizable_size(const struct hstate *h) { return 0;