From patchwork Mon Jan 27 23:21:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951850 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D54C3C02188 for ; Mon, 27 Jan 2025 23:22:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1609128011F; Mon, 27 Jan 2025 18:22:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EEA8D280191; Mon, 27 Jan 2025 18:22:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D145728013A; Mon, 27 Jan 2025 18:22:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 3B38128011F for ; Mon, 27 Jan 2025 18:22:40 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id DA0438087F for ; Mon, 27 Jan 2025 23:22:39 +0000 (UTC) X-FDA: 83054808438.30.FFE3202 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf04.hostedemail.com (Postfix) with ESMTP id 09C2E40008 for ; Mon, 27 Jan 2025 23:22:37 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="As+Oal/H"; spf=pass (imf04.hostedemail.com: domain of 3PBWYZwQKCAsq6owrzzrwp.nzxwty58-xxv6lnv.z2r@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3PBWYZwQKCAsq6owrzzrwp.nzxwty58-xxv6lnv.z2r@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020158; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sEk7wWJBoXYCWMM5+k1sYt6L/V24bQrmW4i63Y+H+6w=; b=kQs5ysxw2tHeto6vH0CmHd4C0vbI6TOthIpSYoTHd4+rr0WdbfW7Oa5GvCJGDqFqB7gFVh EWg4EvwOrS+gweknspWne/tyMlthaqwLJS9T/nzlt5TgtyXWbz8SQPnRnXGzTiu6YVoNQj mFvPFvZz2BLurD+PO8c0+P/e98XTwxw= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="As+Oal/H"; spf=pass (imf04.hostedemail.com: domain of 3PBWYZwQKCAsq6owrzzrwp.nzxwty58-xxv6lnv.z2r@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3PBWYZwQKCAsq6owrzzrwp.nzxwty58-xxv6lnv.z2r@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020158; a=rsa-sha256; cv=none; b=gtM6mO1F9Tr+yw4QzvpTzI8ntRzHQg4TdElCknS26tUlcyqI9zswx7PPE2hxtSNNUQTRgZ h8cn67YQkFCMF3Vv/BjN1X3Jk57UuKGTL8SkBoNSfWXQ3iiNd2s0FeSedULNmY8f5wzmp4 3kfV5wIaNmv5Vbc1uJoT1jqZEVr9eWM= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ee3206466aso11095847a91.1 for ; Mon, 27 Jan 2025 15:22:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020157; x=1738624957; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=sEk7wWJBoXYCWMM5+k1sYt6L/V24bQrmW4i63Y+H+6w=; b=As+Oal/HumRFrbtNntFS6nBzSGvreoLQBcF/DWuzPxbVZUQVx0cKm9k0OUtwEDIWlh n0R6GX4c3y4Auu89mtevCXl8p8g7oCT5uJYTpbJvLnc1xyagDzkNO0x2s3C7yOQ7jH5+ R8mYpmq99DBxiLSv8GdO77vaqkQ3muZZWUjI0cNJVhQHot/rNAXtvDETcbhmU04c2vYV NFa3bqH1WECmKXp8pQJryotkzIFgSn0SXPz33ZrkNY9V41yz910blFUHQ0WeDTqfUPGA vMyqei9efiXioE1aXGUxzvW5u3wpLhQn1afbP1qb+P5dkXwCNNqW0+3uOz3IwKXs6BPH YF3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020157; x=1738624957; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sEk7wWJBoXYCWMM5+k1sYt6L/V24bQrmW4i63Y+H+6w=; b=GKKH3rG2WGde42Nq83QBDWfKYL2kM9FrMrMsBqqq6OZC149Mi0VFX92Fwbfn1pR515 sHV+Zng40KYsgfdZNgiidYTNiaiapbUYipUoXKgyNbblkjgiZR2TWgwO5iiKH6N+cCqJ xkHTpXKgbkgsUiWWsV/6tjw73V3PqKcN12w4Lmlw73xRYAmI42WXKZf+xWZboE2V9oxc ++FDC8QL0e7WqrQoqhohXAkpRCwSsK7aH7HtzSoj267bp07cSsEZnC0D23GogjpFBzsa pi4IXwh+QnoAZqsEhz6WRNoD1/2JjSG4UTNQ66Rpk6HGLE69oP5AK0sfsdVWPpLW/A07 df6Q== X-Forwarded-Encrypted: i=1; AJvYcCW/OZK+JBqEyykPaHuAIBc8csI2B26/9VNmpNnU9XtiNA0FSKfmGEWYG8nlJYvk8UdIwSnlqAkYxg==@kvack.org X-Gm-Message-State: AOJu0YwwisQjljUO8i9MY41+sBdPKEqI5vP4fQlb8/BDMJcaxAMqzqgV XhNE1wKVNMtn97H92ftPmxTvZEJFqJ3JYWZdp15DH0NzUopE4RgtGBBzfRTUtcq3CCWUyw== X-Google-Smtp-Source: AGHT+IHfBibcmt8vtXZ+mym7NU2SB0DPkxQSWz4/Xq+ZhKrlvbhzuCjqjkBbya8StVC/rWEyywcCYFQJ X-Received: from pfblh4.prod.google.com ([2002:a05:6a00:7104:b0:72f:bfd9:14c5]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:aa7:88d2:0:b0:729:1c0f:b94e with SMTP id d2e1a72fcca58-72fc0917f6bmr1506666b3a.6.1738020156872; Mon, 27 Jan 2025 15:22:36 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:48 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-9-fvdl@google.com> Subject: [PATCH 08/27] mm/hugetlb: convert cmdline parameters from setup to early From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Queue-Id: 09C2E40008 X-Stat-Signature: fxxc44aatkkm4ohh9tzf1kss55n6d31p X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1738020157-192307 X-HE-Meta: U2FsdGVkX1/lLFg+FLGqFsDY217Jg64o3PjgcxsfLAA7UKfgrv2d6C6E2P+HtankpLCAc3qwoTYAaZg3SQJgZONVlz7f3+v0kjAbUBYZ5IyltegKKDtN8lSCJQEN8v+i0XNcE2/JGi6+BFTv+LSzh9p7JXbC5M+olUOhvoPy+LEtZQX9IFFpTP94Y4rVfP+R6noO0ir2b0rLg+EkIYELNM0lbhAcD9S25rTQ5Z2yGQMpZw/tkoJ6LOHUCfMYzNjIgFb4ZvygPFdX9ui5tFb4FiuJGzyyusKte+Gd5/UroLD+4pJfVeO1xTKdPJnX1BfRWpfxRZUix+/I6ijef1gWViLHtNJC5CCsIPRAk9mEx7/Vc1hBL+f7WjOfEsP3mPOpkNu2ZjTNFIAEW5X7Q4BvUR6GHXtkmJqmvDyu5z2Ur9r2LMR1rpRXMJ2XXj4oPoZKLS6S716lHMf7yqSxnD06SDFsKdjr9ypozjVejQLcPr4fpaNZKnfF6vl9BwqgN10HIfPfEWqcpPu3U5RqvnW85sESyZ56m1euGProro6C9A9yCcjoEtSYeU/6HjpURoYCTQYqI6rILwMkXaJNXyu65R+28b9aoC4xcFDEUXb0q6TT5tq7t/vwyLKlKVIWQsCnSrnZPiApoycf5ebfnnAWLak+ggT3+F2IEL6XnDG+9IdHbCT8Zmx/EUsCyeoAj9exQvyectzigTjvYF+EIO3qCvgUCqRZLSILP4JmO6CAWWIj+5ENHA5phq6pCHggz66J6Xh5qNBtwvQEA3rQSNc4muIJcTiKvZ4W0GGcRyDD07pXRLEyvitotaberp2MA4UIxUpNiB2N+b+Ya6SORFPYz9sSbROYXuDc1tnGUpjEb98NTfwaYaxPEjTzB0nnt9qYUea4IwQX7lWuknyjb1wU8D4Voo3oXIW/pVs8W3wQYefOBtHTPKuTAskjkp8qY55DkIy5fdLzNVShKtEzlNl PljivUul Lb4k87uU1JuQNX12VRrupaS9laAe+2G0MdX+RXU67v+ZWmLxOHoSAp3hiYBtSCvUSOCCRFlhMXRvL8vGr8PB1L89fzQER8PBE6YxdiDhR3TIuRR90YRFeDdj5/+ViXm7o3UQwNrz6TqQ/nq6sw+pqTDCavWwz7Uhjqr99YGn6jffm2QzABzXySad0Jp0eD8we//oZEon2Xw+ACuWHU+R7PYxlWbAHYCpEEqEmec1hGA9ovK5tcgKx8zaoq02eJr6cIUrqWgvth67PxBdXf25XiX1bH5rNzSJT3qUV5WOAreHQO0DvSOYR16kx2a58lruOqJST8TEaFpvvYXIO9Lox4KLxHaIMVbhUpqbb0r8+ImBuH4c1btOiv3AZ5bLj5sx7vxGcCfxYSwcDu5BldOXYcr1VS4CPerHYvnvfntY5cuyUuaMc4GPb7p7XnN5YehWjeenqvhrNqsBSWU/tu4m200d2P6S1CHFRhbsUar1F2nQxWZ3uyOIhKrRDuUKL8eykh2+Uu0T7U2JNWKhdNkzyZbAw6A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Convert the cmdline parameters (hugepagesz, hugepages, default_hugepagesz and hugetlb_free_vmemmap) to early parameters. Since parse_early_param might run before MMU setups on some platforms (powerpc), validation of huge page sizes as specified in command line parameters would fail. So instead, for the hstate-related values, just record the them and parse them on demand, from hugetlb_bootmem_alloc. The allocation of hugetlb bootmem pages is now done in hugetlb_bootmem_alloc, which is called explicitly at the start of mm_core_init(). core_initcall would be too late, as that happens with memblock already torn down. This change will allow earlier allocation and initialization of bootmem hugetlb pages later on. No functional change intended. Signed-off-by: Frank van der Linden --- include/linux/hugetlb.h | 6 ++ mm/hugetlb.c | 133 +++++++++++++++++++++++++++++++--------- mm/hugetlb_vmemmap.c | 6 +- mm/mm_init.c | 3 + 4 files changed, 119 insertions(+), 29 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ec8c0ccc8f95..9cd7c9dacb88 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -174,6 +174,8 @@ struct address_space *hugetlb_folio_mapping_lock_write(struct folio *folio); extern int sysctl_hugetlb_shm_group; extern struct list_head huge_boot_pages[MAX_NUMNODES]; +void hugetlb_bootmem_alloc(void); + /* arch callbacks */ #ifndef CONFIG_HIGHPTE @@ -1250,6 +1252,10 @@ static inline bool hugetlbfs_pagecache_present( { return false; } + +static inline void hugetlb_bootmem_alloc(void) +{ +} #endif /* CONFIG_HUGETLB_PAGE */ static inline spinlock_t *huge_pte_lock(struct hstate *h, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a67339ca65b4..a95ab44d5545 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -40,6 +40,7 @@ #include #include #include +#include #include #include @@ -62,6 +63,24 @@ static unsigned long hugetlb_cma_size __initdata; __initdata struct list_head huge_boot_pages[MAX_NUMNODES]; +/* + * Due to ordering constraints across the init code for various + * architectures, hugetlb hstate cmdline parameters can't simply + * be early_param. early_param might call the setup function + * before valid hugetlb page sizes are determined, leading to + * incorrect rejection of valid hugepagesz= options. + * + * So, record the parameters early and consume them whenever the + * init code is ready for them, by calling hugetlb_parse_params(). + */ + +/* one (hugepagesz=,hugepages=) pair per hstate, one default_hugepagesz */ +#define HUGE_MAX_CMDLINE_ARGS (2 * HUGE_MAX_HSTATE + 1) +struct hugetlb_cmdline { + char *val; + int (*setup)(char *val); +}; + /* for command line parsing */ static struct hstate * __initdata parsed_hstate; static unsigned long __initdata default_hstate_max_huge_pages; @@ -69,6 +88,20 @@ static bool __initdata parsed_valid_hugepagesz = true; static bool __initdata parsed_default_hugepagesz; static unsigned int default_hugepages_in_node[MAX_NUMNODES] __initdata; +static char hstate_cmdline_buf[COMMAND_LINE_SIZE] __initdata; +static int hstate_cmdline_index __initdata; +static struct hugetlb_cmdline hugetlb_params[HUGE_MAX_CMDLINE_ARGS] __initdata; +static int hugetlb_param_index __initdata; +static __init int hugetlb_add_param(char *s, int (*setup)(char *val)); +static __init void hugetlb_parse_params(void); + +#define hugetlb_early_param(str, func) \ +static __init int func##args(char *s) \ +{ \ + return hugetlb_add_param(s, func); \ +} \ +early_param(str, func##args) + /* * Protects updates to hugepage_freelists, hugepage_activelist, nr_huge_pages, * free_huge_pages, and surplus_huge_pages. @@ -3488,6 +3521,8 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) for (i = 0; i < MAX_NUMNODES; i++) INIT_LIST_HEAD(&huge_boot_pages[i]); + h->next_nid_to_alloc = first_online_node; + h->next_nid_to_free = first_online_node; initialized = true; } @@ -4550,8 +4585,6 @@ void __init hugetlb_add_hstate(unsigned int order) for (i = 0; i < MAX_NUMNODES; ++i) INIT_LIST_HEAD(&h->hugepage_freelists[i]); INIT_LIST_HEAD(&h->hugepage_activelist); - h->next_nid_to_alloc = first_online_node; - h->next_nid_to_free = first_online_node; snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB", huge_page_size(h)/SZ_1K); @@ -4576,6 +4609,42 @@ static void __init hugepages_clear_pages_in_node(void) } } +static __init int hugetlb_add_param(char *s, int (*setup)(char *)) +{ + size_t len; + char *p; + + if (hugetlb_param_index >= HUGE_MAX_CMDLINE_ARGS) + return -EINVAL; + + len = strlen(s) + 1; + if (len + hstate_cmdline_index > sizeof(hstate_cmdline_buf)) + return -EINVAL; + + p = &hstate_cmdline_buf[hstate_cmdline_index]; + memcpy(p, s, len); + hstate_cmdline_index += len; + + hugetlb_params[hugetlb_param_index].val = p; + hugetlb_params[hugetlb_param_index].setup = setup; + + hugetlb_param_index++; + + return 0; +} + +static __init void hugetlb_parse_params(void) +{ + int i; + struct hugetlb_cmdline *hcp; + + for (i = 0; i < hugetlb_param_index; i++) { + hcp = &hugetlb_params[i]; + + hcp->setup(hcp->val); + } +} + /* * hugepages command line processing * hugepages normally follows a valid hugepagsz or default_hugepagsz @@ -4595,7 +4664,7 @@ static int __init hugepages_setup(char *s) if (!parsed_valid_hugepagesz) { pr_warn("HugeTLB: hugepages=%s does not follow a valid hugepagesz, ignoring\n", s); parsed_valid_hugepagesz = true; - return 1; + return -EINVAL; } /* @@ -4649,24 +4718,16 @@ static int __init hugepages_setup(char *s) } } - /* - * Global state is always initialized later in hugetlb_init. - * But we need to allocate gigantic hstates here early to still - * use the bootmem allocator. - */ - if (hugetlb_max_hstate && hstate_is_gigantic(parsed_hstate)) - hugetlb_hstate_alloc_pages(parsed_hstate); - last_mhp = mhp; - return 1; + return 0; invalid: pr_warn("HugeTLB: Invalid hugepages parameter %s\n", p); hugepages_clear_pages_in_node(); - return 1; + return -EINVAL; } -__setup("hugepages=", hugepages_setup); +hugetlb_early_param("hugepages", hugepages_setup); /* * hugepagesz command line processing @@ -4685,7 +4746,7 @@ static int __init hugepagesz_setup(char *s) if (!arch_hugetlb_valid_size(size)) { pr_err("HugeTLB: unsupported hugepagesz=%s\n", s); - return 1; + return -EINVAL; } h = size_to_hstate(size); @@ -4700,7 +4761,7 @@ static int __init hugepagesz_setup(char *s) if (!parsed_default_hugepagesz || h != &default_hstate || default_hstate.max_huge_pages) { pr_warn("HugeTLB: hugepagesz=%s specified twice, ignoring\n", s); - return 1; + return -EINVAL; } /* @@ -4710,14 +4771,14 @@ static int __init hugepagesz_setup(char *s) */ parsed_hstate = h; parsed_valid_hugepagesz = true; - return 1; + return 0; } hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT); parsed_valid_hugepagesz = true; - return 1; + return 0; } -__setup("hugepagesz=", hugepagesz_setup); +hugetlb_early_param("hugepagesz", hugepagesz_setup); /* * default_hugepagesz command line input @@ -4731,14 +4792,14 @@ static int __init default_hugepagesz_setup(char *s) parsed_valid_hugepagesz = false; if (parsed_default_hugepagesz) { pr_err("HugeTLB: default_hugepagesz previously specified, ignoring %s\n", s); - return 1; + return -EINVAL; } size = (unsigned long)memparse(s, NULL); if (!arch_hugetlb_valid_size(size)) { pr_err("HugeTLB: unsupported default_hugepagesz=%s\n", s); - return 1; + return -EINVAL; } hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT); @@ -4755,17 +4816,33 @@ static int __init default_hugepagesz_setup(char *s) */ if (default_hstate_max_huge_pages) { default_hstate.max_huge_pages = default_hstate_max_huge_pages; - for_each_online_node(i) - default_hstate.max_huge_pages_node[i] = - default_hugepages_in_node[i]; - if (hstate_is_gigantic(&default_hstate)) - hugetlb_hstate_alloc_pages(&default_hstate); + /* + * Since this is an early parameter, we can't check + * NUMA node state yet, so loop through MAX_NUMNODES. + */ + for (i = 0; i < MAX_NUMNODES; i++) { + if (default_hugepages_in_node[i] != 0) + default_hstate.max_huge_pages_node[i] = + default_hugepages_in_node[i]; + } default_hstate_max_huge_pages = 0; } - return 1; + return 0; +} +hugetlb_early_param("default_hugepagesz", default_hugepagesz_setup); + +void __init hugetlb_bootmem_alloc(void) +{ + struct hstate *h; + + hugetlb_parse_params(); + + for_each_hstate(h) { + if (hstate_is_gigantic(h)) + hugetlb_hstate_alloc_pages(h); + } } -__setup("default_hugepagesz=", default_hugepagesz_setup); static unsigned int allowed_mems_nr(struct hstate *h) { diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 57b7f591eee8..326cdf94192e 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -444,7 +444,11 @@ DEFINE_STATIC_KEY_FALSE(hugetlb_optimize_vmemmap_key); EXPORT_SYMBOL(hugetlb_optimize_vmemmap_key); static bool vmemmap_optimize_enabled = IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON); -core_param(hugetlb_free_vmemmap, vmemmap_optimize_enabled, bool, 0); +static int __init hugetlb_vmemmap_optimize_param(char *buf) +{ + return kstrtobool(buf, &vmemmap_optimize_enabled); +} +early_param("hugetlb_free_vmemmap", hugetlb_vmemmap_optimize_param); static int __hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *folio, unsigned long flags) diff --git a/mm/mm_init.c b/mm/mm_init.c index 2630cc30147e..d2dee53e95dd 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -30,6 +30,7 @@ #include #include #include +#include #include "internal.h" #include "slab.h" #include "shuffle.h" @@ -2641,6 +2642,8 @@ static void __init mem_init_print_info(void) */ void __init mm_core_init(void) { + hugetlb_bootmem_alloc(); + /* Initializations relying on SMP setup */ BUILD_BUG_ON(MAX_ZONELISTS > 2); build_all_zonelists(NULL);