From patchwork Wed Jan 29 22:41:30 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954198 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0074BC02190 for ; Wed, 29 Jan 2025 22:42:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2659D280093; Wed, 29 Jan 2025 17:42:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 16ED9280091; Wed, 29 Jan 2025 17:42:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB633280093; Wed, 29 Jan 2025 17:42:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C9101280091 for ; Wed, 29 Jan 2025 17:42:21 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 79BE4804C6 for ; Wed, 29 Jan 2025 22:42:21 +0000 (UTC) X-FDA: 83061964482.13.79AC8AE Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf11.hostedemail.com (Postfix) with ESMTP id AEC004000B for ; Wed, 29 Jan 2025 22:42:19 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0iFOhYjO; spf=pass (imf11.hostedemail.com: domain of 3yq6aZwQKCNU6M4C7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3yq6aZwQKCNU6M4C7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190539; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=juYA+sItEByN7zxk1dgdSJwdSPC9Jk78U+kCT9wOvNQ=; b=c478r6uUDEMkWw1MHqvjg0zssUSJJpMJO+ljlqVx9jyxPAUApm6iOeSNkFGHVt7/Ya7HW7 wVSko+i4z03RwesKRfNbk7lwVFU4Qq4rJkEX7G6FZClzbm9owXRHVZzq2PV7GMWWsdPNJB /Pcfn69rW8Wi/LdK0Pv5fQyG1VOQ8S0= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0iFOhYjO; spf=pass (imf11.hostedemail.com: domain of 3yq6aZwQKCNU6M4C7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3yq6aZwQKCNU6M4C7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190539; a=rsa-sha256; cv=none; b=elwU8zdrOUxr7U/v9DrWdfaUMCbDr4ZxouY41hiGepmvsE7rLiJ2ipWaW+i4ze1xpn9Bjk gCOs60XP6LTJVxrx7v/PMCUdZ8Se26P8ba2DPsWD35U5MSxZEVYWQuqahNx3HlqYTkSosh KL0XQOqcB373QEBeh62ekgjGC1GFyh8= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ef909597d9so322955a91.3 for ; Wed, 29 Jan 2025 14:42:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190538; x=1738795338; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=juYA+sItEByN7zxk1dgdSJwdSPC9Jk78U+kCT9wOvNQ=; b=0iFOhYjO9KBU0ftkQEWJWoMBxtrEOlwsy0gt8CbcbY45LMEzpDt2EGAUxvjneqzpWs /ZTF4rXY6NTOvgKe1LTtuX7dZTrtlipPyHpdBHScBsqVw1DmrAexvJDCcAUlXvJOOfrz a/uCYzwxYq3iiGd+O9QpE7txW8/kn71TwWQnixcoM1ESkWLvmjVM1JXIdiCNo/jTT6DJ b9yHfMyHxhYnKlg/LDpJalCd6GisUZkfDFilionhqUZfF2u+7iEHZw7Syy/UnxJ2yVJL 2bTKngMC6jaIUNbY0hoJsnRhmtGPqG9XBkFKgybn8D0+//YcCethD+8Uu3Iv590YBVq6 VsnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190538; x=1738795338; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=juYA+sItEByN7zxk1dgdSJwdSPC9Jk78U+kCT9wOvNQ=; b=kSaYX3t2bXzxxLR6ZOAFlplaEWZ95+p/tvVqyzS8s9SJMb7PatSCB+pMq4b/tk2a20 +NJZ6wD8cZaYdiF3eGSxT5C0dOd3mkfzP8hxqy4h0ug4TRMkYr/8jr85xhkU9DyN245O UdCECeZ271HMoXHgNLZzFz5k0Q4b73tzyGso5kcKjmFmO5OK0gXzjg8M3w1qxV516yqx d5jOSq/oS2Vv7gUQiF2OZMnNIgXvmBa4cW5rDihiro0E/31Df+4LnKr3aLtYyntjzxwV Spxly/jgpRQDiWp6ccpkEyhY3aT2LCs0JzBNtv5zdrABVGKkTst/wNAZLkQpeJNZ8TBU CkEg== X-Forwarded-Encrypted: i=1; AJvYcCVusjJ9j4bOifhImXO0otBvj7LME5O1iKbGCGWezO1OKS9T/x4BB+f2N087QGdEcXag57c/e/q5Yw==@kvack.org X-Gm-Message-State: AOJu0YzeBphKX8W09h256dwXrM3Ccdx58dn8qpgS1F4BOSHxrXVZvSg5 WE6ajt+LLXRgf/ubDsZCLITlwFhev/uW4CsPRcIGNvuMTD2y7P275T8wrI1Y8Gf19wMOBQ== X-Google-Smtp-Source: AGHT+IEg9BYYPtNubsZcQ0Mno7chp+fxxke+KJ5TxIoSy9tvDjU8NGhzVm6RSWY3Cmxnn4PyjSXyESpN X-Received: from pfwz41.prod.google.com ([2002:a05:6a00:1da9:b0:72d:5313:d4ab]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:2e14:b0:71e:4cff:2654 with SMTP id d2e1a72fcca58-72fd0bf7126mr6635395b3a.6.1738190538358; Wed, 29 Jan 2025 14:42:18 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:30 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-2-fvdl@google.com> Subject: [PATCH v2 01/28] mm/cma: export total and free number of pages for CMA areas From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: AEC004000B X-Stat-Signature: r6xjajymio9m7oh1igo71zzf9kot4iqp X-Rspam-User: X-HE-Tag: 1738190539-512609 X-HE-Meta: U2FsdGVkX18qFiEL68HCtlQjRJU2OW8aCSMw3DdPwaoR1GjsrwN+Fon/awRcNO4T8bJugIQPvdA+YHYgW35OyQ/wmBMBTJsb14pGOB5AyBiMjYON6ncaRcgQYycGchhOEQRoGyscBjGgD2ASTtytXMajxvwIzCswC+PyXYAJdHL1wPGbj//f5+t1cLhsp5BzcxTqjtHQL2ghgO7qjYmjXP+mUxbGKohVZ6/9QNEPB8TGHdzsLCk/T4N3QEVGjc53405AWhSA7H1G2BwpAhSi1DhVFVdqh3n6gqfz3VKBw2XnSfkSExgzqwIL6CHJ0jZLbBX5QiXzb37QnbWqpZJxaAUcFSoidygGPfngCZ7WKlKSscPWdHtTQ71xKiYiE3tGLfZSqI0s3ygjw6krI8mHQaedOvmIvqmFbys3jSaC2ysotKPHgiIErGkdu6KbVDTwOw0eCG1Qp/PlLKBsMRTJCjex3qzZK/aTJYEct94LiZv0QxdOBsE2yjagv0E9lcMxlCCBwBaDiJ//IDEjBs5q9FacC/qq6nUkJfIEY6LzjGto6eUaz14k3SAE4GXx2lt/R7qAML6VQH+wBQVNwhYFvKzHIFnll8JQjwqEwXjmSVZoJ30vAvzVFMibmdugGuFM2Q7XUgRaLGd241DHEnjpqIHIHa35rm4GwCdbGCEGZlRMG05zsYWlvlxFm7lXoErz2OdjNGW986cwCbZ5ZuI1fgB9uVAODOBPtyd2a6+WuUCWrWYfkMYbei2nK1tW0qNhy8WgvMRGTtaqivJp9Cb2XUbZxHlZJjzxirOqQryTrImdPskschk2/NIzy3TZMlZkS6kxxkf411NxkGPQHLvCptjjYCK0jv7vFZ5FwNjJdi/0BW/VD0vAvu9rnDuvwMhInerOHCfT04zrqw+DJcmQGHQM47huhsaw6UfFV1lkF6spspNr+QkLH0U8XuuYi+m+fon6N2qRqaQzPa4dZum REUeVPxc SN+txqkqg/PaEUu5C37ZqkuA6/1shGL4zUg1M/OIIEfjJzp47KConULHXZGVs+3LID7PjgI1yxt18MRSEw3ebrLzp8nVpy6PNEpUuuVEjE9njsUjMoE8+cXBGP4tFuK/Kk9wQqlZ/IRaErtzqaYI3CG4iKWCIcBH1/Nycel6TUgNlz26jZA73N8TbYhZUdAEKQVktkrvILTX1Iay4zbS5cfGteZ5MERRk0Cjy+MP7dyyBRxmp4Ax9U+6LFuXiVRURp7gFBfxJ5QS+LII5RNsZ61dWZIK9Qiz0g2rFjszi+jRue/XhByEO+rGbZTV3Mc+AHkAvSk7jLumriOiG3JnJ6sZifNQYFjTFVyb0GRGJzq/2VNR9ZfutPjIrmsYNHhAsicYtvFTPZrMERO6m3F4xsb71fL6k4GmF2Pj9xOAd7ncE+eUj28tSOKy22dvIHeppNcACj6Y+XdRg92WC/Lq4dHR37xmWhmBv+y0YBGPpcVG4cK3MEsi+BmeRxH00G7K3wfBkj/fWyPEiQVg+YzPNo68BqUmB/hUnPD/PS7o2MkKBuy03TGkSoN/+ZYBEERml3O7nPODlBcDwHvi5JqLzjjSBB3VKN6GBsVnPvu12y7QaEgM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In addition to the number of allocations and releases, system management software may like to be aware of the size of CMA areas, and how many pages are available in it. This information is currently not available, so export it in total_page and available_pages, respectively. The name 'available_pages' was picked over 'free_pages' because 'free' implies that the pages are unused. But they might not be, they just haven't been used by cma_alloc The number of available pages is tracked regardless of CONFIG_CMA_SYSFS, allowing for a few minor shortcuts in the code, avoiding bitmap operations. Signed-off-by: Frank van der Linden --- Documentation/ABI/testing/sysfs-kernel-mm-cma | 13 +++++++++++ mm/cma.c | 22 ++++++++++++++----- mm/cma.h | 1 + mm/cma_debug.c | 5 +---- mm/cma_sysfs.c | 20 +++++++++++++++++ 5 files changed, 51 insertions(+), 10 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-cma b/Documentation/ABI/testing/sysfs-kernel-mm-cma index dfd755201142..aaf2a5d8b13b 100644 --- a/Documentation/ABI/testing/sysfs-kernel-mm-cma +++ b/Documentation/ABI/testing/sysfs-kernel-mm-cma @@ -29,3 +29,16 @@ Date: Feb 2024 Contact: Anshuman Khandual Description: the number of pages CMA API succeeded to release + +What: /sys/kernel/mm/cma//total_pages +Date: Jun 2024 +Contact: Frank van der Linden +Description: + The size of the CMA area in pages. + +What: /sys/kernel/mm/cma//available_pages +Date: Jun 2024 +Contact: Frank van der Linden +Description: + The number of pages in the CMA area that are still + available for CMA allocation. diff --git a/mm/cma.c b/mm/cma.c index de5bc0c81fc2..95a8788e54d3 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -86,6 +86,7 @@ static void cma_clear_bitmap(struct cma *cma, unsigned long pfn, spin_lock_irqsave(&cma->lock, flags); bitmap_clear(cma->bitmap, bitmap_no, bitmap_count); + cma->available_count += count; spin_unlock_irqrestore(&cma->lock, flags); } @@ -133,7 +134,7 @@ static void __init cma_activate_area(struct cma *cma) free_reserved_page(pfn_to_page(pfn)); } totalcma_pages -= cma->count; - cma->count = 0; + cma->available_count = cma->count = 0; pr_err("CMA area %s could not be activated\n", cma->name); } @@ -206,7 +207,7 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, snprintf(cma->name, CMA_MAX_NAME, "cma%d\n", cma_area_count); cma->base_pfn = PFN_DOWN(base); - cma->count = size >> PAGE_SHIFT; + cma->available_count = cma->count = size >> PAGE_SHIFT; cma->order_per_bit = order_per_bit; *res_cma = cma; cma_area_count++; @@ -390,7 +391,7 @@ static void cma_debug_show_areas(struct cma *cma) { unsigned long next_zero_bit, next_set_bit, nr_zero; unsigned long start = 0; - unsigned long nr_part, nr_total = 0; + unsigned long nr_part; unsigned long nbits = cma_bitmap_maxno(cma); spin_lock_irq(&cma->lock); @@ -402,12 +403,12 @@ static void cma_debug_show_areas(struct cma *cma) next_set_bit = find_next_bit(cma->bitmap, nbits, next_zero_bit); nr_zero = next_set_bit - next_zero_bit; nr_part = nr_zero << cma->order_per_bit; - pr_cont("%s%lu@%lu", nr_total ? "+" : "", nr_part, + pr_cont("%s%lu@%lu", start ? "+" : "", nr_part, next_zero_bit); - nr_total += nr_part; start = next_zero_bit + nr_zero; } - pr_cont("=> %lu free of %lu total pages\n", nr_total, cma->count); + pr_cont("=> %lu free of %lu total pages\n", cma->available_count, + cma->count); spin_unlock_irq(&cma->lock); } @@ -444,6 +445,14 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, for (;;) { spin_lock_irq(&cma->lock); + /* + * If the request is larger than the available number + * of pages, stop right away. + */ + if (count > cma->available_count) { + spin_unlock_irq(&cma->lock); + break; + } bitmap_no = bitmap_find_next_zero_area_off(cma->bitmap, bitmap_maxno, start, bitmap_count, mask, offset); @@ -452,6 +461,7 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, break; } bitmap_set(cma->bitmap, bitmap_no, bitmap_count); + cma->available_count -= count; /* * It's safe to drop the lock here. We've marked this region for * our exclusive use. If the migration fails we will take the diff --git a/mm/cma.h b/mm/cma.h index 8485ef893e99..3dd3376ae980 100644 --- a/mm/cma.h +++ b/mm/cma.h @@ -13,6 +13,7 @@ struct cma_kobject { struct cma { unsigned long base_pfn; unsigned long count; + unsigned long available_count; unsigned long *bitmap; unsigned int order_per_bit; /* Order of pages represented by one bit */ spinlock_t lock; diff --git a/mm/cma_debug.c b/mm/cma_debug.c index 602fff89b15f..89236f22230a 100644 --- a/mm/cma_debug.c +++ b/mm/cma_debug.c @@ -34,13 +34,10 @@ DEFINE_DEBUGFS_ATTRIBUTE(cma_debugfs_fops, cma_debugfs_get, NULL, "%llu\n"); static int cma_used_get(void *data, u64 *val) { struct cma *cma = data; - unsigned long used; spin_lock_irq(&cma->lock); - /* pages counter is smaller than sizeof(int) */ - used = bitmap_weight(cma->bitmap, (int)cma_bitmap_maxno(cma)); + *val = cma->count - cma->available_count; spin_unlock_irq(&cma->lock); - *val = (u64)used << cma->order_per_bit; return 0; } diff --git a/mm/cma_sysfs.c b/mm/cma_sysfs.c index f50db3973171..97acd3e5a6a5 100644 --- a/mm/cma_sysfs.c +++ b/mm/cma_sysfs.c @@ -62,6 +62,24 @@ static ssize_t release_pages_success_show(struct kobject *kobj, } CMA_ATTR_RO(release_pages_success); +static ssize_t total_pages_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct cma *cma = cma_from_kobj(kobj); + + return sysfs_emit(buf, "%lu\n", cma->count); +} +CMA_ATTR_RO(total_pages); + +static ssize_t available_pages_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct cma *cma = cma_from_kobj(kobj); + + return sysfs_emit(buf, "%lu\n", cma->available_count); +} +CMA_ATTR_RO(available_pages); + static void cma_kobj_release(struct kobject *kobj) { struct cma *cma = cma_from_kobj(kobj); @@ -75,6 +93,8 @@ static struct attribute *cma_attrs[] = { &alloc_pages_success_attr.attr, &alloc_pages_fail_attr.attr, &release_pages_success_attr.attr, + &total_pages_attr.attr, + &available_pages_attr.attr, NULL, }; ATTRIBUTE_GROUPS(cma); From patchwork Wed Jan 29 22:41:31 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954199 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 124D8C02190 for ; Wed, 29 Jan 2025 22:42:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D311D280094; Wed, 29 Jan 2025 17:42:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C93E3280091; Wed, 29 Jan 2025 17:42:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A6FE5280094; Wed, 29 Jan 2025 17:42:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 74443280091 for ; Wed, 29 Jan 2025 17:42:23 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 1C67CC045F for ; Wed, 29 Jan 2025 22:42:23 +0000 (UTC) X-FDA: 83061964566.25.2019F49 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf04.hostedemail.com (Postfix) with ESMTP id 27E6B4000D for ; Wed, 29 Jan 2025 22:42:20 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=2XKEUcEe; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 3zK6aZwQKCNc8O6E9HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3zK6aZwQKCNc8O6E9HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190541; a=rsa-sha256; cv=none; b=YNxfX6jvDKRsHINRNzc0CZY1BtDKJQVqbTzrn4xXjp3+rpelajHGfK1rV0m40GhGyyXdsp V0ck8Duqd5nhgc0buZ3ZJCaQIiKJmlz/dbFmCe2fbX8pS6g9VxmcVtRAlzNZSAW8Ve1Z4L tnynlm4D3j5+PJ54VKPEcUO39vT59U8= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=2XKEUcEe; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 3zK6aZwQKCNc8O6E9HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3zK6aZwQKCNc8O6E9HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190541; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FrI7QbENgy0P+q5nyeLpRs8mThcnQPC6yT2Hdo5n00w=; b=FZbm+bBYcywMUskQmvR+f+eH9nZ5F0D8QqvNHfo6VOVjfXa1NM3tbgI7vaVNJ4It6iGpcC TAk13bHE2f3ds/srO6jOOkTcUQpzi+p2TczQmlQoluxHvk5pnZxU0/0HMPmO/z5FWU2qQv 0D5jxjI5RgFgCO8kKtdUduUJW32//MU= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-21640607349so3639605ad.0 for ; Wed, 29 Jan 2025 14:42:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190540; x=1738795340; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=FrI7QbENgy0P+q5nyeLpRs8mThcnQPC6yT2Hdo5n00w=; b=2XKEUcEe6pn0jTs0DIyutf0P3NlDLays2mEHQfVCV5LWBSTsIrq7FHfTiioBTHND+j g9p2yEQe6kG+DqyVFJZAYezx1M9D22s5e2jbvJGekRjJ2//5dfmQAdEac5pKbjlw/Xh8 gOb5SerjJOeraERG87RVyMTIo0ffo4p9ENRB2M6Du84tAKgxLsIGsiE11qDGI0J4PC7i m8ca93MhN5wLjl8j78ebCXL+oKNGaF3zP6OI6KxPq9sHEse/pSKLFt+VXgxsJ0DZ4uJ7 I5wBaqnNehWCi4HZa6ZCTHtYDqY/S7Xy8eqKe9jVThbekvuJ3Bj5ffL4U7KUtGPY+lRA yGRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190540; x=1738795340; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FrI7QbENgy0P+q5nyeLpRs8mThcnQPC6yT2Hdo5n00w=; b=v7ylborvqQ7xGdBjaySGJ2yeH0aJGeIUtr+fBlHxVRDzmnmR8V2aTVBqsA3RF6hn7+ vskeAjc7ZAENCGgNLJ831nbPjQrM0A1rl29UUpUkk84jU66HRsKqtZbMXePzSPuq6WVe ihA4Sb0R4693yieDbM5g/cNmi+q4F5ms4YKjPQEahK87LioVXkCNwFk8rT1z9Puo3RZj bCQe7EoJZcNH47y+5VvLdBWfcq3TAqT0lm0uW0m5IhtvyAQc12R0jLO8XimIJPGsWfnp tFVkYY8bGVl9Ulz6VhwpxnBTOxsDJjXOHHODxtxoa2WW+irzoulRU5WzH2HY0GO11Lwe +Pgg== X-Forwarded-Encrypted: i=1; AJvYcCVRFI2CUJ1cnaZyy24LkonMZkZvHF2uJa1wTkfQbJV8UTOIRMk4L6f7t/XgkgjHoLbNYSM8R7WnvA==@kvack.org X-Gm-Message-State: AOJu0Ywlpoq/WTYj5ddFEFLc8Io2NpeyHuOCm9hUMlj5M6UPDPrHMkmU QfS/T2Bd8DnHE0sey/7E0D54d81DddGhRUIs1DiI6B2hgeopxJRlUxp3xxDBOpl/eNSo1Q== X-Google-Smtp-Source: AGHT+IE6OcN4Fv9Lq981yNfAOyQkQ7+cKy+E72YrclXqMoOWbYfq/LpCc0bLhE6kWz72gU1kbvdi7LZm X-Received: from pfbds9.prod.google.com ([2002:a05:6a00:4ac9:b0:728:2357:646a]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:3a1d:b0:725:f1ca:fd75 with SMTP id d2e1a72fcca58-72fd0bbe3d0mr7123251b3a.2.1738190540016; Wed, 29 Jan 2025 14:42:20 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:31 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-3-fvdl@google.com> Subject: [PATCH v2 02/28] mm, cma: support multiple contiguous ranges, if requested From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspam-User: X-Rspamd-Queue-Id: 27E6B4000D X-Rspamd-Server: rspam10 X-Stat-Signature: 6c8xiwp4743u9xb4uf4uwizxs66nuzb5 X-HE-Tag: 1738190540-835465 X-HE-Meta: U2FsdGVkX1/oqjxqrRxDxRiaJA1id3tDU2Zl20QTTMPHEv/zRhAq3EFrLpL/MMym8jB+v7uzWAVdlCUvJ1HF/x1fnhwHyqx6Ykm7kNtodEAi7EmZVpQf8v6Fb7qQM/5ycKhbGBkULizzm2GHkOqaSvlxTDCBIi2a0I9RWCNLCv/iSl0XrI2wzUvizkK1SjKmorNxQqUIF0KySbGQOB+I26el3GgmK5XfaGFwQzG06Xq7y0ek5U3PnyDUf51pnmL53ZsMN/cmcY1SoysxRL8AhysVJxyVsvEqZmLQQNn7Yw0SIFul4jKHq9ZD3CP/2Wy6RM6fXYl8q9AbgkqhTHw+zWOis1kl2VPUjaf5Qq1VHw9ljRRjJ3tYBt8iht3ZV99K7sDm7JJMyiojXIO5hN74KzNNmSSI7pXoWTO2niGvoGBkSho6vawunEFQuIPR6+aZ6JXpDnV5PMbzBXztXZonWYU663dKmBk/N6uowRKxwCqvnb36gMJVXsALr/JP2hlusIFNL1kRtW7GXdvD3VLAoT5px/9e+UMb6f632098CMG+Jm8/ICSGDkzYnimoCHvo4SNw48EHASyo2DNpF41Kgv0eVL7fgWzuTFNbRM34+ynD2uHcwpgYIs3mJ1Sl38Z8PbqdkzmXo3Z+K65XEtq0v2d7ca4UgM7s5MPlsPKsLt/4CMrfxIe+l1Cwdmts0MHpmI/JAh8jX8rq/2zUJhk7BFG7gWd4HaWAujc+mlRVwEdS2CEYpYiScdt3JcBTZQRaDY01ALyttasFNQnAIFAhRqv0dgzoLo/ht5tfGbIPSvj9rjw/76LHJ15chUkgEEroph3cHH1kMGy3Kp1eqLtlL70g2/WQu1iegzuLO/gBT6U1ABFAKj6W8IEwxawrY2j8GwY7y4f83BfClNrDKZ+Vq90wBHU7/Lg20ELGeQZV6dWhKTE9KFM3khuyI7CEFURcqW3n77DmicM/CBGD+e/ xV1gM9D9 4sUs7711FQMn3GRAWhXU7ZhZLyx2YH/K3hqIuwCsE8Ieqyri91lOMeIelnLuZmFhSrrQgQQ1BhzrqfrtXgAzUjTRU8gwdigOg/J7xEOqQB4eMc8L6iU3kHU1rX+x15EitdCMh7q/ImY7tLM5UKSsv6njQNJN6eZ/sztFFnGU+7+lGM2QPtYn//Tx8tbyU6hmOPrzn9+XDZlBWyBLE468CmOq9PRF1vbzUFFAOS99pZt1j7f0r9wUWPQwSp1HMIGzKxIjn4/knqpnzVZKPawBRTJ4JtIyeEoArv93vsad1zUYnBF3igKhF+I41o2x7wJbmtVNRAJY9pks3b5oDilW2yIl6Xwfx5Oeq3aJcrqFkb9ysDXP/YUxDRbPUFXRoFiTnfJq9SEoVSxRBrDmP9XVIsq+KozkzLA2K79J7Bci1pZ0T3vz6CY7e29aB6Qj9yCUUs/GFsh2SF0dkESzpBb+syMQJLirjg3CKMaj2mboqgF4g3jB7VmR62WnAN3nmfKnlQscccNt+w+BC6/CLAwUbORoIx7nIZca7dR+LFweVVibonSSc9pMWgahzdA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently, CMA manages one range of physically contiguous memory. Creation of larger CMA areas with hugetlb_cma may run in to gaps in physical memory, so that they are not able to allocate that contiguous physical range from memblock when creating the CMA area. This can happen, for example, on an AMD system with > 1TB of memory, where there will be a gap just below the 1TB (40bit DMA) line. If you have set aside most of memory for potential hugetlb CMA allocation, cma_declare_contiguous_nid will fail. hugetlb_cma doesn't need the entire area to be one physically contiguous range. It just cares about being able to get physically contiguous chunks of a certain size (e.g. 1G), and it is fine to have the CMA area backed by multiple physical ranges, as long as it gets 1G contiguous allocations. Multi-range support is implemented by introducing an array of ranges, instead of just one big one. Each range has its own bitmap. Effectively, the allocate and release operations work as before, just per-range. So, instead of going through one large bitmap, they now go through a number of smaller ones. The maximum number of supported ranges is 8, as defined in CMA_MAX_RANGES. Since some current users of CMA expect a CMA area to just use one physically contiguous range, only allow for multiple ranges if a new interface, cma_declare_contiguous_nid_multi, is used. The other interfaces will work like before, creating only CMA areas with 1 range. cma_declare_contiguous_nid_multi works as follows, mimicking the default "bottom-up, above 4G" reservation approach: 0) Try cma_declare_contiguous_nid, which will use only one region. If this succeeds, return. This makes sure that for all the cases that currently work, the behavior remains unchanged even if the caller switches from cma_declare_contiguous_nid to cma_declare_contiguous_nid_multi. 1) Select the largest free memblock ranges above 4G, with a maximum number of CMA_MAX_RANGES. 2) If we did not find at most CMA_MAX_RANGES that add up to the total size requested, return -ENOMEM. 3) Sort the selected ranges by base address. 4) Reserve them bottom-up until we get what we wanted. Signed-off-by: Frank van der Linden --- include/linux/cma.h | 3 + mm/cma.c | 604 +++++++++++++++++++++++++++++++++++--------- mm/cma.h | 27 +- mm/cma_debug.c | 56 ++-- 4 files changed, 552 insertions(+), 138 deletions(-) diff --git a/include/linux/cma.h b/include/linux/cma.h index d15b64f51336..863427c27dc2 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -40,6 +40,9 @@ static inline int __init cma_declare_contiguous(phys_addr_t base, return cma_declare_contiguous_nid(base, size, limit, alignment, order_per_bit, fixed, name, res_cma, NUMA_NO_NODE); } +extern int __init cma_declare_contiguous_multi(phys_addr_t size, + phys_addr_t align, unsigned int order_per_bit, + const char *name, struct cma **res_cma, int nid); extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, unsigned int order_per_bit, const char *name, diff --git a/mm/cma.c b/mm/cma.c index 95a8788e54d3..c20255161642 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -18,6 +18,7 @@ #include #include +#include #include #include #include @@ -35,9 +36,16 @@ struct cma cma_areas[MAX_CMA_AREAS]; unsigned int cma_area_count; static DEFINE_MUTEX(cma_mutex); +static int __init __cma_declare_contiguous_nid(phys_addr_t base, + phys_addr_t size, phys_addr_t limit, + phys_addr_t alignment, unsigned int order_per_bit, + bool fixed, const char *name, struct cma **res_cma, + int nid); + phys_addr_t cma_get_base(const struct cma *cma) { - return PFN_PHYS(cma->base_pfn); + WARN_ON_ONCE(cma->nranges != 1); + return PFN_PHYS(cma->ranges[0].base_pfn); } unsigned long cma_get_size(const struct cma *cma) @@ -63,9 +71,10 @@ static unsigned long cma_bitmap_aligned_mask(const struct cma *cma, * The value returned is represented in order_per_bits. */ static unsigned long cma_bitmap_aligned_offset(const struct cma *cma, + const struct cma_memrange *cmr, unsigned int align_order) { - return (cma->base_pfn & ((1UL << align_order) - 1)) + return (cmr->base_pfn & ((1UL << align_order) - 1)) >> cma->order_per_bit; } @@ -75,46 +84,57 @@ static unsigned long cma_bitmap_pages_to_bits(const struct cma *cma, return ALIGN(pages, 1UL << cma->order_per_bit) >> cma->order_per_bit; } -static void cma_clear_bitmap(struct cma *cma, unsigned long pfn, - unsigned long count) +static void cma_clear_bitmap(struct cma *cma, const struct cma_memrange *cmr, + unsigned long pfn, unsigned long count) { unsigned long bitmap_no, bitmap_count; unsigned long flags; - bitmap_no = (pfn - cma->base_pfn) >> cma->order_per_bit; + bitmap_no = (pfn - cmr->base_pfn) >> cma->order_per_bit; bitmap_count = cma_bitmap_pages_to_bits(cma, count); spin_lock_irqsave(&cma->lock, flags); - bitmap_clear(cma->bitmap, bitmap_no, bitmap_count); + bitmap_clear(cmr->bitmap, bitmap_no, bitmap_count); cma->available_count += count; spin_unlock_irqrestore(&cma->lock, flags); } static void __init cma_activate_area(struct cma *cma) { - unsigned long base_pfn = cma->base_pfn, pfn; + unsigned long pfn, base_pfn; + int allocrange, r; struct zone *zone; + struct cma_memrange *cmr; + + for (allocrange = 0; allocrange < cma->nranges; allocrange++) { + cmr = &cma->ranges[allocrange]; + cmr->bitmap = bitmap_zalloc(cma_bitmap_maxno(cma, cmr), + GFP_KERNEL); + if (!cmr->bitmap) + goto cleanup; + } - cma->bitmap = bitmap_zalloc(cma_bitmap_maxno(cma), GFP_KERNEL); - if (!cma->bitmap) - goto out_error; + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + base_pfn = cmr->base_pfn; - /* - * alloc_contig_range() requires the pfn range specified to be in the - * same zone. Simplify by forcing the entire CMA resv range to be in the - * same zone. - */ - WARN_ON_ONCE(!pfn_valid(base_pfn)); - zone = page_zone(pfn_to_page(base_pfn)); - for (pfn = base_pfn + 1; pfn < base_pfn + cma->count; pfn++) { - WARN_ON_ONCE(!pfn_valid(pfn)); - if (page_zone(pfn_to_page(pfn)) != zone) - goto not_in_zone; - } + /* + * alloc_contig_range() requires the pfn range specified + * to be in the same zone. Simplify by forcing the entire + * CMA resv range to be in the same zone. + */ + WARN_ON_ONCE(!pfn_valid(base_pfn)); + zone = page_zone(pfn_to_page(base_pfn)); + for (pfn = base_pfn + 1; pfn < base_pfn + cmr->count; pfn++) { + WARN_ON_ONCE(!pfn_valid(pfn)); + if (page_zone(pfn_to_page(pfn)) != zone) + goto cleanup; + } - for (pfn = base_pfn; pfn < base_pfn + cma->count; - pfn += pageblock_nr_pages) - init_cma_reserved_pageblock(pfn_to_page(pfn)); + for (pfn = base_pfn; pfn < base_pfn + cmr->count; + pfn += pageblock_nr_pages) + init_cma_reserved_pageblock(pfn_to_page(pfn)); + } spin_lock_init(&cma->lock); @@ -125,13 +145,19 @@ static void __init cma_activate_area(struct cma *cma) return; -not_in_zone: - bitmap_free(cma->bitmap); -out_error: +cleanup: + for (r = 0; r < allocrange; r++) + bitmap_free(cma->ranges[r].bitmap); + /* Expose all pages to the buddy, they are useless for CMA. */ if (!cma->reserve_pages_on_error) { - for (pfn = base_pfn; pfn < base_pfn + cma->count; pfn++) - free_reserved_page(pfn_to_page(pfn)); + for (r = 0; r < allocrange; r++) { + cmr = &cma->ranges[r]; + for (pfn = cmr->base_pfn; + pfn < cmr->base_pfn + cmr->count; + pfn++) + free_reserved_page(pfn_to_page(pfn)); + } } totalcma_pages -= cma->count; cma->available_count = cma->count = 0; @@ -154,6 +180,43 @@ void __init cma_reserve_pages_on_error(struct cma *cma) cma->reserve_pages_on_error = true; } +static int __init cma_new_area(const char *name, phys_addr_t size, + unsigned int order_per_bit, + struct cma **res_cma) +{ + struct cma *cma; + + if (cma_area_count == ARRAY_SIZE(cma_areas)) { + pr_err("Not enough slots for CMA reserved regions!\n"); + return -ENOSPC; + } + + /* + * Each reserved area must be initialised later, when more kernel + * subsystems (like slab allocator) are available. + */ + cma = &cma_areas[cma_area_count]; + cma_area_count++; + + if (name) + snprintf(cma->name, CMA_MAX_NAME, name); + else + snprintf(cma->name, CMA_MAX_NAME, "cma%d\n", cma_area_count); + + cma->available_count = cma->count = size >> PAGE_SHIFT; + cma->order_per_bit = order_per_bit; + *res_cma = cma; + totalcma_pages += cma->count; + + return 0; +} + +static void __init cma_drop_area(struct cma *cma) +{ + totalcma_pages -= cma->count; + cma_area_count--; +} + /** * cma_init_reserved_mem() - create custom contiguous area from reserved memory * @base: Base address of the reserved area @@ -172,13 +235,9 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, struct cma **res_cma) { struct cma *cma; + int ret; /* Sanity checks */ - if (cma_area_count == ARRAY_SIZE(cma_areas)) { - pr_err("Not enough slots for CMA reserved regions!\n"); - return -ENOSPC; - } - if (!size || !memblock_is_region_reserved(base, size)) return -EINVAL; @@ -195,25 +254,261 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, if (!IS_ALIGNED(base | size, CMA_MIN_ALIGNMENT_BYTES)) return -EINVAL; + ret = cma_new_area(name, size, order_per_bit, &cma); + if (ret != 0) + return ret; + + cma->ranges[0].base_pfn = PFN_DOWN(base); + cma->ranges[0].count = cma->count; + cma->nranges = 1; + + *res_cma = cma; + + return 0; +} + +/* + * Structure used while walking physical memory ranges and finding out + * which one(s) to use for a CMA area. + */ +struct cma_init_memrange { + phys_addr_t base; + phys_addr_t size; + struct list_head list; +}; + +/* + * Work array used during CMA initialization. + */ +static struct cma_init_memrange memranges[CMA_MAX_RANGES] __initdata; + +static bool __init revsizecmp(struct cma_init_memrange *mlp, + struct cma_init_memrange *mrp) +{ + return mlp->size > mrp->size; +} + +static bool __init basecmp(struct cma_init_memrange *mlp, + struct cma_init_memrange *mrp) +{ + return mlp->base < mrp->base; +} + +/* + * Helper function to create sorted lists. + */ +static void __init list_insert_sorted( + struct list_head *ranges, + struct cma_init_memrange *mrp, + bool (*cmp)(struct cma_init_memrange *lh, struct cma_init_memrange *rh)) +{ + struct list_head *mp; + struct cma_init_memrange *mlp; + + if (list_empty(ranges)) + list_add(&mrp->list, ranges); + { + list_for_each(mp, ranges) { + mlp = list_entry(mp, struct cma_init_memrange, list); + if (cmp(mlp, mrp)) + break; + } + __list_add(&mrp->list, mlp->list.prev, &mlp->list); + } +} + +/* + * Create CMA areas with a total size of @total_size. A normal allocation + * for one area is tried first. If that fails, the biggest memblock + * ranges above 4G are selected, and allocated bottom up. + * + * The complexity here is not great, but this function will only be + * called during boot, and the lists operated on have fewer than + * CMA_MAX_RANGES elements (default value: 8). + */ +int __init cma_declare_contiguous_multi(phys_addr_t total_size, + phys_addr_t align, unsigned int order_per_bit, + const char *name, struct cma **res_cma, int nid) +{ + phys_addr_t start, end; + phys_addr_t size, sizesum, sizeleft; + struct cma_init_memrange *mrp, *mlp, *failed; + struct cma_memrange *cmrp; + LIST_HEAD(ranges); + LIST_HEAD(final_ranges); + struct list_head *mp, *next; + int ret, nr = 1; + u64 i; + struct cma *cma; + /* - * Each reserved area must be initialised later, when more kernel - * subsystems (like slab allocator) are available. + * First, try it the normal way, producing just one range. */ - cma = &cma_areas[cma_area_count]; + ret = __cma_declare_contiguous_nid(0, total_size, 0, align, + order_per_bit, false, name, res_cma, nid); + if (ret != -ENOMEM) + goto out; - if (name) - snprintf(cma->name, CMA_MAX_NAME, name); - else - snprintf(cma->name, CMA_MAX_NAME, "cma%d\n", cma_area_count); + /* + * Couldn't find one range that fits our needs, so try multiple + * ranges. + * + * No need to do the alignment checks here, the call to + * cma_declare_contiguous_nid above would have caught + * any issues. With the checks, we know that: + * + * - @align is a power of 2 + * - @align is >= pageblock alignment + * - @size is aligned to @align and to @order_per_bit + * + * So, as long as we create ranges that have a base + * aligned to @align, and a size that is aligned to + * both @align and @order_to_bit, things will work out. + */ + nr = 0; + sizesum = 0; + failed = NULL; - cma->base_pfn = PFN_DOWN(base); - cma->available_count = cma->count = size >> PAGE_SHIFT; - cma->order_per_bit = order_per_bit; + ret = cma_new_area(name, total_size, order_per_bit, &cma); + if (ret != 0) + goto out; + + align = max_t(phys_addr_t, align, CMA_MIN_ALIGNMENT_BYTES); + /* + * Create a list of ranges above 4G, largest range first. + */ + for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &start, &end, NULL) { + if (start < SZ_4G) + continue; + + start = ALIGN(start, align); + if (start >= end) + continue; + + end = ALIGN_DOWN(end, align); + if (end <= start) + continue; + + size = end - start; + size = ALIGN_DOWN(size, (PAGE_SIZE << order_per_bit)); + if (!size) + continue; + sizesum += size; + + pr_debug("consider %016llx - %016llx\n", (u64)start, (u64)end); + + /* + * If we don't yet have used the maximum number of + * areas, grab a new one. + * + * If we can't use anymore, see if this range is not + * smaller than the smallest one already recorded. If + * not, re-use the smallest element. + */ + if (nr < CMA_MAX_RANGES) + mrp = &memranges[nr++]; + else { + mrp = list_last_entry(&ranges, + struct cma_init_memrange, list); + if (size < mrp->size) + continue; + list_del(&mrp->list); + sizesum -= mrp->size; + pr_debug("deleted %016llx - %016llx from the list\n", + (u64)mrp->base, (u64)mrp->base + size); + } + mrp->base = start; + mrp->size = size; + + /* + * Now do a sorted insert. + */ + list_insert_sorted(&ranges, mrp, revsizecmp); + pr_debug("added %016llx - %016llx to the list\n", + (u64)mrp->base, (u64)mrp->base + size); + pr_debug("total size now %llu\n", (u64)sizesum); + } + + /* + * There is not enough room in the CMA_MAX_RANGES largest + * ranges, so bail out. + */ + if (sizesum < total_size) { + cma_drop_area(cma); + ret = -ENOMEM; + goto out; + } + + /* + * Found ranges that provide enough combined space. + * Now, sorted them by address, smallest first, because we + * want to mimic a bottom-up memblock allocation. + */ + sizesum = 0; + list_for_each_safe(mp, next, &ranges) { + mlp = list_entry(mp, struct cma_init_memrange, list); + list_del(mp); + list_insert_sorted(&final_ranges, mlp, basecmp); + sizesum += mlp->size; + if (sizesum >= total_size) + break; + } + + /* + * Walk the final list, and add a CMA range for + * each range, possibly not using the last one fully. + */ + nr = 0; + sizeleft = total_size; + list_for_each(mp, &final_ranges) { + mlp = list_entry(mp, struct cma_init_memrange, list); + size = min(sizeleft, mlp->size); + if (memblock_reserve(mlp->base, size)) { + /* + * Unexpected error. Could go on to + * the next one, but just abort to + * be safe. + */ + failed = mlp; + break; + } + + pr_debug("created region %d: %016llx - %016llx\n", + nr, (u64)mlp->base, (u64)mlp->base + size); + cmrp = &cma->ranges[nr++]; + cmrp->base_pfn = PHYS_PFN(mlp->base); + cmrp->count = size >> PAGE_SHIFT; + + sizeleft -= size; + if (sizeleft == 0) + break; + } + + if (failed) { + list_for_each(mp, &final_ranges) { + mlp = list_entry(mp, struct cma_init_memrange, list); + if (mlp == failed) + break; + memblock_phys_free(mlp->base, mlp->size); + } + cma_drop_area(cma); + ret = -ENOMEM; + goto out; + } + + cma->nranges = nr; *res_cma = cma; - cma_area_count++; - totalcma_pages += cma->count; - return 0; +out: + if (ret != 0) + pr_err("Failed to reserve %lu MiB\n", + (unsigned long)total_size / SZ_1M); + else + pr_info("Reserved %lu MiB in %d range%s\n", + (unsigned long)total_size / SZ_1M, nr, + nr > 1 ? "s" : ""); + + return ret; } /** @@ -241,6 +536,26 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, phys_addr_t alignment, unsigned int order_per_bit, bool fixed, const char *name, struct cma **res_cma, int nid) +{ + int ret; + + ret = __cma_declare_contiguous_nid(base, size, limit, alignment, + order_per_bit, fixed, name, res_cma, nid); + if (ret != 0) + pr_err("Failed to reserve %ld MiB\n", + (unsigned long)size / SZ_1M); + else + pr_info("Reserved %ld MiB at %pa\n", + (unsigned long)size / SZ_1M, &base); + + return ret; +} + +static int __init __cma_declare_contiguous_nid(phys_addr_t base, + phys_addr_t size, phys_addr_t limit, + phys_addr_t alignment, unsigned int order_per_bit, + bool fixed, const char *name, struct cma **res_cma, + int nid) { phys_addr_t memblock_end = memblock_end_of_DRAM(); phys_addr_t highmem_start; @@ -273,10 +588,9 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, /* Sanitise input arguments. */ alignment = max_t(phys_addr_t, alignment, CMA_MIN_ALIGNMENT_BYTES); if (fixed && base & (alignment - 1)) { - ret = -EINVAL; pr_err("Region at %pa must be aligned to %pa bytes\n", &base, &alignment); - goto err; + return -EINVAL; } base = ALIGN(base, alignment); size = ALIGN(size, alignment); @@ -294,10 +608,9 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, * low/high memory boundary. */ if (fixed && base < highmem_start && base + size > highmem_start) { - ret = -EINVAL; pr_err("Region at %pa defined on low/high memory boundary (%pa)\n", &base, &highmem_start); - goto err; + return -EINVAL; } /* @@ -309,18 +622,16 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, limit = memblock_end; if (base + size > limit) { - ret = -EINVAL; pr_err("Size (%pa) of region at %pa exceeds limit (%pa)\n", &size, &base, &limit); - goto err; + return -EINVAL; } /* Reserve memory */ if (fixed) { if (memblock_is_region_reserved(base, size) || memblock_reserve(base, size) < 0) { - ret = -EBUSY; - goto err; + return -EBUSY; } } else { phys_addr_t addr = 0; @@ -357,10 +668,8 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, if (!addr) { addr = memblock_alloc_range_nid(size, alignment, base, limit, nid, true); - if (!addr) { - ret = -ENOMEM; - goto err; - } + if (!addr) + return -ENOMEM; } /* @@ -373,75 +682,67 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, ret = cma_init_reserved_mem(base, size, order_per_bit, name, res_cma); if (ret) - goto free_mem; - - pr_info("Reserved %ld MiB at %pa on node %d\n", (unsigned long)size / SZ_1M, - &base, nid); - return 0; + memblock_phys_free(base, size); -free_mem: - memblock_phys_free(base, size); -err: - pr_err("Failed to reserve %ld MiB on node %d\n", (unsigned long)size / SZ_1M, - nid); return ret; } static void cma_debug_show_areas(struct cma *cma) { unsigned long next_zero_bit, next_set_bit, nr_zero; - unsigned long start = 0; + unsigned long start; unsigned long nr_part; - unsigned long nbits = cma_bitmap_maxno(cma); + unsigned long nbits; + int r; + struct cma_memrange *cmr; spin_lock_irq(&cma->lock); pr_info("number of available pages: "); - for (;;) { - next_zero_bit = find_next_zero_bit(cma->bitmap, nbits, start); - if (next_zero_bit >= nbits) - break; - next_set_bit = find_next_bit(cma->bitmap, nbits, next_zero_bit); - nr_zero = next_set_bit - next_zero_bit; - nr_part = nr_zero << cma->order_per_bit; - pr_cont("%s%lu@%lu", start ? "+" : "", nr_part, - next_zero_bit); - start = next_zero_bit + nr_zero; + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + + start = 0; + nbits = cma_bitmap_maxno(cma, cmr); + + pr_info("range %d: ", r); + for (;;) { + next_zero_bit = find_next_zero_bit(cmr->bitmap, + nbits, start); + if (next_zero_bit >= nbits) + break; + next_set_bit = find_next_bit(cmr->bitmap, nbits, + next_zero_bit); + nr_zero = next_set_bit - next_zero_bit; + nr_part = nr_zero << cma->order_per_bit; + pr_cont("%s%lu@%lu", start ? "+" : "", nr_part, + next_zero_bit); + start = next_zero_bit + nr_zero; + } + pr_info("\n"); } pr_cont("=> %lu free of %lu total pages\n", cma->available_count, cma->count); spin_unlock_irq(&cma->lock); } -static struct page *__cma_alloc(struct cma *cma, unsigned long count, - unsigned int align, gfp_t gfp) +static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr, + unsigned long count, unsigned int align, + struct page **pagep, gfp_t gfp) { unsigned long mask, offset; unsigned long pfn = -1; unsigned long start = 0; unsigned long bitmap_maxno, bitmap_no, bitmap_count; - unsigned long i; + int ret = -EBUSY; struct page *page = NULL; - int ret = -ENOMEM; - const char *name = cma ? cma->name : NULL; - - trace_cma_alloc_start(name, count, align); - - if (!cma || !cma->count || !cma->bitmap) - return page; - - pr_debug("%s(cma %p, name: %s, count %lu, align %d)\n", __func__, - (void *)cma, cma->name, count, align); - - if (!count) - return page; mask = cma_bitmap_aligned_mask(cma, align); - offset = cma_bitmap_aligned_offset(cma, align); - bitmap_maxno = cma_bitmap_maxno(cma); + offset = cma_bitmap_aligned_offset(cma, cmr, align); + bitmap_maxno = cma_bitmap_maxno(cma, cmr); bitmap_count = cma_bitmap_pages_to_bits(cma, count); if (bitmap_count > bitmap_maxno) - return page; + goto out; for (;;) { spin_lock_irq(&cma->lock); @@ -453,14 +754,14 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, spin_unlock_irq(&cma->lock); break; } - bitmap_no = bitmap_find_next_zero_area_off(cma->bitmap, + bitmap_no = bitmap_find_next_zero_area_off(cmr->bitmap, bitmap_maxno, start, bitmap_count, mask, offset); if (bitmap_no >= bitmap_maxno) { spin_unlock_irq(&cma->lock); break; } - bitmap_set(cma->bitmap, bitmap_no, bitmap_count); + bitmap_set(cmr->bitmap, bitmap_no, bitmap_count); cma->available_count -= count; /* * It's safe to drop the lock here. We've marked this region for @@ -469,7 +770,7 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, */ spin_unlock_irq(&cma->lock); - pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit); + pfn = cmr->base_pfn + (bitmap_no << cma->order_per_bit); mutex_lock(&cma_mutex); ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, gfp); mutex_unlock(&cma_mutex); @@ -478,7 +779,7 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, break; } - cma_clear_bitmap(cma, pfn, count); + cma_clear_bitmap(cma, cmr, pfn, count); if (ret != -EBUSY) break; @@ -490,6 +791,48 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, /* try again with a bit different memory target */ start = bitmap_no + mask + 1; } +out: + *pagep = page; + return ret; +} + +/** + * cma_alloc() - allocate pages from contiguous area + * @cma: Contiguous memory region for which the allocation is performed. + * @count: Requested number of pages. + * @align: Requested alignment of pages (in PAGE_SIZE order). + * @no_warn: Avoid printing message about failed allocation + * + * This function allocates part of contiguous memory on specific + * contiguous memory area. + */ +static struct page *__cma_alloc(struct cma *cma, unsigned long count, + unsigned int align, gfp_t gfp) +{ + struct page *page = NULL; + int ret = -ENOMEM, r; + unsigned long i; + const char *name = cma ? cma->name : NULL; + + trace_cma_alloc_start(name, count, align); + + if (!cma || !cma->count) + return page; + + pr_debug("%s(cma %p, name: %s, count %lu, align %d)\n", __func__, + (void *)cma, cma->name, count, align); + + if (!count) + return page; + + for (r = 0; r < cma->nranges; r++) { + page = NULL; + + ret = cma_range_alloc(cma, &cma->ranges[r], count, align, + &page, gfp); + if (ret != -EBUSY || page) + break; + } /* * CMA can allocate multiple page blocks, which results in different @@ -508,7 +851,8 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, } pr_debug("%s(): returned %p\n", __func__, page); - trace_cma_alloc_finish(name, pfn, page, count, align, ret); + trace_cma_alloc_finish(name, page ? page_to_pfn(page) : 0, + page, count, align, ret); if (page) { count_vm_event(CMA_ALLOC_SUCCESS); cma_sysfs_account_success_pages(cma, count); @@ -551,20 +895,31 @@ struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp) bool cma_pages_valid(struct cma *cma, const struct page *pages, unsigned long count) { - unsigned long pfn; + unsigned long pfn, end; + int r; + struct cma_memrange *cmr; + bool ret; - if (!cma || !pages) + if (!cma || !pages || count > cma->count) return false; pfn = page_to_pfn(pages); + ret = false; - if (pfn < cma->base_pfn || pfn >= cma->base_pfn + cma->count) { - pr_debug("%s(page %p, count %lu)\n", __func__, - (void *)pages, count); - return false; + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + end = cmr->base_pfn + cmr->count; + if (pfn >= cmr->base_pfn && pfn < end) { + ret = pfn + count <= end; + break; + } } - return true; + if (!ret) + pr_debug("%s(page %p, count %lu)\n", + __func__, (void *)pages, count); + + return ret; } /** @@ -580,19 +935,32 @@ bool cma_pages_valid(struct cma *cma, const struct page *pages, bool cma_release(struct cma *cma, const struct page *pages, unsigned long count) { - unsigned long pfn; + struct cma_memrange *cmr; + unsigned long pfn, end_pfn; + int r; + + pr_debug("%s(page %p, count %lu)\n", __func__, (void *)pages, count); if (!cma_pages_valid(cma, pages, count)) return false; - pr_debug("%s(page %p, count %lu)\n", __func__, (void *)pages, count); - pfn = page_to_pfn(pages); + end_pfn = pfn + count; + + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + if (pfn >= cmr->base_pfn && + pfn < (cmr->base_pfn + cmr->count)) { + VM_BUG_ON(end_pfn > cmr->base_pfn + cmr->count); + break; + } + } - VM_BUG_ON(pfn + count > cma->base_pfn + cma->count); + if (r == cma->nranges) + return false; free_contig_range(pfn, count); - cma_clear_bitmap(cma, pfn, count); + cma_clear_bitmap(cma, cmr, pfn, count); cma_sysfs_account_release_pages(cma, count); trace_cma_release(cma->name, pfn, pages, count); diff --git a/mm/cma.h b/mm/cma.h index 3dd3376ae980..5f39dd1aac91 100644 --- a/mm/cma.h +++ b/mm/cma.h @@ -10,19 +10,35 @@ struct cma_kobject { struct cma *cma; }; +/* + * Multi-range support. This can be useful if the size of the allocation + * is not expected to be larger than the alignment (like with hugetlb_cma), + * and the total amount of memory requested, while smaller than the total + * amount of memory available, is large enough that it doesn't fit in a + * single physical memory range because of memory holes. + */ +struct cma_memrange { + unsigned long base_pfn; + unsigned long count; + unsigned long *bitmap; +#ifdef CONFIG_CMA_DEBUGFS + struct debugfs_u32_array dfs_bitmap; +#endif +}; +#define CMA_MAX_RANGES 8 + struct cma { - unsigned long base_pfn; unsigned long count; unsigned long available_count; - unsigned long *bitmap; unsigned int order_per_bit; /* Order of pages represented by one bit */ spinlock_t lock; #ifdef CONFIG_CMA_DEBUGFS struct hlist_head mem_head; spinlock_t mem_head_lock; - struct debugfs_u32_array dfs_bitmap; #endif char name[CMA_MAX_NAME]; + int nranges; + struct cma_memrange ranges[CMA_MAX_RANGES]; #ifdef CONFIG_CMA_SYSFS /* the number of CMA page successful allocations */ atomic64_t nr_pages_succeeded; @@ -39,9 +55,10 @@ struct cma { extern struct cma cma_areas[MAX_CMA_AREAS]; extern unsigned int cma_area_count; -static inline unsigned long cma_bitmap_maxno(struct cma *cma) +static inline unsigned long cma_bitmap_maxno(struct cma *cma, + struct cma_memrange *cmr) { - return cma->count >> cma->order_per_bit; + return cmr->count >> cma->order_per_bit; } #ifdef CONFIG_CMA_SYSFS diff --git a/mm/cma_debug.c b/mm/cma_debug.c index 89236f22230a..400f589756ba 100644 --- a/mm/cma_debug.c +++ b/mm/cma_debug.c @@ -46,17 +46,26 @@ DEFINE_DEBUGFS_ATTRIBUTE(cma_used_fops, cma_used_get, NULL, "%llu\n"); static int cma_maxchunk_get(void *data, u64 *val) { struct cma *cma = data; + struct cma_memrange *cmr; unsigned long maxchunk = 0; - unsigned long start, end = 0; - unsigned long bitmap_maxno = cma_bitmap_maxno(cma); + unsigned long start, end; + unsigned long bitmap_maxno; + int r; spin_lock_irq(&cma->lock); - for (;;) { - start = find_next_zero_bit(cma->bitmap, bitmap_maxno, end); - if (start >= bitmap_maxno) - break; - end = find_next_bit(cma->bitmap, bitmap_maxno, start); - maxchunk = max(end - start, maxchunk); + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + bitmap_maxno = cma_bitmap_maxno(cma, cmr); + end = 0; + for (;;) { + start = find_next_zero_bit(cmr->bitmap, + bitmap_maxno, end); + if (start >= bitmap_maxno) + break; + end = find_next_bit(cmr->bitmap, bitmap_maxno, + start); + maxchunk = max(end - start, maxchunk); + } } spin_unlock_irq(&cma->lock); *val = (u64)maxchunk << cma->order_per_bit; @@ -159,24 +168,41 @@ DEFINE_DEBUGFS_ATTRIBUTE(cma_alloc_fops, NULL, cma_alloc_write, "%llu\n"); static void cma_debugfs_add_one(struct cma *cma, struct dentry *root_dentry) { - struct dentry *tmp; + struct dentry *tmp, *dir, *rangedir; + int r; + char rdirname[3]; + struct cma_memrange *cmr; tmp = debugfs_create_dir(cma->name, root_dentry); debugfs_create_file("alloc", 0200, tmp, cma, &cma_alloc_fops); debugfs_create_file("free", 0200, tmp, cma, &cma_free_fops); - debugfs_create_file("base_pfn", 0444, tmp, - &cma->base_pfn, &cma_debugfs_fops); debugfs_create_file("count", 0444, tmp, &cma->count, &cma_debugfs_fops); debugfs_create_file("order_per_bit", 0444, tmp, &cma->order_per_bit, &cma_debugfs_fops); debugfs_create_file("used", 0444, tmp, cma, &cma_used_fops); debugfs_create_file("maxchunk", 0444, tmp, cma, &cma_maxchunk_fops); - cma->dfs_bitmap.array = (u32 *)cma->bitmap; - cma->dfs_bitmap.n_elements = DIV_ROUND_UP(cma_bitmap_maxno(cma), - BITS_PER_BYTE * sizeof(u32)); - debugfs_create_u32_array("bitmap", 0444, tmp, &cma->dfs_bitmap); + rangedir = debugfs_create_dir("ranges", tmp); + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + snprintf(rdirname, sizeof(rdirname), "%d", r); + dir = debugfs_create_dir(rdirname, rangedir); + debugfs_create_file("base_pfn", 0444, dir, + &cmr->base_pfn, &cma_debugfs_fops); + cmr->dfs_bitmap.array = (u32 *)cmr->bitmap; + cmr->dfs_bitmap.n_elements = + DIV_ROUND_UP(cma_bitmap_maxno(cma, cmr), + BITS_PER_BYTE * sizeof(u32)); + debugfs_create_u32_array("bitmap", 0444, dir, + &cmr->dfs_bitmap); + } + + /* + * Backward compatible symlinks to range 0 for base_pfn and bitmap. + */ + debugfs_create_symlink("base_pfn", tmp, "ranges/0/base_pfn"); + debugfs_create_symlink("bitmap", tmp, "ranges/0/bitmap"); } static int __init cma_debugfs_init(void) From patchwork Wed Jan 29 22:41:32 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954200 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF24DC0218D for ; Wed, 29 Jan 2025 22:42:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4814D280095; Wed, 29 Jan 2025 17:42:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3BB6B280091; Wed, 29 Jan 2025 17:42:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D48D280095; Wed, 29 Jan 2025 17:42:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id DBB17280091 for ; Wed, 29 Jan 2025 17:42:24 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 97AC81C7029 for ; Wed, 29 Jan 2025 22:42:24 +0000 (UTC) X-FDA: 83061964608.18.3F001CF Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf12.hostedemail.com (Postfix) with ESMTP id C58E740006 for ; Wed, 29 Jan 2025 22:42:22 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="swBA/M54"; spf=pass (imf12.hostedemail.com: domain of 3za6aZwQKCNg9P7FAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3za6aZwQKCNg9P7FAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190542; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dXf4/6vVFsOQWOL4yP6AMC2Up5+2J9mezQuBbeYawcA=; b=uHHQq525Sxi5V/29uMVY1FYACI3xGIDUkiwyJ5vilf10H9jWMJqo4WhdzIQvqzN7Fyy9Tr jOvx9Kfh7zGSlPU8YTCQSTmmBWb0k4pnT8PLwaUtxJs5ugMb1QPcLiTgPdO6RV+BSNEHos hjXCNjzOpa/4qgKtdAzHdPY1iq9+dNY= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="swBA/M54"; spf=pass (imf12.hostedemail.com: domain of 3za6aZwQKCNg9P7FAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3za6aZwQKCNg9P7FAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190542; a=rsa-sha256; cv=none; b=m6sRyxZ4aqo6RPZa9e1dl534hYRAhjZJerFKbsHSZmIh3Zv4FiJT1WvPZKrIppPEMKbzBg NM2f+DH0bHW6SVXGwXSjaNlquwvbavb+I+y4m0bzv7kA2RD5PYDIAbogJGr7NR1u2JSXT1 2XKSqCxlPghnayPrNrUvzVcP7VcczS8= Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2163c2f32fdso4074515ad.2 for ; Wed, 29 Jan 2025 14:42:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190541; x=1738795341; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=dXf4/6vVFsOQWOL4yP6AMC2Up5+2J9mezQuBbeYawcA=; b=swBA/M54Fpp1GdOmf95GcDO1tprWPREZsey0qtHBSB4UcqRi59Nhr+3//bfdEzZGWQ FsOQMQrnGfVgnJtSjAX0bnefEpEC5cXJLAaGpG13Q6VK/IcC8CzadSadJb/kQ7bGkNTd H4Slt3Fntq/ApXUVhwUU2BatwX1cXzFXXrw6+8sM+63hdVtMNY2PLZdFn7QS8WDlivc+ sXEH8nDsQK/pULQcUSNj+W2XmEpKysVfsQb8O2+AztNVI7X0kLIydZ/mZkDYGyy825SE aNh+XQeUW5qFWAmtB5hy2KhXIrONIv9hFeWBody1d0hwsyjOUwuem5M89loMdtZ1PGpc e7GA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190541; x=1738795341; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dXf4/6vVFsOQWOL4yP6AMC2Up5+2J9mezQuBbeYawcA=; b=tYE+7VmkgGu5tXlTJO3YptZhtLUHvjUh1wWAN2oBrszrZmO/g83vXtTfpMx5wYo//u Ud48xRl5iQM/4D8W1YdqC3k9f6YiDpSJUKtGhVkrnMEsz0ELtwCM3Q1Nl/nEPUSlQvc/ qmyPiGT8iFcsgIhJD7u7GaxDCab7OXiglEc2Kx5Wi3crjBW/Bbi28gNuaN3Q1X+qnvBi m6a3dvmbexHf7FdzRZ/Lt3cvQIXQ1Iq2/JBFL6Tlu7MSjli7JsPQgl5ptSrg/INCVnF9 qsfyxalGqrBHD0ffiF38ry+R5Uv4CLf1sAsNmRKfZtOlxHpbomrgDLOv0NZwGKUiE8Ls mkrg== X-Forwarded-Encrypted: i=1; AJvYcCUKnabHUcCoQxx78DLFitpGl6mvKZEHKoZnDvOGBnPnuON6d5ti9Ni4MpulaB9r6ziYrr/zPumXEw==@kvack.org X-Gm-Message-State: AOJu0Yx3yu0Xa4y7a94nKTG7tMXLXxBMaRXLEBJ0VdIBegDC8tl7FdG5 yiSxsLjn9AfxjUlmAnnuiG27urCaT6hx3geMy4Zp2wNeLzul684Gepv4MSC3PkCt0Saofg== X-Google-Smtp-Source: AGHT+IFtjyy0fiL5jPCk8tJl4v0XKuHY7KQNCKtXfPi9xeiKgYj5YSOrbzBr7gtJ/uGT4M5R65Qgn0uu X-Received: from pfwy16.prod.google.com ([2002:a05:6a00:1c90:b0:725:e46a:4fdd]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:458a:b0:1e4:8fdd:8c77 with SMTP id adf61e73a8af0-1ed7a479222mr8328798637.8.1738190541605; Wed, 29 Jan 2025 14:42:21 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:32 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-4-fvdl@google.com> Subject: [PATCH v2 03/28] mm/cma: introduce cma_intersects function From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , linux-s390@vger.kernel.org X-Rspamd-Queue-Id: C58E740006 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: jzbar17gsybge9fky1rwgr1fu3rdjr1m X-HE-Tag: 1738190542-53581 X-HE-Meta: U2FsdGVkX1+d538m7uRxJSbPDUzmIFNDa7NqhBGRbaLSXTJi2nNt4JjGd5nyrCtHVLsJpF3et6tMbZ3eeymCanZ7y08o10HDOQzzDdYr2jDbr07MOcSqJ87tUN7Jb0iP3eCtwXCHZSiNMHJyT/8BMAQI/Q2CkZVNDKTWtqyZkSNyuwiKecqEYzQaVUQN6BlPXKVkPk+RhOh3dqk12eRjMF92JMZPQB0N09+xHf3ri+76DPpf06cRQhjDGv4nWFKzdKJRORNVbVk2k0GGZUUtp76hGGsgD2ZMGyUN+uotDk5l+8Ex0nZZ7i01cmB2AloNO3nmzb1i/SgPOnyLBkb1mkQ+2UfWkHmrkr1x5qGJJnNadpVJVFswrDvXs1BiTJskipEZ/G1cQjdq7prCQiHrLNHHpoqwhsPggPJghXJoWKLpHc7/5oU4ohoFK7hU98W9apsVofJ10TEkuaatyiKNMs43Q0J/IWb1slibH9qHOQJvYY3ESg5TC6cCLZgx+3xm7X3fM/iuwAsJQCT58JBpv4L87HMEZ55oJfj6TKEGOvro60DkHTE9Gc5Z0NI4BWn6YEtWoKHmRd3Efw66HXud98smfuSYX8fS9gPTV5PmtJjiT4X4t6fwLTad8Vt1Dqg5G5Ws3/0r0aAU1Yh6w1s74AIHBSn5UC7fMqTX8WKigfCyR+JS5Ymy1Hxw8jvscYoC7lGDBd/KlogJ/wTktily//wJ0ld9EGEb4WqcmFKSWxLukoOM8szBByJX/SLwtH2KHnxJNCBRDFqJ8byKuC3pfpwf00aaQ1piw48Zv68/25RN3kTfKL15MqUPXMtH4kCR/455/MfujJ6IKp1kZBFnzkz0wwF0RC2EFRytfPc5hvDIiwYEqOCs0fNaC9MEBacRj+n7wrFyd13LTGLg7o5EylIcIlodSh8GgLkdO7guHuAquQjEcgXE2ZPqix6+iKatxRFlrVNIu1T0qsLWOmT qPCleVRB QReIAwLTspg2hNT7dMhlkdx1qTbZDzVgjRJ1x+GJ4C0oeSlZbejpSrVIr9lSDsmTh8eTjuqkP5i0lM47VyQPN7D+Tt+TQkTt+3WeZcdXkfLpZ8DMWuFSdBYThG/X9Zmto0cfoIT+GWL7cOCVnWO9m/jiHEKl55JQkUnKhzS1NY736kFaIwLC36LX3m9GTome0/M1eoTMPJlJbuPJSq6DJSmZhw1TtYemTYpFpMhDZZzdyrgC13qyZY0VvvGqFk6kRFoCmFemItngdcBa7G5AZZbXZvAldCMCxh9cZpLtHc+p3PGFg0PdzeMcNENZLgIxi+6ouZx7N7oNmFzQdevn6qmXR1nJhX7n990LsjDhy5csfgxk2e5L8viqAtAK7HETH6o1YKotgI/C648udzK3dVuOKJ037Yk0Cy5ucKg6ptBia3DlGcRvShYkDzZp/oH0Zaubj5RDNmYoPxfEvXYlouT/B9SZgS7dgpEgUl9jbVlxe33irkCQlrqSMlwQxgqnnnYEcQbI7fBK51ntHZEsSWhsqy5tJfKfFsMQG+MlRW1BbZny7lPOTnz6HB/BDL9qPU9TCOJ3dO0MW5p1DG6XrgE4hSDRtOQfbECOdIpY1jBGsVXVH4h7hehTfIQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.031173, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now that CMA areas can have multiple physical ranges, code can't assume a CMA struct represents a base_pfn plus a size, as returned from cma_get_base. Most cases are ok though, since they all explicitly refer to CMA areas that were created using existing interfaces (cma_declare_contiguous_nid or cma_init_reserved_mem), which guarantees they have just one physical range. An exception is the s390 code, which walks all CMA ranges to see if they intersect with a range of memory that is about to be hotremoved. So, in the future, it might run in to multi-range areas. To keep this check working, define a cma_intersects function. This just checks if a physaddr range intersects any of the ranges. Use it in the s390 check. Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Alexander Gordeev Cc: linux-s390@vger.kernel.org Signed-off-by: Frank van der Linden --- arch/s390/mm/init.c | 13 +++++-------- include/linux/cma.h | 1 + mm/cma.c | 21 +++++++++++++++++++++ 3 files changed, 27 insertions(+), 8 deletions(-) diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index f2298f7a3f21..d88cb1c13f7d 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -239,16 +239,13 @@ struct s390_cma_mem_data { static int s390_cma_check_range(struct cma *cma, void *data) { struct s390_cma_mem_data *mem_data; - unsigned long start, end; mem_data = data; - start = cma_get_base(cma); - end = start + cma_get_size(cma); - if (end < mem_data->start) - return 0; - if (start >= mem_data->end) - return 0; - return -EBUSY; + + if (cma_intersects(cma, mem_data->start, mem_data->end)) + return -EBUSY; + + return 0; } static int s390_cma_mem_notifier(struct notifier_block *nb, diff --git a/include/linux/cma.h b/include/linux/cma.h index 863427c27dc2..03d85c100dcc 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -53,6 +53,7 @@ extern bool cma_pages_valid(struct cma *cma, const struct page *pages, unsigned extern bool cma_release(struct cma *cma, const struct page *pages, unsigned long count); extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data); +extern bool cma_intersects(struct cma *cma, unsigned long start, unsigned long end); extern void cma_reserve_pages_on_error(struct cma *cma); diff --git a/mm/cma.c b/mm/cma.c index c20255161642..1704d5be6a07 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -988,3 +988,24 @@ int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data) return 0; } + +bool cma_intersects(struct cma *cma, unsigned long start, unsigned long end) +{ + int r; + struct cma_memrange *cmr; + unsigned long rstart, rend; + + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + + rstart = PFN_PHYS(cmr->base_pfn); + rend = PFN_PHYS(cmr->base_pfn + cmr->count); + if (end < rstart) + continue; + if (start >= rend) + continue; + return true; + } + + return false; +} From patchwork Wed Jan 29 22:41:33 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954201 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 748AEC0218D for ; Wed, 29 Jan 2025 22:42:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 281CA280097; Wed, 29 Jan 2025 17:42:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1E5B3280091; Wed, 29 Jan 2025 17:42:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F2E34280097; Wed, 29 Jan 2025 17:42:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C68E0280091 for ; Wed, 29 Jan 2025 17:42:26 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 4C593AF456 for ; Wed, 29 Jan 2025 22:42:26 +0000 (UTC) X-FDA: 83061964692.16.D92BECC Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf06.hostedemail.com (Postfix) with ESMTP id 76926180007 for ; Wed, 29 Jan 2025 22:42:24 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=lUP9Y+TQ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of 3z66aZwQKCNoBR9HCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3z66aZwQKCNoBR9HCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190544; a=rsa-sha256; cv=none; b=pW61aFYxb/fYrBezVPwo7azvlvnbJ0j2Ip+Evrwn9hTltEDZqOn/KylxCA0woy60UNA4B1 UouWx42k08cpWsIJ48PgexW3jfDEnp/yF170WDhLYJknttyUthl80w+ew3tjZIbqQ+jjLX cn4HkfVmptUvIlq5idusFXSIVnv0LWI= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=lUP9Y+TQ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf06.hostedemail.com: domain of 3z66aZwQKCNoBR9HCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3z66aZwQKCNoBR9HCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190544; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=U/uGGJrSbaHhwDwSIFP9+o74IAHqfAjITEitLcS5SV0=; b=Gn+bYlVl+ePYLLJLKp+nMEpzC9PAHfRJbjyFyOSpclXzXpe09XHQYdG/uuyDWkXht0JKwN QqdFKt9wTuc9HLiWQz69afIq1ibTsrWljSv920C/ezEBfo7Ha+bAX7q+AdiUTBsfD5AxkL dx/da7gHEKL8yxZVKUalBJBAx/KrOPc= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2f780a3d6e5so264196a91.0 for ; Wed, 29 Jan 2025 14:42:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190543; x=1738795343; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=U/uGGJrSbaHhwDwSIFP9+o74IAHqfAjITEitLcS5SV0=; b=lUP9Y+TQgFELRXjvOs0kJrWCkdgSyBvtPsEComy9Z2y8UC1Au6L+4vSWCXWGS7/uGc lXolSuE6tjVw6bLnJMvoyXVhoO/86bbiJdFfu5NOILuZRKpKn7EUXVl1ZypTmBqe35Pc xYzprg2vmJ+S71F+7X9bnQwjPrB6QAHRgBP8xcydeH9lSgPHJdxd7cqkAo36arX9Vxrq RcZWe1qh8CLI9QTSjt+j+0Xc6o/Vd6Cuk6AT6wYohEhxlEozXUrAS9Izp2trkEhlQf+y peMOzPjJ5JYBvTVQP2ja9vbkr3CHPjjLyCddbJZx/u8+JVr5MyM+YGsXS3S7edRz1eVO RX4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190543; x=1738795343; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=U/uGGJrSbaHhwDwSIFP9+o74IAHqfAjITEitLcS5SV0=; b=LuIefnYPVSFeiwzsRtNVaYwRL3/eTQeqdtuS5bV6RkLahtPUFny3bAkInUMsT4ZtYJ q2mal3BeYhBeW72ZPFjDa+KhImgxYtC0EuO1TxoN2ZMXfjA37cGDxX76JsvzCHlQ1Zy3 8hRpaFa7tRXIlqXblUpGYnAaxJR7lQjzkNIGfUp9MufICCM3cn3q6gja496kEahbVLz5 mw5oDe7GFZ+BInMKLazcDzn7PAqvPVK5cf7qj8a7colydfaOt2i+UiYzWU9SJYWbqaVx fkkVqDclNYKS3il8MZ8skoS3HL/ubBbw/GX+wIoVXjydTYzwlwkw3e3luAKGsMQmrug3 w7ig== X-Forwarded-Encrypted: i=1; AJvYcCW80OdSfQe3OmZ7OzwlCUTTzgQxXztoYUhilm59bvy9SmwQngROCB2MiR4IyfNVlmUTFjUiYqaaCA==@kvack.org X-Gm-Message-State: AOJu0Yyp/KDcQTXhbGB8q3frOWTSISqJ3BDYkiMMfwMsDR9Wtq4cqkhV LJdCNN6BipkcPvVTCT0kkI9zeCmclPh1hRvx9cEavkP4krpIJIQRFOnjI5SbVTkPx0PD2A== X-Google-Smtp-Source: AGHT+IH06TMw+Px25LxLv4kXiMFgHunln09KW7I6gk88clhDm/mxCvzjecCS4D0Ez1GRYeHXSw6XGkek X-Received: from pfbbt6.prod.google.com ([2002:a05:6a00:4386:b0:72f:59c7:f942]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:4f8c:b0:728:927b:7de2 with SMTP id d2e1a72fcca58-72fd0be829amr8421810b3a.8.1738190543291; Wed, 29 Jan 2025 14:42:23 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:33 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-5-fvdl@google.com> Subject: [PATCH v2 04/28] mm, hugetlb: use cma_declare_contiguous_multi From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspam-User: X-Rspamd-Queue-Id: 76926180007 X-Rspamd-Server: rspam10 X-Stat-Signature: gh8qz9pr51oudo6bbcjrzjkqcokif1kk X-HE-Tag: 1738190544-72983 X-HE-Meta: U2FsdGVkX1+5Rru/yXhm1MICxJzucVlmcr5LXU+d2VHsYX8jSSgwVFBbTExq+/LH+WAucQdcS1Br5/75cDC9CS4Ujn8pgin6eIBlkKK9kBr+TBgCTLDyZL8ddHdAjjDWH2t2OYh65rlFvD6DzuRf/CPJBKVBrufRCrQHbeoTIce7ERdI4TfIjD6Q6G5/CE8ZzDArDdxHReds11cf6lX2VrvxoeKvQObBpPHAk0DXmpSCoMgjY6vLDquhXxQwdWSvctLF9LmvzKBLG8HyBf7rrzkt0SyfTTfIjs6NM17b2mwYch/WUPgj9+wt2ix3hexAjSojkKY2+eNj14rJ9Xzd0KzvWgqiAO1ByUgBj6V0J4Evy5Iy+g7kOivPielRgvNuO4ys7iiBfqTT2chYYMMOAnnAh9H4KE52hdiO6JMPonCbg2pKc/mFiQ2d8g1WNzvRyX4y1lkOiUCTLcCC4CyMZJ0xIO4gQ19d3i6TgApM9iL8QNGCGvJueHni0Iz337v+sQ95pCK4XoWnUqUlBY9QIgWe0OAQhQ/PVIoZ3AC0KI7am6Lp+4WublShBEOpT52+asqx7im9fPPlU9uh9BOLDxWBfxCTI7HXvqjF7HpgHARppsgKkPu4bZNDVmg7Qom058+sm9vyJmFOGOq/MNuzdwCzcDePtjxkcuRb5VXkFE5/JxUZh16OJFTCv/8LNHFI8u59vF7axmnov40bkZtqH16dCcN9JABO1rhanKJNbdNf/UdtQECzJIS9stz8sSjoGDe2zpFES5H7RY8nJD/1U3nx1/pPXeQEunwtkzP68FydP2znTs7OK5tx8gdNSYIsG9rW5cam4tlWl8WzxhfftzCTc/2te5FZ+vf0IAAg3X5rlaIOBRWK5mhCDkFaeHsjL+MzvPI1Y29CWIeCYpYD4+Izhybgv9wb+LyyEtyPA+OSMsdThaKpa6hJLLXqtCex4NDApblGBtzkcUvAcMJ y9Ufa9gu Gxm1iSSDjM7Bz6ajU3s5xTMW17J7h22SYuezaldpvYYwa060HNBba9AF3LTUtz83+XUFh4gkJ+rACh93w+KvbmPUWjBETXsZoWHi6T++8cbtX/GbpyIOGiCUGBegiFnoNCLl8xYWncxXIBSg1inGoPZ3lLn57vnQBpLGm6GHn43qlEVyJ1MghAP58QshSGDpairCuiGR4xSDjKMPEij3OF7pOMOaqEmDQJpwt7T1Pzop6wE1fvEfarO8qaMIbQHlo1URpYg4CHgJ7o1M0NEszXOwUSnp9FreYVANughVxReLo5X8UMnUPg1rO9lb7inpIZ1DcePmDwFGH9NZ3q4nRD/jwSfOk0TDitapJ1eUO4//t+T7pNuBVEGinYn666H6NtkzYxXijBJwlBbpj8jAQxL0VES8Nm884U6ZXGlglYBN7hUB/sjiPMK+6XARJmmYW9HACB2ZHbqoPvxm+rNimfW7uHtA6Qk/g7bBAkN03KKrKsUXXdqBt3O2uPGS9HKXFaFMBhwyDPpagQHiFUgwi6L4ZfVJRYQ+CnGVSqWVF2M5nasBpMqwTk18TVO/HMqyXhUtfmvmliAZ6BlP6PUtcG1hFmsr2fxPesknt X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: hugetlb_cma is fine with using multiple CMA ranges, as long as it can get its gigantic pages allocated from them. So, use cma_declare_contiguous_multi to allow for multiple ranges, increasing the chances of getting what we want on systems with gaps in physical memory. Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 3b25b69aa94f..bc8af09a3105 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7738,9 +7738,8 @@ void __init hugetlb_cma_reserve(int order) * may be returned to CMA allocator in the case of * huge page demotion. */ - res = cma_declare_contiguous_nid(0, size, 0, - PAGE_SIZE << order, - HUGETLB_PAGE_ORDER, false, name, + res = cma_declare_contiguous_multi(size, PAGE_SIZE << order, + HUGETLB_PAGE_ORDER, name, &hugetlb_cma[nid], nid); if (res) { pr_warn("hugetlb_cma: reservation failed: err %d, node %d", From patchwork Wed Jan 29 22:41:34 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954202 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D7C3C02190 for ; Wed, 29 Jan 2025 22:42:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8523C28008E; Wed, 29 Jan 2025 17:42:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7612528008C; Wed, 29 Jan 2025 17:42:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E0D628008E; Wed, 29 Jan 2025 17:42:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 27D0028008C for ; Wed, 29 Jan 2025 17:42:28 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id DCBC61604B4 for ; Wed, 29 Jan 2025 22:42:27 +0000 (UTC) X-FDA: 83061964734.27.AE125A0 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf07.hostedemail.com (Postfix) with ESMTP id 2149040009 for ; Wed, 29 Jan 2025 22:42:25 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=z+CBmFN0; spf=pass (imf07.hostedemail.com: domain of 30K6aZwQKCNsCSAIDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=30K6aZwQKCNsCSAIDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190546; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pO0e6xegNdeyhYt32o5iXhR2JH8dHRA9GoUIxGwwPLE=; b=8NAzQFU2rdcfDQ9pzNHAYG0s3RdmqzJoTs10wghUutoQIpLpVmBTc3kxIVGTG7A1tXs/il REcvZXNxpLR1b5bjmy/kbvrj86mpycdE2+h+JSMwbnGk9H4BrtoxFwnTzYLwJyj9TN3pic THLMNpHiS8E8zemM/XAF2Cp/tQy/X5Q= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190546; a=rsa-sha256; cv=none; b=48LNJckx6mC/R2XkEmf5ZyZu0kRAf8bf/9lCl9TBOIlN9mbg40UfcneQovQD1SgzVmNCv1 NFqlXwWyKGjhJE4bfhd3IpK+xnzdOVHgcZTFHfi613TJIda+fFYOd5YHdUVs9tZ62ej6a9 y6NRAEki2+DEq3QcTTdYuI4NSiFMxeQ= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=z+CBmFN0; spf=pass (imf07.hostedemail.com: domain of 30K6aZwQKCNsCSAIDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=30K6aZwQKCNsCSAIDLLDIB.9LJIFKRU-JJHS79H.LOD@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-216387ddda8so3060455ad.3 for ; Wed, 29 Jan 2025 14:42:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190545; x=1738795345; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=pO0e6xegNdeyhYt32o5iXhR2JH8dHRA9GoUIxGwwPLE=; b=z+CBmFN0Bjt0x2b9DDqT6mKUl/slHJ28PMjzGQZzl2zsyFxZDHvxcDhde4GUDNpN6x CTUmH+zWxk2PjaDPDBFzjwzVlaGQbLcPDzwdynH0P/ORtlQ54HgODgzxBfj3CHqvfeQf UR7OBChhRQO7Vf8rLmrtQkMug+wydfztvtP9PKYY6+KIXJzjzo4i88A640bB/wnF93b3 2OyI0fsHf33+m2s0ODERebvqfQEK1UC9k09xbY07GDcPi/gW79WF04dNqCSoRnJ8a1CY hpY8Czk6m2FFsgfcxlthxyWiM35dzmozr4h5z37hKuv8Zqkz9lPmtQjGAoZZkJYYHeXS aD0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190545; x=1738795345; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pO0e6xegNdeyhYt32o5iXhR2JH8dHRA9GoUIxGwwPLE=; b=G81EelnjJkOvMRGyP7MfOtJLR4mFdjB6A1qVjHHDF5+jihjYLSd4rVwVFqmTi4VlOR oc4qCH6emphS8HuMW/oXNeKBvBqt5TXKvblAl1D3j7SOYXRuaXrIMxTo8xtTC3mJ0rPi M/OO3Mx60kmMvvqwherytq4/0lrw0EZaEDALEbMfa9EAxnGHTohQuPAS1sAdB3mWJpI9 2pxpXx6bbGMgR2UV8YM/G7xXNRIdE9gKp57rtUR/nkF0BLkVjvOe/hD+nyJX/4HovC9U 28Cn6dmi/hL+MJb1kfoqQKjG5F1k1AU8uA7f2Rm/0/PlS1KBHgtyIrDp0cWauCSkOI7L OrDA== X-Forwarded-Encrypted: i=1; AJvYcCWOTm3MEWdRY0sU9DEUCxQfiAcVw82uEeoV1T1L0Zpgoj6ojSeBQPIsd6xXAWmfr6miqjP7jVtBZA==@kvack.org X-Gm-Message-State: AOJu0YzUkka8HKVoFYltHFzYOI+D2sPFlSIsAHQY/uRcI6gbkI+6Pnaj PenKfZ92NpgWod1xI+EeR1KHO5CoydlC25DighqOF4LVCJVjxDqvCkOq5bdEeCQVUB48MA== X-Google-Smtp-Source: AGHT+IFwoboaFCPCNze444ZM4fyXAd+gC8WqzGJgwGWoda0w7P4FTxdLEGdWCLzLUoUhT0C9JBl48K6X X-Received: from pfwz40.prod.google.com ([2002:a05:6a00:1da8:b0:725:df7a:f4e3]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:6d99:b0:1e1:b8bf:8e80 with SMTP id adf61e73a8af0-1ed7a6e16f2mr7506593637.41.1738190544972; Wed, 29 Jan 2025 14:42:24 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:34 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-6-fvdl@google.com> Subject: [PATCH v2 05/28] mm/hugetlb: fix round-robin bootmem allocation From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden , Zhenguo Yao X-Rspamd-Queue-Id: 2149040009 X-Stat-Signature: 8on3jb68tatz18qwkb7dr4urd5r47511 X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1738190545-393745 X-HE-Meta: U2FsdGVkX188f9UTWDm86QXBkeqTG32/a/sGNjPAiw7vkW3AYue4+F2Bqp+MXh4/hfV6oldUEyGabgxNj5qcpbx+bkEuHBENn76hNWAN6dg4IPHvG1zyKm6WNr2mLKa2XUJHC/1+E7q4SIPa7nNvoP4r3Wku6PGMVgO7ZBLCtzkmpA9u69TQLv7pZOlLJlY17UO6H27TAwqShi29JXI/6SipGZPrK0DZEdjczgyXJapNipIIpylCS+qaRCnCGsLm6sJAwCJPWBzYBC2odJE1vgaX7PY7urEXEtFGdkgtTYQjy5wI2y7W1Zvpa/yW+Fn9UK0qBkGhID7OkrZxI4XctHnmsJ9pTyq8DthKfJUxS8RzR/w7nU2s6kaNEvVHGlGh9qO6tTlcYRY1hjLQX7xtlk3Bs9I6uDDuxegFqUojW21sAeXcBaQFKCynbCptGAupi6UOlVI/SmIwHtOFueQjTdM1b3MKztlnGGfZJvwFIkf6g+Q7XnP/PevXlejFr6W2ZkbNC9wb7RKmgYBkqtGTYb0Lr283DYA3+yc4u47R9ZU0l4bgR9Jvrna6F9fMN7x2Eg2EDeoecyMQlurtCapx1RBFWUTn1l52V5INhDgO7hqrDBdbLrCNMeoBd20FrgltwkpaYRIbhqsb/IEpAGS1VHzBPw2imz7QGwb//y8TtbAZrvDs7S7Vj6cSA0yhJmXp/IDRAg9V/xF6AcBjianw95BVdyCrhwEXfpcPq2uX00DxZRHUlhCH0ZG59eamxf7hhi7hOSnItbbDsQyvPJgeKRBI52bm7Yd3sEEkmNrgqq8uSsXgTN6/wQDrd3EtxQjgi3cT/HkUswG1Ugu7GQVbNAEDCYjLq5xCniomUiwgvKrleKEXXYS5datu9SD1AQTIc702zxg18/Wn5BiEYKsqSYWtxhDTCf996cT7ich1zZ4gT87ji5vDLhcShlTRoPYV6D4lNvpNvOEg2vPrucT UaN3rGcO yR71Fn3V6gQEdLL167w8ZHyI0gWioa0y1MjKPUbSNTJubSm66JG66VXgLxkY7wnj1Z5twJ4cOITUoat9jvoC5L47ReVlIuuV8JiROtUfVywhotJ4zDif7CmQ8f5SyZWH8O3bo1bX4YNGAGYJZBdJUEsXCucS45zpRhr3vKILF6BIR10qCDVdG6+TeUR9p2zA/7a2P/S+AwFzQ0zGBfintsr7n1lg/PJAgjk8qqmfmNJiWOkSAmqFGgrpGrbkBhi9LnayoWtYZ5MRxsR5WZ7u+PzPX+VbocoIpnqQ7KkI+gsZplDeN8ftXez38jLRKb23Vp67Sgl+38f2Zjia4jJwEdpL2n8Ij7W1O34QN2SpP/GxcxKyTat25V8Qt1rdyrheZ13HeZgq3QesiotshuvbJvkMWwq+npIq6deq7/pkYp10qYI+pc4MHoNtYqGjxqcmYPVdf71Sl70FgXLBwbEFizbacKUcLv4CgJl+zckPQd1PXlpeaN9HW0uFpmRse6Tti+3wGbCm+Ax7zO6803iv6CVkwjnYJDEfFsYqepVu8hOtwltA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000049, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Commit b5389086ad7b ("hugetlbfs: extend the definition of hugepages parameter to support node allocation") changed the NUMA_NO_NODE round-robin allocation behavior in case of a failure to allocate from one NUMA node. The code originally moved on to the next node to try again, but now it immediately breaks out of the loop. Restore the original behavior. Fixes: b5389086ad7b ("hugetlbfs: extend the definition of hugepages parameter to support node allocation") Cc: Zhenguo Yao Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index bc8af09a3105..18d308d5df6d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3156,16 +3156,13 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) m = memblock_alloc_try_nid_raw( huge_page_size(h), huge_page_size(h), 0, MEMBLOCK_ALLOC_ACCESSIBLE, node); - /* - * Use the beginning of the huge page to store the - * huge_bootmem_page struct (until gather_bootmem - * puts them into the mem_map). - */ - if (!m) - return 0; - goto found; + if (m) + break; } + if (!m) + return 0; + found: /* @@ -3177,7 +3174,14 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) */ memblock_reserved_mark_noinit(virt_to_phys((void *)m + PAGE_SIZE), huge_page_size(h) - PAGE_SIZE); - /* Put them into a private list first because mem_map is not up yet */ + /* + * Use the beginning of the huge page to store the + * huge_bootmem_page struct (until gather_bootmem + * puts them into the mem_map). + * + * Put them into a private list first because mem_map + * is not up yet. + */ INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages[node]); m->hstate = h; From patchwork Wed Jan 29 22:41:35 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954203 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EC394C0218D for ; Wed, 29 Jan 2025 22:42:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4F755280090; Wed, 29 Jan 2025 17:42:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4584528008C; Wed, 29 Jan 2025 17:42:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 27F20280090; Wed, 29 Jan 2025 17:42:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 05D7928008C for ; Wed, 29 Jan 2025 17:42:30 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 88AC5A0486 for ; Wed, 29 Jan 2025 22:42:29 +0000 (UTC) X-FDA: 83061964818.15.211BD63 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf03.hostedemail.com (Postfix) with ESMTP id B281B20004 for ; Wed, 29 Jan 2025 22:42:27 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=RT+TmJKk; spf=pass (imf03.hostedemail.com: domain of 30q6aZwQKCN0EUCKFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=30q6aZwQKCN0EUCKFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190547; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HdRvrkoUQwVvcsaszF053Gf7pOvYwH0DkTGUOoJ3gDo=; b=Xiv1GSHBWQM7KA6hIWUgEgoj1/lKeYzwN5haDs9BhCzlJAh2j0rpNwloTfQtriFxbZhCh/ ELs8u3A0X9hTTm8aFP0KZsdH7/F0yK/3m55SKrT4dafbj5AoYBoeW1vmqGzpmWLarNVYsc 09lLIgINsnNbr3glx3xsoD36sRclXT8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190547; a=rsa-sha256; cv=none; b=ExMH6FzZbt/gLrz+TlPsiU2RR9s6/KLq1+uKBOuIIVJqQoAypdDpddIK1AIGhI3C5lojko 5QmWVdFIZaiYKc3wfWGq08b/GE4hipbzWrox02jKbhQADu5BDtdZ65hUsv91xK7PLYQ/fe 6kVb8gLmolC2C5i2N9eDFuW/Jl8Ul20= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=RT+TmJKk; spf=pass (imf03.hostedemail.com: domain of 30q6aZwQKCN0EUCKFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=30q6aZwQKCN0EUCKFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-216430a88b0so3397895ad.0 for ; Wed, 29 Jan 2025 14:42:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190546; x=1738795346; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=HdRvrkoUQwVvcsaszF053Gf7pOvYwH0DkTGUOoJ3gDo=; b=RT+TmJKkFGmS1JLq3P1VawkdtLlVg/qcaZsDCYodhAyOtpOlCWkDZCaEmx4x38vuwy erbKe3SuCZVfDTLp1ApVYAHpECa0IcCVuBW/ZGMGjmi8piOH0gnWvFZOBm3AyNHAcqME gZ+0OxybpjHikQgmsDG+MdxNkMZadhZVD8omwUUu3lKK7/mnvhXc2xBZvROHUHHAmHJ0 lodarTMEjq8Ghf4Pzi26aFi5IG9tJvNQDHs6afUIw2EyGUPUBTDwzSfk/PHOCF9KulJ5 OXw6ZJkZZcqTe4UTh57xruxXyArAI5NJbnrH0IrhE0FFF14ro7Mbdd4j+jEVn8ctzkYM DJXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190546; x=1738795346; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HdRvrkoUQwVvcsaszF053Gf7pOvYwH0DkTGUOoJ3gDo=; b=FhF4f8TOv4fg0LTElmDdMraR6VqrjOH9yQdPvstO1hRp0+xbfxjVzfnECV05gDYgbv McfWnwBPVSnjXRvP76gb2Op+/q9Gl500UBAKLjvkb4x5xJCCSsed3gUv1fTQnR2f7Wao DS8GmJZoFCfcFLAoKIhIOJFpHabpW6bUcfRhtDgd0U7AdNM4zELXR55htdsOrMbSmQZF 8e4nbk0CM6L1zASQZPxKe0p/uIm+0oRlaoKBUjpk381cOBrVbcSFr1AvFgA/mm5fl4/1 W8fvRE+dnUOZNr3ORP9jgpd334XiNc9XlrpvNipfZu7vOttwl6urTtJXyLhPnh273N/9 Q3zQ== X-Forwarded-Encrypted: i=1; AJvYcCUOWsTpmjBo1kiSbET3DVksFW6o02sSxn8crcMCNxhAK0W7VNrSJvqs1DjQvijTh3mDkzuHfoskyw==@kvack.org X-Gm-Message-State: AOJu0YwMrKym1ZaWS53WegIkpiRtW6JhIYsE5/wfpOtW4irAF0dpLRQL 7amZVLXEk+SBBg5mmhhT9khYLUFEsZuS8BYoUmPGcCXDEnep9GTf5QLL0x0i95/4NNg8MQ== X-Google-Smtp-Source: AGHT+IFRGE8bafbhkN5A9bi7pt5tKOVuNMtgjq5HbfI2cYsNenspZDtaK1fZTtPkQMrimOxYLTsrX/5U X-Received: from pfbcq13.prod.google.com ([2002:a05:6a00:330d:b0:727:2d74:d385]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:4d05:b0:1eb:22e5:bb76 with SMTP id adf61e73a8af0-1ed7a648dfamr7753541637.42.1738190546527; Wed, 29 Jan 2025 14:42:26 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:35 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-7-fvdl@google.com> Subject: [PATCH v2 06/28] mm/hugetlb: remove redundant __ClearPageReserved From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Queue-Id: B281B20004 X-Stat-Signature: j9wbxsn5wujyrnieaz6kt37x7dhhxwjj X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1738190547-975827 X-HE-Meta: U2FsdGVkX1/5ZQE1V2VDBbRuzdncPJJETOCqrG4uuYzLc5fPa16A6KSr66B1QpBZYQS3U5XFt+TjyvO2yfCl+rsRHmrzIhHCNfLeheEhGZ5MpQc9W0aLO0esbySfQTU6e7kNxhfjvaapBg7+q4YhF1s4NBWMG6Xz9WMdcHPh8eepnCxET71Q7QFaPI+ubiol30Sremuai7np4omx+zgom7qHZeN/GCxODIrbVD2f+S1KRL9i4KK0fT9Lhw0x0tA2pgNuEaGqSW22q7ksrxyt49LGqDRKkdfALPzH9UkxUyBfInfIrg/YxiugyDsbfIPt+UP6Ekxcq9W8aFsjL1QdUoblpnTVMG+HxSNR3UvvSS5WuJloDxnRl1HF+fKvO1P7jne6HvODnTP0ip+4AV+BS9v5TbZ9k4qmkw+COtfMfvLZPsx6nL5ra/wpKnweOE2SnijVVlvFQM9stilhM19saxt6DCzMc4YsJU89fQkS8gSPXWakiaqbNQq8WyM7GCVBFUl80Pv46wkS7v1m2zPrzsueuU54+m+WHYB4a7H68NyCOu5jUybOat95sTavXuTHQ9clo0lAHS/YodeMiiFsvGKtO+akP+wEAt6H9ML6GOZsLRtH9bqrlHQmZSpzbHgP184wwqitGJ2AIBZ5wDAKl/SKDBk4L0OnIxx04yWRTh5gZDPeGeXN7utiAd/FGAfRNzqRRRQwQRU7hCzZ6dGFau4nVZmAAnnk6KjIhkwie6Xa3aCzgOWX8LKrwd/tmJpftSQvACRuE0PZ5Nb93aGDzAz5Qm5A7w3u3MUN4KJRsWHe5ySJ5Vv2ktQtK7frnkdTMt26UqQ4az2DS4wFeOUyCU5Q3Y5RUPt8QfwhjXZR/rfUirjYBJIl+LDOoFCr/4qxWlPe4W5H1VXEyLvhMnCxhS5zbhOLtm1WdlTrKLWOcrJ4Pmb+z3Kxvsiz3C9OD8Jgs2GpJXCyrppmHw1WYaY RHrOxw7Z CxHawjr7Hx9ERY+UCXLQ7q4mZWMoYTrDVzJjl+A8cYoPUm9jDpn6DPfxh6GVZmooCVAGqo5pSGhsJIZuKzyO99yeFON4lXzk73Pdl+dChZx/aVAbEOV+kXfZOxTMh3QORhZOnuovdHb+gsIhIR69FENz0dljrr5CY0IZ+Bg3pw+AMGKf2csz56jPKO3YYCLDXP/FkCI57VSUa0TzIt/dJ3Jn14CSgSY29s3T8OyCjhQWE6h5/asgw+5oJ+XrwMhz0sSrV0W5H3UzBK1hxhXzXQHPrUscGN8X2KkVeGYyX8mEMaHy2nIElNf3GfoY79x4UNx91u/fjoKyKv6zih7Fx4R+p6CpMjoYxWO2McYXmt2fjmpeXftFwtrkJep2ikGH6pJM/eIEX2OYBJP4LRMk+J7J8mo/kGQbjTHopiHtU5WcfnoEQTWfZmd5X7qRbe6tG3N03R4RbZfmx9bxEUHaAI3mEbyrLq/Mhj105jIR/XXVJGkl6zJF9CDVwCEQdjuCNeS/+uR5q1IIUHU+C0FbL3ISw/g8avsMMTJegnIEq8uAqNnwBOVPCNUleilSWUpBpjG4MgYgcoW+uuUs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000040, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In hugetlb_folio_init_tail_vmemmap, the reserved flag is cleared for the tail page just before it is zeroed out, which is redundant. Remove the __ClearPageReserved call. Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 1 - 1 file changed, 1 deletion(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 18d308d5df6d..196359254cfb 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3202,7 +3202,6 @@ static void __init hugetlb_folio_init_tail_vmemmap(struct folio *folio, for (pfn = head_pfn + start_page_number; pfn < end_pfn; pfn++) { struct page *page = pfn_to_page(pfn); - __ClearPageReserved(folio_page(folio, pfn - head_pfn)); __init_single_page(page, pfn, zone, nid); prep_compound_tail((struct page *)folio, pfn - head_pfn); ret = page_ref_freeze(page, 1); From patchwork Wed Jan 29 22:41:36 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954204 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62260C02193 for ; Wed, 29 Jan 2025 22:42:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E6FBA280098; Wed, 29 Jan 2025 17:42:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DF31028008C; Wed, 29 Jan 2025 17:42:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF5A8280098; Wed, 29 Jan 2025 17:42:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 90E5528008C for ; Wed, 29 Jan 2025 17:42:31 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 519611C7033 for ; Wed, 29 Jan 2025 22:42:31 +0000 (UTC) X-FDA: 83061964902.23.99476D5 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf15.hostedemail.com (Postfix) with ESMTP id 79CCCA000B for ; Wed, 29 Jan 2025 22:42:29 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=SVo6spu2; spf=pass (imf15.hostedemail.com: domain of 31K6aZwQKCN8GWEMHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=31K6aZwQKCN8GWEMHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190549; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Wwa6fsXZRK5gAo5OnSYVBrwDMxNqaUw9sa9Xi3hrPeg=; b=2VKp5vfw+JS9vZs/505L68g8uNjDlbWh4rwXqLM71K60UvrOuAORDRoaeCtBqHYDb5axTk cjWJ9lqSKemlPtE9PYev+6B4nXjs+yx67qHuyvsarFDchHfh0k0KvuuF/ndbFZmpgyFt3M 4w7Ja4Onp6E+BKZbzVeXBgmCKXpvEXU= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=SVo6spu2; spf=pass (imf15.hostedemail.com: domain of 31K6aZwQKCN8GWEMHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=31K6aZwQKCN8GWEMHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190549; a=rsa-sha256; cv=none; b=osIDVSUTFqYA/s3ej1qNUuGkW5kB93B3vNwYpLE+7U0ADhUcxwXoybk4Llav+ROZQSIPLN OlLM/UWNqj8W9LatXP9CkbZSu2cWTE4/81FK/PFLNO8ts+LxfP09JBE5SJf/uNNmLuXWuM GwM6H035HnxMrgTbQzzU+tfyppJCzI4= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ef114d8346so263381a91.0 for ; Wed, 29 Jan 2025 14:42:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190548; x=1738795348; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Wwa6fsXZRK5gAo5OnSYVBrwDMxNqaUw9sa9Xi3hrPeg=; b=SVo6spu2MUgWfSuOUBuNuor7rMqY0pnUvBQwikcsgi3v2OTtwhQcL/kqiNV3v/O5Tq GOZtoIlOJ43JzjG3UfxX/2rzStuLvTx8AeDuojAVISBSGrEczml1S85S+qbo+zsU4RA2 LDIJF2wu/tYsdnMVDUjUJQC0aiy9j+z2NtBFSs7rLeUZJ+uscagipiFreWBtydFf1pS/ GiCsjsJQeL/EzgtLcBBvCnF/zDMtWjhxivZU/KYdEfnQdLnBufJSkwC04RA25zHPf+8W e41Bc5fmQFTFheRWcfWmYEL0cITE8sLrCj0x7FLyINm6TbFS4jImPM2e23x3F5fB4ZY1 Etyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190548; x=1738795348; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Wwa6fsXZRK5gAo5OnSYVBrwDMxNqaUw9sa9Xi3hrPeg=; b=J9BHSQvlSJIvODgAWj+pnUT1z+mVuwMfSG/klwh6HApsfkMAmBtWcUuUGxL9s5eO71 u3sGdE9YDGrxrXmJP0R0GNtSEkePLMCfI2SBT7/uFR7nmueF6LTEMm5FOm+wbH194VGu h5S2Xfu2jxKFTu5WDb3luE3Neh2aOyBBmaDng3fLsw/R3c8oiQfL5mDWBbhBP1zO7U0i nTVNBQjLAuID613TqWBx56cgZOE3c9k2gCfZ4L43OWva0J3kvG/gQJMrVINChi0nhKkD xByE0L4oCAUQIEBNHUhXEtSAzemzdSvpgwjWZJ7ZZcWnwU/RFJgVI2h2zPTXHxWrVe4x T4Qg== X-Forwarded-Encrypted: i=1; AJvYcCXT0QhDrFNIxJDkNzZ6NTGPdd7rqUzFohhuJdhzFfbsE8RQZE89OfiioW8H7GvNMnkWEA1lWg4GQA==@kvack.org X-Gm-Message-State: AOJu0YylqRHBexbh+oGexy07aL+zI4rVI0IonLcTDx+D3GJOFm4Zkey6 Jhl5sKzqo7RmqSiUvRx6Uvb/d6wTh1W0/hDYlBXLmo8cl3tyLU327hYaL2eXYjGw15gBHg== X-Google-Smtp-Source: AGHT+IF59vnW7g2LrX0k3OPFChcYieeAbAKwK+MeTYrxX422BkPNegPy+HquNkgn402HeoagQo0/5oe0 X-Received: from pfbcn10.prod.google.com ([2002:a05:6a00:340a:b0:728:e76c:253f]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:2e88:b0:725:df1a:288 with SMTP id d2e1a72fcca58-72fd0c74f49mr8427084b3a.24.1738190548297; Wed, 29 Jan 2025 14:42:28 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:36 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-8-fvdl@google.com> Subject: [PATCH v2 07/28] mm/hugetlb: use online nodes for bootmem allocation From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 79CCCA000B X-Stat-Signature: o7tyce544u9g6in3sxwcj8ynjsts1cyj X-Rspam-User: X-HE-Tag: 1738190549-694424 X-HE-Meta: U2FsdGVkX1/Su1iChWOxyIpdGZGuhPQeBYAAlZZM7fF6v7Kd2UOynZuCaBXnGN9XWBNXR6RXHALCf5pk6sKkBL7NnnlpDaQrS2IIrx2n/9w35335kgyNlFVUsq2nLG5tVqZRmTwEnHQ1Qz2LDN5sPoV4r4E2hZWYda4BNxRHbDjiqbwvWrBfCmGRX8ycMVAdEATLPiGdhwrHcfRH1qqAk136FpK0z5rH7IQdcKr17uLngf5+zUh2QSB225weulAu1bL8mwzAnoSSaEJgzbyJZq7Kl+kbWCCfX8bBl7bfY2JKJ/X77X1aqKd4D+x8YXw2EPCbQCkfC4dBU+yKs38gPGmIR2rJxSGd3gRb8WIzQIABbaZmxe6EoQpZkJxBqGu1jNictz2iO1tuBewEHyWG8QKK2j3m+sg1/wL/YHOLFRK4JsIQwk+/4f2TIkXM/BMZ7K6ae6WNFmfzb0TIRtyCi732vKSb+FfaxoiqKjPv5AzC9x1S3nRZ95lbRz71om+AJkV6j2exJ2Ll9tq4dL7RJpqRgfgI6VcfB4IipNmQEZ2n4c3AOG7B/WqvfONBpyZbAE9sJigceV8KMYiSJwRRqi4+6mt9l3d/LMDrHkZjjcmDuB2e7nOo2FXM/UFyS/+EJMO3ClcUOe9K+H62RVVoJ1NHP3KHx7ozqh5lwxIwc3JGupgXPkeYd4pZx2lBNX5MzjdXSmdGRC/D9Y04Uu7KhieN1k2JFOBiiMpE4l61UAjSvnpfvSSHvaMyQfQlM4V6jmDP3XtBCiYtR7/zdKVPmsj9qvpctFIpP9B3EbkOmQEyGlNyuxcYcm0f+rAr5ivrOEQsi229dclz1V4NsUTxjmZbOpJXJeQfW1DSIPu/EKxGT+yIbGBlPXwwofksv0mCy51TS1XJt/pb8dqcL17qQC/mD6ouFHXhwYoP6AQOWMeRJInwohdHmumD6+V2EyAuWXpkNQT2iahEbO6K3qf 0P7/xtYI Ogd9qyb7FPD6mLg2oIrx2pQ4cfUXYWnkueK82XlJoIwwV+haSMNAgHdhad8H58cOqhfmXML+CS9kJrpOc3GR5nmpUSur93V80wZpjIyTAxcyp3B9w8V1K/UScRt+hE6LK3I6UJt+2DQKjpCZ/EU78VTuy615s96t2cFAttkjnfFdxlZep1AnsTVtIkvuwTnGKFkxr/w8LdZvomnEpQxToHCzIrJipd6VXLBT6LwQFUS57Tg+M1ye3bUYyTnuQ16EiKXHBVto40QQT3fWtXP89991iuC3VN/NiYflsWDf9Y1Vg/e3wjoU6//doCgvMtGHd7yxg1vz3cBtiAvXL3xGZiflhN7MK5ZfI93MnXzoFsI35HCnYOiDhp3ThsatKTGKMDfPgPZIm4A1K6gIaYQEmzPoVYvO4DriHTnQ2wq4jwG076c7kCTI3K377q5Y7pZXrcu8cseedm2U/tcaqll9hyRyZRSlnklNSXdjpC59TUNRpgF+RQgXZNp7wTsm8+2otBUKwQeLXi7Pyk+Rz3Q9pXPGNAg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Later commits will move hugetlb bootmem allocation to earlier in init, when N_MEMORY has not yet been set on nodes. Use online nodes instead. At most, this wastes just a few cycles once during boot (and most likely none). Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 196359254cfb..20d54eaf2bad 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3152,7 +3152,7 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) goto found; } /* allocate from next node when distributing huge pages */ - for_each_node_mask_to_alloc(&h->next_nid_to_alloc, nr_nodes, node, &node_states[N_MEMORY]) { + for_each_node_mask_to_alloc(&h->next_nid_to_alloc, nr_nodes, node, &node_states[N_ONLINE]) { m = memblock_alloc_try_nid_raw( huge_page_size(h), huge_page_size(h), 0, MEMBLOCK_ALLOC_ACCESSIBLE, node); @@ -4550,8 +4550,8 @@ void __init hugetlb_add_hstate(unsigned int order) for (i = 0; i < MAX_NUMNODES; ++i) INIT_LIST_HEAD(&h->hugepage_freelists[i]); INIT_LIST_HEAD(&h->hugepage_activelist); - h->next_nid_to_alloc = first_memory_node; - h->next_nid_to_free = first_memory_node; + h->next_nid_to_alloc = first_online_node; + h->next_nid_to_free = first_online_node; snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB", huge_page_size(h)/SZ_1K); From patchwork Wed Jan 29 22:41:37 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954205 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 156ECC02190 for ; Wed, 29 Jan 2025 22:42:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DCE0E280099; Wed, 29 Jan 2025 17:42:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D573828008C; Wed, 29 Jan 2025 17:42:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B5C60280099; Wed, 29 Jan 2025 17:42:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 8CE5C28008C for ; Wed, 29 Jan 2025 17:42:33 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id EBB0F804BC for ; Wed, 29 Jan 2025 22:42:32 +0000 (UTC) X-FDA: 83061964944.09.E8CC08A Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf11.hostedemail.com (Postfix) with ESMTP id 1D7534000C for ; Wed, 29 Jan 2025 22:42:30 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=afA1yHmI; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of 31a6aZwQKCOAHXFNIQQING.EQONKPWZ-OOMXCEM.QTI@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=31a6aZwQKCOAHXFNIQQING.EQONKPWZ-OOMXCEM.QTI@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190551; a=rsa-sha256; cv=none; b=RpD8iZgbs41smi++CgtrOXhczURdBEWkI7rMf1ztWzxjwxL4J92c/2dhY5UV4FEuU4So0g AyunAtsVGLaDRe6sZUFdIlZX4V9nLQ1fXvKz8mzpDDAk7yl/I8jndZHDr8tUWGTMCHP5X2 kxC6uAyWGVkfrWSt9IsLFo/6/DNzBTQ= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=afA1yHmI; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of 31a6aZwQKCOAHXFNIQQING.EQONKPWZ-OOMXCEM.QTI@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=31a6aZwQKCOAHXFNIQQING.EQONKPWZ-OOMXCEM.QTI@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190551; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pAg9E+yVjs4X3E0Z5lKinfq3rHknPoyNwrdyz8zbKYE=; b=bKL+mKVvqkhrqd9CS811oycHREoy10vN9BFYGvHvNaBjVdlAOypyyzTVp900ju7F9JxdRX q1JcGju/IODoR2A4b9i8d6QGOZ4QAk5/9EMoo4UmF9hrhVGcoH7U4ueq9pQWrg52ensKSX roMIVXnbOsxdBTj2Nzw4Y6Rvga6SsaU= Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2166e907b5eso2951545ad.3 for ; Wed, 29 Jan 2025 14:42:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190550; x=1738795350; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=pAg9E+yVjs4X3E0Z5lKinfq3rHknPoyNwrdyz8zbKYE=; b=afA1yHmIPjwNMDvS5Xr/z/3/r9ud9tDxt3e9L2xwjZqD33j8Tu36Jmwu+sWlGFQJqo v5UAkpzEJZB0K91Tq45YQcvoj4Foyrwq+SoSZd8heDvXoJbK5h4q3KrAeUKHBE5nyGr4 1iQpp55/sMLUhxT4568s+NOkNYQ1wNE6y8FdRwtfPsxxdqg4R7Qfav1Wt40cG2oUyzpR QxqJH1V7Ma9BAPqcjQ6Kf/+Tk970m84mDtRUJLN/qaF9WBDPvwUW3IE5+B7ILz/psNo7 bqQvnpcAD3ihzbF318lA/Jum/qWCoeTYzr6D+VyCl6WK7qf9CvK2QnzN9Tq49DkE2jHE 0rww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190550; x=1738795350; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pAg9E+yVjs4X3E0Z5lKinfq3rHknPoyNwrdyz8zbKYE=; b=iB1c5lkxYCZFx+0E8+Q5iyZdjKfiuvcynHgZN4dTmxRv4Y2QWOksPCLaefMyVlFaJv e7tF7PtW+lAhxLUnDTE2ivpX68dKcoWt2i8qX47n7W+uhVHgmse2MK4peUshIKyuKBcA QnR0epNjVOl6o+w6DXkzHXPdbmUqevmeykMQKQqcINxqveuBNKgoSDubf6tPToiGjcrc XIg7Q8PTYbWLaAeQOhazJHobHb26aUjUr6LM95nbGwQfmIXdNDIiC3SOCO1KhkvO/Kb/ Bnzy39lBkwhRoRknGtrjL/LXUIfC6ZPHctQOWRDzZ8jaE+gMf1sFUGMnDOkBMMpHAtud 7IhA== X-Forwarded-Encrypted: i=1; AJvYcCXBQiM50OLHHTxRP1lOVMWWDQLZ/G+whTlPfJuXOiw8Z0l4iEQvXpafV7oDf53dbhT7YUwO6X7LFA==@kvack.org X-Gm-Message-State: AOJu0Yx4JbpcABv45Y206EVBELTcqxg7ifkz7Amu4V8SeNiJOkTQiaDN 3PYlb8yAhPMhv39VIyoyRWZazMMCVuzrcsu5qapb5liBjTtySj6eCC8Fkx3My3+HYHGl4Q== X-Google-Smtp-Source: AGHT+IEltS0buyyeuzvXAhB1x/AMS4GCVgwLNavg/F7vz30dsXHueQ5FkQjVnKOZ3LafqctlXAj1MQpa X-Received: from pfbbw12.prod.google.com ([2002:a05:6a00:408c:b0:72d:7bb4:ffc]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:2e15:b0:728:e40d:c5fc with SMTP id d2e1a72fcca58-72fd0c7c61fmr7331538b3a.22.1738190549941; Wed, 29 Jan 2025 14:42:29 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:37 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-9-fvdl@google.com> Subject: [PATCH v2 08/28] mm/hugetlb: convert cmdline parameters from setup to early From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 1D7534000C X-Stat-Signature: hpumuk6g5jrj5pdqinkmnor6oohg7rmi X-Rspam-User: X-HE-Tag: 1738190550-516594 X-HE-Meta: U2FsdGVkX1/bRcQGSpWPG+y9P1l2yuD8B7aGZJHqHuBKOHSjTOJTIqiv9plsiveAZG1O5DJ/zYsvZqi3tXFmFfSw/aVFULTJ93PNpRBfGQN0+OY99MRcv+RJGm7SfhaeshCv0S95hn2gZIzEMdzfNeOHCLMNVApx2c92854UhDfKlErCCla986ieA0XYVC/IButJMIPLeb0QLtGuuOG5h/ji6D9zGTX+2GpNuMS9num4UhvA1zcg28GIpFS+YScqdsl+sNiUQPqc5cN+EUSMLdLFsnMy3vu1YboLvGgl6t/rpQ0wMNGOv/P0s1usAMOIwC2/S+Q0HAknnD95vj6FgZzAH3YM+dHp0TbUNg+ro0VY5sbhy8DbAl6GHmJcfDGLCcMxN6Bb01MLzl7Nb0iBBYTZIP79vBYz2kHgIyDTLydM2XZZG4HvqLnh1JhgblsVLd7+MjHe7pMSxzlFu30GbMU7kaGgBXwAMUgsL/LRPPHlMv2o8nCIWwlTT2y35RVnV4tZRnzGIlLelnW5yUgdnEwyV7ORppo5hGXpmsneZhJK6x8tzrBuot3gutyTZszTOQalSftBNzRLDe4vHOucItsxe/hLQnhUDE7wdYc9S2kKUQ6aP+nvmTru3dYl7HIl4C4iU7SFmFT8rL8B2onmLeTCAJSDR/FD/JXTZJWexK/unRa9zFhgGVK/qYUbPLJOKZKn0GuGL/SaOTed5gdjmr/2NCeV8U6urIcytSLDFbH0B7BJ7pQTBWBBpNu1WC50qYJdel2MDR8Ffv0hmUdkRu5Di61+/ZST1yGB38hxqMm/+dB/hlz+iuN+Uc6zqXRtHY1+jp3hB4+UdoP6MIfbqLiaFlgjFDV2Af+X1YX/DN0MBI3T5prpDbG98xGLWvZVjXHuHOdjmg5+CqJ0Oqf4ZXlXYS+oshP55V6cedrXJltU+FlLTET+pt/Rd45O7ekHGKoByjzlXw4oSEq8AWc d6JgC4KP Yfkbqd3YUnvySw/CEo+Fd9spqcNvDSZxfBIG/CXyW2WPcl7LKAFoYRAg9MD+lHjsMY+lbGEwzW+Qr3Kjabp4uFx9w5QCm0Zd1LtXmGgAGAKPOmkh0m3IfNfUfNTB2mEZWXMcs0pgQVirtzvaLVkqtFAQ4FIIe9GaK1yY6YY/1ZTCtB79MTvWAAtQfD0Z2Vv2QxoZrKrN43eeNDEW67Ge3XK0+J6X6Tzbr/IpJk8+d/s5s44nI5SbFM9IjFGanvJimupN3xSS5E6DGD0ssECh/DWUnAbq+FyoVSLhPRTTmmpLtMd+9unvEr+gfYLaV99sI/wvwEnk8uTjP+EBhK0gg2vmEqoUcvRmDgdlQEQ7p5k34F5BbriaMeYyq3dRPiUSN0JImCa0OkmJZqZeYPSoYKIRsEbRFLmVJZMs6Vqr2+wLpCwHXzMH+79WGSFRIHySJFsLM9WUSSmB9MOwhv6DnyQjMhzhWtjs9Mo6u/ML2q8qRu5Cb1PKtb4w7/iRNum1asdni93wOvy++rEHDgYTWYvrIqA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Convert the cmdline parameters (hugepagesz, hugepages, default_hugepagesz and hugetlb_free_vmemmap) to early parameters. Since parse_early_param might run before MMU setups on some platforms (powerpc), validation of huge page sizes as specified in command line parameters would fail. So instead, for the hstate-related values, just record the them and parse them on demand, from hugetlb_bootmem_alloc. The allocation of hugetlb bootmem pages is now done in hugetlb_bootmem_alloc, which is called explicitly at the start of mm_core_init(). core_initcall would be too late, as that happens with memblock already torn down. This change will allow earlier allocation and initialization of bootmem hugetlb pages later on. No functional change intended. Signed-off-by: Frank van der Linden --- include/linux/hugetlb.h | 6 ++ mm/hugetlb.c | 133 +++++++++++++++++++++++++++++++--------- mm/hugetlb_vmemmap.c | 6 +- mm/mm_init.c | 3 + 4 files changed, 119 insertions(+), 29 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ec8c0ccc8f95..9cd7c9dacb88 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -174,6 +174,8 @@ struct address_space *hugetlb_folio_mapping_lock_write(struct folio *folio); extern int sysctl_hugetlb_shm_group; extern struct list_head huge_boot_pages[MAX_NUMNODES]; +void hugetlb_bootmem_alloc(void); + /* arch callbacks */ #ifndef CONFIG_HIGHPTE @@ -1250,6 +1252,10 @@ static inline bool hugetlbfs_pagecache_present( { return false; } + +static inline void hugetlb_bootmem_alloc(void) +{ +} #endif /* CONFIG_HUGETLB_PAGE */ static inline spinlock_t *huge_pte_lock(struct hstate *h, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 20d54eaf2bad..c16ed9790022 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -40,6 +40,7 @@ #include #include #include +#include #include #include @@ -62,6 +63,24 @@ static unsigned long hugetlb_cma_size __initdata; __initdata struct list_head huge_boot_pages[MAX_NUMNODES]; +/* + * Due to ordering constraints across the init code for various + * architectures, hugetlb hstate cmdline parameters can't simply + * be early_param. early_param might call the setup function + * before valid hugetlb page sizes are determined, leading to + * incorrect rejection of valid hugepagesz= options. + * + * So, record the parameters early and consume them whenever the + * init code is ready for them, by calling hugetlb_parse_params(). + */ + +/* one (hugepagesz=,hugepages=) pair per hstate, one default_hugepagesz */ +#define HUGE_MAX_CMDLINE_ARGS (2 * HUGE_MAX_HSTATE + 1) +struct hugetlb_cmdline { + char *val; + int (*setup)(char *val); +}; + /* for command line parsing */ static struct hstate * __initdata parsed_hstate; static unsigned long __initdata default_hstate_max_huge_pages; @@ -69,6 +88,20 @@ static bool __initdata parsed_valid_hugepagesz = true; static bool __initdata parsed_default_hugepagesz; static unsigned int default_hugepages_in_node[MAX_NUMNODES] __initdata; +static char hstate_cmdline_buf[COMMAND_LINE_SIZE] __initdata; +static int hstate_cmdline_index __initdata; +static struct hugetlb_cmdline hugetlb_params[HUGE_MAX_CMDLINE_ARGS] __initdata; +static int hugetlb_param_index __initdata; +static __init int hugetlb_add_param(char *s, int (*setup)(char *val)); +static __init void hugetlb_parse_params(void); + +#define hugetlb_early_param(str, func) \ +static __init int func##args(char *s) \ +{ \ + return hugetlb_add_param(s, func); \ +} \ +early_param(str, func##args) + /* * Protects updates to hugepage_freelists, hugepage_activelist, nr_huge_pages, * free_huge_pages, and surplus_huge_pages. @@ -3488,6 +3521,8 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) for (i = 0; i < MAX_NUMNODES; i++) INIT_LIST_HEAD(&huge_boot_pages[i]); + h->next_nid_to_alloc = first_online_node; + h->next_nid_to_free = first_online_node; initialized = true; } @@ -4550,8 +4585,6 @@ void __init hugetlb_add_hstate(unsigned int order) for (i = 0; i < MAX_NUMNODES; ++i) INIT_LIST_HEAD(&h->hugepage_freelists[i]); INIT_LIST_HEAD(&h->hugepage_activelist); - h->next_nid_to_alloc = first_online_node; - h->next_nid_to_free = first_online_node; snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB", huge_page_size(h)/SZ_1K); @@ -4576,6 +4609,42 @@ static void __init hugepages_clear_pages_in_node(void) } } +static __init int hugetlb_add_param(char *s, int (*setup)(char *)) +{ + size_t len; + char *p; + + if (hugetlb_param_index >= HUGE_MAX_CMDLINE_ARGS) + return -EINVAL; + + len = strlen(s) + 1; + if (len + hstate_cmdline_index > sizeof(hstate_cmdline_buf)) + return -EINVAL; + + p = &hstate_cmdline_buf[hstate_cmdline_index]; + memcpy(p, s, len); + hstate_cmdline_index += len; + + hugetlb_params[hugetlb_param_index].val = p; + hugetlb_params[hugetlb_param_index].setup = setup; + + hugetlb_param_index++; + + return 0; +} + +static __init void hugetlb_parse_params(void) +{ + int i; + struct hugetlb_cmdline *hcp; + + for (i = 0; i < hugetlb_param_index; i++) { + hcp = &hugetlb_params[i]; + + hcp->setup(hcp->val); + } +} + /* * hugepages command line processing * hugepages normally follows a valid hugepagsz or default_hugepagsz @@ -4595,7 +4664,7 @@ static int __init hugepages_setup(char *s) if (!parsed_valid_hugepagesz) { pr_warn("HugeTLB: hugepages=%s does not follow a valid hugepagesz, ignoring\n", s); parsed_valid_hugepagesz = true; - return 1; + return -EINVAL; } /* @@ -4649,24 +4718,16 @@ static int __init hugepages_setup(char *s) } } - /* - * Global state is always initialized later in hugetlb_init. - * But we need to allocate gigantic hstates here early to still - * use the bootmem allocator. - */ - if (hugetlb_max_hstate && hstate_is_gigantic(parsed_hstate)) - hugetlb_hstate_alloc_pages(parsed_hstate); - last_mhp = mhp; - return 1; + return 0; invalid: pr_warn("HugeTLB: Invalid hugepages parameter %s\n", p); hugepages_clear_pages_in_node(); - return 1; + return -EINVAL; } -__setup("hugepages=", hugepages_setup); +hugetlb_early_param("hugepages", hugepages_setup); /* * hugepagesz command line processing @@ -4685,7 +4746,7 @@ static int __init hugepagesz_setup(char *s) if (!arch_hugetlb_valid_size(size)) { pr_err("HugeTLB: unsupported hugepagesz=%s\n", s); - return 1; + return -EINVAL; } h = size_to_hstate(size); @@ -4700,7 +4761,7 @@ static int __init hugepagesz_setup(char *s) if (!parsed_default_hugepagesz || h != &default_hstate || default_hstate.max_huge_pages) { pr_warn("HugeTLB: hugepagesz=%s specified twice, ignoring\n", s); - return 1; + return -EINVAL; } /* @@ -4710,14 +4771,14 @@ static int __init hugepagesz_setup(char *s) */ parsed_hstate = h; parsed_valid_hugepagesz = true; - return 1; + return 0; } hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT); parsed_valid_hugepagesz = true; - return 1; + return 0; } -__setup("hugepagesz=", hugepagesz_setup); +hugetlb_early_param("hugepagesz", hugepagesz_setup); /* * default_hugepagesz command line input @@ -4731,14 +4792,14 @@ static int __init default_hugepagesz_setup(char *s) parsed_valid_hugepagesz = false; if (parsed_default_hugepagesz) { pr_err("HugeTLB: default_hugepagesz previously specified, ignoring %s\n", s); - return 1; + return -EINVAL; } size = (unsigned long)memparse(s, NULL); if (!arch_hugetlb_valid_size(size)) { pr_err("HugeTLB: unsupported default_hugepagesz=%s\n", s); - return 1; + return -EINVAL; } hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT); @@ -4755,17 +4816,33 @@ static int __init default_hugepagesz_setup(char *s) */ if (default_hstate_max_huge_pages) { default_hstate.max_huge_pages = default_hstate_max_huge_pages; - for_each_online_node(i) - default_hstate.max_huge_pages_node[i] = - default_hugepages_in_node[i]; - if (hstate_is_gigantic(&default_hstate)) - hugetlb_hstate_alloc_pages(&default_hstate); + /* + * Since this is an early parameter, we can't check + * NUMA node state yet, so loop through MAX_NUMNODES. + */ + for (i = 0; i < MAX_NUMNODES; i++) { + if (default_hugepages_in_node[i] != 0) + default_hstate.max_huge_pages_node[i] = + default_hugepages_in_node[i]; + } default_hstate_max_huge_pages = 0; } - return 1; + return 0; +} +hugetlb_early_param("default_hugepagesz", default_hugepagesz_setup); + +void __init hugetlb_bootmem_alloc(void) +{ + struct hstate *h; + + hugetlb_parse_params(); + + for_each_hstate(h) { + if (hstate_is_gigantic(h)) + hugetlb_hstate_alloc_pages(h); + } } -__setup("default_hugepagesz=", default_hugepagesz_setup); static unsigned int allowed_mems_nr(struct hstate *h) { diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 7735972add01..5b484758f813 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -444,7 +444,11 @@ DEFINE_STATIC_KEY_FALSE(hugetlb_optimize_vmemmap_key); EXPORT_SYMBOL(hugetlb_optimize_vmemmap_key); static bool vmemmap_optimize_enabled = IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON); -core_param(hugetlb_free_vmemmap, vmemmap_optimize_enabled, bool, 0); +static int __init hugetlb_vmemmap_optimize_param(char *buf) +{ + return kstrtobool(buf, &vmemmap_optimize_enabled); +} +early_param("hugetlb_free_vmemmap", hugetlb_vmemmap_optimize_param); static int __hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *folio, unsigned long flags) diff --git a/mm/mm_init.c b/mm/mm_init.c index 2630cc30147e..d2dee53e95dd 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -30,6 +30,7 @@ #include #include #include +#include #include "internal.h" #include "slab.h" #include "shuffle.h" @@ -2641,6 +2642,8 @@ static void __init mem_init_print_info(void) */ void __init mm_core_init(void) { + hugetlb_bootmem_alloc(); + /* Initializations relying on SMP setup */ BUILD_BUG_ON(MAX_ZONELISTS > 2); build_all_zonelists(NULL); From patchwork Wed Jan 29 22:41:38 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954206 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0CEF4C0218D for ; Wed, 29 Jan 2025 22:42:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1986428009A; Wed, 29 Jan 2025 17:42:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D3E528008C; Wed, 29 Jan 2025 17:42:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E8F7C28009A; Wed, 29 Jan 2025 17:42:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C500C28008C for ; Wed, 29 Jan 2025 17:42:34 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 86CACAF6DD for ; Wed, 29 Jan 2025 22:42:34 +0000 (UTC) X-FDA: 83061965028.26.6B6AAE3 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf02.hostedemail.com (Postfix) with ESMTP id B44B080007 for ; Wed, 29 Jan 2025 22:42:32 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=VRLdHBpn; spf=pass (imf02.hostedemail.com: domain of 3166aZwQKCOIJZHPKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3166aZwQKCOIJZHPKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190552; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NdaXvVKgkNqaabYpHFU5W/ymB41iIHIErbbYhcizrz0=; b=V18TpHQgcMIJTHactWatr88CkJgok9B/ah5PaTfnSfrDl+6kV3PR2NhHTUQhcL1C73BMB/ 5wlAna2l1c2bEF/1w8eEAMlVFPPJvxbpEH4h1UtYy31bcYfwneKAWr8JDl3qnc4RWeBJn6 p4VjaATylX+GZ63IftS9vira1WJ+p3Q= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190552; a=rsa-sha256; cv=none; b=JhlrzTGdomJNwD0TZijpXYfMz6eZ8JxDUEqCnq29mjkH7HwiG6oDB6zoyaCop+jGCKrTjr ec4rvmHRqqtFQj1p0sg0wV5S7wBEOovUvMThy8B2LUpxvL8f4n2d2xo1V2L/upt8eZzTPd 3/2Gq2YKSqtOaeBdQGBxTG5DDM02t8Q= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=VRLdHBpn; spf=pass (imf02.hostedemail.com: domain of 3166aZwQKCOIJZHPKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3166aZwQKCOIJZHPKSSKPI.GSQPMRYb-QQOZEGO.SVK@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ef79403c5eso377912a91.0 for ; Wed, 29 Jan 2025 14:42:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190551; x=1738795351; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=NdaXvVKgkNqaabYpHFU5W/ymB41iIHIErbbYhcizrz0=; b=VRLdHBpnzvWjRKyConlXU6Y/vVRcm9hHxoF+eB7nUS3OPYi4pNutS0CNBPvzWgnWng aSrw2/pD3YSRGMhxqKm/XNecA29/b76eh0Lirrr1dnce0kTwNWbSUyTJe2h+0xLnGgCM IPzVlfrPMTQZBw4x4Rd828+hz9qIygXCeVYWPrFbs0XH0GDwoAoNBsXol4PptnwsnQvq /WtnPMUGBmxtlMAoPZU+u/2iN0Kyt646ElSRAbCjSAaGKyMO6uctWO0c54ZyxSPnIy/i Sy+i2l75WiRZim4h7fDdT3jZcsTXg5GQnyx2gBVccmxI2rDuF/Vyrzu9bz1sTu9Ot4V9 AgSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190551; x=1738795351; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=NdaXvVKgkNqaabYpHFU5W/ymB41iIHIErbbYhcizrz0=; b=mslWFfwbT+9oR+h1Hu6LANlaHEDE5P39DTgxdS+L9eU8FqKv53yUUHNcJ58xnNcVY+ H4n3CDSL3UiN/CpEprnhNOQuBl+gcRGCn0lhm0bKffplRK2kyIMXOCzHjzQkYl+JVN/D jbSgRnPCTM2BYjK08TJYWJ/mxqnElBK02K2OZ7MQsCaseT6vbqXPQNhJ4VUTXIIARAh0 bfoXsb1luQgZvW098WZe+YvDj7zDrEwF7v4b2ypII22YlqTYh0fOEzKkDXJOu8LRB+M3 3ktt35CiZmSHktJj0Rw9co0dFnVHF1dMgKn9bWrIef0YGAZQax4YWUfPbVSkX9xsfsos r2Ag== X-Forwarded-Encrypted: i=1; AJvYcCWeTh1O9mEJ/VqqoVVIAu96G82yPNLJAsaa2eWZV89WHcU6HeaQ0XXu8ziqFOW0AHPhLNagfWV/kw==@kvack.org X-Gm-Message-State: AOJu0YxwmstwPEdiLpy0387jhukG30zKeDyJ2u9oWC8ra4dlZnL3Zksj QVklp1knzzQ2L77kFmqP9VRZJoH1GecepfbrjxIPOVDLhUC7O1+PCaXYWX2f6vL7zM4U7A== X-Google-Smtp-Source: AGHT+IHw/wrvs6eYtSXxybbi6ZpEmFFHuoZkE0uX4Iwdoy1hJPle0E2kJ2J6Mh6qYpoJPyZfUbzKoCH1 X-Received: from pfxa13.prod.google.com ([2002:a05:6a00:1d0d:b0:725:d9ab:3f2e]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:10d1:b0:72a:8b8f:a0f1 with SMTP id d2e1a72fcca58-72fd0c9460cmr6874234b3a.20.1738190551535; Wed, 29 Jan 2025 14:42:31 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:38 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-10-fvdl@google.com> Subject: [PATCH v2 09/28] x86/mm: make register_page_bootmem_memmap handle PTE mappings From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden , Dave Hansen , Andy Lutomirski , Peter Zijlstra X-Rspamd-Queue-Id: B44B080007 X-Stat-Signature: oeegnocf7ib8jw51x9tc8omzdc6hr67w X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1738190552-520336 X-HE-Meta: U2FsdGVkX1+2F8AL60HVRyTWI4Usqdfv2pQJfqG/HqXtAA5m0/Osi/7gPBfayVL0isDBZdl5L7Opit5ZQhVjn7I+Eym36xjz3hzgHrbNinGARgsPeXhyPw52rUhkO5H1wOcuX3fFuTl3kgLp3T2xkDbXA1ohsrkx9vHXAnZTazeLGqf/g/zaCA+fFahugjO46zBTNzSNHzrx9YczHsRkuSSWGhE97qRXdLWdfL0QeTWxfp2yuMxXg+fIA6aZA2e4LWdfVsPxOIGW8fH4QLHRuoIsJIk1k4c4m19vGMVuiTwQgKaiZgkn3i8l+vL0REJB33RNjYpkFfHqjYkfUYb1te9iyyAFcje4E+HKWTTOjw2rU47Kd4VxyUblSkDx61ZOSPDDjzJMnQ7wpdggAyVCsIo2s4e2SPM2GLlYujZ5r8EuhpxZMCVDeKR3hM8MB4D6ajR8TSDlZLNR/KaIscFbPoBKfMv7+m8+PvIZmBaH3Qj7Vue0pFkH9DI+hcWalbJzpKh2mvc1nN9gww3sDlUADDzKc7aJjgf+VyUtg/pPgPzfHaABNkTdr3MbtvEPJ37BjnNHyd5LGJNH9qaDL6Kigk+QvUvM6ajOG8XO7DQ5sMDsEqmfGEtQODZrxK2tbsF7it6Yq7TQFiQD7qQp0/SJTMIkDzDBX/Wia4/AXnR94FrU/14COJYzcs84DRSi2hFbOM/qt1NLZBYB0Qk5yEvMBnVHBwI7fLRIrDRM2LCaGFKf5R64m+lYGXRaSxLEGkJ8XsjQIsqL3Nl9+gUMIht1A6fdPqi0LPqDI8d8u3egbyN7KHX+6r377/8I5IUlDmEBAWfOHIj1TAi4vQL9bv4bd2Tg181lwbU9BrMFA2W7X9qamO8x0Ll6LX9E+rW7D9CeHWHBeabHNUle6/dWQbLNbWM9Kx2I+QoXeb0Z2H3lpkZvtGB1EmRU9z0qxO53Bk0oqJPjrKSY9PFu1g2zVpc sNKrgWZi O/Y2oVvbGWSjbf4aPQamZS8YQijfVL6hESMavXCIWap5nuw5IXq+xpKiYcN1VDnb6+tmUgXif7mQym38DN8b/g4TIxxmYT98knceUVHfN+u6L8xKV6RViGs4gvj7T+DAGRPDjUInWlABtYkiyOd0esDpYE+f3Vw/K+e7W7N+BK2sHEnUo4FWzWlqh7pjji3SgBokV8gvKqYTDs+XxJabf2AZykQ9w9HsrqMUwnW5Mr9T+vjrZuJHaf+QWO94sNyKjuWNZTIWS5dR1kvgP+seXss5jwGKgGlRHVujfXC5UDqwDD+Z4ywulEHxYC+4NUrHqiBlDsgB9Qp/SFye44jjvCseE+IdB9CN4V6iCpBvkNDw9x0qAEQ/aCmU2qvyLAcL/bakskvaoVgGIxgLd3ojyTaKKPFHwzXq5bvLCEVlApb0YD95En1iFejbRg/IZqvX6oM9bCHRJySea/oPoS5L6OVGt02LhkAmZOrV190RtHUgxMGoLAUnhrDVzo939Qn7cz8KDP0v/yIx230o1QAek+2dhw/qkRvnSpUWgXkYDzMFP+dYKMSMx5OYUp6MxO21yB4QkgdA9GHI9gye7omr4ZKIbEbEW2J7eMygdy+iLF7zr7gQoBw1vK/AadLJri1kQUkzBGlqmaxw9vtSdJQwlf0cz8MiC+zzvW+/RWoLgFy5Fml2hr/QUqW19tQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: register_page_bootmem_memmap expects that vmemmap pages handed to it are PMD-mapped, and that the number of pages to call get_page_bootmem on is PMD-aligned. This is currently a correct assumption, but will no longer be true once pre-HVO of hugetlb pages is implemented. Make it handle PTE-mapped vmemmap pages and a nr_pages argument that is not necessarily PAGES_PER_SECTION. Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Signed-off-by: Frank van der Linden --- arch/x86/mm/init_64.c | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 01ea7c6df303..e7572af639a4 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1599,11 +1599,12 @@ void register_page_bootmem_memmap(unsigned long section_nr, } get_page_bootmem(section_nr, pud_page(*pud), MIX_SECTION_INFO); - if (!boot_cpu_has(X86_FEATURE_PSE)) { + pmd = pmd_offset(pud, addr); + if (pmd_none(*pmd)) + continue; + + if (!boot_cpu_has(X86_FEATURE_PSE) || !pmd_leaf(*pmd)) { next = (addr + PAGE_SIZE) & PAGE_MASK; - pmd = pmd_offset(pud, addr); - if (pmd_none(*pmd)) - continue; get_page_bootmem(section_nr, pmd_page(*pmd), MIX_SECTION_INFO); @@ -1614,12 +1615,7 @@ void register_page_bootmem_memmap(unsigned long section_nr, SECTION_INFO); } else { next = pmd_addr_end(addr, end); - - pmd = pmd_offset(pud, addr); - if (pmd_none(*pmd)) - continue; - - nr_pmd_pages = 1 << get_order(PMD_SIZE); + nr_pmd_pages = (next - addr) >> PAGE_SHIFT; page = pmd_page(*pmd); while (nr_pmd_pages--) get_page_bootmem(section_nr, page++, From patchwork Wed Jan 29 22:41:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954207 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB4B4C0218D for ; Wed, 29 Jan 2025 22:42:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B5C1C28009B; Wed, 29 Jan 2025 17:42:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B0DB428008C; Wed, 29 Jan 2025 17:42:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9AD0028009B; Wed, 29 Jan 2025 17:42:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 7AE4728008C for ; Wed, 29 Jan 2025 17:42:36 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 3227B80493 for ; Wed, 29 Jan 2025 22:42:36 +0000 (UTC) X-FDA: 83061965112.25.8BE65BA Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf07.hostedemail.com (Postfix) with ESMTP id 5AA8C40009 for ; Wed, 29 Jan 2025 22:42:34 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=OYf04DaG; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of 32a6aZwQKCOQLbJRMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=32a6aZwQKCOQLbJRMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190554; a=rsa-sha256; cv=none; b=dI4qygP63hZKf8kr4ecZoJmVGpU9/VC2mt1cPhp/TwM0uKv8Z1HcTeFr0DktIgn+Ey6WTr 12ZzISGdrd5ZqVWtjCkWjh9jZwjKtfLCoQmoc7F6X/ScJXgiEGLCoHnCDuUldV9pn9PGeQ xL6WDFoHD31RltxtMH1EGyjD0fCbRZI= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=OYf04DaG; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of 32a6aZwQKCOQLbJRMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=32a6aZwQKCOQLbJRMUUMRK.IUSROTad-SSQbGIQ.UXM@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190554; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GH4/nN/ipl8VhnVSg142rgT3YjHhqiHRlTH7bGddYdM=; b=cJC4grrQUVAL1t3YChtS8rMNx3o7GTXyAk777PWWrEp43PxJSD2SVwfMcPOv+KqQwaKWzw LPwcl8EIpkGBTDQNkRoTElapzQz3MPRPKZM/Tz7oOCx4bi7x7iyyC9nMAOILFuSyQj00qu AyE8KnJrHGizo6CrQ7JOqCv92d02fGc= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ef728e36d5so218022a91.3 for ; Wed, 29 Jan 2025 14:42:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190553; x=1738795353; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=GH4/nN/ipl8VhnVSg142rgT3YjHhqiHRlTH7bGddYdM=; b=OYf04DaGsl724pzKKHCC+hi0HCS4FDq409YAEMYc1LzhUNOnDoRDBGcXwulsOwGp6m 4ywGbbih95wRx1HjC96zDAGtWAkK1x1DXjscYQyrUB5tpCUE4ajqnnN5K4Mz+9HFUKZP xqvFB5CR6TgH5BSHAryt3cmpfq/WdRq+/cbsEjRKohfb+MZfx4UWN9MuowaIsA29Doqm pvN94r0Lc1wIWBN4Hf1enDagBydXZwvdfBU8vGR+EtLDTxnUmJ275Gavpi5K7RtA3O2M R7B09uCc57YoGwvgHFXrgReVxB/Jqi9unnk7jFNVojF2+aXTL7ft957ChupxbdJpdNas wq8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190553; x=1738795353; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GH4/nN/ipl8VhnVSg142rgT3YjHhqiHRlTH7bGddYdM=; b=ohennM2zC4De6SNBKDQbo2Oq7n+nd+gvmFjiBti+DTtzCR1xj5dQdvyGUMZiqMCYnt a58s+dHLIHSo2jZub1uIpWT3H1FoUlw3+ApipQ0ZFNbso+bYZOwlAxHc8WkA27vbyMgP PPfckLhswy4YKZoz4sdIBI7ATbG5KaMgXja3rJdvD2q6vD7hQxkIIZAgswtN2yLdiBZV 4BU7cu1QVjL6qwREadubw4Qipve1Ir3SHHNQA6IFnUFxTHxP7+qVcXTo0Lai1vrTTQzU av+RcvmsIUSIYkuSXgzTb6T+locoYFog0fJipvxbUUq9mfSuQDGqGHyUHE+dilFtsgbT ih9w== X-Forwarded-Encrypted: i=1; AJvYcCXOOxuuGvdQH/9U/mLEVRIdMfN7SlUQzyCLliOpBhv+oUcr1UVOJG6BroyWwNiFywZ8eZ/akzzNNQ==@kvack.org X-Gm-Message-State: AOJu0YxTK8A4W0aPON7x65kc6sl9KmLy2mupJR+RjCvE1v8thMBLEgua +oClaG7hOZIG3zjouIGxUXZmNwNY7UyqN7tinDeGKpu64IEQddXFdbDzn6SE0AnOVXo2Ug== X-Google-Smtp-Source: AGHT+IG5Mrax3LyQOG1HOui64BSR1jWBAiLbkCcjUDrw6952WvIrjLf65/po8XH13fz8CLKODbLidR2X X-Received: from pfbds9.prod.google.com ([2002:a05:6a00:4ac9:b0:728:2357:646a]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:4c09:b0:72f:c510:dcb5 with SMTP id d2e1a72fcca58-72fd0c6227dmr7072755b3a.17.1738190553153; Wed, 29 Jan 2025 14:42:33 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:39 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-11-fvdl@google.com> Subject: [PATCH v2 10/28] mm/bootmem_info: export register_page_bootmem_memmap From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 5AA8C40009 X-Stat-Signature: crknxjw67et95ofpkazpq4xdimj3atgq X-HE-Tag: 1738190554-281383 X-HE-Meta: U2FsdGVkX1+mop5BcDc11ZwNWzHtEieGucTG8jzz97oragTcUKVT3BcCmzuAx+93EfeZG0lG9r7MZtB2hTSvtuwllWC7hAREeiUl2Gq+3p3lAGlcKn09u1zA9n2kO2X6O/zKM6aNQ1h5xQQY+EXpuc5oMBE1mo14XZ1p5/e4U3EKfTv7FFHN6beubtG53YcTjwV6TANWhFMEHASITsJoh6Q3kwwbP9wQpfCZUcEOWvLubtVcHX0bIP9/TQk5VwU6qEEptSzuaIioRzCttyqXnpfdKw7yNUUpJ3c4jFij/j+rXFGi1GbVDZhIbCZzTajzHyo7WBz5x5uzJg1cnt1v47M0ITTBXoRrtmqVWUqOvOjv5VEwliKYByeM0u3R8/KT+UVXZVsCp7L6R1+tMS9K0+avIi2/oGQ3GgbVBNSvcoweQoQ1QBIis5INNAKYr1csECFWIf4kfz7YjIUJ8p355yYFBzz23+pZi+qJIhh/kq6USUIEUQeAe39h5spUsYv+nQ/JmG9jWbbyRauMF2RgFdYOjL19eJTCCma/q9VGx+wrvCNUbpRqBIr061+NQe+4RtybVMKGd9/GfnPhHUU6dBYeQIixLGw7phWyqxFu/adxFLjNZ9bU99DFoSEqzpfmwo8O1N/MeZ37XY6aXghzYLHLzzCLmJW+PnUEi8SJinpQsJNNI2LTMQY09esro27jM4l/4jjQsgV3OvV/L0pvC13iF/F4P7gBDHqPvzfryZgKgMMvR8nlmJjdKn7TzEXgiVicxFaopLr1gTXlzbzpaLOkoq6FggOv0S39G7RxzWdUp9/lFHnJ1IBNcN9wqkIusaDno0i58ujg51lGDZ7VejLrAgPqrhZ1/2ycJzOHMcRn2uR9h9wu3IZT5fn+SGLApxquyamSgYqLGn1TG+oHHcp4s6fzk1wpNqkj/VK8bUf6HVsW8sEWRerGMpJJjRuO/qWufBPH8tYOSJ7ytps rsIis98A hdKRigqDkBr85rYw37Sq/bV0WlpMSAWwyMumBymTQDzuOtDmd2Q1xc2hnz81ZHsRf56cf3vZRnmu2U28NtYV2rW+ypmL5OFSbaapkNBiiqcrvxJsRvSRerUKD7p8P9+nYnn35zN1MCs9yzZM4rRJQQ+DVgRpbwKTuG38/vm7mbpcH4AKIBWQOGHy7t6SLz3VQbTylz3bA032VbDLQ/i7mNxPIRTIQMp/ULN06FNMA0yQB9xX2c4trL8QlauCWHrDENwHnR9EY46QPrmnLcM8+f9bEwQ3sDk8Lafq2ixFezRqe2A1c3pbDYDqae87Mf/n3glVCH8xJpxvBu5aRadolOdx0Q3ctLIjC8xrusRMUdTWj5sN+eZXfE3Ixlxb4DFWPEXHNB7bYB7YOdrvVkWUJBJBc890jILNu2I/M8tT9Dwwr/+NFZIFXSzzG0147Ykk9U1mwi5GO+5DN/qwpw0IolUIfuyaD5drwZ0Y1+lvlXJVzAyvFUpz23os7T4Dz+HXkd/R0ps2RMGSHHCQwoCpBU8e3yxUZKudimpVC1ymuMXBwjnvu9EdfbzvFyc0rm7JEsnSz X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: If other mm code wants to use this function for early memmap inialization (on the platforms that have it), it should be made available properly, not just unconditionally in mm.h Make this function available for such cases. Signed-off-by: Frank van der Linden --- arch/powerpc/mm/init_64.c | 1 + include/linux/bootmem_info.h | 7 +++++++ include/linux/mm.h | 3 --- 3 files changed, 8 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c index d96bbc001e73..c2d99d68d40e 100644 --- a/arch/powerpc/mm/init_64.c +++ b/arch/powerpc/mm/init_64.c @@ -41,6 +41,7 @@ #include #include #include +#include #include #include diff --git a/include/linux/bootmem_info.h b/include/linux/bootmem_info.h index d8a8d245824a..4c506e76a808 100644 --- a/include/linux/bootmem_info.h +++ b/include/linux/bootmem_info.h @@ -18,6 +18,8 @@ enum bootmem_type { #ifdef CONFIG_HAVE_BOOTMEM_INFO_NODE void __init register_page_bootmem_info_node(struct pglist_data *pgdat); +void register_page_bootmem_memmap(unsigned long section_nr, struct page *map, + unsigned long nr_pages); void get_page_bootmem(unsigned long info, struct page *page, enum bootmem_type type); @@ -58,6 +60,11 @@ static inline void register_page_bootmem_info_node(struct pglist_data *pgdat) { } +static inline void register_page_bootmem_memmap(unsigned long section_nr, + struct page *map, unsigned long nr_pages) +{ +} + static inline void put_page_bootmem(struct page *page) { } diff --git a/include/linux/mm.h b/include/linux/mm.h index 7b1068ddcbb7..6dfc41b461af 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3918,9 +3918,6 @@ static inline bool vmemmap_can_optimize(struct vmem_altmap *altmap, } #endif -void register_page_bootmem_memmap(unsigned long section_nr, struct page *map, - unsigned long nr_pages); - enum mf_flags { MF_COUNT_INCREASED = 1 << 0, MF_ACTION_REQUIRED = 1 << 1, From patchwork Wed Jan 29 22:41:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954208 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3E4CC02190 for ; Wed, 29 Jan 2025 22:42:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3BB8928009C; Wed, 29 Jan 2025 17:42:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 342A428008C; Wed, 29 Jan 2025 17:42:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 16F1728009C; Wed, 29 Jan 2025 17:42:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id EAFAB28008C for ; Wed, 29 Jan 2025 17:42:37 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id A8FBA12018B for ; Wed, 29 Jan 2025 22:42:37 +0000 (UTC) X-FDA: 83061965154.23.98EB8CC Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf11.hostedemail.com (Postfix) with ESMTP id CC1A540009 for ; Wed, 29 Jan 2025 22:42:35 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=kqBDJkV0; spf=pass (imf11.hostedemail.com: domain of 32q6aZwQKCOUMcKSNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=32q6aZwQKCOUMcKSNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190555; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3UXEGVf17TVzbPjgEhSFIqZ1yVI6Ua/dfF01nsCmm7A=; b=NPLmxFMIQKG5R3H0jj5vbL3yI/UC1TQ9VISl+LmrjFBUtuiB6AniaPKhuVkmmSh/4g3pBi OAidUnP7FTWAC+fgkdCBedhRqBtmljqiq3NbSXo9SXocaExtIfD6q5RSDNAgy9Uvj8FQ2Y qUo3ZkAiWol8eG74s+e2Mbom26D3WE4= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=kqBDJkV0; spf=pass (imf11.hostedemail.com: domain of 32q6aZwQKCOUMcKSNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=32q6aZwQKCOUMcKSNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190555; a=rsa-sha256; cv=none; b=Eo5PdcRzKRa3SLH/WdF52yoMBHt2RyQvaJ/T+Pgx6Vmw/LmW3QPgQ1wEAIKRICsZx8gAzJ ruxWyLwvwS2b/6ZvQuvYnWfbWM1VzjfcsfWHflTS6M8NdkBf8WEng68BtSL2depteM+Jw6 kKPJW+aexs5ZjhxXPQpFN9/PLh5yV+k= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-216405eea1fso3321465ad.0 for ; Wed, 29 Jan 2025 14:42:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190554; x=1738795354; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=3UXEGVf17TVzbPjgEhSFIqZ1yVI6Ua/dfF01nsCmm7A=; b=kqBDJkV0JjZoPCSaEwW1tdc616jgHsDaS7dUpVf1nyMi/U9yeRBRs9e0U6BqBd2Ff3 OkWNe9Y2WgVqL3XKydiz2I2mLGCTlJ7apWSgjI5C9ELgXnSj/T5/l6X+MPuW3i7lA2K5 TvVAgZO+ENbUXK3mit68PXipdsswjaK/6aOKB0ar4QiJzJMxybjAXKib28If0m/U8Z6B R56xC2E6Y7hcsWJjH82SDLjslPqn0sKE7v/Lvr442/uXiv0lOlfEgKp+9uvczPcT2Gog 7Q/TVpWRJPtZr7+hSLSa8jxY/KOGvqw6rXELIuKvlmxDV1gQtez2M5/kABGGiR96y8tf ybhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190554; x=1738795354; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3UXEGVf17TVzbPjgEhSFIqZ1yVI6Ua/dfF01nsCmm7A=; b=WCpdwaK8L7XbQrv792gkBoL1sA3n2N8s+NjbmDh8oiDt+ETC27DYsVCN4GP8fMxqfU XWPCq0SbA4M/xIrke5X8dZtr+jUZQI9otfXkMDKA6L2fqLm9Kk4lZ+YTXIlKgtOPwdjt RHyY0WQMSGFk5iiUPZRTcrw+AIaDDiEQW+TCEZYjG7dR5yUr5lMvdqsWyomVwt3oNNSc +aJW4aWoIFoha12u5qVPopaFP3ttccMcZHUlYI9/2HzFTBW5zT00S4MkjVB9gfViF0Js H04qyBVHnlV5hBRwFoKpqMuivjG/XIfeEYSLD7Qoi7t23ehRFpYg92GCeyvZzUUrN4i9 xvVA== X-Forwarded-Encrypted: i=1; AJvYcCU3azumTwgke1AE/c/03ipDxQ7ZG+P2J52VUKoP7NszPDr55Y+wBOx7KsvlWuZgoVsH49DX1RBtWA==@kvack.org X-Gm-Message-State: AOJu0YyWqRCtidgluNgMgVAMScDYguLtAty4dYduUWMFgWtqgCm6cAee vNvpRpaiA2ot8lAaj2+m+XNE0np3DGPwku4dbawdy5oxUJFKYVRiPblaRZ13mpksRLqAiw== X-Google-Smtp-Source: AGHT+IHj6bsBkN24xzMdTMA+I9+VEM1UbHXLrVTD4bOpHV3AO+9zEskdr0kUa6IbeUS8dcGGf/WyQAID X-Received: from pfbcd15.prod.google.com ([2002:a05:6a00:420f:b0:729:14f9:2f50]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:140f:b0:725:e015:9082 with SMTP id d2e1a72fcca58-72fd0bcd790mr6189047b3a.5.1738190554657; Wed, 29 Jan 2025 14:42:34 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:40 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-12-fvdl@google.com> Subject: [PATCH v2 11/28] mm/sparse: allow for alternate vmemmap section init at boot From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Queue-Id: CC1A540009 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: ogxgqrq5ztodho3au4wko4ed8o1z6ug4 X-HE-Tag: 1738190555-415506 X-HE-Meta: U2FsdGVkX18pP0h4QjPPA1WEm5RsD4RDcKUuQmh2+92DVNr35cQ0rludAKnGldC+1PdlFFNyyL+mp+CFP0RiyIu3J7Mck4bTjQyfEL2+TQP/I+8S0Oet6jrsCcch7fLK4gFniSc3WGzijFWiUgNDRqDCN0AxAtyBK3TWGCE1SVOhRod37yENczq0OTAR7a8muU4vtQB0lrPq+qlm62vyi9ZJp414/8skkhtG9lSbDS4XIRWiHLu7JgP/hrBiZRht7RDG4B4ymRu23K4+Q7+ZMgmbrIwF9NaHoUPcWxUyA3dN2s0gRsgBbcqSrMbLQCW6PozleBG+t+z1UorY7VnjScb9Buc0ySKDSLrZfrKYOpyDL3EZAVx4eg3BG2VoMLFYd5KuSIqG6VMSPNyl4rzPzPrhZl5ls4awnUOnDm3X1XX3g8Gl7pjZb7RmzQxryGh7udfH0c8P6Yp3JM63qgQ+XBZLAi6Qw1RyeREsa92LeKlHwwniwkPz5zll+hUl/g7wectVYKwlc9NHlXAXKXIi7pjksZ7+RmIujdWDEG/ERtI17G/RK7fT5iIgJRQ10Ni4To9UjSXdZ6pxe8kuazeUXbdV76YYqFb+i6M57+rt/p1uQkb9uqakiX5HI6Kyq8X5NfCNw8pkN7o0WvNlWowwJrCB6QCxabXrasALMgWhieaDqXkdUuZLjQGjYrZlZ+Sp0sJoUm58sz/fCGobB3usTK29NSMNBFh527jDP7jrpiAkNizcqIHD+h7R1vqxQZfpWYj0EZcD+6ASblw3jlw+MZfyFkyDaQrhWITCbv9McRLCoNhC7Lq5/vZThqOytgQVwdgepYyNmKKlUyrGgaQ7SN/sXBiFaqewp/vUSNe1BMZU2TxJRNaH3iY7nPxE279jjBjMv/DhGRxPWvNJB9HCfNxxmrOgiTKIHDZhX3kXFYlgzaG6ncXf9UtUSX3VKi9z9iO27BPvqkzP0ctQWhC F0c9lM1R EqLtkPGeAoKkjSKewq5bhqziPvVDr6pJuP0u6zg3lpETGnCJ4B0TKMRfxTFJG0G2nD4H0beeex7BGiC8aAv0lTyIl0lC2LeEHcSqnkHHRVt/0Jb+DfgHTEE1NI/28LsV1LO9KxLvU9PSWsZl1WLvV5x0TzZ6NT8Z3DsplEwJFdL60FrsuEMWEMV8781uJ8ZTOYKpowmEy1QqaJKNw2ogQrrzKFUzETHmtkJX5Fkd/wcGooIjg9bnETi5Wkw0VlFt2RCRKzD2B3WuA1TfOICkaRoFWaklQ3+SPU5dXL5BWGm6R3rbs5dDZS7E59LLjcVzSQri8URAns2S9rddlHHERCF+m4NWl8/RyF3DRiQhb6g78R+EDWa5szGE/yhmtYhI9ByvegapG7C+1tg7GRFQgMgYWLN2NR9gD/hVWEAIQKzZIOHbZyCesuA0UkjP/IqQB24MOtHFxtBjb2ZRTWNhqU0VlTJ3JpcjKvHcE1LTgnlmr48R27doLtVFoETWVPZ/uddF7n9vsUPJK+QjKzoJmnPoC+g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add functions that are called just before the per-section memmap is initialized and just before the memmap page structures are initialized. They are called sparse_vmemmap_init_nid_early and sparse_vmemmap_init_nid_late, respectively. This allows for mm subsystems to add calls to initialize memmap and page structures in a specific way, if using SPARSEMEM_VMEMMAP. Specifically, hugetlb can pre-HVO bootmem allocated pages that way, so that no time and resources are wasted on allocating vmemmap pages, only to free them later (and possibly unnecessarily running the system out of memory in the process). Refactor some code and export a few convenience functions for external use. In sparse_init_nid, skip any sections that are already initialized, e.g. they have been initialized by sparse_vmemmap_init_nid_early already. The hugetlb code to use these functions will be added in a later commit. Export section_map_size, as any alternate memmap init code will want to use it. THe config option to enable this is SPARSEMEM_VMEMMAP_PREINIT, which is dependent on and architecture-specific option, ARCH_WANT_SPARSEMEM_VMEMMAP_PREINIT. This is done because a section flag is used, and the number of flags available is architecture-dependent (see mmzone.h). Architecures can decide if there is room for the flag and enable the option. Fortunately, as of right now, all sparse vmemmap using architectures do have room. Signed-off-by: Frank van der Linden --- include/linux/mm.h | 1 + include/linux/mmzone.h | 35 +++++++++++++++++ mm/Kconfig | 8 ++++ mm/bootmem_info.c | 4 +- mm/mm_init.c | 3 ++ mm/sparse-vmemmap.c | 23 +++++++++++ mm/sparse.c | 87 ++++++++++++++++++++++++++++++++---------- 7 files changed, 139 insertions(+), 22 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 6dfc41b461af..df83653ed6e3 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3828,6 +3828,7 @@ static inline void print_vma_addr(char *prefix, unsigned long rip) #endif void *sparse_buffer_alloc(unsigned long size); +unsigned long section_map_size(void); struct page * __populate_section_memmap(unsigned long pfn, unsigned long nr_pages, int nid, struct vmem_altmap *altmap, struct dev_pagemap *pgmap); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 9540b41894da..44ecb2f90db4 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1933,6 +1933,9 @@ enum { SECTION_IS_EARLY_BIT, #ifdef CONFIG_ZONE_DEVICE SECTION_TAINT_ZONE_DEVICE_BIT, +#endif +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT + SECTION_IS_VMEMMAP_PREINIT_BIT, #endif SECTION_MAP_LAST_BIT, }; @@ -1944,6 +1947,9 @@ enum { #ifdef CONFIG_ZONE_DEVICE #define SECTION_TAINT_ZONE_DEVICE BIT(SECTION_TAINT_ZONE_DEVICE_BIT) #endif +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT +#define SECTION_IS_VMEMMAP_PREINIT BIT(SECTION_IS_VMEMMAP_PREINIT_BIT) +#endif #define SECTION_MAP_MASK (~(BIT(SECTION_MAP_LAST_BIT) - 1)) #define SECTION_NID_SHIFT SECTION_MAP_LAST_BIT @@ -1998,6 +2004,30 @@ static inline int online_device_section(struct mem_section *section) } #endif +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT +static inline int preinited_vmemmap_section(struct mem_section *section) +{ + return (section && + (section->section_mem_map & SECTION_IS_VMEMMAP_PREINIT)); +} + +void sparse_vmemmap_init_nid_early(int nid); +void sparse_vmemmap_init_nid_late(int nid); + +#else +static inline int preinited_vmemmap_section(struct mem_section *section) +{ + return 0; +} +static inline void sparse_vmemmap_init_nid_early(int nid) +{ +} + +static inline void sparse_vmemmap_init_nid_late(int nid) +{ +} +#endif + static inline int online_section_nr(unsigned long nr) { return online_section(__nr_to_section(nr)); @@ -2035,6 +2065,9 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) } #endif +void sparse_init_early_section(int nid, struct page *map, unsigned long pnum, + unsigned long flags); + #ifndef CONFIG_HAVE_ARCH_PFN_VALID /** * pfn_valid - check if there is a valid memory map entry for a PFN @@ -2116,6 +2149,8 @@ void sparse_init(void); #else #define sparse_init() do {} while (0) #define sparse_index_init(_sec, _nid) do {} while (0) +#define sparse_vmemmap_init_nid_early(_nid, _use) do {} while (0) +#define sparse_vmemmap_init_nid_late(_nid) do {} while (0) #define pfn_in_present_section pfn_valid #define subsection_map_init(_pfn, _nr_pages) do {} while (0) #endif /* CONFIG_SPARSEMEM */ diff --git a/mm/Kconfig b/mm/Kconfig index 1b501db06417..f984dd928ce7 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -489,6 +489,14 @@ config SPARSEMEM_VMEMMAP SPARSEMEM_VMEMMAP uses a virtually mapped memmap to optimise pfn_to_page and page_to_pfn operations. This is the most efficient option when sufficient kernel resources are available. + +config ARCH_WANT_SPARSEMEM_VMEMMAP_PREINIT + bool + +config SPARSEMEM_VMEMMAP_PREINIT + bool "Early init of sparse memory virtual memmap" + depends on SPARSEMEM_VMEMMAP && ARCH_WANT_SPARSEMEM_VMEMMAP_PREINIT + default y # # Select this config option from the architecture Kconfig, if it is preferred # to enable the feature of HugeTLB/dev_dax vmemmap optimization. diff --git a/mm/bootmem_info.c b/mm/bootmem_info.c index 95f288169a38..b0e2a9fa641f 100644 --- a/mm/bootmem_info.c +++ b/mm/bootmem_info.c @@ -88,7 +88,9 @@ static void __init register_page_bootmem_info_section(unsigned long start_pfn) memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr); - register_page_bootmem_memmap(section_nr, memmap, PAGES_PER_SECTION); + if (!preinited_vmemmap_section(ms)) + register_page_bootmem_memmap(section_nr, memmap, + PAGES_PER_SECTION); usage = ms->usage; page = virt_to_page(usage); diff --git a/mm/mm_init.c b/mm/mm_init.c index d2dee53e95dd..9f1e41c3dde6 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1862,6 +1862,9 @@ void __init free_area_init(unsigned long *max_zone_pfn) } } + for_each_node_state(nid, N_MEMORY) + sparse_vmemmap_init_nid_late(nid); + calc_nr_kernel_pages(); memmap_init(); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 3287ebadd167..8751c46c35e4 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -470,3 +470,26 @@ struct page * __meminit __populate_section_memmap(unsigned long pfn, return pfn_to_page(pfn); } + +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT +/* + * This is called just before initializing sections for a NUMA node. + * Any special initialization that needs to be done before the + * generic initialization can be done from here. Sections that + * are initialized in hooks called from here will be skipped by + * the generic initialization. + */ +void __init sparse_vmemmap_init_nid_early(int nid) +{ +} + +/* + * This is called just before the initialization of page structures + * through memmap_init. Zones are now initialized, so any work that + * needs to be done that needs zone information can be done from + * here. + */ +void __init sparse_vmemmap_init_nid_late(int nid) +{ +} +#endif diff --git a/mm/sparse.c b/mm/sparse.c index 133b033d0cba..ee0234a77c7f 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -408,13 +408,13 @@ static void __init check_usemap_section_nr(int nid, #endif /* CONFIG_MEMORY_HOTREMOVE */ #ifdef CONFIG_SPARSEMEM_VMEMMAP -static unsigned long __init section_map_size(void) +unsigned long __init section_map_size(void) { return ALIGN(sizeof(struct page) * PAGES_PER_SECTION, PMD_SIZE); } #else -static unsigned long __init section_map_size(void) +unsigned long __init section_map_size(void) { return PAGE_ALIGN(sizeof(struct page) * PAGES_PER_SECTION); } @@ -495,6 +495,44 @@ void __weak __meminit vmemmap_populate_print_last(void) { } +static void *sparse_usagebuf __meminitdata; +static void *sparse_usagebuf_end __meminitdata; + +/* + * Helper function that is used for generic section initialization, and + * can also be used by any hooks added above. + */ +void __init sparse_init_early_section(int nid, struct page *map, + unsigned long pnum, unsigned long flags) +{ + BUG_ON(!sparse_usagebuf || sparse_usagebuf >= sparse_usagebuf_end); + check_usemap_section_nr(nid, sparse_usagebuf); + sparse_init_one_section(__nr_to_section(pnum), pnum, map, + sparse_usagebuf, SECTION_IS_EARLY | flags); + sparse_usagebuf = (void *)sparse_usagebuf + mem_section_usage_size(); +} + +static int __init sparse_usage_init(int nid, unsigned long map_count) +{ + unsigned long size; + + size = mem_section_usage_size() * map_count; + sparse_usagebuf = sparse_early_usemaps_alloc_pgdat_section( + NODE_DATA(nid), size); + if (!sparse_usagebuf) { + sparse_usagebuf_end = NULL; + return -ENOMEM; + } + + sparse_usagebuf_end = sparse_usagebuf + size; + return 0; +} + +static void __init sparse_usage_fini(void) +{ + sparse_usagebuf = sparse_usagebuf_end = NULL; +} + /* * Initialize sparse on a specific node. The node spans [pnum_begin, pnum_end) * And number of present sections in this node is map_count. @@ -503,47 +541,54 @@ static void __init sparse_init_nid(int nid, unsigned long pnum_begin, unsigned long pnum_end, unsigned long map_count) { - struct mem_section_usage *usage; unsigned long pnum; struct page *map; + struct mem_section *ms; - usage = sparse_early_usemaps_alloc_pgdat_section(NODE_DATA(nid), - mem_section_usage_size() * map_count); - if (!usage) { + if (sparse_usage_init(nid, map_count)) { pr_err("%s: node[%d] usemap allocation failed", __func__, nid); goto failed; } + sparse_buffer_init(map_count * section_map_size(), nid); + + sparse_vmemmap_init_nid_early(nid); + for_each_present_section_nr(pnum_begin, pnum) { unsigned long pfn = section_nr_to_pfn(pnum); if (pnum >= pnum_end) break; - map = __populate_section_memmap(pfn, PAGES_PER_SECTION, - nid, NULL, NULL); - if (!map) { - pr_err("%s: node[%d] memory map backing failed. Some memory will not be available.", - __func__, nid); - pnum_begin = pnum; - sparse_buffer_fini(); - goto failed; + ms = __nr_to_section(pnum); + if (!preinited_vmemmap_section(ms)) { + map = __populate_section_memmap(pfn, PAGES_PER_SECTION, + nid, NULL, NULL); + if (!map) { + pr_err("%s: node[%d] memory map backing failed. Some memory will not be available.", + __func__, nid); + pnum_begin = pnum; + sparse_usage_fini(); + sparse_buffer_fini(); + goto failed; + } + sparse_init_early_section(nid, map, pnum, 0); } - check_usemap_section_nr(nid, usage); - sparse_init_one_section(__nr_to_section(pnum), pnum, map, usage, - SECTION_IS_EARLY); - usage = (void *) usage + mem_section_usage_size(); } + sparse_usage_fini(); sparse_buffer_fini(); return; failed: - /* We failed to allocate, mark all the following pnums as not present */ + /* + * We failed to allocate, mark all the following pnums as not present, + * except the ones already initialized earlier. + */ for_each_present_section_nr(pnum_begin, pnum) { - struct mem_section *ms; - if (pnum >= pnum_end) break; ms = __nr_to_section(pnum); + if (!preinited_vmemmap_section(ms)) + ms->section_mem_map = 0; ms->section_mem_map = 0; } } From patchwork Wed Jan 29 22:41:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954209 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE756C02193 for ; Wed, 29 Jan 2025 22:43:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CBCF928009D; Wed, 29 Jan 2025 17:42:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C6B6B28008C; Wed, 29 Jan 2025 17:42:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9FABD28009D; Wed, 29 Jan 2025 17:42:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 7B26628008C for ; Wed, 29 Jan 2025 17:42:39 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 36A601C7029 for ; Wed, 29 Jan 2025 22:42:39 +0000 (UTC) X-FDA: 83061965238.09.996D197 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf24.hostedemail.com (Postfix) with ESMTP id 6C6AE180002 for ; Wed, 29 Jan 2025 22:42:37 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=MESzq7Oy; spf=pass (imf24.hostedemail.com: domain of 33K6aZwQKCOcOeMUPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=33K6aZwQKCOcOeMUPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190557; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tk6hF6rqBW9AusrfDePvQanEIhsmPpyv3aYoz52NJE8=; b=OaB+FQwEyHJtkY4mDeSJOBGaGKq9e/TWCiRSE5PE9n1N+RScNICCi/hIiWGY5xDu1wU8nT RF4CDB7IYJVWBuDNsKxY8hkrpMUbMTutMdsfcWnppW8X9m7MNkx9fQghlu4sIfDmUS8rEj IxrGVvRE8kqLwk9+oSh+W68gTpQpJUQ= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=MESzq7Oy; spf=pass (imf24.hostedemail.com: domain of 33K6aZwQKCOcOeMUPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=33K6aZwQKCOcOeMUPXXPUN.LXVURWdg-VVTeJLT.XaP@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190557; a=rsa-sha256; cv=none; b=2Xr7a36AlyNsQKHSO1iB0iWv7hIh24WENYXCRBV0D7qowcE5w46Eis6JpbiIwl57SGt520 hGEcPjZ6BcSjCEfaHqC3URnkSL8fv/IJzYQnsh/Oa7ySLlgdtHcSuOu20R8fakVr7CvMeO QMJXbZWeTe5oHEvFDh/l2bwG8aPXWY8= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2166f9f52fbso3843935ad.2 for ; Wed, 29 Jan 2025 14:42:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190556; x=1738795356; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=tk6hF6rqBW9AusrfDePvQanEIhsmPpyv3aYoz52NJE8=; b=MESzq7Oy+JFHEmmusxhcMIBf3OqE8lwCT9Z+N0nZ+fDAF8ELcEwcc+rcRRwWGcS/MT E05YhXi05ve//29IXE7v7UB7v9MdTa+AxQldanQkoublJ1tteFZ4UyEdHoBmCgUrTw5M HrhaZMxw/yAaPH5wmhpu+YAPQqORNbuyKul5ENA+kqSA7FIq0AUJm09luL8xdA3EBjG9 NXNVKw6A5euQZ5kMfrFmnWcigxlUAd/af/1lnYFIw0wiotlVWCie8kJ3vtjFEzkniRpM 5iHqqpZKowWmLTJedxTxg7PFl1mYGxWs6XV7Z8ILJVQd1m+XSYREVbWJnYEs7s6Oo2Bm jN3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190556; x=1738795356; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=tk6hF6rqBW9AusrfDePvQanEIhsmPpyv3aYoz52NJE8=; b=gy3Yjm1s7HnNM3r0l2lJa98qGOql8ZQGBNjkZDuiW2lJY2WFcINogomXha6/5JnMNn ZyJcMyfqvfUiYi+6BBInryvrDrEa55ftiOrEySjhF09D9Q5lAd9mzMk8rN9I4YvYXKga 48Ajvh9uI5HedXFHWsXwF5BEdvG753zg1AX1anh9W/Hrk/nlPTb+09VTM5KpzdA2tlE8 xCfwXf/76AobwtJyoEVtmxOqRWnsb4Yq2gT6jw4pB/zoEUeG3Q7Mm+YXNZTmzG76mbXN 3OC2TaUqHKtAuzH5OHMag9YojyGj2Ee52cOZlTZCvq3uqo0Cv7nm0KvIRUNj6/lI4zQz ovIQ== X-Forwarded-Encrypted: i=1; AJvYcCXOvwYM8Hmg8Nc/fKFNsx4o7W1qgXbShOk+V2eDr3Am8zJNrR5j7l8A8kmwuQwHv/4gBVrEVT0JNQ==@kvack.org X-Gm-Message-State: AOJu0YysqInuOpnKOuxTHJDs9H5f0sdLdz4rgLxu6JVI/S+wZHXrH5zu 8L2t7udDq0CCGSbKBOYvBcXHiTAe74eGPvtQ3PbRQtK8rnSgtP37bCF24a0/4Z0WaDI31w== X-Google-Smtp-Source: AGHT+IEqjJb+x+dhP5rhHYNRAeOuV6o2kqh55KQZeq8wjY02yXG128DZoAUhFzB6wXoeeE68EWhcrrD9 X-Received: from pfbay42.prod.google.com ([2002:a05:6a00:302a:b0:729:9f1:663e]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:c91b:b0:1eb:3623:59fd with SMTP id adf61e73a8af0-1ed7a48c425mr8653488637.4.1738190556311; Wed, 29 Jan 2025 14:42:36 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:41 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-13-fvdl@google.com> Subject: [PATCH v2 12/28] mm/hugetlb: set migratetype for bootmem folios From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 6C6AE180002 X-Stat-Signature: dkjopkjhuxe4ifjc78p5osrzetbh7sc1 X-Rspam-User: X-HE-Tag: 1738190557-518493 X-HE-Meta: U2FsdGVkX1+2cmTHg/tJOTVuYTI18S8KW+h9kiKUqT+CmgeqqSDaFHyKwq4p1RIOzU6JKTg3WQWjY7VzjcnjnGBuy0s6jkHAlvHhd2DkqFtIpez6ExHN/Lh4SvgVwq3PY/rrOh0RUVfDrROjtwVLPy6g5ZVYSCGXhvub8f+ZxZCcva3sCAyEzABRL1SGSdRtUGHkJFc0bynGt1pLrqS5/9wgLJ/DuF4sPuKM9q99B6qpe25xe5wbzt0vvlKrTQGQ36bAadgR5/1k026YM1GlYWwuceqw9ZpBM4xyKCLdjOlNm5CcK+VoJtTXbHyZhM2rqE5/ESazazY/gFJGt2cjFQb5Vv7yK542bXH0qZ+XAq+BxOtP/sLYi+sbhP39kds8Me7QkQRVk+FreKS1PBFAdsgCNKu1/YMOnQVzjWTR2fcuzgxNpvbNifaeQxMYAMV29nk1I8Pdv/KN0crRE20AcCRdojoF4mchWoKtN3VYL7iigrBvf/ChEBerUe3DTBtDgTJ3pvUWIOnPApyX5wseZdRCHUcGOUCcFTYx9xc/fbGrGj7vhI2y2j6QVD0+zkUx6Eh3YZZ/JUc41Ke/0iwSYXp91b+rhjuCHRogfmvxTbmgNxMdevz7+t2afLo0LJ4k2klZffVXBo+TTu+9s34bQ3iKSn3jXZh2mbUE6GQjaTAzwBVpwFiFBhPdbrPhU9bangHkT7K1pkHL/BLFuFBDrt3FhLzISymdLz44t3CJNrFUGV6U2xaQCKDhJb5drmEolN2j+4TMOE9S0clNXDSRK8vVQ4h5DC6iw4sr+4Ba6npgYA6d0/0Ufg7cl3RNtDg9cwzv72zsTGFOKD91cLbgZL4CNTE/bMrxc8ZwmU+hEDCgGjBt27zBDGS72NKXLiX7XyZudwTnHmMDTFHaqw0/C4aGF7PP15AojPbF10uzGkrJbW76U7T4/fZec/L3suue1PjiulANcC0hnycv247 Kphv2fvx ZlKm9TRCKmmmJVmuhgAiQFzTOy4WsznhWXswFYaoQdnD61UV1A9L1Lyp2FQt3Fnch6m3nam2P33tDB4lldv+Lt4x6IdOCdTL14I7m2hNna9yluZxjJbkw5FS+qcLjzNnt30N674eXz49/HR+SNzI0GzUwHkhwBnrEg4MbFJZJA+2OCTG26e3Kb2SvqnpXxNAHhSPAJU5usaIZ29lxc4im8owU9bzeInefoaAAbFZjo+0wcetKTWYKS9wu6bFO9km5b+fxoU0BJkhUASXoBUgTALFl7ngxljqVCEXGV32qdBoaEguHeJUfnHwkZnrVE4ZP1GAymLSlOINH3MAqRb4Ic3KKYVXKHV4Yn1PA/y+OGz45WgH3N677Igj5QSzxjFjM5xoT9PgE8Dn8p5N//46+o5urxsuRu1NAdGzewG8QQl3p18AbrFiEzq9Z1nXssVTI3fzFiQG0dXVe4EtFc52if48406j2/w9ZMk2C/NhT1MwEn/u6Sjb3GCuctz//unGyDrx9oaMWG7FBsR3zLQ9Ik9rypg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The pageblocks that back memblock allocated hugetlb folios might not have the migrate type set, in the CONFIG_DEFERRED_STRUCT_PAGE_INIT case. memblock allocated hugetlb folios might be given to the buddy allocator eventually (if nr_hugepages is lowered), so make sure that the migrate type for the pageblocks contained in them is set when initializing them. Set it to the default that memmap init also uses (MIGRATE_MOVABLE). Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c16ed9790022..e5ca5cf2c6fd 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -36,6 +36,7 @@ #include #include #include +#include #include #include @@ -3258,6 +3259,26 @@ static void __init hugetlb_folio_init_vmemmap(struct folio *folio, prep_compound_head((struct page *)folio, huge_page_order(h)); } +/* + * memblock-allocated pageblocks might not have the migrate type set + * if marked with the 'noinit' flag. Set it to the default (MIGRATE_MOVABLE) + * here. + * + * Note that this will not write the page struct, it is ok (and necessary) + * to do this on vmemmap optimized folios. + */ +static void __init hugetlb_bootmem_init_migratetype(struct folio *folio, + struct hstate *h) +{ + unsigned long nr_pages = pages_per_huge_page(h), i; + + WARN_ON_ONCE(!pageblock_aligned(folio_pfn(folio))); + + for (i = 0; i < nr_pages; i += pageblock_nr_pages) + set_pageblock_migratetype(folio_page(folio, i), + MIGRATE_MOVABLE); +} + static void __init prep_and_add_bootmem_folios(struct hstate *h, struct list_head *folio_list) { @@ -3279,6 +3300,7 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, HUGETLB_VMEMMAP_RESERVE_PAGES, pages_per_huge_page(h)); } + hugetlb_bootmem_init_migratetype(folio, h); /* Subdivide locks to achieve better parallel performance */ spin_lock_irqsave(&hugetlb_lock, flags); __prep_account_new_huge_page(h, folio_nid(folio)); From patchwork Wed Jan 29 22:41:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954210 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 91C01C02190 for ; Wed, 29 Jan 2025 22:43:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 36E1928009E; Wed, 29 Jan 2025 17:42:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2F4D828008C; Wed, 29 Jan 2025 17:42:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1BF3F28009E; Wed, 29 Jan 2025 17:42:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E34A528008C for ; Wed, 29 Jan 2025 17:42:40 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9B113452FC for ; Wed, 29 Jan 2025 22:42:40 +0000 (UTC) X-FDA: 83061965280.19.44363C0 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf27.hostedemail.com (Postfix) with ESMTP id D3FD24000C for ; Wed, 29 Jan 2025 22:42:38 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=TC+lBrtt; spf=pass (imf27.hostedemail.com: domain of 33a6aZwQKCOgPfNVQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=33a6aZwQKCOgPfNVQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190558; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xLt9HBc0qt/kG8opRR3xwjjIGa13UKxYV2/Ut2PpccA=; b=0+3HUxZiz+RKIQ9jlH4So8o0RBu4XPrYVAv2fom3TWtoIurWtWKpp003I4M0oJHxJZIrsR v+okUVWymzecVpSaxM5bRguGYdhU+QMPvQCKswJcZxUJS3O8xw4PIILeEZrEhf20QEICQS BGohx+qvn7C6CJNNVFN8X3eiX4xCU/s= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=TC+lBrtt; spf=pass (imf27.hostedemail.com: domain of 33a6aZwQKCOgPfNVQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=33a6aZwQKCOgPfNVQYYQVO.MYWVSXeh-WWUfKMU.YbQ@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190558; a=rsa-sha256; cv=none; b=suYo8cjVUWtifGHUg7ZN5EA1h3qnOb1EprSJ7nJpe9pkRVTgVMZHlvBKx3fXko5YihC4hL 7JSDgy6uY3kfs3eRD2AtsM2f9BPEm8lpwoMPGtBg4UHVrhHiD2/MgNdVJGl+Bw7Pl2wL7d dgEpryGK+Sr2aB/RSy0p1Xmuwsvm1os= Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-21655569152so3146505ad.2 for ; Wed, 29 Jan 2025 14:42:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190557; x=1738795357; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=xLt9HBc0qt/kG8opRR3xwjjIGa13UKxYV2/Ut2PpccA=; b=TC+lBrttueQ29DYCYA2ZtP3k76+o/YWTWXvqsqAomPjYS6am2VVA+3BgMvybRIQRM9 9B2sTPxyp3+5GghtFjJ3qwsdFWtSwQgJtu00aMn3Sg/f4Iw8mrzWyjgFLQYsOTDyAQhh SQqsaGqZwEhBSLTXS2Mfa6I1wkEA0OTQ+CXuCjfEyJtPt7hb1rGHFWmgnW6hTGu0B0Nw jK0gp5nWICOESj6P11/1TPfdBd3oqKFYrWSLyArMcD8jU7pIcYHB7OLjwv3Sjwwa2TQI xjp/rxI/jdz5QrQFAeZS/JN0EVug+1mkjq/FoAcAAJP7sw4v3yE9v6iycaYVn4F5AuLL 9MbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190557; x=1738795357; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xLt9HBc0qt/kG8opRR3xwjjIGa13UKxYV2/Ut2PpccA=; b=vx2ahHMdjF6E2YaCALeDCIHS1BBYRUT9hBPpewKRnnkaI7yacPfCeznBB2F3k971gg sy+zxuVTJBNhjr7K/CTSfMwhHZUwyPurp7UDF+WbuEQnXd3Rr2NIEPIFMkx+zOJLXnPU bZOgIr9DtKsznT6loF6JjfnYOCD6PQvl0lV6l1RbDHzbeq2KT97crRId9W7cCxw6tav2 s3k4B5gbFNEGxt/aBSfNTP4vu5EnMynCwj0Nxx7qLiLT6Ju3080DXMpt0+WxkpgrLBBe ZiDg8udWaK0iclgKNBdUzI//vB01q2H5cRR2eF+HzjSnFAuUd2lmdYZQPedo3cyrAu7r a3/w== X-Forwarded-Encrypted: i=1; AJvYcCVP+nq46qc/Ml2hxcymbS8+YIE8ZXqER/1IOgAWGKMP+7GO5Z3x5kvW8zrbC2KyNKka3o4/pltGVA==@kvack.org X-Gm-Message-State: AOJu0YxadLw4OKX5prmkd4SuwlwbZbq2cNkxJS0veUK3DZEehTgh/vP+ oDdk0XK2et/UduSKPNJaifLkikn4o4+stFjT/XB/oqpWnPE2fqwhEKU7tVwfKa2AjGZjQg== X-Google-Smtp-Source: AGHT+IG77o+oLUYvd4dsteZDfBQBaEerJ11jO3U7DjFTw9ASxdHAtRSoV+qn91gtydr1b6v63zVD5abu X-Received: from plbmf13.prod.google.com ([2002:a17:902:fc8d:b0:215:5a53:ee06]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:d551:b0:216:5556:8b46 with SMTP id d9443c01a7336-21dd7e0728fmr78854645ad.49.1738190557688; Wed, 29 Jan 2025 14:42:37 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:42 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-14-fvdl@google.com> Subject: [PATCH v2 13/28] mm: define __init_reserved_page_zone function From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: D3FD24000C X-Stat-Signature: 8ghdrfwrzdjmbwjkp5okymew8os1k6pg X-Rspam-User: X-HE-Tag: 1738190558-888313 X-HE-Meta: U2FsdGVkX1865/8zy31jqxf1+xBSzVwfZlLNMxWMl7FLz7ISlHtuqJ6jzC03hVqc/VFoO7RwvDP5LoiFjhm499T83VtzXxu+akE/BOuhNtHESBYsC+5PisWkMsuRjlMh2aVQidopPrFqxdx1Luo/xLJx+OhlMqzzsWssWGPQUdOA1u2tkwBqb8fj//wnCR1fcQHVNuldlUrrVU0DGt4URgB/WAbnUrZt/VTSABr/oGNNer1gt5XgasWSs3u4uKctbI/QTIiwjIOTBymwu0Tq+a2TuTivwZhnFvMElwISXBfP1heqsJtFWBo8d3UN/OsbgFcRI75PI14vBVYkv/7sPFYVEPhT3M3fY1SEASTirTnk3sq8qRUh6sFEa1/qYs9XIL3LoyoYgop0MgI5HHRY+/BIrXBnHgp/ELbvCx5igd3t2kRIC9ziRL4N4B867JY8kqe2vkGCtsjHj98IIqHFyJjz87niePWeq3qRffx8EN3tMdQFaAxBGHCsuaz2KGCuCzhG2EsxIrZ0d+jvJGGVK5FbYTrRUal01PJCa7FPqKjl3X0oOQ6Sbo0y+4FJ2Ap+zDi8W/018Sq+E1cQTFmXQpQRCpqBGWqmg5CJGat8Yn5BsMJxHv+MtR913bWJKnUeds0PfTqAEqSNyAc+5NkJEiQ/BjXXor/8daXgPRMBzRE08/WMGSCigAa/fDu+yASOUuWnBWWl/45Blj5iuY+YNUS08m57uVfyW8blJ6iZkcpYzvalZZcN4yzA7Y5ymRUW0SHgZfdZqzmk9viimyXAbFBO3EaAI7p1kBerf0vPf/m1P3lNHMGc1l98+2BSD9GnJpyX1EpLmggcoqezYiYXjRLPhF3pw2ZKiMUpUgwg+ch/tOw3li46FNgMkL2YR+Uqdxs8nauNNswEjl21/OfIKLw23DucD76EFJX8jIOz+sDvAYZARs19e6bnUWGErEb2hVwt2w+s01lQY+ZoG6z xBzpmKuL ydFqY0E+QuBuFtNbGGc4K03px+te5mgKbA/YKSuWwHLzYurLDPTT4YOlaKZKZEnsDUqwk4RzrgoETwvAckTMfjWztDamtRY7BwOxOQ9g0nhnsqOwSNWQ30bM8RmyyQWrgA0zocTRI2tqop0iyk72bRCfgHxCSUovfSAatwWfiiFBSfPSMtNqs2P0qs9wqxHhFK+IbdRSlShpGjvDBRQGMOEJAajiCPQDQGoorGaxrba2U2w6AhBDf0cmyMTnhgMgYEaPX0zjTWkD9AYaASpxzjiaFZNs8DSMsqt6ktdxSj36X0Zdp/+BHIJoIH7f7/IygwO7sVYPwsslOe2rrVxPYMWdxKBliFNgzKsvLSbPFRjYzHGpMmHwKS/ekPutPWe7gbgyattg24RS5Wdx9ccgkx0wjsny18XVGhUkJxDdjmQ2gIp5zm2dDRMSdQKYyrNeV/AuIXqEe2+NrREz0dcnleVxyWgcWoIi06O7QyWYJHSqY8+42qMf9kO8Vez2gpmJL0lXsC/VOSFkWqSRnoc6iW4aGt6o25NCqUfzX69qDjNDN9plaFk1IGgkbbawyWJ+xmgMeUwuo15ri3RayWQd54d5Nw4ef9xXwXp9TSxrjWRS/u4k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Sometimes page structs must be unconditionally initialized as reserved, regardless of DEFERRED_STRUCT_PAGE_INIT. Define a function, __init_reserved_page_zone, containing code that already did all of the work in init_reserved_page, and make it available for use. Signed-off-by: Frank van der Linden --- mm/internal.h | 1 + mm/mm_init.c | 38 +++++++++++++++++++++++--------------- 2 files changed, 24 insertions(+), 15 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 109ef30fee11..57662141930e 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1448,6 +1448,7 @@ static inline bool pte_needs_soft_dirty_wp(struct vm_area_struct *vma, pte_t pte void __meminit __init_single_page(struct page *page, unsigned long pfn, unsigned long zone, int nid); +void __meminit __init_reserved_page_zone(unsigned long pfn, int nid); /* shrinker related functions */ unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, diff --git a/mm/mm_init.c b/mm/mm_init.c index 9f1e41c3dde6..925ed6564572 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -650,6 +650,28 @@ static inline void fixup_hashdist(void) static inline void fixup_hashdist(void) {} #endif /* CONFIG_NUMA */ +/* + * Initialize a reserved page unconditionally, finding its zone first. + */ +void __meminit __init_reserved_page_zone(unsigned long pfn, int nid) +{ + pg_data_t *pgdat; + int zid; + + pgdat = NODE_DATA(nid); + + for (zid = 0; zid < MAX_NR_ZONES; zid++) { + struct zone *zone = &pgdat->node_zones[zid]; + + if (zone_spans_pfn(zone, pfn)) + break; + } + __init_single_page(pfn_to_page(pfn), pfn, zid, nid); + + if (pageblock_aligned(pfn)) + set_pageblock_migratetype(pfn_to_page(pfn), MIGRATE_MOVABLE); +} + #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT static inline void pgdat_set_deferred_range(pg_data_t *pgdat) { @@ -708,24 +730,10 @@ defer_init(int nid, unsigned long pfn, unsigned long end_pfn) static void __meminit init_reserved_page(unsigned long pfn, int nid) { - pg_data_t *pgdat; - int zid; - if (early_page_initialised(pfn, nid)) return; - pgdat = NODE_DATA(nid); - - for (zid = 0; zid < MAX_NR_ZONES; zid++) { - struct zone *zone = &pgdat->node_zones[zid]; - - if (zone_spans_pfn(zone, pfn)) - break; - } - __init_single_page(pfn_to_page(pfn), pfn, zid, nid); - - if (pageblock_aligned(pfn)) - set_pageblock_migratetype(pfn_to_page(pfn), MIGRATE_MOVABLE); + __init_reserved_page_zone(pfn, nid); } #else static inline void pgdat_set_deferred_range(pg_data_t *pgdat) {} From patchwork Wed Jan 29 22:41:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954211 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80BBCC02190 for ; Wed, 29 Jan 2025 22:43:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BFB612800A0; Wed, 29 Jan 2025 17:42:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B0D3528008C; Wed, 29 Jan 2025 17:42:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9D5292800A0; Wed, 29 Jan 2025 17:42:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 64EC728008C for ; Wed, 29 Jan 2025 17:42:42 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 1D6101A0313 for ; Wed, 29 Jan 2025 22:42:42 +0000 (UTC) X-FDA: 83061965364.18.B9DE73A Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf23.hostedemail.com (Postfix) with ESMTP id 4B166140004 for ; Wed, 29 Jan 2025 22:42:40 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=iXwf8rZi; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of 3366aZwQKCOoRhPXSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3366aZwQKCOoRhPXSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190560; a=rsa-sha256; cv=none; b=myCzEwbwhV+bwQmwywoVAkNlDAJT+l2qbEAyLC3t/wqvf+mFUZSNeqVaQT4aP3rQ35casM vdykNE7vRMhayxF7di7IhGtPmyacMKWN7h6jAREpzbcr4ln70W4nC8PNTdQJYw1Q3UdzrZ vvwYzWKMbT+RN4rf064lW7/AZ0egajE= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=iXwf8rZi; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of 3366aZwQKCOoRhPXSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3366aZwQKCOoRhPXSaaSXQ.OaYXUZgj-YYWhMOW.adS@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190560; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Bphjz2B5H2n+p6toKCuCjUfnxwYaF6wPJ0biY9P/V2E=; b=Cvcj2RoW9o7DhLzpcNVlNbaC9rCNryZfQYBHtpoIFcAxEq6ZF3iPYM8qOGWm3uSOHhtAFC Cq8GPJBvbx9T19nFrbtr770/tBdhi/o3i+Kf4rczX02KLPZDo8PQAOo1meNnss/IBI4tUx 0YwxWGvE6k21hLbwvCQ5AJ2saJhrNcQ= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-218cf85639eso4517595ad.3 for ; Wed, 29 Jan 2025 14:42:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190559; x=1738795359; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Bphjz2B5H2n+p6toKCuCjUfnxwYaF6wPJ0biY9P/V2E=; b=iXwf8rZivoUtlh0TyyO+8+kZhX07AtEyqrBWHmbaCfBuhOMqhFADy+A1/0P4n8Urxl AeAZabMjMIraB+YxeDb+E53Z2oQXprgWbpiojjBoJuhXUTy2opNv4s3eIGu7ta+SQKCu Xby3gUVirQdkXciKDjRPckkHZ6/iQyf1uq2X3fDhgzvCce0/+M2fS28n37bp66SeqiFx MSr7UvkNu88bX6DrvThvV1ffds7mbnLvqBpWUixJeB2C7gl6Ji69I9XuJeJJ5z1OmSEY 31yQ2W9QXShOFpe9LpFzZasNjpJxOGj3h8evS2Qt6/MAflhHMt20Xb2Z+pJHIxtSIDV1 +UNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190559; x=1738795359; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Bphjz2B5H2n+p6toKCuCjUfnxwYaF6wPJ0biY9P/V2E=; b=h03dzPcfHrzK35BdlLSHYQaAhs0RwIxF0TIoTINBgA8mY3ihu+RSahM6hEZVLsM+zE v4vuZbBpXGtLB2LJHN26VWcOslkhgPonQJ7r6r9Hhfrb9pBsJKFIFXc/9UHWCHxxxx9H D8xDfD4MeTaxpT0vMx3X+qyFqbn8REUbk7MfABTLeNEmPGeAAW3tvtMQUr73l7BstgZn 59VQ04tK8epIbErZFlrgePDvoZVVS/scyXCS285yGxIYbJqr60o9zhDuFtOSPTOuZ67x slWXh1AUVdG8OAGjxfys4K8/56dFM/a7WfI4TJYwPiQKetEm0VECFDgpzP43fz3UqC5V QBVQ== X-Forwarded-Encrypted: i=1; AJvYcCXYERKVfQr6dmdjmnT27cc4SPJFsGVjfTOrDE/B8cHraI0U3U8yTmY2AMxrLd7RqsBCG6PT0apfiA==@kvack.org X-Gm-Message-State: AOJu0YwqI9qMsj3I1164TDRedoWOumGkFkC51Jy4A3pGpgRrHxlgRH/1 VdqwlBe3kEV5GR3DhECryPDAmeDErEBN4R4X1BDVVfsiuaaqV0nR7yQi7m/8SmyOr7flyg== X-Google-Smtp-Source: AGHT+IEupuZBL681CzwqacE6l9j1whrVwEezyuvYbWICKt+dR8gVuQALet2uv2LXpK1TnJ3eAsBEWAai X-Received: from plxd18.prod.google.com ([2002:a17:902:ef12:b0:21d:dae1:77e8]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:ea02:b0:21c:fb6:7c50 with SMTP id d9443c01a7336-21dd7d8aa02mr76684975ad.31.1738190559165; Wed, 29 Jan 2025 14:42:39 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:43 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-15-fvdl@google.com> Subject: [PATCH v2 14/28] mm/hugetlb: check bootmem pages for zone intersections From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 4B166140004 X-Stat-Signature: tqi47buzi38jeb8b6yx6fprkjyfeyuk1 X-Rspam-User: X-HE-Tag: 1738190560-363253 X-HE-Meta: U2FsdGVkX19ylObBzNkhvC1XUwklgsHBK469bFZ7HMlHUrecgtp02ZU6RH7CLAgCv9vXy9TPfRA/w1qPxnNl8G45jcGd4CRtqPE5VYr3c8G/GFH612vcU1RfGQjfIKQW5vhyEtOozOZNXzT7BMOCcOFMEnlAF5YK2dKOUYb4nOgVVCgLGSXUwG+qxJ8ryX7aygquwHeCAv1yvB99lBscXCiv3cs2blGWpm+imi4KmLI00fSuR6sk4nvP0tFmcEYfaQG8Rk9VUnsPN7lDgg8UovjVMUpwyX+16lTppbT9qa/XiPGO7So1Nbj6ADOIuSxj9sinI9+RekxcPGwYH8nPj3ayEK5kS2VrAjHPWKtQ39Gk0edtPbrfRkxAIqgT8JFSYYMjo0QzAYOWpCbHVocvvUsefU3f2MgsQtZpHJku55OZWknvEC7iNKZfn8s7rKPteqAhCHdkU3qp3+CBklTyuOiY2VpyHarWMUzMuT5Dzy0r0Uv7EePAKCH2Igrmy9JktDxzHxBBGmvWYoU0SSsL3SCid+RmykOfcJl6XS5tpZVeRRXj52iMGbB0vArR++3VbAeFH0b8f2i5efK2CtfmmB8azAM8RksvlHnzeZnRmLA2znHdM1/ebHnlWCIspnVGNZvF9nvSWzkL2jDzhhFPVDSyE2NHbQzRv7AnLJiDhzGhkMsPV4ceFWGnsVxAyAlbHSYd4KVX87/nbHksdEaY8WhIz91CFnEkh1hcOp+Ih4TRs9FxaSr5q8CrPke7LvV3rPZOWN/yH8G/lLWYXRX2aQUT9cfc0P+J9GvdY5TNaTjeYsokFVN7P0eWJTBqhEJrZMzFlVb+T7aQyyTN60ixDqHBrIMkatINvn88UFTsu51djsVXcvM/+NLIOKM56oBN+f12Kz5D8mOz/YvKJ/p7763vno2y3gT6w2hcFWJI2AA1qF0IYsDYvX3SfsAR2Vslv7p45de2iVGhxoB6XZq tfzZwVEW jHssnH4rKeu4aOEEdVI0YXV75ltCLaZn8WJNGv5c0qKMPdKWtcacT6mpyHLc2jYPyXaYGmbABVxl+LXYrfJ5pasOi1zetPz0vCWgQ0Lecxx6V4qvDQ1wkdEH0juJdNore1K/6aL5g3izgGWfoGHaRuvSeURiopEAhPIMwBzOTus+xtgN5w0383QPymLVjofeGLatTqunDKB6S+5rlJXUL4MqMvw32yLEwAOCZkT2t/SxM6hBLStZS8AuyFk81xGIKlXiajflowYfWSFUvDn2SuDfjwCSTDfiDPSLBIppkovnk2oinWDuJfnbpMxQKdejGxmVHA+MlU6pfZgcZYQOW7804qgPJaXYJwGhVVYd5uzH1y8OEtXF/ORpOFFMD0wQ21zAAMdMASyvs+oRYZeQ74iCxrSdDRiqf8r6wvM5Ru4IwVKzWE2HKnhdcUqpSSZSASCXH9lCuGmxIqDd1aW3lj02uVdgRe/2u2ZBadYtU0xWVpZekgF5JITcN15wDJ0LDWAWLncxJPXEpI1TOjznJPiSS4A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Bootmem hugetlb pages are allocated using memblock, which isn't (and mostly can't be) aware of zones. So, they may end up crossing zone boundaries. This would create confusion, a hugetlb page that is part of multiple zones is bad. Worse, HVO might then end up stealthily re-assigning pages to a different zone when a hugetlb page is freed, since the tail page structures beyond the first vmemmap page would inherit the zone of the first page structures. While the chance of this happening is low, you can definitely create a configuration where this happens (especially using ZONE_MOVABLE). To avoid this issue, check if bootmem hugetlb pages intersect with multiple zones during the gather phase, and discard them, handing them to the page allocator, if they do. Record the number of invalid bootmem pages per node and subtract them from the number of available pages at the end, making it easier to do these checks in multiple places later on. Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++-- mm/internal.h | 2 ++ mm/mm_init.c | 25 +++++++++++++++++++++ 3 files changed, 86 insertions(+), 2 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e5ca5cf2c6fd..a0a87d1a8569 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -63,6 +63,7 @@ static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata; static unsigned long hugetlb_cma_size __initdata; __initdata struct list_head huge_boot_pages[MAX_NUMNODES]; +__initdata unsigned long hstate_boot_nrinvalid[HUGE_MAX_HSTATE]; /* * Due to ordering constraints across the init code for various @@ -3309,6 +3310,44 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, } } +static bool __init hugetlb_bootmem_page_zones_valid(int nid, + struct huge_bootmem_page *m) +{ + unsigned long start_pfn; + bool valid; + + start_pfn = virt_to_phys(m) >> PAGE_SHIFT; + + valid = !pfn_range_intersects_zones(nid, start_pfn, + pages_per_huge_page(m->hstate)); + if (!valid) + hstate_boot_nrinvalid[hstate_index(m->hstate)]++; + + return valid; +} + +/* + * Free a bootmem page that was found to be invalid (intersecting with + * multiple zones). + * + * Since it intersects with multiple zones, we can't just do a free + * operation on all pages at once, but instead have to walk all + * pages, freeing them one by one. + */ +static void __init hugetlb_bootmem_free_invalid_page(int nid, struct page *page, + struct hstate *h) +{ + unsigned long npages = pages_per_huge_page(h); + unsigned long pfn; + + while (npages--) { + pfn = page_to_pfn(page); + __init_reserved_page_zone(pfn, nid); + free_reserved_page(page); + page++; + } +} + /* * Put bootmem huge pages into the standard lists after mem_map is up. * Note: This only applies to gigantic (order > MAX_PAGE_ORDER) pages. @@ -3316,14 +3355,25 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, static void __init gather_bootmem_prealloc_node(unsigned long nid) { LIST_HEAD(folio_list); - struct huge_bootmem_page *m; + struct huge_bootmem_page *m, *tm; struct hstate *h = NULL, *prev_h = NULL; - list_for_each_entry(m, &huge_boot_pages[nid], list) { + list_for_each_entry_safe(m, tm, &huge_boot_pages[nid], list) { struct page *page = virt_to_page(m); struct folio *folio = (void *)page; h = m->hstate; + if (!hugetlb_bootmem_page_zones_valid(nid, m)) { + /* + * Can't use this page. Initialize the + * page structures if that hasn't already + * been done, and give them to the page + * allocator. + */ + hugetlb_bootmem_free_invalid_page(nid, page, h); + continue; + } + /* * It is possible to have multiple huge page sizes (hstates) * in this list. If so, process each size separately. @@ -3595,13 +3645,20 @@ static void __init hugetlb_init_hstates(void) static void __init report_hugepages(void) { struct hstate *h; + unsigned long nrinvalid; for_each_hstate(h) { char buf[32]; + nrinvalid = hstate_boot_nrinvalid[hstate_index(h)]; + h->max_huge_pages -= nrinvalid; + string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32); pr_info("HugeTLB: registered %s page size, pre-allocated %ld pages\n", buf, h->free_huge_pages); + if (nrinvalid) + pr_info("HugeTLB: %s page size: %lu invalid page%s discarded\n", + buf, nrinvalid, nrinvalid > 1 ? "s" : ""); pr_info("HugeTLB: %d KiB vmemmap can be freed for a %s page\n", hugetlb_vmemmap_optimizable_size(h) / SZ_1K, buf); } diff --git a/mm/internal.h b/mm/internal.h index 57662141930e..63fda9bb9426 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -658,6 +658,8 @@ static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn, } void set_zone_contiguous(struct zone *zone); +bool pfn_range_intersects_zones(int nid, unsigned long start_pfn, + unsigned long nr_pages); static inline void clear_zone_contiguous(struct zone *zone) { diff --git a/mm/mm_init.c b/mm/mm_init.c index 925ed6564572..f7d5b4fe1ae9 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -2287,6 +2287,31 @@ void set_zone_contiguous(struct zone *zone) zone->contiguous = true; } +/* + * Check if a PFN range intersects multiple zones on one or more + * NUMA nodes. Specify the @nid argument if it is known that this + * PFN range is on one node, NUMA_NO_NODE otherwise. + */ +bool pfn_range_intersects_zones(int nid, unsigned long start_pfn, + unsigned long nr_pages) +{ + struct zone *zone, *izone = NULL; + + for_each_zone(zone) { + if (nid != NUMA_NO_NODE && zone_to_nid(zone) != nid) + continue; + + if (zone_intersects(zone, start_pfn, nr_pages)) { + if (izone != NULL) + return true; + izone = zone; + } + + } + + return false; +} + static void __init mem_init_print_info(void); void __init page_alloc_init_late(void) { From patchwork Wed Jan 29 22:41:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954212 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEDE3C02190 for ; Wed, 29 Jan 2025 22:43:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 987AA2800A3; Wed, 29 Jan 2025 17:42:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E8BE28008C; Wed, 29 Jan 2025 17:42:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 73DE02800A3; Wed, 29 Jan 2025 17:42:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 4EC7128008C for ; Wed, 29 Jan 2025 17:42:44 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 043A01C7057 for ; Wed, 29 Jan 2025 22:42:43 +0000 (UTC) X-FDA: 83061965448.16.C33DE46 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf13.hostedemail.com (Postfix) with ESMTP id 11DF020013 for ; Wed, 29 Jan 2025 22:42:41 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="bJD0ps/5"; spf=pass (imf13.hostedemail.com: domain of 34K6aZwQKCOsSiQYTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=34K6aZwQKCOsSiQYTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190562; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=csxUnMb8aBY+LA7oJhb6s5xyn3ur++3PPVPSPZ4uvnE=; b=amiabfC/3LtquULpl4ubXOFMmpPj8YZj2UWPEjcXq0k+DXAhSB/UQq2RIjud4IZKt2b9qz CTdcEnk4vkRACEIRzX5cLNqSCL+gLU0yd/PK76l0Lb9lA0/jvJIK2rg1p14g+w8TI2xMCs bOmNOQFnHmXDrmju7+j/loO+23TQ2I4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190562; a=rsa-sha256; cv=none; b=rMhV2SjofgVt2Q+CtgH5o5jIBFrSkIjdSiit4+LbhNnz3+to91SdRExmQa5vrIuKf80SuM eqEuGYaQf4BcFJ7xTRV7BNDuqzIsQZIBxKRKbU8qT4TrY81kfa0wuSmgUXmvwzfNzFkkBL 88Y7OZC0QIwvB1mCEsTyQKmfP7AEQIQ= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="bJD0ps/5"; spf=pass (imf13.hostedemail.com: domain of 34K6aZwQKCOsSiQYTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=34K6aZwQKCOsSiQYTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2efa74481fdso244049a91.1 for ; Wed, 29 Jan 2025 14:42:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190561; x=1738795361; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=csxUnMb8aBY+LA7oJhb6s5xyn3ur++3PPVPSPZ4uvnE=; b=bJD0ps/5tDjxyA0vYYAtzgOLz0xcWOdv9zQETbmbuvXwVCokha2cqFNi+96FPjf19U 1MN/d4UvvmhJWH3rZDPkzQesFtd07MXAl7U06ErIXJCT5W9eeHnKzEtpioiC5ymcFAfX 2Z/W6MX54aGkr2uJ5xX+2z/rat4quUFj7uh5969jbxUYGD04G3zoTeIgg3mfMn6vsOrm ib7Ti3v3AZGhjrjgnG++kuXx/4NMhmDIX3kLT8utWYmxejAeAzP7senmsF9864mal1dG +OQ5sU49MndqG6Yq+LsXtmiyFruZo1eix3xtfrExnr1X83LG0P0mHCAB317bo9Df8N2T Lmyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190561; x=1738795361; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=csxUnMb8aBY+LA7oJhb6s5xyn3ur++3PPVPSPZ4uvnE=; b=t5f0RglOPd+5V8NxVMpUBjZwLoBdfs/UDP6mJvV84QApmXP9v+sdhnpaPCswhcAXaO 5odtXXssU2DvydW0UFhJB8rAueq/kGIk0DwLjAQY9yAuPjsT7Hzr8OO+b8f24+FtKJ00 i3+nVApil6fVwYcjzNGh80hEat1jqApch0g28MuhkjqE1yB64ImjYdtd+edRMekB7BjT RN+AK3fRgA9AVRAQoftKlOjKo6A2+K8+22KaJQqCx1SZwwXA9OOXMGZKta7tADRMTixV e5cWz7msU6fDn8Gwuho2qoEiPHxC5YIC2RfdScoJSFXsECQrtAorROajLMHYfiqxXcEN fBWA== X-Forwarded-Encrypted: i=1; AJvYcCWvShhJGxDNYas5UvsBahcgacly7LDc7GulED4ap26g8OR2foyAw/IE/OIYj3axUFhnMJ7DMeHZiA==@kvack.org X-Gm-Message-State: AOJu0Yx7XO6Zq7jenv3SXyUa90SQo2aom/y52WOs5HoaozFFsv40HzfQ 3QVooRHuJ52imy64jSvQrEJaTfjgSsRJRsitmSZJkZhefXk63XVoig9S68vMgaAUbwCXaA== X-Google-Smtp-Source: AGHT+IE8Pv/vPlQg3Zsgd/miPk81S+cEO0cUOfh7UtusaUeUsumjXw+Yjo59fdOEUHr7Es0q7li1hf72 X-Received: from pfbbe5.prod.google.com ([2002:a05:6a00:1f05:b0:728:e3af:6bb0]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:3a0f:b0:71e:e4f:3e58 with SMTP id d2e1a72fcca58-72fd0c623ccmr6357836b3a.17.1738190560818; Wed, 29 Jan 2025 14:42:40 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:44 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-16-fvdl@google.com> Subject: [PATCH v2 15/28] mm/sparse: add vmemmap_*_hvo functions From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Queue-Id: 11DF020013 X-Stat-Signature: 3umormp39dxsf9hy19o5jdnj51x8bxbp X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1738190561-94300 X-HE-Meta: U2FsdGVkX191uITfqzKknvG7tp9INXM5OUS/UN8mljJJPGudKghJY+elZ/BFXzrcB7qQWVtNyoIXs0Z02zCtaw3YbS7seFRm/PO0nosiibDjVdYPMKCOevjfwQmtfNnBCkxXDrbvOk2uKDYsKOxa53c69vYTufcDtMY5E5vt2rMtq4OuUxX/IZq9p5xZdzLdFPBXSfKJIsOClm43VPzryOtUxPZoNsU3tKHX3lE2Q2xuxxu36rccC4v/8V+x5rwk/NTWi6nV8ZxtSX665HD/n2y5SI3EGsw9R4DFRDvuRmoFb3EXRipCgIp/nYGD2pZ9Wnej/o7s6PwD79UcapVFrDer01GOdB0MfhL5cmPcosnbE6OSyCFJEj9mSX2XszvRRmlp+XuyXZ+H2NSVzvULTAi03lEP2shDD6YHwOxyRcmE4H15pN8cpZBa1HCXm7vq8sm30+DG0x/Ms+zZ4uLiR45j0vU2/uz7C10EX9bZzFH/QsLZrwMi5SQOLTSGBhpNWL2Vq4UVgsm6cnVmaJEpn5vVaugUmY3FXu2C/QsS1oDIrNZocJ2SeXcSYzkDpOUNgY+6aQ6/Kp+FVdW7XyBLIKpWEFTHun1uE74yu5mYfdSXZiP7jlM9TNMlHm9LviNvbbnAmabF7pGsGXY0SaBnMjti4qhyr78N4xKyrjW6GNAtvZ3IkzVA+Ui65vekN16Ur8+mv7RXLBrIOrIrBsq6vCdKJ+7+w28/yA4UW78fn6zE9Q/jnkICrEMG0JOdYhTqlnU59zrdaa2qKJYWqFOsFmgeYO1W9l8hheBZaOFsjW1N1VT0t1GxpDGp/kwg3yPtu6G3RE0c+TX6wxQh5v5C7x+vDTQpqFdu7k4skk4jIR2eZl0oBetfAh4eaS8sHYNucb+cliKaarTaA8C8lCy/oY6+6ZZd5NXJ7BUlAmQPT/PN0KNyMGcea2jl6Rn/WEyhkOzV/OR5ldgjMCrc1pJ zbDF2L0v ah+SeLY/+AT2mWZG55kIL/Lz1Epku2/38Ar4xJKuUZwsA/+jc3m33w7YCy0/i5y3sKAM5+x81cY4XOHv6G7Z3YoWrdXWoU2E24HIXlvec0SfDacESfX3khMeO5ikqb9OSzOPZdc7TGf7yYrg1j7tNV+mDRLsEKem6PG7ZP++Tah6T4tXnd5n0AW4vB4tGf/pERoLKy3bq8PBwh1zhULu+Q1emjPg+9Fi2+kGNHrfwTX8EpE8qBDjdI+Y/YYR0mQJLhwCK9ADU5mpDnp0ZY8SxuP88MjhQceR1YNrJqBnPCVXVNtPMx3d+wg+CqixdFt2e5Gbb6lQns8fok8wNPUJkSb1aShtAWRIsC2SS28K+8pAII9e56gWMFjgC+GzNfP6PcDalsQqx/vZr2DbX7ZTjicHII89tkNI+yM3d2E+kTJoV/WLeuV5/EN+pZwlWlCO0ROUrub0+gHCLkfXllly8IHVFn6Tzbg5E+Rhm+3q9DB8MXmcJJk7EGqaTASRa+4Lj2gAGPIiODiYMp8JXJb1w3qb8fgbkz4kVVl0zqZpQpjEU72D1f9COdWuCt4LCOC0flqMFdd2qFdqMTHM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add a few functions to enable early HVO: vmemmap_populate_hvo vmemmap_undo_hvo vmemmap_wrprotect_hvo The populate and undo functions are expected to be used in early init, from the sparse_init_nid_early() function. The wrprotect function is to be used, potentially, later. To implement these functions, mostly re-use the existing compound pages vmemmap logic used by DAX. vmemmap_populate_address has its argument changed a bit in this commit: the page structure passed in to be reused in the mapping is replaced by a PFN and a flag. The flag indicates whether an extra ref should be taken on the vmemmap page containing the head page structure. Taking the ref is appropriate to for DAX / ZONE_DEVICE, but not for HugeTLB HVO. The HugeTLB vmemmap optimization maps tail page structure pages read-only. The vmemmap_wrprotect_hvo function that does this is implemented separately, because it cannot be guaranteed that reserved page structures will not be write accessed during memory initialization. Even with CONFIG_DEFERRED_STRUCT_PAGE_INIT, they might still be written to (if they are at the bottom of a zone). So, vmemmap_populate_hvo leaves the tail page structure pages RW initially, and then later during initialization, after memmap init is fully done, vmemmap_wrprotect_hvo must be called to finish the job. Subsequent commits will use these functions for early HugeTLB HVO. Signed-off-by: Frank van der Linden --- include/linux/mm.h | 9 ++- mm/sparse-vmemmap.c | 141 +++++++++++++++++++++++++++++++++++++++----- 2 files changed, 135 insertions(+), 15 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index df83653ed6e3..0463c062fd7a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3837,7 +3837,8 @@ p4d_t *vmemmap_p4d_populate(pgd_t *pgd, unsigned long addr, int node); pud_t *vmemmap_pud_populate(p4d_t *p4d, unsigned long addr, int node); pmd_t *vmemmap_pmd_populate(pud_t *pud, unsigned long addr, int node); pte_t *vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node, - struct vmem_altmap *altmap, struct page *reuse); + struct vmem_altmap *altmap, unsigned long ptpfn, + unsigned long flags); void *vmemmap_alloc_block(unsigned long size, int node); struct vmem_altmap; void *vmemmap_alloc_block_buf(unsigned long size, int node, @@ -3853,6 +3854,12 @@ int vmemmap_populate_hugepages(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap); int vmemmap_populate(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap); +int vmemmap_populate_hvo(unsigned long start, unsigned long end, int node, + unsigned long headsize); +int vmemmap_undo_hvo(unsigned long start, unsigned long end, int node, + unsigned long headsize); +void vmemmap_wrprotect_hvo(unsigned long start, unsigned long end, int node, + unsigned long headsize); void vmemmap_populate_print_last(void); #ifdef CONFIG_MEMORY_HOTPLUG void vmemmap_free(unsigned long start, unsigned long end, diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 8751c46c35e4..bee22ca93654 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -30,6 +30,13 @@ #include #include +#include + +/* + * Flags for vmemmap_populate_range and friends. + */ +/* Get a ref on the head page struct page, for ZONE_DEVICE compound pages */ +#define VMEMMAP_POPULATE_PAGEREF 0x0001 #include "internal.h" @@ -144,17 +151,18 @@ void __meminit vmemmap_verify(pte_t *pte, int node, pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node, struct vmem_altmap *altmap, - struct page *reuse) + unsigned long ptpfn, unsigned long flags) { pte_t *pte = pte_offset_kernel(pmd, addr); if (pte_none(ptep_get(pte))) { pte_t entry; void *p; - if (!reuse) { + if (!ptpfn) { p = vmemmap_alloc_block_buf(PAGE_SIZE, node, altmap); if (!p) return NULL; + ptpfn = PHYS_PFN(__pa(p)); } else { /* * When a PTE/PMD entry is freed from the init_mm @@ -165,10 +173,10 @@ pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node, * and through vmemmap_populate_compound_pages() when * slab is available. */ - get_page(reuse); - p = page_to_virt(reuse); + if (flags & VMEMMAP_POPULATE_PAGEREF) + get_page(pfn_to_page(ptpfn)); } - entry = pfn_pte(__pa(p) >> PAGE_SHIFT, PAGE_KERNEL); + entry = pfn_pte(ptpfn, PAGE_KERNEL); set_pte_at(&init_mm, addr, pte, entry); } return pte; @@ -238,7 +246,8 @@ pgd_t * __meminit vmemmap_pgd_populate(unsigned long addr, int node) static pte_t * __meminit vmemmap_populate_address(unsigned long addr, int node, struct vmem_altmap *altmap, - struct page *reuse) + unsigned long ptpfn, + unsigned long flags) { pgd_t *pgd; p4d_t *p4d; @@ -258,7 +267,7 @@ static pte_t * __meminit vmemmap_populate_address(unsigned long addr, int node, pmd = vmemmap_pmd_populate(pud, addr, node); if (!pmd) return NULL; - pte = vmemmap_pte_populate(pmd, addr, node, altmap, reuse); + pte = vmemmap_pte_populate(pmd, addr, node, altmap, ptpfn, flags); if (!pte) return NULL; vmemmap_verify(pte, node, addr, addr + PAGE_SIZE); @@ -269,13 +278,15 @@ static pte_t * __meminit vmemmap_populate_address(unsigned long addr, int node, static int __meminit vmemmap_populate_range(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap, - struct page *reuse) + unsigned long ptpfn, + unsigned long flags) { unsigned long addr = start; pte_t *pte; for (; addr < end; addr += PAGE_SIZE) { - pte = vmemmap_populate_address(addr, node, altmap, reuse); + pte = vmemmap_populate_address(addr, node, altmap, + ptpfn, flags); if (!pte) return -ENOMEM; } @@ -286,7 +297,107 @@ static int __meminit vmemmap_populate_range(unsigned long start, int __meminit vmemmap_populate_basepages(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap) { - return vmemmap_populate_range(start, end, node, altmap, NULL); + return vmemmap_populate_range(start, end, node, altmap, 0, 0); +} + +/* + * Undo populate_hvo, and replace it with a normal base page mapping. + * Used in memory init in case a HVO mapping needs to be undone. + * + * This can happen when it is discovered that a memblock allocated + * hugetlb page spans multiple zones, which can only be verified + * after zones have been initialized. + * + * We know that: + * 1) The first @headsize / PAGE_SIZE vmemmap pages were individually + * allocated through memblock, and mapped. + * + * 2) The rest of the vmemmap pages are mirrors of the last head page. + */ +int __meminit vmemmap_undo_hvo(unsigned long addr, unsigned long end, + int node, unsigned long headsize) +{ + unsigned long maddr, pfn; + pte_t *pte; + int headpages; + + /* + * Should only be called early in boot, so nothing will + * be accessing these page structures. + */ + WARN_ON(!early_boot_irqs_disabled); + + headpages = headsize >> PAGE_SHIFT; + + /* + * Clear mirrored mappings for tail page structs. + */ + for (maddr = addr + headsize; maddr < end; maddr += PAGE_SIZE) { + pte = virt_to_kpte(maddr); + pte_clear(&init_mm, maddr, pte); + } + + /* + * Clear and free mappings for head page and first tail page + * structs. + */ + for (maddr = addr; headpages-- > 0; maddr += PAGE_SIZE) { + pte = virt_to_kpte(maddr); + pfn = pte_pfn(ptep_get(pte)); + pte_clear(&init_mm, maddr, pte); + memblock_phys_free(PFN_PHYS(pfn), PAGE_SIZE); + } + + flush_tlb_kernel_range(addr, end); + + return vmemmap_populate(addr, end, node, NULL); +} + +/* + * Write protect the mirrored tail page structs for HVO. This will be + * called from the hugetlb code when gathering and initializing the + * memblock allocated gigantic pages. The write protect can't be + * done earlier, since it can't be guaranteed that the reserved + * page structures will not be written to during initialization, + * even if CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled. + * + * The PTEs are known to exist, and nothing else should be touching + * these pages. The caller is responsible for any TLB flushing. + */ +void vmemmap_wrprotect_hvo(unsigned long addr, unsigned long end, + int node, unsigned long headsize) +{ + unsigned long maddr; + pte_t *pte; + + for (maddr = addr + headsize; maddr < end; maddr += PAGE_SIZE) { + pte = virt_to_kpte(maddr); + ptep_set_wrprotect(&init_mm, maddr, pte); + } +} + +/* + * Populate vmemmap pages HVO-style. The first page contains the head + * page and needed tail pages, the other ones are mirrors of the first + * page. + */ +int __meminit vmemmap_populate_hvo(unsigned long addr, unsigned long end, + int node, unsigned long headsize) +{ + pte_t *pte; + unsigned long maddr; + + for (maddr = addr; maddr < addr + headsize; maddr += PAGE_SIZE) { + pte = vmemmap_populate_address(maddr, node, NULL, 0, 0); + if (!pte) + return -ENOMEM; + } + + /* + * Reuse the last page struct page mapped above for the rest. + */ + return vmemmap_populate_range(maddr, end, node, NULL, + pte_pfn(ptep_get(pte)), 0); } void __weak __meminit vmemmap_set_pmd(pmd_t *pmd, void *p, int node, @@ -409,7 +520,8 @@ static int __meminit vmemmap_populate_compound_pages(unsigned long start_pfn, * with just tail struct pages. */ return vmemmap_populate_range(start, end, node, NULL, - pte_page(ptep_get(pte))); + pte_pfn(ptep_get(pte)), + VMEMMAP_POPULATE_PAGEREF); } size = min(end - start, pgmap_vmemmap_nr(pgmap) * sizeof(struct page)); @@ -417,13 +529,13 @@ static int __meminit vmemmap_populate_compound_pages(unsigned long start_pfn, unsigned long next, last = addr + size; /* Populate the head page vmemmap page */ - pte = vmemmap_populate_address(addr, node, NULL, NULL); + pte = vmemmap_populate_address(addr, node, NULL, 0, 0); if (!pte) return -ENOMEM; /* Populate the tail pages vmemmap page */ next = addr + PAGE_SIZE; - pte = vmemmap_populate_address(next, node, NULL, NULL); + pte = vmemmap_populate_address(next, node, NULL, 0, 0); if (!pte) return -ENOMEM; @@ -433,7 +545,8 @@ static int __meminit vmemmap_populate_compound_pages(unsigned long start_pfn, */ next += PAGE_SIZE; rc = vmemmap_populate_range(next, last, node, NULL, - pte_page(ptep_get(pte))); + pte_pfn(ptep_get(pte)), + VMEMMAP_POPULATE_PAGEREF); if (rc) return -ENOMEM; } From patchwork Wed Jan 29 22:41:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954213 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE886C0218D for ; Wed, 29 Jan 2025 22:43:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CBE002800A4; Wed, 29 Jan 2025 17:42:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C6C1628008C; Wed, 29 Jan 2025 17:42:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B0E0E2800A4; Wed, 29 Jan 2025 17:42:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 91DC228008C for ; Wed, 29 Jan 2025 17:42:45 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 3EE1A140491 for ; Wed, 29 Jan 2025 22:42:45 +0000 (UTC) X-FDA: 83061965490.03.356645B Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf18.hostedemail.com (Postfix) with ESMTP id 65A641C0005 for ; Wed, 29 Jan 2025 22:42:43 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=YJzKaG0q; spf=pass (imf18.hostedemail.com: domain of 34q6aZwQKCO0UkSaVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=34q6aZwQKCO0UkSaVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190563; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+CyhHbVHq5bo9c6pVj60jCqfagJdD0XsxqtGn3cmahY=; b=BE7SWQGBd/9hh1Wn9gIsSl5KMijp8qCxazVrHltnDbPv35tlB/jmyHP/blGXqeMko4+jQE qB25Iyy6cQKW4kQ7kujuO3tuF6SGXGbj6auKuObPDE0YdlcS02+l/xpeIoS5nYlmYq51PM LzJ3AbXDDBez+ECfXDumh6mNanT3pwk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190563; a=rsa-sha256; cv=none; b=xxcKGfPDRal6zzvtHtlwdb/8tiu0nYPIIkH+Ewam05WFArN1AFGyPAhGQXntfFLpy9Z4FK CfQ87pCtvUJa1yn3o7m/llM4f1dgxFyyk3JFutWH0CBrknb2j0zeGBjrnO4RseUKBWtjv1 /wONDD55eSa7pRWfKIixpW/Zimo1ReE= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=YJzKaG0q; spf=pass (imf18.hostedemail.com: domain of 34q6aZwQKCO0UkSaVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=34q6aZwQKCO0UkSaVddVaT.RdbaXcjm-bbZkPRZ.dgV@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2f5538a2356so221261a91.2 for ; Wed, 29 Jan 2025 14:42:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190562; x=1738795362; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=+CyhHbVHq5bo9c6pVj60jCqfagJdD0XsxqtGn3cmahY=; b=YJzKaG0qQujewB1fKzsIrB2sPGwk0c8YCHekeZyh6S0T4Yh1bOlxO3wrkb70cbWDT7 aEEF1oNRc+ap0enc80y5JpIppZhJRgC5qb1iX49oVD/FMVunWpf8T8fTD/W6R7qoE6t4 AWtnUYO4nMrZ9oSiGcVhvgvoWEPMI5z9G0OcxtcjdItyX5Fnz7orAE/PDv9eEdm/qmQm rwcCqqy8/8iKReZfJ2YCqGeB222vwhE209qf779JiBjyfdoN+MDy6tJRNuUpaAcp4Hfs J2xch+wmnrAUBxUGybQItq5mg69TEaX9qUkwSQgyLBjGOE0/+H/n/t0mdmb1tH20ZRmA z8fQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190562; x=1738795362; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+CyhHbVHq5bo9c6pVj60jCqfagJdD0XsxqtGn3cmahY=; b=KTiWH+H429I1SDv5n3Rop5YhJNo11zbcheQ2o3pmNEHUYuHgteRsJcZMk6RcxomTvv dW+J5/G9sss5nucC/KISaGuDzCIynzeW/alqaLBFrQb41II82ddQp6Qv163ae7Bcferr 5d3+rVKSx8Wp5aTLCvDxBkMf4nQXVVU7QILSXta6RozcxiR7PZNayxdyjl10U3mG6m9L 1Qbsis6vBVFGANvMPPMrOWMbpPLiYWc02XSrNgU4UBsEZe6drPcPjXBCd8C46oAT+PVS HDi2BTe3/p9rtyVaHa1JBRksUyHjr43hekaiGU9qzEYw60DdpGHoCC1Od6cusVxYEnup ZuQQ== X-Forwarded-Encrypted: i=1; AJvYcCUo4+wrGlrac4j+D6TZ487MZvG85uq21mmcaXEOmQxu71IMJwmBhPLgCIV3kXpiSzJfhYIzLzC7aQ==@kvack.org X-Gm-Message-State: AOJu0Yxa1/wMpJrQttazZlPeAljUHHUqF38KY4qskO+iiH8fQiCECQgG 8s+QrJenHEcHNGkMz91CyMdhAyc1hLJIav7PtNSOG/sdcjPyuz7ItDQlDnyN0YkhqfTriQ== X-Google-Smtp-Source: AGHT+IE89jnX+A7X0+O3NBJ62H6UTgZMQLLOdEOYxcL1kGIfjMQLQgaW2k8akmMH2SlD3zsDkP21quJB X-Received: from pjbpa2.prod.google.com ([2002:a17:90b:2642:b0:2f4:3ea1:9033]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:1f8c:b0:2ee:94d1:7a89 with SMTP id 98e67ed59e1d1-2f83ab8c371mr6910663a91.1.1738190562213; Wed, 29 Jan 2025 14:42:42 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:45 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-17-fvdl@google.com> Subject: [PATCH v2 16/28] mm/hugetlb: deal with multiple calls to hugetlb_bootmem_alloc From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Stat-Signature: wihj8c6mqbxrrjm8cn7nub4psuadx66w X-Rspamd-Queue-Id: 65A641C0005 X-Rspam-User: X-Rspamd-Server: rspam06 X-HE-Tag: 1738190563-370648 X-HE-Meta: U2FsdGVkX1/YWyvEHWxpKOMIWSlVdRJsS3hs2NWiukPR2u8dlZBSUqZjdM7z9HTufqZ4DtxSNOmc4bCqsvwGk1MEhGEvEEIQMqFdovaj+7Dhc8IvJZlTpwy2RHYdkmSJeDw9C0YP5u0KI6ZJxvTR+M5TnblgrpA7uawrvvBaMlpmuJhbpGycHznFczpu6ThHQVnTtj2VUeAvQ1eLwyOJX5AIsvjDBaiyadSj5QGNCrC3SZbIU6OgF9ghJQ94JlYksNHe5LjtJTLdv+PeBVd01xjdxWJep2HL0V26uV+gR1e56xy7KS3ld2UenLh5T+MHepNNZ78hTWIk4DC3UyPqvirZrVQLxZYSww4MTOqzloOfSxlTeuHSXxDZ0Qod1Kv8TTr203J2js4Af+NIOnZH4swW4R5yI++7KLaQxqZg/095Kwnxv95j+4dDr54YtpiNIHR8ZfsSho2lcxPXLJ89orE1QE+RJjBbzZfK8F9GKfLq/FdsEQcM4eWMNEzU59wTBIBiwOaUTzmMBZozU3SmaLndY6jbowPS0LEhtEVNz7P0Peg9R8Te/db1dNeYn5jWmaVGziUfG5CHFKXuAKP68BHnSzUhztC9QTWfZ7DFBBRwxXI1w3bI+z/e32XwGPDo/FSWxthxYQF7hINpOhGDj3SIDaomhZtcCdPmVmJExFkKDZT98mCfY0QYLwsycSK6QTAR/PNRgijFucQErUIgub8BlD2UznzbFbJMuIpdkK4dU4+HTfsN6OYj2Uy5m6azzPU4RIOB8Yr2yRzKkDuklbzuUpNGkj9zhctybva1aV6WO9xx7DwD7KEa1S+7y0xhTyJ7Yea+BNPSytys8f2eFjN92K58K6qUkWSdrU4WYQG04F2BkLM09U4cntvJX3kg657x/R1wbQ6jU7cmSe/LGJGiEeEntBPbRAty6+anEco/WhZIodTKWuoRTLA/MvRn1vI2ND8E/AfwvRVijXk pOWIwECz 5oWQ9+YeJc06c6JCHBDXIw82URowpo77vsn/lhQRN+u8VnZ6D11hZVTZ4aioPwGqZ1buV2oqbIEW4hn93AwTA75bw1OSDShgo/nLaa4xULglcM8dZZax8ap83qDOZdn1nmcosY2Rid73i2zmkbyBDYSMLqMmq6DX1LmmXFs0j33xTnuZ5019KE3g0b0ypYtbH2ft76ssWWVpJtsemyqS0RTHXFYh904OncYKNKZQZWjMFHEsBS6R3ZE9UvwuiUhWYW3YJ+O7MZvF1V546TSZqHooPB6mhuQU5rPFYzRdpB0lboMXP4fTdkHaI5UtHMMTKgxPipgj+++QALzOI3jIEXrId8j0OG9KfvBTXu+zQryJEqIpzDH9UbKZlcrVnBgNR6DlS4sgDI42Efte56yG3JP2l9PKTe7FhkUtBUzdNmmNLMm5MlIj670KwuC7KUJswSRb4nJtORII1u2giq6hoGvAezTBDmf0MCUj/lWi5wMXx5e14bCpK7UkOz6H5c6Z8Ky5JcvwDYjFgpd8R7pr3co8YTHMpyi0DfUDMRVQhaW2Aked86dpnmiaIhWglZzKUzVzq X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Architectures that want pre-HVO of hugetlb vmemmap pages will need to call hugetlb_bootmem_alloc from an earlier spot in boot (before sparse_init). To facilitate some architectures doing this, protect hugetlb_bootmem_alloc against multiple calls. Also provide a helper function to check if it's been called, so that the early HVO code, to be added later, can see if there is anything to do. Signed-off-by: Frank van der Linden --- include/linux/hugetlb.h | 6 ++++++ mm/hugetlb.c | 12 ++++++++++++ 2 files changed, 18 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 9cd7c9dacb88..5061279e5f73 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -175,6 +175,7 @@ extern int sysctl_hugetlb_shm_group; extern struct list_head huge_boot_pages[MAX_NUMNODES]; void hugetlb_bootmem_alloc(void); +bool hugetlb_bootmem_allocated(void); /* arch callbacks */ @@ -1256,6 +1257,11 @@ static inline bool hugetlbfs_pagecache_present( static inline void hugetlb_bootmem_alloc(void) { } + +static inline bool hugetlb_bootmem_allocated(void) +{ + return false; +} #endif /* CONFIG_HUGETLB_PAGE */ static inline spinlock_t *huge_pte_lock(struct hstate *h, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a0a87d1a8569..0a27659d9290 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4911,16 +4911,28 @@ static int __init default_hugepagesz_setup(char *s) } hugetlb_early_param("default_hugepagesz", default_hugepagesz_setup); +static bool __hugetlb_bootmem_allocated __initdata; + +bool __init hugetlb_bootmem_allocated(void) +{ + return __hugetlb_bootmem_allocated; +} + void __init hugetlb_bootmem_alloc(void) { struct hstate *h; + if (__hugetlb_bootmem_allocated) + return; + hugetlb_parse_params(); for_each_hstate(h) { if (hstate_is_gigantic(h)) hugetlb_hstate_alloc_pages(h); } + + __hugetlb_bootmem_allocated = true; } static unsigned int allowed_mems_nr(struct hstate *h) From patchwork Wed Jan 29 22:41:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954214 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 609E5C02193 for ; Wed, 29 Jan 2025 22:43:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 541E62800A9; Wed, 29 Jan 2025 17:42:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CF4728008C; Wed, 29 Jan 2025 17:42:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2F79F2800A9; Wed, 29 Jan 2025 17:42:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 0C8AA28008C for ; Wed, 29 Jan 2025 17:42:47 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C75DFA04DF for ; Wed, 29 Jan 2025 22:42:46 +0000 (UTC) X-FDA: 83061965532.24.15D25A5 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf06.hostedemail.com (Postfix) with ESMTP id F3A78180007 for ; Wed, 29 Jan 2025 22:42:44 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="mnztq/dM"; spf=pass (imf06.hostedemail.com: domain of 3466aZwQKCO4VlTbWeeWbU.SecbYdkn-ccalQSa.ehW@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3466aZwQKCO4VlTbWeeWbU.SecbYdkn-ccalQSa.ehW@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190565; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=owdlgfr/VPS42Eha44p1LufvQJZlV8Uk+cba5IVjd9o=; b=2ZJoCeTEbl2qUvYoohO5JFyljHz0Yi2cX4u56jTvfnb7mqi/t7ArkA98AJmVmWdzCXJkMy 42rDtZMgkbjpDlYjjuns/DPEhmMzxzFLTQQx0vW0FErFLERmBTUGEmUQDQsFLJmvZvUt4Z t0p+hvhlZEQsgMFjlJGrJlUs9t0LNaU= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="mnztq/dM"; spf=pass (imf06.hostedemail.com: domain of 3466aZwQKCO4VlTbWeeWbU.SecbYdkn-ccalQSa.ehW@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3466aZwQKCO4VlTbWeeWbU.SecbYdkn-ccalQSa.ehW@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190565; a=rsa-sha256; cv=none; b=Vbzc0vLxVA0BsJNhfOculu2NfJqEXleq9vyncBnRw+dWofpYvuF1OYdhA3zHc5n+UJA78U P84R0j0G5ScEhbLbhG4qvEnvd0U4PSeE+otdcXd/GERsp2uH4+Ohyb/fvsKmd35it9EVN+ WwJBljFcy6sJumdHvESrYmGWDfupDvI= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ee3206466aso2409824a91.1 for ; Wed, 29 Jan 2025 14:42:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190564; x=1738795364; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=owdlgfr/VPS42Eha44p1LufvQJZlV8Uk+cba5IVjd9o=; b=mnztq/dMxUNljGjAB4gbNVQ7TMWChXGjNRyIXTKztipWaynPSoDSHgqSfhW26tb6eT NfVrran2xj5AMp05bIp7P8sMB6XipfwcxRD9SmnTf4LLjx1TV0HZSyd1J87r0DW3+OSn F/Vbw5shs70IPu2TgY1wsGn5SdusJxaZbpmRJSXx4OGLynFe1J6Sm972dlSHuoeqajb3 uuUBAKt190TDVqnSRihOHj+1oPi/nf9JYoJHUUZIwXqCD88RdT28eGB/qiqW4NRWYIzR XQk4njoA16TbeYbhYtnLKeTJ9UUwnN9F3m3tJlc3JEv9WKcHcmTd/OcB3BkkOPK9oOy0 oyaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190564; x=1738795364; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=owdlgfr/VPS42Eha44p1LufvQJZlV8Uk+cba5IVjd9o=; b=PEi1Qb8Ad96SJ2H1DtRYFXgheAgkLHmj8PeMeC0dD3ZiIgl/Ic4JjO8BQ3OihKFtaV tn1RCvjQm2UJl9NspCsAW9UIc4iK74wuxwPHuvR6ZDo72JGj/jUq9a94PYBSAGhGgFqY rhUSV3EkwgAh6etqjFHwp2GTcYb+E4t2CvXzNwAnZnLg3L2kngUFL210dsUswB222D7j raXYg9kRAd1CcMYBZ3KBojPVZL4hk2JE6YKpZmBRM6zj2G37rLq10ZP8Brt+iUkozbEo UaanOJOevPQ9Mba2WVH4H/pMH18TBHfhKHRfhDTpRNYG6yrY7PpR6oFTi32LSkqmyntZ a57A== X-Forwarded-Encrypted: i=1; AJvYcCUPws8yb4ag+N7YiFhIr1UgPOZu8VyGAjsUSTrPEDi7QSLGOuM6M767SluuPE+X0KFsXqogjT/tSQ==@kvack.org X-Gm-Message-State: AOJu0YyVamgMpYxCXQcqbarHhDqgkQ1+xZgXzi4zLjjCp8QYlRvJJhgW +chW0l7AyusQ6y8tYo3dCZpX+4SmmKzy3pHPBCGfnJ5OEY7siPRZNUgmu65Z9jI1F5vHEQ== X-Google-Smtp-Source: AGHT+IHEVdOPU8hWxfFhDQfhZQ8pAOtZCFOUBUvtqAFn5Pxdp3YDWz7gu5clrbq1G86WIUxTF3GKcBps X-Received: from pfbeg27.prod.google.com ([2002:a05:6a00:801b:b0:72a:89d4:9641]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:388:b0:725:d64c:f122 with SMTP id d2e1a72fcca58-72fe2ccbc83mr1635907b3a.2.1738190563842; Wed, 29 Jan 2025 14:42:43 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:46 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-18-fvdl@google.com> Subject: [PATCH v2 17/28] mm/hugetlb: move huge_boot_pages list init to hugetlb_bootmem_alloc From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: F3A78180007 X-Stat-Signature: fea4xbdg4gwc974strrxnpu7rzth5qre X-Rspam-User: X-HE-Tag: 1738190564-29951 X-HE-Meta: U2FsdGVkX1+s5gs4QYpBlZCbk3hN40d0tdopHTIkT0HN77cRhaPYXyiLjiFj9SM2rRl+MBB58aT/fKzzOBfgWPcr0OF2nyqHCERf3NtLFGkoHhZpNGJzfS6BtuYGbK61ge/Qlbdsiscj/zT98U3V6gZ5j9/Rkp350tWoNaqu0up6K3jqDnXG64dal2cEDB/UfpmiV12b3v+xC3DE1ypb+RJehN2ie67F/+pWFrLY/5RueW9cE1a3fFe32GmAwjYG3+RzpMQoKthVJzezQ4PcXI5s3qdPcRpJQyAskFqlobvYT9vGZEVKXCWBHAXZjwewiRRxtxZxnlw3PUkF1UXQ30k00Qf/pGPEDe0WJnlEbHpr979KEc+pWfg5/4n1hV3Wc63b9c4kH1maFUDmlqw3F62JL2qqAF5tFq0DPbR1nYWmCBPoUGSnOGGqOfonxjLy8fr74bEyEJvOPPrn718ruwmgFG9fN+HQqFCarK38KmHnesERMvrMgP/IVPUrBcE4WLxsvHTFjzTsZAt0irY/KTPdev70lJm7QldQ4SscbndSmMc79ShJ16K28jTAc5AQNc8MJFJ+DxCFOcGRxYEIG/NCl/ymwnY9fejXB/o6KD99Z4H8pOErGGmoVu5GfCNT0HOoDMk9AG77B80ztCU93nLIZ3hRJt2w/k75oojC9YMO2dCdXvzJTI+8IflGEQ38T7XOnkrR8YOwtyaO3oOOQR+vURWYuE3G6feq/XiJB+TNeu7athTTwS4I2wBRo5MbJ9pZoptWOUywEbofekAjqFenEA5LbQXtKAYo+wUxVXWICjK5zs15H8S4yTRZKfIekbt4uoqrGUS7RNEPzBzKihUcEDbJKlSy6i6fJkhZEDEXbsCpSN/jVCASvnwgIdn1uhxCsupYYhoLCYCOLQF1Zzu0Qgrl9Wjr21eq4iPMdGwfQW2jGPb5PjG/8I77XhT4R5fAfb/fzsrW/Gw6l1s Vx0EJCkn 63O6pY2OSWyrh91fz8J6hhtw1iBz9EvxmiEGwxutEID0GtJCA8+GgKA/RKRXSjyhuHRoxM2+VpbT1eOrPSfbFEXqtH1sUWYCveB9rNJlyXEOZ+DY8MtI0UlgdswM0F41Sj1EDvBk9+IL7J4SL4lOzlkO9rvsInkKayR05qptNC6GD+jvNYC6Xs0diBdVWjKGiu/khnSUvbUpIg2FZ6lEGdBC/y7CIvleKBKOeBaVNYNMOsukJFbvyZEaSZeXovwzDZBQq6xhGbQmyaXEl+raq4lKKf4OIwU5soYC4wgErA0zY+pLFbj9QX7sH0aeVwPpRSlX48dehuxNW+38p78Oftnrh5a77pnQyEYDdeHN31lTkohjX/C9p30KMLQKw5HvV5N3ZdtuSpRJF76w96Tkhr2ST5v9e141n+PmEZJWDohiDoZrKWzix5CJoVCLSj/h5JBy+yPFXHJZPR7OXLOW9DlCShbV8yLcY/8mxeOPtUMs2NPfPDcSK1WZcsI5uC8RUPLOxgtavB2iXoHuQD5Gf9po5Vm7hT/0drjHw0lZAL0vPLTeszKxrG9e3O5o7gCFX7/ff6DV28ZiqwG16a2Kppslwjlu057DXF0jA5/iHPGzhA/U= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Instead of initializing the per-node hugetlb bootmem pages list from the alloc function, we can now do it in a somewhat cleaner way, since there is an explicit hugetlb_bootmem_alloc function. Initialize the lists there. Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 0a27659d9290..7879e772c0d9 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3579,7 +3579,6 @@ static unsigned long __init hugetlb_pages_alloc_boot(struct hstate *h) static void __init hugetlb_hstate_alloc_pages(struct hstate *h) { unsigned long allocated; - static bool initialized __initdata; /* skip gigantic hugepages allocation if hugetlb_cma enabled */ if (hstate_is_gigantic(h) && hugetlb_cma_size) { @@ -3587,17 +3586,6 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) return; } - /* hugetlb_hstate_alloc_pages will be called many times, initialize huge_boot_pages once */ - if (!initialized) { - int i = 0; - - for (i = 0; i < MAX_NUMNODES; i++) - INIT_LIST_HEAD(&huge_boot_pages[i]); - h->next_nid_to_alloc = first_online_node; - h->next_nid_to_free = first_online_node; - initialized = true; - } - /* do node specific alloc */ if (hugetlb_hstate_alloc_pages_specific_nodes(h)) return; @@ -4921,13 +4909,20 @@ bool __init hugetlb_bootmem_allocated(void) void __init hugetlb_bootmem_alloc(void) { struct hstate *h; + int i; if (__hugetlb_bootmem_allocated) return; + for (i = 0; i < MAX_NUMNODES; i++) + INIT_LIST_HEAD(&huge_boot_pages[i]); + hugetlb_parse_params(); for_each_hstate(h) { + h->next_nid_to_alloc = first_online_node; + h->next_nid_to_free = first_online_node; + if (hstate_is_gigantic(h)) hugetlb_hstate_alloc_pages(h); } From patchwork Wed Jan 29 22:41:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954218 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4A78C0218D for ; Wed, 29 Jan 2025 22:43:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E58DA280269; Wed, 29 Jan 2025 17:42:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C490B28008C; Wed, 29 Jan 2025 17:42:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 763E3280270; Wed, 29 Jan 2025 17:42:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id AFFB4280263 for ; Wed, 29 Jan 2025 17:42:48 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 6924B1604C0 for ; Wed, 29 Jan 2025 22:42:48 +0000 (UTC) X-FDA: 83061965616.29.A6FEE03 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf16.hostedemail.com (Postfix) with ESMTP id 97383180017 for ; Wed, 29 Jan 2025 22:42:46 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=GZ2kEE9+; spf=pass (imf16.hostedemail.com: domain of 35a6aZwQKCPAXnVdYggYdW.Ugedafmp-eecnSUc.gjY@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=35a6aZwQKCPAXnVdYggYdW.Ugedafmp-eecnSUc.gjY@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190566; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=s6D2Er1AWUGlWuHRy7b4wDChkIROtbX0tri1lKXEI3Q=; b=bRcaDp4I25J6ftYuLZJPD1AIKV3M/ZCDgLHcXupd4oSVdJrl1+Vfuz59MJJi/KxQd9fFvJ BhEjTtKc/RVCYLYV+uoI4bRTKms6npBTRDU+42jlUn1QodzsL5fOaqPpIrgQMHNklA50cB YnLCLSIEueRQchySTEaJfXfxZ9uHqxw= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=GZ2kEE9+; spf=pass (imf16.hostedemail.com: domain of 35a6aZwQKCPAXnVdYggYdW.Ugedafmp-eecnSUc.gjY@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=35a6aZwQKCPAXnVdYggYdW.Ugedafmp-eecnSUc.gjY@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190566; a=rsa-sha256; cv=none; b=YRdHqZnlhvluSFuGe2VO4uVRYmjrl8M4L6+W8SRiYBDZz2yJIVMIY8z8W++Wg+yHKt6wOM udb1d2WGh4MXAQB6S56ILgLx3fbeVSHmR7iDr5Q/PPBLff3N74vaef4Uine/rXq8J39oj6 Hk4ZWc2xIvl6xjWgdV32B63uJQ6HFAM= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ef7fbd99a6so231735a91.1 for ; Wed, 29 Jan 2025 14:42:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190565; x=1738795365; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=s6D2Er1AWUGlWuHRy7b4wDChkIROtbX0tri1lKXEI3Q=; b=GZ2kEE9+aHg6UamRx1LFZCTm4WWgRtmF92xozCNLmRTUKO9DiVQ4FTuWCuJP4ZiZfu MFGFyeHXfIsqmj12n0O+/8wrOMAtFM26VmyLyj+p8n0AJrgH++ovUbxxggEtbuAwwBSw i6SB2aTeTFuDuN8pwguq43Bfsl8qMWADlEjXbSEb5aTAHy62a7C47of4PlJbYOuY1Xfw Oyksky3rsa0WdL+inMJAuwYd53gk3WTQnx2Lvqq2taz7pzoCJGTQuISBYkRZ95jWXje6 364b5gXPqkU16W5uytO1FXa+R46sKaKmimLXLJYrnqoP8BmeAaW3kdIGABqhM0hjmjIx QLaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190565; x=1738795365; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=s6D2Er1AWUGlWuHRy7b4wDChkIROtbX0tri1lKXEI3Q=; b=UqoMeRe5f2VdLCyTKx05OUQUjNE+ExT2pwYZY9xs80tlvjxRVYKKyj3KW3myVnR/Zo O4RPwFq6+eZ8Szd4huxsk3DKIfZJek122STNEl3gMt8ydlRW8fLlQVCWG3HXjvbVxvbH curcux9alpIWpIPpje5LsE4iO7gMCaS+82MncPpE/NcXpPZzEHpLOGlHCNEax9Glg3yr 48uNYFaepn5gGH749U9sgO4EbGSy+KeI6wQh2ZNqUGtYkG377Zu/JS9G0c6ifXf2NBqM AnSysWmOCrDdoZhjsJGBlerqL0CuP6SkRowF3GTgoP0Vzx1K8WhAZ4xA8gfq2p4n5tL0 DqwA== X-Forwarded-Encrypted: i=1; AJvYcCUeAZTVc6s733EtJwsoXVppzjGosU1VOC8WJBl4lsuDKS69Eu3Kj0k2f0Fmio3qtxaTJvc5A85Q5w==@kvack.org X-Gm-Message-State: AOJu0YxVV2sdnzYZqk3I9UY5AR6tDmbdGWnRPjX/RFbYwbjWUnh9Jmfn XPFlqQPh6F8lApj/q3DwDyT54gdlqvyFZh01q4FzzzTCRfExAEiFcbxiquEU4C8T88Nfdw== X-Google-Smtp-Source: AGHT+IFTdSQExYmjG/I0+G58ZhbdOePk8o9BWwD8WMnaUq4tuY2bhx7wjPi5ZMIwpT+tjSLMeHFg+Bgp X-Received: from pfav7.prod.google.com ([2002:a05:6a00:ab07:b0:728:b8e3:993f]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:acc:b0:725:aa5d:f217 with SMTP id d2e1a72fcca58-72fd0be4e2fmr6604034b3a.7.1738190565418; Wed, 29 Jan 2025 14:42:45 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:47 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-19-fvdl@google.com> Subject: [PATCH v2 18/28] mm/hugetlb: add pre-HVO framework From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 97383180017 X-Stat-Signature: 6e4fu6f5trq6qwdx1qnzaecbmba3dg6q X-HE-Tag: 1738190566-208902 X-HE-Meta: U2FsdGVkX18FoLJ/SDF5nbrlGTtM3AfxdwcSVQdFl2a1w0agr0UJeglUQgmY3XKTdf/QInD1cwKKCDAQMx/USuE3PWLjpIUytw/kvroDjG9Drcny46HTGFdIe+T7kfO4bEyf8+PbQrDm96L0fvkuFqqu+QOqB88JWUcTriVQF7YlVqpTVOibeli5rmu3wLUcHDuGvXWprC48PuNdH72aZlOd49BE80J1OOql3O8bUgOUW/zigIx7DQzkWCDm2ijp+cWGUcdmHFZKiIX/wOHGoDU8YJhYBgkofNdpdl7feNvuv/berdPn5Z0gfm2oX0vVwaw5AYgzuztdKirZ5Uu0/+aj04i7AKXzLFnBfIGmAOjiiBngz5skuJPgkSC9117tagdyUmJKggAfyilddoVcctnYEntsXtkLdeXTcPPhtKRinDOSxJeGuMIJDNAl8bJA4yaYUEgW3yr3NrIoGpOHD6OVGlO3Wpy2uIeL5vryIghiy7NsgyoDa6TWEBZ77A51f1fYIOBVBWGH3fLtDYUU6hWkTUZQTt0xb0J5FFX3nwR1bUlvRlvq7dOIHRwd0HYzt2yoepHj4AaOBBz9sQtKkEY0vS5Ev6NfLkKnGfjDHM050JXqJd4M6V414IDrgevAwEdtKVSE+oD+ir1R5WxHudeVerJ7OJWN0ckOI8ZF13BMuilPdbz3bgdq8c45zooUeSiqRP/HV/4TwvD2PakOxodebTagTMXxe8NsOLx2y7fBdbe1U9eytzDWTfVjKdk58zTRYHf8W7Hmwao0uv0yh1wCtEoQW9D00cZzwPZOyzdZB2NnBMsoZYznFGtKyy7MLdCdes+8gpBsYJqCawpgoDYu0bC3nK5dZ5IjrgBtoeVPuGCCD7BacbGjHX23SuxzsmVTixn+Hdhbg2iRFXN26XHvxZqfSojcYwzRZrG9LBzXzVHikituDnKtSqsnn00cWRLw0alP5oL5Fxn434D 4l/KdnVf MEVugir/Cos3/ZgOVIdE+TV+uzzkKG3hy8rviVqHm4DM/mnN5EOjsdvNmkEQMC/w0SvGEriG/sqSfkQibj1iTjmKmfO3cNvK8YwW2ezd1UY7dYf70pixXeCS8PKEkDUHPum5gm5WRNNaZIOubY8HQdG3Hk1u2Fl4YMG25+5if6nZs6ZGAaUeABPBU/Y5rNINizBIU1SdydbHd0DwpY/DwNkRZik2XQsddrrIwcdAdpvHVpGAA+HLDeTkvj1zcErSZc/PuglfXugCfE34FZpiMIIc1BAjIWbfvwbSfCdzWl6aa8n3FgaUjKNT9R8MiiZTRPJ7HUdTuKv0teXHP+DmnGU5UZaWzlEE8W7YY7VcGJA4L1x0iqoBYH4+666SEvXQdpZeTxQ3tbCAiDTw9nT+A7ib1PU8zBeKaMUtwgvP9OuClyTYSr96jI8LNFLPWXbECoMU6VwEijxs+X3QDm4pBMeoXowMA0yyHo1DlsjFffzVZVDR0FVJCmaPFXNVYaOoBfH0xAePpbTCL6KL3fJxtiKokpw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Define flags for pre-HVOed bootmem hugetlb pages, and act on them. The most important flag is the HVO flag, signalling that a bootmem allocated gigantic page has already been HVO-ed. If this flag is seen by the hugetlb bootmem gather code, the page is marked as HVO optimized. The HVO code will then not try to optimize it again. Instead, it will just map the tail page mirror pages read-only, completing the HVO steps. No functional change, as nothing sets the flags yet. Signed-off-by: Frank van der Linden --- arch/powerpc/mm/hugetlbpage.c | 1 + include/linux/hugetlb.h | 4 +++ mm/hugetlb.c | 24 ++++++++++++++++- mm/hugetlb_vmemmap.c | 50 +++++++++++++++++++++++++++++++++-- mm/hugetlb_vmemmap.h | 15 +++++++++++ 5 files changed, 91 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 6b043180220a..d3c1b749dcfc 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -113,6 +113,7 @@ static int __init pseries_alloc_bootmem_huge_page(struct hstate *hstate) gpage_freearray[nr_gpages] = 0; list_add(&m->list, &huge_boot_pages[0]); m->hstate = hstate; + m->flags = 0; return 1; } diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 5061279e5f73..10a7ce2b95e1 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -681,8 +681,12 @@ struct hstate { struct huge_bootmem_page { struct list_head list; struct hstate *hstate; + unsigned long flags; }; +#define HUGE_BOOTMEM_HVO 0x0001 +#define HUGE_BOOTMEM_ZONES_VALID 0x0002 + int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 7879e772c0d9..b48f8638c9af 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3220,6 +3220,7 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages[node]); m->hstate = h; + m->flags = 0; return 1; } @@ -3287,7 +3288,7 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, struct folio *folio, *tmp_f; /* Send list for bulk vmemmap optimization processing */ - hugetlb_vmemmap_optimize_folios(h, folio_list); + hugetlb_vmemmap_optimize_bootmem_folios(h, folio_list); list_for_each_entry_safe(folio, tmp_f, folio_list, lru) { if (!folio_test_hugetlb_vmemmap_optimized(folio)) { @@ -3316,6 +3317,13 @@ static bool __init hugetlb_bootmem_page_zones_valid(int nid, unsigned long start_pfn; bool valid; + if (m->flags & HUGE_BOOTMEM_ZONES_VALID) { + /* + * Already validated, skip check. + */ + return true; + } + start_pfn = virt_to_phys(m) >> PAGE_SHIFT; valid = !pfn_range_intersects_zones(nid, start_pfn, @@ -3348,6 +3356,11 @@ static void __init hugetlb_bootmem_free_invalid_page(int nid, struct page *page, } } +static bool __init hugetlb_bootmem_page_prehvo(struct huge_bootmem_page *m) +{ + return (m->flags & HUGE_BOOTMEM_HVO); +} + /* * Put bootmem huge pages into the standard lists after mem_map is up. * Note: This only applies to gigantic (order > MAX_PAGE_ORDER) pages. @@ -3388,6 +3401,15 @@ static void __init gather_bootmem_prealloc_node(unsigned long nid) hugetlb_folio_init_vmemmap(folio, h, HUGETLB_VMEMMAP_RESERVE_PAGES); init_new_hugetlb_folio(h, folio); + + if (hugetlb_bootmem_page_prehvo(m)) + /* + * If pre-HVO was done, just set the + * flag, the HVO code will then skip + * this folio. + */ + folio_set_hugetlb_vmemmap_optimized(folio); + list_add(&folio->lru, &folio_list); /* diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 5b484758f813..be6b33ecbc8e 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -649,14 +649,39 @@ static int hugetlb_vmemmap_split_folio(const struct hstate *h, struct folio *fol return vmemmap_remap_split(vmemmap_start, vmemmap_end, vmemmap_reuse); } -void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list) +static void __hugetlb_vmemmap_optimize_folios(struct hstate *h, + struct list_head *folio_list, + bool boot) { struct folio *folio; + int nr_to_optimize; LIST_HEAD(vmemmap_pages); unsigned long flags = VMEMMAP_REMAP_NO_TLB_FLUSH | VMEMMAP_SYNCHRONIZE_RCU; + nr_to_optimize = 0; list_for_each_entry(folio, folio_list, lru) { - int ret = hugetlb_vmemmap_split_folio(h, folio); + int ret; + unsigned long spfn, epfn; + + if (boot && folio_test_hugetlb_vmemmap_optimized(folio)) { + /* + * Already optimized by pre-HVO, just map the + * mirrored tail page structs RO. + */ + spfn = (unsigned long)&folio->page; + epfn = spfn + pages_per_huge_page(h); + vmemmap_wrprotect_hvo(spfn, epfn, folio_nid(folio), + HUGETLB_VMEMMAP_RESERVE_SIZE); + register_page_bootmem_memmap(pfn_to_section_nr(spfn), + &folio->page, + HUGETLB_VMEMMAP_RESERVE_SIZE); + static_branch_inc(&hugetlb_optimize_vmemmap_key); + continue; + } + + nr_to_optimize++; + + ret = hugetlb_vmemmap_split_folio(h, folio); /* * Spliting the PMD requires allocating a page, thus lets fail @@ -668,6 +693,16 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l break; } + if (!nr_to_optimize) + /* + * All pre-HVO folios, nothing left to do. It's ok if + * there is a mix of pre-HVO and not yet HVO-ed folios + * here, as __hugetlb_vmemmap_optimize_folio() will + * skip any folios that already have the optimized flag + * set, see vmemmap_should_optimize_folio(). + */ + goto out; + flush_tlb_all(); list_for_each_entry(folio, folio_list, lru) { @@ -693,10 +728,21 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l } } +out: flush_tlb_all(); free_vmemmap_page_list(&vmemmap_pages); } +void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list) +{ + __hugetlb_vmemmap_optimize_folios(h, folio_list, false); +} + +void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head *folio_list) +{ + __hugetlb_vmemmap_optimize_folios(h, folio_list, true); +} + static const struct ctl_table hugetlb_vmemmap_sysctls[] = { { .procname = "hugetlb_optimize_vmemmap", diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 2fcae92d3359..a6354a27e63f 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -24,6 +24,8 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h, struct list_head *non_hvo_folios); void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio); void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list); +void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head *folio_list); + static inline unsigned int hugetlb_vmemmap_size(const struct hstate *h) { @@ -64,6 +66,19 @@ static inline void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list { } +static inline void hugetlb_vmemmap_init_early(int nid) +{ +} + +static inline void hugetlb_vmemmap_init_late(int nid) +{ +} + +static inline void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, + struct list_head *folio_list) +{ +} + static inline unsigned int hugetlb_vmemmap_optimizable_size(const struct hstate *h) { return 0; From patchwork Wed Jan 29 22:41:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954216 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BBC74C02193 for ; Wed, 29 Jan 2025 22:43:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 849CB280263; Wed, 29 Jan 2025 17:42:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7886B28008C; Wed, 29 Jan 2025 17:42:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3E443280269; Wed, 29 Jan 2025 17:42:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 5DA2D28026A for ; Wed, 29 Jan 2025 17:42:50 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 1AC6312049F for ; Wed, 29 Jan 2025 22:42:50 +0000 (UTC) X-FDA: 83061965700.15.B0AEF79 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf26.hostedemail.com (Postfix) with ESMTP id 431CA14000D for ; Wed, 29 Jan 2025 22:42:48 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ZIdLPL1i; spf=pass (imf26.hostedemail.com: domain of 35q6aZwQKCPEYoWeZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=35q6aZwQKCPEYoWeZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190568; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=X7mcGCY9R/2wAfvNybb5kTke1CSbZ/pNKwqh3Rdbh4M=; b=q3o5o+pC1heI3Yc2ynO9V2AFrOLcTh0LugZsl4FV9lWrjcBYwFn1NtLlZTXeJoA3wjXKpQ m3d/fKLG1CvKSRS3ayvL20f8p6Urjh3aiQ7mGSEx36/ihg7qd8/nzLrv08Ey9Aw0uLo+X+ FhqaCqcYTnk/CXYkfoRdAdvJifXi/aU= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ZIdLPL1i; spf=pass (imf26.hostedemail.com: domain of 35q6aZwQKCPEYoWeZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=35q6aZwQKCPEYoWeZhhZeX.Vhfebgnq-ffdoTVd.hkZ@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190568; a=rsa-sha256; cv=none; b=LjJsI6+80tUsQYA9sAJtsnDSwmySErgwTVGHCXe61e6Q8Qdb3ZbkUmw4eQPTrFUuhAcXij pmy0jA+KaehSN0a+mk4f/oOmA/lt+GTti1IbRuI6mhE6imuKV5S5gczdrxCRcdv21ogfI2 gM7kK9/aMbOdRKd0ETX89GnU6rZxXlM= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ef35de8901so218804a91.3 for ; Wed, 29 Jan 2025 14:42:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190567; x=1738795367; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=X7mcGCY9R/2wAfvNybb5kTke1CSbZ/pNKwqh3Rdbh4M=; b=ZIdLPL1iSPk43Z7cUJL4/Ow4tlt8S/zaFX2m7+DWE9sNkWdlBtPt0MY0/SMfVeeyqR LUis0KWNSTocHSRJvamwyYwbddaIa9O65YdWlp22COAXx6G2PSCxgy05c+qmGlmkuJE8 akTz1HCo32YLF7RjxKLv2bY21VvHsCTSMV3+1WpMB2Yc8mQjlAs0OSjnzZ7vOHAvbP4q 9si5KvrEu8lfTJdjllonqStISoi/6R2BEyZLuHgEbZjfvRCIPWOvKxDHZM42Bx6NSNbk a2kBMiRRelwtG5GThBzm8D21Vg5bOoYpzfhy0KMgnvah4eomeh0i1Hyo0cAnOD0p+eGC gktQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190567; x=1738795367; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=X7mcGCY9R/2wAfvNybb5kTke1CSbZ/pNKwqh3Rdbh4M=; b=vegqMtZRZUd2U2GaHSleYzD16eQslpMneVcDAOk1DfD34zGmk4ypt1GlAysZRClzEh j1Vc3QOelaNQ+HCX75nM/br8bZAA3rA1lK3g5EliQZuIA0729+0AjZmNkayHuV5j6cCS FmgHT1ZzjX4LIDLbV/oZz8NPEtQUKNRF/omZKQsMmvAEu09doVBxxWq9A5Fnc0vSVq5X CP5xPC6kv61ZphincrKCN7hKHwxNTW1Cp/Pc8ssuPxS8yZjHp4Wvs5J03Fm8L+QO1Bks 9gNXezNIMSfMLEbAyHNHElAjzR3RlSX5QHs0fFZi17Z8leIg8nFIExJ7TB31jh7NIMv8 J7rg== X-Forwarded-Encrypted: i=1; AJvYcCV4Q1KMYAdVcrY0nsEAnvij+O8Eez/ACo7ZTxl5umDV8Gy5Y3sMcqpAxofGOnYeM9jPNjWRdrNAUQ==@kvack.org X-Gm-Message-State: AOJu0Yxv3ZyZJYkhX8mK7zHm+B95p8SzRC4VMilXl7/sUhyTWy6G1MJv oWcbJT5kEzJ+OQdNq098Zd12AvPiTMhh0/X3I2y/LM/u/o/NZ3DdLo2AN+7brHy8VU/SeA== X-Google-Smtp-Source: AGHT+IGG+kPFiJE3B/odsMz1ETLGa3V2S5NMwNDqcw7c4uCRCsh1XEHgPCNnT2AiKU+A8bBdmV2pW4ZW X-Received: from pfbcp23.prod.google.com ([2002:a05:6a00:3497:b0:725:f14a:b57c]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:3924:b0:728:e2cc:bfd6 with SMTP id d2e1a72fcca58-72fd0c679b4mr6370617b3a.18.1738190566976; Wed, 29 Jan 2025 14:42:46 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:48 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-20-fvdl@google.com> Subject: [PATCH v2 19/28] mm/hugetlb_vmemmap: fix hugetlb_vmemmap_restore_folios definition From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Queue-Id: 431CA14000D X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: i5xmnqe6aeeiwc155exztk47mn8ctii8 X-HE-Tag: 1738190568-320316 X-HE-Meta: U2FsdGVkX1/0/KuEiTFo8tUpxuJ2IpPTOj6TBB8Prf5/vp/AA3K7gssMCPOuQ+chumcnhE2xeh6oIIxSTH76tu5LCMG4vhX9MYOn4l7CeY6k173fX0oGzwfq0oawQnmgi6Yn1F86H/9Oerp9DYxf60xGYUfOfvBUyYVazGBbQNxaNL0D8XdlllWCkbs+ZrQFWq1vQGhBff/OZqDl0SjTSKDoK4BVnuiJdVZheqxoyIIbJeXJ3wOyywXQPY6YocTyT/7ZFlqx4Onuvx8RxrRLbkExQ43LDBQYQYe6fJU+heJvTe9Nr8ptEspZ29Zn5AXQjjoEMoAfZP2aGbRRt2gVtoRolqAKMajr3wc0WagX3RmgbnS4bO+ALhqsMoPCwCsPgcXRCFMtniak4z/tDWKXUrniSCJIa/66wru819Ri46byhV6Yr+7BudzykZ35bo5LyEaLj3ZqbWJl0T2xWPpyqMbQSZcz/NSKC46iCsQtdb2hS0jrHvSgu31cX6qQqbz2941InCUT9V0eL7zr9Nb1osthAmsx9q7My1DG+2dEAXy1zG8LSrSzmFrTFo2dwd9rH/l0WoJx0PeNty7BvNEYh02XhOE41Jpeld9HL1S37bxHFmN5yn05NgJNnKwj9bL0ppoEVechywcNYQeBbLRZieXLlgYnJQkVkDeDyfn16guLeF2tuCvzZdDkfqFjCpt1GO9UGniKaVmaGUEXmbfCFLVlcR7sRhfMCc3sOnEzJbVG4+kSnA4dDk2yDKt2AfZiiCJKBvpgXNLODYu68wm4S2UBsyYg+II2hGkMRuWayqOOFU1ngtqiIrN6BaB4GPHx+dtJ+WWUmcVCtX95SMyU1rSGpd2wv+b6pXTiiwyGBzI+Aa2kkyoPxJVYVaFmNPl7BRwOAMO3bJWxKjHO3BQhw3jw8A0pMtA1LSL3DoH2zNryFEqiLXaLbG/pDSGrbNeqL9nuZgEKQwdzgrjUXgI q/mG14Z8 J3340R0Xa5aIul9Xvc6LycBOCzRf0Xp5CEX60XL2NQznY9CdGT/WUWYfALFJkVjLB89veuXNioudql3th/SIVmHojP9sph07Ug2zHGhBQjN/jw7vt+f1T8tGu2aJ2g4yRj40IKxmrXTD8h4e9omgfue684gBwyFQYk3wEWvEaLwZvgFHRB4BvSW9sHFmqPQ7vhF3UVHECWlFfE5/0ZrHf3v50l7nl3V4gmx1fHkBaVdl8fiN9+/6WCuhwh6iTm85za8BVAlVoSE3BHkeyjO6dEIGiqWLa2ByDnBHXa0GuR0kHgowGxDS4WiJHvvxtJgSg44irmt3s6eIhBJ1n3GMW8zv5pKW814K3C6wFyWjLrRr37jNIAU5jtYuCO20b1DuBi/yP/09WGpSYpw0HLTByIrfuDEf7Hif5PgOeZgoLGrN/Q7yW6u0pUqhHOZhX10sRX3/TzVTJc7TTVMSHra9JDWl3P3k3WPpkWRxM9lwoavtngk7vsjH0Hnfuop497x4s1hDSU6d/EBcQsE9cek/0jAjNXD0f5g+SXXlGCZdCbFr9EoeMk4L1vQEglGi1rqyvXO1JFc+S4+48w7w= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Make the hugetlb_vmemmap_restore_folios definition inline for the !CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP case, so that including this file in files other than hugetlb_vmemmap.c will work. Fixes: cfb8c75099db ("hugetlb: perform vmemmap restoration on a list of pages") Signed-off-by: Frank van der Linden --- mm/hugetlb_vmemmap.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index a6354a27e63f..926b8b27b5cb 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -50,7 +50,7 @@ static inline int hugetlb_vmemmap_restore_folio(const struct hstate *h, struct f return 0; } -static long hugetlb_vmemmap_restore_folios(const struct hstate *h, +static inline long hugetlb_vmemmap_restore_folios(const struct hstate *h, struct list_head *folio_list, struct list_head *non_hvo_folios) { From patchwork Wed Jan 29 22:41:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954215 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E31EAC0218D for ; Wed, 29 Jan 2025 22:43:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 555AC28026B; Wed, 29 Jan 2025 17:42:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 47BEB28008C; Wed, 29 Jan 2025 17:42:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 237C528008C; Wed, 29 Jan 2025 17:42:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 8439F28026B for ; Wed, 29 Jan 2025 17:42:51 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 49EB5A0486 for ; Wed, 29 Jan 2025 22:42:51 +0000 (UTC) X-FDA: 83061965742.03.6F55264 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf14.hostedemail.com (Postfix) with ESMTP id 7B6D6100010 for ; Wed, 29 Jan 2025 22:42:49 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=H0Vuc4DM; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf14.hostedemail.com: domain of 36K6aZwQKCPMaqYgbjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=36K6aZwQKCPMaqYgbjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190569; a=rsa-sha256; cv=none; b=7zgc5ypCtdJ3XVj7pLzxwCPQyf06e9fIhtzk5PAXfF1T22xVbJKpGuppQReSKt9d9eJ2b0 iGGtex0DaBa1xN//OhBUdUSd5BC4uxvOkXRFb2DObE4aZ2oKdcXqy7WZ28G3p4auyJ2TGV 9jypLL5LKfgOWfoTjo7cYv2QLJUcntY= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=H0Vuc4DM; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf14.hostedemail.com: domain of 36K6aZwQKCPMaqYgbjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=36K6aZwQKCPMaqYgbjjbgZ.Xjhgdips-hhfqVXf.jmb@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190569; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=98CKfAXCouaMhImAnm0e8KjEMzprOEcP3h/4GQttOQs=; b=o/VKa8d9ZBS5rTUjB0XoTZcVb8kxCPKzdoh0plJI+7ZwDfBZa2GzJnyVFK4UBRNbDLkPL4 A5Vo7hWyCqIUFcBPqCTX8oFl4KgT+2DvP9IRMLo0Bc9u5HaN4hXAG6cCVW5i7qcoGv4oly 4okGe0lvJj896rgRLa/VPz6pxeC/qCI= Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-216750b679eso2289995ad.1 for ; Wed, 29 Jan 2025 14:42:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190568; x=1738795368; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=98CKfAXCouaMhImAnm0e8KjEMzprOEcP3h/4GQttOQs=; b=H0Vuc4DMoFdIlmXvOgZTkmJSvqNNTTi39NJQkvKskWWbF6EmEjHLHa8P3HgXt2kk8/ JKam2R1dUH48+UyQZun86FIEj8sRls23vl3TJHbqHpkZ8yApf+eoz5TUD7n7y11HT3Sc STVKQhAVoYbrS0lAnWWN6k0scIxNyAzkHj5LCozwYGJp6L8TE0Nl7wx/1jEqmPxh/sW+ NsX8cTbPUjgLFcxsYU9jQzItkP/Q+jlEsg9dhp4kUeWDINL9DioLX0qpaJe7enCHo5j3 F2TrtfhYDD0FKPYhgmZOuChTWMKwEy58FW634vt08LyC4IEUKm2J7HWZQi6m4zPHnjdw Z72Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190568; x=1738795368; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=98CKfAXCouaMhImAnm0e8KjEMzprOEcP3h/4GQttOQs=; b=tHV0QrmcqtVavK6zHxDGipv56G5QPB+S9ytYoY2WUANWXYqge6rezimD99Sl1jBN6y H0RgTyNfimHrK65ute3LfamSPPHBuLZIaLJfs63T0lEEqQlDcPnCd0c7MebWRYozGSn+ JsTfnvI76hj8gbZCtubESHhtd8TzMgEWwlifr3dDbYUNditaKq1OblCl6mmDaKovu+AA Zewuy5w+ZL/hwthOKQn0ZHtWyXBCgmXK5/vzNVPIEjVwtJ0BwhSiamhdWwD2lOR3rDW8 HZ5FhCKKNADD1rVTGgJZdMGhAqqVi+5V114XXj+ehrnICQu7qthsR71MeeSHHmNID5ru yoGg== X-Forwarded-Encrypted: i=1; AJvYcCX6akCaxSzp8zh0Lz1M5JNri+j6bybN6qcty2r9mInYgPqJTS8mPyImWV+VMGhBlMJdfcKrBVflqw==@kvack.org X-Gm-Message-State: AOJu0Yz2KPahfrb/olwG2q2+ZIxI3wGD6ZlIFm+yVaWuChK5SUvJsgVc UQgkPriUOo4ZV4uYm6UOfEOW8kJo6gw/mZaOVFCXHRgUSEJg0wX++ER3g+GjcvEagBwffQ== X-Google-Smtp-Source: AGHT+IEY1cyX1Tlj45D+PYHBljqRW8zszqnM/nXeXUYLBp6biLr4qmhI4En+OAtFaTTk0P/jl6YDzVhE X-Received: from pfbb14.prod.google.com ([2002:a05:6a00:ac8e:b0:72a:a7a4:9a53]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:a1e:b0:725:b347:c3cc with SMTP id d2e1a72fcca58-72fd0c7bfadmr7456497b3a.23.1738190568330; Wed, 29 Jan 2025 14:42:48 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:49 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-21-fvdl@google.com> Subject: [PATCH v2 20/28] mm/hugetlb: do pre-HVO for bootmem allocated pages From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 7B6D6100010 X-Stat-Signature: fco6gwguh4wbms9f3qmehmet1nswhs64 X-HE-Tag: 1738190569-705718 X-HE-Meta: U2FsdGVkX19gG0TjYDPiU0Bd8vcoMa4F8hpe4LoXhRQonrbtNUL/pbasQnxufKfYOazySQXStn5rj3F6ycuaqR5988EidW63FL669xfCEj8NFT6HhUy0mEbllJlCwc6P9BdJd7ximFRslf5glq1I+Ay9z97SsdDqd09cueeE+TgDSlXDroaURBYOtVRTrc9k1jBs+7l7BiByjiyf1l/F6HeUlceBuBV2pNEvILHHMS8RcFbyCo7w+ZSTjyEz/u+AXOiOzsJbeTjj6VQp8YdigpHtIijlRBm+6W2hrrgVHc3NrFSM8zkrg3+LbFkCfiDl9+nDfcNwS+qvrJG8nIDsbo47xi9ohYLOM+Ry/CDfih17WdjFcSP5nK7EUsnxMZRKscGe8o+Gweqq78C6gAZKuJgit3gyKfzmnkx6vI3Y10x3ECzMs2E+LOAZvErJTgCoISdvFHcZx63K17Ai239Zp25Zqna1MM0+TpLzvEMNCJsTE/5dv20KEmx3ww021L9kZqeKFvRvyy6/UoT/UrqFwNp7nQmUAlmBaLGLln/WUI+XQMyssEjrNmVXMzXdKD2K7sg23gMWES3pnAs+SfCg4B+4IiuiWsCZr6GBbrfvdYVoDXmsdqh3hF17aXbrZe31NUoSKvKpUEFE/tGhKrVVx5mS/bmwzSHrzOr1RawLyeeFb8pkucjCASN7kiMRyDaET5mDmD6SlQ88weQnd5TSXiDcde7zzwpMqBL4w2K7DnvraHgjtq3VnWwESlvi4EZR4kIIK6NvggJ3ehYL4iwYrbXKzB9u+giqI6s5qEDyIulXxuDL2ePYZgOi/ddq/aPAgpRl1eopDff6yEOM6zKzpid25n+FsdlrRcWfz59Bme6ChF/4mxUjYMldAVrpeOOeGoV9O9aoOYXTuwp9JzawUwzbkXkFwS+NpKFtJyMtrQEo1QJsuXPNfGO0AKyN6O95hRethFJ1mBd1EbF5yEE pyqYnyo/ Gv4RdR7QE7edZHIW0+6eI5fD1uVA7l8vDH1CubJ+snjn+sP009c8OhtAGZ0kd5rgKjkyRuJv3Nkg53+MhyUwY9HitM0gYzayGgY1dchfnFNc5M+nVXvvSkNp1gmGMgSPt/b3j+fuwND/by/qntAS54DT8xoXmuoxxa9DqEgxlwZTrUpgbKSOpDD3YU6dDutu1gok56EStwCJ0gc5lNFzbaCREaXrcaKe+nnphuIjqgYD+1oqzDIu8vh+GXJ04ImnAhny8nVbHHOXReqjFckkzroVMFyFCgLNHGTxvxPWfJl4dDLDQ8eHIuNo5C/6HfXUdSaBnHoh/Ik7+VuDOdXiOYxQcCQIHqjroF/Do3Lqu8TAda4opQa+O+RFeJ7MaqG0XZaqphKoD1ycXQYjAq3lrG62WThslG7XqJFJRBNDD2GOfd7TJi0xfByotQthRKZF28K76a5IxKAUA4ucJ+6Op0eni2misoeyh6sQZK8Ye49ZSJ/5bsK9qbJ9lgvDdMeu5SSzQ5r3RTWkOvdoBUToeBai3yw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: For large systems, the overhead of vmemmap pages for hugetlb is substantial. It's about 1.5% of memory, which is about 45G for a 3T system. If you want to configure most of that system for hugetlb (e.g. to use as backing memory for VMs), there is a chance of running out of memory on boot, even though you know that the 45G will become available later. To avoid this scenario, and since it's a waste to first allocate and then free that 45G during boot, do pre-HVO for hugetlb bootmem allocated pages ('gigantic' pages). pre-HVO is done by adding functions that are called from sparse_init_nid_early and sparse_init_nid_late. The first is called before memmap allocation, so it takes care of allocating memmap HVO-style. The second verifies that all bootmem pages look good, specifically it checks that they do not intersect with multiple zones. This can only be done from sparse_init_nid_late path, when zones have been initialized. The hugetlb page size must be aligned to the section size, and aligned to the size of memory described by the number of page structures contained in one PMD (since pre-HVO is not prepared to split PMDs). This should be true for most 'gigantic' pages, it is for 1G pages on x86, where both of these alignment requirements are 128M. This will only have an effect if hugetlb_bootmem_alloc was called early in boot. If not, it won't do anything, and HVO for bootmem hugetlb pages works as before. Signed-off-by: Frank van der Linden --- include/linux/hugetlb.h | 2 + mm/hugetlb.c | 4 +- mm/hugetlb_vmemmap.c | 143 ++++++++++++++++++++++++++++++++++++++++ mm/hugetlb_vmemmap.h | 6 ++ mm/sparse-vmemmap.c | 4 ++ 5 files changed, 157 insertions(+), 2 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 10a7ce2b95e1..2512463bca49 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -687,6 +687,8 @@ struct huge_bootmem_page { #define HUGE_BOOTMEM_HVO 0x0001 #define HUGE_BOOTMEM_ZONES_VALID 0x0002 +bool hugetlb_bootmem_page_zones_valid(int nid, struct huge_bootmem_page *m); + int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index b48f8638c9af..5af544960052 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3311,8 +3311,8 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, } } -static bool __init hugetlb_bootmem_page_zones_valid(int nid, - struct huge_bootmem_page *m) +bool __init hugetlb_bootmem_page_zones_valid(int nid, + struct huge_bootmem_page *m) { unsigned long start_pfn; bool valid; diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index be6b33ecbc8e..9a99dfa3c495 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -743,6 +743,149 @@ void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head __hugetlb_vmemmap_optimize_folios(h, folio_list, true); } +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT + +/* Return true of a bootmem allocated HugeTLB page should be pre-HVO-ed */ +static bool vmemmap_should_optimize_bootmem_page(struct huge_bootmem_page *m) +{ + unsigned long section_size, psize, pmd_vmemmap_size; + phys_addr_t paddr; + + if (!READ_ONCE(vmemmap_optimize_enabled)) + return false; + + if (!hugetlb_vmemmap_optimizable(m->hstate)) + return false; + + psize = huge_page_size(m->hstate); + paddr = virt_to_phys(m); + + /* + * Pre-HVO only works if the bootmem huge page + * is aligned to the section size. + */ + section_size = (1UL << PA_SECTION_SHIFT); + if (!IS_ALIGNED(paddr, section_size) || + !IS_ALIGNED(psize, section_size)) + return false; + + /* + * The pre-HVO code does not deal with splitting PMDS, + * so the bootmem page must be aligned to the number + * of base pages that can be mapped with one vmemmap PMD. + */ + pmd_vmemmap_size = (PMD_SIZE / (sizeof(struct page))) << PAGE_SHIFT; + if (!IS_ALIGNED(paddr, pmd_vmemmap_size) || + !IS_ALIGNED(psize, pmd_vmemmap_size)) + return false; + + return true; +} + +/* + * Initialize memmap section for a gigantic page, HVO-style. + */ +void __init hugetlb_vmemmap_init_early(int nid) +{ + unsigned long psize, paddr, section_size; + unsigned long ns, i, pnum, pfn, nr_pages; + unsigned long start, end; + struct huge_bootmem_page *m = NULL; + void *map; + + /* + * Noting to do if bootmem pages were not allocated + * early in boot, or if HVO wasn't enabled in the + * first place. + */ + if (!hugetlb_bootmem_allocated()) + return; + + if (!READ_ONCE(vmemmap_optimize_enabled)) + return; + + section_size = (1UL << PA_SECTION_SHIFT); + + list_for_each_entry(m, &huge_boot_pages[nid], list) { + if (!vmemmap_should_optimize_bootmem_page(m)) + continue; + + nr_pages = pages_per_huge_page(m->hstate); + psize = nr_pages << PAGE_SHIFT; + paddr = virt_to_phys(m); + pfn = PHYS_PFN(paddr); + map = pfn_to_page(pfn); + start = (unsigned long)map; + end = start + nr_pages * sizeof(struct page); + + if (vmemmap_populate_hvo(start, end, nid, + HUGETLB_VMEMMAP_RESERVE_SIZE) < 0) + continue; + + memmap_boot_pages_add(HUGETLB_VMEMMAP_RESERVE_SIZE / PAGE_SIZE); + + pnum = pfn_to_section_nr(pfn); + ns = psize / section_size; + + for (i = 0; i < ns; i++) { + sparse_init_early_section(nid, map, pnum, + SECTION_IS_VMEMMAP_PREINIT); + map += section_map_size(); + pnum++; + } + + m->flags |= HUGE_BOOTMEM_HVO; + } +} + +void __init hugetlb_vmemmap_init_late(int nid) +{ + struct huge_bootmem_page *m, *tm; + unsigned long phys, nr_pages, start, end; + unsigned long pfn, nr_mmap; + struct hstate *h; + void *map; + + if (!hugetlb_bootmem_allocated()) + return; + + if (!READ_ONCE(vmemmap_optimize_enabled)) + return; + + list_for_each_entry_safe(m, tm, &huge_boot_pages[nid], list) { + if (!(m->flags & HUGE_BOOTMEM_HVO)) + continue; + + phys = virt_to_phys(m); + h = m->hstate; + pfn = PHYS_PFN(phys); + nr_pages = pages_per_huge_page(h); + + if (!hugetlb_bootmem_page_zones_valid(nid, m)) { + /* + * Oops, the hugetlb page spans multiple zones. + * Remove it from the list, and undo HVO. + */ + list_del(&m->list); + + map = pfn_to_page(pfn); + + start = (unsigned long)map; + end = start + nr_pages * sizeof(struct page); + + vmemmap_undo_hvo(start, end, nid, + HUGETLB_VMEMMAP_RESERVE_SIZE); + nr_mmap = end - start - HUGETLB_VMEMMAP_RESERVE_SIZE; + memmap_boot_pages_add(DIV_ROUND_UP(nr_mmap, PAGE_SIZE)); + + memblock_phys_free(phys, huge_page_size(h)); + continue; + } else + m->flags |= HUGE_BOOTMEM_ZONES_VALID; + } +} +#endif + static const struct ctl_table hugetlb_vmemmap_sysctls[] = { { .procname = "hugetlb_optimize_vmemmap", diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 926b8b27b5cb..0031e49b12f7 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -9,6 +9,8 @@ #ifndef _LINUX_HUGETLB_VMEMMAP_H #define _LINUX_HUGETLB_VMEMMAP_H #include +#include +#include /* * Reserve one vmemmap page, all vmemmap addresses are mapped to it. See @@ -25,6 +27,10 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h, void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio); void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list); void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head *folio_list); +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT +void hugetlb_vmemmap_init_early(int nid); +void hugetlb_vmemmap_init_late(int nid); +#endif static inline unsigned int hugetlb_vmemmap_size(const struct hstate *h) diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index bee22ca93654..29647fd3d606 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -32,6 +32,8 @@ #include #include +#include "hugetlb_vmemmap.h" + /* * Flags for vmemmap_populate_range and friends. */ @@ -594,6 +596,7 @@ struct page * __meminit __populate_section_memmap(unsigned long pfn, */ void __init sparse_vmemmap_init_nid_early(int nid) { + hugetlb_vmemmap_init_early(nid); } /* @@ -604,5 +607,6 @@ void __init sparse_vmemmap_init_nid_early(int nid) */ void __init sparse_vmemmap_init_nid_late(int nid) { + hugetlb_vmemmap_init_late(nid); } #endif From patchwork Wed Jan 29 22:41:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954217 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C2CAC02190 for ; Wed, 29 Jan 2025 22:43:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B5D8F28026A; Wed, 29 Jan 2025 17:42:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A740F280269; Wed, 29 Jan 2025 17:42:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6734F28026C; Wed, 29 Jan 2025 17:42:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D3F3C28026C for ; Wed, 29 Jan 2025 17:42:52 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 914A9C04E9 for ; Wed, 29 Jan 2025 22:42:52 +0000 (UTC) X-FDA: 83061965784.28.2794056 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf13.hostedemail.com (Postfix) with ESMTP id CE40C20008 for ; Wed, 29 Jan 2025 22:42:50 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=KXAeoMCB; spf=pass (imf13.hostedemail.com: domain of 36a6aZwQKCPQbrZhckkcha.Ykihejqt-iigrWYg.knc@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=36a6aZwQKCPQbrZhckkcha.Ykihejqt-iigrWYg.knc@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190570; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=m0+bWUW0pfRrXDtFyRBgF5DnwWrUxEqL81u9NMeAeCs=; b=wm7RjPBkXFGisIvh0dWMQEPz2+xSmpYEe981asaixWkhtNSdSIBRPDbWqJkQSP2ibnKcdG rEk7mBtb1T1kWgYB2f4ODhnJtqiOmQfogMN1ct/YGLl+BZjiCKGbm0GI5B4L+RcXfjeixV /GkTrGPlxIrTbhjIfmTacb5/4sUzgbs= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=KXAeoMCB; spf=pass (imf13.hostedemail.com: domain of 36a6aZwQKCPQbrZhckkcha.Ykihejqt-iigrWYg.knc@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=36a6aZwQKCPQbrZhckkcha.Ykihejqt-iigrWYg.knc@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190570; a=rsa-sha256; cv=none; b=HSlRbPU7YaOPq5WKvTkw2wgZd194/IPdOoHVpPFG06eqK5Q2CBD50t/L0q3rDUYYeplUmv 0FOhBtYhSAChhCKUlBklmBIPEubXVrqkGBR8TFCa34QoxlwtZLb6zoHVr4tgoSZLLg4D1f eb4WDHTwm4rj2089AHHj1O0dF2z5Ob8= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-216717543b7so3955005ad.0 for ; Wed, 29 Jan 2025 14:42:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190569; x=1738795369; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=m0+bWUW0pfRrXDtFyRBgF5DnwWrUxEqL81u9NMeAeCs=; b=KXAeoMCB+G0MmAo+P3hAkcvMprD8p7HzSNfPU0joYbrq32GNDCW1g2mbdmYtTbFlfG W0NqQkwxiqNPv/hqbUHO/SUSc5sa7OXGDvTMAN5V8HkaG8h/9Z2lDdI/uY0SpeqWH8Ho OG0rETDODLRi3cMYb7MHpo3u5QdROMS1doEiBmUFNBoHWEsRYJfKBuL4gF9fxLjY3RQ6 S8tOWoXwr3LNSkXsmVabX+C2V4GGV0mhdwSAeE2552wSH9ggR35r0R2VtRN8Vere0A3L bZVGoGsZWk/vEflulMpIw/x/CJYceg/flZ+VOHAyoLE0qXCpwssUbYlqwYH5zw/cZISK ej0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190569; x=1738795369; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=m0+bWUW0pfRrXDtFyRBgF5DnwWrUxEqL81u9NMeAeCs=; b=kIgzd7gwkQuM/gDHkLXFPdgx1t1sHTEjwhIMsH9MDa9Ri6FqwgyqD+THPiy36yZZv3 NatKujn42hyQEZRIPwKn+c1jK0Tpl+2a3QJur2oCs3Y09OWjU0TvsaIyJSgeUi/HaCYe sE4ygPbjLOIoeAFOvUjErJAWoyO++62m1yU9Yhpi7qn5mHVJ3w2cAfNnyPa3j/XgzUY8 1hTApNR6GJVj9EiUn+o3IGfsZBGju6H9nEwT6p6Oi4H4C4Ez6O4lJKAcgHh3KdqL5yGJ ttrihg0T62Ah3wN0WQo5xf1kzGrz0Bj/+FXiqMfMzTRhxvm7HRxp+sJN2rD07ddLU02I sC/g== X-Forwarded-Encrypted: i=1; AJvYcCVuKB4fcFhPWuEKndC2oqjaUnfgzmTje0IGJHbuBaSNpTtz/4S9DrKtkXiTrY6BpBgtCf/7w0F7yQ==@kvack.org X-Gm-Message-State: AOJu0Yx/+2wW1xKh2j4YIdef/ixqBrauoQ02XBZX9KVkZXPrFs1X4XnW HZaRPZMTn418+DbxvKal0jfDsXgbGtpHVPbQRMo90w2Mv0MtEiPpM0NIwl628Hs/wGZd8g== X-Google-Smtp-Source: AGHT+IGzEL5Rip0Mz0fmGy5xX7dEt6FjPy6j1I5Lgvi49arCFvPrf2CjjL7zxLhk7rZXQGyKts87I7hf X-Received: from pgwg3.prod.google.com ([2002:a65:6cc3:0:b0:7fd:50ab:dc45]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:498:b0:1e1:a647:8a54 with SMTP id adf61e73a8af0-1ed7a640d32mr8499455637.20.1738190569677; Wed, 29 Jan 2025 14:42:49 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:50 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-22-fvdl@google.com> Subject: [PATCH v2 21/28] x86/setup: call hugetlb_bootmem_alloc early From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden , Dave Hansen , Andy Lutomirski , Peter Zijlstra X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: CE40C20008 X-Stat-Signature: kgrxppiirdbj64b9mommznhi8hyybtty X-HE-Tag: 1738190570-968098 X-HE-Meta: U2FsdGVkX1+F/1dd8XO49Blm4/AiPwSre80fg/cLCi2d5hFYdKIkPEPqXG8Axw8H0WbhsXH+7dxuZoTdObTFU1X8r9jolfDa8tL8Uy+tW/6ZH5l0/3GvcebLVjyezar62BzKVwIyyuA54z/vQVMRRs1Av4kMQ+OVruBXfnNhSqxP/v6DX35JOPm6x8cm1YT/RN+Cs3VStQWscsd/9OwNihHA2dXifxwSdvji7W5bH3gWkIB3YOdmEiPszB0VB3xRFgF/4fNDaODffBPi3wfxiwTP8pwfKixGS9Lhxe8QYiTCrhRE8V5oFCr28lc38aPgE3IZrAFNtkn+331KbFRDp8IWor5b7Xv5Vd/yP3RQeoGWs8mJ8h2cyRvB2hxQHt7vBOM/0ibhmDak8TzFkrSYGUX9KSWU7Q52p6RhPG2QbVme73TBDP+9A8jxsKNwOxaiILcpAS+hcOlirVCLz/WklbKdR1dHne1RnMvJsv2HKPhruoTwgWwjNPoAQpMW6tFkRcOjq8v/LsfDCuoH1CWiguyKsxAwpDZxl4kB47Sn+jfO/fusYoDSo4YBCLN8YG1fsSpKddBcLdnyJBhdCUXEEBUVTmC7LquVyCM+ix78ApD1Tq7pGtvrd8PaxLujJHdeDkhs0QWVe9ffYSYn7utfWXAQjVOl65wrK6kJd9Jt24+YuUTMgmLyP3e6J6rTyyuQh7w9FkMdGeuwSnqjZXAycEH9j3Sq+NP8RspZ7XedXolJu4ZKoHywYVzyg72irP2XSHjt+Vbkpb9IcYtebpqqh0HTYgE+8fpRe7NbM7VBgiyoPgUZGSX9SHPOdf5ulGtEeEILRVly0eHiqsuEp9nmyVNNHWiWiRR0ZtXrKPc4EwzYK/68EM5MTrnLjWviPEFXrqVFMA1b6WCtsUMgTBvQr2ZE0qeO1v4ASZVm1x2C0bFTde8foHYJPSrNlRTGseSkwthHDu2tIny0lU0C4yT HjRq2gTJ ga8cgEMwYKIptTvpZJ7OyhUFPpJEpiP14NnSBLaVFfPxX4aERl254F2Q6bPgtVMaICeev+dBSwodiq1/XFjfhNFV237+2LaVjov1pURtMFqz8sjyxGbndlv9oezBwxseLySuoBnRTkV8PTPGotbK6HypP1XOk7elaF39NvHfncg4KUfGiw9yCDR848DTPocd0nqxipWPDZmgzrRL+21HLl/9uky4Eo1vEzu5aUvceVSnlfJt/oA2ULI5MAK31zic5XJAOGqOYsrjufFq2xH+aCz03pFtrsX7Fv2riklEZ7lrHaa16LLpMZGhi4l6CJh0jsuOL1VEvZl3kScOBXq6CTxm66OtKEKKXlJPQEkgAMhIoNChSc3U0nbjZ71JkjZyoK6syh+SjR/hLqFsekSXmSpkbbndRpktsuzLZVlMC8i9OsQhglHSCrF7anWHmAdeatsAIb2mXXlolk27Av9SU3H59tIhmKwoJEHW8Eu5pQhJ6bkudKPtgfkCYhRIXPmSMOK7SpcCkpCwIbLetKkHQWrN0ZVDvzSiiOe/+5EZ1PqPSxTprSzjd2iqF+0j/lAYsU2GyTvy1RHiJbSaAcriz9lEV/HlU31sOT8aZVE1mWkE+lQGiZ3g24I6ssto6PG5ha4K8wf1OlodK3m3iVtQ3NtQUgiV9J5cirh1Ta1Dj5xSGRsB/ecDjGyWN9g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Call hugetlb_bootmem_allloc in an earlier spot in setup, after hugelb_cma_reserve. This will make vmemmap preinit of the sections covered by the allocated hugetlb pages possible. Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Signed-off-by: Frank van der Linden --- arch/x86/kernel/setup.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index cebee310e200..ff8604007b08 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -1108,8 +1108,10 @@ void __init setup_arch(char **cmdline_p) initmem_init(); dma_contiguous_reserve(max_pfn_mapped << PAGE_SHIFT); - if (boot_cpu_has(X86_FEATURE_GBPAGES)) + if (boot_cpu_has(X86_FEATURE_GBPAGES)) { hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT); + hugetlb_bootmem_alloc(); + } /* * Reserve memory for crash kernel after SRAT is parsed so that it From patchwork Wed Jan 29 22:41:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954219 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3FD9C0218D for ; Wed, 29 Jan 2025 22:43:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1D2C428008C; Wed, 29 Jan 2025 17:42:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DA4D328026C; Wed, 29 Jan 2025 17:42:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A9585280271; Wed, 29 Jan 2025 17:42:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 4883428026A for ; Wed, 29 Jan 2025 17:42:54 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 01E951604C3 for ; Wed, 29 Jan 2025 22:42:53 +0000 (UTC) X-FDA: 83061965868.22.E2167FD Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf11.hostedemail.com (Postfix) with ESMTP id 2DE2C4000A for ; Wed, 29 Jan 2025 22:42:51 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=gxnUw1R4; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of 36q6aZwQKCPUcsaidlldib.Zljifkru-jjhsXZh.lod@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=36q6aZwQKCPUcsaidlldib.Zljifkru-jjhsXZh.lod@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190572; a=rsa-sha256; cv=none; b=T3vrpOujUK5AZVQCxNIttM8EdXDlJMyE5CJ8sJryg4fLE6vz2L7xU6ti/aCKTJzu5/x3Mr OZPDZnanadN7FDY896GgjYx2tGNZ50SnD7qaG3iaS2BrfLK7A1lMztFwFUeQx9vguACX3E wgqz84JA9OekZ30Aa8zt42WyoSyRWE0= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=gxnUw1R4; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of 36q6aZwQKCPUcsaidlldib.Zljifkru-jjhsXZh.lod@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=36q6aZwQKCPUcsaidlldib.Zljifkru-jjhsXZh.lod@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190572; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Cb+6DeiIrDYs6+D7kWz+SSV0N/KlwTlgW0EvhXblryk=; b=AoJf8WYkq2++pdJf3q+XlVXAelsg3iuTeBIdSxOWzqC0QUV6jmOPdOnyPpU8xXrNDhswpp 2EI8oqNjmJ5cLfXuR1dKGQS3E83QZTagLKxZnTuh9otsU1MnHkxt+2HIuhk+7pI8qJBWHf EXMXBdqvJQDBy++0iYILq8AMBB5aSto= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2166f9f52fbso3851985ad.2 for ; Wed, 29 Jan 2025 14:42:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190571; x=1738795371; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Cb+6DeiIrDYs6+D7kWz+SSV0N/KlwTlgW0EvhXblryk=; b=gxnUw1R4AylKM5W5wfy2EWy7ap4iBFF4p3f0+PW989XrAo4wF43NMzBK+/P2J6Xks4 uDQxdR/t2IFWWGstpP0H3+OJSj0j8ELRIUf9QA6fWYLkMSjkpvFQXoYjm+1hxboypE6j 8vJ10ajLGGjrZGnsjo860k2lnJ2mojsf8BZXk9YJBAGz2f6/6pI0gJBQLehOvuz4YcJ0 rnSBp4/r/NlDRjb0n4Bpitq3yw6X9IBv1xXWAHEHM89sYymDPyDxD8DXwcIZZgkh2+rP vuaMAGwetLnTTrl3V4MfYXjoQwhin9ZyiKJtOa0rvUWk7IlceV1SN2Mtn03iLLY2W2Gy wyvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190571; x=1738795371; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Cb+6DeiIrDYs6+D7kWz+SSV0N/KlwTlgW0EvhXblryk=; b=ct/z24tvjy51cmvZ8I91Ii/bQbju54BM1s51zQJaoF8u+gsVVK9NrwTtg07tfy10en JPpuqHMaY7GjgGkZiwOJ2yKTnDPeY4YiP1c9WEimaTuE3fQBJpybGj4V4QdtAugP6yE2 14et2EIh8s6YNn2WMH8jYzz6ZgIDlffO5p3RgZ9QIUDvEOhNB5HzJK32ZGyr5bF9oAu7 W3J8JByrrv5aH5hu+2dditgE17VucfROSa6TH6fivihQADsCbC2+PPi4Rdo4nBbmB17e 4VJelAmO4JwTuh9u9uIOlzYvD7PDucuq93u4QFzAc+jFuG5P7+4txPxY3r41cJeTI6C3 U58Q== X-Forwarded-Encrypted: i=1; AJvYcCXVg8vHCGjcusqF6uAYN+CGqgkM/MbcmxblfCASzDUTRgw+0Fs4Kyg967PjXZbm6Ve3VABnSr2neQ==@kvack.org X-Gm-Message-State: AOJu0Yw5rwXFdSLJDVj4iaNy4TgZjW6xrQ7jhfkSNzs/wDM5xU2/XlMy lj3b4ICsL/1uCHqfUKqPeHZVYsxL5M+HghQsmMJTCkVBLIszdrmyeLWrZ1TsmmgsPAlb0g== X-Google-Smtp-Source: AGHT+IGDZN67byriIZ5TIJ74e/Xl+F0BuBhYUaCM7mwYBiMBi2VTra4u5+oA1ptyGp1vtiEeW3/9K7g+ X-Received: from plgi2.prod.google.com ([2002:a17:902:cf02:b0:215:ac55:d3ec]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:f687:b0:216:6901:6847 with SMTP id d9443c01a7336-21dd7de1cd0mr67719505ad.42.1738190570966; Wed, 29 Jan 2025 14:42:50 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:51 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-23-fvdl@google.com> Subject: [PATCH v2 22/28] x86/mm: set ARCH_WANT_SPARSEMEM_VMEMMAP_PREINIT From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspam-User: X-Rspamd-Queue-Id: 2DE2C4000A X-Rspamd-Server: rspam10 X-Stat-Signature: yg7pzbu6jpubhdfmez745d5u3mf76mu3 X-HE-Tag: 1738190571-914369 X-HE-Meta: U2FsdGVkX1+rkqV0vJLmC/FYShAtMw/XUoonFtOa15+w74hODr+hbY8D4Y0YdUktz6fSpfK9IW8ZPWWR0p4gWNHLXtcYyINzufIRuOI2z6PDAGGi/QrYy2by6/njVQmeK/JBb/+9jPYIUw6b9f+/fYIaJ3/l4S0FZd7Sd5zPMqs86xHFFV1NYxy5uN2Wl1PthUhSob+z2CKNxHj2SF3aNkUQP9Ckkq1aoSisefogwFidUW0QJME8fRa7zdy/9dcIHLgSkrp3PRLbLyFzCvbJEx4zXG2pSU2ivoNvzNYnY7sS4Un3F1gHCeN7kJF9YIwb9oYc9JNqGjOnv4gfQdxq8OaRipOYsDRBexpnGaOFVXD/4UzreU7d2SMFzdcxGi78Fhijv9P22JxnY9WwrSAfXY+yXj01XiDxXM/Cx/+EIPot4pVLjZOhu7Dma/9ffs1pDz2jeOhnYy4FmDt5NJ7qd2mxBHi8qYR/saVfTKLd4ZSTAF0WqRU7ha2ewtJKIAOQCRzd02B8Y27zNbH+nNQG8wi6aGQewKLi/1L1j/SWKo4BJamFvOM4c3DPtTZNjKDMp6FGOUeeaSsCZaxEjm4BvemWYvG9wb0/JmtKV+1hwt4ho5sFUpiDD4lRSf+4Fkz/OFIsoai0q3OWIG4sewFYLtvFY7hnTKuvRFnYsQqXIEYzrojQOZ4dTOI64VqiVNM7lW/H0CcOVzw/YhegODdQB7Iyby2+xMUmjsbACJMcnHUkO549P2RaJr4J0dXCFXOCACT4QoiTEd/Yp35kC1UWaEVusYHow0R7BGcL1v6P5U4A4ioa3wB6nlTgUk5QpYohSQh42ak6jZrn8IwgpCB3ut+Sl8iZTBm5NWicabbYNoBiYjecZ0SrXkatAiEM01ymfhEo/1JJfdC4Z78l5TpDDGADCZz2e5tZqdxZ2IExOdu94j2M/jZwErZ+T4P8TScteqd2SO4XfLmPO8E2G5I oxdbvz/H MW5UkU0OhheHFuoh4U74PrMc8h6aIAlAugX0dZEYOIhtusqGHN//VphHf6FMkb8BdjFVnC24SH7zbK+98RlB0wohBhWDxrFWGRFhhXKkEC8vbDY7/k3cIc0rc9UBFuAuReMNfWkWQPabv7eGJMDTYL0EXn/OTKJd+l5PtzEpkgGKnN2K4nH7YCfcIvEzurlwtNy8Ro/rki+TqYsfEmZYNqAXKqb5lTxdNgrVvaK5Udh5XnTGoCgmZHXw8rmtCigTfrl7ia7u51KCFU+Uk/7DQjY6IPI2P5Mf6xYmq73rLMHXsPinJiLdA3rX3JUToiqMJsE6nYGtt6EuSq5lqjJhKR2awb/kcoubW6CFaIJ/+BgXLPfqNIOnqragiXWGia/VykOP0QuiUobSYJVk9IfA/GvKoL3MsSa4i8iJzHj4u77xRnsVliM3ew66ALy+SaY+oNJK+YqJf6sR18lvCZ2ZQXX+6JOmupoyoU8Ucg816Pa4XoGOWo7lwgsUZpiW5KqFAkkG0LhwGdl8EmVUkA7IcKbe9VPGyn+HuSKnxjx9HD6qVhAI+XNQmq8/c9B6RcwU18Utc0ab1V+ohmtI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now that hugetlb bootmem pages are allocated earlier, and available for section preinit (HVO-style), set ARCH_WANT_SPARSEMEM_VMEMMAP_PREINIT for x86_64, so that is can be done. This enables pre-HVO on x86_64. Signed-off-by: Frank van der Linden --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 87198d957e2f..ccef99c0a2ba 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -146,6 +146,7 @@ config X86 select ARCH_WANT_LD_ORPHAN_WARN select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64 select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64 + select ARCH_WANT_SPARSEMEM_VMEMMAP_PREINIT if X86_64 select ARCH_WANTS_THP_SWAP if X86_64 select ARCH_HAS_PARANOID_L1D_FLUSH select BUILDTIME_TABLE_SORT From patchwork Wed Jan 29 22:41:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954220 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54AAFC02193 for ; Wed, 29 Jan 2025 22:43:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 23630280270; Wed, 29 Jan 2025 17:42:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1E5A628026C; Wed, 29 Jan 2025 17:42:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 01048280270; Wed, 29 Jan 2025 17:42:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D6D0A28026C for ; Wed, 29 Jan 2025 17:42:55 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 98221AEC57 for ; Wed, 29 Jan 2025 22:42:55 +0000 (UTC) X-FDA: 83061965910.02.7912231 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf07.hostedemail.com (Postfix) with ESMTP id BEBCA40012 for ; Wed, 29 Jan 2025 22:42:53 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=EhJcfzkD; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of 37K6aZwQKCPceuckfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=37K6aZwQKCPceuckfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190573; a=rsa-sha256; cv=none; b=mZOLmZvhXaZWeKuuLC9Gr2eIz/HBdei6liaGYtTM+NusWtzqM/E0vp3PuZpcHUkuRdCRGg 7M7dkz8aT+FraM93zByd/U7m8IdMo8leVjRtkUx87oMlCl+Kmq0+3zWiGGGxVENjUGdWV+ FMdY44dK08bMM8eOn+Yi55HkjvqQYgY= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=EhJcfzkD; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of 37K6aZwQKCPceuckfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=37K6aZwQKCPceuckfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190573; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=B9+SFAfifGO+4WMyk38z+oj+T8+477UR0bm18OCN8QQ=; b=hYQPVSlThWhCLhYQxwdYH9Ef8mk/qGFkrmbvXPvW54F9PihS9DfsEGVc/TDa2wswoO6LVu YfDKBye1YES4m4a/o44AGm5we6Y8bobRx10TdGyWqkYXxXdu2T7jcW3VnpV+iytlItqJHr W2mkbIyqK/XkxD0G5cV85ELfc5zc29I= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2163c2f32fdso4086465ad.2 for ; Wed, 29 Jan 2025 14:42:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190572; x=1738795372; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=B9+SFAfifGO+4WMyk38z+oj+T8+477UR0bm18OCN8QQ=; b=EhJcfzkDXngWOIOgbTIGVRjjF/8L4VYd6KC94oeXHFllldTNU9av+eH3w4MItJAMN4 BAT0TlIlJiDhRTUGuldkWYtWRYQuCvVXa5PuOB8YdFzysxEnG3WtCeqhrHWssI2+nPKg wFkjzef3oXhNX8OWC/K2mQiae7+LkgtOkSdEOZiQk9oVwEdyW+jADTa+h4ym836tKRS3 YzRwASlfXLhJHu9pwklycAblSlq66korTJk+PScsyJGsYWCfG6Rah6h5qEzmHqsbxdjZ /kgkQWbUYkLw9uOhwrv/HqHE6l0kXUGVjdze+eQh4K2R/OdOu33GSugw/Vr92Oc7cVaJ tAww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190572; x=1738795372; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=B9+SFAfifGO+4WMyk38z+oj+T8+477UR0bm18OCN8QQ=; b=YbBre2gYuf03/sfluClJ/5drUrn/KRvzKD5uctwBd1U3yW8t/KWhT/+0Bj5LZp53Nb Y/yccgWT4YS2tMTE6muYIQPZWoJEE4YviohqNF1ZbTil6V5721L6BgiGQ/6D2slS0NeS 8DqUAnKrc8KFbRPOm09n1w/ge2eWBR4aElzL38gAg2uiEPsvCmiHPXD19xSTArzOb4F9 zf+6KjHUuviA6mQO7JIutpk0+gycbtLrwacHZFnyAtzrZOnRft4X85r1CYqMj8JRHDww gS42EZsGOovIwCE6vq7uQudGiErffuPsDqRsDAEBvvOEsI70DuIrnJRslFU8btxFwLVk pyaQ== X-Forwarded-Encrypted: i=1; AJvYcCUEKETdEwj3iTQYa/Yq3ocAnSTOMaE8T5aATJQOeQtc2g2cTxIqotMCfLlLorV232frLnyH31FijQ==@kvack.org X-Gm-Message-State: AOJu0YzLM1oul9qYYBgWJ3a16Vcg1C+zXLYJC/KWKOT23ClP6d+6Ta1V 47kAH1njE9xCAFkxGQfWvBD32yyP0JwEsESJT9dQWQ9mX9U2sGkqfI3/gD/VeWqUKZn02A== X-Google-Smtp-Source: AGHT+IEfESeGYmpm9QvSf55e7+bojIZbnayRY5HSnlK9/hO+fi+i5SAew2Blg7kqEkcf7da+HHBE7s7h X-Received: from pgbca33.prod.google.com ([2002:a05:6a02:6a1:b0:801:9268:c344]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:3387:b0:1e0:c9a9:a950 with SMTP id adf61e73a8af0-1ed7a61d4admr7290847637.39.1738190572647; Wed, 29 Jan 2025 14:42:52 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:52 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-24-fvdl@google.com> Subject: [PATCH v2 23/28] mm/cma: simplify zone intersection check From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspam-User: X-Rspamd-Queue-Id: BEBCA40012 X-Rspamd-Server: rspam10 X-Stat-Signature: trzb3pwkfs6pbs7owuogf66irzhx7kt3 X-HE-Tag: 1738190573-208121 X-HE-Meta: U2FsdGVkX1+SjbP3DLutbM9y9BXh3/Fmlj1UkHjlEvnVTotjTHqG3RaCD45yvDWC8gmq8RXdyk93B0gNC0phhKNi5MK89TK1rhWRYVXh8N8yu+HtFmBa4i0Iw9d1MMafq0jiQjY+Fdpb8zGyK4YEcXygYQPI+3El+hMD1JAIxln0KURfBiybXWugt2XGdBCzxcxpPMN4qpLRVP1CfSoPms9qfn0U3P678eZMQEXvehT4CD2QzGX8XeZQixwkbroxNFWxjta/p5uJt3zWHQkJy8g+HsNeZtFm9dTw4eeoad18WOEI+WkebXQd2vGyciJDCUoNruTTROfB97ePVrSEMYf40W+2w3a/UVPOYRy7QOCHo8UZrU3X/TMyqdp5TFid/Q9peEZQy9DNVZ11KiyCTTrYtkCC+o3wL5nxNZUQix6XjkY7NX8EIUsIqDTp/VsUgBC6F+nVbLETlP+N2LxnKtVYjWLWbrsUKl26Buzy9da+WNrhtxoqICWsH0Y1+mnko/4IurANF4A8knkkvs/Ju5opjWoackwlz2Zrwkk/fQBFLXKAzAGetrylNl7+Wt9+ZQDgsALMZkm99BxB8aZ5HUjji2DTsDyd/+BOVg/B/mDTM/7FdZaXshK9aS/Zvvc6vkPH+A1kXuNON6UicE2nVatn9C/S1T4BIoGOd9J7YfJD4OrlpXV10GBoPDkNMHf32bVCV2TUr9Z7lljALuJqdxzvCSvyhaS30nLu2198f5sSwTgripbDTVjTPm0yjER/5CNZ6EHLAcqhaQVgB6gZP2jU/6ZKLUoLoTF1iKJGhXxz7MlwAW7xXlyp3OJ4UobamS9MBcKLh/+O5t+woldQAQUsMKHxInd+kCAF7G8VTGdWcvpEyL/vCEkqOq9AueZG7mgFMfKZQbaXEH+FXYy6LZLDKoXyH/7AJkFbDP7gM0XgBVwa5jYILtrYDuHrN+Lw6rMHMdASkT+kfZ0CYPS TAjLAiTA cbk0HkAvS1ROpgAtcmExPKm6+JCI3cF0eK/ZiKtn/C3zi1jzpsO8vpwDaxRltDwkgnpULl40UAJnumQ3S658Go3Lyg/hvloafAIbFEX6IDhjf/Cl8lHidVK2zXzIXdyJcFzihzWnIOfY/Ik9XEr1D13m6LjqdrK2dwrZzxUC1nJkQJu1Ftk32x8lhXYw060+atvE09CXfbAtEOYFusWAzQ8fBCvraxFCL5KgHa3i36hPnduQwDL5tol0ZqjiGClJRw6kE2cgQevN+0jyKRJSvTGH1HzCx+i3ExRW28lzSJpVJ4ji3A2GtdomRExRUYz4GEVyUrPzTjh1wkvA0qjHSaWOjvnVOdSArN6s/F2cqqWqr8m20nIwf1i1pCICZbLwJEUTtaBIZ85IvBHvHk9IZVGDFEh2CQs/yD1E3J0Ovy/TrHKxeoGK3atBGQ4p/kz173bAuolsShVjFUVtPzpMM1+aNJzLzmKo1VzJvyKQqlbRgbSv0mzzfIdm7uumeg03SJFIOlUfGxCDrHc1xy8etJ7UarA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: cma_activate_area walks all pages in the area, checking their zone individually to see if the area resides in more than one zone. Make this a little more efficient by using the recently introduced pfn_range_intersects_zones() function. Store the NUMA node id (if any) in the cma structure to facilitate this. Signed-off-by: Frank van der Linden --- mm/cma.c | 13 ++++++------- mm/cma.h | 2 ++ 2 files changed, 8 insertions(+), 7 deletions(-) diff --git a/mm/cma.c b/mm/cma.c index 1704d5be6a07..6ad631c9fdca 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -103,7 +103,6 @@ static void __init cma_activate_area(struct cma *cma) { unsigned long pfn, base_pfn; int allocrange, r; - struct zone *zone; struct cma_memrange *cmr; for (allocrange = 0; allocrange < cma->nranges; allocrange++) { @@ -124,12 +123,8 @@ static void __init cma_activate_area(struct cma *cma) * CMA resv range to be in the same zone. */ WARN_ON_ONCE(!pfn_valid(base_pfn)); - zone = page_zone(pfn_to_page(base_pfn)); - for (pfn = base_pfn + 1; pfn < base_pfn + cmr->count; pfn++) { - WARN_ON_ONCE(!pfn_valid(pfn)); - if (page_zone(pfn_to_page(pfn)) != zone) - goto cleanup; - } + if (pfn_range_intersects_zones(cma->nid, base_pfn, cmr->count)) + goto cleanup; for (pfn = base_pfn; pfn < base_pfn + cmr->count; pfn += pageblock_nr_pages) @@ -261,6 +256,7 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, cma->ranges[0].base_pfn = PFN_DOWN(base); cma->ranges[0].count = cma->count; cma->nranges = 1; + cma->nid = NUMA_NO_NODE; *res_cma = cma; @@ -497,6 +493,7 @@ int __init cma_declare_contiguous_multi(phys_addr_t total_size, } cma->nranges = nr; + cma->nid = nid; *res_cma = cma; out: @@ -684,6 +681,8 @@ static int __init __cma_declare_contiguous_nid(phys_addr_t base, if (ret) memblock_phys_free(base, size); + (*res_cma)->nid = nid; + return ret; } diff --git a/mm/cma.h b/mm/cma.h index 5f39dd1aac91..ff79dba5508c 100644 --- a/mm/cma.h +++ b/mm/cma.h @@ -50,6 +50,8 @@ struct cma { struct cma_kobject *cma_kobj; #endif bool reserve_pages_on_error; + /* NUMA node (NUMA_NO_NODE if unspecified) */ + int nid; }; extern struct cma cma_areas[MAX_CMA_AREAS]; From patchwork Wed Jan 29 22:41:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954221 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CBB65C02190 for ; Wed, 29 Jan 2025 22:43:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A2687280271; Wed, 29 Jan 2025 17:42:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9ACD628026C; Wed, 29 Jan 2025 17:42:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7FF9F280271; Wed, 29 Jan 2025 17:42:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 5D4C428026C for ; Wed, 29 Jan 2025 17:42:57 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 1C8F61C545F for ; Wed, 29 Jan 2025 22:42:57 +0000 (UTC) X-FDA: 83061965994.14.69E5749 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf28.hostedemail.com (Postfix) with ESMTP id 539A1C0003 for ; Wed, 29 Jan 2025 22:42:55 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=rAIsAgcQ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf28.hostedemail.com: domain of 37q6aZwQKCPkgwemhpphmf.dpnmjovy-nnlwbdl.psh@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=37q6aZwQKCPkgwemhpphmf.dpnmjovy-nnlwbdl.psh@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190575; a=rsa-sha256; cv=none; b=xPwrl+aZaGZkhY6NFzj3pJ+4KpeCO6jC2uSzThePdXi4Ov3kqO/CnbNjmLXt+uhZDfk1uk JC8pvZlAU8/r+SU5QVO8fmP9GhHB0+FVVkObevKJFJfAOIwvWaG/QRU6sCWtclIERueKBV vj3WUaYHzeLgcgwVKlmk8K5qXvccY1s= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=rAIsAgcQ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf28.hostedemail.com: domain of 37q6aZwQKCPkgwemhpphmf.dpnmjovy-nnlwbdl.psh@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=37q6aZwQKCPkgwemhpphmf.dpnmjovy-nnlwbdl.psh@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190575; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1345eHEo/n5m5FCY0Prqgv+HivXDQh8NAfYom1ZqdDA=; b=5sXtrsQ2tlZC/J6Z6DsjVxJEbK+eUVzR6T+YF1JY809XyNiurnUeugx4Obnf5oQjmxBUmE UiQBrgvh8qu18UwghDSIFPL3dQqhIah6UGIVIZMTFa8b0atCZWL48ouXASGluGsgUdsDBd LfRe12S1W6LbWJFybZnm+IDkth4oavI= Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2167141e00eso3113495ad.2 for ; Wed, 29 Jan 2025 14:42:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190574; x=1738795374; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=1345eHEo/n5m5FCY0Prqgv+HivXDQh8NAfYom1ZqdDA=; b=rAIsAgcQpIGgvFiaTT3uDQ2Lg7yElABf1JwGfuaGC0uo/TyMC1+sAFBZJLKpC4/n7c OyXMpRArBPMjArDGskOqM3NCFh6pqplT6NmxQmoId97EOVU58g5xzDVqr9k4vX1iKLYT rhARRu02/KqVOslwR1pNpZUQlZP36HU5B6262GmzOMFIunnTFWMVg/MI0tKoKLZ7FyO+ o6ZS3qWuqYhIkob9WmqClU1+HVojWdhg1/yjzRf2kRJbYMr20diWCLCCI5jdH20RYN9p 037NiqdZUN/D5lEdwdlZfmxadxMs0iO3rxJcpA65REEl26evMx9BolfWscZpv0DnJiJe 128A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190574; x=1738795374; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=1345eHEo/n5m5FCY0Prqgv+HivXDQh8NAfYom1ZqdDA=; b=j2lyFgChwPI+Uhs6PQgKZFRJ/Ax/mHVHMAOMNTUnxbwBHyiH20jGFdt8Ujh9C68q3D Q6OLx/JV3omRxkhdFV8ac4UAM/GC57B27wBI+casWbdtAP9h/utJw82/5NmzyzEWUAmT XczSHrktfgbwZlTYoYCOrh+/nv9HQttvyr+qZES42uvxy5iHa9WbJBzCkzWRgrATmIJe qyadNSzap9PbufWfxsZdwYBX0XSj17Z3muB3ZZXmY3OktGW+t73wY1Y3KFVRj/INFKLK SWOXm4ljlf1iKV/QKUpm57fd6CE81bT329uRnG94+GN4TkxNdPIFnZNK7evAHOC9ZLya O6EQ== X-Forwarded-Encrypted: i=1; AJvYcCXTaB8NuWAzYGcQP/Ekrh0aWxd4UhdNx0A11GT4suaMhgHEOyZC/EYkseUEGSF5SRTmQfleHF0zmA==@kvack.org X-Gm-Message-State: AOJu0Yy5IRcMskPqNeUUB+RdM7C8Prt0OH8rQ+Ned4n/CSFYIAlQns5s bSyTqkD+3xDiTlcmuahrW2f+WMkueXh4Z77vaUgRZh7IOaSLZ3tZec/Y1FP8lee/YE+H0Q== X-Google-Smtp-Source: AGHT+IFgKpf48L7TtwmSEw8t26tIVwiTmSEC5MTe4TDZk31dErMHXlLcuwwt7qrIclDuDAgLXqefYm/Q X-Received: from pfu1.prod.google.com ([2002:a05:6a00:a381:b0:725:f324:ad1c]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:1508:b0:1e0:dc7b:4ee9 with SMTP id adf61e73a8af0-1ed7a5b66c4mr8200655637.8.1738190574200; Wed, 29 Jan 2025 14:42:54 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:53 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-25-fvdl@google.com> Subject: [PATCH v2 24/28] mm/cma: introduce a cma validate function From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspam-User: X-Rspamd-Queue-Id: 539A1C0003 X-Rspamd-Server: rspam10 X-Stat-Signature: 9fc9ht14dtt4ab8j5nbz9crogjo8ok3h X-HE-Tag: 1738190575-992066 X-HE-Meta: U2FsdGVkX1/fVnZVfuZJv43lTte56MPiW4FQRe1lm5R2tVKzxItgCv4O995vOhqmZ6POWS27GqOKyFS4hz3O7dszN8uR6MxWBr/wf710ahOap9iBl7toN+OMcZuvdJheUElXzCvAgSMmsuRpTDjDlyBjLgxZMu9scGHtFJuwsmQd9mIggvkMf9GX1qaHgdVv7oKiqJ8viDox4rr5SMfCwVadm+AMU8XQxkUqB/jMdixFkt/bPliixtgctJ8L6fFsOoOF7mDzFpTFfOTINMZOWdZQPWtyImOxrQqU8Uom05xOg9O+F/AOUH8n4GJfH/FyfpC/wW+RMTSDmFh7cQWPTJYzEcA1ZIbOdQa1Yq41US2v9wFupY4n5BUQF2wPF4xRsY55eUfVoGPmCt0mqhdQX90MVzcJHYnEfc0FPgnMOveiWqcgo3ToGf0UrAQdRM4bH9bgcSIyRfPeBrxXdXoOw1Mf25uq6NebWifLCxB+lCuIWXiF8xVu5ttVwVdGhbGzoVdJ39Yz0zWn9/MU+1emusVJ5hkR2pko8bUic1GB9lCvAYit5V7Ff80VjWTFWhImV9uxFsuN9uYveUbWDgkOcvE2STPexnuRnhO/4aeefu61ocTwJzDqQBpLCKcyonOL+GaItybi0vj4C+5KkGVn4J9TfTd/8pd83Yly9T32Tm/BLO2rjsty8wW5a5dav2MY/dzFcq64H+nv5V6Dn4GQ2hpFxTAwBjpm+bMFz+Vg+kOjbuXahuwKObEyILvC+jXjDfLwmEiUpVYUAgUNaD1zg/7hF2q0hzSIn/XX23rTZ+hiOWwMY6+9Ts8iy1lR6PRfiEM+HSCRWXYV5FthqwTF2vaUOcB/yPB74Ikwv67jpOlL/OG6tySdM121LJK/PYZYcDzRC3hnpe1pKmKLq8hlnRJQZClfIIuQFeLAAxO0zSGJxVfi70g/mfU6pxGEUrf3JT06JgevdUJv03Si9Ug 3iTi61sI V9TifkTy3/UW/0NF0IVISn3qYGEMfjB9njpV/CgS2IZVFMkVwf78x8yL0IMd/7FqnQ8cN+mJ2Y9WkA3JdKK8F4fYwLS3A2mWTMh24dWCa/wjPbgbsitMgY/GOVJ4kXmo7KH+Kh1pxbE0qOdY4EniPrkXYC9IDSUOKZpMMzUk87+xDv12lH9e+L2ID98TI09A/DT7UlXjynvGlbXUbz6tA6OwcSwG81+Qsw6u/8u4Hprv/Nff9Cf8VSFfKdxV2is+Pve3WmNfT9NtDD81wDX3M/6plncJWmq+SViesvVZ0tq19wL0kCWm+6XPQmI2v2QIV6zDRZrj8qnlXOCO6YQO6jFZRIaITJTS1iDsl2E9n1I+hdTUKO0WHqn0Xmz/TxBDUHleZ2Hdcqb4CNSTHn4STFi1mHoyqUzmbx6+coqNPQ01UsDPO0KMD6pd7fZ+TNi0wpX8+afh4F+wLDaJEpT0bdHbCH2Sv0nbAejM09FQeMa7I0GvwTpgQD/ITDuBYNBBdO6+wfu8CE7Z+J4f6yVuQrjAZmw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Define a function to check if a CMA area is valid, which means: do its ranges not cross any zone boundaries. Store the result in the newly created flags for each CMA area, so that multiple calls are dealt with. This allows for checking the validity of a CMA area early, which is needed later in order to be able to allocate hugetlb bootmem pages from it with pre-HVO. Signed-off-by: Frank van der Linden --- include/linux/cma.h | 5 ++++ mm/cma.c | 60 ++++++++++++++++++++++++++++++++++++--------- mm/cma.h | 8 +++++- 3 files changed, 60 insertions(+), 13 deletions(-) diff --git a/include/linux/cma.h b/include/linux/cma.h index 03d85c100dcc..62d9c1cf6326 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -60,6 +60,7 @@ extern void cma_reserve_pages_on_error(struct cma *cma); #ifdef CONFIG_CMA struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp); bool cma_free_folio(struct cma *cma, const struct folio *folio); +bool cma_validate_zones(struct cma *cma); #else static inline struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp) { @@ -70,6 +71,10 @@ static inline bool cma_free_folio(struct cma *cma, const struct folio *folio) { return false; } +static inline bool cma_validate_zones(struct cma *cma) +{ + return false; +} #endif #endif diff --git a/mm/cma.c b/mm/cma.c index 6ad631c9fdca..41248dee7197 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -99,6 +99,49 @@ static void cma_clear_bitmap(struct cma *cma, const struct cma_memrange *cmr, spin_unlock_irqrestore(&cma->lock, flags); } +/* + * Check if a CMA area contains no ranges that intersect with + * multiple zones. Store the result in the flags in case + * this gets called more than once. + */ +bool cma_validate_zones(struct cma *cma) +{ + int r; + unsigned long base_pfn; + struct cma_memrange *cmr; + bool valid_bit_set; + + /* + * If already validated, return result of previous check. + * Either the valid or invalid bit will be set if this + * check has already been done. If neither is set, the + * check has not been performed yet. + */ + valid_bit_set = test_bit(CMA_ZONES_VALID, &cma->flags); + if (valid_bit_set || test_bit(CMA_ZONES_INVALID, &cma->flags)) + return valid_bit_set; + + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + base_pfn = cmr->base_pfn; + + /* + * alloc_contig_range() requires the pfn range specified + * to be in the same zone. Simplify by forcing the entire + * CMA resv range to be in the same zone. + */ + WARN_ON_ONCE(!pfn_valid(base_pfn)); + if (pfn_range_intersects_zones(cma->nid, base_pfn, cmr->count)) { + set_bit(CMA_ZONES_INVALID, &cma->flags); + return false; + } + } + + set_bit(CMA_ZONES_VALID, &cma->flags); + + return true; +} + static void __init cma_activate_area(struct cma *cma) { unsigned long pfn, base_pfn; @@ -113,19 +156,12 @@ static void __init cma_activate_area(struct cma *cma) goto cleanup; } + if (!cma_validate_zones(cma)) + goto cleanup; + for (r = 0; r < cma->nranges; r++) { cmr = &cma->ranges[r]; base_pfn = cmr->base_pfn; - - /* - * alloc_contig_range() requires the pfn range specified - * to be in the same zone. Simplify by forcing the entire - * CMA resv range to be in the same zone. - */ - WARN_ON_ONCE(!pfn_valid(base_pfn)); - if (pfn_range_intersects_zones(cma->nid, base_pfn, cmr->count)) - goto cleanup; - for (pfn = base_pfn; pfn < base_pfn + cmr->count; pfn += pageblock_nr_pages) init_cma_reserved_pageblock(pfn_to_page(pfn)); @@ -145,7 +181,7 @@ static void __init cma_activate_area(struct cma *cma) bitmap_free(cma->ranges[r].bitmap); /* Expose all pages to the buddy, they are useless for CMA. */ - if (!cma->reserve_pages_on_error) { + if (!test_bit(CMA_RESERVE_PAGES_ON_ERROR, &cma->flags)) { for (r = 0; r < allocrange; r++) { cmr = &cma->ranges[r]; for (pfn = cmr->base_pfn; @@ -172,7 +208,7 @@ core_initcall(cma_init_reserved_areas); void __init cma_reserve_pages_on_error(struct cma *cma) { - cma->reserve_pages_on_error = true; + set_bit(CMA_RESERVE_PAGES_ON_ERROR, &cma->flags); } static int __init cma_new_area(const char *name, phys_addr_t size, diff --git a/mm/cma.h b/mm/cma.h index ff79dba5508c..bddc84b3cd96 100644 --- a/mm/cma.h +++ b/mm/cma.h @@ -49,11 +49,17 @@ struct cma { /* kobject requires dynamic object */ struct cma_kobject *cma_kobj; #endif - bool reserve_pages_on_error; + unsigned long flags; /* NUMA node (NUMA_NO_NODE if unspecified) */ int nid; }; +enum cma_flags { + CMA_RESERVE_PAGES_ON_ERROR, + CMA_ZONES_VALID, + CMA_ZONES_INVALID, +}; + extern struct cma cma_areas[MAX_CMA_AREAS]; extern unsigned int cma_area_count; From patchwork Wed Jan 29 22:41:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954222 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3132C02190 for ; Wed, 29 Jan 2025 22:43:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 51BAB440009; Wed, 29 Jan 2025 17:42:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CD7828026C; Wed, 29 Jan 2025 17:42:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 31DF3440009; Wed, 29 Jan 2025 17:42:59 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 0FB2028026C for ; Wed, 29 Jan 2025 17:42:59 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B849F1404A2 for ; Wed, 29 Jan 2025 22:42:58 +0000 (UTC) X-FDA: 83061966036.28.B200D5F Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf24.hostedemail.com (Postfix) with ESMTP id D68E218000D for ; Wed, 29 Jan 2025 22:42:56 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=O903qEnI; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of 3766aZwQKCPohxfniqqing.eqonkpwz-oomxcem.qti@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3766aZwQKCPohxfniqqing.eqonkpwz-oomxcem.qti@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190576; a=rsa-sha256; cv=none; b=VvBUhkrN8jGy5QuoxVyJ8uS78IRfJCAISwSoZS1lJ/6xRGcRnsMsshw7zt4hRMjutUJXVh nkplJ92CkHOmmO1JLjKDEm5QRdsDwYhgcIoKwZzcei84pl5nSQPbrxm4jJDHj0GHMAy6Kx NP7qw6TfbWc5s3QfK0Le2e8yuYl8Z2I= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=O903qEnI; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf24.hostedemail.com: domain of 3766aZwQKCPohxfniqqing.eqonkpwz-oomxcem.qti@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3766aZwQKCPohxfniqqing.eqonkpwz-oomxcem.qti@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190576; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=St+q1DoY+42s4g4QcYXYYqTt/AeFuMkTKVQ73Ddb3Ys=; b=gXtPoSuqRPoxQrNhUfSlw33G5OQqRk58atTmmzQAS95LiIa8RJw1sZep0nVqlTyF3OB1w1 NJXTazU/hxM6VFKamQ+FC7/kd/FZdHPSBvC9pOOBIwfA6psE54U9wgK4li8zN2JrNuB2TZ KLbUa5XC7/EXeps3nIgekheVLKPRK2g= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-21648c8601cso2332145ad.2 for ; Wed, 29 Jan 2025 14:42:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190576; x=1738795376; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=St+q1DoY+42s4g4QcYXYYqTt/AeFuMkTKVQ73Ddb3Ys=; b=O903qEnIj9hHZz+P4VNhwjLqMp5ApBochKHg9WE4txrJLviTHxw2KGMSzdvqT5J9dn OEenFjeo7b1QZlE0E+PE+XJlkDFH/emcmDiFmejseET60sbotiGsT0jViW3V8EqF4M4i PTAriKBc4B8TR277W7Rkk9YRy3tLEVZilsMpkzA2XyJhxLjyf+o0fg3N+oWpAOniz3Q5 9ymY1Sc+42Gd5wjy/7OLqibD0vqCr5vMobiUGbMOt7TIYJK0XS6RIyTs0Ayb3TEEIlxx wDR/Y0EdWaAKGfi+eiRnqlUXC1u8Ww5Z98n5KB8d6MAjRnHbwk2SN8ue6rejt+uVqWgU JqNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190576; x=1738795376; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=St+q1DoY+42s4g4QcYXYYqTt/AeFuMkTKVQ73Ddb3Ys=; b=Ws1+P1hcSkaAIS2r+LSQd44i3oU3/gWexOy8nOfHJ35CJdARUR16mSmjapHhWsBOtP NXT4rUxq0AWSglX5f+Ognaa4qjcb/fNWkPVJaSjJ/FpdtaNRgQrr6BcmLHYYHjalxJHy lfYc8EMJTi6r1VPGI0jYX1sGwsd7ZCLKNqoHfiFNi6Bip1+a15Bhwy876b8dpNMmpoFB awz+55JLs8wZwGHDVgDR9CUAFa8FPPbRPcmYPpiY1z/AuF1Bf2b4WxAZ+JyncT8Ma6hi mRpFY7jxZNj404qSLhIiDKFtFHJNvUCl1k9GuY1qWjJQWJQXRFP/w1EJv8r8+ntj/6RH aGKA== X-Forwarded-Encrypted: i=1; AJvYcCUQqLByYx7ov3aX+1NrjiC8fP21NK8N2tKlcef//8sSJfQoxG/IU7U6Y6Zs42p/1K/vVee85vTS0w==@kvack.org X-Gm-Message-State: AOJu0YzIbKr/QbVyekqg++ODbUaWD5AoET5EMAPNElUlMdkV1BCTENAf RfUIwDvBv8CBYwPC/OoUX2X6eHyuM+FUGSOCAj5DtVeMtpNXMpGmiFrBn/4q/D08jH1jCA== X-Google-Smtp-Source: AGHT+IHKa9OKErOBIP4kt8mPwsrgnK1mz1JVZAjaYOuo9N5ZF7fBntLgqqGp6sxXhVo/+8TW/rd/4Vec X-Received: from plbq8.prod.google.com ([2002:a17:903:1788:b0:216:2234:bf3e]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:298d:b0:216:3436:b87e with SMTP id d9443c01a7336-21dd7dff854mr91313615ad.44.1738190575742; Wed, 29 Jan 2025 14:42:55 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:54 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-26-fvdl@google.com> Subject: [PATCH v2 25/28] mm/cma: introduce interface for early reservations From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspam-User: X-Rspamd-Queue-Id: D68E218000D X-Rspamd-Server: rspam10 X-Stat-Signature: xfc7x9rhb64emhs3eodsfqpq54mt7ip5 X-HE-Tag: 1738190576-879226 X-HE-Meta: U2FsdGVkX1+ArtQ0CNZ0CcZd2YOf3mTZs7XRP5Q4tB/n+rDuO6qtNAvVWlt9jRyPOaD0qUbu1WhBzHRI1nXC4l/eWwxt9hvwj+ltLPzvptemWacBL8xe9Oej6BfLWiJLElxUOknWxhPr/F/SfMq/+GLl2JW6SfLK7txKmwFD6OMsUu02fAi5fHrtPkDjlvg8uQLzvKIABSvXDlUQS7Knl5fNNQDGcjofo9d9+wRJ/C52cUa/wfKwH8Xk1vv0k2lRe4jlI/cHZ99pjydFIhXjfG4gvTlAgxinnxQe/N3/xqVN8N4yQ2AtOV6Lps0Xx+pGn32WUD7b74xqtV7iFkHp/AmaLib5cj8N73ppZW7gPD+ZENHuegP+qCHbgWTSh+teKhiuD1brE5BLGj0c3SE2JJP58a+Ce8DN6v19odkj5aiCzHsZyC7tlIuCxApD/UXhPScqpWK0trFe/TVtWDY1kFHTKCJfg+CigQ3RlXy9wy3QDF41p6BLovHBs5TD/4RdYh7Ve2JTX4aBA89SuLs2YledjRTapJ47iimazD7y50rluNvkGxUn7p2ozM/LH7EN3PIywJRhmEau5krxKTMGIPD7mef3nnhzjW/+v1K6kq0sAcEXlvBVtyWhFeyzrl6y9N7P4MQ52VUxfZ5CcBjIgqi9b309qqYuBe8AUL+Wv/Cs+821tw+DzQwzjS7BY8nhOVwvJrsiCMqS7K96civDg9DI7R5soXAz6j4Mj64qAw+Szhok+McuT7kVlMO8eShLX4bOw2brj9rJd88BbofvIkHs2G//atqBrUMB9sCJeruhExaj3DCq1im0VjuLNfjgwL9st4XKTYGbc4MACHph2r5Njj6xVpI0kdinPHzVCqZ88VWGxgDRMOEkx5ounsR38vmc7l1dEHs7iBRxkN8mBBL7R5eXTghRRBF4civKflDYWs7LumIW3SC7A9d5rubaBleNgnRBXJWngJ+Rvt8 HEnx/gmc VYpY88ML8HDhOxMtXDJcinz4ihAjuW2GTCBm0EfCE9RbfScowHiKOgIJg0nU0YFiCNqOtC8lLi1uAmEpRtTrpwNq2u86AiP7QTGAS0/rVbmOUHDw0Rr1Uirz0t5CFjoEfWkcXmfcUadhIgXr54XRn/FhJPLYpu+CdFtyk8SSRtBgKoWJP5fvy+HWrIMgAWH08dYqq2mLTWq7yuSZ1R3KOXQDLzoI2qM5zb5Znis3YLgePCol6V8XmO8MyIf8afaZnd7pTcqQZUKs2vvMwhPq5dnpWkKj3EO2iirLSDFpAErqjP4KbSRXpZscYwxHmdJ3aCnziqcfEbnbjIWTn3Gt5VOTHNZY2wtp7nC6v31APYQsjqMYcEs9IUzcHqsslU0l3zoKf7K+mGgNcfZTBwaZwPSbFSmpIUvP7F0nTdbrVYJu0k3bb6iAIYyI4Ya8kTRZMpAwkUZhK21dgz6euLFvv600XGMkR+zc8FiyLyoL7TqNVihqD8VYHxVV0X44CWrUrY/tfkkFBwJtBHrElufLNwmuWhQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: It can be desirable to reserve memory in a CMA area before it is activated, early in boot. Such reservations would effectively be memblock allocations, but they can be returned to the CMA area later. This functionality can be used to allow hugetlb bootmem allocations from a hugetlb CMA area. A new interface, cma_reserve_early is introduced. This allows for pageblock-aligned reservations. These reservations are skipped during the initial handoff of pages in a CMA area to the buddy allocator. The caller is responsible for making sure that the page structures are set up, and that the migrate type is set correctly, as with other memblock allocations that stick around. If the CMA area fails to activate (because it intersects with multiple zones), the reserved memory is not given to the buddy allocator, the caller needs to take care of that. Signed-off-by: Frank van der Linden --- mm/cma.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++----- mm/cma.h | 8 +++++ mm/internal.h | 16 ++++++++++ mm/mm_init.c | 9 ++++++ 4 files changed, 109 insertions(+), 7 deletions(-) diff --git a/mm/cma.c b/mm/cma.c index 41248dee7197..2b1e264e4e99 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -144,9 +144,10 @@ bool cma_validate_zones(struct cma *cma) static void __init cma_activate_area(struct cma *cma) { - unsigned long pfn, base_pfn; + unsigned long pfn, end_pfn; int allocrange, r; struct cma_memrange *cmr; + unsigned long bitmap_count, count; for (allocrange = 0; allocrange < cma->nranges; allocrange++) { cmr = &cma->ranges[allocrange]; @@ -161,8 +162,13 @@ static void __init cma_activate_area(struct cma *cma) for (r = 0; r < cma->nranges; r++) { cmr = &cma->ranges[r]; - base_pfn = cmr->base_pfn; - for (pfn = base_pfn; pfn < base_pfn + cmr->count; + if (cmr->early_pfn != cmr->base_pfn) { + count = cmr->early_pfn - cmr->base_pfn; + bitmap_count = cma_bitmap_pages_to_bits(cma, count); + bitmap_set(cmr->bitmap, 0, bitmap_count); + } + + for (pfn = cmr->early_pfn; pfn < cmr->base_pfn + cmr->count; pfn += pageblock_nr_pages) init_cma_reserved_pageblock(pfn_to_page(pfn)); } @@ -173,6 +179,7 @@ static void __init cma_activate_area(struct cma *cma) INIT_HLIST_HEAD(&cma->mem_head); spin_lock_init(&cma->mem_head_lock); #endif + set_bit(CMA_ACTIVATED, &cma->flags); return; @@ -184,9 +191,8 @@ static void __init cma_activate_area(struct cma *cma) if (!test_bit(CMA_RESERVE_PAGES_ON_ERROR, &cma->flags)) { for (r = 0; r < allocrange; r++) { cmr = &cma->ranges[r]; - for (pfn = cmr->base_pfn; - pfn < cmr->base_pfn + cmr->count; - pfn++) + end_pfn = cmr->base_pfn + cmr->count; + for (pfn = cmr->early_pfn; pfn < end_pfn; pfn++) free_reserved_page(pfn_to_page(pfn)); } } @@ -290,6 +296,7 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, return ret; cma->ranges[0].base_pfn = PFN_DOWN(base); + cma->ranges[0].early_pfn = PFN_DOWN(base); cma->ranges[0].count = cma->count; cma->nranges = 1; cma->nid = NUMA_NO_NODE; @@ -509,6 +516,7 @@ int __init cma_declare_contiguous_multi(phys_addr_t total_size, nr, (u64)mlp->base, (u64)mlp->base + size); cmrp = &cma->ranges[nr++]; cmrp->base_pfn = PHYS_PFN(mlp->base); + cmrp->early_pfn = cmrp->base_pfn; cmrp->count = size >> PAGE_SHIFT; sizeleft -= size; @@ -540,7 +548,6 @@ int __init cma_declare_contiguous_multi(phys_addr_t total_size, pr_info("Reserved %lu MiB in %d range%s\n", (unsigned long)total_size / SZ_1M, nr, nr > 1 ? "s" : ""); - return ret; } @@ -1044,3 +1051,65 @@ bool cma_intersects(struct cma *cma, unsigned long start, unsigned long end) return false; } + +/* + * Very basic function to reserve memory from a CMA area that has not + * yet been activated. This is expected to be called early, when the + * system is single-threaded, so there is no locking. The alignment + * checking is restrictive - only pageblock-aligned areas + * (CMA_MIN_ALIGNMENT_BYTES) may be reserved through this function. + * This keeps things simple, and is enough for the current use case. + * + * The CMA bitmaps have not yet been allocated, so just start + * reserving from the bottom up, using a PFN to keep track + * of what has been reserved. Unreserving is not possible. + * + * The caller is responsible for initializing the page structures + * in the area properly, since this just points to memblock-allocated + * memory. The caller should subsequently use init_cma_pageblock to + * set the migrate type and CMA stats the pageblocks that were reserved. + * + * If the CMA area fails to activate later, memory obtained through + * this interface is not handed to the page allocator, this is + * the responsibility of the caller (e.g. like normal memblock-allocated + * memory). + */ +void __init *cma_reserve_early(struct cma *cma, unsigned long size) +{ + int r; + struct cma_memrange *cmr; + unsigned long available; + void *ret = NULL; + + if (!cma || !cma->count) + return NULL; + /* + * Can only be called early in init. + */ + if (test_bit(CMA_ACTIVATED, &cma->flags)) + return NULL; + + if (!IS_ALIGNED(size, CMA_MIN_ALIGNMENT_BYTES)) + return NULL; + + if (!IS_ALIGNED(size, (PAGE_SIZE << cma->order_per_bit))) + return NULL; + + size >>= PAGE_SHIFT; + + if (size > cma->available_count) + return NULL; + + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + available = cmr->count - (cmr->early_pfn - cmr->base_pfn); + if (size <= available) { + ret = phys_to_virt(PFN_PHYS(cmr->early_pfn)); + cmr->early_pfn += size; + cma->available_count -= size; + return ret; + } + } + + return ret; +} diff --git a/mm/cma.h b/mm/cma.h index bddc84b3cd96..df7fc623b7a6 100644 --- a/mm/cma.h +++ b/mm/cma.h @@ -16,9 +16,16 @@ struct cma_kobject { * and the total amount of memory requested, while smaller than the total * amount of memory available, is large enough that it doesn't fit in a * single physical memory range because of memory holes. + * + * Fields: + * @base_pfn: physical address of range + * @early_pfn: first PFN not reserved through cma_reserve_early + * @count: size of range + * @bitmap: bitmap of allocated (1 << order_per_bit)-sized chunks. */ struct cma_memrange { unsigned long base_pfn; + unsigned long early_pfn; unsigned long count; unsigned long *bitmap; #ifdef CONFIG_CMA_DEBUGFS @@ -58,6 +65,7 @@ enum cma_flags { CMA_RESERVE_PAGES_ON_ERROR, CMA_ZONES_VALID, CMA_ZONES_INVALID, + CMA_ACTIVATED, }; extern struct cma cma_areas[MAX_CMA_AREAS]; diff --git a/mm/internal.h b/mm/internal.h index 63fda9bb9426..8318c8e6e589 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -848,6 +848,22 @@ void init_cma_reserved_pageblock(struct page *page); #endif /* CONFIG_COMPACTION || CONFIG_CMA */ +struct cma; + +#ifdef CONFIG_CMA +void *cma_reserve_early(struct cma *cma, unsigned long size); +void init_cma_pageblock(struct page *page); +#else +static inline void *cma_reserve_early(struct cma *cma, unsigned long size) +{ + return NULL; +} +static inline void init_cma_pageblock(struct page *page) +{ +} +#endif + + int find_suitable_fallback(struct free_area *area, unsigned int order, int migratetype, bool only_stealable, bool *can_steal); diff --git a/mm/mm_init.c b/mm/mm_init.c index f7d5b4fe1ae9..f31260fd393e 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -2263,6 +2263,15 @@ void __init init_cma_reserved_pageblock(struct page *page) adjust_managed_page_count(page, pageblock_nr_pages); page_zone(page)->cma_pages += pageblock_nr_pages; } +/* + * Similar to above, but only set the migrate type and stats. + */ +void __init init_cma_pageblock(struct page *page) +{ + set_pageblock_migratetype(page, MIGRATE_CMA); + adjust_managed_page_count(page, pageblock_nr_pages); + page_zone(page)->cma_pages += pageblock_nr_pages; +} #endif void set_zone_contiguous(struct zone *zone) From patchwork Wed Jan 29 22:41:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954223 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EB06C02193 for ; Wed, 29 Jan 2025 22:43:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C6ED7280272; Wed, 29 Jan 2025 17:43:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C1D0D28026C; Wed, 29 Jan 2025 17:43:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A977A280272; Wed, 29 Jan 2025 17:43:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8586928026C for ; Wed, 29 Jan 2025 17:43:00 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 419F2AED03 for ; Wed, 29 Jan 2025 22:43:00 +0000 (UTC) X-FDA: 83061966120.25.50C6FC5 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf03.hostedemail.com (Postfix) with ESMTP id 6DBBB20007 for ; Wed, 29 Jan 2025 22:42:58 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=P6Dwerhn; spf=pass (imf03.hostedemail.com: domain of 38a6aZwQKCPwjzhpksskpi.gsqpmry1-qqozego.svk@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=38a6aZwQKCPwjzhpksskpi.gsqpmry1-qqozego.svk@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190578; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Nz/2tKPy0YCm5EodMQQ+b1Mhej+Q9ynW9kEvQcn7esU=; b=sLn1lTZN1ZnxO0Y0Q/nUfaO/Eq7TZvJ3PyOrEZRkAmMr0fafoT6zQBtiXiExyTogKwu3Sd o7SuAzZTxVO489F1a7+0r6oPwkWODKOIQVtyFra+xD0E9elEA7e1PyOQ7Kh0HeObRh6us+ plphyJi9j4tIO4FtHNIc5k4/I7o5AoU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190578; a=rsa-sha256; cv=none; b=Wq3U7V6A/4OOIJmdon8siJIzFdF1JeKabp1UZVbzsq2EzV1SL0Zns/7Ml1Vr78eOvS2NM3 3v8boipP4u35uv4RPRxO4GrtmmSHjBUyMYJ1uuM6oukWNLOgOnpDLh2RNYRJqlglsaL9A0 eWRrM67zYfg2CwahCRQS7d3llVvdg1M= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=P6Dwerhn; spf=pass (imf03.hostedemail.com: domain of 38a6aZwQKCPwjzhpksskpi.gsqpmry1-qqozego.svk@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=38a6aZwQKCPwjzhpksskpi.gsqpmry1-qqozego.svk@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ef9dbeb848so254410a91.0 for ; Wed, 29 Jan 2025 14:42:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190577; x=1738795377; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Nz/2tKPy0YCm5EodMQQ+b1Mhej+Q9ynW9kEvQcn7esU=; b=P6Dwerhn/afJcl9b2Vx20npza70p2QOYkAuLTyhc6VqZz8++eOI0M1gecrBY73adPD 24jnViLyY02uRwysGOwMtRo55GEoqpPMDQMLtSYcm5gsIzlxdLRrhqpmhP7WZWjV1xl5 Eub6dPWZmoHtVFr/9VlcZc4db/AoGorhnolJ8V8pmP1OCHjmq+r1T40etGteBklrkcFy xD82Q6TZ2hnGOhAUKLAGkjyt8gKO6c1OcsV/PGtNyu8jli3xp6DGYKNjJkQWtCM/IQka 7ovvf37z8XLqvGVtKWBo9jnXJXsFPslbu/eUksV2VGqapEjOkhX+kiM8kjqOolmrcr8s /X0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190577; x=1738795377; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Nz/2tKPy0YCm5EodMQQ+b1Mhej+Q9ynW9kEvQcn7esU=; b=fE+J+eSOFwaRTdCeXM5rlc74zsml9Gif3jU+dB5V7uIekTGiOADkMuy6mkYANtmJew wxpAcwlKqbKhKQQhbyZrkJeqqb3htjGmfnvPqRv4Cz3Ub1gAeeEzlDKLnsGdvgJ3WypQ nJz7//77JEnd/FZJEvhMxFg93WQhPcypRaY1vLTiSwn6WSxifrCrPjBYhLKLpAC+WBq/ onP0ANqPkpWGYnK9VC7HQLDze+HuDvg7F5Fr+U25esE0u/bjrQoY3bHD+T/PXljhru7+ khbKLa8GvDz0D+j/ObIcwThL0gtsWNe5BHl+M/iyYhaUFvnWrZpeuvVDKI/vfpnBzF5b v/pg== X-Forwarded-Encrypted: i=1; AJvYcCXs9+/f2/IVvM3tV8vmCKTLO3FtNiNwNn5iJMQ6poSPjHPaCl/99ZBfN9HofAOXPmSk60rYHVY4Gw==@kvack.org X-Gm-Message-State: AOJu0YxpAxgVtprT8fw0cI4zydfD7lCARufB4w3Rkre0WYg6adLIDF9J d21ttIn9/M08iU+DG8ELuyLkMxN5FZDew2rXWmxNQazBFd3ce8TDWSS4yDyEplLNnixYKg== X-Google-Smtp-Source: AGHT+IGXUZD7q/hJ2fVVwgnyF88t+mgr40dGpUsg8S5E9Rm1s3dTi62LjPOI+SAicO9GvtyTUPW1WmMG X-Received: from pfbcw17.prod.google.com ([2002:a05:6a00:4511:b0:725:cd3b:3256]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:b8b:b0:72d:35ed:214b with SMTP id d2e1a72fcca58-72fd0c8bfb4mr7028832b3a.24.1738190577253; Wed, 29 Jan 2025 14:42:57 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:55 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-27-fvdl@google.com> Subject: [PATCH v2 26/28] mm/hugetlb: add hugetlb_cma_only cmdline option From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Queue-Id: 6DBBB20007 X-Stat-Signature: 4uu8okf3buez9utfde53a9pbf8urwhkg X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1738190578-980916 X-HE-Meta: U2FsdGVkX1/utEUg48STxb1W1Uvi9aK2hohnbMu0XCaJVv0V329PhWhL6LY1u1MFGOnKCNFthPa+0PsHlFMooPZ9lMlNtV3jG1BljDh7Hz63DPx46GL33lCIkssgZFFzP/g6nnR1uPke9bb4FFSlG0kRy89kserKIxArMTr+FMBHz6BGaslwujOy41BAo1++QswvdM7bFBlawOovtHiGU6L1e9RatLOckeJs2tt57Zi3hFbxJc14tkYyEp8RuB5hVqwpbrib+HesgeAAslOjd437B6EDKrL9Hj8eHxIj9EqaS7oPlYpwLMvckgtpl7mBeW5RfiC/cT1+fk85E0s/S4bM3ERQ48l7FfTym2KBBOE9Raydt5/PjkNAHUkiidumn0KNohUNP1aCMzvMt6D0a/qshRqSGLzKM5OGoV4DgjSky1MlutZ2u7H1H7PrEMhkj0oemzi3u4TTLprnN2RVZaooCfbwu4QtmwshZclbFPNKZWCMcWZWPAJQ5C0wT5ZUJyCtesO5t+eA0sapsJNzHeG+qT5irV061hEvO7nxDXoZLebr5I3WB8GFNjgTYVS4tmgvkx5nvJc/e7hY7gi05nke8eBt0B9Dhj0STXAr8At8VXBMu7lf8g53q8nbZcZcABJZvNoCDLbtgK+xRWCD8LrIAtF8cY/bW/aQxqT0gynfEO8k2lHRKh4CIP2rWfatzL7xJGjA9ATuyKry2Q3JNwpjzqfdhAISJpZry/Qp4z7yGxg8FHqFuT+Abq+ZwwJ8zzL7Rp+emFAuLXu5KCAOIDtEj4beLhK+Z+s685RyVfWvy6Ng8rZY57bfGsmQX0F4Uu7c13FDuJQrTDpuy3NB2XGqFDCMSEOUny2r6krcPnyKqNbquz7gPOHJDLytxUHlE46yB3b/NgQuGLKWjiUVIuIYEAfl3yPd7lG7z3ilg+6QH6PyEUkiE4fwjEuTM9mVefb3vHerwPesL1/Q0mX CGcOyUuF CdHfthalBRIrk1M3Z1oNn/RpxXtaQZhsKkVPaBr40yYqQPQM5yDNSwEgPGBV812ADYbW6eF23E/qXx26HnUoQFK7r3FHZsvtlLC914LSKk8vnKXI1m6e6L77WDRVhbzeWc0wXYL+XwAwmN8SBjSYaa8BYMNqrM11Koo3bGqLW1k7kDdoANTWMnD0MjsYrlIxv/7qQ85nWmCNDqmNLLGRRUceX8bJX2OQ2KcrGdsBMdjc1Vt56IixovT87ASfqpIGlw7zx+Od3lQSFL0PfIqN+cZh5sNZZCvuNKlYiZ+CGn/WrL8ovF9GPwKGjl8DcZOiLuBTOjwxqNghtYJWpMuyOtc+dmrdVb+GSRktCEMfbW93d+at0BHuuM8IesN92N7OkJRS9wY8HD9UrMCUVlpvGz6q93T3NiF8cORvsZBOpetxmRY4Kwu7qhmg+K+dPzHziNUpHhaZZf+5PQ3PlJycOM3pAd6zCu7wNBv+09RkuaG5NI92lWT76tloTMA/ZUeyjtVgouqTn59jcRW3FldOvGxfU9CnMBfRZrF9n037aYAwQzP6meB/z06Uij9vDEzjMTLcHAL58cm6qKsY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add an option to force hugetlb gigantic pages to be allocated using CMA only (if hugetlb_cma is enabled). This avoids a fallback to allocation from the rest of system memory if the CMA allocation fails. This makes the size of hugetlb_cma a hard upper boundary for gigantic hugetlb page allocations. This is useful because, with a large CMA area, the kernel's unmovable allocations will have less room to work with and it is undesirable for new hugetlb gigantic page allocations to be done from that remaining area. It will eat in to the space available for unmovable allocations, leading to unwanted system behavior (OOMs because the kernel fails to do unmovable allocations). So, with this enabled, an administrator can force a hard upper bound for runtime gigantic page allocations, and have more predictable system behavior. Signed-off-by: Frank van der Linden --- Documentation/admin-guide/kernel-parameters.txt | 7 +++++++ mm/hugetlb.c | 14 ++++++++++++++ 2 files changed, 21 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index fb8752b42ec8..eb56b251ce10 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1892,6 +1892,13 @@ hugepages using the CMA allocator. If enabled, the boot-time allocation of gigantic hugepages is skipped. + hugetlb_cma_only= + [HW,CMA,EARLY] When allocating new HugeTLB pages, only + try to allocate from the CMA areas. + + This option does nothing if hugetlb_cma= is not also + specified. + hugetlb_free_vmemmap= [KNL] Requires CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP enabled. diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 5af544960052..c227d0b9cf1e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -60,6 +60,7 @@ struct hstate hstates[HUGE_MAX_HSTATE]; static struct cma *hugetlb_cma[MAX_NUMNODES]; static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata; #endif +static bool hugetlb_cma_only; static unsigned long hugetlb_cma_size __initdata; __initdata struct list_head huge_boot_pages[MAX_NUMNODES]; @@ -1511,6 +1512,9 @@ static struct folio *alloc_gigantic_folio(struct hstate *h, gfp_t gfp_mask, } #endif if (!folio) { + if (hugetlb_cma_only) + return NULL; + folio = folio_alloc_gigantic(order, gfp_mask, nid, nodemask); if (!folio) return NULL; @@ -4732,6 +4736,9 @@ static __init void hugetlb_parse_params(void) hcp->setup(hcp->val); } + + if (!hugetlb_cma_size) + hugetlb_cma_only = false; } /* @@ -7844,6 +7851,13 @@ static int __init cmdline_parse_hugetlb_cma(char *p) early_param("hugetlb_cma", cmdline_parse_hugetlb_cma); +static int __init cmdline_parse_hugetlb_cma_only(char *p) +{ + return kstrtobool(p, &hugetlb_cma_only); +} + +early_param("hugetlb_cma_only", cmdline_parse_hugetlb_cma_only); + void __init hugetlb_cma_reserve(int order) { unsigned long size, reserved, per_node; From patchwork Wed Jan 29 22:41:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954224 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 123D5C0218D for ; Wed, 29 Jan 2025 22:43:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 84F88440147; Wed, 29 Jan 2025 17:43:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7FEB828026C; Wed, 29 Jan 2025 17:43:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5B734440147; Wed, 29 Jan 2025 17:43:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 3686328026C for ; Wed, 29 Jan 2025 17:43:02 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id DAC8180493 for ; Wed, 29 Jan 2025 22:43:01 +0000 (UTC) X-FDA: 83061966162.11.5B46F36 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf11.hostedemail.com (Postfix) with ESMTP id 155E140008 for ; Wed, 29 Jan 2025 22:42:59 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=VfmqcxKG; spf=pass (imf11.hostedemail.com: domain of 38q6aZwQKCP0k0iqlttlqj.htrqnsz2-rrp0fhp.twl@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=38q6aZwQKCP0k0iqlttlqj.htrqnsz2-rrp0fhp.twl@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190580; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=L0ZG8uhT3OL4gz53zsZU4uOQev18QCwSpJ1z8Lyz6lA=; b=j03UrDlgJMJieR4XQdXIIrDLC7LDC0nOxo5V64laXN84yadH2Tv0izXWuMiC4aHSNWggOm +v63lkquDyVy+AstRMdTSvMx/We7mdALjaFILkOwQ1ZRcJR4YNfV0y9VaP62OT3LJtv8Tb sm1jHLprYUNxFW/1bDDKL6DKrkm02RM= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=VfmqcxKG; spf=pass (imf11.hostedemail.com: domain of 38q6aZwQKCP0k0iqlttlqj.htrqnsz2-rrp0fhp.twl@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=38q6aZwQKCP0k0iqlttlqj.htrqnsz2-rrp0fhp.twl@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190580; a=rsa-sha256; cv=none; b=FfTSjjWC+RerspEPmlal1bTyPG6lufYKvpH4sH6yzCsjEeILviyck1CMSXnhMScBi87EA3 qd20XqPqTJa3+rEmKOGJKpD4VNWFVpBYD3a15eBJXBZXzZCNsXZL4HgPW/PTlCKVLd775E gru4UsZQ1cMDGbsOhTn5eeqDdswWnGI= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ef9da03117so318337a91.1 for ; Wed, 29 Jan 2025 14:42:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190579; x=1738795379; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=L0ZG8uhT3OL4gz53zsZU4uOQev18QCwSpJ1z8Lyz6lA=; b=VfmqcxKGdF0ZBCoQ245NUVoXFq/JwKnnTIakAE2wAHfd1L0tVg8p0i/GUouE9sR1Qz Vc1f5V7cvZlMpXk2lVnIYcmQLYQU2U7pBgS8b4rL1cR/AqsI4FwBB7Iha1TX7uPN7+M1 FpHdQvWaqnab0zuUd1KFGjAtgE9tLTpoqIRczSGGu15EKYMG5ymU26wFLlJbTxzkwpyr 9VJ2bdxNFmPzopI20mcpqBKKRHQZtOmitpXYYKiCuHXDUATVKUgDzhKXV7y9hH5d/dlB 8TcFuRzEgJE5Yhi0G6vqka/Ygd1GSGDTqecXUvIkL301W7YBY++i/fVred2hKk8g9+br 9KKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190579; x=1738795379; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=L0ZG8uhT3OL4gz53zsZU4uOQev18QCwSpJ1z8Lyz6lA=; b=e/4YGpyYXoFhNgsyWsDwL2U/XuoSnwqu3c5SSvMrCHpEnFEiUdfdMsNt7nESkI4a10 TP1ThEPvtj71Q/izFA/F6CXdNByWLt77st++kcU6yxGBdwbf+MHYsdYLc89e5PIZnR8j cUZWpAx7cJwuWag4RPzvbek8IFws7IqICKXPPnK/wSri+N9MJed3sVw4p8dGD3ZTGOEu B47POl+PDmmviWdn9BruAThHX55IlWwRKsY8hLZfEqIsIIi987g035/WSA51CS4/HzEp mvRfzlJAbgOygFB9jVon6oHs5RcGiOzYru2xgdc4gsY5bEGesCWuX0/7lTxuzB+xBqeb 9X/w== X-Forwarded-Encrypted: i=1; AJvYcCXPcr3+1hMegPP4aN+a6n5Av22PgqvkscFoY6eLNpDI8Lk3SI9JaSModNW1OVxAhCVNpBnpwMSIzw==@kvack.org X-Gm-Message-State: AOJu0YwSFvdBOueldMSSQ4yssMbafySznGmKVB03338oSIdqVhDTZV6w Uw2GDUD51F2iVolX/s2vXc7JtBKvNao2sSbUeknWKWaOHuIjVuEFX1cnIt8mtWI3Zk/wlw== X-Google-Smtp-Source: AGHT+IEuTmiv3H9rTX6kMAO3mmM4IDiqwCIIyWovC1QOoLLo+K5CwA7MUZZn4zxCiCikjDN8GtWu9oQz X-Received: from pfbcl4.prod.google.com ([2002:a05:6a00:32c4:b0:725:d033:af87]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:3919:b0:725:df1a:282 with SMTP id d2e1a72fcca58-72fd0be3556mr6904654b3a.10.1738190578851; Wed, 29 Jan 2025 14:42:58 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:56 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-28-fvdl@google.com> Subject: [PATCH v2 27/28] mm/hugetlb: enable bootmem allocation from CMA areas From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden , Madhavan Srinivasan , Michael Ellerman , linuxppc-dev@lists.ozlabs.org X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 155E140008 X-Stat-Signature: aka65g8b1pyf85dckq8f7161yxsq8g8g X-HE-Tag: 1738190579-918576 X-HE-Meta: U2FsdGVkX19qyU3kGkvXYIgordyN3ZPjwVd5voPzgh6G3KCD9CVG4qrzDxsAx/2bVxM2JFsvk2aX6+sCtFHZKDfz4ENYq1L2YVQv4MwMeRn9NkXVrYvS4liJZFpZvewyftvobikNUpN39xmIY+JDvSeqMGMGRduQa2x25Ldg3s8N2etfYtEC4TkKh5MOzmHnhnDuw4TNqr1PgslbQRtppVtbqa7KtHOEstmvaX0loqA+iTfpmNix+Kn6MvY4Vya4Z//RiqNItgTzI45nyu10udmaGuSsagLmpQU1cCO5ihU9wzY314tI01VLuFRiHrYriYSwgAEOZn+zx2S/ygkhDCjEYJrpP5VlegvjXefIC9lMwG7nyVKz40ALsRSX8Cg9hXIGaaC+MYtIb6pzXUtNxCZ8OSIZwDLvRajQSLvksRbIaswwusdY/vuj55m7RycuEzoKl3eZrinbTrtPMyGvmDpngo0hLJz+2U7vDfYUd7JZOyOWvgUhdvt4xIjKqeEjm+q9slllxHm0b8Kp+KMgpuPoohhzSD8HFrowyCj7HlyVnIN6v9RoSKVANgEG432U98AnB2rl6lgFm2XU2quTeZ06V7K4l28HWKeWnLNQV90opwGX4CRoLBwhnh40pnrXL/4p2rISdgnQdiTrGQrcZZrK45XOyrQjDAhho29anwLMHz0WhA8aED5hiMzzoHp9THTuKC2fI62g9EaGBBifn4UU/E7A0CnQR384HzBf6NbI7neImmaSgc15JaJLzCUD0bSpF4hCuZBxK0Y6r5xbqcc1ms7zr7YHFVUMHvugN73XjaNm8cN4HKwX3SEmezk3GCIxbkJPi6olDdh5+HxIUkXrmnRJ0n4mkgwaSLeptlwmWpMd5569r8Qdy9/QGYjZkubKPCd9dkkmOo2VOzaIt1Fq4AVjVj/JJTj/N03xiv3miDsxnERs9A/nrRG5t1A7Nus5F/gSTeTjkDfAa3+ Z+s2f2Vy z9gi1o/LGLLg99/9cmTHBu+c9e3+NaF0/lMGIt761OJDeUjsprik7Bf+3CFn7LDx1DpcHSjk3CnsdPCtZcyNA01c9LmH3JwQ/6mFnL3hZgXhTYSn7JdvHPPIYR9WKzMorKlc0Fl9uC13Nsyncc0QDOVkX3CoWxsBLMukrOyzcJfpNojlg9BeHZPK92xXJYfdUhFfa34Bc2olGPMjeCBMLVM9SqDMG3XR7V7/NsHU87VNS3iq9u7Xm4ERL1PJ5aF66tHbdxwG+JGwDSshFuIKspKfN9T3sFLSn1ELuBTtwT9oeQ0Veu3Ol9g+f2lssaKfobsw7lsColHEP1t9k7F9Lfa9FnOlYWnaKEc4nGbzQlH2ZXMZX2zojrM89MJ/eRIqH4vQg4mLstYsHiy2kBSvC6cNLqGWM8ClOrneVOYK4UnIcXkdGeaFVgqdnykEFaDkSQNov+m83ZdbFWldyQto3xV1/vHnMkB87FvDge+YriO6Ta9SGAT/yh/5B2vuaQZb4OwawFSlwsXVDBvQfXSji7rKUcDtHubskBcQbLGDHAjx5RHwyfM8gXiRLjSUS7Hqc5K2OWiurlJYCb4m92XZ/CuxwYkUtVXTUVze95dx275D7UnM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: If hugetlb_cma_only is enabled, we know that hugetlb pages can only be allocated from CMA. Now that there is an interface to do early reservations from a CMA area (returning memblock memory), it can be used to allocate hugetlb pages from CMA. This also allows for doing pre-HVO on these pages (if enabled). Make sure to initialize the page structures and associated data correctly. Create a flag to signal that a hugetlb page has been allocated from CMA to make things a little easier. Some configurations of powerpc have a special hugetlb bootmem allocator, so introduce a boolean arch_specific_huge_bootmem_alloc that returns true if such an allocator is present. In that case, CMA bootmem allocations can't be used, so check that function before trying. Cc: Madhavan Srinivasan Cc: Michael Ellerman Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Frank van der Linden --- arch/powerpc/include/asm/book3s/64/hugetlb.h | 6 + include/linux/hugetlb.h | 17 +++ mm/hugetlb.c | 121 ++++++++++++++----- 3 files changed, 113 insertions(+), 31 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb.h b/arch/powerpc/include/asm/book3s/64/hugetlb.h index f0bba9c5f9c3..bb786694dd26 100644 --- a/arch/powerpc/include/asm/book3s/64/hugetlb.h +++ b/arch/powerpc/include/asm/book3s/64/hugetlb.h @@ -94,4 +94,10 @@ static inline int check_and_get_huge_psize(int shift) return mmu_psize; } +#define arch_has_huge_bootmem_alloc arch_has_huge_bootmem_alloc + +static inline bool arch_has_huge_bootmem_alloc(void) +{ + return (firmware_has_feature(FW_FEATURE_LPAR) && !radix_enabled()); +} #endif diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 2512463bca49..6c6546b54934 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -591,6 +591,7 @@ enum hugetlb_page_flags { HPG_freed, HPG_vmemmap_optimized, HPG_raw_hwp_unreliable, + HPG_cma, __NR_HPAGEFLAGS, }; @@ -650,6 +651,7 @@ HPAGEFLAG(Temporary, temporary) HPAGEFLAG(Freed, freed) HPAGEFLAG(VmemmapOptimized, vmemmap_optimized) HPAGEFLAG(RawHwpUnreliable, raw_hwp_unreliable) +HPAGEFLAG(Cma, cma) #ifdef CONFIG_HUGETLB_PAGE @@ -678,14 +680,18 @@ struct hstate { char name[HSTATE_NAME_LEN]; }; +struct cma; + struct huge_bootmem_page { struct list_head list; struct hstate *hstate; unsigned long flags; + struct cma *cma; }; #define HUGE_BOOTMEM_HVO 0x0001 #define HUGE_BOOTMEM_ZONES_VALID 0x0002 +#define HUGE_BOOTMEM_CMA 0x0004 bool hugetlb_bootmem_page_zones_valid(int nid, struct huge_bootmem_page *m); @@ -823,6 +829,17 @@ static inline pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, } #endif +#ifndef arch_has_huge_bootmem_alloc +/* + * Some architectures do their own bootmem allocation, so they can't use + * early CMA allocation. + */ +static inline bool arch_has_huge_bootmem_alloc(void) +{ + return false; +} +#endif + static inline struct hstate *folio_hstate(struct folio *folio) { VM_BUG_ON_FOLIO(!folio_test_hugetlb(folio), folio); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index c227d0b9cf1e..5a3e9f7deaba 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -132,8 +132,10 @@ static void hugetlb_free_folio(struct folio *folio) #ifdef CONFIG_CMA int nid = folio_nid(folio); - if (cma_free_folio(hugetlb_cma[nid], folio)) + if (folio_test_hugetlb_cma(folio)) { + WARN_ON_ONCE(!cma_free_folio(hugetlb_cma[nid], folio)); return; + } #endif folio_put(folio); } @@ -1509,6 +1511,9 @@ static struct folio *alloc_gigantic_folio(struct hstate *h, gfp_t gfp_mask, break; } } + + if (folio) + folio_set_hugetlb_cma(folio); } #endif if (!folio) { @@ -3175,6 +3180,53 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, return ERR_PTR(-ENOSPC); } +static bool __init hugetlb_early_cma(struct hstate *h) +{ + if (arch_has_huge_bootmem_alloc()) + return false; + + return (hstate_is_gigantic(h) && hugetlb_cma_only); +} + +static __init void *alloc_bootmem(struct hstate *h, int nid) +{ + struct huge_bootmem_page *m; + unsigned long flags; + struct cma *cma; + +#ifdef CONFIG_CMA + if (hugetlb_early_cma(h)) { + flags = HUGE_BOOTMEM_CMA; + cma = hugetlb_cma[nid]; + m = cma_reserve_early(cma, huge_page_size(h)); + } else +#endif + { + flags = 0; + cma = NULL; + m = memblock_alloc_try_nid_raw(huge_page_size(h), + huge_page_size(h), 0, MEMBLOCK_ALLOC_ACCESSIBLE, nid); + } + + if (m) { + /* + * Use the beginning of the huge page to store the + * huge_bootmem_page struct (until gather_bootmem + * puts them into the mem_map). + * + * Put them into a private list first because mem_map + * is not up yet. + */ + INIT_LIST_HEAD(&m->list); + list_add(&m->list, &huge_boot_pages[nid]); + m->hstate = h; + m->flags = flags; + m->cma = cma; + } + + return m; +} + int alloc_bootmem_huge_page(struct hstate *h, int nid) __attribute__ ((weak, alias("__alloc_bootmem_huge_page"))); int __alloc_bootmem_huge_page(struct hstate *h, int nid) @@ -3184,17 +3236,14 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) /* do node specific alloc */ if (nid != NUMA_NO_NODE) { - m = memblock_alloc_try_nid_raw(huge_page_size(h), huge_page_size(h), - 0, MEMBLOCK_ALLOC_ACCESSIBLE, nid); + m = alloc_bootmem(h, node); if (!m) return 0; goto found; } /* allocate from next node when distributing huge pages */ for_each_node_mask_to_alloc(&h->next_nid_to_alloc, nr_nodes, node, &node_states[N_ONLINE]) { - m = memblock_alloc_try_nid_raw( - huge_page_size(h), huge_page_size(h), - 0, MEMBLOCK_ALLOC_ACCESSIBLE, node); + m = alloc_bootmem(h, node); if (m) break; } @@ -3203,7 +3252,6 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) return 0; found: - /* * Only initialize the head struct page in memmap_init_reserved_pages, * rest of the struct pages will be initialized by the HugeTLB @@ -3213,18 +3261,6 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) */ memblock_reserved_mark_noinit(virt_to_phys((void *)m + PAGE_SIZE), huge_page_size(h) - PAGE_SIZE); - /* - * Use the beginning of the huge page to store the - * huge_bootmem_page struct (until gather_bootmem - * puts them into the mem_map). - * - * Put them into a private list first because mem_map - * is not up yet. - */ - INIT_LIST_HEAD(&m->list); - list_add(&m->list, &huge_boot_pages[node]); - m->hstate = h; - m->flags = 0; return 1; } @@ -3265,13 +3301,25 @@ static void __init hugetlb_folio_init_vmemmap(struct folio *folio, prep_compound_head((struct page *)folio, huge_page_order(h)); } +static bool __init hugetlb_bootmem_page_prehvo(struct huge_bootmem_page *m) +{ + return m->flags & HUGE_BOOTMEM_HVO; +} + +static bool __init hugetlb_bootmem_page_earlycma(struct huge_bootmem_page *m) +{ + return m->flags & HUGE_BOOTMEM_CMA; +} + /* * memblock-allocated pageblocks might not have the migrate type set * if marked with the 'noinit' flag. Set it to the default (MIGRATE_MOVABLE) - * here. + * here, or MIGRATE_CMA if this was a page allocated through an early CMA + * reservation. * - * Note that this will not write the page struct, it is ok (and necessary) - * to do this on vmemmap optimized folios. + * In case of vmemmap optimized folios, the tail vmemmap pages are mapped + * read-only, but that's ok - for sparse vmemmap this does not write to + * the page structure. */ static void __init hugetlb_bootmem_init_migratetype(struct folio *folio, struct hstate *h) @@ -3280,9 +3328,13 @@ static void __init hugetlb_bootmem_init_migratetype(struct folio *folio, WARN_ON_ONCE(!pageblock_aligned(folio_pfn(folio))); - for (i = 0; i < nr_pages; i += pageblock_nr_pages) - set_pageblock_migratetype(folio_page(folio, i), + for (i = 0; i < nr_pages; i += pageblock_nr_pages) { + if (folio_test_hugetlb_cma(folio)) + init_cma_pageblock(folio_page(folio, i)); + else + set_pageblock_migratetype(folio_page(folio, i), MIGRATE_MOVABLE); + } } static void __init prep_and_add_bootmem_folios(struct hstate *h, @@ -3328,10 +3380,16 @@ bool __init hugetlb_bootmem_page_zones_valid(int nid, return true; } + if (hugetlb_bootmem_page_earlycma(m)) { + valid = cma_validate_zones(m->cma); + goto out; + } + start_pfn = virt_to_phys(m) >> PAGE_SHIFT; valid = !pfn_range_intersects_zones(nid, start_pfn, pages_per_huge_page(m->hstate)); +out: if (!valid) hstate_boot_nrinvalid[hstate_index(m->hstate)]++; @@ -3360,11 +3418,6 @@ static void __init hugetlb_bootmem_free_invalid_page(int nid, struct page *page, } } -static bool __init hugetlb_bootmem_page_prehvo(struct huge_bootmem_page *m) -{ - return (m->flags & HUGE_BOOTMEM_HVO); -} - /* * Put bootmem huge pages into the standard lists after mem_map is up. * Note: This only applies to gigantic (order > MAX_PAGE_ORDER) pages. @@ -3414,6 +3467,9 @@ static void __init gather_bootmem_prealloc_node(unsigned long nid) */ folio_set_hugetlb_vmemmap_optimized(folio); + if (hugetlb_bootmem_page_earlycma(m)) + folio_set_hugetlb_cma(folio); + list_add(&folio->lru, &folio_list); /* @@ -3606,8 +3662,11 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) { unsigned long allocated; - /* skip gigantic hugepages allocation if hugetlb_cma enabled */ - if (hstate_is_gigantic(h) && hugetlb_cma_size) { + /* + * Skip gigantic hugepages allocation if early CMA + * reservations are not available. + */ + if (hstate_is_gigantic(h) && hugetlb_cma_size && !hugetlb_early_cma(h)) { pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n"); return; } From patchwork Wed Jan 29 22:41:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13954225 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C749AC02190 for ; Wed, 29 Jan 2025 22:43:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 03E96440149; Wed, 29 Jan 2025 17:43:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F080128026C; Wed, 29 Jan 2025 17:43:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D2F7F440149; Wed, 29 Jan 2025 17:43:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id AE69828026C for ; Wed, 29 Jan 2025 17:43:03 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 618EF1C545F for ; Wed, 29 Jan 2025 22:43:03 +0000 (UTC) X-FDA: 83061966246.10.F9698E8 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf07.hostedemail.com (Postfix) with ESMTP id 7FCEB40009 for ; Wed, 29 Jan 2025 22:43:01 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=IzbUhChr; spf=pass (imf07.hostedemail.com: domain of 39K6aZwQKCAEgwemhpphmf.dpnmjovy-nnlwbdl.psh@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=39K6aZwQKCAEgwemhpphmf.dpnmjovy-nnlwbdl.psh@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738190581; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aH4nwlMvvSTbckejE0k1f4dtYPUtlX/28RtOz+k6ygU=; b=uGO+svfOuletozZrbBf+y9uf/LogiFapsqyPEF9nliBi/a0Pjb2YQXXikML6IcKRU8mi/K Ms2Pehamqp6h+qp4TdqVfQh/JVJ8MCLtcDcy0uOZQKP3+rKIX+BE0vaSrL98BbbaKlHnI6 614Ax1XQBDbOEVBsPppNtvhH6D9F0wY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738190581; a=rsa-sha256; cv=none; b=4yAtPndyZv8qBllnNcDTaEEXBt+BQdowxT/udwsTljLwSIq1+OlmRX+KyBv3qz4iXaHFNF m82PXGO8S4FSLJTVAgB3XEzlmj4Bw/ldazXVQy/kGfhlDzk4tjbeeFz/3DEBlviBAelrPa AfwF1rD+CGLL3SyVRLD5BcbWm8aFLw0= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=IzbUhChr; spf=pass (imf07.hostedemail.com: domain of 39K6aZwQKCAEgwemhpphmf.dpnmjovy-nnlwbdl.psh@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=39K6aZwQKCAEgwemhpphmf.dpnmjovy-nnlwbdl.psh@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2163dc0f5dbso3384895ad.2 for ; Wed, 29 Jan 2025 14:43:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738190580; x=1738795380; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=aH4nwlMvvSTbckejE0k1f4dtYPUtlX/28RtOz+k6ygU=; b=IzbUhChr0UIMCxADrRrKFZbpiLBBeuL84d3Oxlrg8bJZuhIqnIuVh4sO+rRFnT88e7 GFRdFdXo7HKd6QcgNgsMRCOmPKQY2wreX9+ibs9Ey9/t5Pc+tjQdXYF7QFLiu7f4dadn kjffrJ+r1/jvcDleNBCOYbtFIpvxtgB0NhXYTHaJQZtMzZaR6KiJZaqvDUlVWj/0fpXm hHuz7T4ysKL0aFLgTRCcV7E7f+QAWZDa/KH77UPQwG5iGn/5/69ICobNQl5ZXIdgyfxN qG7qjdOHgaq/35Lzfqz2UxwaUA8IblBRg6w0bNAlfPO6OthYGJTTzRObYAmK0FU0EfwG SYaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738190580; x=1738795380; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=aH4nwlMvvSTbckejE0k1f4dtYPUtlX/28RtOz+k6ygU=; b=tbtG+2J5cPkTcas97TA1YBGffvg0yHC941aHGOjI9IXhE2OcNHjBCPeYrOF2oD2/r0 1jShzimusupD3cNXsPSj3rf9qGUrlbdJ6jnthYmMndFPaVEveMWSfP2Duso1U14bvi+a BUH2ymTG2mqbjKT55gHVn21rzo58VMoxJA83aky6uoXBp8nXeW+xsFsu9+ExQIz9Hh+8 L8HLlAc38ht98G2Wydv7StJiN86+x/gOThQpqtUmTeR0wcjcLFhClqmtM1AJhBp70a+6 ii9MUgr04T18aItGmFei1s5BSJP/KXHgULA1CuWlCKHvKjoC55Zi8tZkPJ/Y3tR6tOTk Oicw== X-Forwarded-Encrypted: i=1; AJvYcCUvAapcQMGkRfWK14cCFZ3vlJ7fVe0fh8SC0WuHbFG/GCa/inMum9yGxNfq4MVgpY8WywLoORyeRg==@kvack.org X-Gm-Message-State: AOJu0YzElCoCIYDNX5cAMt63/3EPe7a0jRM6YPz+vLecbY3s4i4vRUf0 5/amjq1ufAVh3jlY63l5xVtISqK6qFaE0jT3HheOcjAP9jdhUq0g+Xr38MzWwhDcrTcLEg== X-Google-Smtp-Source: AGHT+IE+pj10b4z63haeyix4paPMsUGHOoSvFej7RbeeYAi9eA9qHLwGBxnt7Q4A+DDX+iTalohvAtl5 X-Received: from plgm4.prod.google.com ([2002:a17:902:f644:b0:20c:5d5a:9d64]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:188:b0:215:7b06:90ca with SMTP id d9443c01a7336-21dd7c57d68mr67625625ad.17.1738190580369; Wed, 29 Jan 2025 14:43:00 -0800 (PST) Date: Wed, 29 Jan 2025 22:41:57 +0000 In-Reply-To: <20250129224157.2046079-1-fvdl@google.com> Mime-Version: 1.0 References: <20250129224157.2046079-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250129224157.2046079-29-fvdl@google.com> Subject: [PATCH v2 28/28] mm/hugetlb: move hugetlb CMA code in to its own file From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Stat-Signature: ghi84mqmn7t66kiopxd4n37d4me7azq5 X-Rspam-User: X-Rspamd-Queue-Id: 7FCEB40009 X-Rspamd-Server: rspam03 X-HE-Tag: 1738190581-246134 X-HE-Meta: U2FsdGVkX1/y3V3eJaD+8cqI7+fZYfUEurFr6N4PtlWoIqm8HKqqUgMxMOz/orcy5kMIcDYMGNhFH/Pdn2e1JfkeIrtz85S+RpuLO1744LCKZbBFMpgQMiGNQrHtpiKosk0vH7R9w+X+tgUlLiPQVmVziNLkCgefbsHNyHLVnUiG3uvJj4/oRCzMpZDSkdlNtQLLUc/VRThVtrxDy9+vzwBPkxEQnJe22PAxLv7jpNfsaQHoTxDWcjHDmrbanS2yPkHWngqRctVjmH6u6f2uVhk0ziQQtWG34syvt5MVWiGOuNC2SihXZ8XPv8+482Borhn8pe3J9Byb73OoSW+nusuO1HDeGcnUQfpKloYbor5qmtxWzI/fNBcBKJO5SjKtZu+kWqG+6d+T4lna4curDG/CD2RVJWtzeMFOkvVHmvIJIjymlvP74ZNeD+N+knbXOByV6wkPNNX8rzBD6aKmzlAlcILsdpJ2dHsFJKcNPpCgmPq9sL9hl1mZktHaJsgBzWp6PX5whl3BMMjLo/z/oCdRfghlS6obzvD6sZimBXaOrLAHjQnGAIBiem+O4HG7YOfvKBtmZOTcYtPX/py+X2O7H0Zr7Vt/W5lf9b4UykE6xlayyjOGcRYbdJoHqbaw7L9OcUVnkyjk+trFr3AMYRw1RxwMZZ+hx4RTLeZ7QllcHxLc/WVRm8hOFP3Z0Ily0VswmAId+j52FaonLTGNdo1wDpsXJuLQJpC6I/Hbi9YuMKi6ISv8WefliYECq6cGEQ5htGMvAA4pUaRtFxMXAnSOFBjJGI6dAO2PvnYgcSciAMVcUB8HJARAQLh2ZhhL3ed0EQ2MgeQWeNgZLSba1d7YxM4INqB2w6A/sS99Xhp3POPu7vBuGuAlVGuoPWdkK7vBRIArnamJcK4TQz9NpfAhjXqzBeTQRzzugNo3hz8c99OBxyOsM6K772Z7oBm399Ml95F/Eo5VIV1Dq6F njCc172N 59ldJappiZi8+Wy/u3s76jp/VpFcMJ+Rsc0l6MRY1z4RMXKTvwfUmL2p4/l39ZcGYbOpGK4MzCWXWutAErzKhLJj+mNhnJ23RXOcxdGQb5B9QCrgW/T24ivcASZu5MwphTykEcfR6ON/jxw7bXw7/sZRvQaezX9aeb+O5Y0UCoQiwYL3lCmx9tUnMoYjjy2WkdVwSC9N9Bli+Hp3jnXQQkWosUb9xGBVh1Ps/HzHshksNScunywkKCsxE/8xJbf3J9jy+5Vd8KMt+7XT25m72bqzlH8A64irWpontG3pVN++gHvPVtNfZiXRlME1C7JbBaNgIyGRafMZUekp7o6sKlG1Q79bVN1m+NVRBAuMFQJe8Q+MeN83UQrvX0IAUAUqnlv1P8W02rh3XRFwnsEwhGKcCFeDF/nhi1+IpjkwFhRwwssrATY7Rr8sfLBmBlecUKl/nUTTHbj1UKHX74zOTIrAVtGwMhDt2swixrQH63r34odQBgIQG9ziFGVhkFp7Uhli9WpxwQLhywkzPjsIY9M87qA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: hugetlb.c contained a number of CONFIG_CMA ifdefs, and the code inside them was large enough to merit being in its own file, so move it, cleaning up things a bit. Hide some direct variable access behind functions to accomodate the move. No functional change intended. Signed-off-by: Frank van der Linden --- mm/Makefile | 3 + mm/hugetlb.c | 252 +++------------------------------------------ mm/hugetlb_cma.c | 258 +++++++++++++++++++++++++++++++++++++++++++++++ mm/hugetlb_cma.h | 55 ++++++++++ 4 files changed, 332 insertions(+), 236 deletions(-) create mode 100644 mm/hugetlb_cma.c create mode 100644 mm/hugetlb_cma.h diff --git a/mm/Makefile b/mm/Makefile index 850386a67b3e..810ccd45d270 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -79,6 +79,9 @@ obj-$(CONFIG_SWAP) += page_io.o swap_state.o swapfile.o swap_slots.o obj-$(CONFIG_ZSWAP) += zswap.o obj-$(CONFIG_HAS_DMA) += dmapool.o obj-$(CONFIG_HUGETLBFS) += hugetlb.o +ifdef CONFIG_CMA +obj-$(CONFIG_HUGETLBFS) += hugetlb_cma.o +endif obj-$(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP) += hugetlb_vmemmap.o obj-$(CONFIG_NUMA) += mempolicy.o obj-$(CONFIG_SPARSEMEM) += sparse.o diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 5a3e9f7deaba..6e296f16116d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -50,19 +50,13 @@ #include #include "internal.h" #include "hugetlb_vmemmap.h" +#include "hugetlb_cma.h" #include int hugetlb_max_hstate __read_mostly; unsigned int default_hstate_idx; struct hstate hstates[HUGE_MAX_HSTATE]; -#ifdef CONFIG_CMA -static struct cma *hugetlb_cma[MAX_NUMNODES]; -static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata; -#endif -static bool hugetlb_cma_only; -static unsigned long hugetlb_cma_size __initdata; - __initdata struct list_head huge_boot_pages[MAX_NUMNODES]; __initdata unsigned long hstate_boot_nrinvalid[HUGE_MAX_HSTATE]; @@ -129,14 +123,11 @@ static struct resv_map *vma_resv_map(struct vm_area_struct *vma); static void hugetlb_free_folio(struct folio *folio) { -#ifdef CONFIG_CMA - int nid = folio_nid(folio); - if (folio_test_hugetlb_cma(folio)) { - WARN_ON_ONCE(!cma_free_folio(hugetlb_cma[nid], folio)); + hugetlb_cma_free_folio(folio); return; } -#endif + folio_put(folio); } @@ -1493,31 +1484,9 @@ static struct folio *alloc_gigantic_folio(struct hstate *h, gfp_t gfp_mask, if (nid == NUMA_NO_NODE) nid = numa_mem_id(); retry: - folio = NULL; -#ifdef CONFIG_CMA - { - int node; - - if (hugetlb_cma[nid]) - folio = cma_alloc_folio(hugetlb_cma[nid], order, gfp_mask); - - if (!folio && !(gfp_mask & __GFP_THISNODE)) { - for_each_node_mask(node, *nodemask) { - if (node == nid || !hugetlb_cma[node]) - continue; - - folio = cma_alloc_folio(hugetlb_cma[node], order, gfp_mask); - if (folio) - break; - } - } - - if (folio) - folio_set_hugetlb_cma(folio); - } -#endif + folio = hugetlb_cma_alloc_folio(h, gfp_mask, nid, nodemask); if (!folio) { - if (hugetlb_cma_only) + if (hugetlb_cma_exclusive_alloc()) return NULL; folio = folio_alloc_gigantic(order, gfp_mask, nid, nodemask); @@ -3180,32 +3149,19 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, return ERR_PTR(-ENOSPC); } -static bool __init hugetlb_early_cma(struct hstate *h) -{ - if (arch_has_huge_bootmem_alloc()) - return false; - - return (hstate_is_gigantic(h) && hugetlb_cma_only); -} - static __init void *alloc_bootmem(struct hstate *h, int nid) { struct huge_bootmem_page *m; - unsigned long flags; - struct cma *cma; -#ifdef CONFIG_CMA - if (hugetlb_early_cma(h)) { - flags = HUGE_BOOTMEM_CMA; - cma = hugetlb_cma[nid]; - m = cma_reserve_early(cma, huge_page_size(h)); - } else -#endif - { - flags = 0; - cma = NULL; + if (hugetlb_early_cma(h)) + m = hugetlb_cma_alloc_bootmem(h, nid); + else { m = memblock_alloc_try_nid_raw(huge_page_size(h), huge_page_size(h), 0, MEMBLOCK_ALLOC_ACCESSIBLE, nid); + if (m) { + m->flags = 0; + m->cma = NULL; + } } if (m) { @@ -3220,8 +3176,6 @@ static __init void *alloc_bootmem(struct hstate *h, int nid) INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages[nid]); m->hstate = h; - m->flags = flags; - m->cma = cma; } return m; @@ -3666,7 +3620,8 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) * Skip gigantic hugepages allocation if early CMA * reservations are not available. */ - if (hstate_is_gigantic(h) && hugetlb_cma_size && !hugetlb_early_cma(h)) { + if (hstate_is_gigantic(h) && hugetlb_cma_total_size() && + !hugetlb_early_cma(h)) { pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n"); return; } @@ -3703,7 +3658,7 @@ static void __init hugetlb_init_hstates(void) */ if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) continue; - if (hugetlb_cma_size && h->order <= HUGETLB_PAGE_ORDER) + if (hugetlb_cma_total_size() && h->order <= HUGETLB_PAGE_ORDER) continue; for_each_hstate(h2) { if (h2 == h) @@ -4605,14 +4560,6 @@ static void hugetlb_register_all_nodes(void) { } #endif -#ifdef CONFIG_CMA -static void __init hugetlb_cma_check(void); -#else -static inline __init void hugetlb_cma_check(void) -{ -} -#endif - static void __init hugetlb_sysfs_init(void) { struct hstate *h; @@ -4796,8 +4743,7 @@ static __init void hugetlb_parse_params(void) hcp->setup(hcp->val); } - if (!hugetlb_cma_size) - hugetlb_cma_only = false; + hugetlb_cma_validate_params(); } /* @@ -7867,169 +7813,3 @@ void hugetlb_unshare_all_pmds(struct vm_area_struct *vma) hugetlb_unshare_pmds(vma, ALIGN(vma->vm_start, PUD_SIZE), ALIGN_DOWN(vma->vm_end, PUD_SIZE)); } - -#ifdef CONFIG_CMA -static bool cma_reserve_called __initdata; - -static int __init cmdline_parse_hugetlb_cma(char *p) -{ - int nid, count = 0; - unsigned long tmp; - char *s = p; - - while (*s) { - if (sscanf(s, "%lu%n", &tmp, &count) != 1) - break; - - if (s[count] == ':') { - if (tmp >= MAX_NUMNODES) - break; - nid = array_index_nospec(tmp, MAX_NUMNODES); - - s += count + 1; - tmp = memparse(s, &s); - hugetlb_cma_size_in_node[nid] = tmp; - hugetlb_cma_size += tmp; - - /* - * Skip the separator if have one, otherwise - * break the parsing. - */ - if (*s == ',') - s++; - else - break; - } else { - hugetlb_cma_size = memparse(p, &p); - break; - } - } - - return 0; -} - -early_param("hugetlb_cma", cmdline_parse_hugetlb_cma); - -static int __init cmdline_parse_hugetlb_cma_only(char *p) -{ - return kstrtobool(p, &hugetlb_cma_only); -} - -early_param("hugetlb_cma_only", cmdline_parse_hugetlb_cma_only); - -void __init hugetlb_cma_reserve(int order) -{ - unsigned long size, reserved, per_node; - bool node_specific_cma_alloc = false; - int nid; - - /* - * HugeTLB CMA reservation is required for gigantic - * huge pages which could not be allocated via the - * page allocator. Just warn if there is any change - * breaking this assumption. - */ - VM_WARN_ON(order <= MAX_PAGE_ORDER); - cma_reserve_called = true; - - if (!hugetlb_cma_size) - return; - - for (nid = 0; nid < MAX_NUMNODES; nid++) { - if (hugetlb_cma_size_in_node[nid] == 0) - continue; - - if (!node_online(nid)) { - pr_warn("hugetlb_cma: invalid node %d specified\n", nid); - hugetlb_cma_size -= hugetlb_cma_size_in_node[nid]; - hugetlb_cma_size_in_node[nid] = 0; - continue; - } - - if (hugetlb_cma_size_in_node[nid] < (PAGE_SIZE << order)) { - pr_warn("hugetlb_cma: cma area of node %d should be at least %lu MiB\n", - nid, (PAGE_SIZE << order) / SZ_1M); - hugetlb_cma_size -= hugetlb_cma_size_in_node[nid]; - hugetlb_cma_size_in_node[nid] = 0; - } else { - node_specific_cma_alloc = true; - } - } - - /* Validate the CMA size again in case some invalid nodes specified. */ - if (!hugetlb_cma_size) - return; - - if (hugetlb_cma_size < (PAGE_SIZE << order)) { - pr_warn("hugetlb_cma: cma area should be at least %lu MiB\n", - (PAGE_SIZE << order) / SZ_1M); - hugetlb_cma_size = 0; - return; - } - - if (!node_specific_cma_alloc) { - /* - * If 3 GB area is requested on a machine with 4 numa nodes, - * let's allocate 1 GB on first three nodes and ignore the last one. - */ - per_node = DIV_ROUND_UP(hugetlb_cma_size, nr_online_nodes); - pr_info("hugetlb_cma: reserve %lu MiB, up to %lu MiB per node\n", - hugetlb_cma_size / SZ_1M, per_node / SZ_1M); - } - - reserved = 0; - for_each_online_node(nid) { - int res; - char name[CMA_MAX_NAME]; - - if (node_specific_cma_alloc) { - if (hugetlb_cma_size_in_node[nid] == 0) - continue; - - size = hugetlb_cma_size_in_node[nid]; - } else { - size = min(per_node, hugetlb_cma_size - reserved); - } - - size = round_up(size, PAGE_SIZE << order); - - snprintf(name, sizeof(name), "hugetlb%d", nid); - /* - * Note that 'order per bit' is based on smallest size that - * may be returned to CMA allocator in the case of - * huge page demotion. - */ - res = cma_declare_contiguous_multi(size, PAGE_SIZE << order, - HUGETLB_PAGE_ORDER, name, - &hugetlb_cma[nid], nid); - if (res) { - pr_warn("hugetlb_cma: reservation failed: err %d, node %d", - res, nid); - continue; - } - - reserved += size; - pr_info("hugetlb_cma: reserved %lu MiB on node %d\n", - size / SZ_1M, nid); - - if (reserved >= hugetlb_cma_size) - break; - } - - if (!reserved) - /* - * hugetlb_cma_size is used to determine if allocations from - * cma are possible. Set to zero if no cma regions are set up. - */ - hugetlb_cma_size = 0; -} - -static void __init hugetlb_cma_check(void) -{ - if (!hugetlb_cma_size || cma_reserve_called) - return; - - pr_warn("hugetlb_cma: the option isn't supported by current arch\n"); -} - -#endif /* CONFIG_CMA */ diff --git a/mm/hugetlb_cma.c b/mm/hugetlb_cma.c new file mode 100644 index 000000000000..3ea9cd0f6b9f --- /dev/null +++ b/mm/hugetlb_cma.c @@ -0,0 +1,258 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include +#include +#include + +#include +#include + +#include +#include "internal.h" +#include "hugetlb_cma.h" + + +static struct cma *hugetlb_cma[MAX_NUMNODES]; +static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata; +static bool hugetlb_cma_only; +static unsigned long hugetlb_cma_size __initdata; + +void hugetlb_cma_free_folio(struct folio *folio) +{ + int nid = folio_nid(folio); + + WARN_ON_ONCE(!cma_free_folio(hugetlb_cma[nid], folio)); +} + + +struct folio *hugetlb_cma_alloc_folio(struct hstate *h, gfp_t gfp_mask, + int nid, nodemask_t *nodemask) +{ + int node; + int order = huge_page_order(h); + struct folio *folio = NULL; + + if (hugetlb_cma[nid]) + folio = cma_alloc_folio(hugetlb_cma[nid], order, gfp_mask); + + if (!folio && !(gfp_mask & __GFP_THISNODE)) { + for_each_node_mask(node, *nodemask) { + if (node == nid || !hugetlb_cma[node]) + continue; + + folio = cma_alloc_folio(hugetlb_cma[node], order, gfp_mask); + if (folio) + break; + } + } + + if (folio) + folio_set_hugetlb_cma(folio); + + return folio; +} + +struct huge_bootmem_page * __init +hugetlb_cma_alloc_bootmem(struct hstate *h, int nid) +{ + struct cma *cma; + struct huge_bootmem_page *m; + + cma = hugetlb_cma[nid]; + m = cma_reserve_early(cma, huge_page_size(h)); + if (m) { + m->flags = HUGE_BOOTMEM_CMA; + m->cma = cma; + } + + return m; +} + + +static bool cma_reserve_called __initdata; + +static int __init cmdline_parse_hugetlb_cma(char *p) +{ + int nid, count = 0; + unsigned long tmp; + char *s = p; + + while (*s) { + if (sscanf(s, "%lu%n", &tmp, &count) != 1) + break; + + if (s[count] == ':') { + if (tmp >= MAX_NUMNODES) + break; + nid = array_index_nospec(tmp, MAX_NUMNODES); + + s += count + 1; + tmp = memparse(s, &s); + hugetlb_cma_size_in_node[nid] = tmp; + hugetlb_cma_size += tmp; + + /* + * Skip the separator if have one, otherwise + * break the parsing. + */ + if (*s == ',') + s++; + else + break; + } else { + hugetlb_cma_size = memparse(p, &p); + break; + } + } + + return 0; +} + +early_param("hugetlb_cma", cmdline_parse_hugetlb_cma); + +static int __init cmdline_parse_hugetlb_cma_only(char *p) +{ + return kstrtobool(p, &hugetlb_cma_only); +} + +early_param("hugetlb_cma_only", cmdline_parse_hugetlb_cma_only); + +void __init hugetlb_cma_reserve(int order) +{ + unsigned long size, reserved, per_node; + bool node_specific_cma_alloc = false; + int nid; + + /* + * HugeTLB CMA reservation is required for gigantic + * huge pages which could not be allocated via the + * page allocator. Just warn if there is any change + * breaking this assumption. + */ + VM_WARN_ON(order <= MAX_PAGE_ORDER); + cma_reserve_called = true; + + if (!hugetlb_cma_size) + return; + + for (nid = 0; nid < MAX_NUMNODES; nid++) { + if (hugetlb_cma_size_in_node[nid] == 0) + continue; + + if (!node_online(nid)) { + pr_warn("hugetlb_cma: invalid node %d specified\n", nid); + hugetlb_cma_size -= hugetlb_cma_size_in_node[nid]; + hugetlb_cma_size_in_node[nid] = 0; + continue; + } + + if (hugetlb_cma_size_in_node[nid] < (PAGE_SIZE << order)) { + pr_warn("hugetlb_cma: cma area of node %d should be at least %lu MiB\n", + nid, (PAGE_SIZE << order) / SZ_1M); + hugetlb_cma_size -= hugetlb_cma_size_in_node[nid]; + hugetlb_cma_size_in_node[nid] = 0; + } else { + node_specific_cma_alloc = true; + } + } + + /* Validate the CMA size again in case some invalid nodes specified. */ + if (!hugetlb_cma_size) + return; + + if (hugetlb_cma_size < (PAGE_SIZE << order)) { + pr_warn("hugetlb_cma: cma area should be at least %lu MiB\n", + (PAGE_SIZE << order) / SZ_1M); + hugetlb_cma_size = 0; + return; + } + + if (!node_specific_cma_alloc) { + /* + * If 3 GB area is requested on a machine with 4 numa nodes, + * let's allocate 1 GB on first three nodes and ignore the last one. + */ + per_node = DIV_ROUND_UP(hugetlb_cma_size, nr_online_nodes); + pr_info("hugetlb_cma: reserve %lu MiB, up to %lu MiB per node\n", + hugetlb_cma_size / SZ_1M, per_node / SZ_1M); + } + + reserved = 0; + for_each_online_node(nid) { + int res; + char name[CMA_MAX_NAME]; + + if (node_specific_cma_alloc) { + if (hugetlb_cma_size_in_node[nid] == 0) + continue; + + size = hugetlb_cma_size_in_node[nid]; + } else { + size = min(per_node, hugetlb_cma_size - reserved); + } + + size = round_up(size, PAGE_SIZE << order); + + snprintf(name, sizeof(name), "hugetlb%d", nid); + /* + * Note that 'order per bit' is based on smallest size that + * may be returned to CMA allocator in the case of + * huge page demotion. + */ + res = cma_declare_contiguous_multi(size, PAGE_SIZE << order, + HUGETLB_PAGE_ORDER, name, + &hugetlb_cma[nid], nid); + if (res) { + pr_warn("hugetlb_cma: reservation failed: err %d, node %d", + res, nid); + continue; + } + + reserved += size; + pr_info("hugetlb_cma: reserved %lu MiB on node %d\n", + size / SZ_1M, nid); + + if (reserved >= hugetlb_cma_size) + break; + } + + if (!reserved) + /* + * hugetlb_cma_size is used to determine if allocations from + * cma are possible. Set to zero if no cma regions are set up. + */ + hugetlb_cma_size = 0; +} + +void __init hugetlb_cma_check(void) +{ + if (!hugetlb_cma_size || cma_reserve_called) + return; + + pr_warn("hugetlb_cma: the option isn't supported by current arch\n"); +} + +bool hugetlb_cma_exclusive_alloc(void) +{ + return hugetlb_cma_only; +} + +unsigned long __init hugetlb_cma_total_size(void) +{ + return hugetlb_cma_size; +} + +void __init hugetlb_cma_validate_params(void) +{ + if (!hugetlb_cma_size) + hugetlb_cma_only = false; +} + +bool __init hugetlb_early_cma(struct hstate *h) +{ + if (arch_has_huge_bootmem_alloc()) + return false; + + return hstate_is_gigantic(h) && hugetlb_cma_only; +} diff --git a/mm/hugetlb_cma.h b/mm/hugetlb_cma.h new file mode 100644 index 000000000000..92eb7530fe9e --- /dev/null +++ b/mm/hugetlb_cma.h @@ -0,0 +1,55 @@ +// SPDX-License-Identifier: GPL-2.0 +#ifndef _LINUX_HUGETLB_CMA_H +#define _LINUX_HUGETLB_CMA_H + +#ifdef CONFIG_CMA +void hugetlb_cma_free_folio(struct folio *folio); +struct folio *hugetlb_cma_alloc_folio(struct hstate *h, gfp_t gfp_mask, + int nid, nodemask_t *nodemask); +struct huge_bootmem_page *hugetlb_cma_alloc_bootmem(struct hstate *h, int nid); +void hugetlb_cma_check(void); +bool hugetlb_cma_exclusive_alloc(void); +unsigned long hugetlb_cma_total_size(void); +void hugetlb_cma_validate_params(void); +bool hugetlb_early_cma(struct hstate *h); +#else +static inline void hugetlb_cma_free_folio(struct folio *folio) +{ +} + +static inline struct folio *hugetlb_cma_alloc_folio(struct hstate *h, + gfp_t gfp_mask, int nid, nodemask_t *nodemask) +{ + return NULL; +} + +static inline +struct huge_bootmem_page *hugetlb_cma_alloc_bootmem(struct hstate *h, int nid) +{ + return NULL; +} + +static inline void hugetlb_cma_check(void) +{ +} + +static inline bool hugetlb_cma_exclusive_alloc(void) +{ + return false; +} + +static inline unsigned long hugetlb_cma_total_size(void) +{ + return 0; +} + +static inline void hugetlb_cma_validate_params(void) +{ +} + +static inline bool hugetlb_early_cma(struct hstate *h) +{ + return false; +} +#endif +#endif