From patchwork Mon Jan 27 23:21:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951842 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F10D6C02188 for ; Mon, 27 Jan 2025 23:22:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 80E6B2801BF; Mon, 27 Jan 2025 18:22:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7BC032801BC; Mon, 27 Jan 2025 18:22:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 638DF2801BF; Mon, 27 Jan 2025 18:22:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 431AD2801BC for ; Mon, 27 Jan 2025 18:22:28 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id BFFA01C7519 for ; Mon, 27 Jan 2025 23:22:27 +0000 (UTC) X-FDA: 83054807934.08.8FAEE7E Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf07.hostedemail.com (Postfix) with ESMTP id EC9D540007 for ; Mon, 27 Jan 2025 23:22:25 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=gto8hhyI; spf=pass (imf07.hostedemail.com: domain of 3MBWYZwQKCP0k0iqlttlqj.htrqnsz2-rrp0fhp.twl@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3MBWYZwQKCP0k0iqlttlqj.htrqnsz2-rrp0fhp.twl@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020146; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=juYA+sItEByN7zxk1dgdSJwdSPC9Jk78U+kCT9wOvNQ=; b=FRyClOaMnw3w8HyJ58my/pxRPLRK0xuLkwHKgqmlEUDCu3SglBYUIseGD7qi4XNOvUp+6v x8pZiTYmPE+kC2o7EBuFsWc5Tf+hwD4IhOyLwFk2CkRV04hIOmWcb6iAr/j0MhTBxrdUlj IGAYjqURQUl/19D78Imi+Dlb5H5mHiE= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=gto8hhyI; spf=pass (imf07.hostedemail.com: domain of 3MBWYZwQKCP0k0iqlttlqj.htrqnsz2-rrp0fhp.twl@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3MBWYZwQKCP0k0iqlttlqj.htrqnsz2-rrp0fhp.twl@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020146; a=rsa-sha256; cv=none; b=PTEATG2BuYQbpZPq24Y3IpHntdAs28EBdLsMWAP7+3wSDuvPCcUlNoIOd/WreIaEMcbQxh Sh2lEzXrp/r5R5Svt9RDDpPHhvRl7zf+ggPyoRR9eh7eju5r0vg0CzxB32bIZjM741UEqf pjwTbteEogbtEUxZ00NusVgNRJivwXU= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ee8ced572eso10193303a91.0 for ; Mon, 27 Jan 2025 15:22:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020145; x=1738624945; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=juYA+sItEByN7zxk1dgdSJwdSPC9Jk78U+kCT9wOvNQ=; b=gto8hhyIzQsA2RW2EzYOY7Ktsj96IVPJiIj6XZn5vFEL7Eo1mHuQJ2cWvz2cJHIbyG zufNh3QjP7iSW/VeXtSevrQKW8VrvMoTBisXf6Q6QBcrARV/a+qDMUVZHrvEZZbYmfL4 PzF1+1dK8UZsEdVv75A6FrDEbRywji2L5gGKtz5ng08q55xNFmNcP3vMCvZRmWHeMtCP SjEvkO7hhYllAfVykyfuCkUUqLShsVXOmTfgsA11W2KCVMfiF4R77tDDvW7Tvi369Lxr Z+SfYMmPQuyi0OEMzD572SYyev9fz+/ttkbO+TvO2GzinEiCJOqOwdMOuzBVERPWSSMI H32w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020145; x=1738624945; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=juYA+sItEByN7zxk1dgdSJwdSPC9Jk78U+kCT9wOvNQ=; b=Q8pcqTinTd9ZX3ibjS8PizCyyyHBT07S0lK0o0K6zaPZ/VEHqg+EfK3Y/hDkihJPHr onAFHWLBt8FY60lomsCFG3zPmxsMDZohx7/9jUhV8mopYxlh36BXLSu1ty/eZ1nFGei7 O2U4BJe2YibwZIHaCvdE3KGB2dMCXW/hrGrAaSYL7bDZc0B+LfHcNaw65NgcH2otfYgV 1Xq1JQn35qikyXnc9hc8NRFgfgn3JV8I57YvZOeXSp0Xgvlv6Bcnyu5zyBO1z9GQLl9X QgdSaCILeGdIqohbwT8eNyfwTw2lZac7apv9xA7VCr87sSHuk5/lyAAbbTKFHhziDGq8 weWw== X-Forwarded-Encrypted: i=1; AJvYcCVX3XYyJYFDy89ajCj733Gct51qrgSnNhqVsMl9mDPNf3wBcoQye6KCb2qOV1MuYZlsXEXJIO5Wbg==@kvack.org X-Gm-Message-State: AOJu0YwhDL7f+1SS1ylbx2QJCC5in+zHZPvY4DNQCXYrgH8tDXL3Kf8S U7iDPb0CDp4i5a3HKFq5YuZtlNYyEP57OKUb30gOAbOc49nKe1Id8OcK7LVTUQhuVLszFg== X-Google-Smtp-Source: AGHT+IE2OfAZCY1fEw5GNZxTe6dAQr83OcZeh4iYNLGi7VR8UZal+A177NtIooH/8fQU83vYRYrstMLL X-Received: from pfbjc17.prod.google.com ([2002:a05:6a00:6c91:b0:725:eb13:8be8]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:ad8a:b0:72d:8fa2:99ac with SMTP id d2e1a72fcca58-72daf975d73mr60337775b3a.13.1738020144748; Mon, 27 Jan 2025 15:22:24 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:41 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-2-fvdl@google.com> Subject: [PATCH 01/27] mm/cma: export total and free number of pages for CMA areas From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: EC9D540007 X-Stat-Signature: 8jszf8u949w88minedds63bw7iqstbt7 X-Rspam-User: X-HE-Tag: 1738020145-467515 X-HE-Meta: U2FsdGVkX18GWk2fj62kZt6IXc6MTGSlPzR8WELJVXlpxyAOcge6Q3jq8hpZyhiHBIGjmJ/3Ddc1f+GWUITZKZ+qkL3rgVVl1D/ve4l8gNdUCMfvm6VItfIBst2RLZppTdja/Xlilq2Zoz/4sIoKhbdtQZikgdgOEzTJ/YjFN2nA73U1qR1Aqrcde2HmAXN6Kyn4pm38vQkH4nnvnoFxTukPrCNtU2AY50E16vaXljwAWlD3K7yiJlZOJZbSkCYPtvtZ8zJGmzb8S/w4yrD6pFzOsfmyhGl5k72y3OSqEHe7EKk1BAmKdDjb4O7K/erMWVtcxmtrwuGqzMTbY1FWSaZMI78jLG6qKZqLLpmBbTJrH8v3UejNj89Oii2fhvSKi8iYdC78h5Bhva/WgnH9CTebRTSeWS60nWOzFL16UY3JOlQzu9Z8kHkQPhIIlAeV4lCeuqQrO9hJ3I5gjASLnhFUUqQ7PJgFAwHEgLGWAgopaMWG3T9cSeS3S5xsJYs4zQpMjCj7dxvmlIhZSJNWhs/Kze7FfMhoL8SSOyIuPXlInYzbxrUr6uFRPbmXPBJHeRqJC2N/+3GG/xbzXImuwDHJ6V0Dt/Y6xoQLYG2Ibmd6p4aEIS11YFmbAIYtgRfoQuHu+vXkwJzb8pZBz8mPrPEvdnnAp5prZdUXCSuq0IA5yGMw45SzlUavo7jIggNACBRiI5xCmpf26nfpLXZ7t5A4m2WU7q2Aty7tgUXa3wDo9yQI2ASe7ry6VP3pzDj8DlnbvugNo2amVgKz5lC8cSYE7fa118cfCsSRXWtKghw3Ps37l+GpFeQpLBBTzBSHmpRSHW1XeO27oKumN7U4LawsG2MMeqzwKAHhHefXlWGAwB5t8jpvj84tAxWR6ZhdaG/DQNhhHk54P6c82f5m8gRj+bw7tvrFvghJX8L5DggCjsB94gjE+P2XKptNDHMyMt7Z6i+LVLQEtpsMayr dwgua/ju cutYt3tfLV5AJfJXKoqCsPynz1q78xPrw2XioHfGQEhZU8IcKiEsG6c0acBJmW4hfsxkgKysDmXR1oiR7a37PeV/G0PiGZXFWnljym73yYdvYY62esDrJ1SrTO8ADDZbWdrqFuT6f0EOaNtAsD1kA7vlzdnB3AEYJG7ODdgjL7XNtvO/2X9E/4b0UeUIvqQpmOyZawzSDlZ0uZ7A7FHs1qpVHeOuIHnME1LLvKM2i5rTp9R72rZnbd066xjKgs+kR9YO5JHd8U0PFTs4CkH2jjtoWmfaWpNIfI9vJ44nJDUSYDMrCY4lJjyE9cx+V+Kjpgbt+dvcw77aar6c5MT6SjPyythAFkra5RnfbWEKfOjlXNNsERtHPK0O689HtFSw6fJ16HYVJYv+GOOmhVCwqA7BodjOs0rkPuuslcLy8tXcw4I0DBrFezMLgx7yUkiPmbuhsRv1ZbMegAmZPNj9QgGz+CyeDdQ3w0jL/zuG2EqOHV7fwjpbXkC0T5QFdX9WyUh7qfrdD6yszMQZJBcp5qwwHdwFqaTHMBt68P/PUVErBtJvQbyC8rUAWo+m45TdDBk/W6FTjI4xRSuINi5dPDOroN2nnFcaYO9IWlmzJu1yNNcc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In addition to the number of allocations and releases, system management software may like to be aware of the size of CMA areas, and how many pages are available in it. This information is currently not available, so export it in total_page and available_pages, respectively. The name 'available_pages' was picked over 'free_pages' because 'free' implies that the pages are unused. But they might not be, they just haven't been used by cma_alloc The number of available pages is tracked regardless of CONFIG_CMA_SYSFS, allowing for a few minor shortcuts in the code, avoiding bitmap operations. Signed-off-by: Frank van der Linden --- Documentation/ABI/testing/sysfs-kernel-mm-cma | 13 +++++++++++ mm/cma.c | 22 ++++++++++++++----- mm/cma.h | 1 + mm/cma_debug.c | 5 +---- mm/cma_sysfs.c | 20 +++++++++++++++++ 5 files changed, 51 insertions(+), 10 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-cma b/Documentation/ABI/testing/sysfs-kernel-mm-cma index dfd755201142..aaf2a5d8b13b 100644 --- a/Documentation/ABI/testing/sysfs-kernel-mm-cma +++ b/Documentation/ABI/testing/sysfs-kernel-mm-cma @@ -29,3 +29,16 @@ Date: Feb 2024 Contact: Anshuman Khandual Description: the number of pages CMA API succeeded to release + +What: /sys/kernel/mm/cma//total_pages +Date: Jun 2024 +Contact: Frank van der Linden +Description: + The size of the CMA area in pages. + +What: /sys/kernel/mm/cma//available_pages +Date: Jun 2024 +Contact: Frank van der Linden +Description: + The number of pages in the CMA area that are still + available for CMA allocation. diff --git a/mm/cma.c b/mm/cma.c index de5bc0c81fc2..95a8788e54d3 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -86,6 +86,7 @@ static void cma_clear_bitmap(struct cma *cma, unsigned long pfn, spin_lock_irqsave(&cma->lock, flags); bitmap_clear(cma->bitmap, bitmap_no, bitmap_count); + cma->available_count += count; spin_unlock_irqrestore(&cma->lock, flags); } @@ -133,7 +134,7 @@ static void __init cma_activate_area(struct cma *cma) free_reserved_page(pfn_to_page(pfn)); } totalcma_pages -= cma->count; - cma->count = 0; + cma->available_count = cma->count = 0; pr_err("CMA area %s could not be activated\n", cma->name); } @@ -206,7 +207,7 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, snprintf(cma->name, CMA_MAX_NAME, "cma%d\n", cma_area_count); cma->base_pfn = PFN_DOWN(base); - cma->count = size >> PAGE_SHIFT; + cma->available_count = cma->count = size >> PAGE_SHIFT; cma->order_per_bit = order_per_bit; *res_cma = cma; cma_area_count++; @@ -390,7 +391,7 @@ static void cma_debug_show_areas(struct cma *cma) { unsigned long next_zero_bit, next_set_bit, nr_zero; unsigned long start = 0; - unsigned long nr_part, nr_total = 0; + unsigned long nr_part; unsigned long nbits = cma_bitmap_maxno(cma); spin_lock_irq(&cma->lock); @@ -402,12 +403,12 @@ static void cma_debug_show_areas(struct cma *cma) next_set_bit = find_next_bit(cma->bitmap, nbits, next_zero_bit); nr_zero = next_set_bit - next_zero_bit; nr_part = nr_zero << cma->order_per_bit; - pr_cont("%s%lu@%lu", nr_total ? "+" : "", nr_part, + pr_cont("%s%lu@%lu", start ? "+" : "", nr_part, next_zero_bit); - nr_total += nr_part; start = next_zero_bit + nr_zero; } - pr_cont("=> %lu free of %lu total pages\n", nr_total, cma->count); + pr_cont("=> %lu free of %lu total pages\n", cma->available_count, + cma->count); spin_unlock_irq(&cma->lock); } @@ -444,6 +445,14 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, for (;;) { spin_lock_irq(&cma->lock); + /* + * If the request is larger than the available number + * of pages, stop right away. + */ + if (count > cma->available_count) { + spin_unlock_irq(&cma->lock); + break; + } bitmap_no = bitmap_find_next_zero_area_off(cma->bitmap, bitmap_maxno, start, bitmap_count, mask, offset); @@ -452,6 +461,7 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, break; } bitmap_set(cma->bitmap, bitmap_no, bitmap_count); + cma->available_count -= count; /* * It's safe to drop the lock here. We've marked this region for * our exclusive use. If the migration fails we will take the diff --git a/mm/cma.h b/mm/cma.h index 8485ef893e99..3dd3376ae980 100644 --- a/mm/cma.h +++ b/mm/cma.h @@ -13,6 +13,7 @@ struct cma_kobject { struct cma { unsigned long base_pfn; unsigned long count; + unsigned long available_count; unsigned long *bitmap; unsigned int order_per_bit; /* Order of pages represented by one bit */ spinlock_t lock; diff --git a/mm/cma_debug.c b/mm/cma_debug.c index 602fff89b15f..89236f22230a 100644 --- a/mm/cma_debug.c +++ b/mm/cma_debug.c @@ -34,13 +34,10 @@ DEFINE_DEBUGFS_ATTRIBUTE(cma_debugfs_fops, cma_debugfs_get, NULL, "%llu\n"); static int cma_used_get(void *data, u64 *val) { struct cma *cma = data; - unsigned long used; spin_lock_irq(&cma->lock); - /* pages counter is smaller than sizeof(int) */ - used = bitmap_weight(cma->bitmap, (int)cma_bitmap_maxno(cma)); + *val = cma->count - cma->available_count; spin_unlock_irq(&cma->lock); - *val = (u64)used << cma->order_per_bit; return 0; } diff --git a/mm/cma_sysfs.c b/mm/cma_sysfs.c index f50db3973171..97acd3e5a6a5 100644 --- a/mm/cma_sysfs.c +++ b/mm/cma_sysfs.c @@ -62,6 +62,24 @@ static ssize_t release_pages_success_show(struct kobject *kobj, } CMA_ATTR_RO(release_pages_success); +static ssize_t total_pages_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct cma *cma = cma_from_kobj(kobj); + + return sysfs_emit(buf, "%lu\n", cma->count); +} +CMA_ATTR_RO(total_pages); + +static ssize_t available_pages_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct cma *cma = cma_from_kobj(kobj); + + return sysfs_emit(buf, "%lu\n", cma->available_count); +} +CMA_ATTR_RO(available_pages); + static void cma_kobj_release(struct kobject *kobj) { struct cma *cma = cma_from_kobj(kobj); @@ -75,6 +93,8 @@ static struct attribute *cma_attrs[] = { &alloc_pages_success_attr.attr, &alloc_pages_fail_attr.attr, &release_pages_success_attr.attr, + &total_pages_attr.attr, + &available_pages_attr.attr, NULL, }; ATTRIBUTE_GROUPS(cma); From patchwork Mon Jan 27 23:21:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951843 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BBF5AC0218A for ; Mon, 27 Jan 2025 23:22:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 43C852801C0; Mon, 27 Jan 2025 18:22:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3ED752801BC; Mon, 27 Jan 2025 18:22:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 23EBE2801C0; Mon, 27 Jan 2025 18:22:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id E78772801BC for ; Mon, 27 Jan 2025 18:22:30 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 8D1C81A0825 for ; Mon, 27 Jan 2025 23:22:30 +0000 (UTC) X-FDA: 83054808060.05.644C608 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf16.hostedemail.com (Postfix) with ESMTP id A80C618000D for ; Mon, 27 Jan 2025 23:22:28 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vBP8HKfB; spf=pass (imf16.hostedemail.com: domain of 3MxWYZwQKCAIhxfniqqing.eqonkpwz-oomxcem.qti@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3MxWYZwQKCAIhxfniqqing.eqonkpwz-oomxcem.qti@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020148; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IifWnidzmrqOD0Dpjpd1kHAtIyezC6ELid/mxYhfISM=; b=bH2oD8AvMX9x3SxamL6ipsNbpS7gu1AqSllt+35HTurvO+Px7ZASoYF9mCbVS9fboeIdZP uDFeZjeI1sCaCiotN4xrGYtIVGgNUKmgVO5gyJIFNsFi7aSIahNxAYt2T5waLaB2ri4EWU fzw1uujxCxCUMW+rjeSukPpmJgkMBx8= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vBP8HKfB; spf=pass (imf16.hostedemail.com: domain of 3MxWYZwQKCAIhxfniqqing.eqonkpwz-oomxcem.qti@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3MxWYZwQKCAIhxfniqqing.eqonkpwz-oomxcem.qti@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020148; a=rsa-sha256; cv=none; b=0A4iOtsYUO2rtyh4VnQso1bWx//uVJVjFc+G04JWiM6bex/f4P/45Q8dlq1PQLm8Ec2gpQ uw8LbnWOHxapbMgiRq9w4KEOJa2lnmlKpFhlukMV4v43bDNAg22Fk21HM8elHPaohqBHY9 XbI4bdPcsbgDEiuxi3U/MTh6goJa/eU= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-218ad674181so121275375ad.1 for ; Mon, 27 Jan 2025 15:22:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020147; x=1738624947; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=IifWnidzmrqOD0Dpjpd1kHAtIyezC6ELid/mxYhfISM=; b=vBP8HKfBSGaj9B1nXmCPLZUJM2LHHVGHkqAYkGS3CIe7H+6q2lPQCAQU3W6Q6ljwD0 Y1hon712M4he2i7iACD1pZj+ANnuD/9zm0O6hbbaubyD7rMx5RfFS+H1le4EXtoPreR9 hwAbaUSItDJX+WGvF+uU+OFDL4EhD5pxkAlnO39/75P2N6H8c+7QEPQTweRSyA2IUiov 8hh+Y/OztoXU8uzfAEocXSW681sqC/5DuKU+8bUSlFYClKEhpf2tWReBGyUAwFJQZyCU DOJ3P85e70wdKBUqOOyUAXXeWFTkUK7hrqqOJFbg2ojHiWTNu7+M/5sMe6VupXFf5fTq fncA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020147; x=1738624947; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IifWnidzmrqOD0Dpjpd1kHAtIyezC6ELid/mxYhfISM=; b=cv4so70Xttel0Df4EI4HwnVcVY/Lcen8nuaLJVCVMIbD3azROYQxsE7EZ2YfLKYSj4 FdpfwlepyIUgu0V4vJ3XFS7Ah0+JiyCcraFL//Xi1+utCQbssPFsqAIrFbViOgG9qBBU hpGI+/Udx5oMZNB2pbgOcjMDftox54UMWil4MeX3+BU09w3uaqKVCUCmkJjOX4kCwX/a cWgbiHETE7o0tumSOyuIwWzxZ/LJeVxIj3wqbb8MkP5TlLbya79+NaX8beJIm5zCs/5g 4s5MWdEX2WyfP7Bag+bRv2VJvaNKFAc6dCiUX+Ij9slWWrcr4YV5saYbdByboQ4qPm5u V4Ug== X-Forwarded-Encrypted: i=1; AJvYcCVzqpF4IWk3thSLyZ9W/KG4TzNr23eAOb/IYm5VMAUX9qLlLDuM/rPxI3d7XJDlWmwjUgPCE8vOjQ==@kvack.org X-Gm-Message-State: AOJu0YyZvRiVzHFpBVniVviByOlLjgLSiMSiY7tji8ejDza017PqAwmk h/qE7V9OjbmAHQ+SIn87ILF1ya2T7t9chOW2gIaWBwoU1gB1y0Fczd+EQC2G8+NsbiuzcA== X-Google-Smtp-Source: AGHT+IFlow1Oiiw7TvXcOx3vGb3xchW29/1/q+/2fE+WN1OlNnXhw5TiWLnkhFLXeF72Qnti0a6hsfb+ X-Received: from pfbcr10.prod.google.com ([2002:a05:6a00:f0a:b0:728:e76c:253f]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:a82:b0:725:ce39:4516 with SMTP id d2e1a72fcca58-72fc092eda1mr1981267b3a.7.1738020147491; Mon, 27 Jan 2025 15:22:27 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:42 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-3-fvdl@google.com> Subject: [PATCH 02/27] mm, cma: support multiple contiguous ranges, if requested From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: A80C618000D X-Stat-Signature: m5dsqf7ipiju9jqoh9uzu7kmg4fefwib X-Rspam-User: X-HE-Tag: 1738020148-214116 X-HE-Meta: U2FsdGVkX194rb92BYKDPUe6a2oGMUlODdFVTj1vlhebclQ+M1//eVU/CZKhOujduwyq5NxM2x9zPHi4vGhxHOL9z5rREELGKRvsIwmtpJLJsj/S1INNo2GqScLlu1vxaAq/UvPsLT2xN+6YjJcaHecFhMUjbwI/Rc309/P2G/6PtmHKV4msa2jhojVd+sks+YsJ20Otxy0XMd835ctbER877I74Rknqoy9gQge5ssc/xfKymXjeWn++L8tpQvjNq1L/ncWTV/4wO+VyE7S9IVxgeGvH7Feg9U145NYHMLAL57pwoHFchduP96M0PFxe31bdBtJBo3aFhwRjAmDYPQdx8AGxw+zDKK9WMdeKD6SoepX/1QF8Y/MhN1k+PKzrOEojGkcrIqupyds6sgOUz1RVRUUVe5OYzISLyARYQjwzfWJnSTLFHJSJsBw6/pIU6ywIBaMybtLG9qA/UI/cAqxypzQA7gQn+eHuqtfhI9uMzUjhpE/xepSakzdqJJFc9cdm8hbluD9pPCkqHzfCzdAwEYRra6LWoB7f8a3De/6vESxOaOSYzG7OvA1j/AMnvGeVSiaeMJSEuUX+e5XruJ9kXcXSBJlIv2WDzBWsoWEZEhD85jCaFR/dwjCsoR5PfBH1ZEmFlq2hWgvF52lANsKCm6Ni9ycAgJxc7YSAxxWmLiohKC831PRPxZSynNAYG+ySDtD1UQgSxahg6Z1cmbANscOKiUuy2Q8CJhMawg04axhlQZfYTZqGzmvtMurNtTy4P2LINcfX8puu3KAlre3mgdaY1y2G8j56x4mVaFwy6SdA4ZYWGajx8il4W2TnOeDx6M0EV/q+XcE2n/aMKf1vhXNm7O0VsiJZgYAMQrLBKSimvoMUh6jO9Qa9TTQ5e05zKZ8vuwKHmduJiMFmc8PUB33oUv2GtaWjaUxTx2GsP2SehNmXRc4274DwP4YqJAmTht/9T/Yeyn26u9v X953LJWG MZIEAyyzr+HBaZO/9vkPzgxFoe3QuDTq0MXc7oF+MLs+LN9uj1ihMS1IU7/Lz9ccX6MJwPiOAb8mTdnNOv5XeLl0LFwDf+bbrDJrYSf8/8NSbdeaA7mbfzMfjprxI1iBI61LyJ8YzyjpWr4HdVSX/UAqcd6YxtDuqVLCNUUzipk+LglcU4R7s5mPFFkVk0TJMbPA/YAM2ZmjoIdYJGQ2r0jd2lqW4KuRHjEn0UFiPe7Hy8UB0vMr/vxjAyEpdym4m+WIQzr/KgGySnU4TRsaCB3D4TUPh5MHPAKR4mZiEoM3ASJT6bsEtajl3EyRgjqhFbWhaizl6ccjZGGjSgrfP0ctfrUTWtC7J0wVQkK90GYkkoKMaj3jT52DSuRZQBUWXMQZKdFOi8mvxPDPq9eHPYHlXwYfsReP5xnl2Ao0R52VDQPU31rZmVFEAgP7fBQAft6pvdkx+QWRapIp1BSb6qdMRXaSnPiBMSi4I/kbb3hfxzbT9m1tLhVRjoZkOpDng080edaeqveeVwgTIipyL3FXEHE/Ojdx0Lt12PFshHIzUqTvWk/FsFmWSGg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently, CMA manages one range of physically contiguous memory. Creation of larger CMA areas with hugetlb_cma may run in to gaps in physical memory, so that they are not able to allocate that contiguous physical range from memblock when creating the CMA area. This can happen, for example, on an AMD system with > 1TB of memory, where there will be a gap just below the 1TB (40bit DMA) line. If you have set aside most of memory for potential hugetlb CMA allocation, cma_declare_contiguous_nid will fail. hugetlb_cma doesn't need the entire area to be one physically contiguous range. It just cares about being able to get physically contiguous chunks of a certain size (e.g. 1G), and it is fine to have the CMA area backed by multiple physical ranges, as long as it gets 1G contiguous allocations. Multi-range support is implemented by introducing an array of ranges, instead of just one big one. Each range has its own bitmap. Effectively, the allocate and release operations work as before, just per-range. So, instead of going through one large bitmap, they now go through a number of smaller ones. The maximum number of supported ranges is 8, as defined in CMA_MAX_RANGES. Since some current users of CMA expect a CMA area to just use one physically contiguous range, only allow for multiple ranges if a new interface, cma_declare_contiguous_nid_multi, is used. The other interfaces will work like before, creating only CMA areas with 1 range. cma_declare_contiguous_nid_multi works as follows, mimicking the default "bottom-up, above 4G" reservation approach: 0) Try cma_declare_contiguous_nid, which will use only one region. If this succeeds, return. This makes sure that for all the cases that currently work, the behavior remains unchanged even if the caller switches from cma_declare_contiguous_nid to cma_declare_contiguous_nid_multi. 1) Select the largest free memblock ranges above 4G, with a maximum number of CMA_MAX_RANGES. 2) If we did not find at most CMA_MAX_RANGES that add up to the total size requested, return -ENOMEM. 3) Sort the selected ranges by base address. 4) Reserve them bottom-up until we get what we wanted. Signed-off-by: Frank van der Linden --- include/linux/cma.h | 3 + mm/cma.c | 604 +++++++++++++++++++++++++++++++++++--------- mm/cma.h | 23 +- 3 files changed, 508 insertions(+), 122 deletions(-) diff --git a/include/linux/cma.h b/include/linux/cma.h index d15b64f51336..863427c27dc2 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -40,6 +40,9 @@ static inline int __init cma_declare_contiguous(phys_addr_t base, return cma_declare_contiguous_nid(base, size, limit, alignment, order_per_bit, fixed, name, res_cma, NUMA_NO_NODE); } +extern int __init cma_declare_contiguous_multi(phys_addr_t size, + phys_addr_t align, unsigned int order_per_bit, + const char *name, struct cma **res_cma, int nid); extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, unsigned int order_per_bit, const char *name, diff --git a/mm/cma.c b/mm/cma.c index 95a8788e54d3..c20255161642 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -18,6 +18,7 @@ #include #include +#include #include #include #include @@ -35,9 +36,16 @@ struct cma cma_areas[MAX_CMA_AREAS]; unsigned int cma_area_count; static DEFINE_MUTEX(cma_mutex); +static int __init __cma_declare_contiguous_nid(phys_addr_t base, + phys_addr_t size, phys_addr_t limit, + phys_addr_t alignment, unsigned int order_per_bit, + bool fixed, const char *name, struct cma **res_cma, + int nid); + phys_addr_t cma_get_base(const struct cma *cma) { - return PFN_PHYS(cma->base_pfn); + WARN_ON_ONCE(cma->nranges != 1); + return PFN_PHYS(cma->ranges[0].base_pfn); } unsigned long cma_get_size(const struct cma *cma) @@ -63,9 +71,10 @@ static unsigned long cma_bitmap_aligned_mask(const struct cma *cma, * The value returned is represented in order_per_bits. */ static unsigned long cma_bitmap_aligned_offset(const struct cma *cma, + const struct cma_memrange *cmr, unsigned int align_order) { - return (cma->base_pfn & ((1UL << align_order) - 1)) + return (cmr->base_pfn & ((1UL << align_order) - 1)) >> cma->order_per_bit; } @@ -75,46 +84,57 @@ static unsigned long cma_bitmap_pages_to_bits(const struct cma *cma, return ALIGN(pages, 1UL << cma->order_per_bit) >> cma->order_per_bit; } -static void cma_clear_bitmap(struct cma *cma, unsigned long pfn, - unsigned long count) +static void cma_clear_bitmap(struct cma *cma, const struct cma_memrange *cmr, + unsigned long pfn, unsigned long count) { unsigned long bitmap_no, bitmap_count; unsigned long flags; - bitmap_no = (pfn - cma->base_pfn) >> cma->order_per_bit; + bitmap_no = (pfn - cmr->base_pfn) >> cma->order_per_bit; bitmap_count = cma_bitmap_pages_to_bits(cma, count); spin_lock_irqsave(&cma->lock, flags); - bitmap_clear(cma->bitmap, bitmap_no, bitmap_count); + bitmap_clear(cmr->bitmap, bitmap_no, bitmap_count); cma->available_count += count; spin_unlock_irqrestore(&cma->lock, flags); } static void __init cma_activate_area(struct cma *cma) { - unsigned long base_pfn = cma->base_pfn, pfn; + unsigned long pfn, base_pfn; + int allocrange, r; struct zone *zone; + struct cma_memrange *cmr; + + for (allocrange = 0; allocrange < cma->nranges; allocrange++) { + cmr = &cma->ranges[allocrange]; + cmr->bitmap = bitmap_zalloc(cma_bitmap_maxno(cma, cmr), + GFP_KERNEL); + if (!cmr->bitmap) + goto cleanup; + } - cma->bitmap = bitmap_zalloc(cma_bitmap_maxno(cma), GFP_KERNEL); - if (!cma->bitmap) - goto out_error; + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + base_pfn = cmr->base_pfn; - /* - * alloc_contig_range() requires the pfn range specified to be in the - * same zone. Simplify by forcing the entire CMA resv range to be in the - * same zone. - */ - WARN_ON_ONCE(!pfn_valid(base_pfn)); - zone = page_zone(pfn_to_page(base_pfn)); - for (pfn = base_pfn + 1; pfn < base_pfn + cma->count; pfn++) { - WARN_ON_ONCE(!pfn_valid(pfn)); - if (page_zone(pfn_to_page(pfn)) != zone) - goto not_in_zone; - } + /* + * alloc_contig_range() requires the pfn range specified + * to be in the same zone. Simplify by forcing the entire + * CMA resv range to be in the same zone. + */ + WARN_ON_ONCE(!pfn_valid(base_pfn)); + zone = page_zone(pfn_to_page(base_pfn)); + for (pfn = base_pfn + 1; pfn < base_pfn + cmr->count; pfn++) { + WARN_ON_ONCE(!pfn_valid(pfn)); + if (page_zone(pfn_to_page(pfn)) != zone) + goto cleanup; + } - for (pfn = base_pfn; pfn < base_pfn + cma->count; - pfn += pageblock_nr_pages) - init_cma_reserved_pageblock(pfn_to_page(pfn)); + for (pfn = base_pfn; pfn < base_pfn + cmr->count; + pfn += pageblock_nr_pages) + init_cma_reserved_pageblock(pfn_to_page(pfn)); + } spin_lock_init(&cma->lock); @@ -125,13 +145,19 @@ static void __init cma_activate_area(struct cma *cma) return; -not_in_zone: - bitmap_free(cma->bitmap); -out_error: +cleanup: + for (r = 0; r < allocrange; r++) + bitmap_free(cma->ranges[r].bitmap); + /* Expose all pages to the buddy, they are useless for CMA. */ if (!cma->reserve_pages_on_error) { - for (pfn = base_pfn; pfn < base_pfn + cma->count; pfn++) - free_reserved_page(pfn_to_page(pfn)); + for (r = 0; r < allocrange; r++) { + cmr = &cma->ranges[r]; + for (pfn = cmr->base_pfn; + pfn < cmr->base_pfn + cmr->count; + pfn++) + free_reserved_page(pfn_to_page(pfn)); + } } totalcma_pages -= cma->count; cma->available_count = cma->count = 0; @@ -154,6 +180,43 @@ void __init cma_reserve_pages_on_error(struct cma *cma) cma->reserve_pages_on_error = true; } +static int __init cma_new_area(const char *name, phys_addr_t size, + unsigned int order_per_bit, + struct cma **res_cma) +{ + struct cma *cma; + + if (cma_area_count == ARRAY_SIZE(cma_areas)) { + pr_err("Not enough slots for CMA reserved regions!\n"); + return -ENOSPC; + } + + /* + * Each reserved area must be initialised later, when more kernel + * subsystems (like slab allocator) are available. + */ + cma = &cma_areas[cma_area_count]; + cma_area_count++; + + if (name) + snprintf(cma->name, CMA_MAX_NAME, name); + else + snprintf(cma->name, CMA_MAX_NAME, "cma%d\n", cma_area_count); + + cma->available_count = cma->count = size >> PAGE_SHIFT; + cma->order_per_bit = order_per_bit; + *res_cma = cma; + totalcma_pages += cma->count; + + return 0; +} + +static void __init cma_drop_area(struct cma *cma) +{ + totalcma_pages -= cma->count; + cma_area_count--; +} + /** * cma_init_reserved_mem() - create custom contiguous area from reserved memory * @base: Base address of the reserved area @@ -172,13 +235,9 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, struct cma **res_cma) { struct cma *cma; + int ret; /* Sanity checks */ - if (cma_area_count == ARRAY_SIZE(cma_areas)) { - pr_err("Not enough slots for CMA reserved regions!\n"); - return -ENOSPC; - } - if (!size || !memblock_is_region_reserved(base, size)) return -EINVAL; @@ -195,25 +254,261 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, if (!IS_ALIGNED(base | size, CMA_MIN_ALIGNMENT_BYTES)) return -EINVAL; + ret = cma_new_area(name, size, order_per_bit, &cma); + if (ret != 0) + return ret; + + cma->ranges[0].base_pfn = PFN_DOWN(base); + cma->ranges[0].count = cma->count; + cma->nranges = 1; + + *res_cma = cma; + + return 0; +} + +/* + * Structure used while walking physical memory ranges and finding out + * which one(s) to use for a CMA area. + */ +struct cma_init_memrange { + phys_addr_t base; + phys_addr_t size; + struct list_head list; +}; + +/* + * Work array used during CMA initialization. + */ +static struct cma_init_memrange memranges[CMA_MAX_RANGES] __initdata; + +static bool __init revsizecmp(struct cma_init_memrange *mlp, + struct cma_init_memrange *mrp) +{ + return mlp->size > mrp->size; +} + +static bool __init basecmp(struct cma_init_memrange *mlp, + struct cma_init_memrange *mrp) +{ + return mlp->base < mrp->base; +} + +/* + * Helper function to create sorted lists. + */ +static void __init list_insert_sorted( + struct list_head *ranges, + struct cma_init_memrange *mrp, + bool (*cmp)(struct cma_init_memrange *lh, struct cma_init_memrange *rh)) +{ + struct list_head *mp; + struct cma_init_memrange *mlp; + + if (list_empty(ranges)) + list_add(&mrp->list, ranges); + { + list_for_each(mp, ranges) { + mlp = list_entry(mp, struct cma_init_memrange, list); + if (cmp(mlp, mrp)) + break; + } + __list_add(&mrp->list, mlp->list.prev, &mlp->list); + } +} + +/* + * Create CMA areas with a total size of @total_size. A normal allocation + * for one area is tried first. If that fails, the biggest memblock + * ranges above 4G are selected, and allocated bottom up. + * + * The complexity here is not great, but this function will only be + * called during boot, and the lists operated on have fewer than + * CMA_MAX_RANGES elements (default value: 8). + */ +int __init cma_declare_contiguous_multi(phys_addr_t total_size, + phys_addr_t align, unsigned int order_per_bit, + const char *name, struct cma **res_cma, int nid) +{ + phys_addr_t start, end; + phys_addr_t size, sizesum, sizeleft; + struct cma_init_memrange *mrp, *mlp, *failed; + struct cma_memrange *cmrp; + LIST_HEAD(ranges); + LIST_HEAD(final_ranges); + struct list_head *mp, *next; + int ret, nr = 1; + u64 i; + struct cma *cma; + /* - * Each reserved area must be initialised later, when more kernel - * subsystems (like slab allocator) are available. + * First, try it the normal way, producing just one range. */ - cma = &cma_areas[cma_area_count]; + ret = __cma_declare_contiguous_nid(0, total_size, 0, align, + order_per_bit, false, name, res_cma, nid); + if (ret != -ENOMEM) + goto out; - if (name) - snprintf(cma->name, CMA_MAX_NAME, name); - else - snprintf(cma->name, CMA_MAX_NAME, "cma%d\n", cma_area_count); + /* + * Couldn't find one range that fits our needs, so try multiple + * ranges. + * + * No need to do the alignment checks here, the call to + * cma_declare_contiguous_nid above would have caught + * any issues. With the checks, we know that: + * + * - @align is a power of 2 + * - @align is >= pageblock alignment + * - @size is aligned to @align and to @order_per_bit + * + * So, as long as we create ranges that have a base + * aligned to @align, and a size that is aligned to + * both @align and @order_to_bit, things will work out. + */ + nr = 0; + sizesum = 0; + failed = NULL; - cma->base_pfn = PFN_DOWN(base); - cma->available_count = cma->count = size >> PAGE_SHIFT; - cma->order_per_bit = order_per_bit; + ret = cma_new_area(name, total_size, order_per_bit, &cma); + if (ret != 0) + goto out; + + align = max_t(phys_addr_t, align, CMA_MIN_ALIGNMENT_BYTES); + /* + * Create a list of ranges above 4G, largest range first. + */ + for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &start, &end, NULL) { + if (start < SZ_4G) + continue; + + start = ALIGN(start, align); + if (start >= end) + continue; + + end = ALIGN_DOWN(end, align); + if (end <= start) + continue; + + size = end - start; + size = ALIGN_DOWN(size, (PAGE_SIZE << order_per_bit)); + if (!size) + continue; + sizesum += size; + + pr_debug("consider %016llx - %016llx\n", (u64)start, (u64)end); + + /* + * If we don't yet have used the maximum number of + * areas, grab a new one. + * + * If we can't use anymore, see if this range is not + * smaller than the smallest one already recorded. If + * not, re-use the smallest element. + */ + if (nr < CMA_MAX_RANGES) + mrp = &memranges[nr++]; + else { + mrp = list_last_entry(&ranges, + struct cma_init_memrange, list); + if (size < mrp->size) + continue; + list_del(&mrp->list); + sizesum -= mrp->size; + pr_debug("deleted %016llx - %016llx from the list\n", + (u64)mrp->base, (u64)mrp->base + size); + } + mrp->base = start; + mrp->size = size; + + /* + * Now do a sorted insert. + */ + list_insert_sorted(&ranges, mrp, revsizecmp); + pr_debug("added %016llx - %016llx to the list\n", + (u64)mrp->base, (u64)mrp->base + size); + pr_debug("total size now %llu\n", (u64)sizesum); + } + + /* + * There is not enough room in the CMA_MAX_RANGES largest + * ranges, so bail out. + */ + if (sizesum < total_size) { + cma_drop_area(cma); + ret = -ENOMEM; + goto out; + } + + /* + * Found ranges that provide enough combined space. + * Now, sorted them by address, smallest first, because we + * want to mimic a bottom-up memblock allocation. + */ + sizesum = 0; + list_for_each_safe(mp, next, &ranges) { + mlp = list_entry(mp, struct cma_init_memrange, list); + list_del(mp); + list_insert_sorted(&final_ranges, mlp, basecmp); + sizesum += mlp->size; + if (sizesum >= total_size) + break; + } + + /* + * Walk the final list, and add a CMA range for + * each range, possibly not using the last one fully. + */ + nr = 0; + sizeleft = total_size; + list_for_each(mp, &final_ranges) { + mlp = list_entry(mp, struct cma_init_memrange, list); + size = min(sizeleft, mlp->size); + if (memblock_reserve(mlp->base, size)) { + /* + * Unexpected error. Could go on to + * the next one, but just abort to + * be safe. + */ + failed = mlp; + break; + } + + pr_debug("created region %d: %016llx - %016llx\n", + nr, (u64)mlp->base, (u64)mlp->base + size); + cmrp = &cma->ranges[nr++]; + cmrp->base_pfn = PHYS_PFN(mlp->base); + cmrp->count = size >> PAGE_SHIFT; + + sizeleft -= size; + if (sizeleft == 0) + break; + } + + if (failed) { + list_for_each(mp, &final_ranges) { + mlp = list_entry(mp, struct cma_init_memrange, list); + if (mlp == failed) + break; + memblock_phys_free(mlp->base, mlp->size); + } + cma_drop_area(cma); + ret = -ENOMEM; + goto out; + } + + cma->nranges = nr; *res_cma = cma; - cma_area_count++; - totalcma_pages += cma->count; - return 0; +out: + if (ret != 0) + pr_err("Failed to reserve %lu MiB\n", + (unsigned long)total_size / SZ_1M); + else + pr_info("Reserved %lu MiB in %d range%s\n", + (unsigned long)total_size / SZ_1M, nr, + nr > 1 ? "s" : ""); + + return ret; } /** @@ -241,6 +536,26 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, phys_addr_t alignment, unsigned int order_per_bit, bool fixed, const char *name, struct cma **res_cma, int nid) +{ + int ret; + + ret = __cma_declare_contiguous_nid(base, size, limit, alignment, + order_per_bit, fixed, name, res_cma, nid); + if (ret != 0) + pr_err("Failed to reserve %ld MiB\n", + (unsigned long)size / SZ_1M); + else + pr_info("Reserved %ld MiB at %pa\n", + (unsigned long)size / SZ_1M, &base); + + return ret; +} + +static int __init __cma_declare_contiguous_nid(phys_addr_t base, + phys_addr_t size, phys_addr_t limit, + phys_addr_t alignment, unsigned int order_per_bit, + bool fixed, const char *name, struct cma **res_cma, + int nid) { phys_addr_t memblock_end = memblock_end_of_DRAM(); phys_addr_t highmem_start; @@ -273,10 +588,9 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, /* Sanitise input arguments. */ alignment = max_t(phys_addr_t, alignment, CMA_MIN_ALIGNMENT_BYTES); if (fixed && base & (alignment - 1)) { - ret = -EINVAL; pr_err("Region at %pa must be aligned to %pa bytes\n", &base, &alignment); - goto err; + return -EINVAL; } base = ALIGN(base, alignment); size = ALIGN(size, alignment); @@ -294,10 +608,9 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, * low/high memory boundary. */ if (fixed && base < highmem_start && base + size > highmem_start) { - ret = -EINVAL; pr_err("Region at %pa defined on low/high memory boundary (%pa)\n", &base, &highmem_start); - goto err; + return -EINVAL; } /* @@ -309,18 +622,16 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, limit = memblock_end; if (base + size > limit) { - ret = -EINVAL; pr_err("Size (%pa) of region at %pa exceeds limit (%pa)\n", &size, &base, &limit); - goto err; + return -EINVAL; } /* Reserve memory */ if (fixed) { if (memblock_is_region_reserved(base, size) || memblock_reserve(base, size) < 0) { - ret = -EBUSY; - goto err; + return -EBUSY; } } else { phys_addr_t addr = 0; @@ -357,10 +668,8 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, if (!addr) { addr = memblock_alloc_range_nid(size, alignment, base, limit, nid, true); - if (!addr) { - ret = -ENOMEM; - goto err; - } + if (!addr) + return -ENOMEM; } /* @@ -373,75 +682,67 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, ret = cma_init_reserved_mem(base, size, order_per_bit, name, res_cma); if (ret) - goto free_mem; - - pr_info("Reserved %ld MiB at %pa on node %d\n", (unsigned long)size / SZ_1M, - &base, nid); - return 0; + memblock_phys_free(base, size); -free_mem: - memblock_phys_free(base, size); -err: - pr_err("Failed to reserve %ld MiB on node %d\n", (unsigned long)size / SZ_1M, - nid); return ret; } static void cma_debug_show_areas(struct cma *cma) { unsigned long next_zero_bit, next_set_bit, nr_zero; - unsigned long start = 0; + unsigned long start; unsigned long nr_part; - unsigned long nbits = cma_bitmap_maxno(cma); + unsigned long nbits; + int r; + struct cma_memrange *cmr; spin_lock_irq(&cma->lock); pr_info("number of available pages: "); - for (;;) { - next_zero_bit = find_next_zero_bit(cma->bitmap, nbits, start); - if (next_zero_bit >= nbits) - break; - next_set_bit = find_next_bit(cma->bitmap, nbits, next_zero_bit); - nr_zero = next_set_bit - next_zero_bit; - nr_part = nr_zero << cma->order_per_bit; - pr_cont("%s%lu@%lu", start ? "+" : "", nr_part, - next_zero_bit); - start = next_zero_bit + nr_zero; + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + + start = 0; + nbits = cma_bitmap_maxno(cma, cmr); + + pr_info("range %d: ", r); + for (;;) { + next_zero_bit = find_next_zero_bit(cmr->bitmap, + nbits, start); + if (next_zero_bit >= nbits) + break; + next_set_bit = find_next_bit(cmr->bitmap, nbits, + next_zero_bit); + nr_zero = next_set_bit - next_zero_bit; + nr_part = nr_zero << cma->order_per_bit; + pr_cont("%s%lu@%lu", start ? "+" : "", nr_part, + next_zero_bit); + start = next_zero_bit + nr_zero; + } + pr_info("\n"); } pr_cont("=> %lu free of %lu total pages\n", cma->available_count, cma->count); spin_unlock_irq(&cma->lock); } -static struct page *__cma_alloc(struct cma *cma, unsigned long count, - unsigned int align, gfp_t gfp) +static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr, + unsigned long count, unsigned int align, + struct page **pagep, gfp_t gfp) { unsigned long mask, offset; unsigned long pfn = -1; unsigned long start = 0; unsigned long bitmap_maxno, bitmap_no, bitmap_count; - unsigned long i; + int ret = -EBUSY; struct page *page = NULL; - int ret = -ENOMEM; - const char *name = cma ? cma->name : NULL; - - trace_cma_alloc_start(name, count, align); - - if (!cma || !cma->count || !cma->bitmap) - return page; - - pr_debug("%s(cma %p, name: %s, count %lu, align %d)\n", __func__, - (void *)cma, cma->name, count, align); - - if (!count) - return page; mask = cma_bitmap_aligned_mask(cma, align); - offset = cma_bitmap_aligned_offset(cma, align); - bitmap_maxno = cma_bitmap_maxno(cma); + offset = cma_bitmap_aligned_offset(cma, cmr, align); + bitmap_maxno = cma_bitmap_maxno(cma, cmr); bitmap_count = cma_bitmap_pages_to_bits(cma, count); if (bitmap_count > bitmap_maxno) - return page; + goto out; for (;;) { spin_lock_irq(&cma->lock); @@ -453,14 +754,14 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, spin_unlock_irq(&cma->lock); break; } - bitmap_no = bitmap_find_next_zero_area_off(cma->bitmap, + bitmap_no = bitmap_find_next_zero_area_off(cmr->bitmap, bitmap_maxno, start, bitmap_count, mask, offset); if (bitmap_no >= bitmap_maxno) { spin_unlock_irq(&cma->lock); break; } - bitmap_set(cma->bitmap, bitmap_no, bitmap_count); + bitmap_set(cmr->bitmap, bitmap_no, bitmap_count); cma->available_count -= count; /* * It's safe to drop the lock here. We've marked this region for @@ -469,7 +770,7 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, */ spin_unlock_irq(&cma->lock); - pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit); + pfn = cmr->base_pfn + (bitmap_no << cma->order_per_bit); mutex_lock(&cma_mutex); ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, gfp); mutex_unlock(&cma_mutex); @@ -478,7 +779,7 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, break; } - cma_clear_bitmap(cma, pfn, count); + cma_clear_bitmap(cma, cmr, pfn, count); if (ret != -EBUSY) break; @@ -490,6 +791,48 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, /* try again with a bit different memory target */ start = bitmap_no + mask + 1; } +out: + *pagep = page; + return ret; +} + +/** + * cma_alloc() - allocate pages from contiguous area + * @cma: Contiguous memory region for which the allocation is performed. + * @count: Requested number of pages. + * @align: Requested alignment of pages (in PAGE_SIZE order). + * @no_warn: Avoid printing message about failed allocation + * + * This function allocates part of contiguous memory on specific + * contiguous memory area. + */ +static struct page *__cma_alloc(struct cma *cma, unsigned long count, + unsigned int align, gfp_t gfp) +{ + struct page *page = NULL; + int ret = -ENOMEM, r; + unsigned long i; + const char *name = cma ? cma->name : NULL; + + trace_cma_alloc_start(name, count, align); + + if (!cma || !cma->count) + return page; + + pr_debug("%s(cma %p, name: %s, count %lu, align %d)\n", __func__, + (void *)cma, cma->name, count, align); + + if (!count) + return page; + + for (r = 0; r < cma->nranges; r++) { + page = NULL; + + ret = cma_range_alloc(cma, &cma->ranges[r], count, align, + &page, gfp); + if (ret != -EBUSY || page) + break; + } /* * CMA can allocate multiple page blocks, which results in different @@ -508,7 +851,8 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, } pr_debug("%s(): returned %p\n", __func__, page); - trace_cma_alloc_finish(name, pfn, page, count, align, ret); + trace_cma_alloc_finish(name, page ? page_to_pfn(page) : 0, + page, count, align, ret); if (page) { count_vm_event(CMA_ALLOC_SUCCESS); cma_sysfs_account_success_pages(cma, count); @@ -551,20 +895,31 @@ struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp) bool cma_pages_valid(struct cma *cma, const struct page *pages, unsigned long count) { - unsigned long pfn; + unsigned long pfn, end; + int r; + struct cma_memrange *cmr; + bool ret; - if (!cma || !pages) + if (!cma || !pages || count > cma->count) return false; pfn = page_to_pfn(pages); + ret = false; - if (pfn < cma->base_pfn || pfn >= cma->base_pfn + cma->count) { - pr_debug("%s(page %p, count %lu)\n", __func__, - (void *)pages, count); - return false; + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + end = cmr->base_pfn + cmr->count; + if (pfn >= cmr->base_pfn && pfn < end) { + ret = pfn + count <= end; + break; + } } - return true; + if (!ret) + pr_debug("%s(page %p, count %lu)\n", + __func__, (void *)pages, count); + + return ret; } /** @@ -580,19 +935,32 @@ bool cma_pages_valid(struct cma *cma, const struct page *pages, bool cma_release(struct cma *cma, const struct page *pages, unsigned long count) { - unsigned long pfn; + struct cma_memrange *cmr; + unsigned long pfn, end_pfn; + int r; + + pr_debug("%s(page %p, count %lu)\n", __func__, (void *)pages, count); if (!cma_pages_valid(cma, pages, count)) return false; - pr_debug("%s(page %p, count %lu)\n", __func__, (void *)pages, count); - pfn = page_to_pfn(pages); + end_pfn = pfn + count; + + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + if (pfn >= cmr->base_pfn && + pfn < (cmr->base_pfn + cmr->count)) { + VM_BUG_ON(end_pfn > cmr->base_pfn + cmr->count); + break; + } + } - VM_BUG_ON(pfn + count > cma->base_pfn + cma->count); + if (r == cma->nranges) + return false; free_contig_range(pfn, count); - cma_clear_bitmap(cma, pfn, count); + cma_clear_bitmap(cma, cmr, pfn, count); cma_sysfs_account_release_pages(cma, count); trace_cma_release(cma->name, pfn, pages, count); diff --git a/mm/cma.h b/mm/cma.h index 3dd3376ae980..601af7cdb495 100644 --- a/mm/cma.h +++ b/mm/cma.h @@ -10,11 +10,23 @@ struct cma_kobject { struct cma *cma; }; +/* + * Multi-range support. This can be useful if the size of the allocation + * is not expected to be larger than the alignment (like with hugetlb_cma), + * and the total amount of memory requested, while smaller than the total + * amount of memory available, is large enough that it doesn't fit in a + * single physical memory range because of memory holes. + */ +struct cma_memrange { + unsigned long base_pfn; + unsigned long count; + unsigned long *bitmap; +}; +#define CMA_MAX_RANGES 8 + struct cma { - unsigned long base_pfn; unsigned long count; unsigned long available_count; - unsigned long *bitmap; unsigned int order_per_bit; /* Order of pages represented by one bit */ spinlock_t lock; #ifdef CONFIG_CMA_DEBUGFS @@ -23,6 +35,8 @@ struct cma { struct debugfs_u32_array dfs_bitmap; #endif char name[CMA_MAX_NAME]; + int nranges; + struct cma_memrange ranges[CMA_MAX_RANGES]; #ifdef CONFIG_CMA_SYSFS /* the number of CMA page successful allocations */ atomic64_t nr_pages_succeeded; @@ -39,9 +53,10 @@ struct cma { extern struct cma cma_areas[MAX_CMA_AREAS]; extern unsigned int cma_area_count; -static inline unsigned long cma_bitmap_maxno(struct cma *cma) +static inline unsigned long cma_bitmap_maxno(struct cma *cma, + struct cma_memrange *cmr) { - return cma->count >> cma->order_per_bit; + return cmr->count >> cma->order_per_bit; } #ifdef CONFIG_CMA_SYSFS From patchwork Mon Jan 27 23:21:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951844 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49A56C0218C for ; Mon, 27 Jan 2025 23:22:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 773202801C1; Mon, 27 Jan 2025 18:22:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6FBF82801BC; Mon, 27 Jan 2025 18:22:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5746F2801C1; Mon, 27 Jan 2025 18:22:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 2E8F32801BC for ; Mon, 27 Jan 2025 18:22:32 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id CC5D61A06F6 for ; Mon, 27 Jan 2025 23:22:31 +0000 (UTC) X-FDA: 83054808102.23.8BC5D45 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf22.hostedemail.com (Postfix) with ESMTP id 03373C0006 for ; Mon, 27 Jan 2025 23:22:29 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=w7RN6P6Y; spf=pass (imf22.hostedemail.com: domain of 3NBWYZwQKCAMiygojrrjoh.frpolqx0-ppnydfn.ruj@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3NBWYZwQKCAMiygojrrjoh.frpolqx0-ppnydfn.ruj@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020150; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dXf4/6vVFsOQWOL4yP6AMC2Up5+2J9mezQuBbeYawcA=; b=Tm2j+PHhIqi8zEYhmrbERS5xxoCaozlHGC7XJJeCFEnd/I5ejiVCZ/An3ON50AZWF4WPZy eiYMxpJiDUzb4Z4Tc2cuMrDl7CRXHoxGkVou3sNWeHuBIZ4tZXT0Bpi+hjmrmCjaHWy0Y1 J1zg4T/u6pqIeKecuvBXnN/GYQOBkdw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020150; a=rsa-sha256; cv=none; b=vgFlrTi+mtRk27vCW3sEZZiG39FkUliwDHeMKMZ7JsNfkVeIBt4JnTkgqnx2HFvwMjTgsw zlVKqZdxrCP2738VvZ9LpOlo74ct5NpaA9NkAHyHsp4gscX5T/DqPjooZ1bEVo2FgqHA7R 2n6gJnv+qArl4x0tAfj1+FjzH07iAbU= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=w7RN6P6Y; spf=pass (imf22.hostedemail.com: domain of 3NBWYZwQKCAMiygojrrjoh.frpolqx0-ppnydfn.ruj@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3NBWYZwQKCAMiygojrrjoh.frpolqx0-ppnydfn.ruj@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-216728b170cso96103785ad.2 for ; Mon, 27 Jan 2025 15:22:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020149; x=1738624949; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=dXf4/6vVFsOQWOL4yP6AMC2Up5+2J9mezQuBbeYawcA=; b=w7RN6P6Y03QP6Izs8a6fNIidHveP2iukpXBrn3UUxALB7tusgbUz3BlKGA78n6Fn1B Ug/QvdJOwzcOx2DjgielCIWkj+vIm5gAnzlrxJl2KQFRjxhOn8F51JyuAmHUPvjjpElW Felf18SZni6MznVc3Mtr9t7kXApyuTf7Bv5wf+AT36ZQTIRXn8W7a1VaHvekwsUA0xGJ kigk9jKDqpiE1nYgkxZgFSxAi+wj/CWvCIDrLZNdbzyTFFdsMNhxDiKwAeKS9Vs3hnqO rKTLzmOQ3IEUjxTYnuvg7xz/viLdaK9bXj6aZUwW0FQ6npDWRW4ZEp/lzzSb6VWZ87U0 Lp6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020149; x=1738624949; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dXf4/6vVFsOQWOL4yP6AMC2Up5+2J9mezQuBbeYawcA=; b=b1651qsSTJNC4crm7bL4u6BO9WZEPdkRS0HBvGdUZV7wDstn2OnAc4J3L9UDv1Kf6s 9XEHdmU0ShGktI0IuWP6qNLrSw08hgSbuGCarqFzxn3mQsZdWzC2XvgzcjI9eBHdE5mR QqcqXccrp/cOenbB8+PA2PM0yNrm8XDnvnNmveMYC2USHOjQerLrykS2F4J1aG1Wv/UG sG0aBaNrllrpKHjn9SxOaGptAMS86DIT/hRfocIURZf9p61qS/t24fxNu6PJTvFlZCha q5MluGoDWDJJhlOS8SZIeMf4Q90wLNzQ0rYnQUTeXPvmmCRY2X/srHqL+wjd2FDGzlVg ni9w== X-Forwarded-Encrypted: i=1; AJvYcCXrLWFcowreOtmxyFaSkQ5HJRctDCj3xCXVOzivFFTI6haiMv0MTVWGBmRCWlV995Bo/vCPXsGtXQ==@kvack.org X-Gm-Message-State: AOJu0YzWKGfvlIzmFD5Rf3RbsVnpyUchl+TzFQLkC+x9bM/DuH/KMBtb XHINoC1pceVpS3MwV60uaGzhy77toTBCcERToWyjCPb8GHF+wswS+kjtrxDSVE+mhy9KdQ== X-Google-Smtp-Source: AGHT+IHacT0c2glOz6iHxFPEkg2efzrTP0CNHsnhpI9lKFAWHWkywal7QfbYZsPNCZVylM+1koVThcNB X-Received: from pfbcg7.prod.google.com ([2002:a05:6a00:2907:b0:728:aad0:33a4]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:8411:b0:1e7:6f82:3217 with SMTP id adf61e73a8af0-1eb21465271mr67484709637.3.1738020148864; Mon, 27 Jan 2025 15:22:28 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:43 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-4-fvdl@google.com> Subject: [PATCH 03/27] mm/cma: introduce cma_intersects function From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , linux-s390@vger.kernel.org X-Stat-Signature: o6x4owgjneeg9ftnr6m98z8tzg3wyp51 X-Rspamd-Queue-Id: 03373C0006 X-Rspam-User: X-Rspamd-Server: rspam06 X-HE-Tag: 1738020149-322394 X-HE-Meta: U2FsdGVkX19wbJe2IpsPl3Rqn/nAHNuUqCcSosjoCZtvEFNqsIz2Y4SSQ/S3+u3q0oEKRJXGEt94tPjn7gvxgI156oXapuRq0T+M7lXYnEsOVYlhOxhcSIaQjmSz1iFLyw/MSQUhCRmEt/mBH+Sr0Z5nY7myVMIJ1Hue0Xb+3YYaV5ibjfxdhX8Xr9/ikws0biCoKZyyq+DXnKrPLlqMmX6+TRSvMc7sUfjuH4OCYl7Q43zFwXozByBGIujzTpH4+jHHPpxFXByS/PEkzMtvxqdBQ3ykxoXxcEKqShZsRBXHTrd0WzI32mJZjpXLKs0f69z95fGxmyFwJcFmtJSkjjgb1rq8TeT/ZlnsGEYjSinjdRLqKw28E5SxayuugbCbgq8CB9p3oSg6drqT+VeZrSX6p0qh5UPg4OpvrVSB1qeLFAxdwLkPIbZO5WTPxx6cd2JWVekzVV3jbFeK5NTjyj5oQ6L+9EnT6w+9CdkO+xTETOve4dRubgUglwCYSIk/PcuIUwxSCPG6Zu9baX/IsODMN2CKy7MtTUHgpvHMwz4sOrXjqeggHRdHM1mTeNJxHTW60BLc7s7mUK9TCcW78d3wWeGUdIhjSjSbWm7khmKruRPobWH3plz789nedhiYMrOBaggjMvBGwL3E3oPVfs9QyHsxRkn9oBK1C1duejAdQTJ5JMmlmHLYwn4nqOfinZMRR/LXCW7uKUT2G/6FTJlKg77yHszAOZnuMAu5AvhYrdPWpyaEQf7R0qOm50iDwYfmZbBA3PTPcdLch7CiIbxToqnVcbTTiKvzUQ0A7MvBUvAMV8wbRNqwvpI3cUli1FT1wYWMMxiYsORyOEkileT0/wDU2T+4iXBadeQWSbIemJPObLKEHReaEKInNwQVZUgTIziKFxopW5mX79rQ4x8QI63WnHKWBIYKUaK46jGG/OUuq/zfm6KDjh1wxM9+q1bb3qPC/mIhYraKGuz hqn2Fcrs lzRycm4nZomZ12je2uJ4G/0lHGH2yyvHZJWxoiKLCeGmK2ZKgJXOZQflLEp+EcqxpD1yCThG5qYyoOcE5s+/Wiwst5poT2qHAkXVU41WUfsp6wOciTxIekYZSt+2Z7X278U3BZMgK6P8x6drjiNkRxGBLahHxl0om7rrqfah+8RRDMHwvYNbgvn13cbXzuq6qQQJvcoxJVvNY3p2a9HZ8pXQscsD8B9ZknI/Qi7cIPAUFg+/G4kH5u8R4ALEUJbKUyTk8/XL0GMayVM8CMVMbfZP1RLzlhMY95Et3ua8gK596lnhYG0pfJTEz8E96VtHGD8TSGEqlQeAz9vxjKLK+lAm0uguYQ4hPyJXje5RuR1a1Uad96bKqdhyy6GMxJPPoG6La3n1+OSyJ6kSIPYubaub6vhnup3QKQFQBrVlsGMJpor9+gTJN+Z8eJwaAWpOtAuzzZlZYZiZb8cJqIbO5DMZ7LVU9YEGSNpGeMn/ccOK7oa0YZbI9ERtretzl/2Nh0NOk1DmDBxWLV+OSXpixa7ZRTjSm/XNoYPEiofT9od/qYqV4279wiCqQtHfOjHb+YyAIXrG/nKkqlV+LwbNazTa+/4Mup9Q8AESynxkQMto9IeemPc6JIHA6xg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000505, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now that CMA areas can have multiple physical ranges, code can't assume a CMA struct represents a base_pfn plus a size, as returned from cma_get_base. Most cases are ok though, since they all explicitly refer to CMA areas that were created using existing interfaces (cma_declare_contiguous_nid or cma_init_reserved_mem), which guarantees they have just one physical range. An exception is the s390 code, which walks all CMA ranges to see if they intersect with a range of memory that is about to be hotremoved. So, in the future, it might run in to multi-range areas. To keep this check working, define a cma_intersects function. This just checks if a physaddr range intersects any of the ranges. Use it in the s390 check. Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Alexander Gordeev Cc: linux-s390@vger.kernel.org Signed-off-by: Frank van der Linden --- arch/s390/mm/init.c | 13 +++++-------- include/linux/cma.h | 1 + mm/cma.c | 21 +++++++++++++++++++++ 3 files changed, 27 insertions(+), 8 deletions(-) diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index f2298f7a3f21..d88cb1c13f7d 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -239,16 +239,13 @@ struct s390_cma_mem_data { static int s390_cma_check_range(struct cma *cma, void *data) { struct s390_cma_mem_data *mem_data; - unsigned long start, end; mem_data = data; - start = cma_get_base(cma); - end = start + cma_get_size(cma); - if (end < mem_data->start) - return 0; - if (start >= mem_data->end) - return 0; - return -EBUSY; + + if (cma_intersects(cma, mem_data->start, mem_data->end)) + return -EBUSY; + + return 0; } static int s390_cma_mem_notifier(struct notifier_block *nb, diff --git a/include/linux/cma.h b/include/linux/cma.h index 863427c27dc2..03d85c100dcc 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -53,6 +53,7 @@ extern bool cma_pages_valid(struct cma *cma, const struct page *pages, unsigned extern bool cma_release(struct cma *cma, const struct page *pages, unsigned long count); extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data); +extern bool cma_intersects(struct cma *cma, unsigned long start, unsigned long end); extern void cma_reserve_pages_on_error(struct cma *cma); diff --git a/mm/cma.c b/mm/cma.c index c20255161642..1704d5be6a07 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -988,3 +988,24 @@ int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data) return 0; } + +bool cma_intersects(struct cma *cma, unsigned long start, unsigned long end) +{ + int r; + struct cma_memrange *cmr; + unsigned long rstart, rend; + + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + + rstart = PFN_PHYS(cmr->base_pfn); + rend = PFN_PHYS(cmr->base_pfn + cmr->count); + if (end < rstart) + continue; + if (start >= rend) + continue; + return true; + } + + return false; +} From patchwork Mon Jan 27 23:21:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951845 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6605EC02188 for ; Mon, 27 Jan 2025 23:22:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 21ACD2801C3; Mon, 27 Jan 2025 18:22:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1A4332801C2; Mon, 27 Jan 2025 18:22:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F10A62801C3; Mon, 27 Jan 2025 18:22:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id CAFD32801C2 for ; Mon, 27 Jan 2025 18:22:33 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 76A00471DD for ; Mon, 27 Jan 2025 23:22:33 +0000 (UTC) X-FDA: 83054808186.16.DA802A4 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf16.hostedemail.com (Postfix) with ESMTP id A860A18000C for ; Mon, 27 Jan 2025 23:22:31 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="23/BOa1U"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf16.hostedemail.com: domain of 3NhWYZwQKCAUk0iqlttlqj.htrqnsz2-rrp0fhp.twl@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3NhWYZwQKCAUk0iqlttlqj.htrqnsz2-rrp0fhp.twl@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020151; a=rsa-sha256; cv=none; b=fJiVVMvG7PoXaOi+TaWtgyCG24Z/ZSTdZStLOAZU+n/FKjSZziQvpmrsScoBpUvh1ZJfC0 /yNeNAsm4QCD/Ct9VTug1vcKo7U9mskHyzoPgF95E5ZOb1HpBAe77+KQuRVu3gaRBcNlHD c16a/Ug1IOUFkA1UDEHcmzcIDOTp5po= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="23/BOa1U"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf16.hostedemail.com: domain of 3NhWYZwQKCAUk0iqlttlqj.htrqnsz2-rrp0fhp.twl@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3NhWYZwQKCAUk0iqlttlqj.htrqnsz2-rrp0fhp.twl@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020151; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nbfh28HcFIXQ9mLhqkAthugup7PcB5AKxUJWZSh0wXU=; b=XQQPYgS4xeV3FdQ4z22ThZMvX4NMM9HPQFvDVA0EFb98B8gcjy1ykUwWZ3HwsLwDqKxI/o 1hRiDU9/jkdLRjcy7f6SgUw2e7hx6lN+AbFEnewlydLjudYzojzNpV/YBHOZAij5v/bQ2F DD46LxuaWVEPsFIKxpnFV4Sn0lbEdbs= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-216717543b7so131014145ad.0 for ; Mon, 27 Jan 2025 15:22:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020150; x=1738624950; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=nbfh28HcFIXQ9mLhqkAthugup7PcB5AKxUJWZSh0wXU=; b=23/BOa1UzDPSMnmXJRIK5u4jO1TgvjgvZCw0Bmk92/7RnzjXaGMB+YHiqDPdqULUnN 5HD4zfYgH6Xy3dIKebXvWID0ZHj2HPsLNB/zl34ADp4ywzORZ0bupgw21643EYhCzgcU CjvC4iwaePdcpNsJnv4eIZwd5vW/vDQT+uUqq4jl+jGktg1+mMISYf6VUV1Ob3UtTcNv kI2eQRTI9pg1zKExakFoR1oW1HKvoK9YDfOihxDH3sPS9K//cE2Wfl7zQi3quGAiX5VC N6FdILuCJSeX5nS7gCQlfV2OQAdruqsgx+KKNbCnNFPfWSsLnnmHjZGo6zZ4RExKnF63 jXLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020150; x=1738624950; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=nbfh28HcFIXQ9mLhqkAthugup7PcB5AKxUJWZSh0wXU=; b=vopr9dDuNOls6XOixjwLsmy+IZSTpIwT5UFsd5UNF5STnzW7BE+vrRiFUVuMiOaYyF YVI2Em4IKDpLVVbcJbZo+kENSs7YD9sH2ozUqOQU79oN6QoKd5F1XZ0trtf6mg0ph5Qj i89FZLbq7POZS7c7isojerBXtR20VoBxTVtqJag94pTBmihK24hBO3ahrZxPa4g6+12L sF5vX173ZTSbi/LLZvjGR5alAgL3fIx416sakSVsCRn8fz19dnlUdzbUXV5V9RKFbghs DDcriqm34uyF0kssyz6cf8x2pOzN3QUCeB08E/G32RpjLoEBuJ9QFbtZ+S9A7dGuhheg rqhw== X-Forwarded-Encrypted: i=1; AJvYcCV1PPGg1Txi/FiIof0YGG9nYOcHI4rfQleMaIVNW4B25+NU42HxSezi0AaDX+N/Yb0qzPyFdC123A==@kvack.org X-Gm-Message-State: AOJu0Yxih775slYqGPiMQND1B4XdivoNydVaaq0juyFXQP67C/e/iYtr UWtJQywdoWtFI5UHpfIMBP9p+oil9Jy7lenh1g1E+QHmaEwHPpxjNBQaH6CAYdRYWSNKhA== X-Google-Smtp-Source: AGHT+IH+wafy7y4Yrx3/vKft4ZwmgZZYhHwu65H+dX5PMxOWLMh1lXIrwPEg8dgPsWbnMC2GdrQhOOq7 X-Received: from pfbdc11.prod.google.com ([2002:a05:6a00:35cb:b0:728:e3af:6bb0]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:430d:b0:1e1:aef4:9ce7 with SMTP id adf61e73a8af0-1eb214a08b6mr61157310637.17.1738020150464; Mon, 27 Jan 2025 15:22:30 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:44 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-5-fvdl@google.com> Subject: [PATCH 04/27] mm, hugetlb: use cma_declare_contiguous_multi From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspam-User: X-Rspamd-Queue-Id: A860A18000C X-Rspamd-Server: rspam10 X-Stat-Signature: 1mnfibjxnuuycsch737uac86njfw85ng X-HE-Tag: 1738020151-736232 X-HE-Meta: U2FsdGVkX1+xM3/2/N5l8JLXp4alzVBKg0d/UhYnm/7nieXWkMy93v5RIMBjdNNPNf1nunnVrwBit7n5UYsPKqM0VKKhsetFIBsJATJJxYMSpiRMHUUwGWBk7Flc9TTon+LQ1g9wGXVBaPAQVCX7xkFIIiFpPMRLQ2JE23t8g1qb5bsboV8y2BqhDnydh03P8DAja2mLIPKN4AZQkNKCK7Yn+LiKcay5u36Mgsa+Ub0o5Kgtf03f7Yj1xJTYx11mBPPdwamG0f/A0nNteKhQ6eHT/T8cGOD8hy0Kzyr0TRpSIXYYPsQXCL4uMIRcr5gEfRK7ok0rPSkyF932zRFvEXVoNMfdR3wFIzW2+pgXd3Tf/GknBMurqAIzUn6Qr3u5XpVkdQXiT35H1JKB7ka8LilgxwAv9sKloTO0Zsn0/idRhZ/kID9fRSUtQMfZ/SkBJFvE/6h1xYl1Rdla+Hu3jDx8dtKf8tv5Bj1TTtEbSDGwiYJbIyIU3PxhHNr41PeP/G3n9mIyVgKqYpoEm45OQslHBfb+wguRUHub8uy6sg4i+vgGZhE0Y9x7MiJn3snxk2b9EAthxuh9D5hozn2gW4gMwjk38Lx+Kl1wWr/rnXfQKVv7zlBWKCad9uyzCEXjN8YMzwSahs6ubNuZsyRCUPOmR4umdjY+NZ7O9Gn+yIliGkZabku/oJqDcZGaWDjcCmiE8SqD7cejIlya/m3Zrhboz2IParnweqg7NXRpye2twCZr2HSKoXQ7iIVNmCU3K8fABQoio6oPsnKafUDwy9d/KFGxnPyqpwWe1cfXo/L8+I04Xj2iojAhThkZkhZi0pYFUiRezuEZSC3ED2kokhHotQT4E9ChUMZqN9XzuORR3LZfUJondPM0paXEovPL7f+RSFze1zE+e6pKbQF6eGgfWb/IQqTqmxQ7hpAtURIOZgoke9+oWp2jt7ZCs3k/va9cxU957/pyTwOwolx lu2Y0Jnm vOEn8tvXqkDaM9H23hArl9nNBgMeO7SyV9Jj1IoeTwMc2fBXVAQdzOT8qZE/03Ce/eky0qy4t1orfTSdvzDqvG12DUEJFqpTaA7UpsR/3UMl87441Y5Sno//QfiAcl/EQRrs4gXr52wJZ5glpPJxgNAKBZMB1CqmsIC7653Pie3m+IC740UHA/nNVBn8RzNdCVvkZmj3/WNlOo7nXwQWZ8OY/nb3G/ZowFkwUAdIWf3vyT39y9DIogOhSITXxuVKMWu3MtQZhBppGFJk0nI7xHp9XEUUfpqCMBVHQ+IMrDdHGocMEZlYlc2Am1p3G/SxXOpTmQ3obaGNOEODCVJcD/tBPXhltDIxaKu0VwzPFxgLwnnIpzT/2vlDkU/263u8dkZ1lIxSwyWdQDw7UpoRYNrXGjPrg4rMA5HIQF0ydojT7uuPmwoTDDJxraFh/IS5b5ohSXKjfVXJqneQwBphl6q8ANYoYzNeizZhoieJmipm1lmzPRr1XGlXrYTKVfM3v+ONJ8bZ5Az9NQvKg2OUw/zRnygQUeExxSp80Xuetymir/oM0aA4u4XsimjlgAoPhXUF2 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: hugetlb_cma is fine with using multiple CMA ranges, as long as it can get its gigantic pages allocated from them. So, use cma_declare_contiguous_multi to allow for multiple ranges, increasing the chances of getting what we want on systems with gaps in physical memory. Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 87761b042ed0..b187843e38fe 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7738,9 +7738,8 @@ void __init hugetlb_cma_reserve(int order) * may be returned to CMA allocator in the case of * huge page demotion. */ - res = cma_declare_contiguous_nid(0, size, 0, - PAGE_SIZE << order, - HUGETLB_PAGE_ORDER, false, name, + res = cma_declare_contiguous_multi(size, PAGE_SIZE << order, + HUGETLB_PAGE_ORDER, name, &hugetlb_cma[nid], nid); if (res) { pr_warn("hugetlb_cma: reservation failed: err %d, node %d", From patchwork Mon Jan 27 23:21:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951846 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 157B6C02188 for ; Mon, 27 Jan 2025 23:22:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9205628012D; Mon, 27 Jan 2025 18:22:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8CFFF28011F; Mon, 27 Jan 2025 18:22:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7487528012D; Mon, 27 Jan 2025 18:22:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 54E4028011F for ; Mon, 27 Jan 2025 18:22:35 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 0E2C21207E9 for ; Mon, 27 Jan 2025 23:22:35 +0000 (UTC) X-FDA: 83054808270.14.D264426 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf27.hostedemail.com (Postfix) with ESMTP id 2FCCC40010 for ; Mon, 27 Jan 2025 23:22:32 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=NKK9Mv+l; spf=pass (imf27.hostedemail.com: domain of 3OBWYZwQKCAcm2ksnvvnsl.jvtspu14-ttr2hjr.vyn@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3OBWYZwQKCAcm2ksnvvnsl.jvtspu14-ttr2hjr.vyn@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020153; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3cL+TUH11/smaxLgrRBkwnVa/XBTlyU3EXvG7o/S9Vk=; b=poYdMNEqcQ9UTn1xvn8Gyqk116LoMiU0qGftDmcYqNLclVQdb+U/ds+wIolfmWpDddTwn1 Ht9QP3201IEbX6f5yGgDbvHv0B5VbF9vd1Uydt7jxWtYCMRBcuyTAkk6zcvuU8q7/qugB8 YNVad13LbrzSwn239DI3M/H+D3gJZlg= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=NKK9Mv+l; spf=pass (imf27.hostedemail.com: domain of 3OBWYZwQKCAcm2ksnvvnsl.jvtspu14-ttr2hjr.vyn@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3OBWYZwQKCAcm2ksnvvnsl.jvtspu14-ttr2hjr.vyn@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020153; a=rsa-sha256; cv=none; b=TCH2N7EOD9xL2D58C7nKK2Vj+ThSCF4dA6J/TwhvihnT1To20pQG6xtw6FU+X7uMF586sO MIy1OQZR6/N1iZcLOp37l5Dnli0MTBS04ml5HUn1ePUACjzCM1+8WI7k4Rb6Ux4v8yoPPs SW4lvOnimWMPNOu0jMUd7C7kQ9euW90= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ef7fbd99a6so9590954a91.1 for ; Mon, 27 Jan 2025 15:22:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020152; x=1738624952; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=3cL+TUH11/smaxLgrRBkwnVa/XBTlyU3EXvG7o/S9Vk=; b=NKK9Mv+loF7aUwbTin7z3mAdpmg2YNTGrdu1Sw98+Byui0qoj/MlbIFUhoaNvbS9TA F3RikNo2l+qcOSjS8V6sDIylDUDQs7kNw7i/rwdq6H9fmmcrMH43GAyD0RyevBPZeyyn 8ws60u1nwY+676vu23G7nMUuUKM6j4Ywt3gcHRqeNopb0P1hJD5VWe90e+BMVFf65f32 dlsNkIcPEPrDTYW+V9sSogLTma2labCET4Qp4IyP1I0CahRpzwknHAhb2n9TsXC6aNsZ 6CylnWV7LIrqysymxvCOZp62jz3EQwy+Yv1Zo/754bO8sZmGAn7S2V7Rwb7xj9vINrnT Tjcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020152; x=1738624952; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3cL+TUH11/smaxLgrRBkwnVa/XBTlyU3EXvG7o/S9Vk=; b=eWazh2huBG8EguW6fejWhX3XGLxghmWLdZajd2rN13P6EdwLDeCwWXyyRJ7dsFyRNk JAzaRouyOYZK5UnLB3NO6XzTj3cm3Glk5eRNW0QaxJSQ9PYMZbJzOQtYFYm6MaYwST9Y TmPbx6V5FTxi/1tWiKlbxbUI4anLNbSZsO2ivIou4ZbGxOtLnOn8y/ZmGD/4j7L2AFbi zXgknRNooQL8eqp1mDKHcVw7MuAOOPPevKUJMS6LzzJRY3GbveOa+DKbuGFBw0nsM37r N8s+m8PCgAfDzcm7zkE/Lfb8JeoU4NpolVnZijr7r/K80ajtUpDKi4WFhrRxY//RaH+q pQ6g== X-Forwarded-Encrypted: i=1; AJvYcCWNq65sJJ+PCs2+KNSvDy6l0TWBdX/D4ULRxothcFRxT1J+oICVWfUyPxJtWl7hbSopvkXw2aQS1Q==@kvack.org X-Gm-Message-State: AOJu0YyLk5yp1FT0hj94EXLspZhEaTFfy3Ovkv4rqDkyAQN5sEWGYW8k 9L/b4o6f+LFG5HIfkSzarpRsHGuDkJV3iHE7XUc7Q/9G2kWAdomDWR4jbzeTNEi7xqoLLA== X-Google-Smtp-Source: AGHT+IGgN6UDyZ6CnhbCZLqRUCN/vH+l2DkjmSnZAo2yRRKakYvM4vMlM8AITyVlOysWw5h6u6wtZxFn X-Received: from pfbbd41.prod.google.com ([2002:a05:6a00:27a9:b0:725:e05b:5150]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:21ca:b0:727:3fd5:b530 with SMTP id d2e1a72fcca58-72dafb36d75mr57071331b3a.15.1738020152104; Mon, 27 Jan 2025 15:22:32 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:45 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-6-fvdl@google.com> Subject: [PATCH 05/27] mm/hugetlb: fix round-robin bootmem allocation From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden , Zhenguo Yao X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 2FCCC40010 X-Stat-Signature: 66sxiet6amz8ruc6dieb8cm15ft74x8e X-Rspam-User: X-HE-Tag: 1738020152-188395 X-HE-Meta: U2FsdGVkX1+WnsUIeSZ8+pUzwUY+M/Xj7DBYeYlD1t6dll5KnSWwWpNTPbjt3rGSglpzaY2EmwS7SNTz2D3ajcfcOslDFxrKGjMrgdt0S8KJWACsk4jO3OHgoRdJ4PvhlBdAs0MbHzxXI+klwiOQlqqYXuDCnx8GqPwHmUOrO8KleZk4MOCie7+epGid4Gs8BLZyxOmNFgfvXy7q9n0OQU3FNG9EnyDoZjlsLkUfjsg9a/I7BACwOECAzpyjZ71FTd1xkr/zNu1pykpfuo2Jxq4NWelnKgDM1dWdggXS8Yt2mM9Kp4Y25AtSb2BQAnDGOhZNCagVxEyhxRBlQeKw88McM0PeY8kO9DRAbMU7ZYNCgX266jsAPwPQrx4aVwcK2R0Uo/QaYvT87h5JSEsfRCOG2SqgCmnpr7aKkaTFsu58/lDmKc/pTpjpK2uJjAnvc1T4vLS1HzUP1jPKNu9yU4Xeeskmsqued2rmfKNsDuNNI+EfZghfFXeDMwfrEgVCLOSyFWT//FhcHIX54FxiPoHS/CRmZl1WA1NC7YOb4rOaxlRktGhzhdPklgPEiy/y63Ubzxi4MGtzvhmvnyFBu0AeEE1kr127loCo3fIBH4PQ/tpG5vzlsnmJrGnOTnJA6ZMpzgbF0vgWQ7X9+4pdKmxP0ruUyH2u2x/44jxIQfxQ5QxgM7x5uH4P+NOdXeepHmqUsXSTwCrHTyN/YcnWL9LxWFtsW85dGuafynieeNpblmZpBnQvKuC8fO2T+mSImg2wbjKu+nq7bCopPCXHkLCjgOUI2FSDibGnSEkwaFO18fzj62c2j3HcrQQ/47gYhe1iFxqfTyOC4inCtqyGudCQlxPDWLeFS5PFU/HTMDJZay9GpM+NG9x9KjFxlF3kLD70vIZNq445QKkpF8f41PERG99G1Y4mAxsQxVO+DVfWvSrWv8VHyV1Sl/nirAnRuVPkQvUqjiJgjNQdSi7 PgAtDxC+ UCdf7/f08jkBMemCHfXly9IPcDqx03lE16YCYqnBqU8Vxp/qLdVt+O67MmTRjuhZPhI3K5Ub3mlwU/qptzSrGrI+1IoeM9qQHDE/vooNlepWucjAn3gumG+90MjeuqDCHHaokZaAAabu1Cjx9GN3McOwd0d5uDFq+Ij3m6YFPFAXT1Lod8zAeG9QwhcjvVP+mLFD1TZ95/aGsGfzFu0WYTGyfeLdvUQ4nhgtpwNIdaouWMPytsV5PuZxHzo3Y/BZnATRcmliRhfYo91gy83tqvJsV3C5qZeAACAptqnhyLta3dQs81rKAzOE17+9F0NGijIarWmhy37lUugBkCImsa4CcsPphG7wqBBbGoIyj4Y1GNT5hBD2ZSU+5U+QHpqSnlXV0EOIgwNWKjdozpOqkrDBMyjVH5gd+LU6VZuRfsRDpLC1h+7/YJi6TSlBPbIkwwOXtP2N9fkgCGicFDHPft/BqSsweXhhMIdIYbdBvZC01cPNs8ABo63gE3nUYY9oRl81ei4qilTbf3+JmUhQHh+lIFFYRmIt0esKq7P2tYXItFOA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Commit b5389086ad7b ("hugetlbfs: extend the definition of hugepages parameter to support node allocation") changed the NUMA_NO_NODE round-robin allocation behavior in case of a failure to allocate from one NUMA node. The code originally moved on to the next node to try again, but now it immediately breaks out of the loop. Restore the original behavior. Fixes: b5389086ad7b ("hugetlbfs: extend the definition of hugepages parameter to support node allocation") Cc: Zhenguo Yao Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index b187843e38fe..1441a3916b32 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3156,16 +3156,13 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) m = memblock_alloc_try_nid_raw( huge_page_size(h), huge_page_size(h), 0, MEMBLOCK_ALLOC_ACCESSIBLE, node); - /* - * Use the beginning of the huge page to store the - * huge_bootmem_page struct (until gather_bootmem - * puts them into the mem_map). - */ - if (!m) - return 0; - goto found; + if (m) + break; } + if (!m) + return 0; + found: /* @@ -3177,7 +3174,14 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) */ memblock_reserved_mark_noinit(virt_to_phys((void *)m + PAGE_SIZE), huge_page_size(h) - PAGE_SIZE); - /* Put them into a private list first because mem_map is not up yet */ + /* + * Use the beginning of the huge page to store the + * huge_bootmem_page struct (until gather_bootmem + * puts them into the mem_map). + * + * Put them into a private list first because mem_map + * is not up yet. + */ INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages[node]); m->hstate = h; From patchwork Mon Jan 27 23:21:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951847 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76E25C02188 for ; Mon, 27 Jan 2025 23:22:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6877328012E; Mon, 27 Jan 2025 18:22:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 638B728011F; Mon, 27 Jan 2025 18:22:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 43AFC28012E; Mon, 27 Jan 2025 18:22:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 2174328011F for ; Mon, 27 Jan 2025 18:22:37 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id ADFA91207F1 for ; Mon, 27 Jan 2025 23:22:36 +0000 (UTC) X-FDA: 83054808312.15.BAFC8FB Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf02.hostedemail.com (Postfix) with ESMTP id E6E4080009 for ; Mon, 27 Jan 2025 23:22:34 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=tCgItW8O; spf=pass (imf02.hostedemail.com: domain of 3ORWYZwQKCAgn3ltowwotm.kwutqv25-uus3iks.wzo@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3ORWYZwQKCAgn3ltowwotm.kwutqv25-uus3iks.wzo@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020155; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zdlGmP9L9NTeXFx5SYfMc8nzNQTgYbXoPVkbkgOx1Bc=; b=GGKcADdwr2UEIB832+mu5mnEQVPHdrIqBsaqjrO8C5kyuVERHE98HcSJi6i9pKp+a93npi 7ijpXJjQYQwqWka/cP3SlSvk4r7z4O0TMGEeVyq1Tx+is3twAMleSc/Gzes29iBNW7IAbw pWUxl+SwZezdsCgUqtVU7DsJLX8Wwpg= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=tCgItW8O; spf=pass (imf02.hostedemail.com: domain of 3ORWYZwQKCAgn3ltowwotm.kwutqv25-uus3iks.wzo@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3ORWYZwQKCAgn3ltowwotm.kwutqv25-uus3iks.wzo@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020155; a=rsa-sha256; cv=none; b=4VzsJyVtUDpwCMY+v7b087SWedA3jQl3AB+PVpBM5dac0wSKB1YVFQcYdWqTzFGOjUnzzu Z5BiTwEESzW32BwrNguKlTrrHgKdS1cU89lkL4Wyk3XqWnQyu7ihl3oj8r8bq6VlzO4ETh nXSMFaLr1VwCw4IAO8AYBXi66E2trEU= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ef909597d9so14693615a91.3 for ; Mon, 27 Jan 2025 15:22:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020154; x=1738624954; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zdlGmP9L9NTeXFx5SYfMc8nzNQTgYbXoPVkbkgOx1Bc=; b=tCgItW8OIRYSqaf+jWYsit6Zi8BC3bOj7fDcQGvgLxWIprMDK5rb7WBhUTeeos9JX/ eqLqWrK9nx0uSakVBbkcB9i6hiFYeUWuhRSVRrJzBmktOHyDW1pEXRudu10WN/Us8JvS 5uKtHIzRkEeksiGmqzhusj0TjWRpYzy6F4axPsWdOda7lvA8xYqThH1Xz5ft63O8pFJm 0oTxCc4p2r/IxU64Hp07Qa2pifpOBiEaHUPNnoDFXQgCzn+kSnkOi9cGq6olOFZgXaYA OfhXHJMnPmUhUMEVoZaHmTymtFquZn3EXZS7HPK0TkPEz0M7xjwK+ygPs1b9KOqm0pfL CLrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020154; x=1738624954; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zdlGmP9L9NTeXFx5SYfMc8nzNQTgYbXoPVkbkgOx1Bc=; b=EwOKqEz9oO6nebxWhWoPbuC9sECNsWFan9Cx5YC97DqnwxMM/fp96ARpls/d3ACXcn FZymo0YMqdM7yt4OzY7desBrZjy67aQ73sGAS0ix3Vu2+bS4whEpIuIwf9c0WoTahYRE HcaiphnPmemmxeQXiLStHVgbZB1f7apjC0HWtfDN3yNFW1W5ON7kr9/V84rC8lTwO3/+ ym597eHP3NaHdHt7HMdwnKtb720laL1ZTxBAfnytB9Qf72Ts+4ynNWav+8T+R+gnNgO/ FunnpyuzQHuBvzdw9yNL+12TBgQc/V8/k/zdqGRktS+wmswBzNx3Mb4aZXykW0yUAFQO VJ9Q== X-Forwarded-Encrypted: i=1; AJvYcCWZM1ReZBEpD6M6NFsvwY/MVFfBQt4hyNDH/gam68fjc3AhN3D28AY6byZN/Uiw6Hwy4m3imyOkZA==@kvack.org X-Gm-Message-State: AOJu0YytZjnwOVjp8qeYhew4mnVOgSNXaceoK7R8IlDk20Jt04XyiIO7 qeI0HmNG9DPAms2tgb+DzvDVO/0YEK32djuxnLmv1obrAJq9YjXi0IkEGE2rrLQ6QCTu9g== X-Google-Smtp-Source: AGHT+IFlFbrxNKithJCsS+Ck+q6dt+mbWvci6cGtOTdu4D57StTnGf5tAZyUAhoCUWfD8qTUOWq9gHhk X-Received: from pfbcg7.prod.google.com ([2002:a05:6a00:2907:b0:728:aad0:33a4]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:2443:b0:726:c23f:4e5c with SMTP id d2e1a72fcca58-72daf9be6a8mr52518917b3a.1.1738020153756; Mon, 27 Jan 2025 15:22:33 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:46 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-7-fvdl@google.com> Subject: [PATCH 06/27] mm/hugetlb: remove redundant __ClearPageReserved From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Queue-Id: E6E4080009 X-Stat-Signature: x7hkrm7atbmguun6gapswd839i63krs6 X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1738020154-494076 X-HE-Meta: U2FsdGVkX19Nng2iYXlqd3yt5DBgZK6mSp/amVFP7sV+H593WKZ6Uxg0pMQgAJlzJ+w4Pu9SemDHWpFb0S/HJjEZJM0OK55q+nNu4zNb68YXS+1JCQbMq9+ZzXm/jgqBPXuOE0IHfRmbSC4LMDxsX4cGYAJb2U+bcmawgGvHL97eFBfgL5/iddEHfbeobw3nNeflLv9IDNwA04iWL73XuAoyYd9yeJLhWqGbHy3WgiTS83vs61fxRhTj0PhHSuIxAmO1ShIcdULnxLpV9YF+Ul4fl9xx0Z2lQ6WP6J5ZmCU5/W0OLg0X5MdC7DDS00JSiRsJBJtqOtVMfR0VnxJc38zkMZxWKlwmXQng4bv1OlhIXwdU53O4FlK9tT6kroJdkXZE6bVYGuouqDoDOazcC7XC9OytiQ+NfQKOchg6LSccPsdjXVPbi+DqE4ovP5VR32VhXw4kLetuMoKsItc3Z9XCnfrr+i8auryEV9jzKveae0D1A7G06Pl6MfWK4IqXG/JMMzsoHcAaCJmvhGg22f51F74gMr0v9rw6M/+YSyvHI8uu2b8uUCtQoHR6zui+45mSv2q5usvcGbkuw38o0PNQTZZkAm0QkquDUveEKQ4hBmcDfe8T8j7bC0IsPHQuvCNpLyYOCVcIEPEYxQH7AII2NF2jh+pDoR7PnDSl0hKpPUngyfJZ/LgMzhM6vkzzoq0pb4x0uXt51ReQKhCuUwGlaGJtuWeaoka1dWpjnnGrMAZmdkE5wWwA2UlSkRLBLK656sfuDz0AkiwMtc/BI5lCRo9fONtzM+qREMGRfLws+Jd3ljfPPEBalPsVxDKp2LFiSY9q8izJ3ECyyRX+PnQoJADBbvdHL6p7CHKj0RO6T/Mi8YdTU7lKZJfVuJ2hwaeH3DIBxg3ODzkhyCgZDhhiICGQCMFkW7/tCEgq0zEXP0rGBOkcr08qoNBABDmp7SMesUYXfsQ3Ksj3JdF gzqhsY5E x89L4cBXzBhWRfA0P4wJmiAkoaGCcIVNFJEo279/fbMQ1mJLwXqyzzw1UdqAvtymcdhMlhb2GdCJPokhRIQSSJ4wl2RhiBPUDm5OtiIocrlnZM/9XffSbQUD68QQj0WwO5sycBhTocSQoSn4MxrN7K2jj2d/e8RYFNV8xzXGus57D0U0EP6J1CrNJwtRbpGoQt40aGL1AmWlRN5V2fbBnl1omHE4t38nT26+CCzuBkQQ0R9CnZ7vmeYJ2LqXENy1L3QOSDhvquLodMd+NdDtqPouL8bOXhLtxEHGAQllE3xwHYofjtATBMpuRkc79dq4cAqrtK8+7qOwry9J0VnVAdkKxzBqv9kePoMbA4gak7SpelXMnQKBNqf+vMkz86G+na53+xRVHYLGr9uXs2nttjqr+02T+V2zurcTOLGWeG5DAsOpQA2HoMr3Gw13d2SertGgTI5tTGfJnQkcnM2l8TlWW/Z55UM+emmFXbg8TViOZbwL0Q0wGLeb2gfy43yZ2SuMsIYsa70JZV19hcUudEAdEShaQ5Ue9M+y/VyTBWTuWEn4KoMZTOPGQbpj+KcPILAay X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In hugetlb_folio_init_tail_vmemmap, the reserved flag is cleared for the tail page just before it is zeroed out, which is redundant. Remove the __ClearPageReserved call. Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 1 - 1 file changed, 1 deletion(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 1441a3916b32..42d8334d13bb 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3202,7 +3202,6 @@ static void __init hugetlb_folio_init_tail_vmemmap(struct folio *folio, for (pfn = head_pfn + start_page_number; pfn < end_pfn; pfn++) { struct page *page = pfn_to_page(pfn); - __ClearPageReserved(folio_page(folio, pfn - head_pfn)); __init_single_page(page, pfn, zone, nid); prep_compound_tail((struct page *)folio, pfn - head_pfn); ret = page_ref_freeze(page, 1); From patchwork Mon Jan 27 23:21:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951848 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE2F5C02188 for ; Mon, 27 Jan 2025 23:22:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CA1F128012F; Mon, 27 Jan 2025 18:22:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C534728011F; Mon, 27 Jan 2025 18:22:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF29E28012F; Mon, 27 Jan 2025 18:22:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 8CD8D28011F for ; Mon, 27 Jan 2025 18:22:38 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 42A98A0884 for ; Mon, 27 Jan 2025 23:22:38 +0000 (UTC) X-FDA: 83054808396.04.13915DF Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf12.hostedemail.com (Postfix) with ESMTP id 6E51540005 for ; Mon, 27 Jan 2025 23:22:36 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=mWSuWn8Q; spf=pass (imf12.hostedemail.com: domain of 3OxWYZwQKCAop5nvqyyqvo.mywvsx47-wwu5kmu.y1q@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3OxWYZwQKCAop5nvqyyqvo.mywvsx47-wwu5kmu.y1q@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020156; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Brcq4JfPtIrbpkUFa9/+gtrrIFaN0auRSXJvK1RMs+A=; b=6tNU2KVRZXU5wOdedsfU5LGfEJ822FA+hIioo/R767PpHA0dVeZUcz1IKOE6jdWpjpSOMy asnIJQG3xvnLEYpwz6IIWRXF4/ABSzwARAC7DMt8HHgvz7ImFVzU20gYtJp8ODs4rvWNux Cgk1ecKKqcAnPofkmiYz69KeitDt0Bk= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=mWSuWn8Q; spf=pass (imf12.hostedemail.com: domain of 3OxWYZwQKCAop5nvqyyqvo.mywvsx47-wwu5kmu.y1q@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3OxWYZwQKCAop5nvqyyqvo.mywvsx47-wwu5kmu.y1q@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020156; a=rsa-sha256; cv=none; b=VOMxNCeoUpNPHy6eoY360WCjcZF4MIzU288DrgN6v8Akn3klsCEmSKFpl1otDEj9j1Bgbu WYVggOuwzqes0OyZUjNnHUm4jgUw8fDJ5NAMkL53pbXSS0g+g8y4s322aPt1FZW6NYqRzR J2FyZ9JDxBu3VYDLm9x5AWEJYbGbdx8= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2f2a9f056a8so9959106a91.2 for ; Mon, 27 Jan 2025 15:22:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020155; x=1738624955; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Brcq4JfPtIrbpkUFa9/+gtrrIFaN0auRSXJvK1RMs+A=; b=mWSuWn8QcUvRo5rDVnYfPjL9TJhhTJa96AZvu0qOIBUZTkmtO6pxBPN97BkRHLyjZ9 crGyRkjRO3pc0++4f2q+Nar67JBmMAbepsvqq6qF8PxONMoCYMz25TfmHi+frf0K1wv3 WEyENsUjWa82CC2rOQdYPB6ukczCHTKcJ1OIP6j0e4CWZVB8WjZer60iFF7ObCrcroTF st+7R6FkXKSvXey4wjBCUabz7pGLdN3eodQO9tTCw2PdyxZsQtNRSKNM59d7fSFaAf8I pEiGRgCxoO/IS/1Ec+8YhYokscL22xw1NgjH/2rd7IXdR8cy2w5EcBOCwimkZ8OVCWLG thPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020155; x=1738624955; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Brcq4JfPtIrbpkUFa9/+gtrrIFaN0auRSXJvK1RMs+A=; b=UN/IEFxIIE0paqJtxlLd4lv5ZBnr/bocP9+mU5FeucMqcKxrIIyrI2CaHVpHaxFQ6E A291OX36ws63SdlS2dOywFDA8/5KJo3kimoa0EPDnSpAecBzOsWlMt7z4eG8JsBby3jn ceo2XeuHHWoKx+6HwLEAZlSipUpTvdHxVKlyA5NKrdUPARw8PqtM0VpNxJHzaiPFLmI4 j9LdvUHgCJdtbscI60dO06oGWYLpnAQNA3CbhPKNpW8ZTrqbs9Cti84q4ruyoe2FM2Ef FsXx1ossKidqjGjjp6sVEAWvmTeWoh5JpwVDR6IcBTbcOMegN2sc2BQCcWaINU7ENr9A aO3g== X-Forwarded-Encrypted: i=1; AJvYcCWAZaliL8DqbfdBgWDnVCXCYkbBQKEnCrZaIqgEJxkqPz/qFtDlLRrRkltrP8JyT5KSjPn5zyJGEw==@kvack.org X-Gm-Message-State: AOJu0YxrUvvtFVXXtvqaL8XTMazrlH1QonqalB9/vPE7hoSDNLCIbtdG a1JXKqHGsApYvmJLQ8E3wxRUVLD6rLFy2WyGRjNMaxbjYkySeAMdVndfvxHVCdq60QfmTw== X-Google-Smtp-Source: AGHT+IGUkjMGVlFTGDZ543aYzipvaZgJpL0AKG+8193aCJtZ1jd6maoDFhLCDJp+8UWj3sceamvMoeB3 X-Received: from pfbf7.prod.google.com ([2002:a05:6a00:ad87:b0:725:f376:f548]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:180e:b0:72a:a7a4:9b21 with SMTP id d2e1a72fcca58-72daf9ac293mr56912330b3a.5.1738020155273; Mon, 27 Jan 2025 15:22:35 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:47 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-8-fvdl@google.com> Subject: [PATCH 07/27] mm/hugetlb: use online nodes for bootmem allocation From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 6E51540005 X-Stat-Signature: hia13nz4wwdjggezq6dbxh5a7q5tib6j X-Rspam-User: X-HE-Tag: 1738020156-88336 X-HE-Meta: U2FsdGVkX1+Hm8OYAVW47F6em9M/pRCJpnjFGDB85w5Qee4gJTHv8PfbWKFEC/mH/IXrwS3r4rwK7kCaaBvbKNesAkJsfAYd19SmORJihCrkEgmIRhaAma7HS5fuKNSaU/C8X9Uz8RsJsF5T7t7F39VJaDYrtqiBE9n8rSbMBBvxI7iJxG2HYASh85uYfxpRlPpqv68N1Dry6LY7ZE2gHVKFP1ciSuPhhdtKtq/Mt0MuPU0GQC3RbAraVVu48UqYi8ApHYQB07cLIWTNqdZGZhAa+A/uHnSpO+tdRm9MS61nxlTimVVReQ8d24D9YlViRVIGfVKfEp8Us2fxxVH8m8vP79SAdTQtSwKpm+UPkD7pCVmn2JSPKH1Ej9OmxDwGZqpQ7oMc5FBfNwBKG/ElAZSMgFPXG8vdbE1Mq9Tz/A/JgJr6zX5Z4h9Pdud1zKdoOOBpyETR6VY1a66jc+pRPmP8RXlKuFdZIfWwdLAsx5Twgh21Cw3lLpjjZafFdVkU1TqVY3/ahi55c5CIZOGA2MQ6g/Tk08F7peDDz87C3qxmnKlytzH3cpaH2l5YofA6e/8+SBH0Ed6zT/xaCs2Ks0AcFWDTKVkHBeN9jIn60qwMd2lZ5USiqCDQrHjOQzTC3HO0M+3cuQjZ6L+JpWWvwk33aPFDob5GhFAjSz7CgDso2V9etD8Zj3BymjDaMchSr+Kz+c7RhB7DAr93EU7PC3YpjUFozLJn/rkbgi71xSrlYcMUHmxCXibSRqB58GJdtibzFRZ2KxYjSFJQ1sdamrFdr+VEy06UTfnXz9BqQavp+15KF1Yqu9zNmFtufriYGP+gt9EHPXThmBoMu3aibLYXYF/38cszk9jUQD0wZXgaP6IF6qIXNoyP5L+MEfgs56pQXBDlOlkkbD2+LOvMp0e7I6ib4frUMe+FH8EarMysaJmnlSnXrMMW0HzJTVdgzHVqYCVmWQe82mHh/sz WFmYns1+ iQC7KJpTI6ZKoDFbQDpY/t3k2qTxaGZla/4Jj1OoBNEo0m7w4YSUJAnEtNBjFikSMbp3rg5kUr4+DCw9cNyf6PKM2P7X3JSpXAd3RVXjBozAaiuZWLj3Xxv0lNmbUr87yJlCqxHlo9866ptGmOhzLQRuaO57lpsqotSaWnCwNvs+ut0Mj///VZ5hZrayqjRoga/8GvSBgiBw+kIUpReQsqCYMPGiklkFSa6Gi0rwLL8H1BefJ43gXxfX7iDdtiTJ/RKvEXHg2qdPiXArl5oIbvfd4cQwZoQZP1alY+J+xEjQkPY+ld5/SXp0Q9klhPhaMnOpzEdCgx0tLYuZs5xSlx7n/elb8Px056zpokv/k6BlL1Hdu5ctazvywMXqJA3t/LNRjLmh8LMxiVNBUsI3x2HV6WVQg08kVCBqoXRxk4zGvQU7moBKc1xvmaaElz6nSllbzZF3aNMIGf6fYFTlyFTd4Q12SdnMN2chKIdAChOEcVXJVgUPF1GyD7S6xDHrkBjZJaAtx1lGGvLJ+un4/h7GHkQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Later commits will move hugetlb bootmem allocation to earlier in init, when N_MEMORY has not yet been set on nodes. Use online nodes instead. At most, this wastes just a few cycles once during boot (and most likely none). Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 42d8334d13bb..a67339ca65b4 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3152,7 +3152,7 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) goto found; } /* allocate from next node when distributing huge pages */ - for_each_node_mask_to_alloc(&h->next_nid_to_alloc, nr_nodes, node, &node_states[N_MEMORY]) { + for_each_node_mask_to_alloc(&h->next_nid_to_alloc, nr_nodes, node, &node_states[N_ONLINE]) { m = memblock_alloc_try_nid_raw( huge_page_size(h), huge_page_size(h), 0, MEMBLOCK_ALLOC_ACCESSIBLE, node); @@ -4550,8 +4550,8 @@ void __init hugetlb_add_hstate(unsigned int order) for (i = 0; i < MAX_NUMNODES; ++i) INIT_LIST_HEAD(&h->hugepage_freelists[i]); INIT_LIST_HEAD(&h->hugepage_activelist); - h->next_nid_to_alloc = first_memory_node; - h->next_nid_to_free = first_memory_node; + h->next_nid_to_alloc = first_online_node; + h->next_nid_to_free = first_online_node; snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB", huge_page_size(h)/SZ_1K); From patchwork Mon Jan 27 23:21:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951850 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D54C3C02188 for ; Mon, 27 Jan 2025 23:22:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1609128011F; Mon, 27 Jan 2025 18:22:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EEA8D280191; Mon, 27 Jan 2025 18:22:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D145728013A; Mon, 27 Jan 2025 18:22:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 3B38128011F for ; Mon, 27 Jan 2025 18:22:40 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id DA0438087F for ; Mon, 27 Jan 2025 23:22:39 +0000 (UTC) X-FDA: 83054808438.30.FFE3202 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf04.hostedemail.com (Postfix) with ESMTP id 09C2E40008 for ; Mon, 27 Jan 2025 23:22:37 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="As+Oal/H"; spf=pass (imf04.hostedemail.com: domain of 3PBWYZwQKCAsq6owrzzrwp.nzxwty58-xxv6lnv.z2r@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3PBWYZwQKCAsq6owrzzrwp.nzxwty58-xxv6lnv.z2r@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020158; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sEk7wWJBoXYCWMM5+k1sYt6L/V24bQrmW4i63Y+H+6w=; b=kQs5ysxw2tHeto6vH0CmHd4C0vbI6TOthIpSYoTHd4+rr0WdbfW7Oa5GvCJGDqFqB7gFVh EWg4EvwOrS+gweknspWne/tyMlthaqwLJS9T/nzlt5TgtyXWbz8SQPnRnXGzTiu6YVoNQj mFvPFvZz2BLurD+PO8c0+P/e98XTwxw= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="As+Oal/H"; spf=pass (imf04.hostedemail.com: domain of 3PBWYZwQKCAsq6owrzzrwp.nzxwty58-xxv6lnv.z2r@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3PBWYZwQKCAsq6owrzzrwp.nzxwty58-xxv6lnv.z2r@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020158; a=rsa-sha256; cv=none; b=gtM6mO1F9Tr+yw4QzvpTzI8ntRzHQg4TdElCknS26tUlcyqI9zswx7PPE2hxtSNNUQTRgZ h8cn67YQkFCMF3Vv/BjN1X3Jk57UuKGTL8SkBoNSfWXQ3iiNd2s0FeSedULNmY8f5wzmp4 3kfV5wIaNmv5Vbc1uJoT1jqZEVr9eWM= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ee3206466aso11095847a91.1 for ; Mon, 27 Jan 2025 15:22:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020157; x=1738624957; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=sEk7wWJBoXYCWMM5+k1sYt6L/V24bQrmW4i63Y+H+6w=; b=As+Oal/HumRFrbtNntFS6nBzSGvreoLQBcF/DWuzPxbVZUQVx0cKm9k0OUtwEDIWlh n0R6GX4c3y4Auu89mtevCXl8p8g7oCT5uJYTpbJvLnc1xyagDzkNO0x2s3C7yOQ7jH5+ R8mYpmq99DBxiLSv8GdO77vaqkQ3muZZWUjI0cNJVhQHot/rNAXtvDETcbhmU04c2vYV NFa3bqH1WECmKXp8pQJryotkzIFgSn0SXPz33ZrkNY9V41yz910blFUHQ0WeDTqfUPGA vMyqei9efiXioE1aXGUxzvW5u3wpLhQn1afbP1qb+P5dkXwCNNqW0+3uOz3IwKXs6BPH YF3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020157; x=1738624957; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sEk7wWJBoXYCWMM5+k1sYt6L/V24bQrmW4i63Y+H+6w=; b=GKKH3rG2WGde42Nq83QBDWfKYL2kM9FrMrMsBqqq6OZC149Mi0VFX92Fwbfn1pR515 sHV+Zng40KYsgfdZNgiidYTNiaiapbUYipUoXKgyNbblkjgiZR2TWgwO5iiKH6N+cCqJ xkHTpXKgbkgsUiWWsV/6tjw73V3PqKcN12w4Lmlw73xRYAmI42WXKZf+xWZboE2V9oxc ++FDC8QL0e7WqrQoqhohXAkpRCwSsK7aH7HtzSoj267bp07cSsEZnC0D23GogjpFBzsa pi4IXwh+QnoAZqsEhz6WRNoD1/2JjSG4UTNQ66Rpk6HGLE69oP5AK0sfsdVWPpLW/A07 df6Q== X-Forwarded-Encrypted: i=1; AJvYcCW/OZK+JBqEyykPaHuAIBc8csI2B26/9VNmpNnU9XtiNA0FSKfmGEWYG8nlJYvk8UdIwSnlqAkYxg==@kvack.org X-Gm-Message-State: AOJu0YwwisQjljUO8i9MY41+sBdPKEqI5vP4fQlb8/BDMJcaxAMqzqgV XhNE1wKVNMtn97H92ftPmxTvZEJFqJ3JYWZdp15DH0NzUopE4RgtGBBzfRTUtcq3CCWUyw== X-Google-Smtp-Source: AGHT+IHfBibcmt8vtXZ+mym7NU2SB0DPkxQSWz4/Xq+ZhKrlvbhzuCjqjkBbya8StVC/rWEyywcCYFQJ X-Received: from pfblh4.prod.google.com ([2002:a05:6a00:7104:b0:72f:bfd9:14c5]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:aa7:88d2:0:b0:729:1c0f:b94e with SMTP id d2e1a72fcca58-72fc0917f6bmr1506666b3a.6.1738020156872; Mon, 27 Jan 2025 15:22:36 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:48 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-9-fvdl@google.com> Subject: [PATCH 08/27] mm/hugetlb: convert cmdline parameters from setup to early From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Queue-Id: 09C2E40008 X-Stat-Signature: fxxc44aatkkm4ohh9tzf1kss55n6d31p X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1738020157-192307 X-HE-Meta: U2FsdGVkX1/lLFg+FLGqFsDY217Jg64o3PjgcxsfLAA7UKfgrv2d6C6E2P+HtankpLCAc3qwoTYAaZg3SQJgZONVlz7f3+v0kjAbUBYZ5IyltegKKDtN8lSCJQEN8v+i0XNcE2/JGi6+BFTv+LSzh9p7JXbC5M+olUOhvoPy+LEtZQX9IFFpTP94Y4rVfP+R6noO0ir2b0rLg+EkIYELNM0lbhAcD9S25rTQ5Z2yGQMpZw/tkoJ6LOHUCfMYzNjIgFb4ZvygPFdX9ui5tFb4FiuJGzyyusKte+Gd5/UroLD+4pJfVeO1xTKdPJnX1BfRWpfxRZUix+/I6ijef1gWViLHtNJC5CCsIPRAk9mEx7/Vc1hBL+f7WjOfEsP3mPOpkNu2ZjTNFIAEW5X7Q4BvUR6GHXtkmJqmvDyu5z2Ur9r2LMR1rpRXMJ2XXj4oPoZKLS6S716lHMf7yqSxnD06SDFsKdjr9ypozjVejQLcPr4fpaNZKnfF6vl9BwqgN10HIfPfEWqcpPu3U5RqvnW85sESyZ56m1euGProro6C9A9yCcjoEtSYeU/6HjpURoYCTQYqI6rILwMkXaJNXyu65R+28b9aoC4xcFDEUXb0q6TT5tq7t/vwyLKlKVIWQsCnSrnZPiApoycf5ebfnnAWLak+ggT3+F2IEL6XnDG+9IdHbCT8Zmx/EUsCyeoAj9exQvyectzigTjvYF+EIO3qCvgUCqRZLSILP4JmO6CAWWIj+5ENHA5phq6pCHggz66J6Xh5qNBtwvQEA3rQSNc4muIJcTiKvZ4W0GGcRyDD07pXRLEyvitotaberp2MA4UIxUpNiB2N+b+Ya6SORFPYz9sSbROYXuDc1tnGUpjEb98NTfwaYaxPEjTzB0nnt9qYUea4IwQX7lWuknyjb1wU8D4Voo3oXIW/pVs8W3wQYefOBtHTPKuTAskjkp8qY55DkIy5fdLzNVShKtEzlNl PljivUul Lb4k87uU1JuQNX12VRrupaS9laAe+2G0MdX+RXU67v+ZWmLxOHoSAp3hiYBtSCvUSOCCRFlhMXRvL8vGr8PB1L89fzQER8PBE6YxdiDhR3TIuRR90YRFeDdj5/+ViXm7o3UQwNrz6TqQ/nq6sw+pqTDCavWwz7Uhjqr99YGn6jffm2QzABzXySad0Jp0eD8we//oZEon2Xw+ACuWHU+R7PYxlWbAHYCpEEqEmec1hGA9ovK5tcgKx8zaoq02eJr6cIUrqWgvth67PxBdXf25XiX1bH5rNzSJT3qUV5WOAreHQO0DvSOYR16kx2a58lruOqJST8TEaFpvvYXIO9Lox4KLxHaIMVbhUpqbb0r8+ImBuH4c1btOiv3AZ5bLj5sx7vxGcCfxYSwcDu5BldOXYcr1VS4CPerHYvnvfntY5cuyUuaMc4GPb7p7XnN5YehWjeenqvhrNqsBSWU/tu4m200d2P6S1CHFRhbsUar1F2nQxWZ3uyOIhKrRDuUKL8eykh2+Uu0T7U2JNWKhdNkzyZbAw6A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Convert the cmdline parameters (hugepagesz, hugepages, default_hugepagesz and hugetlb_free_vmemmap) to early parameters. Since parse_early_param might run before MMU setups on some platforms (powerpc), validation of huge page sizes as specified in command line parameters would fail. So instead, for the hstate-related values, just record the them and parse them on demand, from hugetlb_bootmem_alloc. The allocation of hugetlb bootmem pages is now done in hugetlb_bootmem_alloc, which is called explicitly at the start of mm_core_init(). core_initcall would be too late, as that happens with memblock already torn down. This change will allow earlier allocation and initialization of bootmem hugetlb pages later on. No functional change intended. Signed-off-by: Frank van der Linden --- include/linux/hugetlb.h | 6 ++ mm/hugetlb.c | 133 +++++++++++++++++++++++++++++++--------- mm/hugetlb_vmemmap.c | 6 +- mm/mm_init.c | 3 + 4 files changed, 119 insertions(+), 29 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ec8c0ccc8f95..9cd7c9dacb88 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -174,6 +174,8 @@ struct address_space *hugetlb_folio_mapping_lock_write(struct folio *folio); extern int sysctl_hugetlb_shm_group; extern struct list_head huge_boot_pages[MAX_NUMNODES]; +void hugetlb_bootmem_alloc(void); + /* arch callbacks */ #ifndef CONFIG_HIGHPTE @@ -1250,6 +1252,10 @@ static inline bool hugetlbfs_pagecache_present( { return false; } + +static inline void hugetlb_bootmem_alloc(void) +{ +} #endif /* CONFIG_HUGETLB_PAGE */ static inline spinlock_t *huge_pte_lock(struct hstate *h, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a67339ca65b4..a95ab44d5545 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -40,6 +40,7 @@ #include #include #include +#include #include #include @@ -62,6 +63,24 @@ static unsigned long hugetlb_cma_size __initdata; __initdata struct list_head huge_boot_pages[MAX_NUMNODES]; +/* + * Due to ordering constraints across the init code for various + * architectures, hugetlb hstate cmdline parameters can't simply + * be early_param. early_param might call the setup function + * before valid hugetlb page sizes are determined, leading to + * incorrect rejection of valid hugepagesz= options. + * + * So, record the parameters early and consume them whenever the + * init code is ready for them, by calling hugetlb_parse_params(). + */ + +/* one (hugepagesz=,hugepages=) pair per hstate, one default_hugepagesz */ +#define HUGE_MAX_CMDLINE_ARGS (2 * HUGE_MAX_HSTATE + 1) +struct hugetlb_cmdline { + char *val; + int (*setup)(char *val); +}; + /* for command line parsing */ static struct hstate * __initdata parsed_hstate; static unsigned long __initdata default_hstate_max_huge_pages; @@ -69,6 +88,20 @@ static bool __initdata parsed_valid_hugepagesz = true; static bool __initdata parsed_default_hugepagesz; static unsigned int default_hugepages_in_node[MAX_NUMNODES] __initdata; +static char hstate_cmdline_buf[COMMAND_LINE_SIZE] __initdata; +static int hstate_cmdline_index __initdata; +static struct hugetlb_cmdline hugetlb_params[HUGE_MAX_CMDLINE_ARGS] __initdata; +static int hugetlb_param_index __initdata; +static __init int hugetlb_add_param(char *s, int (*setup)(char *val)); +static __init void hugetlb_parse_params(void); + +#define hugetlb_early_param(str, func) \ +static __init int func##args(char *s) \ +{ \ + return hugetlb_add_param(s, func); \ +} \ +early_param(str, func##args) + /* * Protects updates to hugepage_freelists, hugepage_activelist, nr_huge_pages, * free_huge_pages, and surplus_huge_pages. @@ -3488,6 +3521,8 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) for (i = 0; i < MAX_NUMNODES; i++) INIT_LIST_HEAD(&huge_boot_pages[i]); + h->next_nid_to_alloc = first_online_node; + h->next_nid_to_free = first_online_node; initialized = true; } @@ -4550,8 +4585,6 @@ void __init hugetlb_add_hstate(unsigned int order) for (i = 0; i < MAX_NUMNODES; ++i) INIT_LIST_HEAD(&h->hugepage_freelists[i]); INIT_LIST_HEAD(&h->hugepage_activelist); - h->next_nid_to_alloc = first_online_node; - h->next_nid_to_free = first_online_node; snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB", huge_page_size(h)/SZ_1K); @@ -4576,6 +4609,42 @@ static void __init hugepages_clear_pages_in_node(void) } } +static __init int hugetlb_add_param(char *s, int (*setup)(char *)) +{ + size_t len; + char *p; + + if (hugetlb_param_index >= HUGE_MAX_CMDLINE_ARGS) + return -EINVAL; + + len = strlen(s) + 1; + if (len + hstate_cmdline_index > sizeof(hstate_cmdline_buf)) + return -EINVAL; + + p = &hstate_cmdline_buf[hstate_cmdline_index]; + memcpy(p, s, len); + hstate_cmdline_index += len; + + hugetlb_params[hugetlb_param_index].val = p; + hugetlb_params[hugetlb_param_index].setup = setup; + + hugetlb_param_index++; + + return 0; +} + +static __init void hugetlb_parse_params(void) +{ + int i; + struct hugetlb_cmdline *hcp; + + for (i = 0; i < hugetlb_param_index; i++) { + hcp = &hugetlb_params[i]; + + hcp->setup(hcp->val); + } +} + /* * hugepages command line processing * hugepages normally follows a valid hugepagsz or default_hugepagsz @@ -4595,7 +4664,7 @@ static int __init hugepages_setup(char *s) if (!parsed_valid_hugepagesz) { pr_warn("HugeTLB: hugepages=%s does not follow a valid hugepagesz, ignoring\n", s); parsed_valid_hugepagesz = true; - return 1; + return -EINVAL; } /* @@ -4649,24 +4718,16 @@ static int __init hugepages_setup(char *s) } } - /* - * Global state is always initialized later in hugetlb_init. - * But we need to allocate gigantic hstates here early to still - * use the bootmem allocator. - */ - if (hugetlb_max_hstate && hstate_is_gigantic(parsed_hstate)) - hugetlb_hstate_alloc_pages(parsed_hstate); - last_mhp = mhp; - return 1; + return 0; invalid: pr_warn("HugeTLB: Invalid hugepages parameter %s\n", p); hugepages_clear_pages_in_node(); - return 1; + return -EINVAL; } -__setup("hugepages=", hugepages_setup); +hugetlb_early_param("hugepages", hugepages_setup); /* * hugepagesz command line processing @@ -4685,7 +4746,7 @@ static int __init hugepagesz_setup(char *s) if (!arch_hugetlb_valid_size(size)) { pr_err("HugeTLB: unsupported hugepagesz=%s\n", s); - return 1; + return -EINVAL; } h = size_to_hstate(size); @@ -4700,7 +4761,7 @@ static int __init hugepagesz_setup(char *s) if (!parsed_default_hugepagesz || h != &default_hstate || default_hstate.max_huge_pages) { pr_warn("HugeTLB: hugepagesz=%s specified twice, ignoring\n", s); - return 1; + return -EINVAL; } /* @@ -4710,14 +4771,14 @@ static int __init hugepagesz_setup(char *s) */ parsed_hstate = h; parsed_valid_hugepagesz = true; - return 1; + return 0; } hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT); parsed_valid_hugepagesz = true; - return 1; + return 0; } -__setup("hugepagesz=", hugepagesz_setup); +hugetlb_early_param("hugepagesz", hugepagesz_setup); /* * default_hugepagesz command line input @@ -4731,14 +4792,14 @@ static int __init default_hugepagesz_setup(char *s) parsed_valid_hugepagesz = false; if (parsed_default_hugepagesz) { pr_err("HugeTLB: default_hugepagesz previously specified, ignoring %s\n", s); - return 1; + return -EINVAL; } size = (unsigned long)memparse(s, NULL); if (!arch_hugetlb_valid_size(size)) { pr_err("HugeTLB: unsupported default_hugepagesz=%s\n", s); - return 1; + return -EINVAL; } hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT); @@ -4755,17 +4816,33 @@ static int __init default_hugepagesz_setup(char *s) */ if (default_hstate_max_huge_pages) { default_hstate.max_huge_pages = default_hstate_max_huge_pages; - for_each_online_node(i) - default_hstate.max_huge_pages_node[i] = - default_hugepages_in_node[i]; - if (hstate_is_gigantic(&default_hstate)) - hugetlb_hstate_alloc_pages(&default_hstate); + /* + * Since this is an early parameter, we can't check + * NUMA node state yet, so loop through MAX_NUMNODES. + */ + for (i = 0; i < MAX_NUMNODES; i++) { + if (default_hugepages_in_node[i] != 0) + default_hstate.max_huge_pages_node[i] = + default_hugepages_in_node[i]; + } default_hstate_max_huge_pages = 0; } - return 1; + return 0; +} +hugetlb_early_param("default_hugepagesz", default_hugepagesz_setup); + +void __init hugetlb_bootmem_alloc(void) +{ + struct hstate *h; + + hugetlb_parse_params(); + + for_each_hstate(h) { + if (hstate_is_gigantic(h)) + hugetlb_hstate_alloc_pages(h); + } } -__setup("default_hugepagesz=", default_hugepagesz_setup); static unsigned int allowed_mems_nr(struct hstate *h) { diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 57b7f591eee8..326cdf94192e 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -444,7 +444,11 @@ DEFINE_STATIC_KEY_FALSE(hugetlb_optimize_vmemmap_key); EXPORT_SYMBOL(hugetlb_optimize_vmemmap_key); static bool vmemmap_optimize_enabled = IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON); -core_param(hugetlb_free_vmemmap, vmemmap_optimize_enabled, bool, 0); +static int __init hugetlb_vmemmap_optimize_param(char *buf) +{ + return kstrtobool(buf, &vmemmap_optimize_enabled); +} +early_param("hugetlb_free_vmemmap", hugetlb_vmemmap_optimize_param); static int __hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *folio, unsigned long flags) diff --git a/mm/mm_init.c b/mm/mm_init.c index 2630cc30147e..d2dee53e95dd 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -30,6 +30,7 @@ #include #include #include +#include #include "internal.h" #include "slab.h" #include "shuffle.h" @@ -2641,6 +2642,8 @@ static void __init mem_init_print_info(void) */ void __init mm_core_init(void) { + hugetlb_bootmem_alloc(); + /* Initializations relying on SMP setup */ BUILD_BUG_ON(MAX_ZONELISTS > 2); build_all_zonelists(NULL); From patchwork Mon Jan 27 23:21:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951849 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D18FC02188 for ; Mon, 27 Jan 2025 23:22:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E5DAF2801A8; Mon, 27 Jan 2025 18:22:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DB75728011F; Mon, 27 Jan 2025 18:22:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C52B22801A6; Mon, 27 Jan 2025 18:22:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A7A9728013A for ; Mon, 27 Jan 2025 18:22:41 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 712C2A087F for ; Mon, 27 Jan 2025 23:22:41 +0000 (UTC) X-FDA: 83054808522.03.A9C75C4 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf07.hostedemail.com (Postfix) with ESMTP id A580140007 for ; Mon, 27 Jan 2025 23:22:39 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=I5JRU6pH; spf=pass (imf07.hostedemail.com: domain of 3PhWYZwQKCA0s8qyt11tyr.p1zyv07A-zzx8npx.14t@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3PhWYZwQKCA0s8qyt11tyr.p1zyv07A-zzx8npx.14t@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020159; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NdaXvVKgkNqaabYpHFU5W/ymB41iIHIErbbYhcizrz0=; b=xfkNEe02+tSn7Guu3RJJBpvIseK+3R0G+eKse2wqNju2RkLsfFWoxtLc6ozPhQQ3Pjar57 oy+84CvBAgOUrvgodklTy8oMKMP/gDq53JZ74+MWZxsbf1Osn3SWb7I9XAQ9gIt2ZOLgjX p1P4gMJ3BD6uJtisGsvLU69aa2PCtrM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020159; a=rsa-sha256; cv=none; b=yjh+FppPDCpfsRrtuuHgizWyXiw/irO9sMiv8p5f7jYW+CB9bSsc7E6DQy1xG1vedJOlQ2 OGxQcvPG34NEzSs7fWDcTvrE0PoiwkoMUE8Wwc/CaC0DG6rsgyeJDV0yvAK49k73SrB/iq tShZU9wGBW9Xkr/TIhZ4WqR8gW5wzxc= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=I5JRU6pH; spf=pass (imf07.hostedemail.com: domain of 3PhWYZwQKCA0s8qyt11tyr.p1zyv07A-zzx8npx.14t@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3PhWYZwQKCA0s8qyt11tyr.p1zyv07A-zzx8npx.14t@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-21661949f23so150339675ad.3 for ; Mon, 27 Jan 2025 15:22:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020158; x=1738624958; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=NdaXvVKgkNqaabYpHFU5W/ymB41iIHIErbbYhcizrz0=; b=I5JRU6pH3cpgTzdaVzgGlEPOFfotfvJOh0YSePfW700MpaF6sQUmcAYGOWSFYf1hqu GHj8j+9M2s8gAdVwI+8YLbdSOhQiq47SgIpA6JqJ8IS0XkzGrbZSG9VTmidY77UtydOU yphahKI9bP7RDQP+xkAXc8Z17OQo1gjrCn/z9TgQkaCzdGcwtQdHUz4HNdndsWerYu1B yjYsXuwGVSqeteHKq6kpC3eOjQNQaSBcGvhzLub8+1Nmfmm7XDD6x6yHHw8UK/KJ+C1P PZuFlxwzhZGUQavMGgNOwzTyu+AvkufNsUNsP2T/x3c8qPN+86Pcfd1UogYgmYm5NGFH cG1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020158; x=1738624958; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=NdaXvVKgkNqaabYpHFU5W/ymB41iIHIErbbYhcizrz0=; b=my2HD6C2E+WVjL0kAxcVbGg/j1KAa8E9vXopvhKwMQgccknG35rAvk8FN43Deq3PcF lCJ8hjrybL/CloROJBTaBygxkNms3imJg8Y6XmwPY3ixYEccipOz11P8nKPlp+VoNAwO 7Gc1qwrfabxAEyN2FiI8nfMCnggPxPYoOcH9tkoFOEsRHku+7bgKQGbGSmTLzLu9Vx1b CfZP/4a97BoTe3cBrCQk/kIN375iLPFKXlV7CIpvQqj73rTzIyxsUmNEIx+sxK7PJ/Ay HXN876/r1c1tjkTfgZqkWLPestXwvP3AFZWC7i9DAsMcnYhX/LaHFjI0nsriUX+IyQhn JMQw== X-Forwarded-Encrypted: i=1; AJvYcCVI4FnJ+ZJAMYEYx5vHMM7veFWI/7Y2mdvv07iPKhuubR6zCZ0/R31U80vla7fWcbxS5SN1KSbk/A==@kvack.org X-Gm-Message-State: AOJu0YyYq7fEy2g8UmKMtr+nr0OYvfajXwvhmk5yyhTh9Sr9RSYN3dIx AbPHLbeJSGECTklJR2hj9GBJX5IkajvZPVTVpzGMBgPVM0pBlbExaXcbMf0kIW3l2ormwQ== X-Google-Smtp-Source: AGHT+IEp7V98ohakR/+V5+zDAOrMpSHiTTmivr5fT5EXSmUB8ro+KBq2mBzgl5YEyCdcO7NN4yKdnziK X-Received: from plle4.prod.google.com ([2002:a17:903:1664:b0:21d:cce6:c548]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:da8b:b0:215:9642:4d7a with SMTP id d9443c01a7336-21c34cc9574mr630305645ad.0.1738020158507; Mon, 27 Jan 2025 15:22:38 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:49 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-10-fvdl@google.com> Subject: [PATCH 09/27] x86/mm: make register_page_bootmem_memmap handle PTE mappings From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden , Dave Hansen , Andy Lutomirski , Peter Zijlstra X-Rspamd-Queue-Id: A580140007 X-Stat-Signature: qu4qwth4kzp83o19tbb7869w4dtnqhbu X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1738020159-145239 X-HE-Meta: U2FsdGVkX1+i4T1WNsgelzd2w3WSDtowPyjCGoR+2wAttiy/DVO4QZ/OKN8l9HQ7ER79TqjCZhL6VJOetOqFEbPKTAYok2YFlVR4e1ewzuPu/3WxSbAakFloCvMvCyYnhTIXj0CA2XyMH0R8uGCsNDHB4zfSHXOD2GltjESemiRSOd5guJX37uDKnBiCWm8ffcTAMDuaSfGrJRfk39yidGDqMPbCRgvwEB67M5oG0SRE+Fgyf1kDJaGwR4QYkx5iDqy1sddm0UeWbwmqrPcaVnWkdrj79ABw6Y6JXJVi7eGvZcQ4krj2BzhdZuo8tjMDrzDXZQxDo4nCeVfJZ8o4M4jqJbjPPWAYzvBqdIIsf7ukgcqchGpelGcE6vygQfnh1XUMpzftSsMoPg2a5YbnX3mE047UmJyjPUeUXlnr9Pa3K0Qo633IgOd824X3Bb5qduT9jrJ2A6LBZLr8aaOU9ZuBKfrXlAGVAVaiAy4cBPO9oDHVPmX2Rg/ZjvFWMT2HsvS5JdPS8tTkv0O0Jqv8cXl/QyxJ7uE9n2yJuIW8K5OiSYO5z1Kvg61+VqWj/xh9JodfkdTl18XMM9RZ1IQc1WfZr8pcrhNcEQag3hz9VJUzipA92zYoJikReMp70U4KwjRCQ1KPGwVKakaWDS7ix3VX/gpX259goNLjxwZ1j3vdLB276nbWwPx2K0b+nqsJPYcNYJ4qhtGfWPtQ0O7dyRrH3x8ypNeUYEYLPMFNMcsBPL/A9myhEP2VyV8guoLYF759mD3MtZuT7HI4SbiftyA5WwMPSyoRri4WYaJj6yMUVmQYUhAajHFGxOeZ9fHv9x9QkckJzvsZk1mZLLChQXzuudvj0Tz+MC48XKFG1TcvMWmbzCCQyxgiQdjsLGjaPajov84q6EQ449YM/vukUFoZCojv1ZbfmtXoj3tF5Vcli0yJUuq0h+9d+nZ847PklRe986woaVStbh2nHwz 1O/kMPv6 toJxMYBs9SeWFT9N9wmD0bG1sKX8YU8tA2cWiZwkrQwb4SdJ0WR7wdqWdbeMtPZPkipKdK+uJ7QKy8gs2Ddv9qlC6TC3ZhyKrpBHuoLuvJbqvNdB82u2rDje+2wKcoqL+0vA/rt1A1MXr1qvyLr/H/rspNQr840yxLoKH1dbFMY98AlyEHKQdEDGyRRth8CS34gnQQTCLHzkapThHKI2R5zEtkDllYZyOoLfsXf7XzE4ZGZpJmEz5wBp7QyB37KzqkEn5TIn8wWo5eprGbbF09wqakXKz7B2yRdikKf3EiOI9omstgdfymmgB8sbFvhOgfSubaK/NYh3iqXyw/H6jl8NdHsjWXwjSG9d0J3LcBK4WykXyBFZ9BuS0dTV9lHZpJesn2DGMsne3e+oTS7I5A84Q3eWnSwqDai15TmpqEX4pjVkqsNtpBhfeqXlvaEI80xsxbNk/BLtxfTyAUSkbm2DZBjPjLwPe4rhN6HQKfXmXGl99OZVjJeZK86zj35z0iF2ZFKnBfOfrcxIZaRM/CVc0Ca89nXtq34m1QalkH8ZtNohe3geGoVOu4xgmZU5yF76LCwehJSPhyD8oHD9Zi7XSczrlgZ1xCanY85nNb2Pef7P/ty6x6toPy2LMNoYq6YiJq63BCwCOiC2h92kq26RiSMt3x+jE2wEfw24BIxFlQUOqHKTLKI7e8A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: register_page_bootmem_memmap expects that vmemmap pages handed to it are PMD-mapped, and that the number of pages to call get_page_bootmem on is PMD-aligned. This is currently a correct assumption, but will no longer be true once pre-HVO of hugetlb pages is implemented. Make it handle PTE-mapped vmemmap pages and a nr_pages argument that is not necessarily PAGES_PER_SECTION. Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Signed-off-by: Frank van der Linden --- arch/x86/mm/init_64.c | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 01ea7c6df303..e7572af639a4 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1599,11 +1599,12 @@ void register_page_bootmem_memmap(unsigned long section_nr, } get_page_bootmem(section_nr, pud_page(*pud), MIX_SECTION_INFO); - if (!boot_cpu_has(X86_FEATURE_PSE)) { + pmd = pmd_offset(pud, addr); + if (pmd_none(*pmd)) + continue; + + if (!boot_cpu_has(X86_FEATURE_PSE) || !pmd_leaf(*pmd)) { next = (addr + PAGE_SIZE) & PAGE_MASK; - pmd = pmd_offset(pud, addr); - if (pmd_none(*pmd)) - continue; get_page_bootmem(section_nr, pmd_page(*pmd), MIX_SECTION_INFO); @@ -1614,12 +1615,7 @@ void register_page_bootmem_memmap(unsigned long section_nr, SECTION_INFO); } else { next = pmd_addr_end(addr, end); - - pmd = pmd_offset(pud, addr); - if (pmd_none(*pmd)) - continue; - - nr_pmd_pages = 1 << get_order(PMD_SIZE); + nr_pmd_pages = (next - addr) >> PAGE_SHIFT; page = pmd_page(*pmd); while (nr_pmd_pages--) get_page_bootmem(section_nr, page++, From patchwork Mon Jan 27 23:21:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51FF0C02188 for ; Mon, 27 Jan 2025 23:23:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 42B31280191; Mon, 27 Jan 2025 18:22:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0E57B28013A; Mon, 27 Jan 2025 18:22:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E4C6C2801A6; Mon, 27 Jan 2025 18:22:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id ADC10280191 for ; Mon, 27 Jan 2025 18:22:43 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 39CD4471A4 for ; Mon, 27 Jan 2025 23:22:43 +0000 (UTC) X-FDA: 83054808606.30.7866E73 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf09.hostedemail.com (Postfix) with ESMTP id 6440B14000B for ; Mon, 27 Jan 2025 23:22:41 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=IHdO7USy; spf=pass (imf09.hostedemail.com: domain of 3QBWYZwQKCA8uAs0v33v0t.r310x29C-11zAprz.36v@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3QBWYZwQKCA8uAs0v33v0t.r310x29C-11zAprz.36v@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020161; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GH4/nN/ipl8VhnVSg142rgT3YjHhqiHRlTH7bGddYdM=; b=FaHhmF1hzrlCBoj0jAkJNCcdzEWdRQzcUVuVPzAzQNPI7fC6LMf0yygZgMnCnEN3+7uIIq 5xVCn1EO9drdQKRJEg2z+skE3VEdB5QFy+M5pXUQpRJ4fckjvQNg8HmxA8nMCPNtsdQH2j 1543SP2U0rS8LSocoaHQ3ZEvIZWFbDg= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=IHdO7USy; spf=pass (imf09.hostedemail.com: domain of 3QBWYZwQKCA8uAs0v33v0t.r310x29C-11zAprz.36v@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3QBWYZwQKCA8uAs0v33v0t.r310x29C-11zAprz.36v@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020161; a=rsa-sha256; cv=none; b=YW1S61zftnA/r+O2OAZjNaFSJaSGVYNAM0F8iz9KqDeGz9W6kwPIQphSL+YiiCdD1B6YyO LyC1qPQM2B8J3BpWyRhQU703v4OzvJ67wFglWuwONpE+FhgTDTGlZO4BUHDkc3SF6Uf38u 0QlC2iSWSELgNdJBMIhKTptQQB5pFhU= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ef9dbeb848so9258099a91.0 for ; Mon, 27 Jan 2025 15:22:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020160; x=1738624960; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=GH4/nN/ipl8VhnVSg142rgT3YjHhqiHRlTH7bGddYdM=; b=IHdO7USyAQjpiXEKtEB7wcRi5hRmAQo6vnG+G0hhjhDcFs2NkWiRkfcvQAqZ0uodph hAk2S80nx140AfUbVbLHa5ZwbEvMDAIYUUuEnycTUbo0z4oDTxLBCYsyjTASgbizf/YH CPUMx76qPiUr/KqvZMjyDTwnuIjwm48AeQuD6dLq7zQYgc+9nwl0b64A9QoATGRGOapb S7H8ZofuGgcVwThRFyF5JCjeAaX4QN+6gnZjmV/XzUah/dTgg2Rfz6L50B2BfKFffRch ONqGXwzbr6eS3yt2jpPgdzW0bCu1wG1Hm8B8F0u8LM2Eo4k2He83r0L+DHK1nVXdIish 4yyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020160; x=1738624960; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GH4/nN/ipl8VhnVSg142rgT3YjHhqiHRlTH7bGddYdM=; b=QZXD/d0e5ckqgK1YxNwuvX80ijMsNOuwx8epG2skfyJl1ixIsvJ6DHy43PY5dlL1GU vPQluUIxz/EIgrF1sqSS8DKopoyE7Z1DvyO6axMlATRHdT0oErLEQzP2H/BtVYYmQy34 isxMEea56ngNhPvclGP+wy6TQcftmKy5uzDAuk6kgzcDdhuJ0mKIkQsyi7Uy/XG3qh1S 9XLPiRKQXlBywJ+fK6YfjiqgZFxWKr35q0OZtqki4myqK+YVywch54gTi4rVBQ1XJh7p pOXt6D/BrCZgRSKIPAUWVzHx5297T8xW+Q0h04WkoPaOVAhWOEe8MorFSlqCdSA3uZq4 +Nmg== X-Forwarded-Encrypted: i=1; AJvYcCXNzC87ylTCjiYWzEwZMi/TytcrutgCu9b/LoaIz29AkqJFFsaH0sdsuQmpRd+P0exDfjem358tuA==@kvack.org X-Gm-Message-State: AOJu0YySoR5n1P011StQS1DYjfZwCA3cUkntJZLbReKQOxgvyj5cD+CH lc0aXbR32Rt8h0mYv/ZK/WO8T8HfCXppiZTVcUS13dKXJqZIS20cs4hUbztGN2cJEjivKg== X-Google-Smtp-Source: AGHT+IHtoVxq4Hng5Mqr5crxiW6csNZeKWVPbrZ8r76xcBpUffrqPvHzzCtwvmYq6eF25Qrl36v9gR7l X-Received: from pfbds10.prod.google.com ([2002:a05:6a00:4aca:b0:725:ceac:b484]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:909c:b0:72a:bcc2:eefb with SMTP id d2e1a72fcca58-72daf92bb26mr59082565b3a.2.1738020160110; Mon, 27 Jan 2025 15:22:40 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:50 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-11-fvdl@google.com> Subject: [PATCH 10/27] mm/bootmem_info: export register_page_bootmem_memmap From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Queue-Id: 6440B14000B X-Stat-Signature: kdxqz51oece1n8uekt5qiocw53n94dg9 X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1738020161-441228 X-HE-Meta: U2FsdGVkX1+utsvAZMM+9R0PtrlJgeZTBfwYq4KljLwKKSH1e9ZxbKEomTkALkb4L3UPhDv3kHscDFPErdlLvjDlfXtjemjmwwPvtjuk4+GliNvQYUxFnVdRhjfTbk0WrrHCmk4qjuK4Ekb93f4KpSUOf/nvve2RIvMEQxZsBVbaAducsm2fJ/C1O0yLvX83edRxvfxJCPY6DyUX8ASuFBtMjabEfuMvgVAS8Vw5NstVgSdN0ItN+75ImFPQnQTTuQO5JJO39Htdm/EQx35yUmLTS9esQPRdQx54JbYhy/Gsz7fKFIoicFyqukn1cAZ61XZUWYbvxZEFKkteL/yJVcFp6dw4B/sES3hV4s9HRMNkwAWn9sB6UDp+XcinYuKTT1P4HjWvm19Esp3kIBfTfJKKugpWqklk80jfpbptZR5SMQVJVpFAObW2wcjviwzeMdKo2Ezc5lTkembtDHsPt/Wq1GZpFJ38OUKuoUpeCa+01Hgw1ZDKFIRHqXH9mrmRoF8Pnyd+H9QP6I+Gculd+qqLWRBhNEvKx2hC3YCO28C0JH0GLYfkwpJRL3i9p6Me3h9nKV+2h/JkRT0kmmnyYGGNmLfmFjYrqFQYSPja1/Kx8P+KojL7dQXC3fnM7JUEX4rHxc99ovXKO1PZoiHPH3kARSZ5l2ZAand7W/tzinLaev9boBg1X0wvlI+uFvNc7mpdQMpc0SfBia94MYxcx6O61Yzb7LfEUs0SsTmKaQYxYxk1aDL2YyoiwYJQuDXFiOAcrDoqBbdtbq77t3OZTLrJSHSh4cVKkwkvz7oZaqbEG/XcCWqjAs5lFH+fenkaHvK0gp7Eo/kzIg+kuDrR512o7eM6cJUJTBqCFSx1I52s72wbqlmSrxqkT90oqSP0MxZYx/fipL4I74bTX4KgAN0iW+dPpNB0DgTCospLaoXKwSXyp3IhhbbugSfWlR3cJ6tVKIx7neP6i9JAnB2 FmUW4egh yYaQWg+SpjGQAfO0kAN30uqqfUE+l++DgrqRZ3KtS97FCRAdW/bS3wxUad1CMYm5ZCOtM4mAb/AAcZAmxuWRKLxzxnbQLe3PLyg7BHPlkSzfA1EoLFcfK5MNfXEfkOqhb9vu8CQnf8hdRranFdClRM3ZZvmIkHPXV/bAWY9dWBc/sHvEAaQEFPbEKO00oPVNQUMOzrb8elOMgmuSFa6vM2QWN8WksarwKMHgwi8u9pUx1Jue2vuXR/z8QOPQOQMIEGF74ukMSCkBYxKKOOUekDHlgU2oxLMJnrIO6I8+T3QbyRrJEWz94CRPo1gKNyJDcpkBy6ucaKPcHRG9zAOgK4rXiTS/2Syk8KdpoGfAB1LPo7o7pkymzn1PQ3zSRmP6PsKIzXja9h/TjEudK1dE2z7PLwNp9rLAVFb7nGZwMESE4YGPqz14l7YwygoNDGq3jiv29ViWqshIKyUSufUoWp8PJcZIto447lEFh7Xu54BReuiEYGuehYpIgBxIFqV9FisBityPsb5KcDkKwsmZgbnFhuxX4mLvYOfzn4YS3EGCTIe0EX2ivrTmxTP71GBOHK1CNKdlTWDEq8fo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: If other mm code wants to use this function for early memmap inialization (on the platforms that have it), it should be made available properly, not just unconditionally in mm.h Make this function available for such cases. Signed-off-by: Frank van der Linden --- arch/powerpc/mm/init_64.c | 1 + include/linux/bootmem_info.h | 7 +++++++ include/linux/mm.h | 3 --- 3 files changed, 8 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c index d96bbc001e73..c2d99d68d40e 100644 --- a/arch/powerpc/mm/init_64.c +++ b/arch/powerpc/mm/init_64.c @@ -41,6 +41,7 @@ #include #include #include +#include #include #include diff --git a/include/linux/bootmem_info.h b/include/linux/bootmem_info.h index d8a8d245824a..4c506e76a808 100644 --- a/include/linux/bootmem_info.h +++ b/include/linux/bootmem_info.h @@ -18,6 +18,8 @@ enum bootmem_type { #ifdef CONFIG_HAVE_BOOTMEM_INFO_NODE void __init register_page_bootmem_info_node(struct pglist_data *pgdat); +void register_page_bootmem_memmap(unsigned long section_nr, struct page *map, + unsigned long nr_pages); void get_page_bootmem(unsigned long info, struct page *page, enum bootmem_type type); @@ -58,6 +60,11 @@ static inline void register_page_bootmem_info_node(struct pglist_data *pgdat) { } +static inline void register_page_bootmem_memmap(unsigned long section_nr, + struct page *map, unsigned long nr_pages) +{ +} + static inline void put_page_bootmem(struct page *page) { } diff --git a/include/linux/mm.h b/include/linux/mm.h index 7b1068ddcbb7..6dfc41b461af 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3918,9 +3918,6 @@ static inline bool vmemmap_can_optimize(struct vmem_altmap *altmap, } #endif -void register_page_bootmem_memmap(unsigned long section_nr, struct page *map, - unsigned long nr_pages); - enum mf_flags { MF_COUNT_INCREASED = 1 << 0, MF_ACTION_REQUIRED = 1 << 1, From patchwork Mon Jan 27 23:21:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951852 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0BBFEC02188 for ; Mon, 27 Jan 2025 23:23:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1A4952801B0; Mon, 27 Jan 2025 18:22:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1550628013A; Mon, 27 Jan 2025 18:22:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F3A002801B0; Mon, 27 Jan 2025 18:22:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id D0FEA28013A for ; Mon, 27 Jan 2025 18:22:44 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 8897EC0820 for ; Mon, 27 Jan 2025 23:22:44 +0000 (UTC) X-FDA: 83054808648.28.305C0ED Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf08.hostedemail.com (Postfix) with ESMTP id B122A160004 for ; Mon, 27 Jan 2025 23:22:42 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=AYta+LbK; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of 3QRWYZwQKCBAvBt1w44w1u.s421y3AD-220Bqs0.47w@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3QRWYZwQKCBAvBt1w44w1u.s421y3AD-220Bqs0.47w@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020162; a=rsa-sha256; cv=none; b=FVsfK5YUqYQC6jdHT4iaDgycuLm+RHVBzdM8vMQUeXmbk955zXegEverMWh//SYOxjTr4a AceAsYc86DgYLiAS3g5hFXV99WhdLOEjCg185w4+OZylm+dmu3DrEar9A7QmICX+MltBL8 qUla3vo/bnNZrruIMuXnYQudGZl2anY= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=AYta+LbK; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of 3QRWYZwQKCBAvBt1w44w1u.s421y3AD-220Bqs0.47w@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3QRWYZwQKCBAvBt1w44w1u.s421y3AD-220Bqs0.47w@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020162; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3UXEGVf17TVzbPjgEhSFIqZ1yVI6Ua/dfF01nsCmm7A=; b=wXiSJM0p5vzbSHBJu7llGhs5KgUr1ko7tVkaJuRBYk7SRF6qjbkfAosbzKGrY2nJv4jPKC A3rd0wtLYI2qcoKFt1uTD1S/kOI3yNdVIwHY6mdJ+TUf0wV/6lz6ym3KzIwEpTAEzcyEra oKJoN8Ue2VfR5VeMYiUtBhQvqcdPz/Q= Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2166a1a5cc4so93303885ad.3 for ; Mon, 27 Jan 2025 15:22:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020161; x=1738624961; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=3UXEGVf17TVzbPjgEhSFIqZ1yVI6Ua/dfF01nsCmm7A=; b=AYta+LbKGUxaplrgDRxkMGtujwzBAnt8+ZCo9aaPJQb3XNFw66wocYt/Tz2Z0u6tyS dCVu3OYd1Smy+n9+mLwegdQ3eX60+YqPZkzLGEVvn52Fq7OgZxSuivzkVHRMLHZwE2ut 8w4ImrKRdDhkk8NDGeSRo4KGuWi6zqvmdoX6gDP3l//kT3dd9fmG4PujfkNkn5Sqvyi/ UoKLMVXPwAGKoXDhR7RhgY01GTZluYZfxY1J0vf6Xn3jZOu4bYgiDJmoKdN27Y7rSVlk FW52W0ywoOcfTeOhexRnIa/FSd923e2HCIziuUZ66W0okFnSm2K+5/6nUOUd9Zdi+4bs nweg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020161; x=1738624961; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3UXEGVf17TVzbPjgEhSFIqZ1yVI6Ua/dfF01nsCmm7A=; b=R32eRlzH/R9tjyw3JXtC3CKKyQbq5fYT61ocl5jbHvSgCkPRcKDqbshEtzJS6PeOyS vs5l6myFoWsN1RfkseaYFdKBDXwlQK5gJ+Eey2yHLOWwIvWY1iHSYmy/+NRb9S5mZQwi 3PzeB5wxhDHw3RxSKDHpPigo35z2iQ4jOThVoqo/EFmLejxIFwrtNYDvr4qsDqWpzZWU O2Onsa0tfd4UTKLNcp9FhP/YjFJ/VFxYNn20mdNCre9KfdLbT2lYbU3sQsQ5nCJbD1UO /Gb7C+dKTSNbB1RkWTviTnkMD2NpgXUSkLMdj40d7XYSUkYdeyQ+tChussa9jc4PrLMl NEJQ== X-Forwarded-Encrypted: i=1; AJvYcCUUryi2ZQ0flZZMKrkT4Xxg26Hpok2TDWs78yBPYgo2IPW+4zMUJ2Q+V/Y0Bg/vbR35wsQFCo3Sag==@kvack.org X-Gm-Message-State: AOJu0YxRGqajzMByVBrgsK8ckfSMDuCvHaPabaQyWQ5oFGxEEGDn9EHx IGM4OVpzE5wxyN3a0ujGsHUm6UqssHIHbEZo6WhIpBWr1cfomg0AbDWVY1P03e1Wgq5KLg== X-Google-Smtp-Source: AGHT+IFhiw4AeddBRSGEq7qIMkWbFl43kdGAToKfXe/DO0y8IerN2I/rW+viU/dkSKd6w3H5A9Jgrve7 X-Received: from pfbcw13.prod.google.com ([2002:a05:6a00:450d:b0:725:1ef3:c075]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:2453:b0:1e1:b12e:edb8 with SMTP id adf61e73a8af0-1eb215682edmr76608159637.30.1738020161582; Mon, 27 Jan 2025 15:22:41 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:51 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-12-fvdl@google.com> Subject: [PATCH 11/27] mm/sparse: allow for alternate vmemmap section init at boot From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: B122A160004 X-Stat-Signature: xmmf1qdr75a675wrqzqz39qzj1f5toxk X-HE-Tag: 1738020162-179045 X-HE-Meta: U2FsdGVkX197N3QlOx6ZbjRiPfIrciiWsSDBgksu7qrNghBklH4lBoH4zXL8tEjrKNWoi0d+INN1f8DHCDAeZVhXpsSrNUEYiOo5l/1YFYp1+I9pL/K/kAssIXEOJHkcvHy4LpN4aSXDIGbPJuUAtNTAhWOUndR9xjvLdaDIr8azz5VG9a31JLIgh7OSlbBP1ltEsVVaiGS7QZyjL0y3NWnMKwfYvva5h833Q1wqsWsc5iOG2odZKXkKG5Ueg5pTm9ikN26CqtUCfmr+lb3szODFLURsag4lb7cRLzM1lCiXO/AbgYjomIl4jEY7H9dRBwpGEtQB3UKXyqQmgDqdz/TjvnUfPVO3tnRNCvI3WEO+ns48Oeqs/38L3R27kth3HNXYGQrbmQkhq0+RUxgvPbbfIp6mjygYZ9AAqOZcbCFHrJB9lyfXdQ4AdB7jZR7iz41OAEnmpc8Sw/Ip0/+d9lgFSa8vC+iKfmrqcQBqQU8zFuAr6z9tlA8+ZMDKo6gmcaa/odQvYB5R3/EwM4rRNmnCMQ+CPX08ehQjICl9C2UMSyNWCwm5EfF2OU0cHuGzQKwia1+s2lmyvMUOfi54P1pJUix+CUnN9VAQubJcVerH8mYRhJN8M0yFvH4jfsSNv/ZhGqzezwMuiSZTlg4Td3KOnePOC1ch3S25TzdOkCyeX9EKyR1wjycUBkhtiUBP/RHF25VkC/GfNwbLJngGyvP6B8txA6FhQHe9tlwla/f6uAgqN2QLoYwTxFm+i9hCr8GME1HdH4hBmxJ2QhGHBfEQSwXzvnx+HRwXzQl5b0g3rBqrxLsH0S+r1xd0wVKGBf8xICqbz+3U/tK6MtXmQeqzBkcOSThMcentyLd3Gg12UgmUZFeDx8FxZi+wGa+fZ7vhzNRWxxIghpANt/ssvRWFDmzX4DpPcuGzfW4R9etXjcIaB5N8tBV3imAiUR8H68jaYWtdgKZzCWYffuI u33ENWeR B0RPPQrns74ZrKGZkNddO/8Zg3NHHv3slQVaB4UQeEReuf9xZ8zuwIpmc+z37WYoEY7tGA4t9o7pRAabP/ol45+GFmWu4dmCmrivvaSTnSv3EKufcA1ufARBlbmJuGYCYoWvYf5Y29M2H0N2jLOrUYSJ3Jf4UldWxAC0AZFnQMMnEB+Bm5NVuY37KTGUACL0CVjyRhHykoI0oLwrcVhnBpv4nk1oQASXIbxTJE0KH2q3L3uG7r83Pnl2DHrAK81pb2fYlSvt4lEumzTvvrh1ddpxKxoocj/Bn6Gv1JhS7DZ10TNl+cP0TWtlRoIwFxsKn+NMk1IzNpb/7hQQd7o5+StD98GzMi0nb63qkDNTJiB1NWejCYqvkzOjJjLGLEOX5BsyxBRwl1d3LjMRBRoKKNvXfLR4B4q+5hmklWKYN3O1T2OmBFLgFCxxZKjFpczuLJ0ohNbSkRXAEwtKXRubmTfQstu8k4EmiNcq2XTAkWKcDwdzz9cd9+0RvJDJ8KELMwYxoqUOwT0YYz5HH80fiHieDVw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add functions that are called just before the per-section memmap is initialized and just before the memmap page structures are initialized. They are called sparse_vmemmap_init_nid_early and sparse_vmemmap_init_nid_late, respectively. This allows for mm subsystems to add calls to initialize memmap and page structures in a specific way, if using SPARSEMEM_VMEMMAP. Specifically, hugetlb can pre-HVO bootmem allocated pages that way, so that no time and resources are wasted on allocating vmemmap pages, only to free them later (and possibly unnecessarily running the system out of memory in the process). Refactor some code and export a few convenience functions for external use. In sparse_init_nid, skip any sections that are already initialized, e.g. they have been initialized by sparse_vmemmap_init_nid_early already. The hugetlb code to use these functions will be added in a later commit. Export section_map_size, as any alternate memmap init code will want to use it. THe config option to enable this is SPARSEMEM_VMEMMAP_PREINIT, which is dependent on and architecture-specific option, ARCH_WANT_SPARSEMEM_VMEMMAP_PREINIT. This is done because a section flag is used, and the number of flags available is architecture-dependent (see mmzone.h). Architecures can decide if there is room for the flag and enable the option. Fortunately, as of right now, all sparse vmemmap using architectures do have room. Signed-off-by: Frank van der Linden --- include/linux/mm.h | 1 + include/linux/mmzone.h | 35 +++++++++++++++++ mm/Kconfig | 8 ++++ mm/bootmem_info.c | 4 +- mm/mm_init.c | 3 ++ mm/sparse-vmemmap.c | 23 +++++++++++ mm/sparse.c | 87 ++++++++++++++++++++++++++++++++---------- 7 files changed, 139 insertions(+), 22 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 6dfc41b461af..df83653ed6e3 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3828,6 +3828,7 @@ static inline void print_vma_addr(char *prefix, unsigned long rip) #endif void *sparse_buffer_alloc(unsigned long size); +unsigned long section_map_size(void); struct page * __populate_section_memmap(unsigned long pfn, unsigned long nr_pages, int nid, struct vmem_altmap *altmap, struct dev_pagemap *pgmap); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 9540b41894da..44ecb2f90db4 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1933,6 +1933,9 @@ enum { SECTION_IS_EARLY_BIT, #ifdef CONFIG_ZONE_DEVICE SECTION_TAINT_ZONE_DEVICE_BIT, +#endif +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT + SECTION_IS_VMEMMAP_PREINIT_BIT, #endif SECTION_MAP_LAST_BIT, }; @@ -1944,6 +1947,9 @@ enum { #ifdef CONFIG_ZONE_DEVICE #define SECTION_TAINT_ZONE_DEVICE BIT(SECTION_TAINT_ZONE_DEVICE_BIT) #endif +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT +#define SECTION_IS_VMEMMAP_PREINIT BIT(SECTION_IS_VMEMMAP_PREINIT_BIT) +#endif #define SECTION_MAP_MASK (~(BIT(SECTION_MAP_LAST_BIT) - 1)) #define SECTION_NID_SHIFT SECTION_MAP_LAST_BIT @@ -1998,6 +2004,30 @@ static inline int online_device_section(struct mem_section *section) } #endif +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT +static inline int preinited_vmemmap_section(struct mem_section *section) +{ + return (section && + (section->section_mem_map & SECTION_IS_VMEMMAP_PREINIT)); +} + +void sparse_vmemmap_init_nid_early(int nid); +void sparse_vmemmap_init_nid_late(int nid); + +#else +static inline int preinited_vmemmap_section(struct mem_section *section) +{ + return 0; +} +static inline void sparse_vmemmap_init_nid_early(int nid) +{ +} + +static inline void sparse_vmemmap_init_nid_late(int nid) +{ +} +#endif + static inline int online_section_nr(unsigned long nr) { return online_section(__nr_to_section(nr)); @@ -2035,6 +2065,9 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) } #endif +void sparse_init_early_section(int nid, struct page *map, unsigned long pnum, + unsigned long flags); + #ifndef CONFIG_HAVE_ARCH_PFN_VALID /** * pfn_valid - check if there is a valid memory map entry for a PFN @@ -2116,6 +2149,8 @@ void sparse_init(void); #else #define sparse_init() do {} while (0) #define sparse_index_init(_sec, _nid) do {} while (0) +#define sparse_vmemmap_init_nid_early(_nid, _use) do {} while (0) +#define sparse_vmemmap_init_nid_late(_nid) do {} while (0) #define pfn_in_present_section pfn_valid #define subsection_map_init(_pfn, _nr_pages) do {} while (0) #endif /* CONFIG_SPARSEMEM */ diff --git a/mm/Kconfig b/mm/Kconfig index 1b501db06417..f984dd928ce7 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -489,6 +489,14 @@ config SPARSEMEM_VMEMMAP SPARSEMEM_VMEMMAP uses a virtually mapped memmap to optimise pfn_to_page and page_to_pfn operations. This is the most efficient option when sufficient kernel resources are available. + +config ARCH_WANT_SPARSEMEM_VMEMMAP_PREINIT + bool + +config SPARSEMEM_VMEMMAP_PREINIT + bool "Early init of sparse memory virtual memmap" + depends on SPARSEMEM_VMEMMAP && ARCH_WANT_SPARSEMEM_VMEMMAP_PREINIT + default y # # Select this config option from the architecture Kconfig, if it is preferred # to enable the feature of HugeTLB/dev_dax vmemmap optimization. diff --git a/mm/bootmem_info.c b/mm/bootmem_info.c index 95f288169a38..b0e2a9fa641f 100644 --- a/mm/bootmem_info.c +++ b/mm/bootmem_info.c @@ -88,7 +88,9 @@ static void __init register_page_bootmem_info_section(unsigned long start_pfn) memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr); - register_page_bootmem_memmap(section_nr, memmap, PAGES_PER_SECTION); + if (!preinited_vmemmap_section(ms)) + register_page_bootmem_memmap(section_nr, memmap, + PAGES_PER_SECTION); usage = ms->usage; page = virt_to_page(usage); diff --git a/mm/mm_init.c b/mm/mm_init.c index d2dee53e95dd..9f1e41c3dde6 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1862,6 +1862,9 @@ void __init free_area_init(unsigned long *max_zone_pfn) } } + for_each_node_state(nid, N_MEMORY) + sparse_vmemmap_init_nid_late(nid); + calc_nr_kernel_pages(); memmap_init(); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 3287ebadd167..8751c46c35e4 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -470,3 +470,26 @@ struct page * __meminit __populate_section_memmap(unsigned long pfn, return pfn_to_page(pfn); } + +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT +/* + * This is called just before initializing sections for a NUMA node. + * Any special initialization that needs to be done before the + * generic initialization can be done from here. Sections that + * are initialized in hooks called from here will be skipped by + * the generic initialization. + */ +void __init sparse_vmemmap_init_nid_early(int nid) +{ +} + +/* + * This is called just before the initialization of page structures + * through memmap_init. Zones are now initialized, so any work that + * needs to be done that needs zone information can be done from + * here. + */ +void __init sparse_vmemmap_init_nid_late(int nid) +{ +} +#endif diff --git a/mm/sparse.c b/mm/sparse.c index 133b033d0cba..ee0234a77c7f 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -408,13 +408,13 @@ static void __init check_usemap_section_nr(int nid, #endif /* CONFIG_MEMORY_HOTREMOVE */ #ifdef CONFIG_SPARSEMEM_VMEMMAP -static unsigned long __init section_map_size(void) +unsigned long __init section_map_size(void) { return ALIGN(sizeof(struct page) * PAGES_PER_SECTION, PMD_SIZE); } #else -static unsigned long __init section_map_size(void) +unsigned long __init section_map_size(void) { return PAGE_ALIGN(sizeof(struct page) * PAGES_PER_SECTION); } @@ -495,6 +495,44 @@ void __weak __meminit vmemmap_populate_print_last(void) { } +static void *sparse_usagebuf __meminitdata; +static void *sparse_usagebuf_end __meminitdata; + +/* + * Helper function that is used for generic section initialization, and + * can also be used by any hooks added above. + */ +void __init sparse_init_early_section(int nid, struct page *map, + unsigned long pnum, unsigned long flags) +{ + BUG_ON(!sparse_usagebuf || sparse_usagebuf >= sparse_usagebuf_end); + check_usemap_section_nr(nid, sparse_usagebuf); + sparse_init_one_section(__nr_to_section(pnum), pnum, map, + sparse_usagebuf, SECTION_IS_EARLY | flags); + sparse_usagebuf = (void *)sparse_usagebuf + mem_section_usage_size(); +} + +static int __init sparse_usage_init(int nid, unsigned long map_count) +{ + unsigned long size; + + size = mem_section_usage_size() * map_count; + sparse_usagebuf = sparse_early_usemaps_alloc_pgdat_section( + NODE_DATA(nid), size); + if (!sparse_usagebuf) { + sparse_usagebuf_end = NULL; + return -ENOMEM; + } + + sparse_usagebuf_end = sparse_usagebuf + size; + return 0; +} + +static void __init sparse_usage_fini(void) +{ + sparse_usagebuf = sparse_usagebuf_end = NULL; +} + /* * Initialize sparse on a specific node. The node spans [pnum_begin, pnum_end) * And number of present sections in this node is map_count. @@ -503,47 +541,54 @@ static void __init sparse_init_nid(int nid, unsigned long pnum_begin, unsigned long pnum_end, unsigned long map_count) { - struct mem_section_usage *usage; unsigned long pnum; struct page *map; + struct mem_section *ms; - usage = sparse_early_usemaps_alloc_pgdat_section(NODE_DATA(nid), - mem_section_usage_size() * map_count); - if (!usage) { + if (sparse_usage_init(nid, map_count)) { pr_err("%s: node[%d] usemap allocation failed", __func__, nid); goto failed; } + sparse_buffer_init(map_count * section_map_size(), nid); + + sparse_vmemmap_init_nid_early(nid); + for_each_present_section_nr(pnum_begin, pnum) { unsigned long pfn = section_nr_to_pfn(pnum); if (pnum >= pnum_end) break; - map = __populate_section_memmap(pfn, PAGES_PER_SECTION, - nid, NULL, NULL); - if (!map) { - pr_err("%s: node[%d] memory map backing failed. Some memory will not be available.", - __func__, nid); - pnum_begin = pnum; - sparse_buffer_fini(); - goto failed; + ms = __nr_to_section(pnum); + if (!preinited_vmemmap_section(ms)) { + map = __populate_section_memmap(pfn, PAGES_PER_SECTION, + nid, NULL, NULL); + if (!map) { + pr_err("%s: node[%d] memory map backing failed. Some memory will not be available.", + __func__, nid); + pnum_begin = pnum; + sparse_usage_fini(); + sparse_buffer_fini(); + goto failed; + } + sparse_init_early_section(nid, map, pnum, 0); } - check_usemap_section_nr(nid, usage); - sparse_init_one_section(__nr_to_section(pnum), pnum, map, usage, - SECTION_IS_EARLY); - usage = (void *) usage + mem_section_usage_size(); } + sparse_usage_fini(); sparse_buffer_fini(); return; failed: - /* We failed to allocate, mark all the following pnums as not present */ + /* + * We failed to allocate, mark all the following pnums as not present, + * except the ones already initialized earlier. + */ for_each_present_section_nr(pnum_begin, pnum) { - struct mem_section *ms; - if (pnum >= pnum_end) break; ms = __nr_to_section(pnum); + if (!preinited_vmemmap_section(ms)) + ms->section_mem_map = 0; ms->section_mem_map = 0; } } From patchwork Mon Jan 27 23:21:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951853 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF256C0218C for ; Mon, 27 Jan 2025 23:23:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D90D02801B8; Mon, 27 Jan 2025 18:22:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D3C8128013A; Mon, 27 Jan 2025 18:22:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BB4D12801B8; Mon, 27 Jan 2025 18:22:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 9867828013A for ; Mon, 27 Jan 2025 18:22:46 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 53C9280899 for ; Mon, 27 Jan 2025 23:22:46 +0000 (UTC) X-FDA: 83054808732.11.F275A1B Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf14.hostedemail.com (Postfix) with ESMTP id 5E86A100014 for ; Mon, 27 Jan 2025 23:22:44 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=NrnTfdsm; spf=pass (imf14.hostedemail.com: domain of 3QxWYZwQKCBIxDv3y66y3w.u64305CF-442Dsu2.69y@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3QxWYZwQKCBIxDv3y66y3w.u64305CF-442Dsu2.69y@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020164; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ahGehxe7VKaUNIFIiK0DUduVLM+snH5y4prr1K+tq+0=; b=7yaQIR2/a61bd0EMSd67sQoT1Zh1cnfYvoHmsgTpq9IiJHymLJzerl9dZ9vkI/fuhNYaE8 RGsVRl5FJWDewC/Q/xt9Va0j6iJ/wQwIfNR9pCKtXZ1SiEJPpzuFE17qToFrjHScw62WiG zDptCNT0Nfz80TKW4XFkyRHNhUClwAU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020164; a=rsa-sha256; cv=none; b=nl3gVTsYs1BAl2hchqNB1Zk5RMRM3psccdou7zKELaHP5YxjHwVQB4IfY1vxaKsRACuryX tlp3EIswoKTy3qAKoucDsQWcqOS8F38ORJE7j7tqzfr4XlBHx+eLI4NxzoC8elMIZBnh4k djxrF4V7XdYvT3f00nEbdZChJ4G7Jlg= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=NrnTfdsm; spf=pass (imf14.hostedemail.com: domain of 3QxWYZwQKCBIxDv3y66y3w.u64305CF-442Dsu2.69y@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3QxWYZwQKCBIxDv3y66y3w.u64305CF-442Dsu2.69y@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2efa74481fdso10055552a91.1 for ; Mon, 27 Jan 2025 15:22:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020163; x=1738624963; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ahGehxe7VKaUNIFIiK0DUduVLM+snH5y4prr1K+tq+0=; b=NrnTfdsmGIhVXZomVAmJfIiEcNOrRJObZI3cyfOA44oT+HlmZBbZlbFXoUWz6G8rXS 399lUiT/CGDcLJpedDP3etryFNvXFiB1qKYTwCHqxTJ8uWhUWuf72KQ/hDNYED9JoeJd PkksLMzrW5JsFx17qvaW+CP+zWuxACbG4MQTk3kHsUyGc6aEG7iHF2ovB/pdW3SXL/km 7beNLAz4gp2jbIw6Gb2bTXdELplLzPTFQ7MlNkT2EipNaxbdHYyjpKznHqJsi2Ov0K0N xElhmr5tdVd1IV4WOSqlZuQ3A/662VSIa9mnwOkZPIMpNEwI+bwVDIR3omMQiV7WAQy/ 8r1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020163; x=1738624963; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ahGehxe7VKaUNIFIiK0DUduVLM+snH5y4prr1K+tq+0=; b=KZzCKxCbdDfc5VLvrR0nxru/v5bzJJF/TWAiYB4DTEWMYRIPjQLlPRmBuFlts+1S/P Sox098MeQnD2j0rD4ks2TuLqhyYct8Zzf+COGwKaQ93NoThnNopky1Nd8CNlNR3Y3GGp l+YarOC6pJ49K7DB7HCGBMER2YLMOQCJXgLi9lWbi+9pw/Q0Bi6Oud5uqGh8J2pZrmLo PkkLLPMQW6GeIMMUkxqy2U4qN6igkZyxNN0VKkbAcPV0u/vw3WLhbDkOAn2HPGmcbs18 BYJwlUGp1cej4N/FVRHM66kD1PtzXGL7N4eCVERvNawvX/2LZTqgkor80PyHpUb0JmnP gsVQ== X-Forwarded-Encrypted: i=1; AJvYcCVm5TJ1IWPwt0zI6jlyEbWA2vGmU52msKfLh1nka3W0iPzkp1zQa9ybkncPax2iSD3O8GdqTjrA9A==@kvack.org X-Gm-Message-State: AOJu0YyO0LaImPoamvy8qMVQGdUaJwlWDGPjE8kYvnx3wOYbdSfHBpLt gdlg0gf8R1FrSMtln9NZeoHIVUQl3cOqq61WPss4s7VEAmWBg0eVvXZl4NurCSg0UYdoVw== X-Google-Smtp-Source: AGHT+IGFs3VKcM0qeDQDow0P008CgDVW9Bc9HLj1xsHRZgddaRUVV0k4+E5fjCd5lGcKYV5Gkvrp4cQ/ X-Received: from pfbeg15.prod.google.com ([2002:a05:6a00:800f:b0:725:c72a:a28a]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:181f:b0:727:3cd0:1167 with SMTP id d2e1a72fcca58-72dafba0247mr63641686b3a.21.1738020163186; Mon, 27 Jan 2025 15:22:43 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:52 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-13-fvdl@google.com> Subject: [PATCH 12/27] mm/hugetlb: set migratetype for bootmem folios From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Stat-Signature: b8cqfxbzotg3yfb6m4k1xezsw9879m9d X-Rspamd-Queue-Id: 5E86A100014 X-Rspam-User: X-Rspamd-Server: rspam06 X-HE-Tag: 1738020164-973156 X-HE-Meta: U2FsdGVkX18lfsqaI6Bc0bg9w+YfvFY2DxhiUOUlhEspRg14vhi4TbhppUyN5Yd4KE6LGvXbl5kg3XPpUef5g2VY7CnPtxpJUDY0kIQcUrCmmboNYk405gzVpWWUk/anNGdJvO9rOYXnMIxGU5QQOyozBvLeewfKqzkfS9ioh9Hotpberef9sO3TO9A0jgCqkr31EM/FkjeyWro5P4yjA2fmXDPRYeVc9sFzyvuWlMjW0D5KP9LorQGinlGsOX0FpvWQ35Lgp+Zu7k8wfa9msmmrrGKINtpIzAzQq4lXPydECf/n8zi4pjNfXXP3Rsj3qFJaIHEz5ENHLRPyHB2aRchI23Fch8EdBi2XGc5z75yGvbG9jLlvh6Ld/q6PtMryLWRfOfOvYIla4cbMqnf0d+9GnGnkS+KVvp6819oZLkcfEDXBitNrOieFpmHcHOvxco9eCTyZAqYAssgkqcAwNDlxuJ5Y+1L0/YBnUvdl9wfqfB0EKDm61zlidV6t8cmof1Fym+CdfbvdAv7T7gENlmyt2TfdcqSO7JGXeDyY4lx8CN+apg2q5rnbgdVoJLJ49Pff6ZSK3wRcNNPZVBYu0h09oM2Ymvo97kaPSLtm461BM96rr5po3YacwxZVadvqPpwAEmTZI5iinq39DRkaeZ+vxj2HLHSM/LNJugUrA33fQO2e9K1HW3EJ8ascA3jKGxfsm7Um+gQNkbUxa9U41ywEcvtAHPOP+VV0eMlmhVMPdva+P/ojsWXlcyeU2jHB7jS8GgaE6XDo6b7M0FGz8Y984Tw9FUzZsWupHfBzwQH6UMyoGKE+QCt9rNN8BtWT5jK9Iq739Br7pZFLaXOcyXDUkwfOjj8gt8JSGP8A4yY0Wq8hQT/cStkdPJZ2V+9cZOYdwdB9GIdDC+WmhQcV4FoqwjFox2xReX6v3FJ5Ge0GeE4fy+Gmq7wImimvEBJVzkrSfzZM3Efs1hwUe6s O9JDyFqt VlhfuiqulH7xWLYxiK0ZprVegizR24Z8iw87qDOCxamy5U88fEw8TTQ2z6gx+daCnsm3Et6KNXkTQBgJYganZ9pPXf7ItR5m9BaJ2IM5FH5a18VWRfGGAjUduVz1jY0nmMbXb0f+Mpj0viq/ctJSFPkTwm5h/AlDK6n0GR0L7iglR3rpdr1wV/8KhbPiZl1I8MBWcFWOEMoSTnOOKyAEiWGhoOiOel17ZPk0G0e4+VWNjZO3fhAp/mFUOGsCP7luKR3nTQiC8BQUeEJ4U5P02ZUtRY2KXV7AZ8ulHUQiUPmtPIPjyi+kiRaIgDxumaCWOAqhjpUSnwXGcdeqqdg3oxEfw6bpds21OAzt6JMlfCPrlYka0EhJt0q1PCfgcpZYA1dOx2Gw/5N96Am/El5ruMQwOq0JKN7LrwgWnkjRBiMxNfZH51dEnCmeHPxRrbgrpcqlsn7TPPUDSe1xHL0lop4JwLVLgzWXE6cNeSaFhU8+9ye0mTlJgm6JyF4xZaVYCdsRBtXHLbNW7JAtbDD7xwhaKKw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The pageblocks that back memblock allocated hugetlb folios might not have the migrate type set, in the CONFIG_DEFERRED_STRUCT_PAGE_INIT case. memblock allocated hugetlb folios might be given to the buddy allocator eventually (if nr_hugepages is lowered), so make sure that the migrate type for the pageblocks contained in them is set when initializing them. Set it to the default that memmap init also uses (MIGRATE_MOVABLE). Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a95ab44d5545..9969717b7dd8 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -36,6 +36,7 @@ #include #include #include +#include #include #include @@ -3258,6 +3259,26 @@ static void __init hugetlb_folio_init_vmemmap(struct folio *folio, prep_compound_head((struct page *)folio, huge_page_order(h)); } +/* + * memblock-allocated pageblocks might not have the migrate type set + * if marked with the 'noinit' flag. Set it to the default (MIGRATE_MOVABLE) + * here. + * + * Note that this will not write the page struct, it is ok (and necessary) + * to do this on vmemmap optimized folios. + */ +static void __init hugetlb_bootmem_init_migratetype(struct folio *folio, + struct hstate *h) +{ + unsigned long nr_pages = pages_per_huge_page(h), i; + + WARN_ON_ONCE(!pageblock_aligned(folio_pfn(folio))); + + for (i = 0; i < nr_pages; i += pageblock_nr_pages) + set_pageblock_migratetype(folio_page(folio, i), + MIGRATE_MOVABLE); +} + static void __init prep_and_add_bootmem_folios(struct hstate *h, struct list_head *folio_list) { @@ -3279,6 +3300,7 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, HUGETLB_VMEMMAP_RESERVE_PAGES, pages_per_huge_page(h)); } + hugetlb_bootmem_init_migratetype(folio, h); /* Subdivide locks to achieve better parallel performance */ spin_lock_irqsave(&hugetlb_lock, flags); __prep_account_new_huge_page(h, folio_nid(folio)); From patchwork Mon Jan 27 23:21:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951854 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30DAEC0218A for ; Mon, 27 Jan 2025 23:23:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 639FC2801C2; Mon, 27 Jan 2025 18:22:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5E94828013A; Mon, 27 Jan 2025 18:22:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 416052801C2; Mon, 27 Jan 2025 18:22:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 20CAE28013A for ; Mon, 27 Jan 2025 18:22:48 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id CEF6916059D for ; Mon, 27 Jan 2025 23:22:47 +0000 (UTC) X-FDA: 83054808774.25.AA253F6 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf03.hostedemail.com (Postfix) with ESMTP id 03AC52000B for ; Mon, 27 Jan 2025 23:22:45 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Nd5oV5Ug; spf=pass (imf03.hostedemail.com: domain of 3RBWYZwQKCBMyEw4z77z4x.v75416DG-553Etv3.7Az@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3RBWYZwQKCBMyEw4z77z4x.v75416DG-553Etv3.7Az@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020166; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xLt9HBc0qt/kG8opRR3xwjjIGa13UKxYV2/Ut2PpccA=; b=1jgveIqoZpCmKybYxOEf9K7cv6MMu4TFVp/TMV/C5zr1XGTCtpVzpd1ZfRvD4UzrzFHCG6 t3KAWKeppYzgzW/LarRum2PEkoQKnr9zYEtHx5UJE9ltbAjlnGYdp9yvUO+r3J3Lmjrril /7rqW/r5p+IYiHcFxvDfx70JwLqnHw8= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Nd5oV5Ug; spf=pass (imf03.hostedemail.com: domain of 3RBWYZwQKCBMyEw4z77z4x.v75416DG-553Etv3.7Az@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3RBWYZwQKCBMyEw4z77z4x.v75416DG-553Etv3.7Az@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020166; a=rsa-sha256; cv=none; b=teQVpZe2A0+pKYssHN4DQ6/ha3N6AuO0PzDC3qEB14zbSASyzjFysiL44U/0nWfXDGZ/i0 ntOsHSCrviXd5msvXvoGYXzGlr2fJAL68CdriKVci7VB93XdefznR0ifOQtANbdGgSia0Q eetBamlVJbFnp1jT+rdCDjKxbzdvvGo= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2166e907b5eso89141995ad.3 for ; Mon, 27 Jan 2025 15:22:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020165; x=1738624965; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=xLt9HBc0qt/kG8opRR3xwjjIGa13UKxYV2/Ut2PpccA=; b=Nd5oV5Ug/DMDySpKK4Q804lcxelUJyV+uAuEauP5KVWya8GggFpz0fUnmxTiq3bMmK fid8yfrYw5/RGjW6p1uYPoWcs9wnna2upaf8doLKoxbbiILZNwQ0l3xen7Ba5uMY5dKJ bPUwsGCnB2VEytx0GnWs1reYRtbD9xVMrm/+Q0SJhLd6+6vbpDNBtpSQxL2ThCawqkUe V+sAqqhd2TKHZv0sLzvLAOxeXf/bpv9hQdWKtxgrx1EzO8h3WzPsSJwS8VFnJaQ992s/ KInv5u/pBY1slxpV6cMgMU/SpWIa8sYZ8qSCPl1bbBNBNQ4wjCoW6eyQrBRadv+R9sN0 F9zA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020165; x=1738624965; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xLt9HBc0qt/kG8opRR3xwjjIGa13UKxYV2/Ut2PpccA=; b=a+X1Dkn1QQZciEOnYPrq8qWrJe9J6uIOPbmOHZtvXfD99xVoesRKuVxG1SwGdqQgpF UB9AnXPz69ZzTw/GCKHmE38PLMaS4twDxoflFMDLNyEsRY5eZ9wDK61s4d/8BoY05t/v 6rlY8DMlsnW6ov2cXZ3AYXKTyst+Aov2n6ZxXlyB9d4OQP8NylaC7ciV/LJO7e4rWXRd 19QyXL0QXAN+DeTUspazbaT/QQzU0wb7zXN3C28gwBsggU7N/BaBwDRhR4XIQ7rvkQvm gGuYXcZxli5PyGmBMdSsfPHkpgwkzoYAOQ0unflH0TZauYR96UwHG5msT3tFLB4/0WPf G45A== X-Forwarded-Encrypted: i=1; AJvYcCXCIe6jGfzgyfM2qNKqFfz7jlxP66aNve4BdGkl6HBW2AFqUYYJ6l69jKLmyuvJPp5IrhlCL/b9+w==@kvack.org X-Gm-Message-State: AOJu0YwJYJ/1XFde1fj+kvWXq/RrADE7ohbQDqwouW3TLJGC2/rI9PwB Qb3Q1YkEnD0DfMzP/Oia1Dzzlcpq2ETWTxmCMk0G1a+rvn9rdXb6AQT3ZgOzfyIFMVATXA== X-Google-Smtp-Source: AGHT+IEy4AliGB/BJZg6HxqjwcmwX1utwC6DXZQSuAJ7MG3sxG6Hj9E074OSH8KEZyMN9lWrc3wMyVEV X-Received: from pgqw8.prod.google.com ([2002:a65:6948:0:b0:ac2:39d0:bdd7]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:7896:b0:1cf:27bf:8e03 with SMTP id adf61e73a8af0-1eb21585e4cmr64832659637.26.1738020164830; Mon, 27 Jan 2025 15:22:44 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:53 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-14-fvdl@google.com> Subject: [PATCH 13/27] mm: define __init_reserved_page_zone function From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Queue-Id: 03AC52000B X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: bsodgdtpywb1ipca7pj3rnkixoami3w9 X-HE-Tag: 1738020165-939344 X-HE-Meta: U2FsdGVkX1+41bIOiFN6U7UgzIruUmNplM+y91jXHT6u8nmYWW4Rm6qRcZET6fsgMWNlk2ohfQsUs9MalAKk6nzQpiGCB/O7WYjHUS8BNyqet+PsI1onqiZlxlOKXa6V7GEe1uKsCaCcwuB2eDQ4FLOvbeL9z0LAMg5ksA1CrQ0DQKoG3FM5ntfiiJR9nAHDTW//FByDaaTCd+GLqRPlEIKjXhVPq/E2JiIXn70lq89VADTW38j6uuYP3siTCxJziY8UIfb5Ppo4gBjooaMdIpkoVs65XqTzN0WJ9L9nT2Gc8DMNhpQzVoKiRRetICznsdSWNvwSDETGaGEpOnAOqDlIfesI8nRXF2pQnN2TPyvOVk1Q2Bs4jH6vzWNyHzwRuJrxWG5gZiT7jUkA0Re9Q2n4VyTFmOxFNmG26QgotFhG/gpe4SJS+JQSvkT8/YB39BxHgKb4rI8e9vhpjpXKjqA2TiEVnvZcMZYYlRtI2LD3u4FycNWJ2fBoAL/XXI9OYtr2v6KXYL79TXUq1ehcSFnLJRf7EwYSXfFih61C8jXsL1gi2PLDLy7avsLJU3tA0ogXgAIKVqJX119HnVsuF75dNKj3r86yisTKzHCNaoPImMjcliKoE68AGCJIjoQuRZoCPYbY9W+FdlsIm2ct7hQ9YuD9FSaNpK01Ai5uUnFO1X8Yia69JnXjxHBhSoJBVazDOKogt0/1oGcc1oJREuu9AaaMZ+PYQSgpuUCCUm+xEMkhn+4RjDDupi5K6tRj1CgaQOQK5U0Ih5YRW2eICJHk24aYEnfkcK3N8NZ1TwfAa5Gl0jgJkepItkK6xcbe+S7bW0gc47pMDZl47QDo77F8dh6Jn56wGhiCxDJGWW3dG6K2C2Fk4tnAKWeCfKbgIpD7ddG9xMw8/bG3ZINdCbOcZCScsmJjiHkFWHReQy0z2FbL1MqKMu15v6TwQI9MHpiKuQJvKOVaURofIIg HlfwqFSk V7KL7IcXd1wSpXORujLauME4IRN6zEak1EIC4Rgtk5vTirLBu+fh+vawQwhiQxgnUihC/0Z3V2J5VClotZAY/w36Q35IUSy/+K3/rgN39ylpnZQiDyf8vdp94nQWSphT+J0FZxsc12cqJgKGkP9q52g1xsEx0btA0w4jdqUqLLmaOThEefQwAIel165ypXFMbeatfJP2a96wGXIzRwYVU2jM/UUcoZxkqnyRbakXxFWrEbVCyiz0HhieWtGTvrOfiolbQvhht+YwP066xQsBBHEpGJ/6jdJyJiFtpTS9NE2nMCFwsMmvU7nG0sJKSrBap6UBOyQIB6bCS8GT38tdOQfVnlTbJMOB0QD452Jh675MdfJSftpfmnEJO2E3rqG24vawe/pGyCcN49yHqMubFofAd7HhtpCwj/5THMwQzRGWyHpocnw2PGukxfZ6VyH+gLPHsJGYDVMleqf6sVFv7EelJEtL/2PTYuD+MgqABf6wZ/ttux91oDpbQYE2usNkblWnvbnYjDhpBYWgP0DDivyfKcVjpSNfSf8KWtiox8jEXh91vd2VQ28J4DqygkWSNmQcx5TzKCxj4TaV8Q3T47LlCM06efPwJWtTrRxdeMPwR2Os= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Sometimes page structs must be unconditionally initialized as reserved, regardless of DEFERRED_STRUCT_PAGE_INIT. Define a function, __init_reserved_page_zone, containing code that already did all of the work in init_reserved_page, and make it available for use. Signed-off-by: Frank van der Linden --- mm/internal.h | 1 + mm/mm_init.c | 38 +++++++++++++++++++++++--------------- 2 files changed, 24 insertions(+), 15 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 109ef30fee11..57662141930e 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1448,6 +1448,7 @@ static inline bool pte_needs_soft_dirty_wp(struct vm_area_struct *vma, pte_t pte void __meminit __init_single_page(struct page *page, unsigned long pfn, unsigned long zone, int nid); +void __meminit __init_reserved_page_zone(unsigned long pfn, int nid); /* shrinker related functions */ unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, diff --git a/mm/mm_init.c b/mm/mm_init.c index 9f1e41c3dde6..925ed6564572 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -650,6 +650,28 @@ static inline void fixup_hashdist(void) static inline void fixup_hashdist(void) {} #endif /* CONFIG_NUMA */ +/* + * Initialize a reserved page unconditionally, finding its zone first. + */ +void __meminit __init_reserved_page_zone(unsigned long pfn, int nid) +{ + pg_data_t *pgdat; + int zid; + + pgdat = NODE_DATA(nid); + + for (zid = 0; zid < MAX_NR_ZONES; zid++) { + struct zone *zone = &pgdat->node_zones[zid]; + + if (zone_spans_pfn(zone, pfn)) + break; + } + __init_single_page(pfn_to_page(pfn), pfn, zid, nid); + + if (pageblock_aligned(pfn)) + set_pageblock_migratetype(pfn_to_page(pfn), MIGRATE_MOVABLE); +} + #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT static inline void pgdat_set_deferred_range(pg_data_t *pgdat) { @@ -708,24 +730,10 @@ defer_init(int nid, unsigned long pfn, unsigned long end_pfn) static void __meminit init_reserved_page(unsigned long pfn, int nid) { - pg_data_t *pgdat; - int zid; - if (early_page_initialised(pfn, nid)) return; - pgdat = NODE_DATA(nid); - - for (zid = 0; zid < MAX_NR_ZONES; zid++) { - struct zone *zone = &pgdat->node_zones[zid]; - - if (zone_spans_pfn(zone, pfn)) - break; - } - __init_single_page(pfn_to_page(pfn), pfn, zid, nid); - - if (pageblock_aligned(pfn)) - set_pageblock_migratetype(pfn_to_page(pfn), MIGRATE_MOVABLE); + __init_reserved_page_zone(pfn, nid); } #else static inline void pgdat_set_deferred_range(pg_data_t *pgdat) {} From patchwork Mon Jan 27 23:21:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951855 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49254C0218A for ; Mon, 27 Jan 2025 23:23:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B1EE82801C4; Mon, 27 Jan 2025 18:22:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ACB4028013A; Mon, 27 Jan 2025 18:22:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F7482801C4; Mon, 27 Jan 2025 18:22:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 6A76828013A for ; Mon, 27 Jan 2025 18:22:50 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 5E6E716059D for ; Mon, 27 Jan 2025 23:22:49 +0000 (UTC) X-FDA: 83054808858.06.56E17EE Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf18.hostedemail.com (Postfix) with ESMTP id 906391C0009 for ; Mon, 27 Jan 2025 23:22:47 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Op2oPyvy; spf=pass (imf18.hostedemail.com: domain of 3RhWYZwQKCBU0Gy619916z.x97638FI-775Gvx5.9C1@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3RhWYZwQKCBU0Gy619916z.x97638FI-775Gvx5.9C1@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020167; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fPEa5A3QHhx3Kbv5Xhjw+prW9k2+8h7ykdvwjxvCKJY=; b=s4Go1zW9Ro1kQSZvfqj9y8DVnSQZ781tMMEQVYx/kxLNzayyxEmGVQMgZ2JlS16Zpss0Rg UT0V/jgRXg/vUR5ChyUdMCWNGozpkULzyOCuOwLWLq4ic0AnmKdSQFZuKnC6ZeROhCZOW3 00DGXsW5EpKJf8ALHo3gHKBKr/tYEmk= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Op2oPyvy; spf=pass (imf18.hostedemail.com: domain of 3RhWYZwQKCBU0Gy619916z.x97638FI-775Gvx5.9C1@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3RhWYZwQKCBU0Gy619916z.x97638FI-775Gvx5.9C1@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020167; a=rsa-sha256; cv=none; b=Ip0lRgtTbFV7K/Nv2rw+5+02kpmEWJaradohYNWYPG+PW4C+VJiyl49IgbHzFaVXq9CkZj SZKS3NiD0YjOMQ/WHPTZnwkIlU3Efj+HnG4WsxBZ9FtT8ocP70bmAnnqpQTx4OCI5EGn0l EnEAib8KBhASltgj8TnIvtgXfhAQ8YM= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2f816a85facso3532302a91.3 for ; Mon, 27 Jan 2025 15:22:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020166; x=1738624966; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=fPEa5A3QHhx3Kbv5Xhjw+prW9k2+8h7ykdvwjxvCKJY=; b=Op2oPyvy2irzpebu/Tu4hi5f/0Z7RGMjuCZa3r8EKkkbboKJc1etJhrCKc2M+qndeJ O2HTiQnYFhMVizsLEzNKq7csSGOQmiBohPHBBayJOWojXMhhIP+1SCW9ll40UEhfjOW5 CuxU629JJGA7/lg3I9RAboBjnUYB8Sdx4KsWCh6IxGZ3MuJOYW/givMIfXRaAC6oao4z CtAyqPyYJRZUJzN56t0kmORcFpTLSMn8HyV/zdJqav0TBPhKqJHMoKJpOZS70hxrtFUc 83VDVjI/7V18A2KE9th8xd73vZ/rD5ut9wuesh7z+hAg8/bLy8qVM9L+YOuTSlSu0AAj qCxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020166; x=1738624966; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=fPEa5A3QHhx3Kbv5Xhjw+prW9k2+8h7ykdvwjxvCKJY=; b=WGt1LepzviGB1O3qC6JhNlWaeMzH6zPZ4ri3858GsOjQ1G5T0Ebp7gz5GYNgqza/zd cka4Q2nyJtdOyoVTegqDy1iwxop8KZvUt4+4HEPJTui82oorQfKwfXwUWxN108ACAwlJ gHgDa3VflsjHHCK3VVsQgjUAQoHq4QNilUw2YxFhxn8SZBVwMFgxhg928masq5ZBraWA bTkRB7Uug5XMWwb/DuXH69WnLR4bKzzrZ8rgabu+M/Iv37VA57mIhmKK3BuhiXZlPN6o lcqXKsK88htPsAgzUdvK9O6Ze8ANy2MngF55Y9CdtTo/Dh8HE3nWd6R170WcwBIutW/s cKpQ== X-Forwarded-Encrypted: i=1; AJvYcCUBk0VKK8LZfCoVNUv/zvvssrhtMC7GaKRhzSB6eiZhznM83TiBpcqKjBt8vJjkwXikyj+07my2WA==@kvack.org X-Gm-Message-State: AOJu0YymJSIULhg4d2RQIGi8In+jL4q34SCiH0zSw+LXtoBVv6y2c7JM 44q2HeQfX7sEDpNCs5wLS6fX7UbsMe281w8ParnOT/nnswB4nwI5imSY5XZr/nzNWBriww== X-Google-Smtp-Source: AGHT+IGn7jc5Om129Hpx9vXBJ3rH8UAewwJypxwqAHi6sGDmxGCjPTFhR44F/Z++C8sNe0c1dXnl0ue0 X-Received: from pfxa29.prod.google.com ([2002:a05:6a00:1d1d:b0:728:e245:6e93]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:84f:b0:725:df1a:288 with SMTP id d2e1a72fcca58-72dafaf8ab3mr70132923b3a.24.1738020166471; Mon, 27 Jan 2025 15:22:46 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:54 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-15-fvdl@google.com> Subject: [PATCH 14/27] mm/hugetlb: check bootmem pages for zone intersections From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 906391C0009 X-Stat-Signature: 7qtnhdnh7ukfqdd8ixrdpqeoukyami45 X-Rspam-User: X-HE-Tag: 1738020167-814541 X-HE-Meta: U2FsdGVkX1/ZyRywcs1aZhJdhrxEVFHztrrTpu7W5LQlRedU4umhp8hFbTkdk1z6/9hZII5gQA9d1Vomh0MedEy8iNj3ByEmLrTjeQkeq2ZrfCzyzova/aH6IZgG5B0vnYIlz5jdN0trSj8bWQaNox35zHCOiicQH4MGzKhBqrjJ/Pn3ql9/7Q3CrOHiy3Ztfw1ts45ozYo5ZankXswMQs/PtpW9+MO+VrVM0vlEnsz1Jwf2rLMOyDIu5jnWOjNDRJ47t8FxOZhMaC6peaIXI5zLsSqPw4j28/64X/WoWmy4P2BEbuxJCwfkscFxXFIqWXRYiKilQ3foAWG60qbZdX/3kojJs/1zkVSD2RFYv2x0ehNVVImDnnXtOmL1end85FJFmFbEnZ5qQqhZki71wKfTNUn2mNtqh0Xdx4fNJ6kqbW0aWTIuTCR6H/J24J/JUZu/HGz2S3HYCHN+tyPb8H75lkAXYm8+STmreWe4JsG3nTULflkIYTLKd0abcrTWXv2o4D/QRUglCD9IyXJM6BeZAz0TavmAc1pn2kh8hD7RKgECSOCbgpi6qh3G/H4nNjHsjk2G2wRUewXyTMF+hlRA+wyuuqErf6S2IO/t7485jv2LZ8hn3eZmfHRsg9N/1eQde1WZDQrwTBE06bsvxSEPlQGBnVCOTdXo0/Ml3jqoEh4Y8EVnfggHpCdDIvZIc6VgvYpVtUgeBAFzuVzwHbYRQFd/3wCmmmHk0XihcuQ55+5lYoL2tu3Dsp0n0i6r9dSI2T7g9hlwjkIhv9JW8ase9WFm6oY3+K7uOTPrrSWJf+A5j467dYg1EtfcQDdu01VlO5wdZI50pE/gyGoSl8nSsEsp9t1KZzp/Dv0zMY1J/PTqE6yTdj7zYo5v97mvtVN4MwOaxR3GSK0Y4IC+yl8gnCGpoyKNN0kPHIfDxtm7DDVHDpmdxu7SzyNc5Mhuj6Q1mf/Ih7+wPyvwckw zQj/PTsI HKi+6LtjTPTg7qlFKJE2XKDsmbV4qEiv8Iqb0mYHrcC1wnIS/LnUM+5dsSIFyU0GpybReL4DzcUlbjprRlcPc+0KADkMyvnJBVXL5HtDnVSX/ZnqyQNXTjuy5LyVMedcM6pYdPEvHXShOVf9XmOa6JIeJwHubq71q1jDYCBNjWBSdohkeKiiMrq0OBX4OwIYF7lMQtY0k7+qw2xq1YiI+6SaZnwt9VGjJbpY4RMsBriKNzIFiNaEzHxKprxYOQ+m2TzpbL3ptOpqdJC+hTKk3pHSkuFXEEOsCQJy3Dub9UvDel37eFk+BG/ebvB5XrfoeyPxzlw4lhWMeXyMgHnbnjZMNrpqoxdY+qyLDUonmX/l+amt3GaDnXZlWpAPPSeqoAyL9YGSh/PEaOS8S3LesGIC6QA6+U7FkTaUPHxeRBJsD88g1xxAfCAKgq3p6ocTvh/Q37t1Ed85f3RmCIVjMqF1AK1ea7FIWPoTDSX5m0i1lYELDzj4YvcxZy82PesBko3r/gSSRuXR3nio4wOfkBHzvMQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Bootmem hugetlb pages are allocated using memblock, which isn't (and mostly can't be) aware of zones. So, they may end up crossing zone boundaries. This would create confusion, a hugetlb page that is part of multiple zones is bad. Worse, HVO might then end up stealthily re-assigning pages to a different zone when a hugetlb page is freed, since the tail page structures beyond the first vmemmap page would inherit the zone of the first page structures. While the chance of this happening is low, you can definitely create a configuration where this happens (especially using ZONE_MOVABLE). To avoid this issue, check if bootmem hugetlb pages intersect with multiple zones during the gather phase, and discard them, handing them to the page allocator, if they do. Record the number of invalid bootmem pages per node and subtract them from the number of available pages at the end, making it easier to do these checks in multiple places later on. Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++-- mm/internal.h | 2 ++ mm/mm_init.c | 25 +++++++++++++++++++++ 3 files changed, 86 insertions(+), 2 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 9969717b7dd8..a4d29a4f3efe 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -63,6 +63,7 @@ static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata; static unsigned long hugetlb_cma_size __initdata; __initdata struct list_head huge_boot_pages[MAX_NUMNODES]; +__initdata unsigned long hstate_boot_nrinvalid[HUGE_MAX_HSTATE]; /* * Due to ordering constraints across the init code for various @@ -3309,6 +3310,44 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, } } +static bool __init hugetlb_bootmem_page_zones_valid(int nid, + struct huge_bootmem_page *m) +{ + unsigned long start_pfn; + bool valid; + + start_pfn = virt_to_phys(m) >> PAGE_SHIFT; + + valid = !pfn_range_intersects_zones(nid, start_pfn, + pages_per_huge_page(m->hstate)); + if (!valid) + hstate_boot_nrinvalid[hstate_index(m->hstate)]++; + + return valid; +} + +/* + * Free a bootmem page that was found to be invalid (intersecting with + * multiple zones). + * + * Since it intersects with multiple zones, we can't just do a free + * operation on all pages at once, but instead have to walk all + * pages, freeing them one by one. + */ +static void __init hugetlb_bootmem_free_invalid_page(int nid, struct page *page, + struct hstate *h) +{ + unsigned long npages = pages_per_huge_page(h); + unsigned long pfn; + + while (npages--) { + pfn = page_to_pfn(page); + __init_reserved_page_zone(pfn, nid); + free_reserved_page(page); + page++; + } +} + /* * Put bootmem huge pages into the standard lists after mem_map is up. * Note: This only applies to gigantic (order > MAX_PAGE_ORDER) pages. @@ -3316,14 +3355,25 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, static void __init gather_bootmem_prealloc_node(unsigned long nid) { LIST_HEAD(folio_list); - struct huge_bootmem_page *m; + struct huge_bootmem_page *m, *tm; struct hstate *h = NULL, *prev_h = NULL; - list_for_each_entry(m, &huge_boot_pages[nid], list) { + list_for_each_entry_safe(m, tm, &huge_boot_pages[nid], list) { struct page *page = virt_to_page(m); struct folio *folio = (void *)page; h = m->hstate; + if (!hugetlb_bootmem_page_zones_valid(nid, m)) { + /* + * Can't use this page. Initialize the + * page structures if that hasn't already + * been done, and give them to the page + * allocator. + */ + hugetlb_bootmem_free_invalid_page(nid, page, h); + continue; + } + /* * It is possible to have multiple huge page sizes (hstates) * in this list. If so, process each size separately. @@ -3595,13 +3645,20 @@ static void __init hugetlb_init_hstates(void) static void __init report_hugepages(void) { struct hstate *h; + unsigned long nrinvalid; for_each_hstate(h) { char buf[32]; + nrinvalid = hstate_boot_nrinvalid[hstate_index(h)]; + h->max_huge_pages -= nrinvalid; + string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32); pr_info("HugeTLB: registered %s page size, pre-allocated %ld pages\n", buf, h->free_huge_pages); + if (nrinvalid) + pr_info("HugeTLB: %s page size: %lu invalid page%s discarded\n", + buf, nrinvalid, nrinvalid > 1 ? "s" : ""); pr_info("HugeTLB: %d KiB vmemmap can be freed for a %s page\n", hugetlb_vmemmap_optimizable_size(h) / SZ_1K, buf); } diff --git a/mm/internal.h b/mm/internal.h index 57662141930e..63fda9bb9426 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -658,6 +658,8 @@ static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn, } void set_zone_contiguous(struct zone *zone); +bool pfn_range_intersects_zones(int nid, unsigned long start_pfn, + unsigned long nr_pages); static inline void clear_zone_contiguous(struct zone *zone) { diff --git a/mm/mm_init.c b/mm/mm_init.c index 925ed6564572..f7d5b4fe1ae9 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -2287,6 +2287,31 @@ void set_zone_contiguous(struct zone *zone) zone->contiguous = true; } +/* + * Check if a PFN range intersects multiple zones on one or more + * NUMA nodes. Specify the @nid argument if it is known that this + * PFN range is on one node, NUMA_NO_NODE otherwise. + */ +bool pfn_range_intersects_zones(int nid, unsigned long start_pfn, + unsigned long nr_pages) +{ + struct zone *zone, *izone = NULL; + + for_each_zone(zone) { + if (nid != NUMA_NO_NODE && zone_to_nid(zone) != nid) + continue; + + if (zone_intersects(zone, start_pfn, nr_pages)) { + if (izone != NULL) + return true; + izone = zone; + } + + } + + return false; +} + static void __init mem_init_print_info(void); void __init page_alloc_init_late(void) { From patchwork Mon Jan 27 23:21:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951856 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63ED5C02188 for ; Mon, 27 Jan 2025 23:23:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AA5B42801C5; Mon, 27 Jan 2025 18:22:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A59D328013A; Mon, 27 Jan 2025 18:22:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 880BE2801C5; Mon, 27 Jan 2025 18:22:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 674FC28013A for ; Mon, 27 Jan 2025 18:22:51 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 2DDCA1A06F6 for ; Mon, 27 Jan 2025 23:22:51 +0000 (UTC) X-FDA: 83054808942.07.D1B650A Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf15.hostedemail.com (Postfix) with ESMTP id 347BEA000B for ; Mon, 27 Jan 2025 23:22:49 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=KvZcJgTl; spf=pass (imf15.hostedemail.com: domain of 3SBWYZwQKCBc2I083BB381.zB985AHK-997Ixz7.BE3@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3SBWYZwQKCBc2I083BB381.zB985AHK-997Ixz7.BE3@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020169; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=csxUnMb8aBY+LA7oJhb6s5xyn3ur++3PPVPSPZ4uvnE=; b=Ff5cKRykEI47HMceo+VysUE4lja+7thSWBVIW82h+Do4gleKCo8LYuUwIQyvCqN1nI/Fzn 4h87gKo2/JpTeyf2AtkVOlViBe4Zc/gzGqWj8P9s2OGLjIzFsDk5MvGFsuoHE6RUGoH/cw gzC52uNZya+5KrSj6rOANoTjDIIeoCg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020169; a=rsa-sha256; cv=none; b=RKKj6QGVnXHpKe+EbUofr4ZTd8KDOulSnyol15hn4iAx3XU3XmGMVMvoZDEQ0vQVCss8mF R/fjPjuBGbMy1StXRVxm9phoDyK5FZkcD/SIryt5MCgZOA8DkjnTvIf218lLps4sLaT/FO UOCLD7p6Byos6Cqy7o8Jh9G4FBpDiHc= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=KvZcJgTl; spf=pass (imf15.hostedemail.com: domain of 3SBWYZwQKCBc2I083BB381.zB985AHK-997Ixz7.BE3@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3SBWYZwQKCBc2I083BB381.zB985AHK-997Ixz7.BE3@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ef909597d9so14693930a91.3 for ; Mon, 27 Jan 2025 15:22:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020168; x=1738624968; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=csxUnMb8aBY+LA7oJhb6s5xyn3ur++3PPVPSPZ4uvnE=; b=KvZcJgTl01MTRTCSQV4SVieeHfaGNpCHYk39rRL+DmPGKqk1nAUZTzYWG2P53WQjXG BsLS7y0gw/hpDaf5wrPjD9Jka8lNZG83NtKUrNqU/WI+nramb9Q6rzCGJ2xKKUNTK2ae rzWUbZJAEc1rHQIX3+itjU4F/stqplaqXb95Yrymne7Om5IEs9hNzSVh8vzl3a2zU07R Um6IkxPdeolKrgxXvqNlscxAf01gXWJ0/Ho8+Hjy3qxpSUyITOP2K79dj5WxPVc8Fc7U BOkao0Jvn+WXHDDSo16BqHVpk2Cv/tJVgg812HUoYJBOOf1rbq2aoS/SytaaHTSeTzdL X4Gw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020168; x=1738624968; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=csxUnMb8aBY+LA7oJhb6s5xyn3ur++3PPVPSPZ4uvnE=; b=PS6sdRX1dZgMfjWtMI5+6vv8F8Z0gH4h/B8KNmjys2Jhkrn54hyPxXtA8t7z8pl/Iw syr/jqEJYdcKdxetscbn3GShYzpWsXONb/JAf0AAU5pHCcObBQJmM88Ow+uqj9g/b+/I zH1LnzvpsWJIhA3SEpUlbGqCQuXKxeCX1Z2LduFFRohYDtejg+OgVcEzvDiA966qLZFL PQlD6kTjpaPLzO3Gxz5kqmH/mtwVTHsXAUo6B0N4RHocT9makponZMo7U4LnVmZdVZct lHLyggbsxJtnHG4VpGcNZyzDFVsDCcpmXhY/emDOLm2sko3x/Eixjoziw58v9aZm4gyD 7/uA== X-Forwarded-Encrypted: i=1; AJvYcCUbFKl3E1cYCDtoOrgrZ85FgOxfDOFzPugI03qnHxl0O/yGRZA60NiNbBu7kkre76bEqPeLdsEDbQ==@kvack.org X-Gm-Message-State: AOJu0Ywn5GHWsf+hLtnHL1iTRUtJLGJMOqrdnJ+P+XHjTuQA5tsaKVL9 36AIBWpVs0l7YDd3aMxb5fnP3ioZkx6+eLQBmcIOxbDyG76XQvjQRQOoGUAMbsWnTog66g== X-Google-Smtp-Source: AGHT+IH3jxo8XkEjr0W40F795lbVGky6ZUEM2NpGkkNXzuQcK4yqmdMK+8XDOm7CSZWx+GP8e91KqO52 X-Received: from pfau6.prod.google.com ([2002:a05:6a00:aa86:b0:72d:7781:cdb9]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:7b07:b0:1e1:ae83:ad04 with SMTP id adf61e73a8af0-1eb215902cbmr48631941637.27.1738020168078; Mon, 27 Jan 2025 15:22:48 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:55 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-16-fvdl@google.com> Subject: [PATCH 15/27] mm/sparse: add vmemmap_*_hvo functions From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Stat-Signature: 7ygmcofepegqftdx99ztimnbgpcue6ay X-Rspamd-Queue-Id: 347BEA000B X-Rspam-User: X-Rspamd-Server: rspam06 X-HE-Tag: 1738020169-637403 X-HE-Meta: U2FsdGVkX1+gNuNXXUl+PQaSGNehJQSFndq6uV1B6v1zG10KPpwnwXGfzwEaaOAY6NmwTyrd4ggkLd3ll7em1HVXzX2hpdzemB7uTnmyMdKi/Kg50vS/5o4t7KpcvEa+y6VBr3aVOoSfIaqVfcjUjuDkO+cexS6h3Wg82ObKPwDThPNTM/c+1r88PpVox73jyBm5I+q1lZAao0a0T/coWeUSwi42eMyqT2fTCzSEmtGUbPbMMs3gkVX9ftKn+URGz69Kqkf9DER+rRhWxHhvtpCdYhta+pWuAd/G3EXHwC3JaoY2Minx59gJlHXQ4swsgzRPreuhdfeC+mk853A8wg5CdVjZZd0KfqNU+/GNTJNAVjLOPuM3H76ZxtK1CAcUwnNBkFr8zxYs5tmgmTQ24tGNjpcFKPbluNNt99ARs8s9YevfiiTq6Exa/RTWGoxDcHfMAtC9uHoRNp7hGsiYR9pCsRiDkSObEZOGJBi08obxEQVX5QJYXwbPWvPRKAUdD2OOD/c0GcXmB5HGJdBJ+ymvnbO9PSho7AsrmGRCrjnyZULkVN9s6chBbGTKrozEQkkN76OAZMBPpwR8PHS4VUVlu4b/DdZNxJkVfeSgbE51OwOTNkJTbhMGKyUkW+hVdHCMEbqje4VXbtDzl/cQbFlk2LiuY0baB/nAedKVbzDIgq4du4B/2JEhtsLkcUgF/xSqmtC4Um7JNtUael3GdQKc8xHkPrv3oqIkhqnfdWLxMPTRxq9wKtlMY3WhfSozMGoCWTz4ODM9YDEyJu1AokhhHPOWEex0kOFc0+cxLe3z42M3tvu5TOLyCw5anGMfXl9b7J5nc1TKbg0oNqCAdsx3jmRe5/6gbVZjOE8WwYTj4InjZd9kpmUNN8NUd9Hut9NSQ8DUgmmx6kzTTdwWok4e72LQPoiwre0ZSKzgiRG8FNEI6w6ItXvlhnuUWANwdXfHTOdKkKJSXtDDfaJ L45KPL5r TsLRKe8deULwaRk7LvGSq9kElbyZj60SNcnoCmo/f7gH9Pd/BLpZAwBg0mSwvOHLDeu9MFMdyUcBoau19n/OBjmhpaMUupYQv3N1pIZNoSJMmyE+GymVXpI32AqNTRCV21wDCckC2yvtirBXI90196R7zlXD0B2W8LeQfFRLs+vxUap21IUX2kucZ9281evi5aM7O5IZWogjCbVpd8CyBaxvtLjoXFvMq+v9GfpGT4kJMQ8+a4UKNWYiYg6ggzEwGndMPzTT9a7v+BgWl0NBub8nJxheFpKnlN1Lil3JLz9rpXulHQ8wvNjRAj52dMV4xMTZ5uIWxhSzsjK4Q3/1KvKwSGvKA0QhnrWNtJ7eYDUOPtBVoRQoAC3alhmv27VTceC7tJBlBsW4DDxIe3aDS0jXfpRgJBFszrn79VlhF+SJgnUsTa+mLSSmBKdX1OFYTgyPmgDY1dtM/VSLoW22kbB83oFrhA3eVuTMj6kaqxLEqNQ1ddks5kS5UkX7qp7Ka8us/imSDPL/SRidBZ9d/nlIGjbFoda8WUGCNHjTHJgwoLnra57QsFMc350wL90uZ26/f X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add a few functions to enable early HVO: vmemmap_populate_hvo vmemmap_undo_hvo vmemmap_wrprotect_hvo The populate and undo functions are expected to be used in early init, from the sparse_init_nid_early() function. The wrprotect function is to be used, potentially, later. To implement these functions, mostly re-use the existing compound pages vmemmap logic used by DAX. vmemmap_populate_address has its argument changed a bit in this commit: the page structure passed in to be reused in the mapping is replaced by a PFN and a flag. The flag indicates whether an extra ref should be taken on the vmemmap page containing the head page structure. Taking the ref is appropriate to for DAX / ZONE_DEVICE, but not for HugeTLB HVO. The HugeTLB vmemmap optimization maps tail page structure pages read-only. The vmemmap_wrprotect_hvo function that does this is implemented separately, because it cannot be guaranteed that reserved page structures will not be write accessed during memory initialization. Even with CONFIG_DEFERRED_STRUCT_PAGE_INIT, they might still be written to (if they are at the bottom of a zone). So, vmemmap_populate_hvo leaves the tail page structure pages RW initially, and then later during initialization, after memmap init is fully done, vmemmap_wrprotect_hvo must be called to finish the job. Subsequent commits will use these functions for early HugeTLB HVO. Signed-off-by: Frank van der Linden --- include/linux/mm.h | 9 ++- mm/sparse-vmemmap.c | 141 +++++++++++++++++++++++++++++++++++++++----- 2 files changed, 135 insertions(+), 15 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index df83653ed6e3..0463c062fd7a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3837,7 +3837,8 @@ p4d_t *vmemmap_p4d_populate(pgd_t *pgd, unsigned long addr, int node); pud_t *vmemmap_pud_populate(p4d_t *p4d, unsigned long addr, int node); pmd_t *vmemmap_pmd_populate(pud_t *pud, unsigned long addr, int node); pte_t *vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node, - struct vmem_altmap *altmap, struct page *reuse); + struct vmem_altmap *altmap, unsigned long ptpfn, + unsigned long flags); void *vmemmap_alloc_block(unsigned long size, int node); struct vmem_altmap; void *vmemmap_alloc_block_buf(unsigned long size, int node, @@ -3853,6 +3854,12 @@ int vmemmap_populate_hugepages(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap); int vmemmap_populate(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap); +int vmemmap_populate_hvo(unsigned long start, unsigned long end, int node, + unsigned long headsize); +int vmemmap_undo_hvo(unsigned long start, unsigned long end, int node, + unsigned long headsize); +void vmemmap_wrprotect_hvo(unsigned long start, unsigned long end, int node, + unsigned long headsize); void vmemmap_populate_print_last(void); #ifdef CONFIG_MEMORY_HOTPLUG void vmemmap_free(unsigned long start, unsigned long end, diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 8751c46c35e4..bee22ca93654 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -30,6 +30,13 @@ #include #include +#include + +/* + * Flags for vmemmap_populate_range and friends. + */ +/* Get a ref on the head page struct page, for ZONE_DEVICE compound pages */ +#define VMEMMAP_POPULATE_PAGEREF 0x0001 #include "internal.h" @@ -144,17 +151,18 @@ void __meminit vmemmap_verify(pte_t *pte, int node, pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node, struct vmem_altmap *altmap, - struct page *reuse) + unsigned long ptpfn, unsigned long flags) { pte_t *pte = pte_offset_kernel(pmd, addr); if (pte_none(ptep_get(pte))) { pte_t entry; void *p; - if (!reuse) { + if (!ptpfn) { p = vmemmap_alloc_block_buf(PAGE_SIZE, node, altmap); if (!p) return NULL; + ptpfn = PHYS_PFN(__pa(p)); } else { /* * When a PTE/PMD entry is freed from the init_mm @@ -165,10 +173,10 @@ pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node, * and through vmemmap_populate_compound_pages() when * slab is available. */ - get_page(reuse); - p = page_to_virt(reuse); + if (flags & VMEMMAP_POPULATE_PAGEREF) + get_page(pfn_to_page(ptpfn)); } - entry = pfn_pte(__pa(p) >> PAGE_SHIFT, PAGE_KERNEL); + entry = pfn_pte(ptpfn, PAGE_KERNEL); set_pte_at(&init_mm, addr, pte, entry); } return pte; @@ -238,7 +246,8 @@ pgd_t * __meminit vmemmap_pgd_populate(unsigned long addr, int node) static pte_t * __meminit vmemmap_populate_address(unsigned long addr, int node, struct vmem_altmap *altmap, - struct page *reuse) + unsigned long ptpfn, + unsigned long flags) { pgd_t *pgd; p4d_t *p4d; @@ -258,7 +267,7 @@ static pte_t * __meminit vmemmap_populate_address(unsigned long addr, int node, pmd = vmemmap_pmd_populate(pud, addr, node); if (!pmd) return NULL; - pte = vmemmap_pte_populate(pmd, addr, node, altmap, reuse); + pte = vmemmap_pte_populate(pmd, addr, node, altmap, ptpfn, flags); if (!pte) return NULL; vmemmap_verify(pte, node, addr, addr + PAGE_SIZE); @@ -269,13 +278,15 @@ static pte_t * __meminit vmemmap_populate_address(unsigned long addr, int node, static int __meminit vmemmap_populate_range(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap, - struct page *reuse) + unsigned long ptpfn, + unsigned long flags) { unsigned long addr = start; pte_t *pte; for (; addr < end; addr += PAGE_SIZE) { - pte = vmemmap_populate_address(addr, node, altmap, reuse); + pte = vmemmap_populate_address(addr, node, altmap, + ptpfn, flags); if (!pte) return -ENOMEM; } @@ -286,7 +297,107 @@ static int __meminit vmemmap_populate_range(unsigned long start, int __meminit vmemmap_populate_basepages(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap) { - return vmemmap_populate_range(start, end, node, altmap, NULL); + return vmemmap_populate_range(start, end, node, altmap, 0, 0); +} + +/* + * Undo populate_hvo, and replace it with a normal base page mapping. + * Used in memory init in case a HVO mapping needs to be undone. + * + * This can happen when it is discovered that a memblock allocated + * hugetlb page spans multiple zones, which can only be verified + * after zones have been initialized. + * + * We know that: + * 1) The first @headsize / PAGE_SIZE vmemmap pages were individually + * allocated through memblock, and mapped. + * + * 2) The rest of the vmemmap pages are mirrors of the last head page. + */ +int __meminit vmemmap_undo_hvo(unsigned long addr, unsigned long end, + int node, unsigned long headsize) +{ + unsigned long maddr, pfn; + pte_t *pte; + int headpages; + + /* + * Should only be called early in boot, so nothing will + * be accessing these page structures. + */ + WARN_ON(!early_boot_irqs_disabled); + + headpages = headsize >> PAGE_SHIFT; + + /* + * Clear mirrored mappings for tail page structs. + */ + for (maddr = addr + headsize; maddr < end; maddr += PAGE_SIZE) { + pte = virt_to_kpte(maddr); + pte_clear(&init_mm, maddr, pte); + } + + /* + * Clear and free mappings for head page and first tail page + * structs. + */ + for (maddr = addr; headpages-- > 0; maddr += PAGE_SIZE) { + pte = virt_to_kpte(maddr); + pfn = pte_pfn(ptep_get(pte)); + pte_clear(&init_mm, maddr, pte); + memblock_phys_free(PFN_PHYS(pfn), PAGE_SIZE); + } + + flush_tlb_kernel_range(addr, end); + + return vmemmap_populate(addr, end, node, NULL); +} + +/* + * Write protect the mirrored tail page structs for HVO. This will be + * called from the hugetlb code when gathering and initializing the + * memblock allocated gigantic pages. The write protect can't be + * done earlier, since it can't be guaranteed that the reserved + * page structures will not be written to during initialization, + * even if CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled. + * + * The PTEs are known to exist, and nothing else should be touching + * these pages. The caller is responsible for any TLB flushing. + */ +void vmemmap_wrprotect_hvo(unsigned long addr, unsigned long end, + int node, unsigned long headsize) +{ + unsigned long maddr; + pte_t *pte; + + for (maddr = addr + headsize; maddr < end; maddr += PAGE_SIZE) { + pte = virt_to_kpte(maddr); + ptep_set_wrprotect(&init_mm, maddr, pte); + } +} + +/* + * Populate vmemmap pages HVO-style. The first page contains the head + * page and needed tail pages, the other ones are mirrors of the first + * page. + */ +int __meminit vmemmap_populate_hvo(unsigned long addr, unsigned long end, + int node, unsigned long headsize) +{ + pte_t *pte; + unsigned long maddr; + + for (maddr = addr; maddr < addr + headsize; maddr += PAGE_SIZE) { + pte = vmemmap_populate_address(maddr, node, NULL, 0, 0); + if (!pte) + return -ENOMEM; + } + + /* + * Reuse the last page struct page mapped above for the rest. + */ + return vmemmap_populate_range(maddr, end, node, NULL, + pte_pfn(ptep_get(pte)), 0); } void __weak __meminit vmemmap_set_pmd(pmd_t *pmd, void *p, int node, @@ -409,7 +520,8 @@ static int __meminit vmemmap_populate_compound_pages(unsigned long start_pfn, * with just tail struct pages. */ return vmemmap_populate_range(start, end, node, NULL, - pte_page(ptep_get(pte))); + pte_pfn(ptep_get(pte)), + VMEMMAP_POPULATE_PAGEREF); } size = min(end - start, pgmap_vmemmap_nr(pgmap) * sizeof(struct page)); @@ -417,13 +529,13 @@ static int __meminit vmemmap_populate_compound_pages(unsigned long start_pfn, unsigned long next, last = addr + size; /* Populate the head page vmemmap page */ - pte = vmemmap_populate_address(addr, node, NULL, NULL); + pte = vmemmap_populate_address(addr, node, NULL, 0, 0); if (!pte) return -ENOMEM; /* Populate the tail pages vmemmap page */ next = addr + PAGE_SIZE; - pte = vmemmap_populate_address(next, node, NULL, NULL); + pte = vmemmap_populate_address(next, node, NULL, 0, 0); if (!pte) return -ENOMEM; @@ -433,7 +545,8 @@ static int __meminit vmemmap_populate_compound_pages(unsigned long start_pfn, */ next += PAGE_SIZE; rc = vmemmap_populate_range(next, last, node, NULL, - pte_page(ptep_get(pte))); + pte_pfn(ptep_get(pte)), + VMEMMAP_POPULATE_PAGEREF); if (rc) return -ENOMEM; } From patchwork Mon Jan 27 23:21:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951857 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A019CC02188 for ; Mon, 27 Jan 2025 23:23:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 464722801C6; Mon, 27 Jan 2025 18:22:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4155728013A; Mon, 27 Jan 2025 18:22:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 28EF22801C6; Mon, 27 Jan 2025 18:22:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id F027028013A for ; Mon, 27 Jan 2025 18:22:52 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id A6FF8471FA for ; Mon, 27 Jan 2025 23:22:52 +0000 (UTC) X-FDA: 83054808984.26.A7E63C5 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf08.hostedemail.com (Postfix) with ESMTP id D21AB16000E for ; Mon, 27 Jan 2025 23:22:50 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="Yl2wg/Em"; spf=pass (imf08.hostedemail.com: domain of 3SRWYZwQKCBg3J194CC492.0CA96BIL-AA8Jy08.CF4@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3SRWYZwQKCBg3J194CC492.0CA96BIL-AA8Jy08.CF4@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020170; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yQdEXb7VrREiqENOyRwQSlPqHVu4TlSVRGhdtKj9UPA=; b=35j+SHE09J/jwNeudwDSHyaAgGzvYbN6Eb4jV+PL2jK4481r7F4L+2dkRpGpe98ZgvGbsY 2sK+Qkh//918IV2QFcEHYuB1RJ2maErTcu1qHxYXtAcWd43Be1JR6ejEpG9uWGDaVt9cqC jlrYVKMfRajRNYEmJMXNg0SVDx/j9fM= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="Yl2wg/Em"; spf=pass (imf08.hostedemail.com: domain of 3SRWYZwQKCBg3J194CC492.0CA96BIL-AA8Jy08.CF4@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3SRWYZwQKCBg3J194CC492.0CA96BIL-AA8Jy08.CF4@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020170; a=rsa-sha256; cv=none; b=p7gGsXjSvD4CXXWGlaeDbwV5fqJxPOtifF0DidK2QKiWIs0QijwN9+TkKRcPrPYGfGY8D7 PxJr5EJIyaTrdihjOHrIsF0XHeqUbD9bq/eO8TI5+X1fJlm2uBbMFnwszSmsRQOueFGWfL 75T9D0cbLgmL6RM3GG0XEP+P9DFCClY= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ee5616e986so13984334a91.2 for ; Mon, 27 Jan 2025 15:22:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020170; x=1738624970; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=yQdEXb7VrREiqENOyRwQSlPqHVu4TlSVRGhdtKj9UPA=; b=Yl2wg/EmW1inl+xTuCa63xa6Xui6J7YJ45HW01oGWcmZUiWa5n0hjjZaVpoHsofogy 2DK72nEcsofdDCMl/4zib6AFDsR8HAJs/1pI26QXQyDRlBEW7Q7ENSSqGL43gNYVIvU2 p+cS5inau2zdjWrp01f5mIvw+lB4Enu6oWVKdcRKfQs7Shn4KwKwHsokoDikgSy0iNaF a1T2DABdz/064v+BzA45CONmDmJrZDntAB+qsoKSWXZWBr2L2X3gNCGKYxYfWEqPTDS/ c3LF+BCnXs9wL9UZPE0uiYfGDBCPTc5ID3AWQT7/D/3ngoHK3WOpn21HYVcsLHVth8s3 NBiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020170; x=1738624970; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=yQdEXb7VrREiqENOyRwQSlPqHVu4TlSVRGhdtKj9UPA=; b=Ry4LTh8cZpotYjvGUPv3qjglqyX5PvbUE5P7cqE6Ou8Tmoi8ockxJZxveP5qxAjn/P W1UnMZfckPHDSeVwjOWPvf+eBupACEWbWfbbzUIjK3D0pslP93fZtQj0+fuqV9F0gIn7 91GBs83ckxqejDLxC98zY0HlOx2j4O4tg6ckQtnd4a/l18EU/6RkP6BCIaajMZwJlmXA WeqTcQkQJ3kvcOwAHRzUXFT32vEkJg6mCTZFvC9J5XN/fWuSCiV2igw8tWeJBIEw7di5 OVHDFsfxkQjcGOTedl1lTrWUaNUg3fRf/HEzJYzFzhruKVHOo2p44geEREuEzAxZJ+YP j7Bw== X-Forwarded-Encrypted: i=1; AJvYcCUddv5u7ZBoBwGuIIWkBLMd0il61aoRi3wwBIEYVQLFqv4uEsJzeBhT4pns90U5AS6i2+fvEfiwHw==@kvack.org X-Gm-Message-State: AOJu0YxQSlnzX5K8mpjXxwVkXvrErI6+Tf50s7TKHz7PubULNacRq13G Or2IqhHSWoiWYc1K7b3qrC/fAm7bH79Qifxi1UjtZNWXRB24yMLXqkf//RbEgPalkbSoMA== X-Google-Smtp-Source: AGHT+IGOxA/LjCFXYDhUCJvwKiwBW39Jv15WSyjwI6Z7bApcIx6+wrwzNjIWdbWHTiCawWQ9bGPb1PBa X-Received: from pfbds10.prod.google.com ([2002:a05:6a00:4aca:b0:725:ceac:b484]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:2917:b0:71e:13ac:d835 with SMTP id d2e1a72fcca58-72dafa442bdmr62065805b3a.11.1738020169612; Mon, 27 Jan 2025 15:22:49 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:56 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-17-fvdl@google.com> Subject: [PATCH 16/27] mm/hugetlb: deal with multiple calls to hugetlb_bootmem_alloc From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Queue-Id: D21AB16000E X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: uj14jeaybnt5f7tdqcjsf9kj5bfesgps X-HE-Tag: 1738020170-253258 X-HE-Meta: U2FsdGVkX1/2Ta8tsHH/hEbyIrJ4LvpHqo/XIsAuabouxfBh4IFztYYCxmveBfTD1JlmF4c/acUeQAx9hsQszL8au/X88rZ0a7kVvMP+FJSdHefYj2CRvNcVuj/Fsx6N4U+ZSfV1Oq7HPMYQ7xW+cOM+qkuauoXWwgwkKrqnVY0jew4qcLizQC58pfboXVWM6NC3HB2G+T+g3mvd7ot43p9kvIK4oJmnsYuYGOKBRIrxrF6AdcNzIHFGFqbA5K+vxZVeV5yw7TmQ61pjhoT3ENoR5vfAtqjw9QQawpG1rtwFpdApZBOHxXS7+yGXYva/iEujbEmm24Nf0KweOxYWyNF5HzQrTH7vKMWRsZQFKNNv1fKpoCcOAUH6ykgPD1bQkwuOXBiS8ZswpPTCZnPfdUOlNI7EsXbOU92p5O/0GE2AS5d4cmmOk6GfK+exvTayUNc/AwSVgwHT7GJ+wkEW8GRuJEWvRi7EDHLqYUiPCoCMWfGaDvmDUrYZaSsUpEYb1aq9CKit6N6g/1DRO2mhF75dqZlKoA0lSdqFBSIxps37+ElEnJxlr9NDoXyyXZdK5J8GOAOGsfcu9jlz7JrM61YNcS96wQa5gC8tYpEzlqyyYpqNBxOmJAZMrWUOwlukoF0E9w+R1d5Kh8P/tK7ljlbjugP57UTAaGITldBrMle2ywFjkQ7nghVBJqeYGKgknX9llIUz1oQ6hCu3+IbWnUJ0UsaDbqjoUjfTBKVImZsP5OPDkKdS9MV/PdxtYfpCZO39W+5+8elKeKMNbaePB1uKj4NOQXoM2+grkWT7sJJWdOvKZkQka3HH65+nu9Q5df9dCU/xxHcv5b1OGYlBGF4YoMCkMCCTGk7JGC5tOkkBdUbjCcICIuX9cvERnO+pMWtmb0ePjZBNh+PGKgn0TRTaC8JsCLbz8UNpMGZRGDxChtpXv93TefW+xmMNwkBpUAO0uOSZL1X1u0+Mi5R ntZlJriX 6eAD2MvikJqY/dkAnY9LCI/dbScKRihkBgo0LfoyNku9LNgSdpF1sSIWvzUHaTmYIYYrPF34BKxBm7WE1HftqEO9esG2dxa70SIvtyP79FZ5IQrYVKabNRQMtlt8T29OHuFbAMyK37OD+0QtCZINg8yXgGRhwmkMl05Ex/FlaafYBFecDs7kjSpbok88WWEQY6SkfXa5BjjQk+FnOlBj0ChP21e0JC7YUWF7GACSV1cFicV8l8Wpp4rG8y7Smo4n2Ln3+mvuF3/3wsyND7K31/Pz+Ku8nHHZGSfqKyXZvK9buzy/qOgoDVTrLycb1rGr0ErhdI6z5mD3jXxgbRAfOLyHKzr9juXQ3XPQH/ZbMANIO9N9eI5uhfRf2Sfk7ir64Wpl6Xj+55ZZ3baEql4GBNJL1+87CfkrUR0408nLgUhjrpP9wHd9pybQe8FSuysCpI1lpd43F8E/s+JkM6gSPwkMiAY74kPjIR+IpE4XrhpCohrDg0SDk+o15KimnXxf7uXzUlywcbSm/gFDo4V6eRK5hpB7WCBZb2edPuPThZYFiqpDbgc7RyW+bvzp6f5q6dEC5glUwCzHPX6U= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Architectures that want pre-HVO of hugetlb vmemmap pages will need to call hugetlb_bootmem_alloc from an earlier spot in boot (before sparse_init). To facilitate some architectures doing this, protect hugetlb_bootmem_alloc against multiple calls. Also provide a helper function to check if it's been called, so that the early HVO code, to be added later, can see if there is anything to do. Signed-off-by: Frank van der Linden --- include/linux/hugetlb.h | 6 ++++++ mm/hugetlb.c | 12 ++++++++++++ 2 files changed, 18 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 9cd7c9dacb88..5061279e5f73 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -175,6 +175,7 @@ extern int sysctl_hugetlb_shm_group; extern struct list_head huge_boot_pages[MAX_NUMNODES]; void hugetlb_bootmem_alloc(void); +bool hugetlb_bootmem_allocated(void); /* arch callbacks */ @@ -1256,6 +1257,11 @@ static inline bool hugetlbfs_pagecache_present( static inline void hugetlb_bootmem_alloc(void) { } + +static inline bool hugetlb_bootmem_allocated(void) +{ + return false; +} #endif /* CONFIG_HUGETLB_PAGE */ static inline spinlock_t *huge_pte_lock(struct hstate *h, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a4d29a4f3efe..18cd232b5df2 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4911,16 +4911,28 @@ static int __init default_hugepagesz_setup(char *s) } hugetlb_early_param("default_hugepagesz", default_hugepagesz_setup); +static bool __hugetlb_bootmem_allocated __initdata; + +bool __init hugetlb_bootmem_allocated(void) +{ + return __hugetlb_bootmem_allocated; +} + void __init hugetlb_bootmem_alloc(void) { struct hstate *h; + if (__hugetlb_bootmem_allocated) + return; + hugetlb_parse_params(); for_each_hstate(h) { if (hstate_is_gigantic(h)) hugetlb_hstate_alloc_pages(h); } + + __hugetlb_bootmem_allocated = true; } static unsigned int allowed_mems_nr(struct hstate *h) From patchwork Mon Jan 27 23:21:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951858 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11963C0218A for ; Mon, 27 Jan 2025 23:23:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D677E2801C8; Mon, 27 Jan 2025 18:22:54 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D144F28013A; Mon, 27 Jan 2025 18:22:54 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BB6F62801C8; Mon, 27 Jan 2025 18:22:54 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 97CB628013A for ; Mon, 27 Jan 2025 18:22:54 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 57BAC12056D for ; Mon, 27 Jan 2025 23:22:54 +0000 (UTC) X-FDA: 83054809068.22.8D122D1 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf08.hostedemail.com (Postfix) with ESMTP id 768AE160004 for ; Mon, 27 Jan 2025 23:22:52 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XaT9urBF; spf=pass (imf08.hostedemail.com: domain of 3SxWYZwQKCBo5L3B6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3SxWYZwQKCBo5L3B6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020172; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gN6EK+U6QYIbP7rjA3kCmrFCSrq1dgfnRvyopAXhkYc=; b=mWQ17pjEvzQ0UKbein6KlaJcDXHbmNFi4Y+HqxMO9bVjaWEdkLXUuk6hxzi0zYn50VNccg YRAJ9JkWJFyS58xTeJzgz7hO6sh3tS7S3QGK89z0QS2SwdSxOIVQJrOAxC3mR4H1KI1bSD q11neDTSN6ACCazKhJ54CsgeXh0/Z/M= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XaT9urBF; spf=pass (imf08.hostedemail.com: domain of 3SxWYZwQKCBo5L3B6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3SxWYZwQKCBo5L3B6EE6B4.2ECB8DKN-CCAL02A.EH6@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020172; a=rsa-sha256; cv=none; b=Sfj4Fd7H5P9r/E6+ntsa9itPsZvdL/9vunEy0n7/vQKmeVFzc5W4gPCBbAF25sRH5Ls0ST NUo1LMRfVO2hWIWkAtIfGJlpxTEObdvkusHuoug57HxdmMifnAhfiqufOgK1/zCQ22UtO+ bZ3C9S+zjA04hGflcP6hfgvVwnSRFXM= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ef9b9981f1so13980318a91.3 for ; Mon, 27 Jan 2025 15:22:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020171; x=1738624971; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=gN6EK+U6QYIbP7rjA3kCmrFCSrq1dgfnRvyopAXhkYc=; b=XaT9urBFuWn2oowmF7Dv6cmGyAojyAsjEq9xFtyYu8yurpeXD/B/+3y9yjRhLaPRiS iT+ENkFLxmNkpwulJRHHPso9YpHc8ekqU77LxNw6RTY/uSEm3QeYqTr95iT4wrYmyRog Q4bqC/szBMvSqtdSFlkT4L0u176qaCgfUtpeWErdosxnoM+TWpo8XK/gRMeFp9qApVdz T0dBPu+rFrl8iaDeR42m9Nqrs3fql2V5X1eehwHubXyR6Ol0kEBoRFua3lSTReYwY5v8 D7eG5gCoj3AdX1B0ZkcZQ9K9tAHiU1UC+2TXWY+SnGEhxVP2RMeELAZhTKXTepZDwlvg pQHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020171; x=1738624971; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gN6EK+U6QYIbP7rjA3kCmrFCSrq1dgfnRvyopAXhkYc=; b=MgRfBSo8ermIwv81oPWpsQEEzIXZYgp6wKQJVnAZPdzmBcVMl7CJ9t5rfPr/+E+6V6 pklIZKSTkK+yvEApkDBRXJ5GlxD8Bo4sYDilMNESaytIf82jOLQUx9WlueZ663PbNOBh wjP2rXvPS6CMp7+d2dznKIuPbzVWK/ywhKCfalTUMCituXJBYAGY/e3NsVBm/n3jD1nb YNl04ml4EBiIz3XB/mINUV15ntCUnJwFerYN/MAStSZbS6ojGw+QziB+azrt2HZm/Akj 9pOf8Tw35/4eqlID0CBgN6GswMa/mui1HzcG4fl44PxtvBCfTMD4PwZEsPeV29kjfIJG Ry8w== X-Forwarded-Encrypted: i=1; AJvYcCXyr0cpMGC3js0Eln8E/Qx5tA+cRdaZ49xjZBbzP7zbysn0b52kz731uTIK+E71nexAkINxMwichQ==@kvack.org X-Gm-Message-State: AOJu0Yx/WhV+DVARTkIcIKUXIjiMbWnxz2QmFSmcC6dhT2nmxS+fxoZy t/T7Ag8CbWqT96DlKjDgl7e3VQpE/XSrt6p/CYOruJdToomijg2Wee4NIzPrp3SgKVEOiA== X-Google-Smtp-Source: AGHT+IEB19lBmsSgcwC3AJjb1l49j6QqCp/duCIXB8+yga7ZxDp2FikGh4xF/oRa2YREpQLIMove+/qi X-Received: from pfbcg24.prod.google.com ([2002:a05:6a00:2918:b0:72a:bc54:8507]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:4293:b0:727:3cd0:122f with SMTP id d2e1a72fcca58-72dafa04710mr55707205b3a.9.1738020171346; Mon, 27 Jan 2025 15:22:51 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:57 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-18-fvdl@google.com> Subject: [PATCH 17/27] mm/hugetlb: move huge_boot_pages list init to hugetlb_bootmem_alloc From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 768AE160004 X-Stat-Signature: azfsxxcosu37cpw4x7jxze5q1ea1cx5m X-Rspam-User: X-HE-Tag: 1738020172-728579 X-HE-Meta: U2FsdGVkX1+1Sc0FnxeJShwIe4qkciADlV5dSTC9vk52bZsCFkrNA37aPp9pFmsLAymndcJe8nHcIS7puApY8kdDVuz39WG1HrK9a4e3N2isiWG46Q2Uaqvo91xB3nA4BqW5haJ/75ot3Lm3jPc4mhtdQqZy5LeLnH9qwvYFL/Y4YP1JiivC+S3rRHxMSp+0+Os/C9sQXgGE3AA2chTsxujk3lN36jP8tMrljWncXIwCUNz5q+1QvEoRvsnaMBV2FQxKejtuh2bzUAIQCcfzobF/qlhyNC+TKF8pUk6evZCcOr+cf82sJq8/Dessewm3ECZOj7tHrkn9jGPy3D+tUAfqloAe1aMx22UJJbSB81HoRVfMHIkUM1dCunAwKsnux3NRjVt2hMTwzzxHflCt7C9sBE0rtL74io55KDB2bZJqlQvQEf87OAPemhYW1H6teKKrEbyGZfireEmKo7ZTPByHx3C4RWp9CUsXWU/la1L2rl6MBIgBO/bZwetCDGrE3LyVv16QeiJxZOhmTdYLxoGyvmJGpMgglXy42JQtJ2a38bloJy5V56tPDzJs76nQDIpUw+xth/wtYtDW5xZlNw5Su8cHN4hp8opBaZlE8LTLVODcl813DlgpsRQAssHLMFQ7jhqdRzou1ua+efTFhNrUxk7QqZuoUo3qrj+DdSY9IW5PFUigduj5YBWakBLotTeFOvsvk1WWM70VxsfVjM2da9W/5PPRjwj/iT4eUC0/ntsYIXXSsFNbhm4NsYUIwcGuv1vGnfwSKsbmL+yRxJL+hfM8Ei2oHrwQ8oOq0p2j5HjzB6g4I5+wMy9+x2wadvZ1f8iDxEVktsR96GwOOgTq8k9TuEVqb0QZyX7iKC+Zl59ime+l3+Bs0KdT2uxhM01VqV0Q4C0G/PXBVdDM+QdQG3Zpsrvf2p61p5yjSw0lDERe20wG5PQdBft39easFUT5c07JBRdCDxxQQWV +A9i6wQz 4kYMlS7xSsA0fjSq7S1bJ7wyKozzMnQKku/soTin2B/1vhF+AbHoHyV1yRW9eKmmoZcamRjbpU69sSDmue22UdMu7I2uj9PMlQai0D9D4QGB0g1+nejibmdkkfJig5NPFg2g/IIGLTNjafDDnFgy1cg6TiLx3FkYSbHbkfWo6hZmr3nsxPfwY87itueZ+kaqMoNK+xlKNBYYxMkhlWBthicWKSeJVxyvEyiu4qP3ZKjEhu9eAJxrs6YUpwevVl8E6gfb1nok4DCV3X7YzAie+X1j6YLRooHbFJ9GVLbEsx1EBAYDKxIY5ktMJ+hvHsyFMWnxtDyj8AQyRmQrDvKtTIROFn1N5XzdyWI5Wudybf/qmvn2JolmmYCCCYYGiPWb9f1C0DwhUUqs2az1bLHbOxPonuiigTmAOzYQmPdGe+yuAkeCy6+OHjUwA+OPeby5lui9oGRAZHZaskRNFaPRmUACyHeloG34hX4vOAGhb6xelPzKVhYBGICw65AdhptQMEwBLz1LIY1ZcZufAGlHg1VjUa1q4wPEVtkrhsQnsWjEACkpLtuluVGTjMGxDhCjj0KvUxbWrEHMm6jFzgjRsWoRvECvoVIKJ8RFoBV4ggYrDFSQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Instead of initializing the per-node hugetlb bootmem pages list from the alloc function, we can now do it in a somewhat cleaner way, since there is an explicit hugetlb_bootmem_alloc function. Initialize the lists there. Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 18cd232b5df2..2aa35c1d112b 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3579,7 +3579,6 @@ static unsigned long __init hugetlb_pages_alloc_boot(struct hstate *h) static void __init hugetlb_hstate_alloc_pages(struct hstate *h) { unsigned long allocated; - static bool initialized __initdata; /* skip gigantic hugepages allocation if hugetlb_cma enabled */ if (hstate_is_gigantic(h) && hugetlb_cma_size) { @@ -3587,17 +3586,6 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) return; } - /* hugetlb_hstate_alloc_pages will be called many times, initialize huge_boot_pages once */ - if (!initialized) { - int i = 0; - - for (i = 0; i < MAX_NUMNODES; i++) - INIT_LIST_HEAD(&huge_boot_pages[i]); - h->next_nid_to_alloc = first_online_node; - h->next_nid_to_free = first_online_node; - initialized = true; - } - /* do node specific alloc */ if (hugetlb_hstate_alloc_pages_specific_nodes(h)) return; @@ -4921,13 +4909,20 @@ bool __init hugetlb_bootmem_allocated(void) void __init hugetlb_bootmem_alloc(void) { struct hstate *h; + int i; if (__hugetlb_bootmem_allocated) return; + for (i = 0; i < MAX_NUMNODES; i++) + INIT_LIST_HEAD(&huge_boot_pages[i]); + hugetlb_parse_params(); for_each_hstate(h) { + h->next_nid_to_alloc = first_online_node; + h->next_nid_to_free = first_online_node; + if (hstate_is_gigantic(h)) hugetlb_hstate_alloc_pages(h); } From patchwork Mon Jan 27 23:21:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951859 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17A72C02188 for ; Mon, 27 Jan 2025 23:23:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 74A0D2801C9; Mon, 27 Jan 2025 18:22:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6FA7528013A; Mon, 27 Jan 2025 18:22:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 54D852801C9; Mon, 27 Jan 2025 18:22:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 357FA28013A for ; Mon, 27 Jan 2025 18:22:56 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id E317F12056D for ; Mon, 27 Jan 2025 23:22:55 +0000 (UTC) X-FDA: 83054809110.14.797E38A Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf17.hostedemail.com (Postfix) with ESMTP id 0E51840003 for ; Mon, 27 Jan 2025 23:22:53 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=pvzMHPRW; spf=pass (imf17.hostedemail.com: domain of 3TBWYZwQKCBs6M4C7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3TBWYZwQKCBs6M4C7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020174; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pZnSVaYX+KiYCdXZ4CmyGFw5hmL1VdfR5XYIpStDEuk=; b=mSgKE8tHNca/x0+GxFiHWFuub94XDegTALuMffzKoxJX+HE5gPau3Vce3kHkD1h+s+xJLP J6NXOlLel2iiwV92Km3UEenM8Z6ObbW6uhr0GjQ4h8/WGnlr0HvMVOtox8WSWeWMtBNjTx buZW+P1y0Pkp6CPjTXKy6VOidjxHOTs= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=pvzMHPRW; spf=pass (imf17.hostedemail.com: domain of 3TBWYZwQKCBs6M4C7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3TBWYZwQKCBs6M4C7FF7C5.3FDC9ELO-DDBM13B.FI7@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020174; a=rsa-sha256; cv=none; b=NGRd/GSK1pLQZ6/NUXww6ZERMjyXuqkwSfBtbmOJy63nxLiO9jJHeLu9GDDV3SF7KPinpe IuR8ErLwJ1uBvr/qk0v713hSwmVzf8+LtTYAkoZLTiPTJ9HDsDiEyWuwhNgGvBQEPXk3oO iRgTdVIAgvyjfLHllIHTiqevNINRL+o= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ef7fbd99a6so9591429a91.1 for ; Mon, 27 Jan 2025 15:22:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020173; x=1738624973; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=pZnSVaYX+KiYCdXZ4CmyGFw5hmL1VdfR5XYIpStDEuk=; b=pvzMHPRWxcKg+wqjPRTiuam4QrSqIDsjfQWo0cFi3FJ9N3Eo4JvYVCvIrntEkwO5G2 MoKh59cPUlwUTz9ZublgXw6TW6Sc/5/jurkwWMX37j6/nwsSBB7ILw+ybs5myCSAq2Fe VUz6FCPsX/54qfQ4F4bXUV0gdSbMmT4QQ042cVOWImuZUPk55/1aB7Xzj5N70XWHXduK REcJ24DqYhW6X4Zr4Q2cqgQR6wi1QsEzqVIPk5lCVPsMZn8KofYkNOE/3QMjPmV299zE 0VIOqlq7Ca4AuvHKCalIjkIdKoD+pGLycauMU/7nMYDZfJjIyURW+QKvE3rrlrqvtIaX AaYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020173; x=1738624973; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pZnSVaYX+KiYCdXZ4CmyGFw5hmL1VdfR5XYIpStDEuk=; b=wFPbjX/mCQ0ACgd5rpxgLepj5rVSuDXhOc4cACn3zeASBpU1MzoMth9TrP4wZRbXeH A2PSIeHdzOnkkFLZGWnEzzMWysSenDQc3mmU0tZRpz/4TzMLB3KIPt/PzPi2QxKfF+mx 1zQKGsjRUwxBAGjqn5hxkAKKNPrTgiczT92U5DEEZ6YyKr02HtuJFRtaq40VL+8A67mx c9CJxaEqUbPFUe1jT5fcaMRnZOTzEeOTx561vn4BVS0GA1dT79oo7miq+xLpY38XV6W7 dvNaWZQ5tSKw4sTOqjz6i6KJa4ANS98dZcShSHAmEQUvDPu3mKRepR/BbNi76SzgUyuV 9OHw== X-Forwarded-Encrypted: i=1; AJvYcCXM1YFio2NXG58EPYaniTVYqiUJ+a8dD4fPKs2oIk1nBXTOW5qr8K6d2VHwyhvsgDG98oiewbJOrw==@kvack.org X-Gm-Message-State: AOJu0YxRMarw7iVK1hLdaCEuRP70BmkaG+r5V62LTh9FVgKjRiSpWMSW KEli2Z4a5F3xr2S26gSMDvbJDBNrIUvmdnBfu5w8Z9/4TvMXq/ODXl9jAcMrteA5xf8Tuw== X-Google-Smtp-Source: AGHT+IFeqEuccYmAhGNha3QlLz2orNvO84nvz+r3SrCrM4Gd+ORaJN3aYgAlJa4q+mgHhPJuRCi37zYw X-Received: from pfbcg7.prod.google.com ([2002:a05:6a00:2907:b0:728:aad0:33a4]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:23c1:b0:726:64a5:5f67 with SMTP id d2e1a72fcca58-72daf99731bmr61371606b3a.12.1738020172884; Mon, 27 Jan 2025 15:22:52 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:58 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-19-fvdl@google.com> Subject: [PATCH 18/27] mm/hugetlb: add pre-HVO framework From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Queue-Id: 0E51840003 X-Stat-Signature: 8bjir4swirfzfgq4azs3bocgj1kdno5h X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1738020173-98006 X-HE-Meta: U2FsdGVkX1+/nCt2+5PR1sbs//cysfK+ezESiDLWQqB4e+jzP2VYmtHWvPhoo6jykjHgVJS2qNOYjmxjRsycHfV+OL9QWtx9c/YGBEQgTLRoClx4eibIkx9UswmkEVMWWhgB0CdQWw1jXrh3Yqum1dp/89/rebMXFfh2PxOf6Vxntkj+U/JHYJ5oQdqn4xVr+dpK2sU/vEeuZXdTPi0OjAhPG+P3NLxKQHzGeTXIzeCL6C0plYlk5RJHisO3BdIDMSvz8Nq/uG97IJvvPh2OkyusCKOjr0GXRj1SJrLGXR6Tyoo/PuWxYhSC140Dlg10bZ5v1Rr39PakXZVaJs7bEWWPfT45wDwTdGmCnOQkIeGA3/2InaInUVqV/75tgtxpsZ+vRjnanzUaKsmGXAoWo/UywEgCGdIbmoOxcrym4nLim5NpcUSY0KvWJV/o5h6asg0kAOwsXOURTvwgGNSo7hT+CFDIJmT3UX+8HlRtadKbIt9OeAjoSt0fh31paW40TXPol4hKmOjo4zoRrP1PUa2g6xmj5sd7nZ7AGs7CQ2+MA6lORXRapEpNH8H793iikyNizaYpQ3zEuyqujmUiI8PEr7HuBYuidY25L8d0BEdjjf1xg+lhvNHtR2yB9e/+vsIwCgczEJo07TsAmLqYV/iFX5cNDhGDcjqIVwZDZTT2Zye38NBvq1hQFyOrG4CWYNLBIpReaMvKkv6Akh3x/6r/EwE0q2NpnCyjOfWDNyqoKy0UYLgf4vy3nJkXDQJFWLgqluBbEf1cEc4QkQKFDD5aMIXn+CiJvTKAcyk70GGjX4nHj86avp3UJY0M89VOUw6Uve6O5lYU5bmu1AQ2JOFrEGq6FS3V4O2eS40Rj2tZUSOnO5V2IvhEzW65B8Fq9l2OYBt9BU+f9sr08dHrH71DYfPcvpiQO53XJL66CesN/A8Kuc3BgvHIfh/eWLvdYMAx1k317GeYEjwso8a GxTM+w8D JOWspP7yYy9mUk/GRgbn8oqKRISfNJGi+zkVS3Z169SsvF88CeDND/7AB+ADY5GJprx8FQrBtZWpFYEi6P2rgZpzqJK/+4PTTIwT3aygxkB39VA2M+D5AzDEBEmtIp13Yf1KKnJaAznF/HC3FLey2F+rpBoHSeZ06Dh6ZHzjhENPNWy/y8bcrlpJOxQNTptheKoOp9DjcfKUPn78iB5MzNoLst8/CVNxmljV8lwD3N1jCpl3KxXUqVPDwaLBHhEmqlm8NX7x/S2TjV8QAmNb1jaP9e8pOhfX/C+eXebIwMD57z/wFQGKqIm79KGS0gW+lKVocK8vAcntz3cXBzAibPmtkoGzhvVjw47ClEVKXTIZpQMZMqVvkxCeBQSccHXbUtbFDgAiXUKBPbbrv/7YBMaS2tYBvy9e0LMRZJDgDVo9K41eJa1aklqhmCn1hSAXRun1pCDZRuceUiQFPuzFOapwrxWlj9A6wlmtUbRZN041I6+/940G9PlPPOQeW00KIWgHkudXE+iCF+scQhtzOK3J+Ug== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Define flags for pre-HVOed bootmem hugetlb pages, and act on them. The most important flag is the HVO flag, signalling that a bootmem allocated gigantic page has already been HVO-ed. If this flag is seen by the hugetlb bootmem gather code, the page is marked as HVO optimized. The HVO code will then not try to optimize it again. Instead, it will just map the tail page mirror pages read-only, completing the HVO steps. No functional change, as nothing sets the flags yet. Signed-off-by: Frank van der Linden --- arch/powerpc/mm/hugetlbpage.c | 1 + include/linux/hugetlb.h | 4 +++ mm/hugetlb.c | 24 ++++++++++++++++- mm/hugetlb_vmemmap.c | 50 +++++++++++++++++++++++++++++++++-- mm/hugetlb_vmemmap.h | 15 +++++++++++ 5 files changed, 91 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 6b043180220a..d3c1b749dcfc 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -113,6 +113,7 @@ static int __init pseries_alloc_bootmem_huge_page(struct hstate *hstate) gpage_freearray[nr_gpages] = 0; list_add(&m->list, &huge_boot_pages[0]); m->hstate = hstate; + m->flags = 0; return 1; } diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 5061279e5f73..10a7ce2b95e1 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -681,8 +681,12 @@ struct hstate { struct huge_bootmem_page { struct list_head list; struct hstate *hstate; + unsigned long flags; }; +#define HUGE_BOOTMEM_HVO 0x0001 +#define HUGE_BOOTMEM_ZONES_VALID 0x0002 + int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 2aa35c1d112b..05c5a65e605f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3220,6 +3220,7 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages[node]); m->hstate = h; + m->flags = 0; return 1; } @@ -3287,7 +3288,7 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, struct folio *folio, *tmp_f; /* Send list for bulk vmemmap optimization processing */ - hugetlb_vmemmap_optimize_folios(h, folio_list); + hugetlb_vmemmap_optimize_bootmem_folios(h, folio_list); list_for_each_entry_safe(folio, tmp_f, folio_list, lru) { if (!folio_test_hugetlb_vmemmap_optimized(folio)) { @@ -3316,6 +3317,13 @@ static bool __init hugetlb_bootmem_page_zones_valid(int nid, unsigned long start_pfn; bool valid; + if (m->flags & HUGE_BOOTMEM_ZONES_VALID) { + /* + * Already validated, skip check. + */ + return true; + } + start_pfn = virt_to_phys(m) >> PAGE_SHIFT; valid = !pfn_range_intersects_zones(nid, start_pfn, @@ -3348,6 +3356,11 @@ static void __init hugetlb_bootmem_free_invalid_page(int nid, struct page *page, } } +static bool __init hugetlb_bootmem_page_prehvo(struct huge_bootmem_page *m) +{ + return (m->flags & HUGE_BOOTMEM_HVO); +} + /* * Put bootmem huge pages into the standard lists after mem_map is up. * Note: This only applies to gigantic (order > MAX_PAGE_ORDER) pages. @@ -3388,6 +3401,15 @@ static void __init gather_bootmem_prealloc_node(unsigned long nid) hugetlb_folio_init_vmemmap(folio, h, HUGETLB_VMEMMAP_RESERVE_PAGES); init_new_hugetlb_folio(h, folio); + + if (hugetlb_bootmem_page_prehvo(m)) + /* + * If pre-HVO was done, just set the + * flag, the HVO code will then skip + * this folio. + */ + folio_set_hugetlb_vmemmap_optimized(folio); + list_add(&folio->lru, &folio_list); /* diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 326cdf94192e..4eddf3c30d62 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -649,14 +649,39 @@ static int hugetlb_vmemmap_split_folio(const struct hstate *h, struct folio *fol return vmemmap_remap_split(vmemmap_start, vmemmap_end, vmemmap_reuse); } -void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list) +static void __hugetlb_vmemmap_optimize_folios(struct hstate *h, + struct list_head *folio_list, + bool boot) { struct folio *folio; + int nr_to_optimize; LIST_HEAD(vmemmap_pages); unsigned long flags = VMEMMAP_REMAP_NO_TLB_FLUSH | VMEMMAP_SYNCHRONIZE_RCU; + nr_to_optimize = 0; list_for_each_entry(folio, folio_list, lru) { - int ret = hugetlb_vmemmap_split_folio(h, folio); + int ret; + unsigned long spfn, epfn; + + if (boot && folio_test_hugetlb_vmemmap_optimized(folio)) { + /* + * Already optimized by pre-HVO, just map the + * mirrored tail page structs RO. + */ + spfn = (unsigned long)&folio->page; + epfn = spfn + pages_per_huge_page(h); + vmemmap_wrprotect_hvo(spfn, epfn, folio_nid(folio), + HUGETLB_VMEMMAP_RESERVE_SIZE); + register_page_bootmem_memmap(pfn_to_section_nr(spfn), + &folio->page, + HUGETLB_VMEMMAP_RESERVE_SIZE); + static_branch_inc(&hugetlb_optimize_vmemmap_key); + continue; + } + + nr_to_optimize++; + + ret = hugetlb_vmemmap_split_folio(h, folio); /* * Spliting the PMD requires allocating a page, thus lets fail @@ -668,6 +693,16 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l break; } + if (!nr_to_optimize) + /* + * All pre-HVO folios, nothing left to do. It's ok if + * there is a mix of pre-HVO and not yet HVO-ed folios + * here, as __hugetlb_vmemmap_optimize_folio() will + * skip any folios that already have the optimized flag + * set, see vmemmap_should_optimize_folio(). + */ + goto out; + flush_tlb_all(); list_for_each_entry(folio, folio_list, lru) { @@ -693,10 +728,21 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l } } +out: flush_tlb_all(); free_vmemmap_page_list(&vmemmap_pages); } +void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list) +{ + __hugetlb_vmemmap_optimize_folios(h, folio_list, false); +} + +void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head *folio_list) +{ + __hugetlb_vmemmap_optimize_folios(h, folio_list, true); +} + static struct ctl_table hugetlb_vmemmap_sysctls[] = { { .procname = "hugetlb_optimize_vmemmap", diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 2fcae92d3359..a6354a27e63f 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -24,6 +24,8 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h, struct list_head *non_hvo_folios); void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio); void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list); +void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head *folio_list); + static inline unsigned int hugetlb_vmemmap_size(const struct hstate *h) { @@ -64,6 +66,19 @@ static inline void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list { } +static inline void hugetlb_vmemmap_init_early(int nid) +{ +} + +static inline void hugetlb_vmemmap_init_late(int nid) +{ +} + +static inline void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, + struct list_head *folio_list) +{ +} + static inline unsigned int hugetlb_vmemmap_optimizable_size(const struct hstate *h) { return 0; From patchwork Mon Jan 27 23:21:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951860 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCC8BC02188 for ; Mon, 27 Jan 2025 23:23:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F3ACC2801CA; Mon, 27 Jan 2025 18:22:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EE91828013A; Mon, 27 Jan 2025 18:22:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D64662801CA; Mon, 27 Jan 2025 18:22:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id B3BDB28013A for ; Mon, 27 Jan 2025 18:22:57 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 6B3681207E6 for ; Mon, 27 Jan 2025 23:22:57 +0000 (UTC) X-FDA: 83054809194.08.A9963E9 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf26.hostedemail.com (Postfix) with ESMTP id 532A514000B for ; Mon, 27 Jan 2025 23:22:55 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vcce83ce; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf26.hostedemail.com: domain of 3ThWYZwQKCB08O6E9HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3ThWYZwQKCB08O6E9HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020175; a=rsa-sha256; cv=none; b=YqXzcWYCvW2/Ry9w0iiQm6f71jDj/csggaEFHBe/EoTZaLtQUkAdTz3rUadn+mYFYL9Fuc hRUTZW1TG1MkvSFwSQQMkhM9go3KDH7AGb+AUDQZzyubbLiBzEhPved7QnEzz0r1jSWlFN H7sWJxYtNMMsoDFBp+slMnWhack3O5M= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vcce83ce; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf26.hostedemail.com: domain of 3ThWYZwQKCB08O6E9HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3ThWYZwQKCB08O6E9HH9E7.5HFEBGNQ-FFDO35D.HK9@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020175; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=X7mcGCY9R/2wAfvNybb5kTke1CSbZ/pNKwqh3Rdbh4M=; b=3IkYTW8kZz3Zjb/lzJ45NY5uoq1toAj1ouEPCh4qUFykp6if/3pRmW1cbRqMRqaPzCdGBk of7B5vxHHtL/3BZiWDhw1HeloJOvDzZr6RTif20l1vcpzUrQHzOmxRG7OWJThelEOLxUwj 5b8B/K5UmV2TwAK7VCCn668EQBS3pac= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-21dcc9f3c8aso4479665ad.3 for ; Mon, 27 Jan 2025 15:22:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020174; x=1738624974; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=X7mcGCY9R/2wAfvNybb5kTke1CSbZ/pNKwqh3Rdbh4M=; b=vcce83ceAzO7Z1/Gvx2PGqAJ3VqSCedCRCqiPxRLXoBS3aDeyLMIX8ilecSQV87EI1 ygyu4VGKKZHOuy685GRntpiQhj1jo1qTx4x0dHnwQ+ReXoQNGxEpi9QRy4gKRRj3qtJT pDw0Rl9BfDIof6zDXnQrsS/QC0HZO0hHvRkEvncNtNIivclbeO2UnkYk6OePg+uErVj2 WnBQBX3SWyTxe/nYiIVYf4D7k++w8cZS7nnuw2Zpk6yzudrJA86VY1lMtlrZRQLipeAZ AHR8ZajI5twDEcytuYhqca7HHR+0aVOo6Q9LE0a2vNlt3c4tyouFkJ8uUQKwZrgStMsv 4CeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020174; x=1738624974; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=X7mcGCY9R/2wAfvNybb5kTke1CSbZ/pNKwqh3Rdbh4M=; b=Q6rEoEn6pthxnepDqFWYlX99ORJuQok/2P4RdNPWP9C2NSuIAmnjaN5X002SKOP1lN YEdRoHcn+8bPxnB9ewteDMPl50DI4v05AdoI+eHjNjpwK8E/vhLYhGyvbzM7mNw5V2OM /rwfkQll54UCshVbT50p9CbizvbOcbkOE+kfSexgWUO0ObJrTUGdV0jTcGBkMKMYnEbs UPFo92bzHQXGs8jses6tdnGlFU01MHxhDgPTldmb17p381HF4SLvBDBk4J+zTufGEIOf lTCPBWzUTUibaVZsGcjD4B0FmkyqSu+xtUl1SsCDp0xpYVNKsP15P9o0K64qDWVtp5Rn WC4g== X-Forwarded-Encrypted: i=1; AJvYcCVMEyUEHcL7Kqr+OJa3bCFukd2T8E6zEVBZm/468O07wJyzqJbn/pJZHDWh16pUDLXo5X255eNh/A==@kvack.org X-Gm-Message-State: AOJu0Yz/kMH+/S9BV0WozuBZq4YOsFjgkDcsm3ukSXwHeiCsOc8PE09A QW+zVAimnvk95oUfSkw0KJUpkAKGN4688VyuQfURQkuf/zwsZVCV38vq+Vx/+CceaXmHLg== X-Google-Smtp-Source: AGHT+IE5tGS51aonvqKKCiky34ZGNKoaNnyFsw6cOmWUpKn2utYC2DJvD1y3ko3Jcx5xkeODyc6ZuvMG X-Received: from pfbbd37.prod.google.com ([2002:a05:6a00:27a5:b0:725:ceac:b47e]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:8724:b0:1e1:bd5b:b84b with SMTP id adf61e73a8af0-1eb215fb386mr54403555637.40.1738020174191; Mon, 27 Jan 2025 15:22:54 -0800 (PST) Date: Mon, 27 Jan 2025 23:21:59 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-20-fvdl@google.com> Subject: [PATCH 19/27] mm/hugetlb_vmemmap: fix hugetlb_vmemmap_restore_folios definition From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 532A514000B X-Stat-Signature: gt855ystwe9gqci8zrucu39yxjag3mwe X-Rspam-User: X-HE-Tag: 1738020175-495016 X-HE-Meta: U2FsdGVkX1/AIEWjBMCAEs9kfTvhUHK1v9lLAx9BCFXnW6fc7vXFfkir6kW3e0Fp8AxhAMFtmF191T8TclKCVDEDcz5Tzs6deQHoNNzs5Lc1+Xq0Dt/uV2PTPlcZ4gfYoIYYrXrUGeOjTCoiEAsYdy+qYvSAAeKBp98/Gs3OvHhzdPSImsHnt50nkE7KmACNwNeFtKc/WKpYMbDw77RdZD8EzmaPYgTzD2P1A6IovHyntTpQTOB7LkYbT1kbanHLp0kPZKmTsjhIt0flOaI72iBI4617z53jjYzH97QcXceVXE6xHpNq9LGpUmXMt8l9HrzHme5uqRK1zVLAyKlW0MptEVzcNF4YoLbP6VwfmkC0QaRR2IoUADvhnNdHzgHfxrEtoQ2nDO46ndZESIObxebpAgc5cPDkEWbSXXM4+GxjJU+XBNok3SSDzj+U2Sc7xifx5DNPfFihsnV+piduO/zsZv70V74ioJvw3w6G1ZtRqTOOYuT/IhhuwGtJS/mkmfjeGk1XizAndl5TWFxQ4xFPfAy0B7zfSYIMFWueQB2fRRQeMlVSRSB168jpD4U2iW1eS6bXDjWYTUd2E+15iz9QW5/2DBWewJe1DUHa2P24JeMCvOAeeVDto+Q/s8K7xtvFR72gpYlsaeU9J9ywp6fKO+Pm1qS+OkvkxBh/wMLQHzm23EfsevcNWLyjDLG3Cgr57/rDG4dwCbIV3YV0Wm9tUDAsezlphn1sLihBvaIOmTf01YqhjWNFwz8+BDZ4j5oriialeIBodythImkprO/+GXQtDNNsiRat3QHnM2EqvaWC1m7fkBdq/UkmYcsocHZc0+23T6pHRv7u3tj1ArPJpkV9Mt5QjDKP5HBA1kswMnmRu3cIIxzq9WGnJPByLityBlJYvuXlconAYgQZVMyPrj88AxBySThTMYW8Mb80XG1O9dH0iMLxXy81WVgwwh2AC0YkbIBox1wT1ww J0aGv7PS 0YZyQrnwUWrcEK1qQebyt+SD/GEPAliVrqwR++mAeWnQP/zE/70PMCIiiI01WmWJIrlajBZftSdP8j23NcoTtlmmfygWhbxYXAfrxmeMgoHmK0OSRqI7WYWura0pLBIJHQAIulZxwhMHvu6QFZIxRIlc19Avrh9kSz7dhsJyoRkYet6I875aD/KYjq0LB+ppmRLrEXNN1QAiv5XFbbg6p9FwD3UUhVvcYT2Y37OG0/ofs+NUruHXJCRwUSLOTseFIQduFWmLh+/Pwmr6DndDprreWIwio/LKjfijyIQpnzS4JXpQ1o7sKqtO3UhnmPTODx6J2g145XhJ7O6MGIJ4FQtCpydoI8ztbtXTqNx0yNQMm9rJ2XEYwFMt9fDt9aCuw5+NsTbpMZBybu3HFLDWKTqRG3WMsDbxITJytBFTd/26c45UW/0j2J6Hbp9E9/Ck0USOyUifUpiauIiuKbbs5yYSMQj+H5+Wqz4SdAAi5grwZM/JqHuJ9ZBlWeDhc38s1652AgvkhJvYn85sXYA42Be+c3WktMoxTxp9tImRkgfbRZIe9pj75VU80tVG2Y9cn8D2K X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Make the hugetlb_vmemmap_restore_folios definition inline for the !CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP case, so that including this file in files other than hugetlb_vmemmap.c will work. Fixes: cfb8c75099db ("hugetlb: perform vmemmap restoration on a list of pages") Signed-off-by: Frank van der Linden --- mm/hugetlb_vmemmap.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index a6354a27e63f..926b8b27b5cb 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -50,7 +50,7 @@ static inline int hugetlb_vmemmap_restore_folio(const struct hstate *h, struct f return 0; } -static long hugetlb_vmemmap_restore_folios(const struct hstate *h, +static inline long hugetlb_vmemmap_restore_folios(const struct hstate *h, struct list_head *folio_list, struct list_head *non_hvo_folios) { From patchwork Mon Jan 27 23:22:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951862 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B99D8C02188 for ; Mon, 27 Jan 2025 23:23:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5BFC92801CB; Mon, 27 Jan 2025 18:23:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 46B9C28013A; Mon, 27 Jan 2025 18:23:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 156262801CE; Mon, 27 Jan 2025 18:23:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 300DB28013A for ; Mon, 27 Jan 2025 18:22:59 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id E2ABCB0A90 for ; Mon, 27 Jan 2025 23:22:58 +0000 (UTC) X-FDA: 83054809236.25.47FD376 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf17.hostedemail.com (Postfix) with ESMTP id 18AB340002 for ; Mon, 27 Jan 2025 23:22:56 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=n+ojoBu0; spf=pass (imf17.hostedemail.com: domain of 3TxWYZwQKCB49P7FAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3TxWYZwQKCB49P7FAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020177; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fEVgSampFcFnztxYiqKe2aRbQGMcWDvZ/ZB5suAPpfY=; b=cFSC1SclxAuMYl0UYtsAxq0ShYQFjPT8D2yGWRaojnyBiAuSjI0fq48BsmsAOmvYsMi3qX KGkYsKf3ygeWEO8aOaQGf5IpiWIXz838nAAD+pQ6JtsVIln+kaH6hWUCbIX5S4V1V1xXxB PQnyKEbhnuJZf1vVaFg9Kbykuai5bM0= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=n+ojoBu0; spf=pass (imf17.hostedemail.com: domain of 3TxWYZwQKCB49P7FAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3TxWYZwQKCB49P7FAIIAF8.6IGFCHOR-GGEP46E.ILA@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020177; a=rsa-sha256; cv=none; b=1HQv4DSGdZjOUx/CXaaHMdT7kLI4EbiIWlLLoQy+RR7JVNJrn43U2CxZHE/8Tr0bohs+hg D/Wrcf7ukGtKjf2Wf5Dv0Aj8B3iqBxPTvdWx1f2xYpG4D0mOmQCrYJ33b3MxY9jvfLth0k 2saR3MzPSr9BP2UjeRG7dDQK8b6nc6k= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ef9864e006so14698622a91.2 for ; Mon, 27 Jan 2025 15:22:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020176; x=1738624976; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=fEVgSampFcFnztxYiqKe2aRbQGMcWDvZ/ZB5suAPpfY=; b=n+ojoBu0SODxQHRFPT3Mw0aBvXp4hZeF8W1cFmFPhmgyJEFQs8v1GlykSfR0LKAXj9 gdULSDAxHBo4j43M4jd6PDjyLOV4opauGJd7niVmTxwsDPme20JXmYOrzIcl4XCzgAjr q8wVq/y+PwNYu8X8GYRGnJKHDVBgy2QsuVV2bDh0DHpc81xc2tJ6tY2lBivd9DdSro1E ELutgW6i3V4/vEM5BxDB0QZQG07cJteNKFcqfIGapmhYiwh1jWMxPuGksus2XDd1xkAS jP0iaPTEDL81F/nV3JkUesExg4SJidMdaopshSpdb8FNumwiFLcDLS+b4hfOwrd1VSp7 8qbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020176; x=1738624976; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=fEVgSampFcFnztxYiqKe2aRbQGMcWDvZ/ZB5suAPpfY=; b=o1yX/NN+bCKQJhFOdHZ21YbhfiYvKiJfXIVDT+m/gzIf2JouGTH5ITHNGNBiP19PEH fNXAyWiWlOpMG1sV77FD61sT27ipPu+Hk9lN90geHnHrB+znSMiujtY558IcJ4mZUVeO 9paRVtPy7DkZIbJy6n5MRY6xcYQ84A5NIVJCRgaKKTKcrC0Z8tK27vi+b1HdwLqnPi+G rxuT2TsCOv46g36Hg0ToWuscPLT2wUc1IGT1ZA5H3cEq9Q/L98Y7qTcZ43rv2x2W3XVE 7mOLq8OpltYAFvS25W+xKJfS/jPuxhbmRHjsEyorckfRrxqjQj24Vd3F86BEW+NMat6J dtOg== X-Forwarded-Encrypted: i=1; AJvYcCURt5e897VWaiXUcNk38rrvc4cQH1g5Qeh9qmqqBd47TM1u51ylKbzU/erCFTVKuCI1aX6pSrwE0g==@kvack.org X-Gm-Message-State: AOJu0Yy9tCVGpDUsjw/kct5mDN7Dr2GLoZjH6hSS1EBZJilvsY0iSjtY M14s8OnjXws4YJbdXK/YIG0g9cFYX02EwKlY+IUPU/YKzGXUA2YtI8Oe/PNgOTc8pkLmoA== X-Google-Smtp-Source: AGHT+IGJRnN25b4dZlF6olAZgvchiqmQKdipqlEwFeLtoRc3PeD9pO1+0UyZWnQ7s7SBmwjf5eKeBH4R X-Received: from pfbbr13.prod.google.com ([2002:a05:6a00:440d:b0:725:e4b6:901f]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:6d88:b0:1e3:e77d:1460 with SMTP id adf61e73a8af0-1eb214e00eamr70005972637.22.1738020175839; Mon, 27 Jan 2025 15:22:55 -0800 (PST) Date: Mon, 27 Jan 2025 23:22:00 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-21-fvdl@google.com> Subject: [PATCH 20/27] mm/hugetlb: do pre-HVO for bootmem allocated pages From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 18AB340002 X-Stat-Signature: f7sknq4ie7aztqxccok55dac1bnb1nzm X-Rspam-User: X-HE-Tag: 1738020176-177760 X-HE-Meta: U2FsdGVkX189BlC4qpfhChR1GEZfZhnsXCArrX69s4IuuzWJWTE14qyIV8STtx20fOj8xNMl85cC6E7S7PQJtJZTdhYwx63XTHDSHeMxyGiB698YjzWKDMisrzZKKBE6QRg6+OWrghP7FhT+5jjaXtLXTKt7e8QFHWSsVoqWLfmj3D4eJBPL6+PK2lOVug/ky/cMDNqVNWCMSlS4NniPQZ2Lrnb/nwwQGKyIXeQCpe/HD5kqGuRgTpE0w+GoE/FczYAkDkr4QDv5TTAmNKjnf4X7aimdX9TXIlnlcraTR4E8Wh00Sw49lmJNoKHiJiSaY22j/enxfg5qeICBNlB9kR7LYEzIsHaqpMdHZsznt9r2zWCJ7hAu7u+ucaLJgDfZTVwoJnOJjJv9ZloHChW2OpCOktiEo7qgbGXZMV39hBcE6Qh7aUl77GTdUU2BugLRt2KmIfNAwBxJ0vfn7Rjk5tq9CusPJxl5+5b+ySPuO/GxE6d2odPlV2PFUNxgZhLlsMlCNbUfPkp33b3agPImyLwJxuM4b/Rl4ZRR+r6dXJn1CRF3yNAqptY6c53dDhic0E3rTik8Sv23tayji9L0U2BO9oxoXrf+p46fDNa4qBR4pQeLEqc0SIkVZAlmwVkqmdMEZjbvex8i+EbhjV3H5HP39MlMxMSovIqP/IZr9p/5aLLkI2DMB+nea+9wMoaHNvPp+eF1hnupfRfgTklQA+XfpXR6kR3e5MAP65ZSrJfYzIm5ks9QjI0vsXXv7dL2c6jaKE/YmTy7q/7gsB0v7dlO8oKsu7uq4v9QHZ/LOKIjHog4lM0E5Yn+iFU7DQTlrDWCohiqBvuF4fuUIOOklzNLwoeFiO++tiHFRle2+v3GSFiqYbTsP69YUk8xTUM1nC1L6h2LO28h5OMoZkEfSOmJsLM9C+2ZRulp0X9dE4T3g+53oJfe7qSi86Pm1QyF/fDK3WnLU71E5g9y2g/ V5O5nctw T6VWDEazvDgkqWfrEedoRwSW9dyIm5pqyJKbPE46FTAzh2b6Njup+QSENe85qOtCL1AZhwiHPDDSxsKJ4pIweblVbJlIWRSV+FsCFuONOnS3Ps2ibzB6R1xhnEJb761W6bCtQh2hpQpe7o5X2fnWjkf2gRZJ1nQRDI7tkRbWcss2q605DMXBf5lXoAVs3j5NpIVCm21CqqePa31tMXzb7waR2c05BMsnUpWJV8cSCzujGLWgg+YdeLJFj0xt+klBO3zCzneR/EYPnEQ9bhoVXvRUbuf07NgHsAI/4k/rORMN/Nmi4JfWXX7bTOmUfSrcKKaGaGlmw4iy7HM0pGwfzGex7Rc12025Q4268MhauM1tLUBoxiPpxDPyS9pWcvtbluys3bPjpSzViEsXWl3HRx3kTaKW8Vvp6i+hFiqb+bLufd/LvzpP3oLH0FQbCuNqLZzk71NxEBEP2++B4K+4IIWrsCSck48XcqnS/vgEqDl97b7XhUit9uqcEE9Ijd1KzS2HCeFB3I41e60obLSyNv9s8lA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: For large systems, the overhead of vmemmap pages for hugetlb is substantial. It's about 1.5% of memory, which is about 45G for a 3T system. If you want to configure most of that system for hugetlb (e.g. to use as backing memory for VMs), there is a chance of running out of memory on boot, even though you know that the 45G will become available later. To avoid this scenario, and since it's a waste to first allocate and then free that 45G during boot, do pre-HVO for hugetlb bootmem allocated pages ('gigantic' pages). pre-HVO is done by adding functions that are called from sparse_init_nid_early and sparse_init_nid_late. The first is called before memmap allocation, so it takes care of allocating memmap HVO-style. The second verifies that all bootmem pages look good, specifically it checks that they do not intersect with multiple zones. This can only be done from sparse_init_nid_late path, when zones have been initialized. The hugetlb page size must be aligned to the section size, and aligned to the size of memory described by the number of page structures contained in one PMD (since pre-HVO is not prepared to split PMDs). This should be true for most 'gigantic' pages, it is for 1G pages on x86, where both of these alignment requirements are 128M. This will only have an effect if hugetlb_bootmem_alloc was called early in boot. If not, it won't do anything, and HVO for bootmem hugetlb pages works as before. Signed-off-by: Frank van der Linden --- include/linux/hugetlb.h | 2 + mm/hugetlb.c | 4 +- mm/hugetlb_vmemmap.c | 143 ++++++++++++++++++++++++++++++++++++++++ mm/hugetlb_vmemmap.h | 6 ++ mm/sparse-vmemmap.c | 4 ++ 5 files changed, 157 insertions(+), 2 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 10a7ce2b95e1..2512463bca49 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -687,6 +687,8 @@ struct huge_bootmem_page { #define HUGE_BOOTMEM_HVO 0x0001 #define HUGE_BOOTMEM_ZONES_VALID 0x0002 +bool hugetlb_bootmem_page_zones_valid(int nid, struct huge_bootmem_page *m); + int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 05c5a65e605f..28653214f23d 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3311,8 +3311,8 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, } } -static bool __init hugetlb_bootmem_page_zones_valid(int nid, - struct huge_bootmem_page *m) +bool __init hugetlb_bootmem_page_zones_valid(int nid, + struct huge_bootmem_page *m) { unsigned long start_pfn; bool valid; diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 4eddf3c30d62..49cbd82a2f82 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -743,6 +743,149 @@ void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head __hugetlb_vmemmap_optimize_folios(h, folio_list, true); } +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT + +/* Return true of a bootmem allocated HugeTLB page should be pre-HVO-ed */ +static bool vmemmap_should_optimize_bootmem_page(struct huge_bootmem_page *m) +{ + unsigned long section_size, psize, pmd_vmemmap_size; + phys_addr_t paddr; + + if (!READ_ONCE(vmemmap_optimize_enabled)) + return false; + + if (!hugetlb_vmemmap_optimizable(m->hstate)) + return false; + + psize = huge_page_size(m->hstate); + paddr = virt_to_phys(m); + + /* + * Pre-HVO only works if the bootmem huge page + * is aligned to the section size. + */ + section_size = (1UL << PA_SECTION_SHIFT); + if (!IS_ALIGNED(paddr, section_size) || + !IS_ALIGNED(psize, section_size)) + return false; + + /* + * The pre-HVO code does not deal with splitting PMDS, + * so the bootmem page must be aligned to the number + * of base pages that can be mapped with one vmemmap PMD. + */ + pmd_vmemmap_size = (PMD_SIZE / (sizeof(struct page))) << PAGE_SHIFT; + if (!IS_ALIGNED(paddr, pmd_vmemmap_size) || + !IS_ALIGNED(psize, pmd_vmemmap_size)) + return false; + + return true; +} + +/* + * Initialize memmap section for a gigantic page, HVO-style. + */ +void __init hugetlb_vmemmap_init_early(int nid) +{ + unsigned long psize, paddr, section_size; + unsigned long ns, i, pnum, pfn, nr_pages; + unsigned long start, end; + struct huge_bootmem_page *m = NULL; + void *map; + + /* + * Noting to do if bootmem pages were not allocated + * early in boot, or if HVO wasn't enabled in the + * first place. + */ + if (!hugetlb_bootmem_allocated()) + return; + + if (!READ_ONCE(vmemmap_optimize_enabled)) + return; + + section_size = (1UL << PA_SECTION_SHIFT); + + list_for_each_entry(m, &huge_boot_pages[nid], list) { + if (!vmemmap_should_optimize_bootmem_page(m)) + continue; + + nr_pages = pages_per_huge_page(m->hstate); + psize = nr_pages << PAGE_SHIFT; + paddr = virt_to_phys(m); + pfn = PHYS_PFN(paddr); + map = pfn_to_page(pfn); + start = (unsigned long)map; + end = start + nr_pages * sizeof(struct page); + + if (vmemmap_populate_hvo(start, end, nid, + HUGETLB_VMEMMAP_RESERVE_SIZE) < 0) + continue; + + memmap_boot_pages_add(HUGETLB_VMEMMAP_RESERVE_SIZE / PAGE_SIZE); + + pnum = pfn_to_section_nr(pfn); + ns = psize / section_size; + + for (i = 0; i < ns; i++) { + sparse_init_early_section(nid, map, pnum, + SECTION_IS_VMEMMAP_PREINIT); + map += section_map_size(); + pnum++; + } + + m->flags |= HUGE_BOOTMEM_HVO; + } +} + +void __init hugetlb_vmemmap_init_late(int nid) +{ + struct huge_bootmem_page *m, *tm; + unsigned long phys, nr_pages, start, end; + unsigned long pfn, nr_mmap; + struct hstate *h; + void *map; + + if (!hugetlb_bootmem_allocated()) + return; + + if (!READ_ONCE(vmemmap_optimize_enabled)) + return; + + list_for_each_entry_safe(m, tm, &huge_boot_pages[nid], list) { + if (!(m->flags & HUGE_BOOTMEM_HVO)) + continue; + + phys = virt_to_phys(m); + h = m->hstate; + pfn = PHYS_PFN(phys); + nr_pages = pages_per_huge_page(h); + + if (!hugetlb_bootmem_page_zones_valid(nid, m)) { + /* + * Oops, the hugetlb page spans multiple zones. + * Remove it from the list, and undo HVO. + */ + list_del(&m->list); + + map = pfn_to_page(pfn); + + start = (unsigned long)map; + end = start + nr_pages * sizeof(struct page); + + vmemmap_undo_hvo(start, end, nid, + HUGETLB_VMEMMAP_RESERVE_SIZE); + nr_mmap = end - start - HUGETLB_VMEMMAP_RESERVE_SIZE; + memmap_boot_pages_add(DIV_ROUND_UP(nr_mmap, PAGE_SIZE)); + + memblock_phys_free(phys, huge_page_size(h)); + continue; + } else + m->flags |= HUGE_BOOTMEM_ZONES_VALID; + } +} +#endif + static struct ctl_table hugetlb_vmemmap_sysctls[] = { { .procname = "hugetlb_optimize_vmemmap", diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 926b8b27b5cb..0031e49b12f7 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -9,6 +9,8 @@ #ifndef _LINUX_HUGETLB_VMEMMAP_H #define _LINUX_HUGETLB_VMEMMAP_H #include +#include +#include /* * Reserve one vmemmap page, all vmemmap addresses are mapped to it. See @@ -25,6 +27,10 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h, void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio); void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list); void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head *folio_list); +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT +void hugetlb_vmemmap_init_early(int nid); +void hugetlb_vmemmap_init_late(int nid); +#endif static inline unsigned int hugetlb_vmemmap_size(const struct hstate *h) diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index bee22ca93654..29647fd3d606 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -32,6 +32,8 @@ #include #include +#include "hugetlb_vmemmap.h" + /* * Flags for vmemmap_populate_range and friends. */ @@ -594,6 +596,7 @@ struct page * __meminit __populate_section_memmap(unsigned long pfn, */ void __init sparse_vmemmap_init_nid_early(int nid) { + hugetlb_vmemmap_init_early(nid); } /* @@ -604,5 +607,6 @@ void __init sparse_vmemmap_init_nid_early(int nid) */ void __init sparse_vmemmap_init_nid_late(int nid) { + hugetlb_vmemmap_init_late(nid); } #endif From patchwork Mon Jan 27 23:22:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951864 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10BB9C0218A for ; Mon, 27 Jan 2025 23:23:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D7F8A2801CE; Mon, 27 Jan 2025 18:23:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 83CF52801D9; Mon, 27 Jan 2025 18:23:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 474612801CD; Mon, 27 Jan 2025 18:23:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 8D9062801CB for ; Mon, 27 Jan 2025 18:23:00 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 510EEA0873 for ; Mon, 27 Jan 2025 23:23:00 +0000 (UTC) X-FDA: 83054809320.30.6F7BD74 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf16.hostedemail.com (Postfix) with ESMTP id 8B71A18000C for ; Mon, 27 Jan 2025 23:22:58 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Htl3S9Kl; spf=pass (imf16.hostedemail.com: domain of 3URWYZwQKCCABR9HCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3URWYZwQKCCABR9HCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020178; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=m0+bWUW0pfRrXDtFyRBgF5DnwWrUxEqL81u9NMeAeCs=; b=TSDaKrPVPDHMc1QaQJs+GdkF7fbfUGH+iVn3wQSuSHmfpgLBEZK+sGeK9DKeiPAXz+k4um u7XoPcNmfYMu7vDpOWXN4LCljF2taIAh1/oXtiiJ/AVJ6UvrBNM1jPMiZWw1PbJWXy4xUg 47GfpQ/4K51ibU5A1hoHSrs4U0wqeIo= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Htl3S9Kl; spf=pass (imf16.hostedemail.com: domain of 3URWYZwQKCCABR9HCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3URWYZwQKCCABR9HCKKCHA.8KIHEJQT-IIGR68G.KNC@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020178; a=rsa-sha256; cv=none; b=eNPoqQcYuKaWeYwQ4/UPJ5/UG5Cb8aVfjZ4z4WSECOLyKmrChMz7sF2evRH2EN7Nqsh1tX iz+3ZqShGOJeQR69TuqWM93qpMYiUnMhvrTllv4M/M9sBRfGRX7Mxoa9qW8UQxQ6qqdsKp j5GjyJXrI/Ymve9Ze0Tp1zghY+7ejr0= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-21648ddd461so103258655ad.0 for ; Mon, 27 Jan 2025 15:22:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020177; x=1738624977; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=m0+bWUW0pfRrXDtFyRBgF5DnwWrUxEqL81u9NMeAeCs=; b=Htl3S9KlAVVCD5TMLQFo8htFwX1507Qa3bMQvMU+gQYNNjsPKryYWc7aCGVOZG3sNu wV5C/Au0yUdACiB8wzfDoBSvPCOJQc/jivdkgDIEj/AtTXkC6RrECwCFBqkMxhQy5twM swH1e/0RafzmVuY0YqOrAXcVUV1fhq5tI0lk14ryKWvfnf6iVF6BvMbR3dOAUCaM6LOL s3vI39GcXloe9GaLBZAjVA/di4hxFv3zoLoD4IQyBpWfmfX8CsfvnFJOBFUfFSRSETNU aewAlEVxuNGxY2sjOzhKzeae9vK/wQB3s2F+aHZAzLdEksyOa3o5T9f6fwoBiAssX5NC ur7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020177; x=1738624977; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=m0+bWUW0pfRrXDtFyRBgF5DnwWrUxEqL81u9NMeAeCs=; b=fVplQGw3Ze0H+SzBAYi6rDjX0DrNeVDYxvD4YJ/c/VslihFxQR5BT/3wT13/edShga 24HI4YAaPuqMNU0CTSNEYThnrZ/xqnEG0W945SJAbPQttt67T6oEI3atRsNV2B/tTpbQ UMFoWpjPJdIWO0LSsmqQtBPsRFqu1dc2JHEBWfHYBXxX2PV6ANDGMT0ERpaudnGvEgRw X+03QJL3DjpA8lOnvSI1MTlSpbINLdOpb65zEsKxSA4eyjgx/4TvKhvo3jdHCvV3E3dy dsPEZUcEIN95xh1x60yVdAr51b+a4E8770zH3e24uY0ktncN+wgIpqSXDP2EqF5G0Epy jREQ== X-Forwarded-Encrypted: i=1; AJvYcCU4M3TfZk9GcglInlZNXRbWN08BFow3nlxN8/68+iqY+aK2J6gCTAvMr/66/V72m2Y/unsY7Y9Eig==@kvack.org X-Gm-Message-State: AOJu0Yx2l87Z7hjJD8eLtl5EMj4Cw+UR+Hlvf3b1g8QhG94e2MvNPr1R 61V4GdY0hwnCNW+EsFly1geHxJggwuZZ3dUFfGAtzqBLdR0eir3AJkGu+aGGxjBM3fj1nQ== X-Google-Smtp-Source: AGHT+IHTkWT+vKPtNmML5TD/UxEnzEGLQT8O3DJ3Xye94Aonki0v0QAKyqnXLqCJrz3+wg2puwj3NqqG X-Received: from pfav8.prod.google.com ([2002:a05:6a00:ab08:b0:728:b8e3:993f]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:3942:b0:1e1:9f77:da92 with SMTP id adf61e73a8af0-1eb215901e0mr60888783637.33.1738020177482; Mon, 27 Jan 2025 15:22:57 -0800 (PST) Date: Mon, 27 Jan 2025 23:22:01 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-22-fvdl@google.com> Subject: [PATCH 21/27] x86/setup: call hugetlb_bootmem_alloc early From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden , Dave Hansen , Andy Lutomirski , Peter Zijlstra X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 8B71A18000C X-Stat-Signature: n5ot9rxs9sb8ijs76zscomqhe3zugio6 X-Rspam-User: X-HE-Tag: 1738020178-417956 X-HE-Meta: U2FsdGVkX19Ce5/tOiccZL0FKECi1HoHMr+RYFwZZ9O2Ob1QGhabTW4Gqmr80/NbEg1iNXF4wmek6j29l0rnz3vaGpDlQNJlzLfc8AAKAxVVB8ihHTnLr4FI5piha6DsBYoYAUSCkbu7fRlyc5wzzymY+Bau3AJGZKhKQbtXXDehjyK+dKIQhyqyuAhacDpblQmsfXyW6lPPJy2P3IQgLtw4ZPUMg1bIzc5ge1iHl5YFfyeRDEgyGKHoVQCjEfu6o3X8g0O0N86JZ/rJutyyFmljsKngWf+3S2PSKVotmTQoiptruUbPn6CmJCDwqDoChayn1gl/5vdpZQ1nV/B499B+DdFWr+3RnSCFg5uJmolHqOl2nOXgSGJ1IA1wbXTLElMOwCB6tQpDetidFaVmw8gzANugeTFnq6vqhBSGD7Oha6WXhkaG7CPLPfWoTeZu4y8srGKzVaD1QYtWbCSWu4MSwDM9gTQUUQJtSMLq9QesTg8bTewexevVdtAjN3LIQB4dZmzaYe7HtHGOTyLqUEzPVbUsFUWgwA0Bw4AGjcYZcMBJRsy2zfYNq9GbXpb9/MPRoGIvlRx2aCm9/4oqU8FCxCZSEwh6cnh+iz/fwjSCOKYnGE7SYmOx97C2YJeYDHCsvgSZasxiJumlywcVSw1s7FtV78T5Rx/8fSEH4nv/WUqVch/tVMk0VA+TXsO4ikwsTIqhX7xGgRtMIJRHbcs3iuSuGGsdftSo+x0D66zjAcuvyeWMhVb580pyTnGrFzTE12ETUAPTaVC3p/j7+NSGUrad69DjoaTnn2GN//LXKRb9W7CbAlyuBY3wm0eJeZDjrj/rM+j9uTe3blvPF/u+WUzaEzgCBzkEhK6BgK2tMb02tf5PqdnIJlCY+2PyBvqLOT7mkRvkdMAozDy51PxwCDzvM07oZDmhut1VyOkF6eYZFrU8W/7x37Be+LoihQFX9z7QORfoqWDy1NH iBRyiC32 mkKrmOpXJWyt5yeYyJ2E08A+PB84XNLK2xUoOVchvBrJ+2fU77a7fybx3LBNyAtqulXdPGHuU+u4rmWs8Srrjs/EL1Fs55gpsqGyU+putV94dZ8jo3/PgTTefqhY/cORnOLt8SpVz7iymMzdDRm5t+ckcVTTrmSwHC9LYEOpVBdoOAE3X78BGwiWbVlo0CGOmV5g0F16KDERFvUBwdq8pFC1uWDkbI7AL6GFnK8L3oCX5m3h/4EpoGpUrjeUFt6OiGjeTKBkwQnw12hyNg2H+7qOkr2w28LQ2F7R3TI9vYAZi/b8TEsq+ac2qKWNGGd7/xFFc0GoWA7ADAWo7Naqzg6lhpWkjTotxhxIfH/+r9vaI3jdZX8g5s5InlF7a6uR1xqvH036colml7gPbUjXV8SBJmAvy5S+1YsQ36+PcOeWFp6ZmlL6E3XxC5JkrEOZFpcVULNH/erqs+rk11LpRcaRO7cuvPsl82RZoegwSnEjpSVL2SbSpFLSd5u4yNCDomwYbtnyrery0LCIbxI2VSpt+E5YNLpFTJHAEp8GRFh8rfY4a0DH7FrNtpi0j0Vr8s5bFGka3L3d+rYrQkms+NhWq2TWM/Y57u0qqxHikVwUMU71ZJYCxkMLfgQqg9lTyoGPZRAm0P2LIdogBmfwGTIR5hsICotFciY+ljqP38QwfJz5RmEHAnXwAZg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Call hugetlb_bootmem_allloc in an earlier spot in setup, after hugelb_cma_reserve. This will make vmemmap preinit of the sections covered by the allocated hugetlb pages possible. Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Signed-off-by: Frank van der Linden --- arch/x86/kernel/setup.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index cebee310e200..ff8604007b08 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -1108,8 +1108,10 @@ void __init setup_arch(char **cmdline_p) initmem_init(); dma_contiguous_reserve(max_pfn_mapped << PAGE_SHIFT); - if (boot_cpu_has(X86_FEATURE_GBPAGES)) + if (boot_cpu_has(X86_FEATURE_GBPAGES)) { hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT); + hugetlb_bootmem_alloc(); + } /* * Reserve memory for crash kernel after SRAT is parsed so that it From patchwork Mon Jan 27 23:22:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951865 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BD60C0218A for ; Mon, 27 Jan 2025 23:23:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 105452801CC; Mon, 27 Jan 2025 18:23:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CA74F2801CD; Mon, 27 Jan 2025 18:23:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9E3462801DB; Mon, 27 Jan 2025 18:23:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 4C7F32801CC for ; Mon, 27 Jan 2025 18:23:02 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 05B2FC07FB for ; Mon, 27 Jan 2025 23:23:02 +0000 (UTC) X-FDA: 83054809404.24.35E06A4 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf11.hostedemail.com (Postfix) with ESMTP id 3669B40014 for ; Mon, 27 Jan 2025 23:23:00 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="hn/ow3IT"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of 3UxWYZwQKCCIDTBJEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3UxWYZwQKCCIDTBJEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020180; a=rsa-sha256; cv=none; b=avLwjTp0yD+2l73+pc/WNVszzj+gfVpeJByCU7dsfcCMq4M7lwMxtnVzG34NwFmKL9s4K2 6VhyOkuz7ehlw9LWXnv+apThivJop2Vxb2BsSt78rRrCyUPLyH2pJJ2laHF3nIlI2jLsfD Yop6V2+KR+fWTYVEYanRrleGA8KOpL8= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="hn/ow3IT"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of 3UxWYZwQKCCIDTBJEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3UxWYZwQKCCIDTBJEMMEJC.AMKJGLSV-KKIT8AI.MPE@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020180; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Cb+6DeiIrDYs6+D7kWz+SSV0N/KlwTlgW0EvhXblryk=; b=W1kcp3BbAq5Pa5Cj6KAYYcCSD8GH6Zqq9gFTsOQAcHFOCSHTI7V92dRyZFdE1apolxat23 SISNW2eF/yzVfDEO1AwHfUagC7ON5tX1rTNlUYBXevjRyS0q+IJ2o2VGNjDbQla7yORJul rcOS8X3AqKLzyDugfFYSW9TnacQRi2M= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2166855029eso95147195ad.0 for ; Mon, 27 Jan 2025 15:22:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020179; x=1738624979; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Cb+6DeiIrDYs6+D7kWz+SSV0N/KlwTlgW0EvhXblryk=; b=hn/ow3ITP3xMbhLU5lki2ZV1LO0OSpkVcnuDezEZL5ezVNpLae6mp1LFE7JhwOnEXf ZQdTC9Z5L0rtKfr4ki44jmiQSC4/N7Hd4Z41xs3HhfNdnhI8vZvHXtylGJU7jhuNnBr3 TGbP6Z1msQAl3rQCIr783f2ntIpq2WePOqzKBSryDagnNDlKk3yCgqvjw5WulZh8Uh0x ZOaQmiQMQqvnCU2qjdc72HxRGMcoKBC1E3nEUvRqO3rKItikTV60x2grdteJ9eo2Hnvg VtZmsp2gFo0rfw0LD8dpZvvizBtvXLmmgELZW28l1iu1E+zVzBySTs9vW81GQ5wJeq47 jZ0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020179; x=1738624979; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Cb+6DeiIrDYs6+D7kWz+SSV0N/KlwTlgW0EvhXblryk=; b=ZctXDHfZuubO6pSBuiy9sRNWXiLn2lMAdU8bqWOv+5m5ZamF0TEQci1YgnjPXKM36N W4DYlrBZB8v4BRvmFPi5V+VmZFIguTQXtAgd1INoe3IhjcbIL+bp7aqE/tL1r6i7jxZ9 Q7qAAFPlm/rZsbFViKB0yeueEZOM3+7d+bgz7EZggHFk3adRdlboyzGyAzOCGBRZSxD1 U45FoMKkqc5SkHKupbI7ySeYhp6ISYhOojQ6Aeg7PFk5pufKzK6XZa0wX6pr6FwwzYQL Q7HUCQ/WrUx7viwvn7nFOLxTZsTLj43wcdI+Svo2HhgM98vkUA0hNH8bPpM6010vbR02 V2UQ== X-Forwarded-Encrypted: i=1; AJvYcCUNE1ZET7JsGJgOxcIxbLWVuV82xnID6JMIj1D2Q5pWdbKpfaXec0+QdUs1GSUmG3L91Npd520v0g==@kvack.org X-Gm-Message-State: AOJu0YzKWExaTD6HtyeuRUckm44VBL9M/QfQAdaZXFAL/xlm3+CirtNG +TJ+jGl1/vBlbz8JyDMDEvWwdYpPyoiAeBB5hojbnU2vats1jj+Hk/WOak4PiE0SJe3gKQ== X-Google-Smtp-Source: AGHT+IGqg2/Gc3mmj4AUptzXihhmAv8IJmloiWtQqdzdRu1x7uiKXeG3jkyVwNBKlKTE+Qd2VVx//pyw X-Received: from pfxa31.prod.google.com ([2002:a05:6a00:1d1f:b0:726:d6e6:a38]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:258d:b0:1e6:50a0:bbc5 with SMTP id adf61e73a8af0-1eb214e850emr65655680637.21.1738020179102; Mon, 27 Jan 2025 15:22:59 -0800 (PST) Date: Mon, 27 Jan 2025 23:22:02 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-23-fvdl@google.com> Subject: [PATCH 22/27] x86/mm: set ARCH_WANT_SPARSEMEM_VMEMMAP_PREINIT From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 3669B40014 X-Stat-Signature: ehgbhpo8x3gzyaeokkqo3ddy4mai86nf X-HE-Tag: 1738020180-777619 X-HE-Meta: U2FsdGVkX1+Xx3LYSF/Pd8SRNSFaMk3aZTdt+IyUDshMkRtFZea0yMecY7YXu7TP9JIqGKTaKJ1KUNlrO7AD40pItFxLp20MzYUBRwob4KffLv9FsIao7XAm/Ns64aQNSj6a1sSDFWQ4KdwEkxcKc+g6l7pErqogWRLoQFgzKGrojAxZrzWZPbc5eU9EUnPUQCUqis6WGOrlCMS4Q633HYGBgopMj8l7IR5WbpoS72Hmyvo0SZKqrfaMpFnS6vWK00JTzMd5oaWlhBdBvJIqqJ+ztgj9oUqIrCuTR+TKGV8uWmCRyKXly/Qty4isVdVuw2Ss+VsFHldGqf9xDgcz+Q5AQKYKR+5XLmX/tfZXO4KByTutXKuVLim1dS4HWhsTVkMWPymWRv6NQ/04uMTYIuz96KNjy1wMPd2UCDwL/JXyoIYE/9xzmucZ9aJqmiyrIn1VAzp8em6mqzAtGn0kyt3+PeDAvBOvSfTzHAEsC/hkHGrszjqaIbeNXS+lTWv9WdhiBtAgUWYO1827SVwkouwS6JEoGKYx3SnG/hn/8YEBTI/cJbrq92UzwfEiwvgyBV4iDWoqXBoOYdJAdpJKGER6BtWHVhG7fBNc9vWMj/EMP4PfmfmYILQ1g+Vy2aJDIsfKHMbOV82qYj8V0UjI4/gVVa2J8sPdlpjiQEorOdTo7qel5fiaXqGyAl4OBxy/A8zLXA6YkQDzUgQJmvFwppJo6o8f9zYpppZkHXl4C7L6NQcRE6FwzvQHHVxDxtJB0rXULLADDxplr9Qm3wLkYhEO1zTfWRoCZQb39ALgyDtjhWceO48bHGWPf5snKHATvNIzFGj/HKJFDHKglUxY8XASHa6+ZVCAr5SVANqfKtTRtT73jHradbems33m8u7nLQjK6aoGMAhW2ALdqn+praty4bh8WnWjQf8DYScgIGQKxVPwUCEoJetjdSBrAijfKCyFwhgiY05Wt80Zkgc ApfD1enr CbWjYNM6diLJDif4aW2VKf46dMOnHF8xa0djVmqfhzodO+klXCEcSLktR2EJ7uz++YGWCRmTce6s+MNCmwd9VMSIvHRMS3zsHqulzIIbRMXfdEtEpzsr5eKYeNG2OXC04C15kA7mYrj+Qtkecpmht7Dvp4g/06lRKU2XeGM8xMALEBODK5rYKyHm6H1ez2iXIV/2LQG/s807FVgHyeTqW/7j3A0fcFqrSSgzkQfH1cIj7f8XMt/rJiILvJh1T5BzT/N2+eYxmg87B2NgF/eY4Xy3+t5/gT8Ap8TbI6cUxyXqnIEKdMD0+HCThWNvXyEvJeER3jrgr7sTJM2ELw2QeUIsNRMyNV0YJBn8VF3rfVRS6KpdNFlXCLPjb/rL1osTywlLW2cQXCLBDeUEBz/OKvNW5QpszDNL7yyqppSggBsgBT9ooXS21gSxHmLdhGYzCj8ga2Qt8xWb1AEAOzm8YmilW8PPxma6q1gfB0IO4Bi72Lg4ihqNAC0qAfmAsK40C7arTJSk9ezlE5XgvG9xXtbkRxtFM08REeGwWKBxF9FtozoaBkbhM9AqRO3JAshRI0GN8eXpGywUy5aY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000165, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now that hugetlb bootmem pages are allocated earlier, and available for section preinit (HVO-style), set ARCH_WANT_SPARSEMEM_VMEMMAP_PREINIT for x86_64, so that is can be done. This enables pre-HVO on x86_64. Signed-off-by: Frank van der Linden --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 87198d957e2f..ccef99c0a2ba 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -146,6 +146,7 @@ config X86 select ARCH_WANT_LD_ORPHAN_WARN select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64 select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64 + select ARCH_WANT_SPARSEMEM_VMEMMAP_PREINIT if X86_64 select ARCH_WANTS_THP_SWAP if X86_64 select ARCH_HAS_PARANOID_L1D_FLUSH select BUILDTIME_TABLE_SORT From patchwork Mon Jan 27 23:22:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951863 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88EC3C0218A for ; Mon, 27 Jan 2025 23:23:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A749828013A; Mon, 27 Jan 2025 18:23:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 68B072801CE; Mon, 27 Jan 2025 18:23:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2DC2A2801D9; Mon, 27 Jan 2025 18:23:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 8173C2801CD for ; Mon, 27 Jan 2025 18:23:03 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 3EAB9C0802 for ; Mon, 27 Jan 2025 23:23:03 +0000 (UTC) X-FDA: 83054809446.01.8271951 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf27.hostedemail.com (Postfix) with ESMTP id 7949540004 for ; Mon, 27 Jan 2025 23:23:01 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=FeWDCtCk; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of 3VBWYZwQKCCMEUCKFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3VBWYZwQKCCMEUCKFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020181; a=rsa-sha256; cv=none; b=rcA5x5cDWGP8wGxaOPYwZO9bAfe2SmqXW/BcBaPkM/LynYzQuFQv0xNUOlsHxvAA/kMhhu NZWKVdkoGPb51V2dhwejtnYKwL+PdKYLqW7NsZtQE5QUg30zg35lcb0tWqpqAfSnIwSvxb AFH1VV4DxGMXHG63w66q9OApheqQNz8= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=FeWDCtCk; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of 3VBWYZwQKCCMEUCKFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3VBWYZwQKCCMEUCKFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020181; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hCniTRl/lA/PuLQ+BNMwksXzOFV25yjBBKkil3bBQiY=; b=qGu4I1209mH1MGpFtVe9DL09i7883DRgovFHhZc5WOOpEucvRmixw0k/V+N5N6Pii+1q4v kDduVFG270uxhffVVjFYjoQAZLk0OY0Zd0tF9F73NT//Gk9Ajr1UAyIckC2JjmsAOy5tti g7MCxtfjdaqg2DLsf1X+5PJMQ02+ojE= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2162259a5dcso155125835ad.3 for ; Mon, 27 Jan 2025 15:23:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020180; x=1738624980; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=hCniTRl/lA/PuLQ+BNMwksXzOFV25yjBBKkil3bBQiY=; b=FeWDCtCkboyWNJJZRG7i9wklguChzc4lskAAXvUUgssptZOKkie05DIqgl51QqWCrf ecQIK0WUslIL96m3q/duNbsLXsy8qqJY79p+eK4RvXWqXi4ha9goIRZgJ71hWSaXbN6f 463IlBaUaetu3i4mtXgcb9PpkJC5Imku8+q9ihWnxXLJhYUL18YTbKitcMs9p0UfygrB js/RwhXLMkInQlFwmAj8PXecQY/exHaVhRfsjRYmMHTw0mE72SG8fr1V09yhUQl3rZZb NsrZF5EQphXftBIgcXVgyisub7KBKpvizERGFnXbP0KzY8+WJ63AeWhqwj9V7cK65qWz uAHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020180; x=1738624980; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hCniTRl/lA/PuLQ+BNMwksXzOFV25yjBBKkil3bBQiY=; b=dlGPd805i64WZNkykNl6IAQi11YNGozGIgsg/3xkgZPScYj9mXHgtam2gWNKOZC0Wd hYg7uI25Ni6QZHgPckNMUXbpBfji6sWDPqhqLW5Npy0RbW16D+0bwXT8f35gLcqTZDQq AzlptHcN1+RYEBDCJL8BG3vM0s9zFLrjs9OrFx9lIthyaDjFXf6UtLstOFqr5VyHHcjX TQgNwTekXBSa1LEzx4sVqmryVBumc/yY5Z5g6m2a1QhDz9LK/I5nNeb7KiwYRqQhApwI B0oMCtrkUI5CZGT3tYdA2O4Gh0eMVRrflfIlNwKMqa+8JjQw+YZKyOuPI7TvTxMWYbH3 M0ug== X-Forwarded-Encrypted: i=1; AJvYcCXopMpZlWrLjuy+R+rTgq0Wmr+Q6PG71GX+JSWY9bbE/U2sWmKhUC/tX7YQjvJcGxj3S0X6qwkUDw==@kvack.org X-Gm-Message-State: AOJu0Yz/UkAKol8SazhLbiGi9QYlHyogFZDVq0/qupj3Ii1a4YEh3Vj8 khYsmhJk1+M1qe+NXm9bphbuPBz/QiC/K4fNGNIvs3yEM4vC3vLs0qU7GrrxPKnuPBKh1A== X-Google-Smtp-Source: AGHT+IHzO0EISgReM0dLIWHi2YCB+4i3HiviP+HCYttMVTZD0GYiJiyEaV9yehlN7QlnVFYvPUvUBi+s X-Received: from pfxa31.prod.google.com ([2002:a05:6a00:1d1f:b0:726:d6e6:a38]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:78a1:b0:1e1:becc:1c91 with SMTP id adf61e73a8af0-1eb2157fbddmr69783403637.28.1738020180426; Mon, 27 Jan 2025 15:23:00 -0800 (PST) Date: Mon, 27 Jan 2025 23:22:03 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-24-fvdl@google.com> Subject: [PATCH 23/27] mm/cma: simplify zone intersection check From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspam-User: X-Rspamd-Queue-Id: 7949540004 X-Rspamd-Server: rspam10 X-Stat-Signature: bojj3oc978d944pmjtzpia3gqarf5o1p X-HE-Tag: 1738020181-316090 X-HE-Meta: U2FsdGVkX1/tEaSxNfJlcE3LatDNEdEBU/ipcelDvDUMqG7JLa067R8ppgP8WSnzjCLamtSKxnJjA5UppMurHiN2SUlkaQ6Xg30neu51dTYjcLNa6T4LzQtG4udT2YgTzuT65CYkoRzlQsQXwEqQtnBdZq7P3mRK9k6SYdTw3uejukyngpMNSgQCMRm2Dum+XYEsDGBh/+pdisAJDZfy+dIgEkHFvWID9xCt7ZatTU7MlCpQud6rk4tInnHLUubQkrm14tEHIqUJvknUV39Sf0ApWBgIJMwuJj8SiWj4aw2OuIB2tritdMI4gakXVilmf1+G5shKLRCZ1gEB9CP29GDond1BfG+ZP4iusDpMOOpl1+ZSh7G1ISEr4sxuKUlUnVJ3tC4pT4ZWhyEuA9+EW7LlYNzNzo6bDNCc/p4523JBhv6F+gVx/j2u3Mi1r2IlmKBuTlIQuTGO4IowUAStyiSneNMwl8dxgamZkxiWgrRj3HtmbbdOpfL4Q8RC5B9MFP1ykDBgGaq6/nJrb1JPAhXUO+l7uqe4GLgqA3Rum9zMKu+5uQXMVayXHnylRJrVrdjq1mCvVuXYW1GjW9xDJzkCl+98GPxsUT84cUv5+KcqOxlVcSWhegO9zzmllsVZyNhGN2IOKAxBMQ0zT5rFxQrZ+lZ3NDNJQ/3kpQZRZnWIwV58vgt6/mRI12d9f1gBSbXlTsnqbRlWAjk8QNIfZYJLaUeV+e75L9Vuzx0ig9qS/5rg3ZE//ZpksdUEBhkP7lZtRrjzGlRhm/oXqN82K6BJjlAXIm9yjL0mclfWrS5m0UZcXzBe9i3WKYwNSu+htotsfHlbWWdbJktZRHSvYGCgjTZoRkmTlfZJLgovpQ3lrZf3nyxc6kneSFTADLSgqWBy9ANi7bpomnbtAnO2zpxMMwPOqSYCKod1+V6xmhzPtXRpk09ZWHulMmKREOOrI/nstlrN6HiBELY0del utalpYeo pC51nyKsC2Ns/VZ/F6P8wmTSBrgogfU9L97Jr3DU3ghedq7dHZqP04XCgh2vpOUX+T85djrYbC+6Lcy+8NN8N5wI1Y75jIuut4zyqACHCh2J4865/YRhzGjmMYfKnMn37i61MAknCIkiRDaCHVd0pB6msUsh/kIZdTwnQW6TH+5kpii1mDN5q2aw9kBcSNBASy4QC1itxf9787Ywcr00OYtaTa0GehCNGNHYn9807FaVwGaUO3pZV2+gMEQyzJpZYl5Avp3kzjB4ve1Byr+rP5Kuf0MMX0/xw1AsNnly95687dxkJvjWanNXgzWkBbeYnr8LXLxFoLX0Hb9C293CVK7RnSP3wXGisqTNew6RHY7WJAY/09LgEtVEz7H70Mm+q3UZ3C44jK8y+6JUBYpPbr3emP70MJtVa5sPZ35iYfXjjQQJIp29RYFAeoFPUqKgqel3YwvNbE1i7vmABhmc/EWzo1pUNVZYOp7xikEH2gSQ7Krwr48uR+/ZfxUkXBX+bR8op9oxgU7nF18ygyJX5Aa9gyw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000055, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: cma_activate_area walks all pages in the area, checking their zone individually to see if the area resides in more than one zone. Make this a little more efficient by using the recently introduced pfn_range_intersects_zones() function. Store the NUMA node id (if any) in the cma structure to facilitate this. Signed-off-by: Frank van der Linden --- mm/cma.c | 13 ++++++------- mm/cma.h | 2 ++ 2 files changed, 8 insertions(+), 7 deletions(-) diff --git a/mm/cma.c b/mm/cma.c index 1704d5be6a07..6ad631c9fdca 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -103,7 +103,6 @@ static void __init cma_activate_area(struct cma *cma) { unsigned long pfn, base_pfn; int allocrange, r; - struct zone *zone; struct cma_memrange *cmr; for (allocrange = 0; allocrange < cma->nranges; allocrange++) { @@ -124,12 +123,8 @@ static void __init cma_activate_area(struct cma *cma) * CMA resv range to be in the same zone. */ WARN_ON_ONCE(!pfn_valid(base_pfn)); - zone = page_zone(pfn_to_page(base_pfn)); - for (pfn = base_pfn + 1; pfn < base_pfn + cmr->count; pfn++) { - WARN_ON_ONCE(!pfn_valid(pfn)); - if (page_zone(pfn_to_page(pfn)) != zone) - goto cleanup; - } + if (pfn_range_intersects_zones(cma->nid, base_pfn, cmr->count)) + goto cleanup; for (pfn = base_pfn; pfn < base_pfn + cmr->count; pfn += pageblock_nr_pages) @@ -261,6 +256,7 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, cma->ranges[0].base_pfn = PFN_DOWN(base); cma->ranges[0].count = cma->count; cma->nranges = 1; + cma->nid = NUMA_NO_NODE; *res_cma = cma; @@ -497,6 +493,7 @@ int __init cma_declare_contiguous_multi(phys_addr_t total_size, } cma->nranges = nr; + cma->nid = nid; *res_cma = cma; out: @@ -684,6 +681,8 @@ static int __init __cma_declare_contiguous_nid(phys_addr_t base, if (ret) memblock_phys_free(base, size); + (*res_cma)->nid = nid; + return ret; } diff --git a/mm/cma.h b/mm/cma.h index 601af7cdb495..b70a6c763f7d 100644 --- a/mm/cma.h +++ b/mm/cma.h @@ -48,6 +48,8 @@ struct cma { struct cma_kobject *cma_kobj; #endif bool reserve_pages_on_error; + /* NUMA node (NUMA_NO_NODE if unspecified) */ + int nid; }; extern struct cma cma_areas[MAX_CMA_AREAS]; From patchwork Mon Jan 27 23:22:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951861 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FA8EC02188 for ; Mon, 27 Jan 2025 23:23:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2DFD52801DA; Mon, 27 Jan 2025 18:23:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 266CC28013A; Mon, 27 Jan 2025 18:23:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0B90D2801D9; Mon, 27 Jan 2025 18:23:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 4B2292801CE for ; Mon, 27 Jan 2025 18:23:05 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 0A1C5471D8 for ; Mon, 27 Jan 2025 23:23:05 +0000 (UTC) X-FDA: 83054809530.12.2BC613C Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf25.hostedemail.com (Postfix) with ESMTP id 2A38BA0015 for ; Mon, 27 Jan 2025 23:23:03 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=wvswButg; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf25.hostedemail.com: domain of 3VhWYZwQKCCUGWEMHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3VhWYZwQKCCUGWEMHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020183; a=rsa-sha256; cv=none; b=p6lx6uN8p3nnrRI0o7D2SOCsuxXff7ybz5sTtG8DvthU97en0C0bUA9e6bit3kvwg8dbdw 8jVUV2EYC+bfQwwlJncOsqMJQV2U8R1mji+N4N5teEHshqzbTDuDyVGLW0e1jXoINFI8zc vxSWZWonK+ZasCjbfyan4D5CUTNbO6I= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=wvswButg; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf25.hostedemail.com: domain of 3VhWYZwQKCCUGWEMHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3VhWYZwQKCCUGWEMHPPHMF.DPNMJOVY-NNLWBDL.PSH@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020183; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Xh3mdP2mhnHLzQNGEPkDLWzaQSos/caZ2VOKVnErQBw=; b=Xpe7vJdKEJCBBRKuTUcpcSU7mIJhos9CjbI2NU2dTJU8Oeknjk345gXv4c7PyA/vW5CGpr QT91xpZokssKqcz6IAP34QbNNUQnBogPUMLEtfi7RbrCmdcXUFG6VSOBC1BsQyVvgSHv2H xcsZ7x9PhTRM34YsQO1RwEnlEtJ9Tkc= Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-216728b170cso96109165ad.2 for ; Mon, 27 Jan 2025 15:23:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020182; x=1738624982; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Xh3mdP2mhnHLzQNGEPkDLWzaQSos/caZ2VOKVnErQBw=; b=wvswButgbvCyLmoTlTcDFl6zRsnMm9g4M/eXq1lYpB3Df+0Al6sNJ8xvTQLCNZo+i1 F5cJDH+JR8AobwDOzNsCjtwX/Qm9vjvCO8tY3s/UTzUgOcavPPFmngPGE8DUFSyDoIna 2z6i7NOp424noHVxUy8ASDALcX97ytbYkswTaqEqrSGIDqOcFrxJXhlJEfzxnp/BaBmr ueAQqKqCyMGYant9jD1j+Ys2FU+hEkzGP9iSh/ep+PZl0DZjCliE+eF/2M0/tqq3aOeC ZE+cfqenV490xO2qe9b4920J4v8EFa+kpFBylb50v/ZJmhxrrHjLRItT2Vy6zoE88zyY KPKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020182; x=1738624982; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Xh3mdP2mhnHLzQNGEPkDLWzaQSos/caZ2VOKVnErQBw=; b=wjTaQ1jEtPm3yCFqH6115fL3n2AQhVd148Tf502Y8rV7LgP4aq6RYOLxWsS75M3jBa AAjIPaunxtQ5Al0PVJYFqouTjOZ1jeAAFFRX9+KvjWjIXOh3qG8lAfnlyTo7zzel94vl gUtnU/nuNmYdpJbHF+AFPkaO2OdxqrXuCm4ydaazvqhGmL20t0pKJK8exaexGDSKMB2K UfVLHQRyx7fzHtsHvVyUffaV90WGlQQTBOefxmVUm14BKHrXVXZ1zNs6fMGra4eqoAqx q14x4ynnGsPzLuR3eMYRLxHiP4sZUTKluoqEtbK0fNydTerVXbj1qOOL8SDO0/lt0NSz V5uQ== X-Forwarded-Encrypted: i=1; AJvYcCUyjjVmq37mdBmVCTjsYpZW3Ji2aHAk+2haPw1hk8bITwxtHyvXnKKHcMjTi38CYGCI7lb2rvmG2w==@kvack.org X-Gm-Message-State: AOJu0YwRL2dQ7wzOnVMTsoiRXwy9LRAcZZx6sAPxY5VrSMZMjEYcZM2H StSYvu/JmPn4LX2By2C9USDyNa1PmqP8J+rA0Eh4Z/PXtzZRR1GlGmx+YSe/E9JKNsJSsg== X-Google-Smtp-Source: AGHT+IEwrpwkWupZm7Uo0XNFuCQxEURyVNgEoud/S8ArxMomvVoWaaaCSrh4po+SsvL0uPbIwiOPUUys X-Received: from pfbcl4.prod.google.com ([2002:a05:6a00:32c4:b0:725:c7de:e052]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:3951:b0:1e3:e6f3:6372 with SMTP id adf61e73a8af0-1eb2159533emr56631102637.27.1738020182020; Mon, 27 Jan 2025 15:23:02 -0800 (PST) Date: Mon, 27 Jan 2025 23:22:04 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-25-fvdl@google.com> Subject: [PATCH 24/27] mm/cma: introduce a cma validate function From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 2A38BA0015 X-Stat-Signature: mydk15x8w6gzzyk13ytzy1hhq6rr63dz X-Rspam-User: X-HE-Tag: 1738020182-386385 X-HE-Meta: U2FsdGVkX18ZREfMmnerG4TheAk03JNhUCVUpdDW/8rDkzjoEabbHKmyUKMgyV8zYJZiPyoWkkNF+ZwkZFC+hx++MEYIx1hAoouyVFU0z1AcKdjXeeZJToPFDFKCYhNV4fcRKmfw6S/DM6R5VklyOtrTWzzlVsPm4wsayhGzVZ35RpGsQ6MPAHUKenKXdWc7sxxc6iX6/fyctwo2e2Kgnpw+GwkmvtCANgBoALpLN+McvHLegoNv+ENM2TTiQotpXb0uyfmCyTV4iWKOOuEjCJDaRnM19owHnM0DBIQ1e5xRxM+xWplEwb2GcVc8KZ1bHXkQvdOOqNrCYD/MOSRP/8k8m4cDC+GlKWX8pWHfoL4QZyYpWNfJkxEo8ThTBz6KXkb/FMRONNSz2gj09S5Dd/Wp4yliDboSY8pfWqJBWl0zBMfT1Q06/SFruGXaAkD7+z0zo3CDduCOxr+o26i3b50P3ur5UYRaJAwkKWZp0nwULBv75m3572GLtcAv8hwkWIFnXTomsWC1iAEbODySYvtjHHBdWqtJZQkAyXilez4CKSkEHizbAQigoH4eJ5e15p8UxGOvTskkluYuIZka/scIymr/+15B120MAZ2XTIn1EbanMF6B/fmjs4MGWWQ4h1I81sb1g76941gpDdT0fq9spn1Q5v6wd/o1wHW2q4Y4gSA76WNvXrZwBF4/RRLkvO0244Y+LOKWhNaWchhnbCzzEMUGL06B0dzxEQdtnHtte+QBlhs2BO/7b9WN8c28NYs34bzhDjlg4MIYzb5Ch2b00HrZna2PAi5RKSKIkYqInnC0ydfHk5P8ihy8Y1e1nZOGy/9wNSUJOJomBsux8j1hX5Qhj5E1Ci2kVx9MGhufV0mlzKdeIwveiqHwmEkkNv1vcQQjSNh0VsRmtWCkbEPuWpoTCFpCYtMsiMTOQCGaN+NALBDtQBmE9o02FVAI9kFgg0Sbn0OUoHuo9yX ui/+u+F7 kN9jXpISLSr0hwSETYqy6MRuznyDnnMs1LPfW/LW+xuoveI+k1x67K9ZoAwHNn5eZ6A4Qij/ru6f5d55XlhKAvU1qgvi9/meB3sQniEufqUXsum7MZ7AcRRhwR6NCKrkiLlaag5V4Zud0p+engs593Bz2UO9RlJfSqPYlXRxsqj3l0MAnVU77hrmk6Bb+CXt9uwiJqMwNN2YmzoVnkH60Jyl3ZEACKgwBucA9O1flmnegJyUACLc/IWaoIoYsbYRXo1OxGG7VkXl9Ty4XoVXEYs7e11OyX2nEsItYuv9ipymXTPADXhwFbELwCls9Wt5dY+Rtk81kr+ITQ47x4wANYl7WJv85aroaBv718JpT3VrNcaqjbv9LNn6OSD4Thxcr1Ga+mMd/GwzPXEOzytqPTQ4nc+V8ovZiFKiP+LooeY8ECeICHFg54MxwtD45ouaITZlhngByrvEBoO0WymTiLnqBy+Un1SP4/UwRvUwF3e0c0j8rkqUORWJmQYgZL74G8NiAeFWNgOMKqJeL0xdBcxaRwQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Define a function to check if a CMA area is valid, which means: do its ranges not cross any zone boundaries. Store the result in the newly created flags for each CMA area, so that multiple calls are dealt with. This allows for checking the validity of a CMA area early, which is needed later in order to be able to allocate hugetlb bootmem pages from it with pre-HVO. Signed-off-by: Frank van der Linden --- include/linux/cma.h | 5 ++++ mm/cma.c | 60 ++++++++++++++++++++++++++++++++++++--------- mm/cma.h | 8 +++++- 3 files changed, 60 insertions(+), 13 deletions(-) diff --git a/include/linux/cma.h b/include/linux/cma.h index 03d85c100dcc..62d9c1cf6326 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -60,6 +60,7 @@ extern void cma_reserve_pages_on_error(struct cma *cma); #ifdef CONFIG_CMA struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp); bool cma_free_folio(struct cma *cma, const struct folio *folio); +bool cma_validate_zones(struct cma *cma); #else static inline struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp) { @@ -70,6 +71,10 @@ static inline bool cma_free_folio(struct cma *cma, const struct folio *folio) { return false; } +static inline bool cma_validate_zones(struct cma *cma) +{ + return false; +} #endif #endif diff --git a/mm/cma.c b/mm/cma.c index 6ad631c9fdca..41248dee7197 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -99,6 +99,49 @@ static void cma_clear_bitmap(struct cma *cma, const struct cma_memrange *cmr, spin_unlock_irqrestore(&cma->lock, flags); } +/* + * Check if a CMA area contains no ranges that intersect with + * multiple zones. Store the result in the flags in case + * this gets called more than once. + */ +bool cma_validate_zones(struct cma *cma) +{ + int r; + unsigned long base_pfn; + struct cma_memrange *cmr; + bool valid_bit_set; + + /* + * If already validated, return result of previous check. + * Either the valid or invalid bit will be set if this + * check has already been done. If neither is set, the + * check has not been performed yet. + */ + valid_bit_set = test_bit(CMA_ZONES_VALID, &cma->flags); + if (valid_bit_set || test_bit(CMA_ZONES_INVALID, &cma->flags)) + return valid_bit_set; + + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + base_pfn = cmr->base_pfn; + + /* + * alloc_contig_range() requires the pfn range specified + * to be in the same zone. Simplify by forcing the entire + * CMA resv range to be in the same zone. + */ + WARN_ON_ONCE(!pfn_valid(base_pfn)); + if (pfn_range_intersects_zones(cma->nid, base_pfn, cmr->count)) { + set_bit(CMA_ZONES_INVALID, &cma->flags); + return false; + } + } + + set_bit(CMA_ZONES_VALID, &cma->flags); + + return true; +} + static void __init cma_activate_area(struct cma *cma) { unsigned long pfn, base_pfn; @@ -113,19 +156,12 @@ static void __init cma_activate_area(struct cma *cma) goto cleanup; } + if (!cma_validate_zones(cma)) + goto cleanup; + for (r = 0; r < cma->nranges; r++) { cmr = &cma->ranges[r]; base_pfn = cmr->base_pfn; - - /* - * alloc_contig_range() requires the pfn range specified - * to be in the same zone. Simplify by forcing the entire - * CMA resv range to be in the same zone. - */ - WARN_ON_ONCE(!pfn_valid(base_pfn)); - if (pfn_range_intersects_zones(cma->nid, base_pfn, cmr->count)) - goto cleanup; - for (pfn = base_pfn; pfn < base_pfn + cmr->count; pfn += pageblock_nr_pages) init_cma_reserved_pageblock(pfn_to_page(pfn)); @@ -145,7 +181,7 @@ static void __init cma_activate_area(struct cma *cma) bitmap_free(cma->ranges[r].bitmap); /* Expose all pages to the buddy, they are useless for CMA. */ - if (!cma->reserve_pages_on_error) { + if (!test_bit(CMA_RESERVE_PAGES_ON_ERROR, &cma->flags)) { for (r = 0; r < allocrange; r++) { cmr = &cma->ranges[r]; for (pfn = cmr->base_pfn; @@ -172,7 +208,7 @@ core_initcall(cma_init_reserved_areas); void __init cma_reserve_pages_on_error(struct cma *cma) { - cma->reserve_pages_on_error = true; + set_bit(CMA_RESERVE_PAGES_ON_ERROR, &cma->flags); } static int __init cma_new_area(const char *name, phys_addr_t size, diff --git a/mm/cma.h b/mm/cma.h index b70a6c763f7d..0a1f8f8abe08 100644 --- a/mm/cma.h +++ b/mm/cma.h @@ -47,11 +47,17 @@ struct cma { /* kobject requires dynamic object */ struct cma_kobject *cma_kobj; #endif - bool reserve_pages_on_error; + unsigned long flags; /* NUMA node (NUMA_NO_NODE if unspecified) */ int nid; }; +enum cma_flags { + CMA_RESERVE_PAGES_ON_ERROR, + CMA_ZONES_VALID, + CMA_ZONES_INVALID, +}; + extern struct cma cma_areas[MAX_CMA_AREAS]; extern unsigned int cma_area_count; From patchwork Mon Jan 27 23:22:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951866 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 27DACC02188 for ; Mon, 27 Jan 2025 23:23:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 42B9C2801CD; Mon, 27 Jan 2025 18:23:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2E2472801D9; Mon, 27 Jan 2025 18:23:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F17032801DB; Mon, 27 Jan 2025 18:23:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C09BE2801CC for ; Mon, 27 Jan 2025 18:23:06 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 785811C7C70 for ; Mon, 27 Jan 2025 23:23:06 +0000 (UTC) X-FDA: 83054809572.11.725C2BE Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf18.hostedemail.com (Postfix) with ESMTP id A511A1C0010 for ; Mon, 27 Jan 2025 23:23:04 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=OxbXBDTc; spf=pass (imf18.hostedemail.com: domain of 3VxWYZwQKCCYHXFNIQQING.EQONKPWZ-OOMXCEM.QTI@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3VxWYZwQKCCYHXFNIQQING.EQONKPWZ-OOMXCEM.QTI@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020184; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KwINzBek+TuboeY4k9bxwxfZ/O8zkR2bm85x9QvP/04=; b=qcBq0ytWADUi3H9BKJy0nGTc/hSLmf481lVQOrpD+F9jA7ZpviDGpRwDp2D9OUecQyL9aH L9MyRsDzYeshiTsfFI770L9Mb3fGgnDvg3slH8nWv1tnOTSHQeeXnSKHp3/oC4hJF0bIE7 ndx2/kNtgMmrPRBjUlNEyZA5CAxnr0Q= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020184; a=rsa-sha256; cv=none; b=5aHuKjKoVrbVBLtXgZuVk+Xl/Gl6I72yxMMxd/Kh1R2vT8biIUugXAEmkbLkINxKMgDiwv m8TbGkvMcRzIgZDmOg1FgfY+N/ODiKBOG+pPhR8eOXuAPhWJrRcUfP0Dq72oim/616HoFs gCz0rY1FItppPcc96GbdbcqRufndBoI= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=OxbXBDTc; spf=pass (imf18.hostedemail.com: domain of 3VxWYZwQKCCYHXFNIQQING.EQONKPWZ-OOMXCEM.QTI@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3VxWYZwQKCCYHXFNIQQING.EQONKPWZ-OOMXCEM.QTI@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ef35de8901so9222855a91.3 for ; Mon, 27 Jan 2025 15:23:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020183; x=1738624983; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=KwINzBek+TuboeY4k9bxwxfZ/O8zkR2bm85x9QvP/04=; b=OxbXBDTcGh/Iuu7v2Tg1fKfby89oitnZIlAmmbSmD3MhTE86iPblyDv8aT9IOFNCn0 S7ubHBkH6ua8c5UKArIW32mFWr4ndkb+1zt76sga8ElgsnkfdBBIZOvsH7L3uSvoN0kL w0mYsIaxFlI73KbLMQwzdxROKzGXNwdg6GrcOj0r1b0B6tVfhAl4x+H8bQyOLD2VKMYw eYOzPNmyomMcphnuPKo+5MsRQJFaPo/yIjCKyPJlIySpTA3PjF4cJPi9eLxy71X+WD4M bQ8kgUiAWIxpvfOVaeVush2sYcAnuSyNLc2ozuOH5T9QRQZjZeuBqPUGZHEbqw8x5idX MfTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020183; x=1738624983; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KwINzBek+TuboeY4k9bxwxfZ/O8zkR2bm85x9QvP/04=; b=ely3aIdn+Lqxh1n92yfi4qKPh9soWHtvo6OHvAQQoFiKFReoEUIEFSC71FNal1Bu7v VtEuNkpcjItdALsScWAVfaNB50fB0w86gc5x52fAM0wCc+O7nqJHTRAprF+aUbtBERqm pdP8jKCMWx29QjR/mVvdSX/gJMnT6dxNIblVoqtfvGFqN31ARYceP6aVaJbjQzezkZfr +57OjYCrjpjhJY0uKbFwcsb8us4JMo7HiX+2Ui8jYo3ILnYFVZd/TxN4lSePBAsKcolE MvSoZd4Krz48PNJpYruvMeNy6JrCq1c6OSYT3kYvUOH5JinwMDyjybkV8VIVGNfvN2T1 Vqpw== X-Forwarded-Encrypted: i=1; AJvYcCUZw5Bp1Q82MMsnwvZbBeJA+zjPGv84p/XJP2x1/OPPHg13iv0F65aPKiMs7VNfrGbQ123AtB9eEQ==@kvack.org X-Gm-Message-State: AOJu0YzmsXBjCATuskTku9W3QvknrivKYr8p7xYi028n7xibPZ2Ubo4K aM7DfP5cD9P3170mc1yVYaw9usQYc7kA1fKobPGkreipN8IkGJfURCXQkcp55LGuf9zjZg== X-Google-Smtp-Source: AGHT+IF6R6P25DhcwrUq5f/uWfBdxK6gbsckW/gvsb31sSGfRC3rYruHziIqFE9lTFPRi2qCdE0ZlH17 X-Received: from pfat15.prod.google.com ([2002:a05:6a00:aa0f:b0:727:3c81:f42a]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:10d1:b0:72d:a208:d366 with SMTP id d2e1a72fcca58-72dafaa5962mr67617715b3a.20.1738020183373; Mon, 27 Jan 2025 15:23:03 -0800 (PST) Date: Mon, 27 Jan 2025 23:22:05 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-26-fvdl@google.com> Subject: [PATCH 25/27] mm/cma: introduce interface for early reservations From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Queue-Id: A511A1C0010 X-Stat-Signature: 57c8rocmew6u33mshxd8bk4tft96k44r X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1738020184-313448 X-HE-Meta: U2FsdGVkX19OwnrsoyDDPk8m6PN0FHEP/xoOPcUQbopGLi+imYKBbdcgGpTSX6hrxH7YKfDjmFP6RgNv9KCbLXiWZ03AOGnZ5J2gEk2M59MJ6QiWTpiIT6gidREy5hSCZWBgbJunBB58XJZtotZcfvpLikrRCT8ZF3rcXaUe0cAnEaE6MCrnhAdv0EZltf0emoHcPA3tI12A0E4jzkz7ExYlT9pqjqxBTdU6fR4lfNtKLvFRylpRq0Af2wzqZvFs84Z9iwa/44FMmZJHhtNuL9IYq7aaQQpr89CoMQ9TOe2qM+DPkWBx4oI8+KxKw1EoOWLVxD5DvIn0JDa/oLs5HIiPM0Bct6a00eQvT7iamfM6DV7U4V1IdRcBGOceXHYv/icsLGArb5TA/UEOrPGGfQN9pZ8Oz3+zIvd90jc3E5Ex/l8i29o80SeDZVU29RWNS90MoP41N0Go8MhkC20RL7Ip7XwUdBIO1+oxmdZ1oUnTQQhjaglBUBMAtr4/xxoDyKo9Pha1J6bMmFIzI3e27oBqErDR9DY2W3n9gpXnRSdm1j4B+eIakAmOpLdSeJ4zfNoErhVn3X7caA1qgHKYfUErFehB/B4YPnBALZ2oeLoeglDcLpwmGNIH3lNr3wI1VBV4nMIyPIDFEW0Mtprz7liW3kjB/vjOYyE9kBWriN2a9aat3vL3rVOtkW7FY54ulU28SiB9BCwfXpIo8aucdH98/+R77/65KQdC2aG86HdDfDta/zMucVjygdPTnqkPtMj5Tumc4viMnlptRF55mSBBEfP9tlwQDdW9PE9wUB8ScXWyrhxUJM5/tFpfb0idUf2ylmdRNoCLOdbtB0sejJM+XswABIX6cKCUO76irLj8185YctjwsWM3MGUT3vpzVVieB5HeXL449+Mgf301HUJa6UX1xH63vQy5SzLiJ9nL+EIvWJcWy18J63+YtH52SH0SGPBrRbwFq/YYmbR UXF9TCBX pi/9yJOXDixCOMkZP97Ktb90eO6paU75yi2Yo6NB9PQO25C743D0ABpMWEFl0vJ/eKROY/z2NkhZ8dkSocWe/NbcR2b8+CBBqnuPexxwcvx91TrKXXBt2aLkgQGOFk/rSVQ0HhR9V6cuoe0KoRkH9jI1+V5sY7hdIUDoAUSkXl7xeBN5ILFwoRp7eCgqTH+qRtQHH/30h0Svx+t8ce639wkRZ5Ik5A6L0Uy1EnSuQ7xv4+pbTDAA0hUi4HA0ubk9W9juOJgOz/pscYiWQj++KZ9FHqNfEuMSGZo5pCo7s0GRG+F5e2nWfs0+mTJ/QdrpkOG6blbOthRD2A1c9XFZ5TQnv93cxtqIH5UsnhG1gwEMa2IPVgjOHei2DLRIdtt5xZ8kAOBS17cQZNjBpqln8wmKUnIeo9IkYWzazCnBqRJTb0MTHW2q3zFbYpvlgWuPszBGDjDfCpz7h9W+WjKB+8CqpViROo4Oep/hB8lkHz51b8Ua6dwI98gyuKYKI/P342gBfTz6zf8HMZbXvmNH6hmglnQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: It can be desirable to reserve memory in a CMA area before it is activated, early in boot. Such reservations would effectively be memblock allocations, but they can be returned to the CMA area later. This functionality can be used to allow hugetlb bootmem allocations from a hugetlb CMA area. A new interface, cma_reserve_early is introduced. This allows for pageblock-aligned reservations. These reservations are skipped during the initial handoff of pages in a CMA area to the buddy allocator. The caller is responsible for making sure that the page structures are set up, and that the migrate type is set correctly, as with other memblock allocations that stick around. If the CMA area fails to activate (because it intersects with multiple zones), the reserved memory is not given to the buddy allocator, the caller needs to take care of that. Signed-off-by: Frank van der Linden --- mm/cma.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++----- mm/cma.h | 8 +++++ mm/internal.h | 16 ++++++++++ mm/mm_init.c | 9 ++++++ 4 files changed, 109 insertions(+), 7 deletions(-) diff --git a/mm/cma.c b/mm/cma.c index 41248dee7197..1c0a01d02a28 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -144,9 +144,10 @@ bool cma_validate_zones(struct cma *cma) static void __init cma_activate_area(struct cma *cma) { - unsigned long pfn, base_pfn; + unsigned long pfn, end_pfn; int allocrange, r; struct cma_memrange *cmr; + unsigned long bitmap_count, count; for (allocrange = 0; allocrange < cma->nranges; allocrange++) { cmr = &cma->ranges[allocrange]; @@ -161,8 +162,13 @@ static void __init cma_activate_area(struct cma *cma) for (r = 0; r < cma->nranges; r++) { cmr = &cma->ranges[r]; - base_pfn = cmr->base_pfn; - for (pfn = base_pfn; pfn < base_pfn + cmr->count; + if (cmr->early_pfn != cmr->base_pfn) { + count = cmr->early_pfn - cmr->base_pfn; + bitmap_count = cma_bitmap_pages_to_bits(cma, count); + bitmap_set(cmr->bitmap, 0, bitmap_count); + } + + for (pfn = cmr->early_pfn; pfn < cmr->base_pfn + cmr->count; pfn += pageblock_nr_pages) init_cma_reserved_pageblock(pfn_to_page(pfn)); } @@ -173,6 +179,7 @@ static void __init cma_activate_area(struct cma *cma) INIT_HLIST_HEAD(&cma->mem_head); spin_lock_init(&cma->mem_head_lock); #endif + set_bit(CMA_ACTIVATED, &cma->flags); return; @@ -184,9 +191,8 @@ static void __init cma_activate_area(struct cma *cma) if (!test_bit(CMA_RESERVE_PAGES_ON_ERROR, &cma->flags)) { for (r = 0; r < allocrange; r++) { cmr = &cma->ranges[r]; - for (pfn = cmr->base_pfn; - pfn < cmr->base_pfn + cmr->count; - pfn++) + end_pfn = cmr->base_pfn + cmr->count; + for (pfn = cmr->early_pfn; pfn < end_pfn; pfn++) free_reserved_page(pfn_to_page(pfn)); } } @@ -290,6 +296,7 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, return ret; cma->ranges[0].base_pfn = PFN_DOWN(base); + cma->ranges[0].early_pfn = PFN_DOWN(base); cma->ranges[0].count = cma->count; cma->nranges = 1; cma->nid = NUMA_NO_NODE; @@ -509,6 +516,7 @@ int __init cma_declare_contiguous_multi(phys_addr_t total_size, nr, (u64)mlp->base, (u64)mlp->base + size); cmrp = &cma->ranges[nr++]; cmrp->base_pfn = PHYS_PFN(mlp->base); + cmrp->early_pfn = cmrp->base_pfn; cmrp->count = size >> PAGE_SHIFT; sizeleft -= size; @@ -540,7 +548,6 @@ int __init cma_declare_contiguous_multi(phys_addr_t total_size, pr_info("Reserved %lu MiB in %d range%s\n", (unsigned long)total_size / SZ_1M, nr, nr > 1 ? "s" : ""); - return ret; } @@ -1044,3 +1051,65 @@ bool cma_intersects(struct cma *cma, unsigned long start, unsigned long end) return false; } + +/* + * Very basic function to reserve memory from a CMA area that has not + * yet been activated. This is expected to be called early, when the + * system is single-threaded, so there is no locking. The alignment + * checking is restrictive - only pageblock-aligned areas + * (CMA_MIN_ALIGNMENT_BYTES) may be reserved through this function. + * This keeps things simple, and is enough for the current use case. + * + * The CMA bitmaps have not yet been allocated, so just start + * reserving from the bottom up, using a PFN to keep track + * of what has been reserved. Unreserving is not possible. + * + * The caller is responsible for initializing the page structures + * in the area properly, since this just points to memblock-allocated + * memory. The caller should subsequently use init_cma_pageblock to + * set the migrate type and CMA stats the pageblocks that were reserved. + * + * If the CMA area fails to activate later, memory obtained through + * this interface is not handed to the page allocator, this is + * the responsibility of the caller (e.g. like normal memblock-allocated + * memory). + */ +void __init *cma_reserve_early(struct cma *cma, unsigned long size) +{ + int r; + struct cma_memrange *cmr; + unsigned long available; + void *ret = NULL; + + if (!cma->count) + return NULL; + /* + * Can only be called early in init. + */ + if (test_bit(CMA_ACTIVATED, &cma->flags)) + return NULL; + + if (!IS_ALIGNED(size, CMA_MIN_ALIGNMENT_BYTES)) + return NULL; + + if (!IS_ALIGNED(size, (PAGE_SIZE << cma->order_per_bit))) + return NULL; + + size >>= PAGE_SHIFT; + + if (size > cma->available_count) + return NULL; + + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + available = cmr->count - (cmr->early_pfn - cmr->base_pfn); + if (size <= available) { + ret = phys_to_virt(PFN_PHYS(cmr->early_pfn)); + cmr->early_pfn += size; + cma->available_count -= size; + return ret; + } + } + + return ret; +} diff --git a/mm/cma.h b/mm/cma.h index 0a1f8f8abe08..93fc76cc6068 100644 --- a/mm/cma.h +++ b/mm/cma.h @@ -16,9 +16,16 @@ struct cma_kobject { * and the total amount of memory requested, while smaller than the total * amount of memory available, is large enough that it doesn't fit in a * single physical memory range because of memory holes. + * + * Fields: + * @base_pfn: physical address of range + * @early_pfn: first PFN not reserved through cma_reserve_early + * @count: size of range + * @bitmap: bitmap of allocated (1 << order_per_bit)-sized chunks. */ struct cma_memrange { unsigned long base_pfn; + unsigned long early_pfn; unsigned long count; unsigned long *bitmap; }; @@ -56,6 +63,7 @@ enum cma_flags { CMA_RESERVE_PAGES_ON_ERROR, CMA_ZONES_VALID, CMA_ZONES_INVALID, + CMA_ACTIVATED, }; extern struct cma cma_areas[MAX_CMA_AREAS]; diff --git a/mm/internal.h b/mm/internal.h index 63fda9bb9426..8318c8e6e589 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -848,6 +848,22 @@ void init_cma_reserved_pageblock(struct page *page); #endif /* CONFIG_COMPACTION || CONFIG_CMA */ +struct cma; + +#ifdef CONFIG_CMA +void *cma_reserve_early(struct cma *cma, unsigned long size); +void init_cma_pageblock(struct page *page); +#else +static inline void *cma_reserve_early(struct cma *cma, unsigned long size) +{ + return NULL; +} +static inline void init_cma_pageblock(struct page *page) +{ +} +#endif + + int find_suitable_fallback(struct free_area *area, unsigned int order, int migratetype, bool only_stealable, bool *can_steal); diff --git a/mm/mm_init.c b/mm/mm_init.c index f7d5b4fe1ae9..f31260fd393e 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -2263,6 +2263,15 @@ void __init init_cma_reserved_pageblock(struct page *page) adjust_managed_page_count(page, pageblock_nr_pages); page_zone(page)->cma_pages += pageblock_nr_pages; } +/* + * Similar to above, but only set the migrate type and stats. + */ +void __init init_cma_pageblock(struct page *page) +{ + set_pageblock_migratetype(page, MIGRATE_CMA); + adjust_managed_page_count(page, pageblock_nr_pages); + page_zone(page)->cma_pages += pageblock_nr_pages; +} #endif void set_zone_contiguous(struct zone *zone) From patchwork Mon Jan 27 23:22:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951867 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 688AAC02188 for ; Mon, 27 Jan 2025 23:23:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7771B2801DC; Mon, 27 Jan 2025 18:23:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6AC0F2801D9; Mon, 27 Jan 2025 18:23:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 573B72801DC; Mon, 27 Jan 2025 18:23:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 338872801D9 for ; Mon, 27 Jan 2025 18:23:08 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id EBB45C082C for ; Mon, 27 Jan 2025 23:23:07 +0000 (UTC) X-FDA: 83054809614.26.3263540 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf06.hostedemail.com (Postfix) with ESMTP id 2583F180002 for ; Mon, 27 Jan 2025 23:23:05 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=GkRXI0B4; spf=pass (imf06.hostedemail.com: domain of 3WBWYZwQKCCcIYGOJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3WBWYZwQKCCcIYGOJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020186; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EKeeHmURdM6aNKIslC9cbG8uxcWG/M1qKwEQEge7FZ4=; b=qyPOnJZBwOUOuondaaX6BwsP298Vl4cWF5hyCliXtAvX2FxUm7p8uJw4zGtGGN5DlVfVu8 oM35vQUYX3bYJRvj3aPTyey6BYz43H86ONEHFYQ3R4uTXOKHW1AsTkjnhqgBPQeN/clJZv RpLNzLmotqh3gJk3wzWf6sxSMpn6gE8= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=GkRXI0B4; spf=pass (imf06.hostedemail.com: domain of 3WBWYZwQKCCcIYGOJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3WBWYZwQKCCcIYGOJRRJOH.FRPOLQXa-PPNYDFN.RUJ@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020186; a=rsa-sha256; cv=none; b=NzgL1z37DSpAYGP8Nz3FNjtfDCqxv2xtPAAHFkaE8acCK6KOpetojFm1aB62lWtAtUtOPT vGVb3R5iAkAUOmpsIAQQ9luHIPzcoZcdZ2Gk9V2KnxFMuGndTvAL5Wmo1Eq4QPJXThBJpW hbLTP88f3gENmPE2KRaKhoJgn9NBR2o= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ef909597d9so14694484a91.3 for ; Mon, 27 Jan 2025 15:23:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020185; x=1738624985; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=EKeeHmURdM6aNKIslC9cbG8uxcWG/M1qKwEQEge7FZ4=; b=GkRXI0B46IuLbP5ZKos1fLcjVFDx+fRs1uo6brR4Xvairf1KwjDN9W1C7BXc7O0SSa gGizmvzXwglTD9gnuEftjFTiplkjpkkmWFyndTpikZkkJry+dhclr1rWVpd1KPFa1QMJ SNkkxLER7p7kmwInnEPrCmaVLPYjwPfvybzfnKaejrkh0EIeaBiYwezyGNtizI3hd3NA Obli3guw+gwz/o+EcdEW+RaNa0jDD2Qrjv67rEENLi2TmfQWN+7bgZ515WCuvblfFDgO FFlWicmwDdI7lq39MHEUJbYzl9mrryNOC3ufvtf1xOOMtmIVon4rWW93/2koHSvIQf47 oIfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020185; x=1738624985; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=EKeeHmURdM6aNKIslC9cbG8uxcWG/M1qKwEQEge7FZ4=; b=DfjL7K6jgaXS/+Lbax3DG6oCa3WYR5oyIplXMF2prpzQql4em6wCS1wcwDOGb6uxRn TMf32sYttKPBm2nyAzilE26aCLzIQJDyuyMpsWE0kZ8OdfYFJo8Q7K7UxtFinlZD8ucS BEoC7uXNNKNmwerXw3xLeMSToq/epMzJpw4T7JKbA4/xc8h2UscvFqs3XMHmNK0roSz6 0aSdKSJMfMzJkmNMlWIvSN7KD0uRPxDnujST2DCCu6RKku2/eJjAEEOTyfIbbj/WermE 0nPCi4Hedm6up2AQZjVvfAvxNYekeY6JVdB/I0dljmpO6KgpSnJTnsXWpQqALWYnG0/A XQvw== X-Forwarded-Encrypted: i=1; AJvYcCUgpeHgq0hS+Xrxza/CPM2Hf35Oihc+2D/t05v/mr0Hms4ZDtIbNxda+z4FI5LUun4KueArfIL8ZA==@kvack.org X-Gm-Message-State: AOJu0YwFWNQrRVXCQ1EyKGY8aBmrxmo9ND71a8fF1nrwui72ubcGEjsK 8Poj5oVz1GnCn6Kgv1F9oLOjOA4MUB0qeVlxrgFNvE+gV47nX2C4Np7FtOsrWjWirUsyAg== X-Google-Smtp-Source: AGHT+IHgFhdTPrw6WGsfRcFM8FTm7PmxtxhzLyvV3ymhkxmbus+woNireOpmJQGtxSxQ613Ma5NUaJJJ X-Received: from pfbbd10.prod.google.com ([2002:a05:6a00:278a:b0:728:e945:d2c2]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:3d88:b0:1e1:aab8:386a with SMTP id adf61e73a8af0-1eb2148ebd2mr76997102637.18.1738020184991; Mon, 27 Jan 2025 15:23:04 -0800 (PST) Date: Mon, 27 Jan 2025 23:22:06 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-27-fvdl@google.com> Subject: [PATCH 26/27] mm/hugetlb: add hugetlb_cma_only cmdline option From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspamd-Queue-Id: 2583F180002 X-Stat-Signature: oxn887mf3zngrqkztw7d9m9rkm7u7hmb X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1738020185-201199 X-HE-Meta: U2FsdGVkX1929ohjYIOaX7R2sXRUPO6eKvg6fIIE6l/PXlMv9YVB9Vlti62Qk2QPV7ISXeOa6K1kqTKU/R2JT7DA3UMxCf9grdFbfVOI3SifK9OOIZokC6QNolHdcGD/leJ792DAZ4nxmcPwmz3J1ICgJUphUohMnKNyNe0e6X/XYc52xNa+amddGgMOanjW216bgM6SJzlQ0kZmA2BYu4NwX/ML3dRXg/wJgv4OYGuKb2K2PfaFz8TWk2/U81SNo0HkvWOnM7eP68QbPZ07ikgLUuiw6Oo03S4L0BfmctSfFDNl7a1POu266otZt85iS+PUB/XUQM6Ez9QKBNfdlb2y6FVV7wcXr+oh6U2wuflLebX2uJTwVfIJs015Yl4deLrwLNuVaDMwjxnJgDpaB3NpFGIAJcXYrwcBc7MBkMMAP2AB5/HSu6c2gBlTJQtUwmWYpLnA5QGPNMTlI62XZ8uAcU6pR/T29Ws0zm/rewxwfHMqlxzRrgsT5FyhxoLj+phoJumScLHNhhcNfxXGUj5AiU3nxtnmJc2rKWRefV/mMbS0PPN1sy4cc5XanNHwwqDILpFdWCUM1BlbownJsFezeUMhCf3GxTY2y/9LFhzR/2Cp/1iywI0XFBvQzYllAQhNfEkDqbJCd4s6E1M8d1Nz0LsgWeXlevRYai4H1tOqi2O3ULq/FFqT6Ccp334hZYsneEzzNEEm6I2udLHFKT+4MyTFYykYqEYbKojkTWeiQZeuvy6afOwyX4zVSrw+448KAwgEszt8i0m9YoIz7s9EK3I66evfx7PbhRK6fHu/ys1nTq/MDxYbGFEiZe+SBfx3S1E0FYwHXm0xdZeb3DX5eIdRbq+6QBJ3e0rwbfc6DN6YDLSjU3GBNTGZL+N88hmE0BWgb/tkPzo37AaPxxB/b4eYl7DcC601KAlEb4s3CnqIz3g0kILbIgiyKU2UaNrTi1EVdX3cbekf4e0 5XaIrcuG kOKes66XP2C4cDVOQV+95dBtiRXE5Uh6znXh+P6w5Bv8hI6bAHEBb1xKX+ObPxkcIhym5kFqhaWLm0hM26Ug9MVvTAKPMz6VDDRyGMvcP6VXEbMsFKCO1XXqna8EH26KK/cviVA3uVMD0jqLHaOMLWZnDo6zcW0b5CwiWrzm7SSfhmGhOKIefRFWzHvE6VmwY9kOR+8jV10zkdpWbutXgD6aFoCoosoM7Kg/L0WpzHvw772nhd1aU+hydxxFEQ7U+olZyCE5fNz4C5kuHXTEKAAWx80uOANaLgTlkVZvr4Uj5b/Pgfu0ob3npDTm1IGCxphzqelZ8sB/t12jYltDXqz2qVdVves4gyUPqaW8+69xBLMo2wrpylyjmF6XSnAVQuzcGpxRgnu7wsU2dFAaNiFUc9x1UkGYaz0f+EK5YRTxovP1GBRwsgHL69Zc4Ine54Kz+L/APi5gVR7IZCbOoqQ0JojDdsRrr9UgVIdMo8/nSkO1UauS1xIJ65eawXwryuq0TbF9sqJPMeX6LXTvJotLL65MfWccwfRhCYeoUYSo4Z44VcvXWhoPBpkvQ4rUktj2hLvls9ICQ87U= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add an option to force hugetlb gigantic pages to be allocated using CMA only (if hugetlb_cma is enabled). This avoids a fallback to allocation from the rest of system memory if the CMA allocation fails. This makes the size of hugetlb_cma a hard upper boundary for gigantic hugetlb page allocations. This is useful because, with a large CMA area, the kernel's unmovable allocations will have less room to work with and it is undesirable for new hugetlb gigantic page allocations to be done from that remaining area. It will eat in to the space available for unmovable allocations, leading to unwanted system behavior (OOMs because the kernel fails to do unmovable allocations). So, with this enabled, an administrator can force a hard upper bound for runtime gigantic page allocations, and have more predictable system behavior. Signed-off-by: Frank van der Linden --- Documentation/admin-guide/kernel-parameters.txt | 7 +++++++ mm/hugetlb.c | 11 +++++++++++ 2 files changed, 18 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index d0f6c055dfcc..6a164466ec33 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1892,6 +1892,13 @@ hugepages using the CMA allocator. If enabled, the boot-time allocation of gigantic hugepages is skipped. + hugetlb_cma_only= + [HW,CMA,EARLY] When allocating new HugeTLB pages, only + try to allocate from the CMA areas. + + This option does nothing if hugetlb_cma= is not also + specified. + hugetlb_free_vmemmap= [KNL] Requires CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP enabled. diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 28653214f23d..32ebde9039e2 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -60,6 +60,7 @@ struct hstate hstates[HUGE_MAX_HSTATE]; static struct cma *hugetlb_cma[MAX_NUMNODES]; static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata; #endif +static bool hugetlb_cma_only; static unsigned long hugetlb_cma_size __initdata; __initdata struct list_head huge_boot_pages[MAX_NUMNODES]; @@ -1511,6 +1512,9 @@ static struct folio *alloc_gigantic_folio(struct hstate *h, gfp_t gfp_mask, } #endif if (!folio) { + if (hugetlb_cma_size && hugetlb_cma_only) + return NULL; + folio = folio_alloc_gigantic(order, gfp_mask, nid, nodemask); if (!folio) return NULL; @@ -7844,6 +7848,13 @@ static int __init cmdline_parse_hugetlb_cma(char *p) early_param("hugetlb_cma", cmdline_parse_hugetlb_cma); +static int __init cmdline_parse_hugetlb_cma_only(char *p) +{ + return kstrtobool(p, &hugetlb_cma_only); +} + +early_param("hugetlb_cma_only", cmdline_parse_hugetlb_cma_only); + void __init hugetlb_cma_reserve(int order) { unsigned long size, reserved, per_node; From patchwork Mon Jan 27 23:22:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13951868 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 688F6C0218A for ; Mon, 27 Jan 2025 23:24:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 59E0B2801DD; Mon, 27 Jan 2025 18:23:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 529462801D9; Mon, 27 Jan 2025 18:23:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3A1D42801DD; Mon, 27 Jan 2025 18:23:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 0FC1B2801D9 for ; Mon, 27 Jan 2025 18:23:10 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id A166F1C7519 for ; Mon, 27 Jan 2025 23:23:09 +0000 (UTC) X-FDA: 83054809698.26.9A698FA Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf01.hostedemail.com (Postfix) with ESMTP id BF71E40008 for ; Mon, 27 Jan 2025 23:23:07 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=rQEZ0nww; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf01.hostedemail.com: domain of 3WhWYZwQKCCkKaIQLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3WhWYZwQKCCkKaIQLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738020187; a=rsa-sha256; cv=none; b=RSfG54Iq+8Faz9jjeNEP77OA3kpMZqWrBJpKDtQqoCk5ygdITpbAVCjNsb4bhY3Vnu399e kj/TLUs9i3/GKBpxrckdPaj78oM5DFx9fsDW8OzdeJ0vTsKTeAdVI+kkp90fothHfwkmNG A7iZdDgVrshIo2MEiVivwvHm7BcvzJo= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=rQEZ0nww; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf01.hostedemail.com: domain of 3WhWYZwQKCCkKaIQLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3WhWYZwQKCCkKaIQLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738020187; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Gm3hQSpmhBarlZRgBgM0zsWbfOG8zObqZGHewKZCOZk=; b=5DsyDc0n8X76keKjEkLG4YAZ7BaCS/hOm5QUGnj3TPFcOMDvXHoJl9m7YhiGwZQGoIECCw 4QPmKKs0VORhXvakKZjozj3oZTEa5c9lYDU+eJa8vxbWiYOr5Zs6d2r/6bl4Gc0PprEoTB AtiAR2Dy8H0x5P0nEPutNJEwbuiz7hE= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2f780a3d6e5so10078072a91.0 for ; Mon, 27 Jan 2025 15:23:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738020186; x=1738624986; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Gm3hQSpmhBarlZRgBgM0zsWbfOG8zObqZGHewKZCOZk=; b=rQEZ0nwwTphehx9wu48RgkDDPvD5SFZ9ciLIDpbPHGZc6Bj+YvOOrzYAnt2V/aVfYD 2+Rg3ffo1Vw4knjQXX6yCKDkBwZ2WXOyLRo9/yyz0pGst48KJ4pzDcYZCDYcCpWbOQRk i7m9XfH0vAFxX7XrXYLuM5ly6PZX0/Xg1iMGns9ab6Yk4b3QijDKFQ9Rh2bpT4g/Uju3 Pb+FMpq0xTOACVHY/sT9yChsGqVmZlbtY89ehqlvdvFXCO7SwSw40FcXLPbtrBuWB+vG d1aw8bv/FZKl/6GVnEf6CoaKloBwdS6Xpv0iwkuQ1bVqKVHjDGzRw8D+wtJ5bSf+emPM LNcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738020186; x=1738624986; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Gm3hQSpmhBarlZRgBgM0zsWbfOG8zObqZGHewKZCOZk=; b=b528bLEfwreYi382VO/mpI+s0GL1Qf3bB47JCxII47SkuYzEjKbX/BIYuB07XgKbDB k5OsFy9n580wsispah5VGBVBsnmUTraHiAe13YmJZLlQdnuY2eyv2GnWqQZgDzblwUsy sWXb/9h3ZBIjPeSgClFBbWQPawTvhhgv81oc+Psjv272wFUCHe3Z+r4L2pz2tDTzVWmH 8mrcc0wElIwPN7RqH9BScbw1PHOe+lbsf3YYtQXq3GBLVg8sRAOVDzjeNLd4gfmQNqER Xtzrtp3XvNE7Nx0PLVGxMZjLcih4t7tBv2MNG5aZ3jAHoKdgQiGkNhJOHyOqMJIaoZiH vAyQ== X-Forwarded-Encrypted: i=1; AJvYcCXxc+eQEnV0/E0w/esSJRKPRZZsFUMIBkA87uPnoBnxCX9JdXvQjGgLf6vD2Vz/pGroNjyUdFaM+w==@kvack.org X-Gm-Message-State: AOJu0YxVlYnNY3MYfJZAk1P+K6JaP/3HLtLBeKxY9yxE6bGF5BPEJ6kq 0TEB3PQG9ZGSoU7FMyplV5rKK21ZxJAH/Ly6wzP+7Vs0ne2io/PJ4wM55Qo/3Wb0he+vmQ== X-Google-Smtp-Source: AGHT+IH19j30fP8A5nC7LbdjLhXfAefG5JdyL9TVfOVI+iunswaOnTqgk5DlFd7Zc5gd2cUKDhngrEFu X-Received: from pfbjo40.prod.google.com ([2002:a05:6a00:90a8:b0:728:e508:8a48]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:2917:b0:725:b201:2362 with SMTP id d2e1a72fcca58-72dafa409b5mr61127310b3a.11.1738020186643; Mon, 27 Jan 2025 15:23:06 -0800 (PST) Date: Mon, 27 Jan 2025 23:22:07 +0000 In-Reply-To: <20250127232207.3888640-1-fvdl@google.com> Mime-Version: 1.0 References: <20250127232207.3888640-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.262.g85cc9f2d1e-goog Message-ID: <20250127232207.3888640-28-fvdl@google.com> Subject: [PATCH 27/27] mm/hugetlb: enable bootmem allocation from CMA areas From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usama.arif@bytedance.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden , Madhavan Srinivasan , Michael Ellerman , linuxppc-dev@lists.ozlabs.org X-Rspam-User: X-Rspamd-Queue-Id: BF71E40008 X-Rspamd-Server: rspam10 X-Stat-Signature: jrkxw5cbyf4uukcam5bt1wxzt3emfops X-HE-Tag: 1738020187-534090 X-HE-Meta: U2FsdGVkX1/+db7Sc+XFNgRof7i1r2/ma+J44792dksOXLCJqdD+prx631VI/oSciU576Q4cWxJFmtwdalJxLhqji7FexaD6WPAB+bAxI/AA/cM0Nx0dsBPKKU2uMRHaFJLxyDN/wXyJWTAq2T6x6Zrp1Lu4UqLC6gaRjTUtMTztIuWgUkbySpZ0kTtULSUUtJ3S/fAHL3GurSSQ8n2hXO/tK7cSCXb3Kl6fEc6nY6XwQZlZPL72HgY05OOWsetljCqGmD41PmHk+c9m1PPGBGBJFIp0d8WNbFzCbJS2RpHMEUE6dUwoL2yiwed+rDjjXr7cvs7e7aKTLfx4B0fhYXUU5thETbs4fmU659TSIoJ5mGnAf4XS1WN1r9dyZLQLqG0T7iHaF/bh4tYRVM3/8bnCuhb3QEKVqzDHzKi2NAUqFs0bJDsCPzaRoWGC3YZ0jfxEF0vGy31FpMjFcxmYMXsoTI55/GF02SXM9YFigsdsDRr9YN6JgJRtJaR90yz8oPA/8RIgXWv2WmG+1rK8gf3VXk0KC1g4Kjhk98u/VqyH55W7r9ro9hRi1KIN79fQcPGlel8D5qZdN5jGwT6z4f6R4B1y9lCHlTd6ODB7Oy9YiEYOI9kBhzkwUMmTG2EDrtWgUe6O5IJIFFBCpdFEweu+lEzFNNkAR1olJtk7to8uqqUWb7ASLB9pGPjRkllM5n8T2VryT8YCJvd5CTmREZsl+w79EttxbVeERpzhoW+HTtH1lykD5y1+wdMLKCsN+QGZJV7vo91GCN93Fb9O0iPYSSkVqrpbc4+UrATZtFbJbFIlghAsRstlxm2IgmMyxsEr/UFOfx9K0YOlXWyS947zn5CuG6f3nWFuJLB5OyViZ7BMxFzXhIMIjlJvoc++Jhy21IOWOLftSVh/gAxL23XXmsjuorkhOtYDAlOeeC0a5tTagRD+EgjfKWs0U1pkPPF2wEn8aqs76rxQ7sO E70CwwpQ q0XTI9pCMvPl5dUDcA/GCLvbb/JR7cxK51BNvW1KJCA7ZaV58vQpWDaewZ6V9fNK9ZkQ9xinq6OlvbE/As9dd97yMbKRNHY18aJWhtOvnpWt/Qzo9u/r5dZKjdLkip/cr3OPwihmWMGKhLhqWr3jTr+DVTXfgABnQCzgEHSE2xFC08E0F7hy/12sGvRmPNYUpvdp92FkjFkBGU/5Ba689FXl7Ipifunzycqp53Af00MNNVFUhaaLTP4/NL4fEAp5yJXsb4iei8nyWOdcNhnUpelKc6yciAlkPBj+igkF+M47dTqTBic6zwHg5+NSsQ4ykse0L/jbHoN5GMGH9WgW4GhJ0LFrQ04zcBjx3VKPDYpIIUVzfDxqUapysxVebs9r9YhYug1Tv8iQCCKdC13s1vYXfd9gU70Il+A31ycVLc6x2as0MEuC3rRcYJ0nziRlT0soGgrN8g6UMGnSG7hsNIMGuggKXxDFkiwiXCMxu/J2BkTPZ6TU/AUbanjR3BKYv2sA7HKVrG8dSOZK37qytE1kEDTAAcYIaB6q142DtiC0RWkJYdh5JfLcrr2Fo0q5vbfXB953DK28U9RXFqrPDxWDKrqy8uihM6vuLr0LICGF/yWM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: If hugetlb_cma_only is enabled, we know that hugetlb pages can only be allocated from CMA. Now that there is an interface to do early reservations from a CMA area (returning memblock memory), it can be used to allocate hugetlb pages from CMA. This also allows for doing pre-HVO on these pages (if enabled). Make sure to initialize the page structures and associated data correctly. Create a flag to signal that a hugetlb page has been allocated from CMA to make things a little easier. Some configurations of powerpc have a special hugetlb bootmem allocator, so introduce a boolean arch_specific_huge_bootmem_alloc that returns true if such an allocator is present. In that case, CMA bootmem allocations can't be used, so check that function before trying. Cc: Madhavan Srinivasan Cc: Michael Ellerman Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Frank van der Linden --- arch/powerpc/mm/hugetlbpage.c | 5 ++ include/linux/hugetlb.h | 7 ++ mm/hugetlb.c | 135 +++++++++++++++++++++++++--------- 3 files changed, 114 insertions(+), 33 deletions(-) diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index d3c1b749dcfc..e53e4b4c8ef6 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -121,6 +121,11 @@ bool __init hugetlb_node_alloc_supported(void) { return false; } + +bool __init arch_specific_huge_bootmem_alloc(struct hstate *h) +{ + return (firmware_has_feature(FW_FEATURE_LPAR) && !radix_enabled()); +} #endif diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 2512463bca49..bca3052fb175 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -591,6 +591,7 @@ enum hugetlb_page_flags { HPG_freed, HPG_vmemmap_optimized, HPG_raw_hwp_unreliable, + HPG_cma, __NR_HPAGEFLAGS, }; @@ -650,6 +651,7 @@ HPAGEFLAG(Temporary, temporary) HPAGEFLAG(Freed, freed) HPAGEFLAG(VmemmapOptimized, vmemmap_optimized) HPAGEFLAG(RawHwpUnreliable, raw_hwp_unreliable) +HPAGEFLAG(Cma, cma) #ifdef CONFIG_HUGETLB_PAGE @@ -678,14 +680,18 @@ struct hstate { char name[HSTATE_NAME_LEN]; }; +struct cma; + struct huge_bootmem_page { struct list_head list; struct hstate *hstate; unsigned long flags; + struct cma *cma; }; #define HUGE_BOOTMEM_HVO 0x0001 #define HUGE_BOOTMEM_ZONES_VALID 0x0002 +#define HUGE_BOOTMEM_CMA 0x0004 bool hugetlb_bootmem_page_zones_valid(int nid, struct huge_bootmem_page *m); @@ -711,6 +717,7 @@ bool __init hugetlb_node_alloc_supported(void); void __init hugetlb_add_hstate(unsigned order); bool __init arch_hugetlb_valid_size(unsigned long size); +bool __init arch_specific_huge_bootmem_alloc(struct hstate *h); struct hstate *size_to_hstate(unsigned long size); #ifndef HUGE_MAX_HSTATE diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 32ebde9039e2..183e8d0c2fb4 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -61,7 +61,7 @@ static struct cma *hugetlb_cma[MAX_NUMNODES]; static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata; #endif static bool hugetlb_cma_only; -static unsigned long hugetlb_cma_size __initdata; +static unsigned long hugetlb_cma_size; __initdata struct list_head huge_boot_pages[MAX_NUMNODES]; __initdata unsigned long hstate_boot_nrinvalid[HUGE_MAX_HSTATE]; @@ -132,8 +132,10 @@ static void hugetlb_free_folio(struct folio *folio) #ifdef CONFIG_CMA int nid = folio_nid(folio); - if (cma_free_folio(hugetlb_cma[nid], folio)) + if (folio_test_hugetlb_cma(folio)) { + WARN_ON(!cma_free_folio(hugetlb_cma[nid], folio)); return; + } #endif folio_put(folio); } @@ -1509,6 +1511,9 @@ static struct folio *alloc_gigantic_folio(struct hstate *h, gfp_t gfp_mask, break; } } + + if (folio) + folio_set_hugetlb_cma(folio); } #endif if (!folio) { @@ -3175,6 +3180,63 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, return ERR_PTR(-ENOSPC); } +/* + * Some architectures do their own bootmem allocation, so they can't use + * early CMA allocation. So, allow for this function to be redefined. + */ +bool __init __attribute((weak)) +arch_specific_huge_bootmem_alloc(struct hstate *h) +{ + return false; +} + +static bool __init hugetlb_early_cma(struct hstate *h) +{ + if (arch_specific_huge_bootmem_alloc(h)) + return false; + + return (hstate_is_gigantic(h) && hugetlb_cma_size && hugetlb_cma_only); +} + +static __init void *alloc_bootmem(struct hstate *h, int nid) +{ + struct huge_bootmem_page *m; + unsigned long flags; + struct cma *cma; + +#ifdef CONFIG_CMA + if (hugetlb_early_cma(h)) { + flags = HUGE_BOOTMEM_CMA; + cma = hugetlb_cma[nid]; + m = cma_reserve_early(cma, huge_page_size(h)); + } else +#endif + { + flags = 0; + cma = NULL; + m = memblock_alloc_try_nid_raw(huge_page_size(h), + huge_page_size(h), 0, MEMBLOCK_ALLOC_ACCESSIBLE, nid); + } + + if (m) { + /* + * Use the beginning of the huge page to store the + * huge_bootmem_page struct (until gather_bootmem + * puts them into the mem_map). + * + * Put them into a private list first because mem_map + * is not up yet. + */ + INIT_LIST_HEAD(&m->list); + list_add(&m->list, &huge_boot_pages[nid]); + m->hstate = h; + m->flags = flags; + m->cma = cma; + } + + return m; +} + int alloc_bootmem_huge_page(struct hstate *h, int nid) __attribute__ ((weak, alias("__alloc_bootmem_huge_page"))); int __alloc_bootmem_huge_page(struct hstate *h, int nid) @@ -3184,17 +3246,14 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) /* do node specific alloc */ if (nid != NUMA_NO_NODE) { - m = memblock_alloc_try_nid_raw(huge_page_size(h), huge_page_size(h), - 0, MEMBLOCK_ALLOC_ACCESSIBLE, nid); + m = alloc_bootmem(h, node); if (!m) return 0; goto found; } /* allocate from next node when distributing huge pages */ for_each_node_mask_to_alloc(&h->next_nid_to_alloc, nr_nodes, node, &node_states[N_ONLINE]) { - m = memblock_alloc_try_nid_raw( - huge_page_size(h), huge_page_size(h), - 0, MEMBLOCK_ALLOC_ACCESSIBLE, node); + m = alloc_bootmem(h, node); if (m) break; } @@ -3203,7 +3262,6 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) return 0; found: - /* * Only initialize the head struct page in memmap_init_reserved_pages, * rest of the struct pages will be initialized by the HugeTLB @@ -3213,18 +3271,6 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) */ memblock_reserved_mark_noinit(virt_to_phys((void *)m + PAGE_SIZE), huge_page_size(h) - PAGE_SIZE); - /* - * Use the beginning of the huge page to store the - * huge_bootmem_page struct (until gather_bootmem - * puts them into the mem_map). - * - * Put them into a private list first because mem_map - * is not up yet. - */ - INIT_LIST_HEAD(&m->list); - list_add(&m->list, &huge_boot_pages[node]); - m->hstate = h; - m->flags = 0; return 1; } @@ -3265,13 +3311,25 @@ static void __init hugetlb_folio_init_vmemmap(struct folio *folio, prep_compound_head((struct page *)folio, huge_page_order(h)); } +static bool __init hugetlb_bootmem_page_prehvo(struct huge_bootmem_page *m) +{ + return (m->flags & HUGE_BOOTMEM_HVO); +} + +static bool __init hugetlb_bootmem_page_earlycma(struct huge_bootmem_page *m) +{ + return (m->flags & HUGE_BOOTMEM_CMA); +} + /* * memblock-allocated pageblocks might not have the migrate type set * if marked with the 'noinit' flag. Set it to the default (MIGRATE_MOVABLE) - * here. + * here, or MIGRATE_CMA if this was a page allocated through an early CMA + * reservation. * - * Note that this will not write the page struct, it is ok (and necessary) - * to do this on vmemmap optimized folios. + * In case of vmemmap optimized folios, the tail vmemmap pages are mapped + * read-only, but that's ok - for sparse vmemmap this does not write to + * the page structure. */ static void __init hugetlb_bootmem_init_migratetype(struct folio *folio, struct hstate *h) @@ -3280,9 +3338,13 @@ static void __init hugetlb_bootmem_init_migratetype(struct folio *folio, WARN_ON_ONCE(!pageblock_aligned(folio_pfn(folio))); - for (i = 0; i < nr_pages; i += pageblock_nr_pages) - set_pageblock_migratetype(folio_page(folio, i), + for (i = 0; i < nr_pages; i += pageblock_nr_pages) { + if (folio_test_hugetlb_cma(folio)) + init_cma_pageblock(folio_page(folio, i)); + else + set_pageblock_migratetype(folio_page(folio, i), MIGRATE_MOVABLE); + } } static void __init prep_and_add_bootmem_folios(struct hstate *h, @@ -3319,7 +3381,7 @@ bool __init hugetlb_bootmem_page_zones_valid(int nid, struct huge_bootmem_page *m) { unsigned long start_pfn; - bool valid; + bool valid = false; if (m->flags & HUGE_BOOTMEM_ZONES_VALID) { /* @@ -3328,10 +3390,16 @@ bool __init hugetlb_bootmem_page_zones_valid(int nid, return true; } + if (hugetlb_bootmem_page_earlycma(m)) { + valid = cma_validate_zones(m->cma); + goto out; + } + start_pfn = virt_to_phys(m) >> PAGE_SHIFT; valid = !pfn_range_intersects_zones(nid, start_pfn, pages_per_huge_page(m->hstate)); +out: if (!valid) hstate_boot_nrinvalid[hstate_index(m->hstate)]++; @@ -3360,11 +3428,6 @@ static void __init hugetlb_bootmem_free_invalid_page(int nid, struct page *page, } } -static bool __init hugetlb_bootmem_page_prehvo(struct huge_bootmem_page *m) -{ - return (m->flags & HUGE_BOOTMEM_HVO); -} - /* * Put bootmem huge pages into the standard lists after mem_map is up. * Note: This only applies to gigantic (order > MAX_PAGE_ORDER) pages. @@ -3414,6 +3477,9 @@ static void __init gather_bootmem_prealloc_node(unsigned long nid) */ folio_set_hugetlb_vmemmap_optimized(folio); + if (hugetlb_bootmem_page_earlycma(m)) + folio_set_hugetlb_cma(folio); + list_add(&folio->lru, &folio_list); /* @@ -3606,8 +3672,11 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) { unsigned long allocated; - /* skip gigantic hugepages allocation if hugetlb_cma enabled */ - if (hstate_is_gigantic(h) && hugetlb_cma_size) { + /* + * Skip gigantic hugepages allocation if early CMA + * reservations are not available. + */ + if (hstate_is_gigantic(h) && hugetlb_cma_size && !hugetlb_early_cma(h)) { pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n"); return; }