From patchwork Thu Feb 6 18:51:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13963572 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C852C02194 for ; Thu, 6 Feb 2025 18:52:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 98F68280018; Thu, 6 Feb 2025 13:52:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 91749280002; Thu, 6 Feb 2025 13:52:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 769FD280018; Thu, 6 Feb 2025 13:52:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 503A8280002 for ; Thu, 6 Feb 2025 13:52:00 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 11854B255A for ; Thu, 6 Feb 2025 18:52:00 +0000 (UTC) X-FDA: 83090414400.24.EA1D2BD Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf13.hostedemail.com (Postfix) with ESMTP id 4081A20005 for ; Thu, 6 Feb 2025 18:51:58 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0hl+6DcX; spf=pass (imf13.hostedemail.com: domain of 3zQSlZwQKCK4TjRZUccUZS.QcaZWbil-aaYjOQY.cfU@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3zQSlZwQKCK4TjRZUccUZS.QcaZWbil-aaYjOQY.cfU@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738867918; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Os2cqKctxCUcRS8FYnguBqsdmbx6qbrypyMDT5BCAmU=; b=zBmyo60uBu0WTHjY8+NTS53X6mQHSUyBq3jFrXSVK8bnfYt5dqAzwbrvkodVmYCe8Ghz1q t0gbHq7YBTz64E8XDE4yHSr1RtI30yDtByiUzAFHgN89aSgtxR04bU5Zjo5hz6d0pYW1km lqkDYmWoUuQn4nALpcrhvi7A8Jgj7JM= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0hl+6DcX; spf=pass (imf13.hostedemail.com: domain of 3zQSlZwQKCK4TjRZUccUZS.QcaZWbil-aaYjOQY.cfU@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3zQSlZwQKCK4TjRZUccUZS.QcaZWbil-aaYjOQY.cfU@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738867918; a=rsa-sha256; cv=none; b=g3n4lseLUWuUmo+tPSOFtKKp1EcI6G6fcBPkv3JPopLWYYk1G6yJrW2Uv6Xk0th2WKOnXO xhCeVz6Nem6yLCtvTi9EEOOkLJZj2iK4QP9tXVdOHnXGATZavsb/2He1vgKV554X5QKkTI WGQejY/r9UQ+FNLU34ACLeNhEnKobTo= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2f9fbe02121so2523521a91.0 for ; Thu, 06 Feb 2025 10:51:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1738867917; x=1739472717; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Os2cqKctxCUcRS8FYnguBqsdmbx6qbrypyMDT5BCAmU=; b=0hl+6DcXQsrLwquY7KQK5Ap1E7sKlcGxCI6ruyYugAdYuMngqFiII/A7KMeGQzr1oP fGjhQvGCUmXyRZpX5f2F6EHL6nmOOdXEvgk8D1F5PgI5nMg0iiEe8CKZcb93p35FjCJo 2XCUegCLqffIFjzLmhGUV3VVCWft5efh6GKhGBUqs6cZrSaDJQAVWHjZ3nh2Fv/uHN6h U6Ronu3vUbgW4p0mIVfWKDMV2RJmB5CqYU5jvqtEpMsl4AMwNl4tQB0Kv3jDV0WH8MVz Ug5IicO0mqyjht5HOxSvWlMB3RLRBsj8cQuu9qECf0OPG95UqRq6FR/83I/h/N69ZuvU hIHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738867917; x=1739472717; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Os2cqKctxCUcRS8FYnguBqsdmbx6qbrypyMDT5BCAmU=; b=c41+GVkfozn3b7IUK/oXU1nErvhUNCVKBHILJNfEGxQIlPNE2f7e+O967gxAvMAqKN ZVHQMmd9U9fEkBV62QECoDNweBdGFqd1SYKY0qBDP49OgIvTO2tS55QF49HMEi5MNxKZ pMF/NEMnmKTkjOyEVXWuMdub6nDE2d3pF6XMGGmHELrzZJtz53s0iY9lktbsaD05B8Ir WcWOpDUn9j8Oxe4YbauS1gyf+E2MkBPXplCmgncg41dt/RVBPd05kqkdSk3dfCk0cH1y cpykPE1laL+PDCrae9Qq44WMRuiSaw3LtIquO3FL9J2mX03/7MR5IFSqFGAbDFnuws8x cRcQ== X-Forwarded-Encrypted: i=1; AJvYcCXeEt+bpc2Yeu8f8shxV5avinIvPeuMia35T0QHKXbKNK4f54OdNx7mcfGVNpS1XEH9C5DV9PCwIA==@kvack.org X-Gm-Message-State: AOJu0YxbYAlWYG6Am4AUpVeiIeBcFoNaixqZ1qrmf+wl88IjsmxbaNed Vji6bGcXNiPB8IaUn6x5DlKymvOHP6VEP0kqxl96QsmCt7QVmAT2IeWn4e9zyKKWNb/E7Q== X-Google-Smtp-Source: AGHT+IHVX/1oNg+jUxgfnaihB0fSHJ3u8Hg2NHEhvkhZ5xKtCbM1RyTOrbuLCZ62fn2ArYVFlFmi+NRu X-Received: from pfmb20.prod.google.com ([2002:a05:6a00:1154:b0:730:4a3b:49ca]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:420b:b0:725:e499:5b8a with SMTP id d2e1a72fcca58-7305d505b69mr532061b3a.15.1738867917188; Thu, 06 Feb 2025 10:51:57 -0800 (PST) Date: Thu, 6 Feb 2025 18:51:05 +0000 In-Reply-To: <20250206185109.1210657-1-fvdl@google.com> Mime-Version: 1.0 References: <20250206185109.1210657-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.502.g6dc24dfdaf-goog Message-ID: <20250206185109.1210657-26-fvdl@google.com> Subject: [PATCH v3 25/28] mm/cma: introduce interface for early reservations From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 4081A20005 X-Stat-Signature: ipcwkjad4xhkui857chio6oqmkh9f7xy X-HE-Tag: 1738867918-81929 X-HE-Meta: U2FsdGVkX19UByiuVRnuFHZ5AVcGVMYsdwu/erFcKCWVvHk1RJ7kz5VvlFI7vWF4u2NjO+I45pKG/3TGTx+TvVaRtbDR1YKYCoAFAwWPmRauyoAaPWUJaCrx698Un/gbBaPj9YeA/lcJdDldK7FErzq/Vo3uTXHwT6IPemb7DOsxwE1dLo+u2+ZwIFAc68bhYUIIxA5WU+JB/D8OpsAx3m9J2TP/Y9MGc/oiWZuPQtPwObbLL9RWIyhyl5nK6XFhtGat/EZsPl0LqS6cPVg68bJ1PfHqTqM6ZA0u3i7yn59yqKWtTIr+E9q3CLHyy8O8TVGd528/FPtVIaprUSgMtyWk+cN1tG1ZFA+65MnJ+z3lRp+dl7+opfQEwzeT462I3qxb8RUOhbASbuOM3wmjULw3otRfSDLLEJ5dW4MOvlJZsswSDK1YxG8dkKjPf+CXy1mlXdbDI8oNDtug5PYDjxpqIe6+pgAiRufAgnSSLrqYZM2Y75UggBUThLoiurNpMOBghSjNbty+anbnELjFyy8Qkw7ZDr3gj6QhBYmgRTpkXZYefb4PwMb92xuqzUQbf8qAn/iEblvULOJGwf4o0HzYMVLh0y5OgyLEeAKWxwPkw2pY6Mq10crnDH7iZeXDinBx89LXiwBTpbflS/xthRO2Dayzd1I93+HJpjHMgigIZiy8HoEzv3Gh8M9YNszqMvdKMgBpX3LslaBmWeRFKd3rjULcHlkHWYJb9mozYXoAOBs8vXBlBN8gKnM5jepSfVb3oLblZRi+egrvr7iPhy/NDSGTHVO9ZKDX+BUAsh5HRwrKvNijPaLa+u7FzpLIz30dRvB7HK80Zoxgjo1uTw5yBmEfZJLl6LQnyGok002Z42ydYb8zDhXA3EoKfEaoPmf0YpkIcCE2uSfg2d+L2zPQ3LJcmkGdnxOdzqb6zRFuNyCl4rTuJ/KxgSkmyCoH0lILqtO/CtbLc8pDos9 EZjQ+t9B UoTuCTX8X1KHL/QjKwJ+UD8DMS35LevBYPlyWmtc9yaxhQxs43wnUHt0twm0ppL8X33R3PS/XEp9/4DKbv6tBkxda+NHkJC4/fn9RVV9xtQ5L6xIfltjmwUdVd5UQHqhhvJ2C9lVGGkg1ZcBlSwKTnV7mBwmlvCrMf2kwKNi7/j/PaB3lKuh5Wv0qZDAuQhQx9Q8FCv/iR/2Ab7+LkwLE20xDYhYfteJqJ0qhocJ/zfQaoJ/34GC8EGpxeVRnTDsmff+nN/qU39wpQGYhwKTQMMdvRxZZTOC2grxOr5KCkumX1F2u2wGRRNR5uONYSk+StFcWX/DKBMwew9117zL8gTqVKibamtm2VkrfPc+kJIcDvtgeTtfw5kE1XP/ZUy3dVoQ7pwDByMHQQE51/Shn6aI3qNOMNd3Ly0F/CzFsuhhJNid+Jd4DBLC04ACii7AizE9hj5W9WFfLXVFkp1Lg5+2iKo9ojFG6zsUf1gcaMKyEiB7tEvN+ePS9j8uVUpY9vZpeChw6M+CNKZ6kjhTVfXTEeg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: It can be desirable to reserve memory in a CMA area before it is activated, early in boot. Such reservations would effectively be memblock allocations, but they can be returned to the CMA area later. This functionality can be used to allow hugetlb bootmem allocations from a hugetlb CMA area. A new interface, cma_reserve_early is introduced. This allows for pageblock-aligned reservations. These reservations are skipped during the initial handoff of pages in a CMA area to the buddy allocator. The caller is responsible for making sure that the page structures are set up, and that the migrate type is set correctly, as with other memblock allocations that stick around. If the CMA area fails to activate (because it intersects with multiple zones), the reserved memory is not given to the buddy allocator, the caller needs to take care of that. Signed-off-by: Frank van der Linden --- mm/cma.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++----- mm/cma.h | 8 +++++ mm/internal.h | 16 ++++++++++ mm/mm_init.c | 9 ++++++ 4 files changed, 109 insertions(+), 7 deletions(-) diff --git a/mm/cma.c b/mm/cma.c index 4388d941d381..34a4df29af72 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -144,9 +144,10 @@ bool cma_validate_zones(struct cma *cma) static void __init cma_activate_area(struct cma *cma) { - unsigned long pfn, base_pfn; + unsigned long pfn, end_pfn; int allocrange, r; struct cma_memrange *cmr; + unsigned long bitmap_count, count; for (allocrange = 0; allocrange < cma->nranges; allocrange++) { cmr = &cma->ranges[allocrange]; @@ -161,8 +162,13 @@ static void __init cma_activate_area(struct cma *cma) for (r = 0; r < cma->nranges; r++) { cmr = &cma->ranges[r]; - base_pfn = cmr->base_pfn; - for (pfn = base_pfn; pfn < base_pfn + cmr->count; + if (cmr->early_pfn != cmr->base_pfn) { + count = cmr->early_pfn - cmr->base_pfn; + bitmap_count = cma_bitmap_pages_to_bits(cma, count); + bitmap_set(cmr->bitmap, 0, bitmap_count); + } + + for (pfn = cmr->early_pfn; pfn < cmr->base_pfn + cmr->count; pfn += pageblock_nr_pages) init_cma_reserved_pageblock(pfn_to_page(pfn)); } @@ -173,6 +179,7 @@ static void __init cma_activate_area(struct cma *cma) INIT_HLIST_HEAD(&cma->mem_head); spin_lock_init(&cma->mem_head_lock); #endif + set_bit(CMA_ACTIVATED, &cma->flags); return; @@ -184,9 +191,8 @@ static void __init cma_activate_area(struct cma *cma) if (!test_bit(CMA_RESERVE_PAGES_ON_ERROR, &cma->flags)) { for (r = 0; r < allocrange; r++) { cmr = &cma->ranges[r]; - for (pfn = cmr->base_pfn; - pfn < cmr->base_pfn + cmr->count; - pfn++) + end_pfn = cmr->base_pfn + cmr->count; + for (pfn = cmr->early_pfn; pfn < end_pfn; pfn++) free_reserved_page(pfn_to_page(pfn)); } } @@ -290,6 +296,7 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, return ret; cma->ranges[0].base_pfn = PFN_DOWN(base); + cma->ranges[0].early_pfn = PFN_DOWN(base); cma->ranges[0].count = cma->count; cma->nranges = 1; cma->nid = NUMA_NO_NODE; @@ -509,6 +516,7 @@ int __init cma_declare_contiguous_multi(phys_addr_t total_size, nr, (u64)mlp->base, (u64)mlp->base + size); cmrp = &cma->ranges[nr++]; cmrp->base_pfn = PHYS_PFN(mlp->base); + cmrp->early_pfn = cmrp->base_pfn; cmrp->count = size >> PAGE_SHIFT; sizeleft -= size; @@ -540,7 +548,6 @@ int __init cma_declare_contiguous_multi(phys_addr_t total_size, pr_info("Reserved %lu MiB in %d range%s\n", (unsigned long)total_size / SZ_1M, nr, nr > 1 ? "s" : ""); - return ret; } @@ -1034,3 +1041,65 @@ bool cma_intersects(struct cma *cma, unsigned long start, unsigned long end) return false; } + +/* + * Very basic function to reserve memory from a CMA area that has not + * yet been activated. This is expected to be called early, when the + * system is single-threaded, so there is no locking. The alignment + * checking is restrictive - only pageblock-aligned areas + * (CMA_MIN_ALIGNMENT_BYTES) may be reserved through this function. + * This keeps things simple, and is enough for the current use case. + * + * The CMA bitmaps have not yet been allocated, so just start + * reserving from the bottom up, using a PFN to keep track + * of what has been reserved. Unreserving is not possible. + * + * The caller is responsible for initializing the page structures + * in the area properly, since this just points to memblock-allocated + * memory. The caller should subsequently use init_cma_pageblock to + * set the migrate type and CMA stats the pageblocks that were reserved. + * + * If the CMA area fails to activate later, memory obtained through + * this interface is not handed to the page allocator, this is + * the responsibility of the caller (e.g. like normal memblock-allocated + * memory). + */ +void __init *cma_reserve_early(struct cma *cma, unsigned long size) +{ + int r; + struct cma_memrange *cmr; + unsigned long available; + void *ret = NULL; + + if (!cma || !cma->count) + return NULL; + /* + * Can only be called early in init. + */ + if (test_bit(CMA_ACTIVATED, &cma->flags)) + return NULL; + + if (!IS_ALIGNED(size, CMA_MIN_ALIGNMENT_BYTES)) + return NULL; + + if (!IS_ALIGNED(size, (PAGE_SIZE << cma->order_per_bit))) + return NULL; + + size >>= PAGE_SHIFT; + + if (size > cma->available_count) + return NULL; + + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + available = cmr->count - (cmr->early_pfn - cmr->base_pfn); + if (size <= available) { + ret = phys_to_virt(PFN_PHYS(cmr->early_pfn)); + cmr->early_pfn += size; + cma->available_count -= size; + return ret; + } + } + + return ret; +} diff --git a/mm/cma.h b/mm/cma.h index bddc84b3cd96..df7fc623b7a6 100644 --- a/mm/cma.h +++ b/mm/cma.h @@ -16,9 +16,16 @@ struct cma_kobject { * and the total amount of memory requested, while smaller than the total * amount of memory available, is large enough that it doesn't fit in a * single physical memory range because of memory holes. + * + * Fields: + * @base_pfn: physical address of range + * @early_pfn: first PFN not reserved through cma_reserve_early + * @count: size of range + * @bitmap: bitmap of allocated (1 << order_per_bit)-sized chunks. */ struct cma_memrange { unsigned long base_pfn; + unsigned long early_pfn; unsigned long count; unsigned long *bitmap; #ifdef CONFIG_CMA_DEBUGFS @@ -58,6 +65,7 @@ enum cma_flags { CMA_RESERVE_PAGES_ON_ERROR, CMA_ZONES_VALID, CMA_ZONES_INVALID, + CMA_ACTIVATED, }; extern struct cma cma_areas[MAX_CMA_AREAS]; diff --git a/mm/internal.h b/mm/internal.h index 63fda9bb9426..8318c8e6e589 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -848,6 +848,22 @@ void init_cma_reserved_pageblock(struct page *page); #endif /* CONFIG_COMPACTION || CONFIG_CMA */ +struct cma; + +#ifdef CONFIG_CMA +void *cma_reserve_early(struct cma *cma, unsigned long size); +void init_cma_pageblock(struct page *page); +#else +static inline void *cma_reserve_early(struct cma *cma, unsigned long size) +{ + return NULL; +} +static inline void init_cma_pageblock(struct page *page) +{ +} +#endif + + int find_suitable_fallback(struct free_area *area, unsigned int order, int migratetype, bool only_stealable, bool *can_steal); diff --git a/mm/mm_init.c b/mm/mm_init.c index f7d5b4fe1ae9..f31260fd393e 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -2263,6 +2263,15 @@ void __init init_cma_reserved_pageblock(struct page *page) adjust_managed_page_count(page, pageblock_nr_pages); page_zone(page)->cma_pages += pageblock_nr_pages; } +/* + * Similar to above, but only set the migrate type and stats. + */ +void __init init_cma_pageblock(struct page *page) +{ + set_pageblock_migratetype(page, MIGRATE_CMA); + adjust_managed_page_count(page, pageblock_nr_pages); + page_zone(page)->cma_pages += pageblock_nr_pages; +} #endif void set_zone_contiguous(struct zone *zone)