[v2] mm: page_alloc: consume available CMA space first

Message ID	20230726150705.GA1365610@cmpxchg.org (mailing list archive)
State	New
Headers	show Return-Path: <owner-linux-mm@kvack.org> Date: Wed, 26 Jul 2023 11:07:05 -0400 From: Johannes Weiner <hannes@cmpxchg.org> To: Andrew Morton <akpm@linux-foundation.org> Cc: Vlastimil Babka <vbabka@suse.cz>, Mel Gorman <mgorman@techsingularity.net>, Roman Gushchin <roman.gushchin@linux.dev>, Rik van Riel <riel@surriel.com>, Joonsoo Kim <js1304@gmail.com>, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2] mm: page_alloc: consume available CMA space first Message-ID: <20230726150705.GA1365610@cmpxchg.org> References: <20230726145304.1319046-1-hannes@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230726145304.1319046-1-hannes@cmpxchg.org> Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	[v2] mm: page_alloc: consume available CMA space first \| expand [v2] mm: page_alloc: consume available CMA space first

Message ID

20230726150705.GA1365610@cmpxchg.org (mailing list archive)

State

New

Headers

Date: Wed, 26 Jul 2023 11:07:05 -0400
From: Johannes Weiner <hannes@cmpxchg.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Rik van Riel <riel@surriel.com>, Joonsoo Kim <js1304@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: [PATCH v2] mm: page_alloc: consume available CMA space first
Message-ID: <20230726150705.GA1365610@cmpxchg.org>
References: <20230726145304.1319046-1-hannes@cmpxchg.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20230726145304.1319046-1-hannes@cmpxchg.org>
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

[v2] mm: page_alloc: consume available CMA space first | expand

Commit Message

Johannes Weiner July 26, 2023, 3:07 p.m. UTC

On a memcache setup with heavy anon usage and no swap, we routinely
see premature OOM kills with multiple gigabytes of free space left:

    Node 0 Normal free:4978632kB [...] free_cma:4893276kB

This free space turns out to be CMA. We set CMA regions aside for
potential hugetlb users on all of our machines, figuring that even if
there aren't any, the memory is available to userspace allocations.

When the OOMs trigger, it's from unmovable and reclaimable allocations
that aren't allowed to dip into CMA. The non-CMA regions meanwhile are
dominated by the anon pages.

Movable pages can be migrated out of CMA when necessary, but we don't
have a mechanism to migrate them *into* CMA to make room for unmovable
allocations. The only recourse we have for these pages is reclaim,
which due to a lack of swap is unavailable in our case.

Because we have more options for CMA pages, change the policy to
always fill up CMA first. This reduces the risk of premature OOMs.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/page_alloc.c | 53 +++++++++++++++++++------------------------------
 1 file changed, 20 insertions(+), 33 deletions(-)

I realized shortly after sending the first version that the code can
be further simplified by removing __rmqueue_cma_fallback() altogether.

Build, boot and runtime tested that CMA is indeed used up first.

Comments

Andrew Morton July 26, 2023, 8:06 p.m. UTC | #1

On Wed, 26 Jul 2023 11:07:05 -0400 Johannes Weiner <hannes@cmpxchg.org> wrote:

> On a memcache setup with heavy anon usage and no swap, we routinely
> see premature OOM kills with multiple gigabytes of free space left:
> 
>     Node 0 Normal free:4978632kB [...] free_cma:4893276kB
> 
> This free space turns out to be CMA. We set CMA regions aside for
> potential hugetlb users on all of our machines, figuring that even if
> there aren't any, the memory is available to userspace allocations.
> 
> When the OOMs trigger, it's from unmovable and reclaimable allocations
> that aren't allowed to dip into CMA. The non-CMA regions meanwhile are
> dominated by the anon pages.
> 
> Movable pages can be migrated out of CMA when necessary, but we don't
> have a mechanism to migrate them *into* CMA to make room for unmovable
> allocations. The only recourse we have for these pages is reclaim,
> which due to a lack of swap is unavailable in our case.
> 
> Because we have more options for CMA pages, change the policy to
> always fill up CMA first. This reduces the risk of premature OOMs.

This conflicts significantly (and more than textually) with "mm:
optimization on page allocation when CMA enabled", which has been
languishing in mm-unstable in an inadequately reviewed state since May
11.  Please suggest a way forward?

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7d3460c7a480..b257f9651ce9 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1634,17 +1634,6 @@  static int fallbacks[MIGRATE_TYPES][MIGRATE_PCPTYPES - 1] = {
 	[MIGRATE_RECLAIMABLE] = { MIGRATE_UNMOVABLE,   MIGRATE_MOVABLE   },
 };
 
-#ifdef CONFIG_CMA
-static __always_inline struct page *__rmqueue_cma_fallback(struct zone *zone,
-					unsigned int order)
-{
-	return __rmqueue_smallest(zone, order, MIGRATE_CMA);
-}
-#else
-static inline struct page *__rmqueue_cma_fallback(struct zone *zone,
-					unsigned int order) { return NULL; }
-#endif
-
 /*
  * Move the free pages in a range to the freelist tail of the requested type.
  * Note that start_page and end_pages are not aligned on a pageblock
@@ -2124,29 +2113,27 @@  __rmqueue(struct zone *zone, unsigned int order, int migratetype,
 {
 	struct page *page;
 
-	if (IS_ENABLED(CONFIG_CMA)) {
-		/*
-		 * Balance movable allocations between regular and CMA areas by
-		 * allocating from CMA when over half of the zone's free memory
-		 * is in the CMA area.
-		 */
-		if (alloc_flags & ALLOC_CMA &&
-		    zone_page_state(zone, NR_FREE_CMA_PAGES) >
-		    zone_page_state(zone, NR_FREE_PAGES) / 2) {
-			page = __rmqueue_cma_fallback(zone, order);
-			if (page)
-				return page;
-		}
+#ifdef CONFIG_CMA
+	/*
+	 * Use up CMA first. Movable pages can be migrated out of CMA
+	 * if necessary, but they cannot migrate into it to make room
+	 * for unmovables elsewhere. The only recourse for them is
+	 * then reclaim, which might be unavailable without swap. We
+	 * want to reduce the risk of OOM with free CMA space left.
+	 */
+	if (alloc_flags & ALLOC_CMA) {
+		page = __rmqueue_smallest(zone, order, MIGRATE_CMA);
+		if (page)
+			return page;
 	}
-retry:
-	page = __rmqueue_smallest(zone, order, migratetype);
-	if (unlikely(!page)) {
-		if (alloc_flags & ALLOC_CMA)
-			page = __rmqueue_cma_fallback(zone, order);
-
-		if (!page && __rmqueue_fallback(zone, order, migratetype,
-								alloc_flags))
-			goto retry;
+#endif
+
+	for (;;) {
+		page = __rmqueue_smallest(zone, order, migratetype);
+		if (page)
+			break;
+		if (!__rmqueue_fallback(zone, order, migratetype, alloc_flags))
+			break;
 	}
 	return page;
 }

[v2] mm: page_alloc: consume available CMA space first

Commit Message

Comments

Patch