diff mbox series

mm: page_alloc: fix cma pageblock was stolen in rmqueue fallback

Message ID 20230830111332.7599-1-lecopzer.chen@mediatek.com (mailing list archive)
State New
Headers show
Series mm: page_alloc: fix cma pageblock was stolen in rmqueue fallback | expand

Commit Message

Lecopzer Chen Aug. 30, 2023, 11:13 a.m. UTC
commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a
spinlock") fallback freeing page to free_one_page() if pcp trylock
failed. This make MIGRATE_CMA be able to fallback and be stolen
whole pageblock by MIGRATE_UNMOVABLE in the page allocation.

PCP free is fine because free_pcppages_bulk() will always get
migratetype again before freeing the page, thus this only happen when
someone tried to put CMA page in to other MIGRATE_TYPE's freelist.

Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
Reported-by: Joe Liu <joe.liu@mediatek.com>
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: Mark-pk Tsai <mark-pk.tsai@mediatek.com>
Cc: Joe Liu <joe.liu@mediatek.com>
---
 mm/page_alloc.c | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Mel Gorman Sept. 5, 2023, 9:09 a.m. UTC | #1
On Wed, Aug 30, 2023 at 07:13:33PM +0800, Lecopzer Chen wrote:
> commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a
> spinlock") fallback freeing page to free_one_page() if pcp trylock
> failed. This make MIGRATE_CMA be able to fallback and be stolen
> whole pageblock by MIGRATE_UNMOVABLE in the page allocation.
> 
> PCP free is fine because free_pcppages_bulk() will always get
> migratetype again before freeing the page, thus this only happen when
> someone tried to put CMA page in to other MIGRATE_TYPE's freelist.
> 
> Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
> Reported-by: Joe Liu <joe.liu@mediatek.com>
> Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
> Cc: Mark-pk Tsai <mark-pk.tsai@mediatek.com>
> Cc: Joe Liu <joe.liu@mediatek.com>

Sorry for the long delay and thanks Lecopzer for the patch.

This changelog is difficult to parse but the fix may also me too specific
and could be more robust against types other than CMA. It is true that
a failed PCP acquire may return a !is_migrate_isolate page to the wrong
list but it's more straight-forward to unconditionally lookup the PCP
migratetype of the spinlock is not acquired.

How about this? It unconditionally looks up the PCP migratetype after
spinlock contention. It's build tested only

--8<--
mm: page_alloc: Free pages to correct buddy list after PCP lock contention

Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
returns pages to the buddy list on PCP lock contention. However, for
migratetypes that are not MIGRATE_PCPTYPES, the migratetype may have
been clobbered already for pages that are not being isolated. In
practice, this means that CMA pages may be returned to the wrong
buddy list. While this might be harmless in some cases as it is
MIGRATE_MOVABLE, the pageblock could be reassigned in rmqueue_fallback
and prevent a future CMA allocation. Lookup the PCP migratetype
against unconditionally if the PCP lock is contended.

[lecopzer.chen@mediatek.com: CMA-specific fix]
Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
Reported-by: Joe Liu <joe.liu@mediatek.com>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 mm/page_alloc.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 452459836b71..4053c377fee8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2428,7 +2428,13 @@ void free_unref_page(struct page *page, unsigned int order)
 		free_unref_page_commit(zone, pcp, page, migratetype, order);
 		pcp_spin_unlock(pcp);
 	} else {
-		free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
+		/*
+		 * The page migratetype may have been clobbered for types
+		 * (type >= MIGRATE_PCPTYPES && !is_migrate_isolate) so
+		 * must be rechecked.
+		 */
+		free_one_page(zone, page, pfn, order,
+			      get_pcppage_migratetype(page), FPI_NONE);
 	}
 	pcp_trylock_finish(UP_flags);
 }
Lecopzer Chen Sept. 5, 2023, 9:22 a.m. UTC | #2
Hi Mel,

Of course, because we only test for CMA and only make sure that
CMA part is wrong.
You patch is definitly better.
Please go ahead with your patch, thank you.

> On Wed, Aug 30, 2023 at 07:13:33PM +0800, Lecopzer Chen wrote:
> > commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a
> > spinlock") fallback freeing page to free_one_page() if pcp trylock
> > failed. This make MIGRATE_CMA be able to fallback and be stolen
> > whole pageblock by MIGRATE_UNMOVABLE in the page allocation.
> > 
> > PCP free is fine because free_pcppages_bulk() will always get
> > migratetype again before freeing the page, thus this only happen when
> > someone tried to put CMA page in to other MIGRATE_TYPE's freelist.
> > 
> > Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
> > Reported-by: Joe Liu <joe.liu@mediatek.com>
> > Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
> > Cc: Mark-pk Tsai <mark-pk.tsai@mediatek.com>
> > Cc: Joe Liu <joe.liu@mediatek.com>
> 
> Sorry for the long delay and thanks Lecopzer for the patch.
> 
> This changelog is difficult to parse but the fix may also me too specific
> and could be more robust against types other than CMA. It is true that
> a failed PCP acquire may return a !is_migrate_isolate page to the wrong
> list but it's more straight-forward to unconditionally lookup the PCP
> migratetype of the spinlock is not acquired.
> 
> How about this? It unconditionally looks up the PCP migratetype after
> spinlock contention. It's build tested only
> 
> --8<--
> mm: page_alloc: Free pages to correct buddy list after PCP lock contention
> 
> Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
> returns pages to the buddy list on PCP lock contention. However, for
> migratetypes that are not MIGRATE_PCPTYPES, the migratetype may have
> been clobbered already for pages that are not being isolated. In
> practice, this means that CMA pages may be returned to the wrong
> buddy list. While this might be harmless in some cases as it is
> MIGRATE_MOVABLE, the pageblock could be reassigned in rmqueue_fallback
> and prevent a future CMA allocation. Lookup the PCP migratetype
> against unconditionally if the PCP lock is contended.
> 
> [lecopzer.chen@mediatek.com: CMA-specific fix]
> Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
> Reported-by: Joe Liu <joe.liu@mediatek.com>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> ---
>  mm/page_alloc.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 452459836b71..4053c377fee8 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2428,7 +2428,13 @@ void free_unref_page(struct page *page, unsigned int order)
>  		free_unref_page_commit(zone, pcp, page, migratetype, order);
>  		pcp_spin_unlock(pcp);
>  	} else {
> -		free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
> +		/*
> +		 * The page migratetype may have been clobbered for types
> +		 * (type >= MIGRATE_PCPTYPES && !is_migrate_isolate) so
> +		 * must be rechecked.
> +		 */
> +		free_one_page(zone, page, pfn, order,
> +			      get_pcppage_migratetype(page), FPI_NONE);
>  	}
>  	pcp_trylock_finish(UP_flags);
>  }
Vlastimil Babka Sept. 5, 2023, 9:37 a.m. UTC | #3
On 9/5/23 11:09, Mel Gorman wrote:
> On Wed, Aug 30, 2023 at 07:13:33PM +0800, Lecopzer Chen wrote:
>> commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a
>> spinlock") fallback freeing page to free_one_page() if pcp trylock
>> failed. This make MIGRATE_CMA be able to fallback and be stolen
>> whole pageblock by MIGRATE_UNMOVABLE in the page allocation.
>> 
>> PCP free is fine because free_pcppages_bulk() will always get
>> migratetype again before freeing the page, thus this only happen when
>> someone tried to put CMA page in to other MIGRATE_TYPE's freelist.
>> 
>> Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
>> Reported-by: Joe Liu <joe.liu@mediatek.com>
>> Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
>> Cc: Mark-pk Tsai <mark-pk.tsai@mediatek.com>
>> Cc: Joe Liu <joe.liu@mediatek.com>
> 
> Sorry for the long delay and thanks Lecopzer for the patch.
> 
> This changelog is difficult to parse but the fix may also me too specific
> and could be more robust against types other than CMA. It is true that
> a failed PCP acquire may return a !is_migrate_isolate page to the wrong
> list but it's more straight-forward to unconditionally lookup the PCP
> migratetype of the spinlock is not acquired.
> 
> How about this? It unconditionally looks up the PCP migratetype after
> spinlock contention. It's build tested only
> 
> --8<--
> mm: page_alloc: Free pages to correct buddy list after PCP lock contention
> 
> Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
> returns pages to the buddy list on PCP lock contention. However, for
> migratetypes that are not MIGRATE_PCPTYPES, the migratetype may have
> been clobbered already for pages that are not being isolated. In
> practice, this means that CMA pages may be returned to the wrong
> buddy list. While this might be harmless in some cases as it is
> MIGRATE_MOVABLE, the pageblock could be reassigned in rmqueue_fallback
> and prevent a future CMA allocation. Lookup the PCP migratetype
> against unconditionally if the PCP lock is contended.
> 
> [lecopzer.chen@mediatek.com: CMA-specific fix]
> Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")

I think we should Cc: stable for the sake of 6.1 LTS?

> Reported-by: Joe Liu <joe.liu@mediatek.com>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/page_alloc.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 452459836b71..4053c377fee8 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2428,7 +2428,13 @@ void free_unref_page(struct page *page, unsigned int order)
>  		free_unref_page_commit(zone, pcp, page, migratetype, order);
>  		pcp_spin_unlock(pcp);
>  	} else {
> -		free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
> +		/*
> +		 * The page migratetype may have been clobbered for types
> +		 * (type >= MIGRATE_PCPTYPES && !is_migrate_isolate) so
> +		 * must be rechecked.
> +		 */
> +		free_one_page(zone, page, pfn, order,
> +			      get_pcppage_migratetype(page), FPI_NONE);
>  	}
>  	pcp_trylock_finish(UP_flags);
>  }
Mel Gorman Sept. 5, 2023, 12:28 p.m. UTC | #4
On Tue, Sep 05, 2023 at 11:37:30AM +0200, Vlastimil Babka wrote:
> On 9/5/23 11:09, Mel Gorman wrote:
> > On Wed, Aug 30, 2023 at 07:13:33PM +0800, Lecopzer Chen wrote:
> >> commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a
> >> spinlock") fallback freeing page to free_one_page() if pcp trylock
> >> failed. This make MIGRATE_CMA be able to fallback and be stolen
> >> whole pageblock by MIGRATE_UNMOVABLE in the page allocation.
> >> 
> >> PCP free is fine because free_pcppages_bulk() will always get
> >> migratetype again before freeing the page, thus this only happen when
> >> someone tried to put CMA page in to other MIGRATE_TYPE's freelist.
> >> 
> >> Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
> >> Reported-by: Joe Liu <joe.liu@mediatek.com>
> >> Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
> >> Cc: Mark-pk Tsai <mark-pk.tsai@mediatek.com>
> >> Cc: Joe Liu <joe.liu@mediatek.com>
> > 
> > Sorry for the long delay and thanks Lecopzer for the patch.
> > 
> > This changelog is difficult to parse but the fix may also me too specific
> > and could be more robust against types other than CMA. It is true that
> > a failed PCP acquire may return a !is_migrate_isolate page to the wrong
> > list but it's more straight-forward to unconditionally lookup the PCP
> > migratetype of the spinlock is not acquired.
> > 
> > How about this? It unconditionally looks up the PCP migratetype after
> > spinlock contention. It's build tested only
> > 
> > --8<--
> > mm: page_alloc: Free pages to correct buddy list after PCP lock contention
> > 
> > Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
> > returns pages to the buddy list on PCP lock contention. However, for
> > migratetypes that are not MIGRATE_PCPTYPES, the migratetype may have
> > been clobbered already for pages that are not being isolated. In
> > practice, this means that CMA pages may be returned to the wrong
> > buddy list. While this might be harmless in some cases as it is
> > MIGRATE_MOVABLE, the pageblock could be reassigned in rmqueue_fallback
> > and prevent a future CMA allocation. Lookup the PCP migratetype
> > against unconditionally if the PCP lock is contended.
> > 
> > [lecopzer.chen@mediatek.com: CMA-specific fix]
> > Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
> 
> I think we should Cc: stable for the sake of 6.1 LTS?
> 

Yep.

> > Reported-by: Joe Liu <joe.liu@mediatek.com>
> > Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> 
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> 

Thanks.
Johannes Weiner Sept. 11, 2023, 3:57 p.m. UTC | #5
On Tue, Sep 05, 2023 at 10:09:22AM +0100, Mel Gorman wrote:
> mm: page_alloc: Free pages to correct buddy list after PCP lock contention
> 
> Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
> returns pages to the buddy list on PCP lock contention. However, for
> migratetypes that are not MIGRATE_PCPTYPES, the migratetype may have
> been clobbered already for pages that are not being isolated. In
> practice, this means that CMA pages may be returned to the wrong
> buddy list. While this might be harmless in some cases as it is
> MIGRATE_MOVABLE, the pageblock could be reassigned in rmqueue_fallback
> and prevent a future CMA allocation. Lookup the PCP migratetype
> against unconditionally if the PCP lock is contended.
> 
> [lecopzer.chen@mediatek.com: CMA-specific fix]
> Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
> Reported-by: Joe Liu <joe.liu@mediatek.com>
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> ---
>  mm/page_alloc.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 452459836b71..4053c377fee8 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2428,7 +2428,13 @@ void free_unref_page(struct page *page, unsigned int order)
>  		free_unref_page_commit(zone, pcp, page, migratetype, order);
>  		pcp_spin_unlock(pcp);
>  	} else {
> -		free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
> +		/*
> +		 * The page migratetype may have been clobbered for types
> +		 * (type >= MIGRATE_PCPTYPES && !is_migrate_isolate) so
> +		 * must be rechecked.
> +		 */
> +		free_one_page(zone, page, pfn, order,
> +			      get_pcppage_migratetype(page), FPI_NONE);
>  	}
>  	pcp_trylock_finish(UP_flags);
>  }
> 

I had sent a (similar) fix for this here:

https://lore.kernel.org/lkml/20230821183733.106619-4-hannes@cmpxchg.org/

The context wasn't CMA, but HIGHATOMIC pages going to the movable
freelist. But the class of bug is the same: the migratetype tweaking
really only applies to the pcplist, not the buddy slowpath; I added a
local pcpmigratetype to make it more clear, and hopefully prevent bugs
of this nature down the line.

I'm just preparing v2 of the above series. Do you want me to break
this change out and send it separately?
Vlastimil Babka Sept. 11, 2023, 4:13 p.m. UTC | #6
On 9/11/23 17:57, Johannes Weiner wrote:
> On Tue, Sep 05, 2023 at 10:09:22AM +0100, Mel Gorman wrote:
>> mm: page_alloc: Free pages to correct buddy list after PCP lock contention
>> 
>> Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
>> returns pages to the buddy list on PCP lock contention. However, for
>> migratetypes that are not MIGRATE_PCPTYPES, the migratetype may have
>> been clobbered already for pages that are not being isolated. In
>> practice, this means that CMA pages may be returned to the wrong
>> buddy list. While this might be harmless in some cases as it is
>> MIGRATE_MOVABLE, the pageblock could be reassigned in rmqueue_fallback
>> and prevent a future CMA allocation. Lookup the PCP migratetype
>> against unconditionally if the PCP lock is contended.
>> 
>> [lecopzer.chen@mediatek.com: CMA-specific fix]
>> Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
>> Reported-by: Joe Liu <joe.liu@mediatek.com>
>> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
>> ---
>>  mm/page_alloc.c | 8 +++++++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>> 
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 452459836b71..4053c377fee8 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -2428,7 +2428,13 @@ void free_unref_page(struct page *page, unsigned int order)
>>  		free_unref_page_commit(zone, pcp, page, migratetype, order);
>>  		pcp_spin_unlock(pcp);
>>  	} else {
>> -		free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
>> +		/*
>> +		 * The page migratetype may have been clobbered for types
>> +		 * (type >= MIGRATE_PCPTYPES && !is_migrate_isolate) so
>> +		 * must be rechecked.
>> +		 */
>> +		free_one_page(zone, page, pfn, order,
>> +			      get_pcppage_migratetype(page), FPI_NONE);
>>  	}
>>  	pcp_trylock_finish(UP_flags);
>>  }
>> 
> 
> I had sent a (similar) fix for this here:
> 
> https://lore.kernel.org/lkml/20230821183733.106619-4-hannes@cmpxchg.org/
> 
> The context wasn't CMA, but HIGHATOMIC pages going to the movable
> freelist. But the class of bug is the same: the migratetype tweaking
> really only applies to the pcplist, not the buddy slowpath; I added a
> local pcpmigratetype to make it more clear, and hopefully prevent bugs
> of this nature down the line.

Seems to be the cleanest solution to me, indeed.

> I'm just preparing v2 of the above series. Do you want me to break
> this change out and send it separately?

Works for me, if you combine the it with the information about what commit
that fixes, the CMA implications reported, and Cc stable.
Johannes Weiner Sept. 11, 2023, 6:11 p.m. UTC | #7
On Mon, Sep 11, 2023 at 06:13:59PM +0200, Vlastimil Babka wrote:
> On 9/11/23 17:57, Johannes Weiner wrote:
> > On Tue, Sep 05, 2023 at 10:09:22AM +0100, Mel Gorman wrote:
> >> mm: page_alloc: Free pages to correct buddy list after PCP lock contention
> >> 
> >> Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
> >> returns pages to the buddy list on PCP lock contention. However, for
> >> migratetypes that are not MIGRATE_PCPTYPES, the migratetype may have
> >> been clobbered already for pages that are not being isolated. In
> >> practice, this means that CMA pages may be returned to the wrong
> >> buddy list. While this might be harmless in some cases as it is
> >> MIGRATE_MOVABLE, the pageblock could be reassigned in rmqueue_fallback
> >> and prevent a future CMA allocation. Lookup the PCP migratetype
> >> against unconditionally if the PCP lock is contended.
> >> 
> >> [lecopzer.chen@mediatek.com: CMA-specific fix]
> >> Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
> >> Reported-by: Joe Liu <joe.liu@mediatek.com>
> >> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> >> ---
> >>  mm/page_alloc.c | 8 +++++++-
> >>  1 file changed, 7 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >> index 452459836b71..4053c377fee8 100644
> >> --- a/mm/page_alloc.c
> >> +++ b/mm/page_alloc.c
> >> @@ -2428,7 +2428,13 @@ void free_unref_page(struct page *page, unsigned int order)
> >>  		free_unref_page_commit(zone, pcp, page, migratetype, order);
> >>  		pcp_spin_unlock(pcp);
> >>  	} else {
> >> -		free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
> >> +		/*
> >> +		 * The page migratetype may have been clobbered for types
> >> +		 * (type >= MIGRATE_PCPTYPES && !is_migrate_isolate) so
> >> +		 * must be rechecked.
> >> +		 */
> >> +		free_one_page(zone, page, pfn, order,
> >> +			      get_pcppage_migratetype(page), FPI_NONE);
> >>  	}
> >>  	pcp_trylock_finish(UP_flags);
> >>  }
> >> 
> > 
> > I had sent a (similar) fix for this here:
> > 
> > https://lore.kernel.org/lkml/20230821183733.106619-4-hannes@cmpxchg.org/
> > 
> > The context wasn't CMA, but HIGHATOMIC pages going to the movable
> > freelist. But the class of bug is the same: the migratetype tweaking
> > really only applies to the pcplist, not the buddy slowpath; I added a
> > local pcpmigratetype to make it more clear, and hopefully prevent bugs
> > of this nature down the line.
> 
> Seems to be the cleanest solution to me, indeed.
> 
> > I'm just preparing v2 of the above series. Do you want me to break
> > this change out and send it separately?
> 
> Works for me, if you combine the it with the information about what commit
> that fixes, the CMA implications reported, and Cc stable.

How about this? Based on v6.6-rc1.

---

From 84e4490095ed3d1f2991e7f0e58e2968e56cc7c0 Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@cmpxchg.org>
Date: Fri, 28 Jul 2023 14:29:41 -0400
Subject: [PATCH] mm: page_alloc: fix CMA and HIGHATOMIC landing on the wrong
 buddy list

Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a
spinlock") bypasses the pcplist on lock contention and returns the
page directly to the buddy list of the page's migratetype.

For pages that don't have their own pcplist, such as CMA and
HIGHATOMIC, the migratetype is temporarily updated such that the page
can hitch a ride on the MOVABLE pcplist. Their true type is later
reassessed when flushing in free_pcppages_bulk(). However, when lock
contention is detected after the type was already overriden, the
bypass will then put the page on the wrong buddy list.

Once on the MOVABLE buddy list, the page becomes eligible for
fallbacks and even stealing. In the case of HIGHATOMIC, otherwise
ineligible allocations can dip into the highatomic reserves. In the
case of CMA, the page can be lost from the CMA region permanently.

Use a separate pcpmigratetype variable for the pcplist override. Use
the original migratetype when going directly to the buddy. This fixes
the bug and should make the intentions more obvious in the code.

Originally sent here to address the HIGHATOMIC case:
https://lore.kernel.org/lkml/20230821183733.106619-4-hannes@cmpxchg.org/

Changelog updated in response to the CMA-specific bug report.

[mgorman@techsingularity.net: updated changelog]
Reported-by: Joe Liu <joe.liu@mediatek.com>
Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
Cc: stable@vger.kernel.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/page_alloc.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0c5be12f9336..95546f376302 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2400,7 +2400,7 @@ void free_unref_page(struct page *page, unsigned int order)
 	struct per_cpu_pages *pcp;
 	struct zone *zone;
 	unsigned long pfn = page_to_pfn(page);
-	int migratetype;
+	int migratetype, pcpmigratetype;
 
 	if (!free_unref_page_prepare(page, pfn, order))
 		return;
@@ -2408,24 +2408,24 @@ void free_unref_page(struct page *page, unsigned int order)
 	/*
 	 * We only track unmovable, reclaimable and movable on pcp lists.
 	 * Place ISOLATE pages on the isolated list because they are being
-	 * offlined but treat HIGHATOMIC as movable pages so we can get those
-	 * areas back if necessary. Otherwise, we may have to free
+	 * offlined but treat HIGHATOMIC and CMA as movable pages so we can
+	 * get those areas back if necessary. Otherwise, we may have to free
 	 * excessively into the page allocator
 	 */
-	migratetype = get_pcppage_migratetype(page);
+	migratetype = pcpmigratetype = get_pcppage_migratetype(page);
 	if (unlikely(migratetype >= MIGRATE_PCPTYPES)) {
 		if (unlikely(is_migrate_isolate(migratetype))) {
 			free_one_page(page_zone(page), page, pfn, order, migratetype, FPI_NONE);
 			return;
 		}
-		migratetype = MIGRATE_MOVABLE;
+		pcpmigratetype = MIGRATE_MOVABLE;
 	}
 
 	zone = page_zone(page);
 	pcp_trylock_prepare(UP_flags);
 	pcp = pcp_spin_trylock(zone->per_cpu_pageset);
 	if (pcp) {
-		free_unref_page_commit(zone, pcp, page, migratetype, order);
+		free_unref_page_commit(zone, pcp, page, pcpmigratetype, order);
 		pcp_spin_unlock(pcp);
 	} else {
 		free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
Vlastimil Babka Sept. 11, 2023, 7:59 p.m. UTC | #8
On 9/11/23 20:11, Johannes Weiner wrote:
> On Mon, Sep 11, 2023 at 06:13:59PM +0200, Vlastimil Babka wrote:
>> On 9/11/23 17:57, Johannes Weiner wrote:
>> > On Tue, Sep 05, 2023 at 10:09:22AM +0100, Mel Gorman wrote:
>> >> mm: page_alloc: Free pages to correct buddy list after PCP lock contention
>> >> 
>> >> Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
>> >> returns pages to the buddy list on PCP lock contention. However, for
>> >> migratetypes that are not MIGRATE_PCPTYPES, the migratetype may have
>> >> been clobbered already for pages that are not being isolated. In
>> >> practice, this means that CMA pages may be returned to the wrong
>> >> buddy list. While this might be harmless in some cases as it is
>> >> MIGRATE_MOVABLE, the pageblock could be reassigned in rmqueue_fallback
>> >> and prevent a future CMA allocation. Lookup the PCP migratetype
>> >> against unconditionally if the PCP lock is contended.
>> >> 
>> >> [lecopzer.chen@mediatek.com: CMA-specific fix]
>> >> Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
>> >> Reported-by: Joe Liu <joe.liu@mediatek.com>
>> >> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
>> >> ---
>> >>  mm/page_alloc.c | 8 +++++++-
>> >>  1 file changed, 7 insertions(+), 1 deletion(-)
>> >> 
>> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> >> index 452459836b71..4053c377fee8 100644
>> >> --- a/mm/page_alloc.c
>> >> +++ b/mm/page_alloc.c
>> >> @@ -2428,7 +2428,13 @@ void free_unref_page(struct page *page, unsigned int order)
>> >>  		free_unref_page_commit(zone, pcp, page, migratetype, order);
>> >>  		pcp_spin_unlock(pcp);
>> >>  	} else {
>> >> -		free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
>> >> +		/*
>> >> +		 * The page migratetype may have been clobbered for types
>> >> +		 * (type >= MIGRATE_PCPTYPES && !is_migrate_isolate) so
>> >> +		 * must be rechecked.
>> >> +		 */
>> >> +		free_one_page(zone, page, pfn, order,
>> >> +			      get_pcppage_migratetype(page), FPI_NONE);
>> >>  	}
>> >>  	pcp_trylock_finish(UP_flags);
>> >>  }
>> >> 
>> > 
>> > I had sent a (similar) fix for this here:
>> > 
>> > https://lore.kernel.org/lkml/20230821183733.106619-4-hannes@cmpxchg.org/
>> > 
>> > The context wasn't CMA, but HIGHATOMIC pages going to the movable
>> > freelist. But the class of bug is the same: the migratetype tweaking
>> > really only applies to the pcplist, not the buddy slowpath; I added a
>> > local pcpmigratetype to make it more clear, and hopefully prevent bugs
>> > of this nature down the line.
>> 
>> Seems to be the cleanest solution to me, indeed.
>> 
>> > I'm just preparing v2 of the above series. Do you want me to break
>> > this change out and send it separately?
>> 
>> Works for me, if you combine the it with the information about what commit
>> that fixes, the CMA implications reported, and Cc stable.
> 
> How about this? Based on v6.6-rc1.
> 
> ---
> 
> From 84e4490095ed3d1f2991e7f0e58e2968e56cc7c0 Mon Sep 17 00:00:00 2001
> From: Johannes Weiner <hannes@cmpxchg.org>
> Date: Fri, 28 Jul 2023 14:29:41 -0400
> Subject: [PATCH] mm: page_alloc: fix CMA and HIGHATOMIC landing on the wrong
>  buddy list
> 
> Commit 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a
> spinlock") bypasses the pcplist on lock contention and returns the
> page directly to the buddy list of the page's migratetype.
> 
> For pages that don't have their own pcplist, such as CMA and
> HIGHATOMIC, the migratetype is temporarily updated such that the page
> can hitch a ride on the MOVABLE pcplist. Their true type is later
> reassessed when flushing in free_pcppages_bulk(). However, when lock
> contention is detected after the type was already overriden, the
> bypass will then put the page on the wrong buddy list.
> 
> Once on the MOVABLE buddy list, the page becomes eligible for
> fallbacks and even stealing. In the case of HIGHATOMIC, otherwise
> ineligible allocations can dip into the highatomic reserves. In the
> case of CMA, the page can be lost from the CMA region permanently.
> 
> Use a separate pcpmigratetype variable for the pcplist override. Use
> the original migratetype when going directly to the buddy. This fixes
> the bug and should make the intentions more obvious in the code.
> 
> Originally sent here to address the HIGHATOMIC case:
> https://lore.kernel.org/lkml/20230821183733.106619-4-hannes@cmpxchg.org/
> 
> Changelog updated in response to the CMA-specific bug report.
> 
> [mgorman@techsingularity.net: updated changelog]
> Reported-by: Joe Liu <joe.liu@mediatek.com>
> Fixes: 4b23a68f9536 ("mm/page_alloc: protect PCP lists with a spinlock")
> Cc: stable@vger.kernel.org
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/page_alloc.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 0c5be12f9336..95546f376302 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2400,7 +2400,7 @@ void free_unref_page(struct page *page, unsigned int order)
>  	struct per_cpu_pages *pcp;
>  	struct zone *zone;
>  	unsigned long pfn = page_to_pfn(page);
> -	int migratetype;
> +	int migratetype, pcpmigratetype;
>  
>  	if (!free_unref_page_prepare(page, pfn, order))
>  		return;
> @@ -2408,24 +2408,24 @@ void free_unref_page(struct page *page, unsigned int order)
>  	/*
>  	 * We only track unmovable, reclaimable and movable on pcp lists.
>  	 * Place ISOLATE pages on the isolated list because they are being
> -	 * offlined but treat HIGHATOMIC as movable pages so we can get those
> -	 * areas back if necessary. Otherwise, we may have to free
> +	 * offlined but treat HIGHATOMIC and CMA as movable pages so we can
> +	 * get those areas back if necessary. Otherwise, we may have to free
>  	 * excessively into the page allocator
>  	 */
> -	migratetype = get_pcppage_migratetype(page);
> +	migratetype = pcpmigratetype = get_pcppage_migratetype(page);
>  	if (unlikely(migratetype >= MIGRATE_PCPTYPES)) {
>  		if (unlikely(is_migrate_isolate(migratetype))) {
>  			free_one_page(page_zone(page), page, pfn, order, migratetype, FPI_NONE);
>  			return;
>  		}
> -		migratetype = MIGRATE_MOVABLE;
> +		pcpmigratetype = MIGRATE_MOVABLE;
>  	}
>  
>  	zone = page_zone(page);
>  	pcp_trylock_prepare(UP_flags);
>  	pcp = pcp_spin_trylock(zone->per_cpu_pageset);
>  	if (pcp) {
> -		free_unref_page_commit(zone, pcp, page, migratetype, order);
> +		free_unref_page_commit(zone, pcp, page, pcpmigratetype, order);
>  		pcp_spin_unlock(pcp);
>  	} else {
>  		free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
diff mbox series

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 452459836b71..0ea88c031838 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2428,6 +2428,14 @@  void free_unref_page(struct page *page, unsigned int order)
 		free_unref_page_commit(zone, pcp, page, migratetype, order);
 		pcp_spin_unlock(pcp);
 	} else {
+#ifdef CONFIG_CMA
+		/*
+		 * CMA must be back to its freelist. Otherwise CMA pageblock may
+		 * be stolen by fallback flow while getting free page.
+		 */
+		if (get_pcppage_migratetype(page) == MIGRATE_CMA)
+			migratetype = MIGRATE_CMA;
+#endif
 		free_one_page(zone, page, pfn, order, migratetype, FPI_NONE);
 	}
 	pcp_trylock_finish(UP_flags);