diff mbox series

mm, isolation: avoid checking unmovable pages across pageblock boundary

Message ID 20200824065811.383266-1-lixinhai.lxh@gmail.com (mailing list archive)
State New, archived
Headers show
Series mm, isolation: avoid checking unmovable pages across pageblock boundary | expand

Commit Message

Li Xinhai Aug. 24, 2020, 6:58 a.m. UTC
In has_unmovable_pages(), the page parameter would not always be the
first page within a pageblock (see how the page pointer is passed in from
start_isolate_page_range() after call __first_valid_page()), so that
would cause checking unmovable pages span two pageblocks.

After this patch, the checking is enforced within one pageblock no matter
the page is first one or not, and obey the semantics of this function.

This issue is found by code inspection.

Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
---
 mm/page_alloc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Oscar Salvador Aug. 24, 2020, 8:17 a.m. UTC | #1
On Mon, Aug 24, 2020 at 02:58:11PM +0800, Li Xinhai wrote:
> In has_unmovable_pages(), the page parameter would not always be the
> first page within a pageblock (see how the page pointer is passed in from
> start_isolate_page_range() after call __first_valid_page()), so that
> would cause checking unmovable pages span two pageblocks.
> 
> After this patch, the checking is enforced within one pageblock no matter
> the page is first one or not, and obey the semantics of this function.
> 
> This issue is found by code inspection.
> 
> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>

Unless I am missing something, this looks good to me.

Reviewed-by: Oscar Salvador <osalvador@suse.de>
Michal Hocko Aug. 25, 2020, 8:26 a.m. UTC | #2
On Mon 24-08-20 14:58:11, Li Xinhai wrote:
> In has_unmovable_pages(), the page parameter would not always be the
> first page within a pageblock (see how the page pointer is passed in from
> start_isolate_page_range() after call __first_valid_page()), so that
> would cause checking unmovable pages span two pageblocks.

This might lead to false negatives when an unrelated block would cause
an isolation failure.

> After this patch, the checking is enforced within one pageblock no matter
> the page is first one or not, and obey the semantics of this function.
> 
> This issue is found by code inspection.
> 
> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>

I have tried to find the commit which has introduced this but there was
so much churn around that I gave up. Not that we care all that much
because it seems that we simply never try to isolate pageblocks with
holes. Memory hotplug disallows that explicitly and the CMA allocator
doesn't trip over that either. Or maybe we were just lucky or a silent
failure didn't really trigger any attention.

Anyway, good to have it fixed.

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/page_alloc.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 0e2bab486fea..c2c5b565f1f3 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8213,6 +8213,7 @@ struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>  {
>  	unsigned long iter = 0;
>  	unsigned long pfn = page_to_pfn(page);
> +	unsigned long offset = pfn % pageblock_nr_pages;
>  
>  	/*
>  	 * TODO we could make this much more efficient by not checking every
> @@ -8234,7 +8235,7 @@ struct page *has_unmovable_pages(struct zone *zone, struct page *page,
>  		return page;
>  	}
>  
> -	for (; iter < pageblock_nr_pages; iter++) {
> +	for (; iter < pageblock_nr_pages - offset; iter++) {
>  		if (!pfn_valid_within(pfn + iter))
>  			continue;
>  
> -- 
> 2.18.4
>
David Hildenbrand Sept. 8, 2020, 8:30 a.m. UTC | #3
On 25.08.20 10:26, Michal Hocko wrote:
> On Mon 24-08-20 14:58:11, Li Xinhai wrote:
>> In has_unmovable_pages(), the page parameter would not always be the
>> first page within a pageblock (see how the page pointer is passed in from
>> start_isolate_page_range() after call __first_valid_page()), so that
>> would cause checking unmovable pages span two pageblocks.
> 
> This might lead to false negatives when an unrelated block would cause
> an isolation failure.
> 
>> After this patch, the checking is enforced within one pageblock no matter
>> the page is first one or not, and obey the semantics of this function.
>>
>> This issue is found by code inspection.
>>
>> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
> 
> I have tried to find the commit which has introduced this but there was
> so much churn around that I gave up. Not that we care all that much
> because it seems that we simply never try to isolate pageblocks with
> holes. Memory hotplug disallows that explicitly and the CMA allocator
> doesn't trip over that either. Or maybe we were just lucky or a silent
> failure didn't really trigger any attention.

It will never happen.

- memory offlining excludes ranges without holes.
- virtio-mem (alloc_contig_range()) never has ranges with holes.
- gigantic pages (alloc_contig_range()) never use ranges with holes
- CMA (alloc_contig_range()) never uses ranges with holes.

Trying to allocate something that's not there would be troublesome already.

I *guess* the handling is in place for corner cases of
alloc_contig_range(), whereby one tries to allocate a sub-MAX_ORDER-1
page, and  alloc_contig_range() automatically tries to isolate all
pageblocks spanning full MAX_ORDER-1 pages. See
pfn_max_align_down/pfn_max_align_up. If there would be a memory hole
close by, that could trigger.

It would, however, only trigger for CMA, as that is the only user
allocating in sub-MAX_ORDER-1 granularity right now.

I think we should just disallow/check for holes in the extended range
(pfn_max_align_down/pfn_max_align_up) in alloc_contig_range() and
document the expected behavior for start_isolate_page_range() - to not
contain any holes. This get's rid of a whole bunch of unnecessary
pfn_to_online_page() calls.

We would no longer able to alloc_contig_range() pages close to a memory
hole in case of CMA (in corner cases) - but I doubt this is an actual issue.

If we agree, I can send patches.
David Hildenbrand Sept. 8, 2020, 8:30 a.m. UTC | #4
On 25.08.20 10:26, Michal Hocko wrote:
> On Mon 24-08-20 14:58:11, Li Xinhai wrote:
>> In has_unmovable_pages(), the page parameter would not always be the
>> first page within a pageblock (see how the page pointer is passed in from
>> start_isolate_page_range() after call __first_valid_page()), so that
>> would cause checking unmovable pages span two pageblocks.
> 
> This might lead to false negatives when an unrelated block would cause
> an isolation failure.
> 
>> After this patch, the checking is enforced within one pageblock no matter
>> the page is first one or not, and obey the semantics of this function.
>>
>> This issue is found by code inspection.
>>
>> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
> 
> I have tried to find the commit which has introduced this but there was
> so much churn around that I gave up. Not that we care all that much
> because it seems that we simply never try to isolate pageblocks with
> holes. Memory hotplug disallows that explicitly and the CMA allocator
> doesn't trip over that either. Or maybe we were just lucky or a silent
> failure didn't really trigger any attention.

It will never happen.

- memory offlining excludes ranges without holes.
- virtio-mem (alloc_contig_range()) never has ranges with holes.
- gigantic pages (alloc_contig_range()) never use ranges with holes
- CMA (alloc_contig_range()) never uses ranges with holes.

Trying to allocate something that's not there would be troublesome already.

I *guess* the handling is in place for corner cases of
alloc_contig_range(), whereby one tries to allocate a sub-MAX_ORDER-1
page, and  alloc_contig_range() automatically tries to isolate all
pageblocks spanning full MAX_ORDER-1 pages. See
pfn_max_align_down/pfn_max_align_up. If there would be a memory hole
close by, that could trigger.

It would, however, only trigger for CMA, as that is the only user
allocating in sub-MAX_ORDER-1 granularity right now.

I think we should just disallow/check for holes in the extended range
(pfn_max_align_down/pfn_max_align_up) in alloc_contig_range() and
document the expected behavior for start_isolate_page_range() - to not
contain any holes. This get's rid of a whole bunch of unnecessary
pfn_to_online_page() calls.

We would no longer able to alloc_contig_range() pages close to a memory
hole in case of CMA (in corner cases) - but I doubt this is an actual issue.

If we agree, I can send patches.
Michal Hocko Sept. 14, 2020, 3 p.m. UTC | #5
On Tue 08-09-20 10:30:35, David Hildenbrand wrote:
[...]
> - gigantic pages (alloc_contig_range()) never use ranges with holes
> - CMA (alloc_contig_range()) never uses ranges with holes.

Where is that enforced?
David Hildenbrand Sept. 14, 2020, 4:36 p.m. UTC | #6
On 14.09.20 17:00, Michal Hocko wrote:
> On Tue 08-09-20 10:30:35, David Hildenbrand wrote:
> [...]
>> - gigantic pages (alloc_contig_range()) never use ranges with holes
>> - CMA (alloc_contig_range()) never uses ranges with holes.
> 
> Where is that enforced?
> 

You mean with CMA? I think the way CMA obtains regions from memblock
would not allow for that. (allocating something that's not actually
there does not make too much sense)

For example, we do a couple of unchecked pfn_to_page() calls directly in
alloc_contig_range(), most prominently right at the beginning
(page_zone(pfn_to_page(start)).

Also, __test_page_isolated_in_pageblock() would never succeed in case we
have memory holes that have a valid memmap (PageReserved()).

I think we should update the documentation of alloc_contig_range()

"The PFN range must belong to a single zone and must not contain any holes."

And simplify all that code. Of course, assuming there isn't a CMA use
case with holes I am too blind to see.
diff mbox series

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0e2bab486fea..c2c5b565f1f3 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8213,6 +8213,7 @@  struct page *has_unmovable_pages(struct zone *zone, struct page *page,
 {
 	unsigned long iter = 0;
 	unsigned long pfn = page_to_pfn(page);
+	unsigned long offset = pfn % pageblock_nr_pages;
 
 	/*
 	 * TODO we could make this much more efficient by not checking every
@@ -8234,7 +8235,7 @@  struct page *has_unmovable_pages(struct zone *zone, struct page *page,
 		return page;
 	}
 
-	for (; iter < pageblock_nr_pages; iter++) {
+	for (; iter < pageblock_nr_pages - offset; iter++) {
 		if (!pfn_valid_within(pfn + iter))
 			continue;