diff mbox series

[1/2] hugetlb: fix update_and_free_page contig page struct assumption

Message ID 20210217184926.33567-1-mike.kravetz@oracle.com (mailing list archive)
State New, archived
Headers show
Series [1/2] hugetlb: fix update_and_free_page contig page struct assumption | expand

Commit Message

Mike Kravetz Feb. 17, 2021, 6:49 p.m. UTC
page structs are not guaranteed to be contiguous for gigantic pages.  The
routine update_and_free_page can encounter a gigantic page, yet it assumes
page structs are contiguous when setting page flags in subpages.

If update_and_free_page encounters non-contiguous page structs, we can
see “BUG: Bad page state in process …” errors.

Non-contiguous page structs are generally not an issue.  However, they can
exist with a specific kernel configuration and hotplug operations.  For
example: Configure the kernel with CONFIG_SPARSEMEM and
!CONFIG_SPARSEMEM_VMEMMAP.  Then, hotplug add memory for the area where the
gigantic page will be allocated.
Zi Yan outlined steps to reproduce here [1].

[1] https://lore.kernel.org/linux-mm/16F7C58B-4D79-41C5-9B64-A1A1628F4AF2@nvidia.com/

Fixes: 944d9fec8d7a ("hugetlb: add support for gigantic page allocation at runtime")
Signed-off-by: Zi Yan <ziy@nvidia.com>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: <stable@vger.kernel.org>
---
 mm/hugetlb.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Andrew Morton Feb. 17, 2021, 7:02 p.m. UTC | #1
On Wed, 17 Feb 2021 10:49:25 -0800 Mike Kravetz <mike.kravetz@oracle.com> wrote:

> page structs are not guaranteed to be contiguous for gigantic pages.  The
> routine update_and_free_page can encounter a gigantic page, yet it assumes
> page structs are contiguous when setting page flags in subpages.
> 
> If update_and_free_page encounters non-contiguous page structs, we can
> see “BUG: Bad page state in process …” errors.
> 
> Non-contiguous page structs are generally not an issue.  However, they can
> exist with a specific kernel configuration and hotplug operations.  For
> example: Configure the kernel with CONFIG_SPARSEMEM and
> !CONFIG_SPARSEMEM_VMEMMAP.  Then, hotplug add memory for the area where the
> gigantic page will be allocated.
> Zi Yan outlined steps to reproduce here [1].
> 
> [1] https://lore.kernel.org/linux-mm/16F7C58B-4D79-41C5-9B64-A1A1628F4AF2@nvidia.com/
> 
> Fixes: 944d9fec8d7a ("hugetlb: add support for gigantic page allocation at runtime")

June 2014.  That's a long lurk time for a bug.  I wonder if some later
commit revealed it.

I guess it doesn't matter a lot, but some -stable kernel maintainers
might wonder if they really need this fix...
Mike Kravetz Feb. 17, 2021, 7:38 p.m. UTC | #2
On 2/17/21 11:02 AM, Andrew Morton wrote:
> On Wed, 17 Feb 2021 10:49:25 -0800 Mike Kravetz <mike.kravetz@oracle.com> wrote:
> 
>> page structs are not guaranteed to be contiguous for gigantic pages.  The
>> routine update_and_free_page can encounter a gigantic page, yet it assumes
>> page structs are contiguous when setting page flags in subpages.
>>
>> If update_and_free_page encounters non-contiguous page structs, we can
>> see “BUG: Bad page state in process …” errors.
>>
>> Non-contiguous page structs are generally not an issue.  However, they can
>> exist with a specific kernel configuration and hotplug operations.  For
>> example: Configure the kernel with CONFIG_SPARSEMEM and
>> !CONFIG_SPARSEMEM_VMEMMAP.  Then, hotplug add memory for the area where the
>> gigantic page will be allocated.
>> Zi Yan outlined steps to reproduce here [1].
>>
>> [1] https://lore.kernel.org/linux-mm/16F7C58B-4D79-41C5-9B64-A1A1628F4AF2@nvidia.com/
>>
>> Fixes: 944d9fec8d7a ("hugetlb: add support for gigantic page allocation at runtime")
> 
> June 2014.  That's a long lurk time for a bug.  I wonder if some later
> commit revealed it.
> 
> I guess it doesn't matter a lot, but some -stable kernel maintainers
> might wonder if they really need this fix...

I am not sure how common a CONFIG_SPARSEMEM and !CONFIG_SPARSEMEM_VMEMMAP
config is.  On the more popular architectures, this is not the default.
But, you can build a kernel with such options.  And, then you need to
hotplug memory add and allocate a gigantic page there.

It is unlikely to happen, but possible since Zi could force the BUG.

The copy_huge_page_from_user bug requires the same non-normal configuration
and is just as unlikely to occurr.  But, since it can overwrite somewhat
random pages I would feel better if it was fixed.
Matthew Wilcox Feb. 18, 2021, 2:45 p.m. UTC | #3
On Wed, Feb 17, 2021 at 11:02:52AM -0800, Andrew Morton wrote:
> On Wed, 17 Feb 2021 10:49:25 -0800 Mike Kravetz <mike.kravetz@oracle.com> wrote:
> > page structs are not guaranteed to be contiguous for gigantic pages.  The
>
> June 2014.  That's a long lurk time for a bug.  I wonder if some later
> commit revealed it.

I would suggest that gigantic pages have not seen much use.  Certainly
performance with Intel CPUs on benchmarks that I've been involved with
showed lower performance with 1GB pages than with 2MB pages until quite
recently.
Jason Gunthorpe Feb. 18, 2021, 5:25 p.m. UTC | #4
On Thu, Feb 18, 2021 at 02:45:54PM +0000, Matthew Wilcox wrote:
> On Wed, Feb 17, 2021 at 11:02:52AM -0800, Andrew Morton wrote:
> > On Wed, 17 Feb 2021 10:49:25 -0800 Mike Kravetz <mike.kravetz@oracle.com> wrote:
> > > page structs are not guaranteed to be contiguous for gigantic pages.  The
> >
> > June 2014.  That's a long lurk time for a bug.  I wonder if some later
> > commit revealed it.
> 
> I would suggest that gigantic pages have not seen much use.  Certainly
> performance with Intel CPUs on benchmarks that I've been involved with
> showed lower performance with 1GB pages than with 2MB pages until quite
> recently.

I suggested in another thread that maybe it is time to consider
dropping this "feature"

If it has been slightly broken for 7 years it seems a good bet it
isn't actually being used.

The cost to fix GUP to be compatible with this will hurt normal
GUP performance - and again, that nobody has hit this bug in GUP
further suggests the feature isn't used..

Jason
Zi Yan Feb. 18, 2021, 5:27 p.m. UTC | #5
On 18 Feb 2021, at 12:25, Jason Gunthorpe wrote:

> On Thu, Feb 18, 2021 at 02:45:54PM +0000, Matthew Wilcox wrote:
>> On Wed, Feb 17, 2021 at 11:02:52AM -0800, Andrew Morton wrote:
>>> On Wed, 17 Feb 2021 10:49:25 -0800 Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>>> page structs are not guaranteed to be contiguous for gigantic pages.  The
>>>
>>> June 2014.  That's a long lurk time for a bug.  I wonder if some later
>>> commit revealed it.
>>
>> I would suggest that gigantic pages have not seen much use.  Certainly
>> performance with Intel CPUs on benchmarks that I've been involved with
>> showed lower performance with 1GB pages than with 2MB pages until quite
>> recently.
>
> I suggested in another thread that maybe it is time to consider
> dropping this "feature"

You mean dropping gigantic page support in hugetlb?

>
> If it has been slightly broken for 7 years it seems a good bet it
> isn't actually being used.
>
> The cost to fix GUP to be compatible with this will hurt normal
> GUP performance - and again, that nobody has hit this bug in GUP
> further suggests the feature isn't used..

A easy fix might be to make gigantic hugetlb page depends on
CONFIG_SPARSEMEM_VMEMMAP, which guarantee all struct pages are contiguous.


—
Best Regards,
Yan Zi
Jason Gunthorpe Feb. 18, 2021, 5:32 p.m. UTC | #6
On Thu, Feb 18, 2021 at 12:27:58PM -0500, Zi Yan wrote:
> On 18 Feb 2021, at 12:25, Jason Gunthorpe wrote:
> 
> > On Thu, Feb 18, 2021 at 02:45:54PM +0000, Matthew Wilcox wrote:
> >> On Wed, Feb 17, 2021 at 11:02:52AM -0800, Andrew Morton wrote:
> >>> On Wed, 17 Feb 2021 10:49:25 -0800 Mike Kravetz <mike.kravetz@oracle.com> wrote:
> >>>> page structs are not guaranteed to be contiguous for gigantic pages.  The
> >>>
> >>> June 2014.  That's a long lurk time for a bug.  I wonder if some later
> >>> commit revealed it.
> >>
> >> I would suggest that gigantic pages have not seen much use.  Certainly
> >> performance with Intel CPUs on benchmarks that I've been involved with
> >> showed lower performance with 1GB pages than with 2MB pages until quite
> >> recently.
> >
> > I suggested in another thread that maybe it is time to consider
> > dropping this "feature"
>
> You mean dropping gigantic page support in hugetlb?

No, I mean dropping support for arches that want to do:

   tail_page != head_page + tail_page_nr

because they can't allocate the required page array either virtually
or physically contiguously.

It seems like quite a burden on the core mm for a very niche, and
maybe even non-existant, case. 

It was originally done for PPC, can these PPC systems use VMEMMAP now?

> > The cost to fix GUP to be compatible with this will hurt normal
> > GUP performance - and again, that nobody has hit this bug in GUP
> > further suggests the feature isn't used..
> 
> A easy fix might be to make gigantic hugetlb page depends on
> CONFIG_SPARSEMEM_VMEMMAP, which guarantee all struct pages are contiguous.

Yes, exactly.

Jason
Mike Kravetz Feb. 18, 2021, 5:34 p.m. UTC | #7
On 2/18/21 9:25 AM, Jason Gunthorpe wrote:
> On Thu, Feb 18, 2021 at 02:45:54PM +0000, Matthew Wilcox wrote:
>> On Wed, Feb 17, 2021 at 11:02:52AM -0800, Andrew Morton wrote:
>>> On Wed, 17 Feb 2021 10:49:25 -0800 Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>>> page structs are not guaranteed to be contiguous for gigantic pages.  The
>>>
>>> June 2014.  That's a long lurk time for a bug.  I wonder if some later
>>> commit revealed it.
>>
>> I would suggest that gigantic pages have not seen much use.  Certainly
>> performance with Intel CPUs on benchmarks that I've been involved with
>> showed lower performance with 1GB pages than with 2MB pages until quite
>> recently.
> 
> I suggested in another thread that maybe it is time to consider
> dropping this "feature"
> 
> If it has been slightly broken for 7 years it seems a good bet it
> isn't actually being used.
> 
> The cost to fix GUP to be compatible with this will hurt normal
> GUP performance - and again, that nobody has hit this bug in GUP
> further suggests the feature isn't used..

I was thinking that we could detect these 'unusual' configurations and only
do the slower page struct walking in those cases.  However, we would need to
do some research to make sure we have taken into account all possible config
options which can produce non-contiguous page structs.  That should have zero
performance impact in the 'normal' cases.

I suppose we could prohibit gigantic pages in these 'unusual' configurations.
It would require some research to see if this 'may' impact someone.
Zi Yan Feb. 18, 2021, 5:40 p.m. UTC | #8
On 18 Feb 2021, at 12:32, Jason Gunthorpe wrote:

> On Thu, Feb 18, 2021 at 12:27:58PM -0500, Zi Yan wrote:
>> On 18 Feb 2021, at 12:25, Jason Gunthorpe wrote:
>>
>>> On Thu, Feb 18, 2021 at 02:45:54PM +0000, Matthew Wilcox wrote:
>>>> On Wed, Feb 17, 2021 at 11:02:52AM -0800, Andrew Morton wrote:
>>>>> On Wed, 17 Feb 2021 10:49:25 -0800 Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>>>>> page structs are not guaranteed to be contiguous for gigantic pages.  The
>>>>>
>>>>> June 2014.  That's a long lurk time for a bug.  I wonder if some later
>>>>> commit revealed it.
>>>>
>>>> I would suggest that gigantic pages have not seen much use.  Certainly
>>>> performance with Intel CPUs on benchmarks that I've been involved with
>>>> showed lower performance with 1GB pages than with 2MB pages until quite
>>>> recently.
>>>
>>> I suggested in another thread that maybe it is time to consider
>>> dropping this "feature"
>>
>> You mean dropping gigantic page support in hugetlb?
>
> No, I mean dropping support for arches that want to do:
>
>    tail_page != head_page + tail_page_nr
>
> because they can't allocate the required page array either virtually
> or physically contiguously.
>
> It seems like quite a burden on the core mm for a very niche, and
> maybe even non-existant, case.
>
> It was originally done for PPC, can these PPC systems use VMEMMAP now?
>
>>> The cost to fix GUP to be compatible with this will hurt normal
>>> GUP performance - and again, that nobody has hit this bug in GUP
>>> further suggests the feature isn't used..
>>
>> A easy fix might be to make gigantic hugetlb page depends on
>> CONFIG_SPARSEMEM_VMEMMAP, which guarantee all struct pages are contiguous.
>
> Yes, exactly.

I actually have a question on CONFIG_SPARSEMEM_VMEMMAP. Can we assume
PFN_A - PFN_B == struct_page_A - struct_page_B, meaning all struct pages
are ordered based on physical addresses? I just wonder for two PFN ranges,
e.g., [0 - 128MB], [128MB - 256MB], if it is possible to first online
[128MB - 256MB] then [0 - 128MB] and the struct pages of [128MB - 256MB]
are in front of [0 - 128MB] in the vmemmap due to online ordering.


—
Best Regards,
Yan Zi
Mike Kravetz Feb. 18, 2021, 5:51 p.m. UTC | #9
On 2/18/21 9:40 AM, Zi Yan wrote:
> On 18 Feb 2021, at 12:32, Jason Gunthorpe wrote:
> 
>> On Thu, Feb 18, 2021 at 12:27:58PM -0500, Zi Yan wrote:
>>> On 18 Feb 2021, at 12:25, Jason Gunthorpe wrote:
>>>
>>>> On Thu, Feb 18, 2021 at 02:45:54PM +0000, Matthew Wilcox wrote:
>>>>> On Wed, Feb 17, 2021 at 11:02:52AM -0800, Andrew Morton wrote:
>>>>>> On Wed, 17 Feb 2021 10:49:25 -0800 Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>>>>>> page structs are not guaranteed to be contiguous for gigantic pages.  The
>>>>>>
>>>>>> June 2014.  That's a long lurk time for a bug.  I wonder if some later
>>>>>> commit revealed it.
>>>>>
>>>>> I would suggest that gigantic pages have not seen much use.  Certainly
>>>>> performance with Intel CPUs on benchmarks that I've been involved with
>>>>> showed lower performance with 1GB pages than with 2MB pages until quite
>>>>> recently.
>>>>
>>>> I suggested in another thread that maybe it is time to consider
>>>> dropping this "feature"
>>>
>>> You mean dropping gigantic page support in hugetlb?
>>
>> No, I mean dropping support for arches that want to do:
>>
>>    tail_page != head_page + tail_page_nr
>>
>> because they can't allocate the required page array either virtually
>> or physically contiguously.
>>
>> It seems like quite a burden on the core mm for a very niche, and
>> maybe even non-existant, case.
>>
>> It was originally done for PPC, can these PPC systems use VMEMMAP now?
>>
>>>> The cost to fix GUP to be compatible with this will hurt normal
>>>> GUP performance - and again, that nobody has hit this bug in GUP
>>>> further suggests the feature isn't used..
>>>
>>> A easy fix might be to make gigantic hugetlb page depends on
>>> CONFIG_SPARSEMEM_VMEMMAP, which guarantee all struct pages are contiguous.
>>
>> Yes, exactly.
> 
> I actually have a question on CONFIG_SPARSEMEM_VMEMMAP. Can we assume
> PFN_A - PFN_B == struct_page_A - struct_page_B, meaning all struct pages
> are ordered based on physical addresses? I just wonder for two PFN ranges,
> e.g., [0 - 128MB], [128MB - 256MB], if it is possible to first online
> [128MB - 256MB] then [0 - 128MB] and the struct pages of [128MB - 256MB]
> are in front of [0 - 128MB] in the vmemmap due to online ordering.

I have not looked at the code which does the onlining and vmemmap setup.
But, these definitions make me believe it is true:

#elif defined(CONFIG_SPARSEMEM_VMEMMAP)

/* memmap is virtually contiguous.  */
#define __pfn_to_page(pfn)      (vmemmap + (pfn))
#define __page_to_pfn(page)     (unsigned long)((page) - vmemmap)
Zi Yan Feb. 18, 2021, 6:50 p.m. UTC | #10
On 18 Feb 2021, at 12:51, Mike Kravetz wrote:

> On 2/18/21 9:40 AM, Zi Yan wrote:
>> On 18 Feb 2021, at 12:32, Jason Gunthorpe wrote:
>>
>>> On Thu, Feb 18, 2021 at 12:27:58PM -0500, Zi Yan wrote:
>>>> On 18 Feb 2021, at 12:25, Jason Gunthorpe wrote:
>>>>
>>>>> On Thu, Feb 18, 2021 at 02:45:54PM +0000, Matthew Wilcox wrote:
>>>>>> On Wed, Feb 17, 2021 at 11:02:52AM -0800, Andrew Morton wrote:
>>>>>>> On Wed, 17 Feb 2021 10:49:25 -0800 Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>>>>>>> page structs are not guaranteed to be contiguous for gigantic pages.  The
>>>>>>>
>>>>>>> June 2014.  That's a long lurk time for a bug.  I wonder if some later
>>>>>>> commit revealed it.
>>>>>>
>>>>>> I would suggest that gigantic pages have not seen much use.  Certainly
>>>>>> performance with Intel CPUs on benchmarks that I've been involved with
>>>>>> showed lower performance with 1GB pages than with 2MB pages until quite
>>>>>> recently.
>>>>>
>>>>> I suggested in another thread that maybe it is time to consider
>>>>> dropping this "feature"
>>>>
>>>> You mean dropping gigantic page support in hugetlb?
>>>
>>> No, I mean dropping support for arches that want to do:
>>>
>>>    tail_page != head_page + tail_page_nr
>>>
>>> because they can't allocate the required page array either virtually
>>> or physically contiguously.
>>>
>>> It seems like quite a burden on the core mm for a very niche, and
>>> maybe even non-existant, case.
>>>
>>> It was originally done for PPC, can these PPC systems use VMEMMAP now?
>>>
>>>>> The cost to fix GUP to be compatible with this will hurt normal
>>>>> GUP performance - and again, that nobody has hit this bug in GUP
>>>>> further suggests the feature isn't used..
>>>>
>>>> A easy fix might be to make gigantic hugetlb page depends on
>>>> CONFIG_SPARSEMEM_VMEMMAP, which guarantee all struct pages are contiguous.
>>>
>>> Yes, exactly.
>>
>> I actually have a question on CONFIG_SPARSEMEM_VMEMMAP. Can we assume
>> PFN_A - PFN_B == struct_page_A - struct_page_B, meaning all struct pages
>> are ordered based on physical addresses? I just wonder for two PFN ranges,
>> e.g., [0 - 128MB], [128MB - 256MB], if it is possible to first online
>> [128MB - 256MB] then [0 - 128MB] and the struct pages of [128MB - 256MB]
>> are in front of [0 - 128MB] in the vmemmap due to online ordering.
>
> I have not looked at the code which does the onlining and vmemmap setup.
> But, these definitions make me believe it is true:
>
> #elif defined(CONFIG_SPARSEMEM_VMEMMAP)
>
> /* memmap is virtually contiguous.  */
> #define __pfn_to_page(pfn)      (vmemmap + (pfn))
> #define __page_to_pfn(page)     (unsigned long)((page) - vmemmap)

Makes sense. Thank you for checking.

I guess making gigantic page depends on CONFIG_SPARSEMEM_VMEMMAP might
be a good way of simplifying code and avoiding future bugs unless
there is an arch really needs gigantic page and cannot have VMEMMAP.

—
Best Regards,
Yan Zi
Mike Kravetz Feb. 18, 2021, 9:43 p.m. UTC | #11
On 2/18/21 9:34 AM, Mike Kravetz wrote:
> On 2/18/21 9:25 AM, Jason Gunthorpe wrote:
>> On Thu, Feb 18, 2021 at 02:45:54PM +0000, Matthew Wilcox wrote:
>>> On Wed, Feb 17, 2021 at 11:02:52AM -0800, Andrew Morton wrote:
>>>> On Wed, 17 Feb 2021 10:49:25 -0800 Mike Kravetz <mike.kravetz@oracle.com> wrote:
>>>>> page structs are not guaranteed to be contiguous for gigantic pages.  The
>>>>
>>>> June 2014.  That's a long lurk time for a bug.  I wonder if some later
>>>> commit revealed it.
>>>
>>> I would suggest that gigantic pages have not seen much use.  Certainly
>>> performance with Intel CPUs on benchmarks that I've been involved with
>>> showed lower performance with 1GB pages than with 2MB pages until quite
>>> recently.
>>
>> I suggested in another thread that maybe it is time to consider
>> dropping this "feature"
>>
>> If it has been slightly broken for 7 years it seems a good bet it
>> isn't actually being used.
>>
>> The cost to fix GUP to be compatible with this will hurt normal
>> GUP performance - and again, that nobody has hit this bug in GUP
>> further suggests the feature isn't used..
> 
> I was thinking that we could detect these 'unusual' configurations and only
> do the slower page struct walking in those cases.  However, we would need to
> do some research to make sure we have taken into account all possible config
> options which can produce non-contiguous page structs.  That should have zero
> performance impact in the 'normal' cases.

What about something like the following patch, and making all code that
wants to scan gigantic page subpages use mem_map_next()?

From 95b0384bd5d7f0435546bdd3c01c478724ae0166 Mon Sep 17 00:00:00 2001
From: Mike Kravetz <mike.kravetz@oracle.com>
Date: Thu, 18 Feb 2021 13:35:02 -0800
Subject: [PATCH] mm: define PFN_PAGE_MAP_LINEAR to optimize gigantic page
 scans

Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
---
 arch/ia64/include/asm/page.h       | 1 +
 arch/m68k/include/asm/page_no.h    | 1 +
 include/asm-generic/memory_model.h | 2 ++
 mm/internal.h                      | 2 ++
 4 files changed, 6 insertions(+)

diff --git a/arch/ia64/include/asm/page.h b/arch/ia64/include/asm/page.h
index b69a5499d75b..8f4288862ec8 100644
--- a/arch/ia64/include/asm/page.h
+++ b/arch/ia64/include/asm/page.h
@@ -106,6 +106,7 @@ extern struct page *vmem_map;
 #ifdef CONFIG_DISCONTIGMEM
 # define page_to_pfn(page)	((unsigned long) (page - vmem_map))
 # define pfn_to_page(pfn)	(vmem_map + (pfn))
+# define PFN_PAGE_MAP_LINEAR
 # define __pfn_to_phys(pfn)	PFN_PHYS(pfn)
 #else
 # include <asm-generic/memory_model.h>
diff --git a/arch/m68k/include/asm/page_no.h b/arch/m68k/include/asm/page_no.h
index 6bbe52025de3..cafc0731a42c 100644
--- a/arch/m68k/include/asm/page_no.h
+++ b/arch/m68k/include/asm/page_no.h
@@ -28,6 +28,7 @@ extern unsigned long memory_end;
 
 #define pfn_to_page(pfn)	virt_to_page(pfn_to_virt(pfn))
 #define page_to_pfn(page)	virt_to_pfn(page_to_virt(page))
+#define PFN_PAGE_MAP_LINEAR
 #define pfn_valid(pfn)	        ((pfn) < max_mapnr)
 
 #define	virt_addr_valid(kaddr)	(((void *)(kaddr) >= (void *)PAGE_OFFSET) && \
diff --git a/include/asm-generic/memory_model.h b/include/asm-generic/memory_model.h
index 7637fb46ba4f..8ac4c48dbf22 100644
--- a/include/asm-generic/memory_model.h
+++ b/include/asm-generic/memory_model.h
@@ -33,6 +33,7 @@
 #define __pfn_to_page(pfn)	(mem_map + ((pfn) - ARCH_PFN_OFFSET))
 #define __page_to_pfn(page)	((unsigned long)((page) - mem_map) + \
 				 ARCH_PFN_OFFSET)
+#define PFN_PAGE_MAP_LINEAR
 #elif defined(CONFIG_DISCONTIGMEM)
 
 #define __pfn_to_page(pfn)			\
@@ -53,6 +54,7 @@
 /* memmap is virtually contiguous.  */
 #define __pfn_to_page(pfn)	(vmemmap + (pfn))
 #define __page_to_pfn(page)	(unsigned long)((page) - vmemmap)
+#define PFN_PAGE_MAP_LINEAR
 
 #elif defined(CONFIG_SPARSEMEM)
 /*
diff --git a/mm/internal.h b/mm/internal.h
index 25d2b2439f19..64cc5069047c 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -454,12 +454,14 @@ static inline struct page *mem_map_offset(struct page *base, int offset)
 static inline struct page *mem_map_next(struct page *iter,
 						struct page *base, int offset)
 {
+#ifndef PFN_PAGE_MAP_LINEAR
 	if (unlikely((offset & (MAX_ORDER_NR_PAGES - 1)) == 0)) {
 		unsigned long pfn = page_to_pfn(base) + offset;
 		if (!pfn_valid(pfn))
 			return NULL;
 		return pfn_to_page(pfn);
 	}
+#endif
 	return iter + 1;
 }
diff mbox series

Patch

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 4bdb58ab14cb..94e9fa803294 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1312,14 +1312,16 @@  static inline void destroy_compound_gigantic_page(struct page *page,
 static void update_and_free_page(struct hstate *h, struct page *page)
 {
 	int i;
+	struct page *subpage = page;
 
 	if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
 		return;
 
 	h->nr_huge_pages--;
 	h->nr_huge_pages_node[page_to_nid(page)]--;
-	for (i = 0; i < pages_per_huge_page(h); i++) {
-		page[i].flags &= ~(1 << PG_locked | 1 << PG_error |
+	for (i = 0; i < pages_per_huge_page(h);
+	     i++, subpage = mem_map_next(subpage, page, i)) {
+		subpage->flags &= ~(1 << PG_locked | 1 << PG_error |
 				1 << PG_referenced | 1 << PG_dirty |
 				1 << PG_active | 1 << PG_private |
 				1 << PG_writeback);