diff mbox series

[2/2] mm: Create non-atomic version of SetPageReserved for init use

Message ID 20180904183345.4416.76515.stgit@localhost.localdomain (mailing list archive)
State New, archived
Headers show
Series Address issues slowing memory init | expand

Commit Message

Alexander Duyck Sept. 4, 2018, 6:33 p.m. UTC
From: Alexander Duyck <alexander.h.duyck@intel.com>

It doesn't make much sense to use the atomic SetPageReserved at init time
when we are using memset to clear the memory and manipulating the page
flags via simple "&=" and "|=" operations in __init_single_page.

This patch adds a non-atomic version __SetPageReserved that can be used
during page init and shows about a 10% improvement in initialization times
on the systems I have available for testing.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---
 include/linux/page-flags.h |    1 +
 mm/page_alloc.c            |    4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

Comments

Dave Hansen Sept. 4, 2018, 7:27 p.m. UTC | #1
On 09/04/2018 11:33 AM, Alexander Duyck wrote:
> +++ b/mm/page_alloc.c
> @@ -1231,7 +1231,7 @@ void __meminit reserve_bootmem_region(phys_addr_t start, phys_addr_t end)
>  			/* Avoid false-positive PageTail() */
>  			INIT_LIST_HEAD(&page->lru);
>  
> -			SetPageReserved(page);
> +			__SetPageReserved(page);
>  		}
>  	}
>  }
> @@ -5518,7 +5518,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>  		page = pfn_to_page(pfn);
>  		__init_single_page(page, pfn, zone, nid);
>  		if (context == MEMMAP_HOTPLUG)
> -			SetPageReserved(page);
> +			__SetPageReserved(page);

Comments needed, please.  SetPageReserved() is opaque enough by itself,
but having to discern between it and an __ variant is even less fun.
Michal Hocko Sept. 5, 2018, 6:24 a.m. UTC | #2
On Tue 04-09-18 11:33:45, Alexander Duyck wrote:
> From: Alexander Duyck <alexander.h.duyck@intel.com>
> 
> It doesn't make much sense to use the atomic SetPageReserved at init time
> when we are using memset to clear the memory and manipulating the page
> flags via simple "&=" and "|=" operations in __init_single_page.
> 
> This patch adds a non-atomic version __SetPageReserved that can be used
> during page init and shows about a 10% improvement in initialization times
> on the systems I have available for testing.

I agree with Dave about a comment is due. I am also quite surprised that
this leads to such a large improvement. Could you be more specific about
your test and machines you were testing on?

Other than that the patch makes sense to me.

> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>

With the above addressed, feel free to add
Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!

> ---
>  include/linux/page-flags.h |    1 +
>  mm/page_alloc.c            |    4 ++--
>  2 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 74bee8cecf4c..57ec3fef7e9f 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -292,6 +292,7 @@ static inline int PagePoisoned(const struct page *page)
>  
>  PAGEFLAG(Reserved, reserved, PF_NO_COMPOUND)
>  	__CLEARPAGEFLAG(Reserved, reserved, PF_NO_COMPOUND)
> +	__SETPAGEFLAG(Reserved, reserved, PF_NO_COMPOUND)
>  PAGEFLAG(SwapBacked, swapbacked, PF_NO_TAIL)
>  	__CLEARPAGEFLAG(SwapBacked, swapbacked, PF_NO_TAIL)
>  	__SETPAGEFLAG(SwapBacked, swapbacked, PF_NO_TAIL)
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 05e983f42316..9c7d6e971630 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1231,7 +1231,7 @@ void __meminit reserve_bootmem_region(phys_addr_t start, phys_addr_t end)
>  			/* Avoid false-positive PageTail() */
>  			INIT_LIST_HEAD(&page->lru);
>  
> -			SetPageReserved(page);
> +			__SetPageReserved(page);
>  		}
>  	}
>  }
> @@ -5518,7 +5518,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
>  		page = pfn_to_page(pfn);
>  		__init_single_page(page, pfn, zone, nid);
>  		if (context == MEMMAP_HOTPLUG)
> -			SetPageReserved(page);
> +			__SetPageReserved(page);
>  
>  		/*
>  		 * Mark the block movable so that blocks are reserved for
>
Alexander Duyck Sept. 5, 2018, 8:18 p.m. UTC | #3
On Tue, Sep 4, 2018 at 11:24 PM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Tue 04-09-18 11:33:45, Alexander Duyck wrote:
> > From: Alexander Duyck <alexander.h.duyck@intel.com>
> >
> > It doesn't make much sense to use the atomic SetPageReserved at init time
> > when we are using memset to clear the memory and manipulating the page
> > flags via simple "&=" and "|=" operations in __init_single_page.
> >
> > This patch adds a non-atomic version __SetPageReserved that can be used
> > during page init and shows about a 10% improvement in initialization times
> > on the systems I have available for testing.
>
> I agree with Dave about a comment is due. I am also quite surprised that
> this leads to such a large improvement. Could you be more specific about
> your test and machines you were testing on?

So my test case has been just initializing 4 3TB blocks of persistent
memory with a few trace_printk values added to track total time in
move_pfn_range_to_zone.

What I have been seeing is that the time needed for the call drops on
average from 35-36 seconds down to around 31-32.

> Other than that the patch makes sense to me.
>
> > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
>
> With the above addressed, feel free to add
> Acked-by: Michal Hocko <mhocko@suse.com>
>
> Thanks!

As far as adding a comment are we just talking about why it is
reserved, or do we need a description of the __SetPageReserved versus
SetPageReserved. For now I was looking at adding a comment like:
@@ -5517,8 +5517,13 @@ void __meminit memmap_init_zone(unsigned long
size, int nid, unsigned long zone,
 not_early:
                page = pfn_to_page(pfn);
                __init_single_page(page, pfn, zone, nid);
+
+               /*
+                * Mark page reserved as it will need to wait for onlining
+                * phase for it to be fully associated with a zone.
+                */
                if (context == MEMMAP_HOTPLUG)
-                       SetPageReserved(page);
+                       __SetPageReserved(page);

                /*
                 * Mark the block movable so that blocks are reserved for

Any thoughts on this?

Thanks.

- Alex
Pasha Tatashin Sept. 5, 2018, 8:22 p.m. UTC | #4
On 9/5/18 4:18 PM, Alexander Duyck wrote:
> On Tue, Sep 4, 2018 at 11:24 PM Michal Hocko <mhocko@kernel.org> wrote:
>>
>> On Tue 04-09-18 11:33:45, Alexander Duyck wrote:
>>> From: Alexander Duyck <alexander.h.duyck@intel.com>
>>>
>>> It doesn't make much sense to use the atomic SetPageReserved at init time
>>> when we are using memset to clear the memory and manipulating the page
>>> flags via simple "&=" and "|=" operations in __init_single_page.
>>>
>>> This patch adds a non-atomic version __SetPageReserved that can be used
>>> during page init and shows about a 10% improvement in initialization times
>>> on the systems I have available for testing.
>>
>> I agree with Dave about a comment is due. I am also quite surprised that
>> this leads to such a large improvement. Could you be more specific about
>> your test and machines you were testing on?
> 
> So my test case has been just initializing 4 3TB blocks of persistent
> memory with a few trace_printk values added to track total time in
> move_pfn_range_to_zone.
> 
> What I have been seeing is that the time needed for the call drops on
> average from 35-36 seconds down to around 31-32.

Just curious why is there variance? During boot time is usually pretty
consistent, as there is only one thread and system is in pretty much the
same state.

A dmesg output in the commit log would be helpful.

Thank you,
Pavel

> 
>> Other than that the patch makes sense to me.
>>
>>> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
>>
>> With the above addressed, feel free to add
>> Acked-by: Michal Hocko <mhocko@suse.com>
>>
>> Thanks!
> 
> As far as adding a comment are we just talking about why it is
> reserved, or do we need a description of the __SetPageReserved versus
> SetPageReserved. For now I was looking at adding a comment like:
> @@ -5517,8 +5517,13 @@ void __meminit memmap_init_zone(unsigned long
> size, int nid, unsigned long zone,
>  not_early:
>                 page = pfn_to_page(pfn);
>                 __init_single_page(page, pfn, zone, nid);
> +
> +               /*
> +                * Mark page reserved as it will need to wait for onlining
> +                * phase for it to be fully associated with a zone.
> +                */
>                 if (context == MEMMAP_HOTPLUG)
> -                       SetPageReserved(page);
> +                       __SetPageReserved(page);
> 
>                 /*
>                  * Mark the block movable so that blocks are reserved for
> 
> Any thoughts on this?
> 
> Thanks.
> 
> - Alex
>
Alexander Duyck Sept. 5, 2018, 8:35 p.m. UTC | #5
On Wed, Sep 5, 2018 at 1:22 PM Pasha Tatashin
<Pavel.Tatashin@microsoft.com> wrote:
>
>
>
> On 9/5/18 4:18 PM, Alexander Duyck wrote:
> > On Tue, Sep 4, 2018 at 11:24 PM Michal Hocko <mhocko@kernel.org> wrote:
> >>
> >> On Tue 04-09-18 11:33:45, Alexander Duyck wrote:
> >>> From: Alexander Duyck <alexander.h.duyck@intel.com>
> >>>
> >>> It doesn't make much sense to use the atomic SetPageReserved at init time
> >>> when we are using memset to clear the memory and manipulating the page
> >>> flags via simple "&=" and "|=" operations in __init_single_page.
> >>>
> >>> This patch adds a non-atomic version __SetPageReserved that can be used
> >>> during page init and shows about a 10% improvement in initialization times
> >>> on the systems I have available for testing.
> >>
> >> I agree with Dave about a comment is due. I am also quite surprised that
> >> this leads to such a large improvement. Could you be more specific about
> >> your test and machines you were testing on?
> >
> > So my test case has been just initializing 4 3TB blocks of persistent
> > memory with a few trace_printk values added to track total time in
> > move_pfn_range_to_zone.
> >
> > What I have been seeing is that the time needed for the call drops on
> > average from 35-36 seconds down to around 31-32.
>
> Just curious why is there variance? During boot time is usually pretty
> consistent, as there is only one thread and system is in pretty much the
> same state.
>
> A dmesg output in the commit log would be helpful.
>
> Thank you,
> Pavel

The variance has to do with the fact that it is being added via
hot-plug. So in this case the system boots and then after 5 minutes it
then goes about hot-plugging the memory. The memmap_init_zone call
will make regular calls into cond_resched() and it seems like if there
are any other active threads that can end up impacting the timings and
provide a few hundred ms of variation between runs.

In addition there is also NUMA locality that plays a role. I have seen
values as low as 25.5s pre-patch, 23.2 after, and values as high as
39.17 pre-patch, 37.3 after. I am assuming that the lowest values just
happened to luck into being node local, and the highest values end up
being 2 nodes away on the 4 node system I am testing. I'm planning to
try and address the NUMA issues using an approach similar to what the
deferred_init is already doing by trying to start a kernel thread on
the correct node and then probably just waiting on that to complete
outside of the hotplug lock. The solution will end up being a hybrid
probably between the work Dan Williams had submitted a couple months
ago and the existing deferred_init code. But I will be targeting that
for 4.20 at the earliest.

- Alex
Michal Hocko Sept. 6, 2018, 5:41 a.m. UTC | #6
On Wed 05-09-18 13:18:24, Alexander Duyck wrote:
> On Tue, Sep 4, 2018 at 11:24 PM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Tue 04-09-18 11:33:45, Alexander Duyck wrote:
> > > From: Alexander Duyck <alexander.h.duyck@intel.com>
> > >
> > > It doesn't make much sense to use the atomic SetPageReserved at init time
> > > when we are using memset to clear the memory and manipulating the page
> > > flags via simple "&=" and "|=" operations in __init_single_page.
> > >
> > > This patch adds a non-atomic version __SetPageReserved that can be used
> > > during page init and shows about a 10% improvement in initialization times
> > > on the systems I have available for testing.
> >
> > I agree with Dave about a comment is due. I am also quite surprised that
> > this leads to such a large improvement. Could you be more specific about
> > your test and machines you were testing on?
> 
> So my test case has been just initializing 4 3TB blocks of persistent
> memory with a few trace_printk values added to track total time in
> move_pfn_range_to_zone.
> 
> What I have been seeing is that the time needed for the call drops on
> average from 35-36 seconds down to around 31-32.

This information belongs to the changelog.

> 
> > Other than that the patch makes sense to me.
> >
> > > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> >
> > With the above addressed, feel free to add
> > Acked-by: Michal Hocko <mhocko@suse.com>
> >
> > Thanks!
> 
> As far as adding a comment are we just talking about why it is
> reserved, or do we need a description of the __SetPageReserved versus
> SetPageReserved. For now I was looking at adding a comment like:

the later. The reason why we make it reserved should be quite clear. A
comment wouldn't hurt of course and what you have is a good start. But
it is usually atomic vs. non-atomic SetPage$Foo which needs some
clarification.

> @@ -5517,8 +5517,13 @@ void __meminit memmap_init_zone(unsigned long
> size, int nid, unsigned long zone,
>  not_early:
>                 page = pfn_to_page(pfn);
>                 __init_single_page(page, pfn, zone, nid);
> +
> +               /*
> +                * Mark page reserved as it will need to wait for onlining
> +                * phase for it to be fully associated with a zone.
> +                */
>                 if (context == MEMMAP_HOTPLUG)
> -                       SetPageReserved(page);
> +                       __SetPageReserved(page);
> 
>                 /*
>                  * Mark the block movable so that blocks are reserved for
> 
> Any thoughts on this?
> 
> Thanks.
> 
> - Alex
diff mbox series

Patch

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 74bee8cecf4c..57ec3fef7e9f 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -292,6 +292,7 @@  static inline int PagePoisoned(const struct page *page)
 
 PAGEFLAG(Reserved, reserved, PF_NO_COMPOUND)
 	__CLEARPAGEFLAG(Reserved, reserved, PF_NO_COMPOUND)
+	__SETPAGEFLAG(Reserved, reserved, PF_NO_COMPOUND)
 PAGEFLAG(SwapBacked, swapbacked, PF_NO_TAIL)
 	__CLEARPAGEFLAG(SwapBacked, swapbacked, PF_NO_TAIL)
 	__SETPAGEFLAG(SwapBacked, swapbacked, PF_NO_TAIL)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 05e983f42316..9c7d6e971630 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1231,7 +1231,7 @@  void __meminit reserve_bootmem_region(phys_addr_t start, phys_addr_t end)
 			/* Avoid false-positive PageTail() */
 			INIT_LIST_HEAD(&page->lru);
 
-			SetPageReserved(page);
+			__SetPageReserved(page);
 		}
 	}
 }
@@ -5518,7 +5518,7 @@  void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
 		page = pfn_to_page(pfn);
 		__init_single_page(page, pfn, zone, nid);
 		if (context == MEMMAP_HOTPLUG)
-			SetPageReserved(page);
+			__SetPageReserved(page);
 
 		/*
 		 * Mark the block movable so that blocks are reserved for