Message ID | 20180904183345.4416.76515.stgit@localhost.localdomain (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Address issues slowing memory init | expand |
On 09/04/2018 11:33 AM, Alexander Duyck wrote: > +++ b/mm/page_alloc.c > @@ -1231,7 +1231,7 @@ void __meminit reserve_bootmem_region(phys_addr_t start, phys_addr_t end) > /* Avoid false-positive PageTail() */ > INIT_LIST_HEAD(&page->lru); > > - SetPageReserved(page); > + __SetPageReserved(page); > } > } > } > @@ -5518,7 +5518,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, > page = pfn_to_page(pfn); > __init_single_page(page, pfn, zone, nid); > if (context == MEMMAP_HOTPLUG) > - SetPageReserved(page); > + __SetPageReserved(page); Comments needed, please. SetPageReserved() is opaque enough by itself, but having to discern between it and an __ variant is even less fun.
On Tue 04-09-18 11:33:45, Alexander Duyck wrote: > From: Alexander Duyck <alexander.h.duyck@intel.com> > > It doesn't make much sense to use the atomic SetPageReserved at init time > when we are using memset to clear the memory and manipulating the page > flags via simple "&=" and "|=" operations in __init_single_page. > > This patch adds a non-atomic version __SetPageReserved that can be used > during page init and shows about a 10% improvement in initialization times > on the systems I have available for testing. I agree with Dave about a comment is due. I am also quite surprised that this leads to such a large improvement. Could you be more specific about your test and machines you were testing on? Other than that the patch makes sense to me. > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> With the above addressed, feel free to add Acked-by: Michal Hocko <mhocko@suse.com> Thanks! > --- > include/linux/page-flags.h | 1 + > mm/page_alloc.c | 4 ++-- > 2 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h > index 74bee8cecf4c..57ec3fef7e9f 100644 > --- a/include/linux/page-flags.h > +++ b/include/linux/page-flags.h > @@ -292,6 +292,7 @@ static inline int PagePoisoned(const struct page *page) > > PAGEFLAG(Reserved, reserved, PF_NO_COMPOUND) > __CLEARPAGEFLAG(Reserved, reserved, PF_NO_COMPOUND) > + __SETPAGEFLAG(Reserved, reserved, PF_NO_COMPOUND) > PAGEFLAG(SwapBacked, swapbacked, PF_NO_TAIL) > __CLEARPAGEFLAG(SwapBacked, swapbacked, PF_NO_TAIL) > __SETPAGEFLAG(SwapBacked, swapbacked, PF_NO_TAIL) > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 05e983f42316..9c7d6e971630 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1231,7 +1231,7 @@ void __meminit reserve_bootmem_region(phys_addr_t start, phys_addr_t end) > /* Avoid false-positive PageTail() */ > INIT_LIST_HEAD(&page->lru); > > - SetPageReserved(page); > + __SetPageReserved(page); > } > } > } > @@ -5518,7 +5518,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, > page = pfn_to_page(pfn); > __init_single_page(page, pfn, zone, nid); > if (context == MEMMAP_HOTPLUG) > - SetPageReserved(page); > + __SetPageReserved(page); > > /* > * Mark the block movable so that blocks are reserved for >
On Tue, Sep 4, 2018 at 11:24 PM Michal Hocko <mhocko@kernel.org> wrote: > > On Tue 04-09-18 11:33:45, Alexander Duyck wrote: > > From: Alexander Duyck <alexander.h.duyck@intel.com> > > > > It doesn't make much sense to use the atomic SetPageReserved at init time > > when we are using memset to clear the memory and manipulating the page > > flags via simple "&=" and "|=" operations in __init_single_page. > > > > This patch adds a non-atomic version __SetPageReserved that can be used > > during page init and shows about a 10% improvement in initialization times > > on the systems I have available for testing. > > I agree with Dave about a comment is due. I am also quite surprised that > this leads to such a large improvement. Could you be more specific about > your test and machines you were testing on? So my test case has been just initializing 4 3TB blocks of persistent memory with a few trace_printk values added to track total time in move_pfn_range_to_zone. What I have been seeing is that the time needed for the call drops on average from 35-36 seconds down to around 31-32. > Other than that the patch makes sense to me. > > > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> > > With the above addressed, feel free to add > Acked-by: Michal Hocko <mhocko@suse.com> > > Thanks! As far as adding a comment are we just talking about why it is reserved, or do we need a description of the __SetPageReserved versus SetPageReserved. For now I was looking at adding a comment like: @@ -5517,8 +5517,13 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, not_early: page = pfn_to_page(pfn); __init_single_page(page, pfn, zone, nid); + + /* + * Mark page reserved as it will need to wait for onlining + * phase for it to be fully associated with a zone. + */ if (context == MEMMAP_HOTPLUG) - SetPageReserved(page); + __SetPageReserved(page); /* * Mark the block movable so that blocks are reserved for Any thoughts on this? Thanks. - Alex
On 9/5/18 4:18 PM, Alexander Duyck wrote: > On Tue, Sep 4, 2018 at 11:24 PM Michal Hocko <mhocko@kernel.org> wrote: >> >> On Tue 04-09-18 11:33:45, Alexander Duyck wrote: >>> From: Alexander Duyck <alexander.h.duyck@intel.com> >>> >>> It doesn't make much sense to use the atomic SetPageReserved at init time >>> when we are using memset to clear the memory and manipulating the page >>> flags via simple "&=" and "|=" operations in __init_single_page. >>> >>> This patch adds a non-atomic version __SetPageReserved that can be used >>> during page init and shows about a 10% improvement in initialization times >>> on the systems I have available for testing. >> >> I agree with Dave about a comment is due. I am also quite surprised that >> this leads to such a large improvement. Could you be more specific about >> your test and machines you were testing on? > > So my test case has been just initializing 4 3TB blocks of persistent > memory with a few trace_printk values added to track total time in > move_pfn_range_to_zone. > > What I have been seeing is that the time needed for the call drops on > average from 35-36 seconds down to around 31-32. Just curious why is there variance? During boot time is usually pretty consistent, as there is only one thread and system is in pretty much the same state. A dmesg output in the commit log would be helpful. Thank you, Pavel > >> Other than that the patch makes sense to me. >> >>> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> >> >> With the above addressed, feel free to add >> Acked-by: Michal Hocko <mhocko@suse.com> >> >> Thanks! > > As far as adding a comment are we just talking about why it is > reserved, or do we need a description of the __SetPageReserved versus > SetPageReserved. For now I was looking at adding a comment like: > @@ -5517,8 +5517,13 @@ void __meminit memmap_init_zone(unsigned long > size, int nid, unsigned long zone, > not_early: > page = pfn_to_page(pfn); > __init_single_page(page, pfn, zone, nid); > + > + /* > + * Mark page reserved as it will need to wait for onlining > + * phase for it to be fully associated with a zone. > + */ > if (context == MEMMAP_HOTPLUG) > - SetPageReserved(page); > + __SetPageReserved(page); > > /* > * Mark the block movable so that blocks are reserved for > > Any thoughts on this? > > Thanks. > > - Alex >
On Wed, Sep 5, 2018 at 1:22 PM Pasha Tatashin <Pavel.Tatashin@microsoft.com> wrote: > > > > On 9/5/18 4:18 PM, Alexander Duyck wrote: > > On Tue, Sep 4, 2018 at 11:24 PM Michal Hocko <mhocko@kernel.org> wrote: > >> > >> On Tue 04-09-18 11:33:45, Alexander Duyck wrote: > >>> From: Alexander Duyck <alexander.h.duyck@intel.com> > >>> > >>> It doesn't make much sense to use the atomic SetPageReserved at init time > >>> when we are using memset to clear the memory and manipulating the page > >>> flags via simple "&=" and "|=" operations in __init_single_page. > >>> > >>> This patch adds a non-atomic version __SetPageReserved that can be used > >>> during page init and shows about a 10% improvement in initialization times > >>> on the systems I have available for testing. > >> > >> I agree with Dave about a comment is due. I am also quite surprised that > >> this leads to such a large improvement. Could you be more specific about > >> your test and machines you were testing on? > > > > So my test case has been just initializing 4 3TB blocks of persistent > > memory with a few trace_printk values added to track total time in > > move_pfn_range_to_zone. > > > > What I have been seeing is that the time needed for the call drops on > > average from 35-36 seconds down to around 31-32. > > Just curious why is there variance? During boot time is usually pretty > consistent, as there is only one thread and system is in pretty much the > same state. > > A dmesg output in the commit log would be helpful. > > Thank you, > Pavel The variance has to do with the fact that it is being added via hot-plug. So in this case the system boots and then after 5 minutes it then goes about hot-plugging the memory. The memmap_init_zone call will make regular calls into cond_resched() and it seems like if there are any other active threads that can end up impacting the timings and provide a few hundred ms of variation between runs. In addition there is also NUMA locality that plays a role. I have seen values as low as 25.5s pre-patch, 23.2 after, and values as high as 39.17 pre-patch, 37.3 after. I am assuming that the lowest values just happened to luck into being node local, and the highest values end up being 2 nodes away on the 4 node system I am testing. I'm planning to try and address the NUMA issues using an approach similar to what the deferred_init is already doing by trying to start a kernel thread on the correct node and then probably just waiting on that to complete outside of the hotplug lock. The solution will end up being a hybrid probably between the work Dan Williams had submitted a couple months ago and the existing deferred_init code. But I will be targeting that for 4.20 at the earliest. - Alex
On Wed 05-09-18 13:18:24, Alexander Duyck wrote: > On Tue, Sep 4, 2018 at 11:24 PM Michal Hocko <mhocko@kernel.org> wrote: > > > > On Tue 04-09-18 11:33:45, Alexander Duyck wrote: > > > From: Alexander Duyck <alexander.h.duyck@intel.com> > > > > > > It doesn't make much sense to use the atomic SetPageReserved at init time > > > when we are using memset to clear the memory and manipulating the page > > > flags via simple "&=" and "|=" operations in __init_single_page. > > > > > > This patch adds a non-atomic version __SetPageReserved that can be used > > > during page init and shows about a 10% improvement in initialization times > > > on the systems I have available for testing. > > > > I agree with Dave about a comment is due. I am also quite surprised that > > this leads to such a large improvement. Could you be more specific about > > your test and machines you were testing on? > > So my test case has been just initializing 4 3TB blocks of persistent > memory with a few trace_printk values added to track total time in > move_pfn_range_to_zone. > > What I have been seeing is that the time needed for the call drops on > average from 35-36 seconds down to around 31-32. This information belongs to the changelog. > > > Other than that the patch makes sense to me. > > > > > Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> > > > > With the above addressed, feel free to add > > Acked-by: Michal Hocko <mhocko@suse.com> > > > > Thanks! > > As far as adding a comment are we just talking about why it is > reserved, or do we need a description of the __SetPageReserved versus > SetPageReserved. For now I was looking at adding a comment like: the later. The reason why we make it reserved should be quite clear. A comment wouldn't hurt of course and what you have is a good start. But it is usually atomic vs. non-atomic SetPage$Foo which needs some clarification. > @@ -5517,8 +5517,13 @@ void __meminit memmap_init_zone(unsigned long > size, int nid, unsigned long zone, > not_early: > page = pfn_to_page(pfn); > __init_single_page(page, pfn, zone, nid); > + > + /* > + * Mark page reserved as it will need to wait for onlining > + * phase for it to be fully associated with a zone. > + */ > if (context == MEMMAP_HOTPLUG) > - SetPageReserved(page); > + __SetPageReserved(page); > > /* > * Mark the block movable so that blocks are reserved for > > Any thoughts on this? > > Thanks. > > - Alex
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 74bee8cecf4c..57ec3fef7e9f 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -292,6 +292,7 @@ static inline int PagePoisoned(const struct page *page) PAGEFLAG(Reserved, reserved, PF_NO_COMPOUND) __CLEARPAGEFLAG(Reserved, reserved, PF_NO_COMPOUND) + __SETPAGEFLAG(Reserved, reserved, PF_NO_COMPOUND) PAGEFLAG(SwapBacked, swapbacked, PF_NO_TAIL) __CLEARPAGEFLAG(SwapBacked, swapbacked, PF_NO_TAIL) __SETPAGEFLAG(SwapBacked, swapbacked, PF_NO_TAIL) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 05e983f42316..9c7d6e971630 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1231,7 +1231,7 @@ void __meminit reserve_bootmem_region(phys_addr_t start, phys_addr_t end) /* Avoid false-positive PageTail() */ INIT_LIST_HEAD(&page->lru); - SetPageReserved(page); + __SetPageReserved(page); } } } @@ -5518,7 +5518,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone, page = pfn_to_page(pfn); __init_single_page(page, pfn, zone, nid); if (context == MEMMAP_HOTPLUG) - SetPageReserved(page); + __SetPageReserved(page); /* * Mark the block movable so that blocks are reserved for