Message ID | 1596372896-15336-1-git-send-email-charante@codeaurora.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm, memory_hotplug: update pcp lists everytime onlining a memory block | expand |
On 02.08.20 14:54, Charan Teja Reddy wrote: > When onlining a first memory block in a zone, pcp lists are not updated > thus pcp struct will have the default setting of ->high = 0,->batch = 1. > This means till the second memory block in a zone(if it have) is onlined > the pcp lists of this zone will not contain any pages because pcp's > ->count is always greater than ->high thus free_pcppages_bulk() is > called to free batch size(=1) pages every time system wants to add a > page to the pcp list through free_unref_page(). To put this in a word, > system is not using benefits offered by the pcp lists when there is a > single onlineable memory block in a zone. Correct this by always > updating the pcp lists when memory block is onlined. I guess such setups are rare ... but I can imagine it being the case with virtio-mem in the future ... not 100% if this performance optimization is really relevant in practice ... how did you identify this? > > Signed-off-by: Charan Teja Reddy <charante@codeaurora.org> > --- > mm/memory_hotplug.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index dcdf327..7f62d69 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -854,8 +854,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, > node_states_set_node(nid, &arg); > if (need_zonelists_rebuild) > build_all_zonelists(NULL); > - else > - zone_pcp_update(zone); > + zone_pcp_update(zone); > > init_per_zone_wmark_min(); > > Does, in general, look sane to me. Reviewed-by: David Hildenbrand <david@redhat.com>
Thanks David for the comments. On 8/3/2020 1:35 PM, David Hildenbrand wrote: > On 02.08.20 14:54, Charan Teja Reddy wrote: >> When onlining a first memory block in a zone, pcp lists are not updated >> thus pcp struct will have the default setting of ->high = 0,->batch = 1. >> This means till the second memory block in a zone(if it have) is onlined >> the pcp lists of this zone will not contain any pages because pcp's >> ->count is always greater than ->high thus free_pcppages_bulk() is >> called to free batch size(=1) pages every time system wants to add a >> page to the pcp list through free_unref_page(). To put this in a word, >> system is not using benefits offered by the pcp lists when there is a >> single onlineable memory block in a zone. Correct this by always >> updating the pcp lists when memory block is onlined. > > I guess such setups are rare ... but I can imagine it being the case > with virtio-mem in the future ... not 100% if this performance > optimization is really relevant in practice ... how did you identify this? Even the Snapdragon hardware that I had tested on contain multiple onlineable memory blocks. But we have the use case in which we online single memory block and once it is filled then online the next block. In the step where single block is onlined, we observed the below pageset params. pagesets cpu: 0 count: 0 high: 0 batch: 1 Once the second block is onlined then only seeing some sane values as below. cpu: 0 count: 32 high: 378 batch: 63 In the above case, till the second block is onlined, no page is held in the pcp list. So, updating the pcp params every time when onlining the memory block is required, as an example in the usecase that I had mentioned. > >> >> Signed-off-by: Charan Teja Reddy <charante@codeaurora.org> >> --- >> mm/memory_hotplug.c | 3 +-- >> 1 file changed, 1 insertion(+), 2 deletions(-) >> >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c >> index dcdf327..7f62d69 100644 >> --- a/mm/memory_hotplug.c >> +++ b/mm/memory_hotplug.c >> @@ -854,8 +854,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, >> node_states_set_node(nid, &arg); >> if (need_zonelists_rebuild) >> build_all_zonelists(NULL); >> - else >> - zone_pcp_update(zone); >> + zone_pcp_update(zone); >> >> init_per_zone_wmark_min(); >> >> > > Does, in general, look sane to me. > > Reviewed-by: David Hildenbrand <david@redhat.com> Thanks for the ACK. >
On 8/2/20 2:54 PM, Charan Teja Reddy wrote: > When onlining a first memory block in a zone, pcp lists are not updated > thus pcp struct will have the default setting of ->high = 0,->batch = 1. > This means till the second memory block in a zone(if it have) is onlined > the pcp lists of this zone will not contain any pages because pcp's > ->count is always greater than ->high thus free_pcppages_bulk() is > called to free batch size(=1) pages every time system wants to add a > page to the pcp list through free_unref_page(). To put this in a word, > system is not using benefits offered by the pcp lists when there is a > single onlineable memory block in a zone. Correct this by always > updating the pcp lists when memory block is onlined. > > Signed-off-by: Charan Teja Reddy <charante@codeaurora.org> Makes sense to me. Acked-by: Vlastimil Babka <vbabka@suse.cz> > --- > mm/memory_hotplug.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index dcdf327..7f62d69 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -854,8 +854,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, > node_states_set_node(nid, &arg); > if (need_zonelists_rebuild) > build_all_zonelists(NULL); > - else > - zone_pcp_update(zone); > + zone_pcp_update(zone); > > init_per_zone_wmark_min(); > >
On 03.08.20 15:28, Charan Teja Kalla wrote: > Thanks David for the comments. > > On 8/3/2020 1:35 PM, David Hildenbrand wrote: >> On 02.08.20 14:54, Charan Teja Reddy wrote: >>> When onlining a first memory block in a zone, pcp lists are not updated >>> thus pcp struct will have the default setting of ->high = 0,->batch = 1. >>> This means till the second memory block in a zone(if it have) is onlined >>> the pcp lists of this zone will not contain any pages because pcp's >>> ->count is always greater than ->high thus free_pcppages_bulk() is >>> called to free batch size(=1) pages every time system wants to add a >>> page to the pcp list through free_unref_page(). To put this in a word, >>> system is not using benefits offered by the pcp lists when there is a >>> single onlineable memory block in a zone. Correct this by always >>> updating the pcp lists when memory block is onlined. >> >> I guess such setups are rare ... but I can imagine it being the case >> with virtio-mem in the future ... not 100% if this performance >> optimization is really relevant in practice ... how did you identify this? > > Even the Snapdragon hardware that I had tested on contain multiple > onlineable memory blocks. But we have the use case in which we online > single memory block and once it is filled then online the next block. In > the step where single block is onlined, we observed the below pageset Out of interest, why? Is it to optimize energy consumption?
On Sun 02-08-20 18:24:56, Charan Teja Reddy wrote: > When onlining a first memory block in a zone, pcp lists are not updated > thus pcp struct will have the default setting of ->high = 0,->batch = 1. > This means till the second memory block in a zone(if it have) is onlined > the pcp lists of this zone will not contain any pages because pcp's > ->count is always greater than ->high thus free_pcppages_bulk() is > called to free batch size(=1) pages every time system wants to add a > page to the pcp list through free_unref_page(). To put this in a word, > system is not using benefits offered by the pcp lists when there is a > single onlineable memory block in a zone. Correct this by always > updating the pcp lists when memory block is onlined. Yes this seems like an ancient bug Fixes: 1f522509c77a ("mem-hotplug: avoid multiple zones sharing same boot strapping boot_pageset") Just nobody has noticed because a single block memory zone is really rare. > Signed-off-by: Charan Teja Reddy <charante@codeaurora.org> Acked-by: Michal Hocko <mhocko@suse.com> Thanks > --- > mm/memory_hotplug.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index dcdf327..7f62d69 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -854,8 +854,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, > node_states_set_node(nid, &arg); > if (need_zonelists_rebuild) > build_all_zonelists(NULL); > - else > - zone_pcp_update(zone); > + zone_pcp_update(zone); > > init_per_zone_wmark_min(); > > -- > QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a > member of the Code Aurora Forum, hosted by The Linux Foundation >
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index dcdf327..7f62d69 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -854,8 +854,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, node_states_set_node(nid, &arg); if (need_zonelists_rebuild) build_all_zonelists(NULL); - else - zone_pcp_update(zone); + zone_pcp_update(zone); init_per_zone_wmark_min();
When onlining a first memory block in a zone, pcp lists are not updated thus pcp struct will have the default setting of ->high = 0,->batch = 1. This means till the second memory block in a zone(if it have) is onlined the pcp lists of this zone will not contain any pages because pcp's ->count is always greater than ->high thus free_pcppages_bulk() is called to free batch size(=1) pages every time system wants to add a page to the pcp list through free_unref_page(). To put this in a word, system is not using benefits offered by the pcp lists when there is a single onlineable memory block in a zone. Correct this by always updating the pcp lists when memory block is onlined. Signed-off-by: Charan Teja Reddy <charante@codeaurora.org> --- mm/memory_hotplug.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)