Message ID | 20220715170312.13931-4-julien@xen.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | xen/mm: Optimize init_heap_pages() | expand |
Hi Julien, On 2022/7/16 1:03, Julien Grall wrote: > From: Hongyan Xia <hongyxia@amazon.com> > > The idea is to split the range into multiple aligned power-of-2 regions > which only needs to call free_heap_pages() once each. We check the least > significant set bit of the start address and use its bit index as the > order of this increment. This makes sure that each increment is both > power-of-2 and properly aligned, which can be safely passed to > free_heap_pages(). Of course, the order also needs to be sanity checked > against the upper bound and MAX_ORDER. > > Tested on a nested environment on c5.metal with various amount > of RAM and CONFIG_DEBUG=n. Time for end_boot_allocator() to complete: > Before After > - 90GB: 1445 ms 96 ms > - 8GB: 126 ms 8 ms > - 4GB: 62 ms 4 ms > > Signed-off-by: Hongyan Xia <hongyxia@amazon.com> > Signed-off-by: Julien Grall <jgrall@amazon.com> > > --- > > Changes in v2: > - Update comment > - Update the numbers. They are slightly better as is_contig_page() > has been folded in init_heap_pages(). > --- > xen/common/page_alloc.c | 35 ++++++++++++++++++++++++++++++++--- > 1 file changed, 32 insertions(+), 3 deletions(-) > > diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c > index eedb2fed77c3..2b99801d2ea3 100644 > --- a/xen/common/page_alloc.c > +++ b/xen/common/page_alloc.c > @@ -1779,7 +1779,7 @@ int query_page_offline(mfn_t mfn, uint32_t *status) > > /* > * This function should only be called with valid pages from the same NUMA > - * node. > + * node and zone. > */ > static void _init_heap_pages(const struct page_info *pg, > unsigned long nr_pages, > @@ -1806,8 +1806,22 @@ static void _init_heap_pages(const struct page_info *pg, > > while ( s < e ) > { > - free_heap_pages(mfn_to_page(_mfn(s)), 0, need_scrub); > - s += 1UL; > + /* > + * For s == 0, we simply use the largest increment by checking the > + * MSB of the region size. For s != 0, we also need to ensure that the > + * chunk is properly sized to end at power-of-two alignment. We do this > + * by checking the LSB of the start address and use its index as > + * the increment. Both cases need to be guarded by MAX_ORDER. > + * > + * Note that the value of ffsl() and flsl() starts from 1 so we need > + * to decrement it by 1. > + */ > + int inc_order = min(MAX_ORDER, flsl(e - s) - 1); > + > + if ( s ) > + inc_order = min(inc_order, ffsl(s) - 1); > + free_heap_pages(mfn_to_page(_mfn(s)), inc_order, need_scrub); > + s += (1UL << inc_order); > } > } > > @@ -1844,6 +1858,9 @@ static void init_heap_pages( > > for ( i = 0; i < nr_pages; ) > { > +#ifdef CONFIG_SEPARATE_XENHEAP > + unsigned int zone = page_to_zone(pg); > +#endif > unsigned int nid = phys_to_nid(page_to_maddr(pg)); > unsigned long left = nr_pages - i; > unsigned long contig_pages; > @@ -1856,6 +1873,18 @@ static void init_heap_pages( > */ > for ( contig_pages = 1; contig_pages < left; contig_pages++ ) > { > + /* > + * No need to check for the zone when !CONFIG_SEPARATE_XENHEAP > + * because free_heap_pages() can only take power-of-two ranges > + * which never cross zone boundaries. But for separate xenheap > + * which is manually defined, it is possible for power-of-two > + * range to cross zones. > + */ > +#ifdef CONFIG_SEPARATE_XENHEAP > + if ( zone != page_to_zone(pg) ) > + break; > +#endif > + > if ( nid != (phys_to_nid(page_to_maddr(pg))) ) > break; > } Reviewed-by: Wei Chen <Wei.Chen@arm.com>
On 15.07.2022 19:03, Julien Grall wrote: > @@ -1806,8 +1806,22 @@ static void _init_heap_pages(const struct page_info *pg, > > while ( s < e ) > { > - free_heap_pages(mfn_to_page(_mfn(s)), 0, need_scrub); > - s += 1UL; > + /* > + * For s == 0, we simply use the largest increment by checking the > + * MSB of the region size. For s != 0, we also need to ensure that the > + * chunk is properly sized to end at power-of-two alignment. We do this > + * by checking the LSB of the start address and use its index as > + * the increment. Both cases need to be guarded by MAX_ORDER. s/guarded/bounded/ ? > + * Note that the value of ffsl() and flsl() starts from 1 so we need > + * to decrement it by 1. > + */ > + int inc_order = min(MAX_ORDER, flsl(e - s) - 1); unsigned int, since ... > + if ( s ) > + inc_order = min(inc_order, ffsl(s) - 1); > + free_heap_pages(mfn_to_page(_mfn(s)), inc_order, need_scrub); ... that's what the function parameter says, and the value also can go negative? Preferably with these adjustments Reviewed-by: Jan Beulich <jbeulich@suse.com> Jan
Hi Jan, On 18/07/2022 10:43, Jan Beulich wrote: > On 15.07.2022 19:03, Julien Grall wrote: >> @@ -1806,8 +1806,22 @@ static void _init_heap_pages(const struct page_info *pg, >> >> while ( s < e ) >> { >> - free_heap_pages(mfn_to_page(_mfn(s)), 0, need_scrub); >> - s += 1UL; >> + /* >> + * For s == 0, we simply use the largest increment by checking the >> + * MSB of the region size. For s != 0, we also need to ensure that the >> + * chunk is properly sized to end at power-of-two alignment. We do this >> + * by checking the LSB of the start address and use its index as >> + * the increment. Both cases need to be guarded by MAX_ORDER. > > s/guarded/bounded/ ? > >> + * Note that the value of ffsl() and flsl() starts from 1 so we need >> + * to decrement it by 1. >> + */ >> + int inc_order = min(MAX_ORDER, flsl(e - s) - 1); > > unsigned int, since ... MAX_ORDER, flsl(), ffsl() are both returning signed value. So inc_order should be 'int' to avoid any compilation issue. > >> + if ( s ) >> + inc_order = min(inc_order, ffsl(s) - 1); >> + free_heap_pages(mfn_to_page(_mfn(s)), inc_order, need_scrub); > > ... that's what the function parameter says, and the value also can go > negative? AFAICT, none of the values are negative. But per above, if we use min() then the local variable should be 'int'. The two possible alternatives are: 1) Use min_t() 2) Update MAX_ORDER, flsl(), ffsl() to return an unsigned value The ideal would be #2 but it will require at least an extra patch and effort. If we use #1, then they use may become stale if we implement #2. So I would prefer to keep min(). That said, I would be open to use min_t() if you strongly prefer it. > Preferably with these adjustments > Reviewed-by: Jan Beulich <jbeulich@suse.com> Thanks! Cheers,
On 18.07.2022 12:24, Julien Grall wrote: > Hi Jan, > > On 18/07/2022 10:43, Jan Beulich wrote: >> On 15.07.2022 19:03, Julien Grall wrote: >>> @@ -1806,8 +1806,22 @@ static void _init_heap_pages(const struct page_info *pg, >>> >>> while ( s < e ) >>> { >>> - free_heap_pages(mfn_to_page(_mfn(s)), 0, need_scrub); >>> - s += 1UL; >>> + /* >>> + * For s == 0, we simply use the largest increment by checking the >>> + * MSB of the region size. For s != 0, we also need to ensure that the >>> + * chunk is properly sized to end at power-of-two alignment. We do this >>> + * by checking the LSB of the start address and use its index as >>> + * the increment. Both cases need to be guarded by MAX_ORDER. >> >> s/guarded/bounded/ ? >> >>> + * Note that the value of ffsl() and flsl() starts from 1 so we need >>> + * to decrement it by 1. >>> + */ >>> + int inc_order = min(MAX_ORDER, flsl(e - s) - 1); >> >> unsigned int, since ... > > MAX_ORDER, flsl(), ffsl() are both returning signed value. So inc_order > should be 'int' to avoid any compilation issue. > >> >>> + if ( s ) >>> + inc_order = min(inc_order, ffsl(s) - 1); >>> + free_heap_pages(mfn_to_page(_mfn(s)), inc_order, need_scrub); >> >> ... that's what the function parameter says, and the value also can go >> negative? > > AFAICT, none of the values are negative. But per above, if we use min() > then the local variable should be 'int'. The two possible alternatives are: > 1) Use min_t() > 2) Update MAX_ORDER, flsl(), ffsl() to return an unsigned value > > The ideal would be #2 but it will require at least an extra patch and > effort. If we use #1, then they use may become stale if we implement #2. I'm not sure #2 is a good option - we'd deviate from conventions used elsewhere on what flsl() and ffsl() return. And constants would imo best remain suffix-less unless a suffix is very relevant to have (for MAX_ORDER, which isn't at risk of being subject to ~ or -, it may be warranted). 3) unsigned int inc_order = min(MAX_ORDER, flsl(e - s) - 1); if ( s ) inc_order = min(inc_order, ffsl(s) - 1U); No compilation issues to expect here, afaict. > So I would prefer to keep min(). That said, I would be open to use > min_t() if you strongly prefer it. I consider max_t() / min_t() dangerous, so I'd like to limit their use to cases where no good alternatives exist. Jan
Hi Jan, On 18/07/2022 12:02, Jan Beulich wrote: > On 18.07.2022 12:24, Julien Grall wrote: > 3) > unsigned int inc_order = min(MAX_ORDER, flsl(e - s) - 1); > > if ( s ) > inc_order = min(inc_order, ffsl(s) - 1U); I like this idea! > > No compilation issues to expect here, afaict. Correct, GCC is happy with this approach. Assuming there are no other comments, are you happy if I make the change on commit? Cheers,
On 18.07.2022 19:39, Julien Grall wrote: > On 18/07/2022 12:02, Jan Beulich wrote: >> On 18.07.2022 12:24, Julien Grall wrote: >> 3) >> unsigned int inc_order = min(MAX_ORDER, flsl(e - s) - 1); >> >> if ( s ) >> inc_order = min(inc_order, ffsl(s) - 1U); > > I like this idea! > >> >> No compilation issues to expect here, afaict. > > Correct, GCC is happy with this approach. Assuming there are no other > comments, are you happy if I make the change on commit? Sure - iirc I gave my R-b already. Jan
Hi Jan, On 19/07/2022 07:01, Jan Beulich wrote: > On 18.07.2022 19:39, Julien Grall wrote: >> On 18/07/2022 12:02, Jan Beulich wrote: >>> On 18.07.2022 12:24, Julien Grall wrote: >>> 3) >>> unsigned int inc_order = min(MAX_ORDER, flsl(e - s) - 1); >>> >>> if ( s ) >>> inc_order = min(inc_order, ffsl(s) - 1U); >> >> I like this idea! >> >>> >>> No compilation issues to expect here, afaict. >> >> Correct, GCC is happy with this approach. Assuming there are no other >> comments, are you happy if I make the change on commit? > > Sure - iirc I gave my R-b already. You did. Just wanted to make sure your reviewed-by was valid with your proposal. I have now committed the series. Cheers,
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c index eedb2fed77c3..2b99801d2ea3 100644 --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c @@ -1779,7 +1779,7 @@ int query_page_offline(mfn_t mfn, uint32_t *status) /* * This function should only be called with valid pages from the same NUMA - * node. + * node and zone. */ static void _init_heap_pages(const struct page_info *pg, unsigned long nr_pages, @@ -1806,8 +1806,22 @@ static void _init_heap_pages(const struct page_info *pg, while ( s < e ) { - free_heap_pages(mfn_to_page(_mfn(s)), 0, need_scrub); - s += 1UL; + /* + * For s == 0, we simply use the largest increment by checking the + * MSB of the region size. For s != 0, we also need to ensure that the + * chunk is properly sized to end at power-of-two alignment. We do this + * by checking the LSB of the start address and use its index as + * the increment. Both cases need to be guarded by MAX_ORDER. + * + * Note that the value of ffsl() and flsl() starts from 1 so we need + * to decrement it by 1. + */ + int inc_order = min(MAX_ORDER, flsl(e - s) - 1); + + if ( s ) + inc_order = min(inc_order, ffsl(s) - 1); + free_heap_pages(mfn_to_page(_mfn(s)), inc_order, need_scrub); + s += (1UL << inc_order); } } @@ -1844,6 +1858,9 @@ static void init_heap_pages( for ( i = 0; i < nr_pages; ) { +#ifdef CONFIG_SEPARATE_XENHEAP + unsigned int zone = page_to_zone(pg); +#endif unsigned int nid = phys_to_nid(page_to_maddr(pg)); unsigned long left = nr_pages - i; unsigned long contig_pages; @@ -1856,6 +1873,18 @@ static void init_heap_pages( */ for ( contig_pages = 1; contig_pages < left; contig_pages++ ) { + /* + * No need to check for the zone when !CONFIG_SEPARATE_XENHEAP + * because free_heap_pages() can only take power-of-two ranges + * which never cross zone boundaries. But for separate xenheap + * which is manually defined, it is possible for power-of-two + * range to cross zones. + */ +#ifdef CONFIG_SEPARATE_XENHEAP + if ( zone != page_to_zone(pg) ) + break; +#endif + if ( nid != (phys_to_nid(page_to_maddr(pg))) ) break; }