diff mbox series

mm, page_alloc: cleanup usemap_size() when SPARSEMEM is not set

Message ID 20190118234905.27597-1-richard.weiyang@gmail.com (mailing list archive)
State New, archived
Headers show
Series mm, page_alloc: cleanup usemap_size() when SPARSEMEM is not set | expand

Commit Message

Wei Yang Jan. 18, 2019, 11:49 p.m. UTC
Two cleanups in this patch:

  * since pageblock_nr_pages == (1 << pageblock_order), the roundup()
    and right shift pageblock_order could be replaced with
    DIV_ROUND_UP()
  * use BITS_TO_LONGS() to get number of bytes for bitmap

This patch also fix one typo in comment.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
 mm/page_alloc.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

Comments

Michal Hocko Jan. 22, 2019, 8:55 a.m. UTC | #1
On Sat 19-01-19 07:49:05, Wei Yang wrote:
> Two cleanups in this patch:
> 
>   * since pageblock_nr_pages == (1 << pageblock_order), the roundup()
>     and right shift pageblock_order could be replaced with
>     DIV_ROUND_UP()

Why is this change worth it?

>   * use BITS_TO_LONGS() to get number of bytes for bitmap
> 
> This patch also fix one typo in comment.
> 
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> ---
>  mm/page_alloc.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index d295c9bc01a8..d7073cedd087 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6352,7 +6352,7 @@ static void __init calculate_node_totalpages(struct pglist_data *pgdat,
>  /*
>   * Calculate the size of the zone->blockflags rounded to an unsigned long
>   * Start by making sure zonesize is a multiple of pageblock_order by rounding
> - * up. Then use 1 NR_PAGEBLOCK_BITS worth of bits per pageblock, finally
> + * up. Then use 1 NR_PAGEBLOCK_BITS width of bits per pageblock, finally

why do you change this?

>   * round what is now in bits to nearest long in bits, then return it in
>   * bytes.
>   */
> @@ -6361,12 +6361,9 @@ static unsigned long __init usemap_size(unsigned long zone_start_pfn, unsigned l
>  	unsigned long usemapsize;
>  
>  	zonesize += zone_start_pfn & (pageblock_nr_pages-1);
> -	usemapsize = roundup(zonesize, pageblock_nr_pages);
> -	usemapsize = usemapsize >> pageblock_order;
> +	usemapsize = DIV_ROUND_UP(zonesize, pageblock_nr_pages);
>  	usemapsize *= NR_PAGEBLOCK_BITS;
> -	usemapsize = roundup(usemapsize, 8 * sizeof(unsigned long));
> -
> -	return usemapsize / 8;
> +	return BITS_TO_LONGS(usemapsize) * sizeof(unsigned long);
>  }
>  
>  static void __ref setup_usemap(struct pglist_data *pgdat,
> -- 
> 2.15.1
>
Wei Yang Jan. 22, 2019, 3:07 p.m. UTC | #2
On Tue, Jan 22, 2019 at 09:55:24AM +0100, Michal Hocko wrote:
>On Sat 19-01-19 07:49:05, Wei Yang wrote:
>> Two cleanups in this patch:
>> 
>>   * since pageblock_nr_pages == (1 << pageblock_order), the roundup()
>>     and right shift pageblock_order could be replaced with
>>     DIV_ROUND_UP()
>
>Why is this change worth it?
>

To make it directly show usemapsize is number of times of
pageblock_nr_pages.

>>   * use BITS_TO_LONGS() to get number of bytes for bitmap
>> 
>> This patch also fix one typo in comment.
>> 
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> ---
>>  mm/page_alloc.c | 9 +++------
>>  1 file changed, 3 insertions(+), 6 deletions(-)
>> 
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index d295c9bc01a8..d7073cedd087 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -6352,7 +6352,7 @@ static void __init calculate_node_totalpages(struct pglist_data *pgdat,
>>  /*
>>   * Calculate the size of the zone->blockflags rounded to an unsigned long
>>   * Start by making sure zonesize is a multiple of pageblock_order by rounding
>> - * up. Then use 1 NR_PAGEBLOCK_BITS worth of bits per pageblock, finally
>> + * up. Then use 1 NR_PAGEBLOCK_BITS width of bits per pageblock, finally
>
>why do you change this?
>

Is the original comment not correct? Or I misunderstand the English
word?

>>   * round what is now in bits to nearest long in bits, then return it in
>>   * bytes.
>>   */
>> @@ -6361,12 +6361,9 @@ static unsigned long __init usemap_size(unsigned long zone_start_pfn, unsigned l
>>  	unsigned long usemapsize;
>>  
>>  	zonesize += zone_start_pfn & (pageblock_nr_pages-1);
>> -	usemapsize = roundup(zonesize, pageblock_nr_pages);
>> -	usemapsize = usemapsize >> pageblock_order;
>> +	usemapsize = DIV_ROUND_UP(zonesize, pageblock_nr_pages);
>>  	usemapsize *= NR_PAGEBLOCK_BITS;
>> -	usemapsize = roundup(usemapsize, 8 * sizeof(unsigned long));
>> -
>> -	return usemapsize / 8;
>> +	return BITS_TO_LONGS(usemapsize) * sizeof(unsigned long);
>>  }
>>  
>>  static void __ref setup_usemap(struct pglist_data *pgdat,
>> -- 
>> 2.15.1
>> 
>
>-- 
>Michal Hocko
>SUSE Labs
Michal Hocko Jan. 22, 2019, 3:16 p.m. UTC | #3
On Tue 22-01-19 15:07:17, Wei Yang wrote:
> On Tue, Jan 22, 2019 at 09:55:24AM +0100, Michal Hocko wrote:
> >On Sat 19-01-19 07:49:05, Wei Yang wrote:
> >> Two cleanups in this patch:
> >> 
> >>   * since pageblock_nr_pages == (1 << pageblock_order), the roundup()
> >>     and right shift pageblock_order could be replaced with
> >>     DIV_ROUND_UP()
> >
> >Why is this change worth it?
> >
> 
> To make it directly show usemapsize is number of times of
> pageblock_nr_pages.

Does this lead to a better code generation? Does it make the code easier
to read/maintain?

> >>   * use BITS_TO_LONGS() to get number of bytes for bitmap
> >> 
> >> This patch also fix one typo in comment.
> >> 
> >> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> >> ---
> >>  mm/page_alloc.c | 9 +++------
> >>  1 file changed, 3 insertions(+), 6 deletions(-)
> >> 
> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >> index d295c9bc01a8..d7073cedd087 100644
> >> --- a/mm/page_alloc.c
> >> +++ b/mm/page_alloc.c
> >> @@ -6352,7 +6352,7 @@ static void __init calculate_node_totalpages(struct pglist_data *pgdat,
> >>  /*
> >>   * Calculate the size of the zone->blockflags rounded to an unsigned long
> >>   * Start by making sure zonesize is a multiple of pageblock_order by rounding
> >> - * up. Then use 1 NR_PAGEBLOCK_BITS worth of bits per pageblock, finally
> >> + * up. Then use 1 NR_PAGEBLOCK_BITS width of bits per pageblock, finally
> >
> >why do you change this?
> >
> 
> Is the original comment not correct? Or I misunderstand the English
> word?

yes AFAICS
Wei Yang Jan. 22, 2019, 3:56 p.m. UTC | #4
On Tue, Jan 22, 2019 at 04:16:28PM +0100, Michal Hocko wrote:
>On Tue 22-01-19 15:07:17, Wei Yang wrote:
>> On Tue, Jan 22, 2019 at 09:55:24AM +0100, Michal Hocko wrote:
>> >On Sat 19-01-19 07:49:05, Wei Yang wrote:
>> >> Two cleanups in this patch:
>> >> 
>> >>   * since pageblock_nr_pages == (1 << pageblock_order), the roundup()
>> >>     and right shift pageblock_order could be replaced with
>> >>     DIV_ROUND_UP()
>> >
>> >Why is this change worth it?
>> >
>> 
>> To make it directly show usemapsize is number of times of
>> pageblock_nr_pages.
>
>Does this lead to a better code generation? Does it make the code easier
>to read/maintain?
>

I think the answer is yes.

  * it reduce the code from 6 lines to 3 lines, 50% off
  * by reducing calculation back and forth, it would be easier for
    audience to catch what it tries to do

>> >>   * use BITS_TO_LONGS() to get number of bytes for bitmap
>> >> 
>> >> This patch also fix one typo in comment.
>> >> 
>> >> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> >> ---
>> >>  mm/page_alloc.c | 9 +++------
>> >>  1 file changed, 3 insertions(+), 6 deletions(-)
>> >> 
>> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> >> index d295c9bc01a8..d7073cedd087 100644
>> >> --- a/mm/page_alloc.c
>> >> +++ b/mm/page_alloc.c
>> >> @@ -6352,7 +6352,7 @@ static void __init calculate_node_totalpages(struct pglist_data *pgdat,
>> >>  /*
>> >>   * Calculate the size of the zone->blockflags rounded to an unsigned long
>> >>   * Start by making sure zonesize is a multiple of pageblock_order by rounding
>> >> - * up. Then use 1 NR_PAGEBLOCK_BITS worth of bits per pageblock, finally
>> >> + * up. Then use 1 NR_PAGEBLOCK_BITS width of bits per pageblock, finally
>> >
>> >why do you change this?
>> >
>> 
>> Is the original comment not correct? Or I misunderstand the English
>> word?
>
>yes AFAICS

ok, maybe the first time to know this. So I guess they are the same
meaning? I searched in google, but no specific explanation on this.

>
>-- 
>Michal Hocko
>SUSE Labs
Michal Hocko Jan. 23, 2019, 9:55 a.m. UTC | #5
On Tue 22-01-19 15:56:28, Wei Yang wrote:
> On Tue, Jan 22, 2019 at 04:16:28PM +0100, Michal Hocko wrote:
> >On Tue 22-01-19 15:07:17, Wei Yang wrote:
> >> On Tue, Jan 22, 2019 at 09:55:24AM +0100, Michal Hocko wrote:
> >> >On Sat 19-01-19 07:49:05, Wei Yang wrote:
> >> >> Two cleanups in this patch:
> >> >> 
> >> >>   * since pageblock_nr_pages == (1 << pageblock_order), the roundup()
> >> >>     and right shift pageblock_order could be replaced with
> >> >>     DIV_ROUND_UP()
> >> >
> >> >Why is this change worth it?
> >> >
> >> 
> >> To make it directly show usemapsize is number of times of
> >> pageblock_nr_pages.
> >
> >Does this lead to a better code generation? Does it make the code easier
> >to read/maintain?
> >
> 
> I think the answer is yes.
> 
>   * it reduce the code from 6 lines to 3 lines, 50% off
>   * by reducing calculation back and forth, it would be easier for
>     audience to catch what it tries to do

To be honest, I really do not see this sufficient to justify touching
the code unless the resulting _generated_ code is better/more efficient.
Wei Yang Jan. 24, 2019, 2:13 p.m. UTC | #6
On Wed, Jan 23, 2019 at 10:55:03AM +0100, Michal Hocko wrote:
>On Tue 22-01-19 15:56:28, Wei Yang wrote:
>> 
>> I think the answer is yes.
>> 
>>   * it reduce the code from 6 lines to 3 lines, 50% off
>>   * by reducing calculation back and forth, it would be easier for
>>     audience to catch what it tries to do
>
>To be honest, I really do not see this sufficient to justify touching
>the code unless the resulting _generated_ code is better/more efficient.

Tried objdump to compare two version.

               Base       Patched      Reduced
Code Size(B)   48         39           18.7%
Instructions   12         10           16.6%

Here is the raw output.

00000000000001be <usemap_size_ywtest1>:
{
 1be:	e8 00 00 00 00       	callq  1c3 <usemap_size_ywtest1+0x5>
	zonesize += zone_start_pfn & (pageblock_nr_pages-1);
 1c3:	81 e7 ff 01 00 00    	and    $0x1ff,%edi
{
 1c9:	55                   	push   %rbp
	return usemapsize / 8;
 1ca:	48 ba f8 ff ff ff ff 	movabs $0x1ffffffffffffff8,%rdx
 1d1:	ff ff 1f 
	usemapsize = roundup(zonesize, pageblock_nr_pages);
 1d4:	48 8d 84 3e ff 01 00 	lea    0x1ff(%rsi,%rdi,1),%rax
 1db:	00 
	usemapsize *= NR_PAGEBLOCK_BITS;
 1dc:	48 c1 e8 09          	shr    $0x9,%rax
	usemapsize = roundup(usemapsize, 8 * sizeof(unsigned long));
 1e0:	48 8d 04 85 3f 00 00 	lea    0x3f(,%rax,4),%rax
 1e7:	00 
{
 1e8:	48 89 e5             	mov    %rsp,%rbp
}
 1eb:	5d                   	pop    %rbp
	return usemapsize / 8;
 1ec:	48 c1 e8 03          	shr    $0x3,%rax
 1f0:	48 21 d0             	and    %rdx,%rax
}
 1f3:	c3                   	retq   

00000000000001f4 <usemap_size_ywtest2>:
{
 1f4:	e8 00 00 00 00       	callq  1f9 <usemap_size_ywtest2+0x5>
	zonesize += zone_start_pfn & (pageblock_nr_pages-1);
 1f9:	81 e7 ff 01 00 00    	and    $0x1ff,%edi
{
 1ff:	55                   	push   %rbp
	usemapsize = DIV_ROUND_UP(zonesize, pageblock_nr_pages);
 200:	48 8d 84 3e ff 01 00 	lea    0x1ff(%rsi,%rdi,1),%rax
 207:	00 
 208:	48 c1 e8 09          	shr    $0x9,%rax
	return BITS_TO_LONGS(usemapsize) * sizeof(unsigned long);
 20c:	48 8d 04 85 3f 00 00 	lea    0x3f(,%rax,4),%rax
 213:	00 
{
 214:	48 89 e5             	mov    %rsp,%rbp
}
 217:	5d                   	pop    %rbp
	return BITS_TO_LONGS(usemapsize) * sizeof(unsigned long);
 218:	48 c1 e8 06          	shr    $0x6,%rax
 21c:	48 c1 e0 03          	shl    $0x3,%rax
}
 220:	c3                   	retq
Michal Hocko Jan. 24, 2019, 2:30 p.m. UTC | #7
On Thu 24-01-19 14:13:41, Wei Yang wrote:
> On Wed, Jan 23, 2019 at 10:55:03AM +0100, Michal Hocko wrote:
> >On Tue 22-01-19 15:56:28, Wei Yang wrote:
> >> 
> >> I think the answer is yes.
> >> 
> >>   * it reduce the code from 6 lines to 3 lines, 50% off
> >>   * by reducing calculation back and forth, it would be easier for
> >>     audience to catch what it tries to do
> >
> >To be honest, I really do not see this sufficient to justify touching
> >the code unless the resulting _generated_ code is better/more efficient.
> 
> Tried objdump to compare two version.
> 
>                Base       Patched      Reduced
> Code Size(B)   48         39           18.7%
> Instructions   12         10           16.6%

How have you compiled the code? (compiler version, any specific configs).
Because I do not see any difference.

CONFIG_CC_OPTIMIZE_FOR_SIZE:
   text    data     bss     dec     hex filename
  47087    2085      72   49244    c05c mm/page_alloc.o
  47087    2085      72   49244    c05c mm/page_alloc.o.prev

CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE:
   text    data     bss     dec     hex filename
  55046    2085      72   57203    df73 mm/page_alloc.o
  55046    2085      72   57203    df73 mm/page_alloc.o.prev

And that would actually match my expectations because I am pretty sure
the compiler can figure out what to do with those operations even
without any help.

Really, is this really worth touching and spending a non-trivial time to
discuss? I do not see the benefit.
Wei Yang Jan. 25, 2019, 1:12 a.m. UTC | #8
On Thu, Jan 24, 2019 at 03:30:08PM +0100, Michal Hocko wrote:
>On Thu 24-01-19 14:13:41, Wei Yang wrote:
>> On Wed, Jan 23, 2019 at 10:55:03AM +0100, Michal Hocko wrote:
>> >On Tue 22-01-19 15:56:28, Wei Yang wrote:
>> >> 
>> >> I think the answer is yes.
>> >> 
>> >>   * it reduce the code from 6 lines to 3 lines, 50% off
>> >>   * by reducing calculation back and forth, it would be easier for
>> >>     audience to catch what it tries to do
>> >
>> >To be honest, I really do not see this sufficient to justify touching
>> >the code unless the resulting _generated_ code is better/more efficient.
>> 
>> Tried objdump to compare two version.
>> 
>>                Base       Patched      Reduced
>> Code Size(B)   48         39           18.7%
>> Instructions   12         10           16.6%
>
>How have you compiled the code? (compiler version, any specific configs).
>Because I do not see any difference.

Yes, of course I have hacked and compiled the code.

I guess you compile the code on x86, which by default SPARSEMEM is
configured. This means those changes are not compiled. 

To get the result, I have hacked the code to add the definition to
mm/sparse.c and call this new function to make sure compile will not
optimize this out.

Below is the result from readelf -S mm/sparse.o

>
>CONFIG_CC_OPTIMIZE_FOR_SIZE:
>   text    data     bss     dec     hex filename
>  47087    2085      72   49244    c05c mm/page_alloc.o
>  47087    2085      72   49244    c05c mm/page_alloc.o.prev

text: 0x2c7 -> 0x2be   reduced 9 bytes

>
>CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE:
>   text    data     bss     dec     hex filename
>  55046    2085      72   57203    df73 mm/page_alloc.o
>  55046    2085      72   57203    df73 mm/page_alloc.o.prev

text: 0x35b -> 0x34b   reduced 16 bytes

>
>And that would actually match my expectations because I am pretty sure
>the compiler can figure out what to do with those operations even
>without any help.
>
>Really, is this really worth touching and spending a non-trivial time to
>discuss? I do not see the benefit.

I thought this is a trivial change and we have the same taste of the
code.

I agree to put an end to this thread.

Thanks for your time.

>-- 
>Michal Hocko
>SUSE Labs
diff mbox series

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d295c9bc01a8..d7073cedd087 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6352,7 +6352,7 @@  static void __init calculate_node_totalpages(struct pglist_data *pgdat,
 /*
  * Calculate the size of the zone->blockflags rounded to an unsigned long
  * Start by making sure zonesize is a multiple of pageblock_order by rounding
- * up. Then use 1 NR_PAGEBLOCK_BITS worth of bits per pageblock, finally
+ * up. Then use 1 NR_PAGEBLOCK_BITS width of bits per pageblock, finally
  * round what is now in bits to nearest long in bits, then return it in
  * bytes.
  */
@@ -6361,12 +6361,9 @@  static unsigned long __init usemap_size(unsigned long zone_start_pfn, unsigned l
 	unsigned long usemapsize;
 
 	zonesize += zone_start_pfn & (pageblock_nr_pages-1);
-	usemapsize = roundup(zonesize, pageblock_nr_pages);
-	usemapsize = usemapsize >> pageblock_order;
+	usemapsize = DIV_ROUND_UP(zonesize, pageblock_nr_pages);
 	usemapsize *= NR_PAGEBLOCK_BITS;
-	usemapsize = roundup(usemapsize, 8 * sizeof(unsigned long));
-
-	return usemapsize / 8;
+	return BITS_TO_LONGS(usemapsize) * sizeof(unsigned long);
 }
 
 static void __ref setup_usemap(struct pglist_data *pgdat,