Message ID | 20190118234905.27597-1-richard.weiyang@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm, page_alloc: cleanup usemap_size() when SPARSEMEM is not set | expand |
On Sat 19-01-19 07:49:05, Wei Yang wrote: > Two cleanups in this patch: > > * since pageblock_nr_pages == (1 << pageblock_order), the roundup() > and right shift pageblock_order could be replaced with > DIV_ROUND_UP() Why is this change worth it? > * use BITS_TO_LONGS() to get number of bytes for bitmap > > This patch also fix one typo in comment. > > Signed-off-by: Wei Yang <richard.weiyang@gmail.com> > --- > mm/page_alloc.c | 9 +++------ > 1 file changed, 3 insertions(+), 6 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index d295c9bc01a8..d7073cedd087 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -6352,7 +6352,7 @@ static void __init calculate_node_totalpages(struct pglist_data *pgdat, > /* > * Calculate the size of the zone->blockflags rounded to an unsigned long > * Start by making sure zonesize is a multiple of pageblock_order by rounding > - * up. Then use 1 NR_PAGEBLOCK_BITS worth of bits per pageblock, finally > + * up. Then use 1 NR_PAGEBLOCK_BITS width of bits per pageblock, finally why do you change this? > * round what is now in bits to nearest long in bits, then return it in > * bytes. > */ > @@ -6361,12 +6361,9 @@ static unsigned long __init usemap_size(unsigned long zone_start_pfn, unsigned l > unsigned long usemapsize; > > zonesize += zone_start_pfn & (pageblock_nr_pages-1); > - usemapsize = roundup(zonesize, pageblock_nr_pages); > - usemapsize = usemapsize >> pageblock_order; > + usemapsize = DIV_ROUND_UP(zonesize, pageblock_nr_pages); > usemapsize *= NR_PAGEBLOCK_BITS; > - usemapsize = roundup(usemapsize, 8 * sizeof(unsigned long)); > - > - return usemapsize / 8; > + return BITS_TO_LONGS(usemapsize) * sizeof(unsigned long); > } > > static void __ref setup_usemap(struct pglist_data *pgdat, > -- > 2.15.1 >
On Tue, Jan 22, 2019 at 09:55:24AM +0100, Michal Hocko wrote: >On Sat 19-01-19 07:49:05, Wei Yang wrote: >> Two cleanups in this patch: >> >> * since pageblock_nr_pages == (1 << pageblock_order), the roundup() >> and right shift pageblock_order could be replaced with >> DIV_ROUND_UP() > >Why is this change worth it? > To make it directly show usemapsize is number of times of pageblock_nr_pages. >> * use BITS_TO_LONGS() to get number of bytes for bitmap >> >> This patch also fix one typo in comment. >> >> Signed-off-by: Wei Yang <richard.weiyang@gmail.com> >> --- >> mm/page_alloc.c | 9 +++------ >> 1 file changed, 3 insertions(+), 6 deletions(-) >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index d295c9bc01a8..d7073cedd087 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -6352,7 +6352,7 @@ static void __init calculate_node_totalpages(struct pglist_data *pgdat, >> /* >> * Calculate the size of the zone->blockflags rounded to an unsigned long >> * Start by making sure zonesize is a multiple of pageblock_order by rounding >> - * up. Then use 1 NR_PAGEBLOCK_BITS worth of bits per pageblock, finally >> + * up. Then use 1 NR_PAGEBLOCK_BITS width of bits per pageblock, finally > >why do you change this? > Is the original comment not correct? Or I misunderstand the English word? >> * round what is now in bits to nearest long in bits, then return it in >> * bytes. >> */ >> @@ -6361,12 +6361,9 @@ static unsigned long __init usemap_size(unsigned long zone_start_pfn, unsigned l >> unsigned long usemapsize; >> >> zonesize += zone_start_pfn & (pageblock_nr_pages-1); >> - usemapsize = roundup(zonesize, pageblock_nr_pages); >> - usemapsize = usemapsize >> pageblock_order; >> + usemapsize = DIV_ROUND_UP(zonesize, pageblock_nr_pages); >> usemapsize *= NR_PAGEBLOCK_BITS; >> - usemapsize = roundup(usemapsize, 8 * sizeof(unsigned long)); >> - >> - return usemapsize / 8; >> + return BITS_TO_LONGS(usemapsize) * sizeof(unsigned long); >> } >> >> static void __ref setup_usemap(struct pglist_data *pgdat, >> -- >> 2.15.1 >> > >-- >Michal Hocko >SUSE Labs
On Tue 22-01-19 15:07:17, Wei Yang wrote: > On Tue, Jan 22, 2019 at 09:55:24AM +0100, Michal Hocko wrote: > >On Sat 19-01-19 07:49:05, Wei Yang wrote: > >> Two cleanups in this patch: > >> > >> * since pageblock_nr_pages == (1 << pageblock_order), the roundup() > >> and right shift pageblock_order could be replaced with > >> DIV_ROUND_UP() > > > >Why is this change worth it? > > > > To make it directly show usemapsize is number of times of > pageblock_nr_pages. Does this lead to a better code generation? Does it make the code easier to read/maintain? > >> * use BITS_TO_LONGS() to get number of bytes for bitmap > >> > >> This patch also fix one typo in comment. > >> > >> Signed-off-by: Wei Yang <richard.weiyang@gmail.com> > >> --- > >> mm/page_alloc.c | 9 +++------ > >> 1 file changed, 3 insertions(+), 6 deletions(-) > >> > >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c > >> index d295c9bc01a8..d7073cedd087 100644 > >> --- a/mm/page_alloc.c > >> +++ b/mm/page_alloc.c > >> @@ -6352,7 +6352,7 @@ static void __init calculate_node_totalpages(struct pglist_data *pgdat, > >> /* > >> * Calculate the size of the zone->blockflags rounded to an unsigned long > >> * Start by making sure zonesize is a multiple of pageblock_order by rounding > >> - * up. Then use 1 NR_PAGEBLOCK_BITS worth of bits per pageblock, finally > >> + * up. Then use 1 NR_PAGEBLOCK_BITS width of bits per pageblock, finally > > > >why do you change this? > > > > Is the original comment not correct? Or I misunderstand the English > word? yes AFAICS
On Tue, Jan 22, 2019 at 04:16:28PM +0100, Michal Hocko wrote: >On Tue 22-01-19 15:07:17, Wei Yang wrote: >> On Tue, Jan 22, 2019 at 09:55:24AM +0100, Michal Hocko wrote: >> >On Sat 19-01-19 07:49:05, Wei Yang wrote: >> >> Two cleanups in this patch: >> >> >> >> * since pageblock_nr_pages == (1 << pageblock_order), the roundup() >> >> and right shift pageblock_order could be replaced with >> >> DIV_ROUND_UP() >> > >> >Why is this change worth it? >> > >> >> To make it directly show usemapsize is number of times of >> pageblock_nr_pages. > >Does this lead to a better code generation? Does it make the code easier >to read/maintain? > I think the answer is yes. * it reduce the code from 6 lines to 3 lines, 50% off * by reducing calculation back and forth, it would be easier for audience to catch what it tries to do >> >> * use BITS_TO_LONGS() to get number of bytes for bitmap >> >> >> >> This patch also fix one typo in comment. >> >> >> >> Signed-off-by: Wei Yang <richard.weiyang@gmail.com> >> >> --- >> >> mm/page_alloc.c | 9 +++------ >> >> 1 file changed, 3 insertions(+), 6 deletions(-) >> >> >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> >> index d295c9bc01a8..d7073cedd087 100644 >> >> --- a/mm/page_alloc.c >> >> +++ b/mm/page_alloc.c >> >> @@ -6352,7 +6352,7 @@ static void __init calculate_node_totalpages(struct pglist_data *pgdat, >> >> /* >> >> * Calculate the size of the zone->blockflags rounded to an unsigned long >> >> * Start by making sure zonesize is a multiple of pageblock_order by rounding >> >> - * up. Then use 1 NR_PAGEBLOCK_BITS worth of bits per pageblock, finally >> >> + * up. Then use 1 NR_PAGEBLOCK_BITS width of bits per pageblock, finally >> > >> >why do you change this? >> > >> >> Is the original comment not correct? Or I misunderstand the English >> word? > >yes AFAICS ok, maybe the first time to know this. So I guess they are the same meaning? I searched in google, but no specific explanation on this. > >-- >Michal Hocko >SUSE Labs
On Tue 22-01-19 15:56:28, Wei Yang wrote: > On Tue, Jan 22, 2019 at 04:16:28PM +0100, Michal Hocko wrote: > >On Tue 22-01-19 15:07:17, Wei Yang wrote: > >> On Tue, Jan 22, 2019 at 09:55:24AM +0100, Michal Hocko wrote: > >> >On Sat 19-01-19 07:49:05, Wei Yang wrote: > >> >> Two cleanups in this patch: > >> >> > >> >> * since pageblock_nr_pages == (1 << pageblock_order), the roundup() > >> >> and right shift pageblock_order could be replaced with > >> >> DIV_ROUND_UP() > >> > > >> >Why is this change worth it? > >> > > >> > >> To make it directly show usemapsize is number of times of > >> pageblock_nr_pages. > > > >Does this lead to a better code generation? Does it make the code easier > >to read/maintain? > > > > I think the answer is yes. > > * it reduce the code from 6 lines to 3 lines, 50% off > * by reducing calculation back and forth, it would be easier for > audience to catch what it tries to do To be honest, I really do not see this sufficient to justify touching the code unless the resulting _generated_ code is better/more efficient.
On Wed, Jan 23, 2019 at 10:55:03AM +0100, Michal Hocko wrote: >On Tue 22-01-19 15:56:28, Wei Yang wrote: >> >> I think the answer is yes. >> >> * it reduce the code from 6 lines to 3 lines, 50% off >> * by reducing calculation back and forth, it would be easier for >> audience to catch what it tries to do > >To be honest, I really do not see this sufficient to justify touching >the code unless the resulting _generated_ code is better/more efficient. Tried objdump to compare two version. Base Patched Reduced Code Size(B) 48 39 18.7% Instructions 12 10 16.6% Here is the raw output. 00000000000001be <usemap_size_ywtest1>: { 1be: e8 00 00 00 00 callq 1c3 <usemap_size_ywtest1+0x5> zonesize += zone_start_pfn & (pageblock_nr_pages-1); 1c3: 81 e7 ff 01 00 00 and $0x1ff,%edi { 1c9: 55 push %rbp return usemapsize / 8; 1ca: 48 ba f8 ff ff ff ff movabs $0x1ffffffffffffff8,%rdx 1d1: ff ff 1f usemapsize = roundup(zonesize, pageblock_nr_pages); 1d4: 48 8d 84 3e ff 01 00 lea 0x1ff(%rsi,%rdi,1),%rax 1db: 00 usemapsize *= NR_PAGEBLOCK_BITS; 1dc: 48 c1 e8 09 shr $0x9,%rax usemapsize = roundup(usemapsize, 8 * sizeof(unsigned long)); 1e0: 48 8d 04 85 3f 00 00 lea 0x3f(,%rax,4),%rax 1e7: 00 { 1e8: 48 89 e5 mov %rsp,%rbp } 1eb: 5d pop %rbp return usemapsize / 8; 1ec: 48 c1 e8 03 shr $0x3,%rax 1f0: 48 21 d0 and %rdx,%rax } 1f3: c3 retq 00000000000001f4 <usemap_size_ywtest2>: { 1f4: e8 00 00 00 00 callq 1f9 <usemap_size_ywtest2+0x5> zonesize += zone_start_pfn & (pageblock_nr_pages-1); 1f9: 81 e7 ff 01 00 00 and $0x1ff,%edi { 1ff: 55 push %rbp usemapsize = DIV_ROUND_UP(zonesize, pageblock_nr_pages); 200: 48 8d 84 3e ff 01 00 lea 0x1ff(%rsi,%rdi,1),%rax 207: 00 208: 48 c1 e8 09 shr $0x9,%rax return BITS_TO_LONGS(usemapsize) * sizeof(unsigned long); 20c: 48 8d 04 85 3f 00 00 lea 0x3f(,%rax,4),%rax 213: 00 { 214: 48 89 e5 mov %rsp,%rbp } 217: 5d pop %rbp return BITS_TO_LONGS(usemapsize) * sizeof(unsigned long); 218: 48 c1 e8 06 shr $0x6,%rax 21c: 48 c1 e0 03 shl $0x3,%rax } 220: c3 retq
On Thu 24-01-19 14:13:41, Wei Yang wrote: > On Wed, Jan 23, 2019 at 10:55:03AM +0100, Michal Hocko wrote: > >On Tue 22-01-19 15:56:28, Wei Yang wrote: > >> > >> I think the answer is yes. > >> > >> * it reduce the code from 6 lines to 3 lines, 50% off > >> * by reducing calculation back and forth, it would be easier for > >> audience to catch what it tries to do > > > >To be honest, I really do not see this sufficient to justify touching > >the code unless the resulting _generated_ code is better/more efficient. > > Tried objdump to compare two version. > > Base Patched Reduced > Code Size(B) 48 39 18.7% > Instructions 12 10 16.6% How have you compiled the code? (compiler version, any specific configs). Because I do not see any difference. CONFIG_CC_OPTIMIZE_FOR_SIZE: text data bss dec hex filename 47087 2085 72 49244 c05c mm/page_alloc.o 47087 2085 72 49244 c05c mm/page_alloc.o.prev CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE: text data bss dec hex filename 55046 2085 72 57203 df73 mm/page_alloc.o 55046 2085 72 57203 df73 mm/page_alloc.o.prev And that would actually match my expectations because I am pretty sure the compiler can figure out what to do with those operations even without any help. Really, is this really worth touching and spending a non-trivial time to discuss? I do not see the benefit.
On Thu, Jan 24, 2019 at 03:30:08PM +0100, Michal Hocko wrote: >On Thu 24-01-19 14:13:41, Wei Yang wrote: >> On Wed, Jan 23, 2019 at 10:55:03AM +0100, Michal Hocko wrote: >> >On Tue 22-01-19 15:56:28, Wei Yang wrote: >> >> >> >> I think the answer is yes. >> >> >> >> * it reduce the code from 6 lines to 3 lines, 50% off >> >> * by reducing calculation back and forth, it would be easier for >> >> audience to catch what it tries to do >> > >> >To be honest, I really do not see this sufficient to justify touching >> >the code unless the resulting _generated_ code is better/more efficient. >> >> Tried objdump to compare two version. >> >> Base Patched Reduced >> Code Size(B) 48 39 18.7% >> Instructions 12 10 16.6% > >How have you compiled the code? (compiler version, any specific configs). >Because I do not see any difference. Yes, of course I have hacked and compiled the code. I guess you compile the code on x86, which by default SPARSEMEM is configured. This means those changes are not compiled. To get the result, I have hacked the code to add the definition to mm/sparse.c and call this new function to make sure compile will not optimize this out. Below is the result from readelf -S mm/sparse.o > >CONFIG_CC_OPTIMIZE_FOR_SIZE: > text data bss dec hex filename > 47087 2085 72 49244 c05c mm/page_alloc.o > 47087 2085 72 49244 c05c mm/page_alloc.o.prev text: 0x2c7 -> 0x2be reduced 9 bytes > >CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE: > text data bss dec hex filename > 55046 2085 72 57203 df73 mm/page_alloc.o > 55046 2085 72 57203 df73 mm/page_alloc.o.prev text: 0x35b -> 0x34b reduced 16 bytes > >And that would actually match my expectations because I am pretty sure >the compiler can figure out what to do with those operations even >without any help. > >Really, is this really worth touching and spending a non-trivial time to >discuss? I do not see the benefit. I thought this is a trivial change and we have the same taste of the code. I agree to put an end to this thread. Thanks for your time. >-- >Michal Hocko >SUSE Labs
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d295c9bc01a8..d7073cedd087 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6352,7 +6352,7 @@ static void __init calculate_node_totalpages(struct pglist_data *pgdat, /* * Calculate the size of the zone->blockflags rounded to an unsigned long * Start by making sure zonesize is a multiple of pageblock_order by rounding - * up. Then use 1 NR_PAGEBLOCK_BITS worth of bits per pageblock, finally + * up. Then use 1 NR_PAGEBLOCK_BITS width of bits per pageblock, finally * round what is now in bits to nearest long in bits, then return it in * bytes. */ @@ -6361,12 +6361,9 @@ static unsigned long __init usemap_size(unsigned long zone_start_pfn, unsigned l unsigned long usemapsize; zonesize += zone_start_pfn & (pageblock_nr_pages-1); - usemapsize = roundup(zonesize, pageblock_nr_pages); - usemapsize = usemapsize >> pageblock_order; + usemapsize = DIV_ROUND_UP(zonesize, pageblock_nr_pages); usemapsize *= NR_PAGEBLOCK_BITS; - usemapsize = roundup(usemapsize, 8 * sizeof(unsigned long)); - - return usemapsize / 8; + return BITS_TO_LONGS(usemapsize) * sizeof(unsigned long); } static void __ref setup_usemap(struct pglist_data *pgdat,
Two cleanups in this patch: * since pageblock_nr_pages == (1 << pageblock_order), the roundup() and right shift pageblock_order could be replaced with DIV_ROUND_UP() * use BITS_TO_LONGS() to get number of bytes for bitmap This patch also fix one typo in comment. Signed-off-by: Wei Yang <richard.weiyang@gmail.com> --- mm/page_alloc.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)