Message ID | 20210701135543epcms1p84a043bf49757bafada0a773372611d69@epcms1p8 (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | mm: sparse: pass section_nr to section_mark_present | expand |
On 01.07.21 15:55, 권오훈 wrote: > With CONFIG_SPARSEMEM_EXTREME enabled, __section_nr() which converts > mem_section to section_nr could be costly since it iterates all > sections to check if the given mem_section is in its range. It actually iterates all section roots. > > On the other hand, __nr_to_section which converts section_nr to > mem_section can be done in O(1). > > Let's pass section_nr instead of mem_section ptr to section_mark_present > in order to reduce needless iterations. I'd expect this to be mostly noise, especially as we iterate section roots and for most (smallish) machines we might just work on the lowest section roots only. Can you actually observe an improvement regarding boot times? Anyhow, looks straight forward to me, although we might just reintroduce similar patterns again easily if it's really just noise (see find_memory_block() as used by). And it might allow for a nice cleanup (see below). Reviewed-by: David Hildenbrand <david@redhat.com> Can you send 1) a patch to convert find_memory_block() as well and 2) a patch to rip out __section_nr() completely? > > Signed-off-by: Ohhoon Kwon <ohoono.kwon@samsung.com> > --- > mm/sparse.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/mm/sparse.c b/mm/sparse.c > index 55c18aff3e42..4a2700e9a65f 100644 > --- a/mm/sparse.c > +++ b/mm/sparse.c > @@ -186,13 +186,14 @@ void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn, > * those loops early. > */ > unsigned long __highest_present_section_nr; > -static void section_mark_present(struct mem_section *ms) > +static void section_mark_present(unsigned long section_nr) > { > - unsigned long section_nr = __section_nr(ms); > + struct mem_section *ms; > > if (section_nr > __highest_present_section_nr) > __highest_present_section_nr = section_nr; > > + ms = __nr_to_section(section_nr); > ms->section_mem_map |= SECTION_MARKED_PRESENT; > } > > @@ -279,7 +280,7 @@ static void __init memory_present(int nid, unsigned long start, unsigned long en > if (!ms->section_mem_map) { > ms->section_mem_map = sparse_encode_early_nid(nid) | > SECTION_IS_ONLINE; > - section_mark_present(ms); > + section_mark_present(section); > } > } > } > @@ -933,7 +934,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, > > ms = __nr_to_section(section_nr); > set_section_nid(section_nr, nid); > - section_mark_present(ms); > + section_mark_present(section_nr); > > /* Align memmap to section boundary in the subsection case */ > if (section_nr_to_pfn(section_nr) != start_pfn) >
On Thu, Jul 01, 2021 at 04:34:13PM +0200, David Hildenbrand wrote: > On 01.07.21 15:55, 권오훈 wrote: > > With CONFIG_SPARSEMEM_EXTREME enabled, __section_nr() which converts > > mem_section to section_nr could be costly since it iterates all > > sections to check if the given mem_section is in its range. > > It actually iterates all section roots. > > > > > On the other hand, __nr_to_section which converts section_nr to > > mem_section can be done in O(1). > > > > Let's pass section_nr instead of mem_section ptr to section_mark_present > > in order to reduce needless iterations. > > I'd expect this to be mostly noise, especially as we iterate section > roots and for most (smallish) machines we might just work on the lowest > section roots only. > > Can you actually observe an improvement regarding boot times? > > Anyhow, looks straight forward to me, although we might just reintroduce > similar patterns again easily if it's really just noise (see > find_memory_block() as used by). And it might allow for a nice cleanup > (see below). > > Reviewed-by: David Hildenbrand <david@redhat.com> > > > Can you send 1) a patch to convert find_memory_block() as well and 2) a > patch to rip out __section_nr() completely? > > > > > Signed-off-by: Ohhoon Kwon <ohoono.kwon@samsung.com> > > --- > > mm/sparse.c | 9 +++++---- > > 1 file changed, 5 insertions(+), 4 deletions(-) > > > > diff --git a/mm/sparse.c b/mm/sparse.c > > index 55c18aff3e42..4a2700e9a65f 100644 > > --- a/mm/sparse.c > > +++ b/mm/sparse.c > > @@ -186,13 +186,14 @@ void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn, > > * those loops early. > > */ > > unsigned long __highest_present_section_nr; > > -static void section_mark_present(struct mem_section *ms) > > +static void section_mark_present(unsigned long section_nr) > > { > > - unsigned long section_nr = __section_nr(ms); > > + struct mem_section *ms; > > > > if (section_nr > __highest_present_section_nr) > > __highest_present_section_nr = section_nr; > > > > + ms = __nr_to_section(section_nr); > > ms->section_mem_map |= SECTION_MARKED_PRESENT; > > } > > > > @@ -279,7 +280,7 @@ static void __init memory_present(int nid, unsigned long start, unsigned long en > > if (!ms->section_mem_map) { > > ms->section_mem_map = sparse_encode_early_nid(nid) | > > SECTION_IS_ONLINE; > > - section_mark_present(ms); > > + section_mark_present(section); > > } > > } > > } > > @@ -933,7 +934,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, > > > > ms = __nr_to_section(section_nr); > > set_section_nid(section_nr, nid); > > - section_mark_present(ms); > > + section_mark_present(section_nr); > > > > /* Align memmap to section boundary in the subsection case */ > > if (section_nr_to_pfn(section_nr) != start_pfn) > > > > > -- > Thanks, > > David / dhildenb > Dear David. I tried to check on time for memblocks_present, but when I tested with mobile phones with 8GB ram, the original binary took 0us either as well as the patched binary. I'm not sure how the results would differ on huge systems with bigger ram. I agree that it could turn out to be just a noise, as you expected. However as you also mentioned, the patches will be straight forward when all codes using __section_nr() are cleaned up nicely. Below are the two patches that you asked for. Please tell me if you need me to send the patches in separate e-mails. Thanks, Ohhoon Kwon. > Can you send 1) a patch to convert find_memory_block() as well and 2) a > patch to rip out __section_nr() completely? ----------------------------------------------------------------------------------------- [PATCH] mm: sparse: pass section_nr to find_memory_block With CONFIG_SPARSEMEM_EXTREME enabled, __section_nr() which converts mem_section to section_nr could be costly since it iterates all sections to check if the given mem_section is in its range. On the other hand, __nr_to_section which converts section_nr to mem_section can be done in O(1). Let's pass section_nr instead of mem_section ptr to find_memory_block in order to reduce needless iterations. Signed-off-by: Ohhoon Kwon <ohoono.kwon@samsung.com> --- arch/powerpc/platforms/pseries/hotplug-memory.c | 4 +--- drivers/base/memory.c | 4 ++-- include/linux/memory.h | 2 +- 3 files changed, 4 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c index 8377f1f7c78e..905790092e0e 100644 --- a/arch/powerpc/platforms/pseries/hotplug-memory.c +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c @@ -211,13 +211,11 @@ static int update_lmb_associativity_index(struct drmem_lmb *lmb) static struct memory_block *lmb_to_memblock(struct drmem_lmb *lmb) { unsigned long section_nr; - struct mem_section *mem_sect; struct memory_block *mem_block; section_nr = pfn_to_section_nr(PFN_DOWN(lmb->base_addr)); - mem_sect = __nr_to_section(section_nr); - mem_block = find_memory_block(mem_sect); + mem_block = find_memory_block(section_nr); return mem_block; } diff --git a/drivers/base/memory.c b/drivers/base/memory.c index d5ffaab3cb61..e31598955cc4 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -578,9 +578,9 @@ static struct memory_block *find_memory_block_by_id(unsigned long block_id) /* * Called under device_hotplug_lock. */ -struct memory_block *find_memory_block(struct mem_section *section) +struct memory_block *find_memory_block(unsigned long section_nr) { - unsigned long block_id = memory_block_id(__section_nr(section)); + unsigned long block_id = memory_block_id(section_nr); return find_memory_block_by_id(block_id); } diff --git a/include/linux/memory.h b/include/linux/memory.h index 97e92e8b556a..d9a0b61cd432 100644 --- a/include/linux/memory.h +++ b/include/linux/memory.h @@ -90,7 +90,7 @@ int create_memory_block_devices(unsigned long start, unsigned long size, void remove_memory_block_devices(unsigned long start, unsigned long size); extern void memory_dev_init(void); extern int memory_notify(unsigned long val, void *v); -extern struct memory_block *find_memory_block(struct mem_section *); +extern struct memory_block *find_memory_block(unsigned long section_nr); typedef int (*walk_memory_blocks_func_t)(struct memory_block *, void *); extern int walk_memory_blocks(unsigned long start, unsigned long size, void *arg, walk_memory_blocks_func_t func);
On 01.07.21 17:41, 권오훈 wrote: > On Thu, Jul 01, 2021 at 04:34:13PM +0200, David Hildenbrand wrote: >> On 01.07.21 15:55, 권오훈 wrote: >>> With CONFIG_SPARSEMEM_EXTREME enabled, __section_nr() which converts >>> mem_section to section_nr could be costly since it iterates all >>> sections to check if the given mem_section is in its range. >> >> It actually iterates all section roots. >> >>> >>> On the other hand, __nr_to_section which converts section_nr to >>> mem_section can be done in O(1). >>> >>> Let's pass section_nr instead of mem_section ptr to section_mark_present >>> in order to reduce needless iterations. >> >> I'd expect this to be mostly noise, especially as we iterate section >> roots and for most (smallish) machines we might just work on the lowest >> section roots only. >> >> Can you actually observe an improvement regarding boot times? >> >> Anyhow, looks straight forward to me, although we might just reintroduce >> similar patterns again easily if it's really just noise (see >> find_memory_block() as used by). And it might allow for a nice cleanup >> (see below). >> >> Reviewed-by: David Hildenbrand <david@redhat.com> >> >> >> Can you send 1) a patch to convert find_memory_block() as well and 2) a >> patch to rip out __section_nr() completely? >> >>> >>> Signed-off-by: Ohhoon Kwon <ohoono.kwon@samsung.com> >>> --- >>> mm/sparse.c | 9 +++++---- >>> 1 file changed, 5 insertions(+), 4 deletions(-) >>> >>> diff --git a/mm/sparse.c b/mm/sparse.c >>> index 55c18aff3e42..4a2700e9a65f 100644 >>> --- a/mm/sparse.c >>> +++ b/mm/sparse.c >>> @@ -186,13 +186,14 @@ void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn, >>> * those loops early. >>> */ >>> unsigned long __highest_present_section_nr; >>> -static void section_mark_present(struct mem_section *ms) >>> +static void section_mark_present(unsigned long section_nr) >>> { >>> - unsigned long section_nr = __section_nr(ms); >>> + struct mem_section *ms; >>> >>> if (section_nr > __highest_present_section_nr) >>> __highest_present_section_nr = section_nr; >>> >>> + ms = __nr_to_section(section_nr); >>> ms->section_mem_map |= SECTION_MARKED_PRESENT; >>> } >>> >>> @@ -279,7 +280,7 @@ static void __init memory_present(int nid, unsigned long start, unsigned long en >>> if (!ms->section_mem_map) { >>> ms->section_mem_map = sparse_encode_early_nid(nid) | >>> SECTION_IS_ONLINE; >>> - section_mark_present(ms); >>> + section_mark_present(section); >>> } >>> } >>> } >>> @@ -933,7 +934,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, >>> >>> ms = __nr_to_section(section_nr); >>> set_section_nid(section_nr, nid); >>> - section_mark_present(ms); >>> + section_mark_present(section_nr); >>> >>> /* Align memmap to section boundary in the subsection case */ >>> if (section_nr_to_pfn(section_nr) != start_pfn) >>> >> >> >> -- >> Thanks, >> >> David / dhildenb >> > Dear David. > > I tried to check on time for memblocks_present, but when I tested with mobile > phones with 8GB ram, the original binary took 0us either as well as the > patched binary. > I'm not sure how the results would differ on huge systems with bigger ram. > I agree that it could turn out to be just a noise, as you expected. > > However as you also mentioned, the patches will be straight forward when all > codes using __section_nr() are cleaned up nicely. > > Below are the two patches that you asked for. > Please tell me if you need me to send the patches in separate e-mails. Yes, please send them separately. Maybe sent all 3 patches combined in a single series, so Andrew can pick them easily, and reviewers can review more easily. Thanks!
diff --git a/mm/sparse.c b/mm/sparse.c index 55c18aff3e42..4a2700e9a65f 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -186,13 +186,14 @@ void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn, * those loops early. */ unsigned long __highest_present_section_nr; -static void section_mark_present(struct mem_section *ms) +static void section_mark_present(unsigned long section_nr) { - unsigned long section_nr = __section_nr(ms); + struct mem_section *ms; if (section_nr > __highest_present_section_nr) __highest_present_section_nr = section_nr; + ms = __nr_to_section(section_nr); ms->section_mem_map |= SECTION_MARKED_PRESENT; } @@ -279,7 +280,7 @@ static void __init memory_present(int nid, unsigned long start, unsigned long en if (!ms->section_mem_map) { ms->section_mem_map = sparse_encode_early_nid(nid) | SECTION_IS_ONLINE; - section_mark_present(ms); + section_mark_present(section); } } } @@ -933,7 +934,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn, ms = __nr_to_section(section_nr); set_section_nid(section_nr, nid); - section_mark_present(ms); + section_mark_present(section_nr); /* Align memmap to section boundary in the subsection case */ if (section_nr_to_pfn(section_nr) != start_pfn)
With CONFIG_SPARSEMEM_EXTREME enabled, __section_nr() which converts mem_section to section_nr could be costly since it iterates all sections to check if the given mem_section is in its range. On the other hand, __nr_to_section which converts section_nr to mem_section can be done in O(1). Let's pass section_nr instead of mem_section ptr to section_mark_present in order to reduce needless iterations. Signed-off-by: Ohhoon Kwon <ohoono.kwon@samsung.com> --- mm/sparse.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)