diff mbox series

mm: sparse: pass section_nr to section_mark_present

Message ID 20210701135543epcms1p84a043bf49757bafada0a773372611d69@epcms1p8 (mailing list archive)
State New
Headers show
Series mm: sparse: pass section_nr to section_mark_present | expand

Commit Message

권오훈 July 1, 2021, 1:55 p.m. UTC
With CONFIG_SPARSEMEM_EXTREME enabled, __section_nr() which converts
mem_section to section_nr could be costly since it iterates all
sections to check if the given mem_section is in its range.

On the other hand, __nr_to_section which converts section_nr to
mem_section can be done in O(1).

Let's pass section_nr instead of mem_section ptr to section_mark_present
in order to reduce needless iterations.

Signed-off-by: Ohhoon Kwon <ohoono.kwon@samsung.com>
---
 mm/sparse.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Comments

David Hildenbrand July 1, 2021, 2:34 p.m. UTC | #1
On 01.07.21 15:55, 권오훈 wrote:
> With CONFIG_SPARSEMEM_EXTREME enabled, __section_nr() which converts
> mem_section to section_nr could be costly since it iterates all
> sections to check if the given mem_section is in its range.

It actually iterates all section roots.

> 
> On the other hand, __nr_to_section which converts section_nr to
> mem_section can be done in O(1).
> 
> Let's pass section_nr instead of mem_section ptr to section_mark_present
> in order to reduce needless iterations.

I'd expect this to be mostly noise, especially as we iterate section 
roots and for most (smallish) machines we might just work on the lowest 
section roots only.

Can you actually observe an improvement regarding boot times?

Anyhow, looks straight forward to me, although we might just reintroduce 
similar patterns again easily if it's really just noise (see 
find_memory_block() as used by). And it might allow for a nice cleanup 
(see below).

Reviewed-by: David Hildenbrand <david@redhat.com>


Can you send 1) a patch to convert find_memory_block() as well and 2) a 
patch to rip out __section_nr() completely?

> 
> Signed-off-by: Ohhoon Kwon <ohoono.kwon@samsung.com>
> ---
>   mm/sparse.c | 9 +++++----
>   1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/sparse.c b/mm/sparse.c
> index 55c18aff3e42..4a2700e9a65f 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -186,13 +186,14 @@ void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn,
>    * those loops early.
>    */
>   unsigned long __highest_present_section_nr;
> -static void section_mark_present(struct mem_section *ms)
> +static void section_mark_present(unsigned long section_nr)
>   {
> -	unsigned long section_nr = __section_nr(ms);
> +	struct mem_section *ms;
>   
>   	if (section_nr > __highest_present_section_nr)
>   		__highest_present_section_nr = section_nr;
>   
> +	ms = __nr_to_section(section_nr);
>   	ms->section_mem_map |= SECTION_MARKED_PRESENT;
>   }
>   
> @@ -279,7 +280,7 @@ static void __init memory_present(int nid, unsigned long start, unsigned long en
>   		if (!ms->section_mem_map) {
>   			ms->section_mem_map = sparse_encode_early_nid(nid) |
>   							SECTION_IS_ONLINE;
> -			section_mark_present(ms);
> +			section_mark_present(section);
>   		}
>   	}
>   }
> @@ -933,7 +934,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn,
>   
>   	ms = __nr_to_section(section_nr);
>   	set_section_nid(section_nr, nid);
> -	section_mark_present(ms);
> +	section_mark_present(section_nr);
>   
>   	/* Align memmap to section boundary in the subsection case */
>   	if (section_nr_to_pfn(section_nr) != start_pfn)
>
권오훈 July 1, 2021, 3:41 p.m. UTC | #2
On Thu, Jul 01, 2021 at 04:34:13PM +0200, David Hildenbrand wrote:
> On 01.07.21 15:55, 권오훈 wrote:
> > With CONFIG_SPARSEMEM_EXTREME enabled, __section_nr() which converts
> > mem_section to section_nr could be costly since it iterates all
> > sections to check if the given mem_section is in its range.
>  
> It actually iterates all section roots.
>  
> > 
> > On the other hand, __nr_to_section which converts section_nr to
> > mem_section can be done in O(1).
> > 
> > Let's pass section_nr instead of mem_section ptr to section_mark_present
> > in order to reduce needless iterations.
>  
> I'd expect this to be mostly noise, especially as we iterate section 
> roots and for most (smallish) machines we might just work on the lowest 
> section roots only.
>  
> Can you actually observe an improvement regarding boot times?
>  
> Anyhow, looks straight forward to me, although we might just reintroduce 
> similar patterns again easily if it's really just noise (see 
> find_memory_block() as used by). And it might allow for a nice cleanup 
> (see below).
>  
> Reviewed-by: David Hildenbrand <david@redhat.com>
>  
>  
> Can you send 1) a patch to convert find_memory_block() as well and 2) a 
> patch to rip out __section_nr() completely?
>  
> > 
> > Signed-off-by: Ohhoon Kwon <ohoono.kwon@samsung.com>
> > ---
> >   mm/sparse.c | 9 +++++----
> >   1 file changed, 5 insertions(+), 4 deletions(-)
> > 
> > diff --git a/mm/sparse.c b/mm/sparse.c
> > index 55c18aff3e42..4a2700e9a65f 100644
> > --- a/mm/sparse.c
> > +++ b/mm/sparse.c
> > @@ -186,13 +186,14 @@ void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn,
> >    * those loops early.
> >    */
> >   unsigned long __highest_present_section_nr;
> > -static void section_mark_present(struct mem_section *ms)
> > +static void section_mark_present(unsigned long section_nr)
> >   {
> > -        unsigned long section_nr = __section_nr(ms);
> > +        struct mem_section *ms;
> >   
> >           if (section_nr > __highest_present_section_nr)
> >                   __highest_present_section_nr = section_nr;
> >   
> > +        ms = __nr_to_section(section_nr);
> >           ms->section_mem_map |= SECTION_MARKED_PRESENT;
> >   }
> >   
> > @@ -279,7 +280,7 @@ static void __init memory_present(int nid, unsigned long start, unsigned long en
> >                   if (!ms->section_mem_map) {
> >                           ms->section_mem_map = sparse_encode_early_nid(nid) |
> >                                                           SECTION_IS_ONLINE;
> > -                        section_mark_present(ms);
> > +                        section_mark_present(section);
> >                   }
> >           }
> >   }
> > @@ -933,7 +934,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn,
> >   
> >           ms = __nr_to_section(section_nr);
> >           set_section_nid(section_nr, nid);
> > -        section_mark_present(ms);
> > +        section_mark_present(section_nr);
> >   
> >           /* Align memmap to section boundary in the subsection case */
> >           if (section_nr_to_pfn(section_nr) != start_pfn)
> > 
>  
>  
> -- 
> Thanks,
>  
> David / dhildenb
>  
Dear David.

I tried to check on time for memblocks_present, but when I tested with mobile
phones with 8GB ram, the original binary took 0us either as well as the
patched binary.
I'm not sure how the results would differ on huge systems with bigger ram.
I agree that it could turn out to be just a noise, as you expected.

However as you also mentioned, the patches will be straight forward when all
codes using __section_nr() are cleaned up nicely.

Below are the two patches that you asked for.
Please tell me if you need me to send the patches in separate e-mails.

Thanks,
Ohhoon Kwon.

> Can you send 1) a patch to convert find_memory_block() as well and 2) a 
> patch to rip out __section_nr() completely?


-----------------------------------------------------------------------------------------


[PATCH] mm: sparse: pass section_nr to find_memory_block

With CONFIG_SPARSEMEM_EXTREME enabled, __section_nr() which converts
mem_section to section_nr could be costly since it iterates all
sections to check if the given mem_section is in its range.

On the other hand, __nr_to_section which converts section_nr to
mem_section can be done in O(1).

Let's pass section_nr instead of mem_section ptr to find_memory_block
in order to reduce needless iterations.

Signed-off-by: Ohhoon Kwon <ohoono.kwon@samsung.com>
---
 arch/powerpc/platforms/pseries/hotplug-memory.c | 4 +---
 drivers/base/memory.c                           | 4 ++--
 include/linux/memory.h                          | 2 +-
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 8377f1f7c78e..905790092e0e 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -211,13 +211,11 @@ static int update_lmb_associativity_index(struct drmem_lmb *lmb)
 static struct memory_block *lmb_to_memblock(struct drmem_lmb *lmb)
 {
 	unsigned long section_nr;
-	struct mem_section *mem_sect;
 	struct memory_block *mem_block;
 
 	section_nr = pfn_to_section_nr(PFN_DOWN(lmb->base_addr));
-	mem_sect = __nr_to_section(section_nr);
 
-	mem_block = find_memory_block(mem_sect);
+	mem_block = find_memory_block(section_nr);
 	return mem_block;
 }
 
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index d5ffaab3cb61..e31598955cc4 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -578,9 +578,9 @@ static struct memory_block *find_memory_block_by_id(unsigned long block_id)
 /*
  * Called under device_hotplug_lock.
  */
-struct memory_block *find_memory_block(struct mem_section *section)
+struct memory_block *find_memory_block(unsigned long section_nr)
 {
-	unsigned long block_id = memory_block_id(__section_nr(section));
+	unsigned long block_id = memory_block_id(section_nr);
 
 	return find_memory_block_by_id(block_id);
 }
diff --git a/include/linux/memory.h b/include/linux/memory.h
index 97e92e8b556a..d9a0b61cd432 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -90,7 +90,7 @@ int create_memory_block_devices(unsigned long start, unsigned long size,
 void remove_memory_block_devices(unsigned long start, unsigned long size);
 extern void memory_dev_init(void);
 extern int memory_notify(unsigned long val, void *v);
-extern struct memory_block *find_memory_block(struct mem_section *);
+extern struct memory_block *find_memory_block(unsigned long section_nr);
 typedef int (*walk_memory_blocks_func_t)(struct memory_block *, void *);
 extern int walk_memory_blocks(unsigned long start, unsigned long size,
 			      void *arg, walk_memory_blocks_func_t func);
David Hildenbrand July 1, 2021, 4:12 p.m. UTC | #3
On 01.07.21 17:41, 권오훈 wrote:
> On Thu, Jul 01, 2021 at 04:34:13PM +0200, David Hildenbrand wrote:
>> On 01.07.21 15:55, 권오훈 wrote:
>>> With CONFIG_SPARSEMEM_EXTREME enabled, __section_nr() which converts
>>> mem_section to section_nr could be costly since it iterates all
>>> sections to check if the given mem_section is in its range.
>>   
>> It actually iterates all section roots.
>>   
>>>
>>> On the other hand, __nr_to_section which converts section_nr to
>>> mem_section can be done in O(1).
>>>
>>> Let's pass section_nr instead of mem_section ptr to section_mark_present
>>> in order to reduce needless iterations.
>>   
>> I'd expect this to be mostly noise, especially as we iterate section
>> roots and for most (smallish) machines we might just work on the lowest
>> section roots only.
>>   
>> Can you actually observe an improvement regarding boot times?
>>   
>> Anyhow, looks straight forward to me, although we might just reintroduce
>> similar patterns again easily if it's really just noise (see
>> find_memory_block() as used by). And it might allow for a nice cleanup
>> (see below).
>>   
>> Reviewed-by: David Hildenbrand <david@redhat.com>
>>   
>>   
>> Can you send 1) a patch to convert find_memory_block() as well and 2) a
>> patch to rip out __section_nr() completely?
>>   
>>>
>>> Signed-off-by: Ohhoon Kwon <ohoono.kwon@samsung.com>
>>> ---
>>>    mm/sparse.c | 9 +++++----
>>>    1 file changed, 5 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/mm/sparse.c b/mm/sparse.c
>>> index 55c18aff3e42..4a2700e9a65f 100644
>>> --- a/mm/sparse.c
>>> +++ b/mm/sparse.c
>>> @@ -186,13 +186,14 @@ void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn,
>>>     * those loops early.
>>>     */
>>>    unsigned long __highest_present_section_nr;
>>> -static void section_mark_present(struct mem_section *ms)
>>> +static void section_mark_present(unsigned long section_nr)
>>>    {
>>> -        unsigned long section_nr = __section_nr(ms);
>>> +        struct mem_section *ms;
>>>    
>>>            if (section_nr > __highest_present_section_nr)
>>>                    __highest_present_section_nr = section_nr;
>>>    
>>> +        ms = __nr_to_section(section_nr);
>>>            ms->section_mem_map |= SECTION_MARKED_PRESENT;
>>>    }
>>>    
>>> @@ -279,7 +280,7 @@ static void __init memory_present(int nid, unsigned long start, unsigned long en
>>>                    if (!ms->section_mem_map) {
>>>                            ms->section_mem_map = sparse_encode_early_nid(nid) |
>>>                                                            SECTION_IS_ONLINE;
>>> -                        section_mark_present(ms);
>>> +                        section_mark_present(section);
>>>                    }
>>>            }
>>>    }
>>> @@ -933,7 +934,7 @@ int __meminit sparse_add_section(int nid, unsigned long start_pfn,
>>>    
>>>            ms = __nr_to_section(section_nr);
>>>            set_section_nid(section_nr, nid);
>>> -        section_mark_present(ms);
>>> +        section_mark_present(section_nr);
>>>    
>>>            /* Align memmap to section boundary in the subsection case */
>>>            if (section_nr_to_pfn(section_nr) != start_pfn)
>>>
>>   
>>   
>> -- 
>> Thanks,
>>   
>> David / dhildenb
>>   
> Dear David.
> 
> I tried to check on time for memblocks_present, but when I tested with mobile
> phones with 8GB ram, the original binary took 0us either as well as the
> patched binary.
> I'm not sure how the results would differ on huge systems with bigger ram.
> I agree that it could turn out to be just a noise, as you expected.
> 
> However as you also mentioned, the patches will be straight forward when all
> codes using __section_nr() are cleaned up nicely.
> 
> Below are the two patches that you asked for.
> Please tell me if you need me to send the patches in separate e-mails.

Yes, please send them separately. Maybe sent all 3 patches combined in a 
single series, so Andrew can pick them easily, and reviewers can review 
more easily.

Thanks!
diff mbox series

Patch

diff --git a/mm/sparse.c b/mm/sparse.c
index 55c18aff3e42..4a2700e9a65f 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -186,13 +186,14 @@  void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn,
  * those loops early.
  */
 unsigned long __highest_present_section_nr;
-static void section_mark_present(struct mem_section *ms)
+static void section_mark_present(unsigned long section_nr)
 {
-	unsigned long section_nr = __section_nr(ms);
+	struct mem_section *ms;
 
 	if (section_nr > __highest_present_section_nr)
 		__highest_present_section_nr = section_nr;
 
+	ms = __nr_to_section(section_nr);
 	ms->section_mem_map |= SECTION_MARKED_PRESENT;
 }
 
@@ -279,7 +280,7 @@  static void __init memory_present(int nid, unsigned long start, unsigned long en
 		if (!ms->section_mem_map) {
 			ms->section_mem_map = sparse_encode_early_nid(nid) |
 							SECTION_IS_ONLINE;
-			section_mark_present(ms);
+			section_mark_present(section);
 		}
 	}
 }
@@ -933,7 +934,7 @@  int __meminit sparse_add_section(int nid, unsigned long start_pfn,
 
 	ms = __nr_to_section(section_nr);
 	set_section_nid(section_nr, nid);
-	section_mark_present(ms);
+	section_mark_present(section_nr);
 
 	/* Align memmap to section boundary in the subsection case */
 	if (section_nr_to_pfn(section_nr) != start_pfn)