diff mbox series

[v2,2/3] x86: probe memblock size advisement value during mm init

Message ID 20241016192445.3118-3-gourry@gourry.net (mailing list archive)
State Handled Elsewhere, archived
Headers show
Series mm/memblock,x86,acpi: hotplug memory alignment advisement | expand

Commit Message

Gregory Price Oct. 16, 2024, 7:24 p.m. UTC
Systems with hotplug may provide an advisement value on what the
memblock size should be.  Probe this value when the rest of the
configuration values are considered.

The new heuristic is as follows

1) set_memory_block_size_order value if already set (cmdline param)
2) minimum block size if memory is less than large block limit
3) [new] hotplug advise: lesser of advise value or memory alignment
4) Max block size if system is bare-metal
5) Largest size that aligns to end of memory.

Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Gregory Price <gourry@gourry.net>
---
 arch/x86/mm/init_64.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

David Hildenbrand Oct. 21, 2024, 11:12 a.m. UTC | #1
Am 16.10.24 um 21:24 schrieb Gregory Price:
> Systems with hotplug may provide an advisement value on what the
> memblock size should be.  Probe this value when the rest of the
> configuration values are considered.
> 
> The new heuristic is as follows
> 
> 1) set_memory_block_size_order value if already set (cmdline param)
> 2) minimum block size if memory is less than large block limit
> 3) [new] hotplug advise: lesser of advise value or memory alignment
> 4) Max block size if system is bare-metal
> 5) Largest size that aligns to end of memory.
> 
> Suggested-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Gregory Price <gourry@gourry.net>
> ---
>   arch/x86/mm/init_64.c | 16 ++++++++++++++++
>   1 file changed, 16 insertions(+)
> 
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index ff253648706f..b72923b12d99 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1439,6 +1439,7 @@ static unsigned long probe_memory_block_size(void)
>   {
>   	unsigned long boot_mem_end = max_pfn << PAGE_SHIFT;
>   	unsigned long bz;
> +	int order;
>   
>   	/* If memory block size has been set, then use it */
>   	bz = set_memory_block_size;
> @@ -1451,6 +1452,21 @@ static unsigned long probe_memory_block_size(void)
>   		goto done;
>   	}
>   
> +	/* Consider hotplug advisement value (if set) */
> +	order = memblock_probe_size_order();

"size_order" is a very weird name. Just return a size?

memory_block_advised_max_size()

or sth like that?

> +	bz = order > 0 ? (1UL << order) : 0;
> +	if (bz) {
> +		/* Align down to max and up to min supported */
> +		bz = 
> +		/* Use lesser of advisement and end of memory alignment */
> +		for (; bz > MIN_MEMORY_BLOCK_SIZE; bz >>= 1) {
> +			if (IS_ALIGNED(boot_mem_end, bz))
> +				goto done;

This looks like duplicate code wit the loop below.

Could we refactored it into something like:

advised_max_size = memory_block_advised_max_size();
if (!advised_max_size) {
	bz = MAX_BLOCK_SIZE;
	if (!boot_cpu_has(X86_FEATURE_HYPERVISOR)
		goto done,
} else {
	bz = max(min(advised_max_size, MAX_BLOCK_SIZE), MIN_MEMORY_BLOCK_SIZE);
}

for (; bz > MIN_MEMORY_BLOCK_SIZE; bz >>= 1) {
	if (IS_ALIGNED(boot_mem_end, bz))
		break;
Gregory Price Oct. 21, 2024, 2:46 p.m. UTC | #2
On Mon, Oct 21, 2024 at 01:12:26PM +0200, David Hildenbrand wrote:
> 
> 
> Am 16.10.24 um 21:24 schrieb Gregory Price:
> > Systems with hotplug may provide an advisement value on what the
> > memblock size should be.  Probe this value when the rest of the
> > configuration values are considered.
> > 
> > The new heuristic is as follows
> > 
> > 1) set_memory_block_size_order value if already set (cmdline param)
> > 2) minimum block size if memory is less than large block limit
> > 3) [new] hotplug advise: lesser of advise value or memory alignment
> > 4) Max block size if system is bare-metal
> > 5) Largest size that aligns to end of memory.
> > 
> > Suggested-by: David Hildenbrand <david@redhat.com>
> > Signed-off-by: Gregory Price <gourry@gourry.net>
> > ---
> >   arch/x86/mm/init_64.c | 16 ++++++++++++++++
> >   1 file changed, 16 insertions(+)
> > 
> > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> > index ff253648706f..b72923b12d99 100644
> > --- a/arch/x86/mm/init_64.c
> > +++ b/arch/x86/mm/init_64.c
> > @@ -1439,6 +1439,7 @@ static unsigned long probe_memory_block_size(void)
> >   {
> >   	unsigned long boot_mem_end = max_pfn << PAGE_SHIFT;
> >   	unsigned long bz;
> > +	int order;
> >   	/* If memory block size has been set, then use it */
> >   	bz = set_memory_block_size;
> > @@ -1451,6 +1452,21 @@ static unsigned long probe_memory_block_size(void)
> >   		goto done;
> >   	}
> > +	/* Consider hotplug advisement value (if set) */
> > +	order = memblock_probe_size_order();
> 
> "size_order" is a very weird name. Just return a size?
> 
> memory_block_advised_max_size()
> 
> or sth like that?
> 

There isn't technically an overall "max block size", nor any alignment
requirements - so order was a nice way of enforcing 2-order alignment
while also having the ability to get a -1/-EBUSY/whatever out.

I can change it if it's a big sticking point - but that's my reasoning.

> > +	bz = order > 0 ? (1UL << order) : 0;
> > +	if (bz) {
> > +		/* Align down to max and up to min supported */
> > +		bz = +		/* Use lesser of advisement and end of memory alignment */
> > +		for (; bz > MIN_MEMORY_BLOCK_SIZE; bz >>= 1) {
> > +			if (IS_ALIGNED(boot_mem_end, bz))
> > +				goto done;
> 
> This looks like duplicate code wit the loop below.
> 
> Could we refactored it into something like:
> 
> advised_max_size = memory_block_advised_max_size();
> if (!advised_max_size) {
> 	bz = MAX_BLOCK_SIZE;
> 	if (!boot_cpu_has(X86_FEATURE_HYPERVISOR)
> 		goto done,
> } else {
> 	bz = max(min(advised_max_size, MAX_BLOCK_SIZE), MIN_MEMORY_BLOCK_SIZE);
> }
> 
> for (; bz > MIN_MEMORY_BLOCK_SIZE; bz >>= 1) {
> 	if (IS_ALIGNED(boot_mem_end, bz))
> 		break;
> 
>

this is better, will update.

> 
> -- 
> Cheers,
> 
> David / dhildenb
>
David Hildenbrand Oct. 21, 2024, 3:57 p.m. UTC | #3
On 21.10.24 16:46, Gregory Price wrote:
> On Mon, Oct 21, 2024 at 01:12:26PM +0200, David Hildenbrand wrote:
>>
>>
>> Am 16.10.24 um 21:24 schrieb Gregory Price:
>>> Systems with hotplug may provide an advisement value on what the
>>> memblock size should be.  Probe this value when the rest of the
>>> configuration values are considered.
>>>
>>> The new heuristic is as follows
>>>
>>> 1) set_memory_block_size_order value if already set (cmdline param)
>>> 2) minimum block size if memory is less than large block limit
>>> 3) [new] hotplug advise: lesser of advise value or memory alignment
>>> 4) Max block size if system is bare-metal
>>> 5) Largest size that aligns to end of memory.
>>>
>>> Suggested-by: David Hildenbrand <david@redhat.com>
>>> Signed-off-by: Gregory Price <gourry@gourry.net>
>>> ---
>>>    arch/x86/mm/init_64.c | 16 ++++++++++++++++
>>>    1 file changed, 16 insertions(+)
>>>
>>> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
>>> index ff253648706f..b72923b12d99 100644
>>> --- a/arch/x86/mm/init_64.c
>>> +++ b/arch/x86/mm/init_64.c
>>> @@ -1439,6 +1439,7 @@ static unsigned long probe_memory_block_size(void)
>>>    {
>>>    	unsigned long boot_mem_end = max_pfn << PAGE_SHIFT;
>>>    	unsigned long bz;
>>> +	int order;
>>>    	/* If memory block size has been set, then use it */
>>>    	bz = set_memory_block_size;
>>> @@ -1451,6 +1452,21 @@ static unsigned long probe_memory_block_size(void)
>>>    		goto done;
>>>    	}
>>> +	/* Consider hotplug advisement value (if set) */
>>> +	order = memblock_probe_size_order();
>>
>> "size_order" is a very weird name. Just return a size?
>>
>> memory_block_advised_max_size()
>>
>> or sth like that?
>>
> 
> There isn't technically an overall "max block size", nor any alignment
> requirements - so order was a nice way of enforcing 2-order alignment
> while also having the ability to get a -1/-EBUSY/whatever out.

I see. But we (MM) just call it "order" then, like pageblock_order, 
max_order, compound_order ... but here we use "size everywhere" so I 
prefer to just sticking to that.

> 
> I can change it if it's a big sticking point - but that's my reasoning.

Simply enforce it when setting the size. We call it "memory_block_size" 
everywhere and it's also a power-of-2 etc and sanity-check that in 
memory_dev_init().
Gregory Price Oct. 21, 2024, 4:04 p.m. UTC | #4
On Mon, Oct 21, 2024 at 10:46:38AM -0400, Gregory Price wrote:
> On Mon, Oct 21, 2024 at 01:12:26PM +0200, David Hildenbrand wrote:
> > 
> > > +	/* Consider hotplug advisement value (if set) */
> > > +	order = memblock_probe_size_order();
> > 
> > "size_order" is a very weird name. Just return a size?
> > 
> > memory_block_advised_max_size()
> > 
> > or sth like that?
> > 
> 
> There isn't technically an overall "max block size", nor any alignment
> requirements - so order was a nice way of enforcing 2-order alignment
> while also having the ability to get a -1/-EBUSY/whatever out.
> 
> I can change it if it's a big sticking point - but that's my reasoning.
> 

maybe change to

memory_block_advise_max_size
memory_block_probe_max_size

but still take in / return an order?

~Gregory
Gregory Price Oct. 21, 2024, 4:17 p.m. UTC | #5
On Mon, Oct 21, 2024 at 05:57:28PM +0200, David Hildenbrand wrote:
> On 21.10.24 16:46, Gregory Price wrote:
> > On Mon, Oct 21, 2024 at 01:12:26PM +0200, David Hildenbrand wrote:
> > > 
> > > 
> > > Am 16.10.24 um 21:24 schrieb Gregory Price:
> > > > Systems with hotplug may provide an advisement value on what the
> > > > memblock size should be.  Probe this value when the rest of the
> > > > configuration values are considered.
> > > > 
> > > > The new heuristic is as follows
> > > > 
> > > > 1) set_memory_block_size_order value if already set (cmdline param)
> > > > 2) minimum block size if memory is less than large block limit
> > > > 3) [new] hotplug advise: lesser of advise value or memory alignment
> > > > 4) Max block size if system is bare-metal
> > > > 5) Largest size that aligns to end of memory.
> > > > 
> > > > Suggested-by: David Hildenbrand <david@redhat.com>
> > > > Signed-off-by: Gregory Price <gourry@gourry.net>
> > > > ---
> > > >    arch/x86/mm/init_64.c | 16 ++++++++++++++++
> > > >    1 file changed, 16 insertions(+)
> > > > 
> > > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> > > > index ff253648706f..b72923b12d99 100644
> > > > --- a/arch/x86/mm/init_64.c
> > > > +++ b/arch/x86/mm/init_64.c
> > > > @@ -1439,6 +1439,7 @@ static unsigned long probe_memory_block_size(void)
> > > >    {
> > > >    	unsigned long boot_mem_end = max_pfn << PAGE_SHIFT;
> > > >    	unsigned long bz;
> > > > +	int order;
> > > >    	/* If memory block size has been set, then use it */
> > > >    	bz = set_memory_block_size;
> > > > @@ -1451,6 +1452,21 @@ static unsigned long probe_memory_block_size(void)
> > > >    		goto done;
> > > >    	}
> > > > +	/* Consider hotplug advisement value (if set) */
> > > > +	order = memblock_probe_size_order();
> > > 
> > > "size_order" is a very weird name. Just return a size?
> > > 
> > > memory_block_advised_max_size()
> > > 
> > > or sth like that?
> > > 
> > 
> > There isn't technically an overall "max block size", nor any alignment
> > requirements - so order was a nice way of enforcing 2-order alignment
> > while also having the ability to get a -1/-EBUSY/whatever out.
> 
> I see. But we (MM) just call it "order" then, like pageblock_order,
> max_order, compound_order ... but here we use "size everywhere" so I prefer
> to just sticking to that.
> 
> > 
> > I can change it if it's a big sticking point - but that's my reasoning.
> 
> Simply enforce it when setting the size. We call it "memory_block_size"
> everywhere and it's also a power-of-2 etc and sanity-check that in
> memory_dev_init().
> 
>

Disregard my other email.  Didn't see this one come through.

I'll switch to a size and check alignment. Probably i need to play
with the locking mechanism to avoid changing after it's probe the
first time, but i'll poke at it.

So probably i change to an ssize_t for the arg and return value.

~Gregory
diff mbox series

Patch

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index ff253648706f..b72923b12d99 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1439,6 +1439,7 @@  static unsigned long probe_memory_block_size(void)
 {
 	unsigned long boot_mem_end = max_pfn << PAGE_SHIFT;
 	unsigned long bz;
+	int order;
 
 	/* If memory block size has been set, then use it */
 	bz = set_memory_block_size;
@@ -1451,6 +1452,21 @@  static unsigned long probe_memory_block_size(void)
 		goto done;
 	}
 
+	/* Consider hotplug advisement value (if set) */
+	order = memblock_probe_size_order();
+	bz = order > 0 ? (1UL << order) : 0;
+	if (bz) {
+		/* Align down to max and up to min supported */
+		bz = max(min(bz, MAX_BLOCK_SIZE), MIN_MEMORY_BLOCK_SIZE);
+		/* Use lesser of advisement and end of memory alignment */
+		for (; bz > MIN_MEMORY_BLOCK_SIZE; bz >>= 1) {
+			if (IS_ALIGNED(boot_mem_end, bz))
+				goto done;
+		}
+		/* Barring clean alignment, default to min block size */
+		goto done;
+	}
+
 	/*
 	 * Use max block size to minimize overhead on bare metal, where
 	 * alignment for memory hotplug isn't a concern.