diff mbox

[v3,2/5] memblock: Improve memblock to support allocation from lower address.

Message ID 1379064655-20874-3-git-send-email-tangchen@cn.fujitsu.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

tangchen Sept. 13, 2013, 9:30 a.m. UTC
This patch modifies the memblock_find_in_range_node() to support two
different allocation directions. After this patch, memblock will check
memblock.current_direction, and decide in which direction to allocate
memory.

Now it supports two allocation directions: bottom up and top down.
When direction is top down, it acts as before.
When direction is bottom up, the start address should be greater than
the end of the kernel image. Otherwise, it will be trimmed to kernel
image end address.

Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
---
 mm/memblock.c |  107 ++++++++++++++++++++++++++++++++++++++++++++++++++------
 1 files changed, 95 insertions(+), 12 deletions(-)

Comments

Toshi Kani Sept. 13, 2013, 9:53 p.m. UTC | #1
On Fri, 2013-09-13 at 17:30 +0800, Tang Chen wrote:
 :
> @@ -100,8 +180,7 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start,
>  					phys_addr_t end, phys_addr_t size,
>  					phys_addr_t align, int nid)
>  {
> -	phys_addr_t this_start, this_end, cand;
> -	u64 i;
> +	phys_addr_t ret;
>  
>  	/* pump up @end */
>  	if (end == MEMBLOCK_ALLOC_ACCESSIBLE)
> @@ -111,18 +190,22 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start,
>  	start = max_t(phys_addr_t, start, PAGE_SIZE);
>  	end = max(start, end);
>  
> -	for_each_free_mem_range_reverse(i, nid, &this_start, &this_end, NULL) {
> -		this_start = clamp(this_start, start, end);
> -		this_end = clamp(this_end, start, end);
> +	if (memblock_direction_bottom_up()) {
> +		/*
> +		 * MEMBLOCK_ALLOC_ACCESSIBLE is 0, which is less than the end
> +		 * of kernel image. So callers specify MEMBLOCK_ALLOC_ACCESSIBLE
> +		 * as @start is OK.
> +		 */
> +		start =	max(start, __pa_symbol(_end)); /* End of kernel image. */
>  
> -		if (this_end < size)
> -			continue;
> +		ret = __memblock_find_range(start, end, size, align, nid);
> +		if (ret)
> +			return ret;
>  
> -		cand = round_down(this_end - size, align);
> -		if (cand >= this_start)
> -			return cand;
> +		pr_warn("memblock: Failed to allocate memory in bottom up direction. Now try top down direction.\n");

Is there any chance that this retry will succeed given that start and
end are still the same?

Thanks,
-Toshi


>  	}
> -	return 0;
> +
> +	return __memblock_find_range_rev(start, end, size, align, nid);
>  }
>  
>  /**


--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yanfei Zhang Sept. 16, 2013, 1:28 a.m. UTC | #2
Hello toshi-san,

On 09/14/2013 05:53 AM, Toshi Kani wrote:
> On Fri, 2013-09-13 at 17:30 +0800, Tang Chen wrote:
>  :
>> @@ -100,8 +180,7 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start,
>>  					phys_addr_t end, phys_addr_t size,
>>  					phys_addr_t align, int nid)
>>  {
>> -	phys_addr_t this_start, this_end, cand;
>> -	u64 i;
>> +	phys_addr_t ret;
>>  
>>  	/* pump up @end */
>>  	if (end == MEMBLOCK_ALLOC_ACCESSIBLE)
>> @@ -111,18 +190,22 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start,
>>  	start = max_t(phys_addr_t, start, PAGE_SIZE);
>>  	end = max(start, end);
>>  
>> -	for_each_free_mem_range_reverse(i, nid, &this_start, &this_end, NULL) {
>> -		this_start = clamp(this_start, start, end);
>> -		this_end = clamp(this_end, start, end);
>> +	if (memblock_direction_bottom_up()) {
>> +		/*
>> +		 * MEMBLOCK_ALLOC_ACCESSIBLE is 0, which is less than the end
>> +		 * of kernel image. So callers specify MEMBLOCK_ALLOC_ACCESSIBLE
>> +		 * as @start is OK.
>> +		 */
>> +		start =	max(start, __pa_symbol(_end)); /* End of kernel image. */
>>  
>> -		if (this_end < size)
>> -			continue;
>> +		ret = __memblock_find_range(start, end, size, align, nid);
>> +		if (ret)
>> +			return ret;
>>  
>> -		cand = round_down(this_end - size, align);
>> -		if (cand >= this_start)
>> -			return cand;
>> +		pr_warn("memblock: Failed to allocate memory in bottom up direction. Now try top down direction.\n");
> 
> Is there any chance that this retry will succeed given that start and
> end are still the same?

Thanks for pointing this. We've made a mistake here. If the original start
is less than __pa_symbol(_end), and if the bottom up search fails, then
the top down search deserves a try with the original start argument.

> 
> Thanks,
> -Toshi
> 
> 
>>  	}
>> -	return 0;
>> +
>> +	return __memblock_find_range_rev(start, end, size, align, nid);
>>  }
>>  
>>  /**
> 
> 
>
Tejun Heo Sept. 23, 2013, 3:50 p.m. UTC | #3
Hello,

Please separate out factoring out of top-down allocation.  That change
is an equivalent conversion which shouldn't involve any functional
difference.  Mixing that with introduction of new feature isn't a good
idea, so the patch split should be 1. split out top-down allocation
from memblock_find_in_range_node() 2. introduce bottom-up flag and
implement the feature.

On Fri, Sep 13, 2013 at 05:30:52PM +0800, Tang Chen wrote:
> +/**
>   * memblock_find_in_range_node - find free area in given range and node
> - * @start: start of candidate range
> + * @start: start of candidate range, can be %MEMBLOCK_ALLOC_ACCESSIBLE

The only reason @end has special ACCESSIBLE flag is because we don't
know how high is actually accessible and it needs to be distinguished
from ANYWHERE.  We assume that the lower addresses are always mapped,
so using ACCESSIBLE for @start is weird.  I think it'd be clearer to
make the memblock interface to set the direction explicitly state what
it's doing - ie. something like set_memblock_alloc_above_kernel(bool
enable).  We clearly don't want pure bottom-up allocation and the
@start/@end params in memblock interface are used to impose extra
limitations for each allocation, not the overall allocator behavior.

> @@ -100,8 +180,7 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start,
>  					phys_addr_t end, phys_addr_t size,
>  					phys_addr_t align, int nid)
>  {
> -	phys_addr_t this_start, this_end, cand;
> -	u64 i;
> +	phys_addr_t ret;
>  
>  	/* pump up @end */
>  	if (end == MEMBLOCK_ALLOC_ACCESSIBLE)
> @@ -111,18 +190,22 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start,
>  	start = max_t(phys_addr_t, start, PAGE_SIZE);
>  	end = max(start, end);
>  
> -	for_each_free_mem_range_reverse(i, nid, &this_start, &this_end, NULL) {
> -		this_start = clamp(this_start, start, end);
> -		this_end = clamp(this_end, start, end);
> +	if (memblock_direction_bottom_up()) {
> +		/*
> +		 * MEMBLOCK_ALLOC_ACCESSIBLE is 0, which is less than the end
> +		 * of kernel image. So callers specify MEMBLOCK_ALLOC_ACCESSIBLE
> +		 * as @start is OK.
> +		 */
> +		start =	max(start, __pa_symbol(_end)); /* End of kernel image. */
>  
> -		if (this_end < size)
> -			continue;
> +		ret = __memblock_find_range(start, end, size, align, nid);
> +		if (ret)
> +			return ret;
>  
> -		cand = round_down(this_end - size, align);
> -		if (cand >= this_start)
> -			return cand;
> +		pr_warn("memblock: Failed to allocate memory in bottom up direction. Now try top down direction.\n");

You probably wanna explain why retrying top-down allocation may
succeed when bottom-up failed.

Thanks.
Zhang Yanfei Sept. 23, 2013, 4:44 p.m. UTC | #4
Hello tejun,

On 09/23/2013 11:50 PM, Tejun Heo wrote:
> Hello,
> 
> Please separate out factoring out of top-down allocation.  That change
> is an equivalent conversion which shouldn't involve any functional
> difference.  Mixing that with introduction of new feature isn't a good
> idea, so the patch split should be 1. split out top-down allocation
> from memblock_find_in_range_node() 2. introduce bottom-up flag and
> implement the feature.

Ok, will do the split.

> 
> On Fri, Sep 13, 2013 at 05:30:52PM +0800, Tang Chen wrote:
>> +/**
>>   * memblock_find_in_range_node - find free area in given range and node
>> - * @start: start of candidate range
>> + * @start: start of candidate range, can be %MEMBLOCK_ALLOC_ACCESSIBLE
> 
> The only reason @end has special ACCESSIBLE flag is because we don't
> know how high is actually accessible and it needs to be distinguished
> from ANYWHERE.  We assume that the lower addresses are always mapped,
> so using ACCESSIBLE for @start is weird.  I think it'd be clearer to
> make the memblock interface to set the direction explicitly state what
> it's doing - ie. something like set_memblock_alloc_above_kernel(bool
> enable).  We clearly don't want pure bottom-up allocation and the
> @start/@end params in memblock interface are used to impose extra
> limitations for each allocation, not the overall allocator behavior.
> 
>> @@ -100,8 +180,7 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start,
>>  					phys_addr_t end, phys_addr_t size,
>>  					phys_addr_t align, int nid)
>>  {
>> -	phys_addr_t this_start, this_end, cand;
>> -	u64 i;
>> +	phys_addr_t ret;
>>  
>>  	/* pump up @end */
>>  	if (end == MEMBLOCK_ALLOC_ACCESSIBLE)
>> @@ -111,18 +190,22 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start,
>>  	start = max_t(phys_addr_t, start, PAGE_SIZE);
>>  	end = max(start, end);
>>  
>> -	for_each_free_mem_range_reverse(i, nid, &this_start, &this_end, NULL) {
>> -		this_start = clamp(this_start, start, end);
>> -		this_end = clamp(this_end, start, end);
>> +	if (memblock_direction_bottom_up()) {
>> +		/*
>> +		 * MEMBLOCK_ALLOC_ACCESSIBLE is 0, which is less than the end
>> +		 * of kernel image. So callers specify MEMBLOCK_ALLOC_ACCESSIBLE
>> +		 * as @start is OK.
>> +		 */
>> +		start =	max(start, __pa_symbol(_end)); /* End of kernel image. */
>>  
>> -		if (this_end < size)
>> -			continue;
>> +		ret = __memblock_find_range(start, end, size, align, nid);
>> +		if (ret)
>> +			return ret;
>>  
>> -		cand = round_down(this_end - size, align);
>> -		if (cand >= this_start)
>> -			return cand;
>> +		pr_warn("memblock: Failed to allocate memory in bottom up direction. Now try top down direction.\n");
> 
> You probably wanna explain why retrying top-down allocation may
> succeed when bottom-up failed.

ok. The reason is we always limit bottom-up allocation from
the kernel image end, but to-down allocation doesn't have the
limit, so retrying top-down allocation may succeed when
bottom-up failed.

Will add the comment to explain the retry.

Thanks.

> 
> Thanks.
>
Zhang Yanfei Sept. 23, 2013, 6:07 p.m. UTC | #5
Hello tejun,

On 09/23/2013 11:50 PM, Tejun Heo wrote:
> Hello,
> 
> Please separate out factoring out of top-down allocation.  That change
> is an equivalent conversion which shouldn't involve any functional
> difference.  Mixing that with introduction of new feature isn't a good
> idea, so the patch split should be 1. split out top-down allocation
> from memblock_find_in_range_node() 2. introduce bottom-up flag and
> implement the feature.
> 
> On Fri, Sep 13, 2013 at 05:30:52PM +0800, Tang Chen wrote:
>> +/**
>>   * memblock_find_in_range_node - find free area in given range and node
>> - * @start: start of candidate range
>> + * @start: start of candidate range, can be %MEMBLOCK_ALLOC_ACCESSIBLE
> 
> The only reason @end has special ACCESSIBLE flag is because we don't
> know how high is actually accessible and it needs to be distinguished
> from ANYWHERE.  We assume that the lower addresses are always mapped,
> so using ACCESSIBLE for @start is weird.  I think it'd be clearer to
> make the memblock interface to set the direction explicitly state what
> it's doing - ie. something like set_memblock_alloc_above_kernel(bool
> enable).  We clearly don't want pure bottom-up allocation and the
> @start/@end params in memblock interface are used to impose extra
> limitations for each allocation, not the overall allocator behavior.

Forgot this one...

Yes, I am following your advice in principle but kind of confused by
something you said above. Where should the set_memblock_alloc_above_kernel
be used? IMO, the function is like:

find_in_range_node()
{
     if (ok) {
           /* bottom-up */
           ret = __memblock_find_in_range(max(start, _end_of_kernel), end...);
           if (!ret)
                 return ret;
     }

     /* top-down retry */
     return __memblock_find_in_range_rev(start, end...)
}

For bottom-up allocation, we always start from max(start, _end_of_kernel).

Thanks.

> 
>> @@ -100,8 +180,7 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start,
>>  					phys_addr_t end, phys_addr_t size,
>>  					phys_addr_t align, int nid)
>>  {
>> -	phys_addr_t this_start, this_end, cand;
>> -	u64 i;
>> +	phys_addr_t ret;
>>  
>>  	/* pump up @end */
>>  	if (end == MEMBLOCK_ALLOC_ACCESSIBLE)
>> @@ -111,18 +190,22 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start,
>>  	start = max_t(phys_addr_t, start, PAGE_SIZE);
>>  	end = max(start, end);
>>  
>> -	for_each_free_mem_range_reverse(i, nid, &this_start, &this_end, NULL) {
>> -		this_start = clamp(this_start, start, end);
>> -		this_end = clamp(this_end, start, end);
>> +	if (memblock_direction_bottom_up()) {
>> +		/*
>> +		 * MEMBLOCK_ALLOC_ACCESSIBLE is 0, which is less than the end
>> +		 * of kernel image. So callers specify MEMBLOCK_ALLOC_ACCESSIBLE
>> +		 * as @start is OK.
>> +		 */
>> +		start =	max(start, __pa_symbol(_end)); /* End of kernel image. */
>>  
>> -		if (this_end < size)
>> -			continue;
>> +		ret = __memblock_find_range(start, end, size, align, nid);
>> +		if (ret)
>> +			return ret;
>>  
>> -		cand = round_down(this_end - size, align);
>> -		if (cand >= this_start)
>> -			return cand;
>> +		pr_warn("memblock: Failed to allocate memory in bottom up direction. Now try top down direction.\n");
> 
> You probably wanna explain why retrying top-down allocation may
> succeed when bottom-up failed.
> 
> Thanks.
>
Tejun Heo Sept. 23, 2013, 8:21 p.m. UTC | #6
Hello,

On Tue, Sep 24, 2013 at 02:07:13AM +0800, Zhang Yanfei wrote:
> Yes, I am following your advice in principle but kind of confused by
> something you said above. Where should the set_memblock_alloc_above_kernel
> be used? IMO, the function is like:
> 
> find_in_range_node()
> {
>      if (ok) {
>            /* bottom-up */
>            ret = __memblock_find_in_range(max(start, _end_of_kernel), end...);
>            if (!ret)
>                  return ret;
>      }
> 
>      /* top-down retry */
>      return __memblock_find_in_range_rev(start, end...)
> }
> 
> For bottom-up allocation, we always start from max(start, _end_of_kernel).

Oh, I was talking about naming of the memblock_set_bottom_up()
function.  We aren't really doing pure bottom up allocations, so I
think it probably would be clearer if the name clearly denotes that
we're doing above-kernel allocation.

Thanks.
Yanfei Zhang Sept. 24, 2013, 2:41 a.m. UTC | #7
Hello tejun,

On 09/24/2013 04:21 AM, Tejun Heo wrote:
> Hello,
> 
> On Tue, Sep 24, 2013 at 02:07:13AM +0800, Zhang Yanfei wrote:
>> Yes, I am following your advice in principle but kind of confused by
>> something you said above. Where should the set_memblock_alloc_above_kernel
>> be used? IMO, the function is like:
>>
>> find_in_range_node()
>> {
>>      if (ok) {
>>            /* bottom-up */
>>            ret = __memblock_find_in_range(max(start, _end_of_kernel), end...);
>>            if (!ret)
>>                  return ret;
>>      }
>>
>>      /* top-down retry */
>>      return __memblock_find_in_range_rev(start, end...)
>> }
>>
>> For bottom-up allocation, we always start from max(start, _end_of_kernel).
> 
> Oh, I was talking about naming of the memblock_set_bottom_up()
> function.  We aren't really doing pure bottom up allocations, so I
> think it probably would be clearer if the name clearly denotes that
> we're doing above-kernel allocation.

I see. But I think memblock_set_alloc_above_kernel may lose the info
that we are doing bottom-up allocation. So my idea is we introduce
pure bottom-up allocation mode in previous patches and we use the
bottom-up allocation here and limit the start address above the kernel
, with explicit comments to indicate this.

How do you think?

Thanks.

> 
> Thanks.
>
Tejun Heo Sept. 24, 2013, 2:46 a.m. UTC | #8
Hello,

On Tue, Sep 24, 2013 at 10:41:51AM +0800, Zhang Yanfei wrote:
> I see. But I think memblock_set_alloc_above_kernel may lose the info
> that we are doing bottom-up allocation. So my idea is we introduce
> pure bottom-up allocation mode in previous patches and we use the
> bottom-up allocation here and limit the start address above the kernel
> , with explicit comments to indicate this.

It probably doesn't matter either way.  I was just a bit bothered that
it's called bottom-up when it implies more including the retry logic.
Its use of bottom-up allocation is really an implementation logic to
achieve the goal of allocating memory above kernel image after all,
but yeah minor point either way.

Thanks.
diff mbox

Patch

diff --git a/mm/memblock.c b/mm/memblock.c
index f24ca2e..87a7f04 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -20,6 +20,8 @@ 
 #include <linux/seq_file.h>
 #include <linux/memblock.h>
 
+#include <asm-generic/sections.h>
+
 static struct memblock_region memblock_memory_init_regions[INIT_MEMBLOCK_REGIONS] __initdata_memblock;
 static struct memblock_region memblock_reserved_init_regions[INIT_MEMBLOCK_REGIONS] __initdata_memblock;
 
@@ -84,8 +86,81 @@  static long __init_memblock memblock_overlaps_region(struct memblock_type *type,
 }
 
 /**
+ * __memblock_find_range - find free area utility
+ * @start: start of candidate range, can be %MEMBLOCK_ALLOC_ACCESSIBLE
+ * @end: end of candidate range, can be %MEMBLOCK_ALLOC_{ANYWHERE|ACCESSIBLE}
+ * @size: size of free area to find
+ * @align: alignment of free area to find
+ * @nid: nid of the free area to find, %MAX_NUMNODES for any node
+ *
+ * Utility called from memblock_find_in_range_node(), find free area from
+ * lower address to higher address.
+ *
+ * RETURNS:
+ * Found address on success, %0 on failure.
+ */
+static phys_addr_t __init_memblock
+__memblock_find_range(phys_addr_t start, phys_addr_t end,
+		      phys_addr_t size, phys_addr_t align, int nid)
+{
+	phys_addr_t this_start, this_end, cand;
+	u64 i;
+
+	for_each_free_mem_range(i, nid, &this_start, &this_end, NULL) {
+		this_start = clamp(this_start, start, end);
+		this_end = clamp(this_end, start, end);
+
+		cand = round_up(this_start, align);
+		if (cand < this_end && this_end - cand >= size)
+			return cand;
+	}
+
+	return 0;
+}
+
+/**
+ * __memblock_find_range_rev - find free area utility, in reverse order
+ * @start: start of candidate range, can be %MEMBLOCK_ALLOC_ACCESSIBLE
+ * @end: end of candidate range, can be %MEMBLOCK_ALLOC_{ANYWHERE|ACCESSIBLE}
+ * @size: size of free area to find
+ * @align: alignment of free area to find
+ * @nid: nid of the free area to find, %MAX_NUMNODES for any node
+ *
+ * Utility called from memblock_find_in_range_node(), find free area from
+ * higher address to lower address.
+ *
+ * RETURNS:
+ * Found address on success, %0 on failure.
+ */
+static phys_addr_t __init_memblock
+__memblock_find_range_rev(phys_addr_t start, phys_addr_t end,
+			  phys_addr_t size, phys_addr_t align, int nid)
+{
+	phys_addr_t this_start, this_end, cand;
+	u64 i;
+
+	for_each_free_mem_range_reverse(i, nid, &this_start, &this_end, NULL) {
+		this_start = clamp(this_start, start, end);
+		this_end = clamp(this_end, start, end);
+
+		/*
+		 * Just in case that (this_end - size) underflows and cause
+		 * (cand >= this_start) to be true incorrectly.
+		 */
+		if (this_end < size)
+			break;
+
+		cand = round_down(this_end - size, align);
+		if (cand >= this_start)
+			return cand;
+	}
+
+	return 0;
+}
+
+/**
  * memblock_find_in_range_node - find free area in given range and node
- * @start: start of candidate range
+ * @start: start of candidate range, can be %MEMBLOCK_ALLOC_ACCESSIBLE
  * @end: end of candidate range, can be %MEMBLOCK_ALLOC_{ANYWHERE|ACCESSIBLE}
  * @size: size of free area to find
  * @align: alignment of free area to find
@@ -93,6 +168,11 @@  static long __init_memblock memblock_overlaps_region(struct memblock_type *type,
  *
  * Find @size free area aligned to @align in the specified range and node.
  *
+ * When allocation direction is from low to high, the @start should be greater
+ * than the end of the kernel image. Otherwise, it will be trimmed. And also,
+ * if allocation from low to high failed, will try to allocate memory from high
+ * to low again.
+ *
  * RETURNS:
  * Found address on success, %0 on failure.
  */
@@ -100,8 +180,7 @@  phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start,
 					phys_addr_t end, phys_addr_t size,
 					phys_addr_t align, int nid)
 {
-	phys_addr_t this_start, this_end, cand;
-	u64 i;
+	phys_addr_t ret;
 
 	/* pump up @end */
 	if (end == MEMBLOCK_ALLOC_ACCESSIBLE)
@@ -111,18 +190,22 @@  phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start,
 	start = max_t(phys_addr_t, start, PAGE_SIZE);
 	end = max(start, end);
 
-	for_each_free_mem_range_reverse(i, nid, &this_start, &this_end, NULL) {
-		this_start = clamp(this_start, start, end);
-		this_end = clamp(this_end, start, end);
+	if (memblock_direction_bottom_up()) {
+		/*
+		 * MEMBLOCK_ALLOC_ACCESSIBLE is 0, which is less than the end
+		 * of kernel image. So callers specify MEMBLOCK_ALLOC_ACCESSIBLE
+		 * as @start is OK.
+		 */
+		start =	max(start, __pa_symbol(_end)); /* End of kernel image. */
 
-		if (this_end < size)
-			continue;
+		ret = __memblock_find_range(start, end, size, align, nid);
+		if (ret)
+			return ret;
 
-		cand = round_down(this_end - size, align);
-		if (cand >= this_start)
-			return cand;
+		pr_warn("memblock: Failed to allocate memory in bottom up direction. Now try top down direction.\n");
 	}
-	return 0;
+
+	return __memblock_find_range_rev(start, end, size, align, nid);
 }
 
 /**