[v27,1/9] memblock: add memblock_cap_memory_range()

Message ID	20161117022023.GA5704@linaro.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org> Date: Thu, 17 Nov 2016 14:34:24 +0900 From: AKASHI Takahiro <takahiro.akashi@linaro.org> To: Will Deacon <will.deacon@arm.com> Subject: Re: [PATCH v27 1/9] memblock: add memblock_cap_memory_range() Message-ID: <20161117022023.GA5704@linaro.org> Mail-Followup-To: AKASHI Takahiro <takahiro.akashi@linaro.org>, Will Deacon <will.deacon@arm.com>, Dennis Chen <dennis.chen@arm.com>, catalin.marinas@arm.com, akpm@linux-foundation.org, james.morse@arm.com, geoff@infradead.org, bauerman@linux.vnet.ibm.com, dyoung@redhat.com, mark.rutland@arm.com, kexec@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, nd@arm.com References: <20161102044959.11954-1-takahiro.akashi@linaro.org> <20161102045153.12008-1-takahiro.akashi@linaro.org> <20161110172720.GB17134@arm.com> <20161111025049.GG381@linaro.org> <20161111031903.GB15997@arm.com> <20161114055515.GH381@linaro.org> <20161116163015.GM7928@arm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20161116163015.GM7928@arm.com> User-Agent: Mutt/1.5.24 (2015-08-30) Precedence: list Cc: mark.rutland@arm.com, geoff@infradead.org, catalin.marinas@arm.com, kexec@lists.infradead.org, linux-mm@kvack.org, james.morse@arm.com, bauerman@linux.vnet.ibm.com, Dennis Chen <dennis.chen@arm.com>, akpm@linux-foundation.org, nd@arm.com, dyoung@redhat.com, linux-arm-kernel@lists.infradead.org Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Message ID

20161117022023.GA5704@linaro.org (mailing list archive)

State

New, archived

Headers

Date: Thu, 17 Nov 2016 14:34:24 +0900
From: AKASHI Takahiro <takahiro.akashi@linaro.org>
To: Will Deacon <will.deacon@arm.com>
Subject: Re: [PATCH v27 1/9] memblock: add memblock_cap_memory_range()
Message-ID: <20161117022023.GA5704@linaro.org>
Mail-Followup-To: AKASHI Takahiro <takahiro.akashi@linaro.org>,
	Will Deacon <will.deacon@arm.com>,
	Dennis Chen <dennis.chen@arm.com>, catalin.marinas@arm.com,
	akpm@linux-foundation.org, james.morse@arm.com, geoff@infradead.org, 
	bauerman@linux.vnet.ibm.com, dyoung@redhat.com,
	mark.rutland@arm.com, kexec@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
	nd@arm.com
References: <20161102044959.11954-1-takahiro.akashi@linaro.org>
	<20161102045153.12008-1-takahiro.akashi@linaro.org>
	<20161110172720.GB17134@arm.com> <20161111025049.GG381@linaro.org>
	<20161111031903.GB15997@arm.com> <20161114055515.GH381@linaro.org>
	<20161116163015.GM7928@arm.com>
MIME-Version: 1.0
Content-Disposition: inline
In-Reply-To: <20161116163015.GM7928@arm.com>
User-Agent: Mutt/1.5.24 (2015-08-30)
Precedence: list
Cc: mark.rutland@arm.com, geoff@infradead.org, catalin.marinas@arm.com,
	kexec@lists.infradead.org, linux-mm@kvack.org, james.morse@arm.com,
	bauerman@linux.vnet.ibm.com, Dennis Chen <dennis.chen@arm.com>,
	akpm@linux-foundation.org, nd@arm.com, dyoung@redhat.com,
	linux-arm-kernel@lists.infradead.org
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org>
Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Commit Message

AKASHI Takahiro Nov. 17, 2016, 5:34 a.m. UTC

Will,

On Wed, Nov 16, 2016 at 04:30:15PM +0000, Will Deacon wrote:
> Hi Akashi,
> 
> On Mon, Nov 14, 2016 at 02:55:16PM +0900, AKASHI Takahiro wrote:
> > On Fri, Nov 11, 2016 at 11:19:04AM +0800, Dennis Chen wrote:
> > > On Fri, Nov 11, 2016 at 11:50:50AM +0900, AKASHI Takahiro wrote:
> > > > On Thu, Nov 10, 2016 at 05:27:20PM +0000, Will Deacon wrote:
> > > > > On Wed, Nov 02, 2016 at 01:51:53PM +0900, AKASHI Takahiro wrote:
> > > > > > +void __init memblock_cap_memory_range(phys_addr_t base, phys_addr_t size)
> > > > > > +{
> > > > > > +	int start_rgn, end_rgn;
> > > > > > +	int i, ret;
> > > > > > +
> > > > > > +	if (!size)
> > > > > > +		return;
> > > > > > +
> > > > > > +	ret = memblock_isolate_range(&memblock.memory, base, size,
> > > > > > +						&start_rgn, &end_rgn);
> > > > > > +	if (ret)
> > > > > > +		return;
> > > > > > +
> > > > > > +	/* remove all the MAP regions */
> > > > > > +	for (i = memblock.memory.cnt - 1; i >= end_rgn; i--)
> > > > > > +		if (!memblock_is_nomap(&memblock.memory.regions[i]))
> > > > > > +			memblock_remove_region(&memblock.memory, i);
> > > > > > +
> > > > > > +	for (i = start_rgn - 1; i >= 0; i--)
> > > > > > +		if (!memblock_is_nomap(&memblock.memory.regions[i]))
> > > > > > +			memblock_remove_region(&memblock.memory, i);
> > > > > > +
> > > > > > +	/* truncate the reserved regions */
> > > > > > +	memblock_remove_range(&memblock.reserved, 0, base);
> > > > > > +	memblock_remove_range(&memblock.reserved,
> > > > > > +			base + size, (phys_addr_t)ULLONG_MAX);
> > > > > > +}
> > > > > 
> > > > > This duplicates a bunch of the logic in memblock_mem_limit_remove_map. Can
> > > > > you not implement that in terms of your new, more general, function? e.g.
> > > > > by passing base == 0, and size == limit?
> > > > 
> > > > Obviously it's possible.
> > > > I actually talked to Dennis before about merging them,
> > > > but he was against my idea.
> > > >
> > > Oops! I thought we have reached agreement in the thread:http://lists.infradead.org/pipermail/linux-arm-kernel/2016-July/442817.html
> > > So feel free to do that as Will'll do
> > 
> > OK, but I found that the two functions have a bit different semantics
> > in clipping memory range, in particular, when the range [base,base+size)
> > goes across several regions with a gap.
> > (This does not happen in my arm64 kdump, though.)
> > That is, 'limit' in memblock_mem_limit_remove_map() means total size of
> > available memory, while 'size' in memblock_cap_memory_range() indicates
> > the size of _continuous_ memory range.
> 
> I thought limit was just a physical address, and then

No, it's not.

> memblock_mem_limit_remove_map operated on the end of the nearest memblock?

No, but "max_addr" returned by __find_max_addr() is a physical address
and the end address of memory of "limit" size in total.

> You could leave the __find_max_addr call in memblock_mem_limit_remove_map,
> given that I don't think you need/want it for memblock_cap_memory_range.
> 
> > So I added an extra argument, exact, to a common function to specify
> > distinct behaviors. Confusing? Please see the patch below.
> 
> Oh yikes, this certainly wasn't what I had in mind! My observation was
> just that memblock_mem_limit_remove_map(limit) does:
> 
> 
>   1. memblock_isolate_range(limit - limit+ULLONG_MAX)
>   2. memblock_remove_region(all non-nomap regions in the isolated region)
>   3. truncate reserved regions to limit
> 
> and your memblock_cap_memory_range(base, size) does:
> 
>   1. memblock_isolate_range(base - base+size)
>   2, memblock_remove_region(all non-nomap regions above and below the
>      isolated region)
>   3. truncate reserved regions around the isolated region
> 
> so, assuming we can invert the isolation in one of the cases, then they
> could share the same underlying implementation.

Please see my simplified patch below which would explain what I meant.
(Note that the size is calculated by 'max_addr - 0'.)

> I'm probably just missing something here, because the patch you've ended
> up with is far more involved than I anticipated...

I hope that it will meet almost your anticipation.

Thanks,
-Takahiro AKASHI

> 
> Will

Comments

Will Deacon Nov. 17, 2016, 11:19 a.m. UTC | #1

Hi Akashi,

On Thu, Nov 17, 2016 at 02:34:24PM +0900, AKASHI Takahiro wrote:
> On Wed, Nov 16, 2016 at 04:30:15PM +0000, Will Deacon wrote:
> > I thought limit was just a physical address, and then
> 
> No, it's not.

Quite right, it's a size. Sorry about that.

> > memblock_mem_limit_remove_map operated on the end of the nearest memblock?
> 
> No, but "max_addr" returned by __find_max_addr() is a physical address
> and the end address of memory of "limit" size in total.
> 
> > You could leave the __find_max_addr call in memblock_mem_limit_remove_map,
> > given that I don't think you need/want it for memblock_cap_memory_range.
> > 
> > > So I added an extra argument, exact, to a common function to specify
> > > distinct behaviors. Confusing? Please see the patch below.
> > 
> > Oh yikes, this certainly wasn't what I had in mind! My observation was
> > just that memblock_mem_limit_remove_map(limit) does:
> > 
> > 
> >   1. memblock_isolate_range(limit - limit+ULLONG_MAX)
> >   2. memblock_remove_region(all non-nomap regions in the isolated region)
> >   3. truncate reserved regions to limit
> > 
> > and your memblock_cap_memory_range(base, size) does:
> > 
> >   1. memblock_isolate_range(base - base+size)
> >   2, memblock_remove_region(all non-nomap regions above and below the
> >      isolated region)
> >   3. truncate reserved regions around the isolated region
> > 
> > so, assuming we can invert the isolation in one of the cases, then they
> > could share the same underlying implementation.
> 
> Please see my simplified patch below which would explain what I meant.
> (Note that the size is calculated by 'max_addr - 0'.)
> 
> > I'm probably just missing something here, because the patch you've ended
> > up with is far more involved than I anticipated...
> 
> I hope that it will meet almost your anticipation.

It looks much better, thanks! Just one question below.

> diff --git a/mm/memblock.c b/mm/memblock.c
> index 7608bc3..fea1688 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1514,11 +1514,37 @@ void __init memblock_enforce_memory_limit(phys_addr_t limit)
>  			      (phys_addr_t)ULLONG_MAX);
>  }
>  
> +void __init memblock_cap_memory_range(phys_addr_t base, phys_addr_t size)
> +{
> +	int start_rgn, end_rgn;
> +	int i, ret;
> +
> +	if (!size)
> +		return;
> +
> +	ret = memblock_isolate_range(&memblock.memory, base, size,
> +						&start_rgn, &end_rgn);
> +	if (ret)
> +		return;
> +
> +	/* remove all the MAP regions */
> +	for (i = memblock.memory.cnt - 1; i >= end_rgn; i--)
> +		if (!memblock_is_nomap(&memblock.memory.regions[i]))
> +			memblock_remove_region(&memblock.memory, i);

In the case that we have only one, giant memblock that covers base all
of base + size, can't we end up with start_rgn = end_rgn = 0? In which
case, we'd end up accidentally removing the map regions here.

The existing code:

> -	/* remove all the MAP regions above the limit */
> -	for (i = end_rgn - 1; i >= start_rgn; i--) {
> -		if (!memblock_is_nomap(&type->regions[i]))
> -			memblock_remove_region(type, i);
> -	}

seems to handle this.

Will

James Morse Nov. 17, 2016, 6 p.m. UTC | #2

Hi Will, Akashi,

On 17/11/16 11:19, Will Deacon wrote:
> It looks much better, thanks! Just one question below.
> 

> On Thu, Nov 17, 2016 at 02:34:24PM +0900, AKASHI Takahiro wrote:
>> diff --git a/mm/memblock.c b/mm/memblock.c
>> index 7608bc3..fea1688 100644
>> --- a/mm/memblock.c
>> +++ b/mm/memblock.c
>> @@ -1514,11 +1514,37 @@ void __init memblock_enforce_memory_limit(phys_addr_t limit)
>>  			      (phys_addr_t)ULLONG_MAX);
>>  }
>>  
>> +void __init memblock_cap_memory_range(phys_addr_t base, phys_addr_t size)
>> +{
>> +	int start_rgn, end_rgn;
>> +	int i, ret;
>> +
>> +	if (!size)
>> +		return;
>> +
>> +	ret = memblock_isolate_range(&memblock.memory, base, size,
>> +						&start_rgn, &end_rgn);
>> +	if (ret)
>> +		return;
>> +
>> +	/* remove all the MAP regions */
>> +	for (i = memblock.memory.cnt - 1; i >= end_rgn; i--)
>> +		if (!memblock_is_nomap(&memblock.memory.regions[i]))
>> +			memblock_remove_region(&memblock.memory, i);
> 
> In the case that we have only one, giant memblock that covers base all
> of base + size, can't we end up with start_rgn = end_rgn = 0? In which

Can this happen? If we only have one memblock that exactly spans
base:(base+size), memblock_isolate_range() will hit the '@rgn is fully
contained, record it' code and set start_rgn=0,end_rgn=1. (rbase==base,
rend==end). We only go round the loop once.

If we only have one memblock that is bigger than base:(base+size) we end up with
three regions, start_rgn=1,end_rgn=2. The trickery here is the '@rgn intersects
from above' code decreases the loop counter so we process the same entry twice,
hitting '@rgn is fully contained, record it' the second time round... so we go
round the loop four times.

I can't see how we hit the:
> 	if (rbase >= end)
> 		break;
> 	if (rend <= base)
> 		continue;

code in either case...



Thanks,

James


> case, we'd end up accidentally removing the map regions here.
> 
> The existing code:
> 
>> -	/* remove all the MAP regions above the limit */
>> -	for (i = end_rgn - 1; i >= start_rgn; i--) {
>> -		if (!memblock_is_nomap(&type->regions[i]))
>> -			memblock_remove_region(type, i);
>> -	}
> 
> seems to handle this.

AKASHI Takahiro Nov. 18, 2016, 1:03 a.m. UTC | #3

James,

On Thu, Nov 17, 2016 at 06:00:58PM +0000, James Morse wrote:
> Hi Will, Akashi,
> 
> On 17/11/16 11:19, Will Deacon wrote:
> > It looks much better, thanks! Just one question below.
> > 
> 
> > On Thu, Nov 17, 2016 at 02:34:24PM +0900, AKASHI Takahiro wrote:
> >> diff --git a/mm/memblock.c b/mm/memblock.c
> >> index 7608bc3..fea1688 100644
> >> --- a/mm/memblock.c
> >> +++ b/mm/memblock.c
> >> @@ -1514,11 +1514,37 @@ void __init memblock_enforce_memory_limit(phys_addr_t limit)
> >>  			      (phys_addr_t)ULLONG_MAX);
> >>  }
> >>  
> >> +void __init memblock_cap_memory_range(phys_addr_t base, phys_addr_t size)
> >> +{
> >> +	int start_rgn, end_rgn;
> >> +	int i, ret;
> >> +
> >> +	if (!size)
> >> +		return;
> >> +
> >> +	ret = memblock_isolate_range(&memblock.memory, base, size,
> >> +						&start_rgn, &end_rgn);
> >> +	if (ret)
> >> +		return;
> >> +
> >> +	/* remove all the MAP regions */
> >> +	for (i = memblock.memory.cnt - 1; i >= end_rgn; i--)
> >> +		if (!memblock_is_nomap(&memblock.memory.regions[i]))
> >> +			memblock_remove_region(&memblock.memory, i);
> > 
> > In the case that we have only one, giant memblock that covers base all
> > of base + size, can't we end up with start_rgn = end_rgn = 0? In which
> 
> Can this happen? If we only have one memblock that exactly spans
> base:(base+size), memblock_isolate_range() will hit the '@rgn is fully
> contained, record it' code and set start_rgn=0,end_rgn=1. (rbase==base,
> rend==end). We only go round the loop once.
> 
> If we only have one memblock that is bigger than base:(base+size) we end up with
> three regions, start_rgn=1,end_rgn=2. The trickery here is the '@rgn intersects
> from above' code decreases the loop counter so we process the same entry twice,
> hitting '@rgn is fully contained, record it' the second time round... so we go
> round the loop four times.

Thank you for your observation.

> I can't see how we hit the:
> > 	if (rbase >= end)
> > 		break;
> > 	if (rend <= base)
> > 		continue;
> 
> code in either case...

Right. So 'end_rgn' will never be expected to be 0 as far as some
intersection exists.

-Takahiro AKASHI

> 
> 
> Thanks,
> 
> James
> 
> 
> > case, we'd end up accidentally removing the map regions here.
> > 
> > The existing code:
> > 
> >> -	/* remove all the MAP regions above the limit */
> >> -	for (i = end_rgn - 1; i >= start_rgn; i--) {
> >> -		if (!memblock_is_nomap(&type->regions[i]))
> >> -			memblock_remove_region(type, i);
> >> -	}
> > 
> > seems to handle this.

Will Deacon Nov. 18, 2016, 12:10 p.m. UTC | #4

On Thu, Nov 17, 2016 at 06:00:58PM +0000, James Morse wrote:
> On 17/11/16 11:19, Will Deacon wrote:
> > It looks much better, thanks! Just one question below.
> > 
> 
> > On Thu, Nov 17, 2016 at 02:34:24PM +0900, AKASHI Takahiro wrote:
> >> diff --git a/mm/memblock.c b/mm/memblock.c
> >> index 7608bc3..fea1688 100644
> >> --- a/mm/memblock.c
> >> +++ b/mm/memblock.c
> >> @@ -1514,11 +1514,37 @@ void __init memblock_enforce_memory_limit(phys_addr_t limit)
> >>  			      (phys_addr_t)ULLONG_MAX);
> >>  }
> >>  
> >> +void __init memblock_cap_memory_range(phys_addr_t base, phys_addr_t size)
> >> +{
> >> +	int start_rgn, end_rgn;
> >> +	int i, ret;
> >> +
> >> +	if (!size)
> >> +		return;
> >> +
> >> +	ret = memblock_isolate_range(&memblock.memory, base, size,
> >> +						&start_rgn, &end_rgn);
> >> +	if (ret)
> >> +		return;
> >> +
> >> +	/* remove all the MAP regions */
> >> +	for (i = memblock.memory.cnt - 1; i >= end_rgn; i--)
> >> +		if (!memblock_is_nomap(&memblock.memory.regions[i]))
> >> +			memblock_remove_region(&memblock.memory, i);
> > 
> > In the case that we have only one, giant memblock that covers base all
> > of base + size, can't we end up with start_rgn = end_rgn = 0? In which
> 
> Can this happen? If we only have one memblock that exactly spans
> base:(base+size), memblock_isolate_range() will hit the '@rgn is fully
> contained, record it' code and set start_rgn=0,end_rgn=1. (rbase==base,
> rend==end). We only go round the loop once.
> 
> If we only have one memblock that is bigger than base:(base+size) we end up with
> three regions, start_rgn=1,end_rgn=2. The trickery here is the '@rgn intersects
> from above' code decreases the loop counter so we process the same entry twice,
> hitting '@rgn is fully contained, record it' the second time round... so we go
> round the loop four times.
> 
> I can't see how we hit the:
> > 	if (rbase >= end)
> > 		break;
> > 	if (rend <= base)
> > 		continue;
> 
> code in either case...

I consistently misread that as rend >= end and rbase <= base! In which case,
I agree with your analysis:

Reviewed-by: Will Deacon <will.deacon@arm.com>

The patch could probably still use an ack from an mm person.

Will

===8<===
diff --git a/mm/memblock.c b/mm/memblock.c
index 7608bc3..fea1688 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1514,11 +1514,37 @@  void __init memblock_enforce_memory_limit(phys_addr_t limit)
 			      (phys_addr_t)ULLONG_MAX);
 }
 
+void __init memblock_cap_memory_range(phys_addr_t base, phys_addr_t size)
+{
+	int start_rgn, end_rgn;
+	int i, ret;
+
+	if (!size)
+		return;
+
+	ret = memblock_isolate_range(&memblock.memory, base, size,
+						&start_rgn, &end_rgn);
+	if (ret)
+		return;
+
+	/* remove all the MAP regions */
+	for (i = memblock.memory.cnt - 1; i >= end_rgn; i--)
+		if (!memblock_is_nomap(&memblock.memory.regions[i]))
+			memblock_remove_region(&memblock.memory, i);
+
+	for (i = start_rgn - 1; i >= 0; i--)
+		if (!memblock_is_nomap(&memblock.memory.regions[i]))
+			memblock_remove_region(&memblock.memory, i);
+
+	/* truncate the reserved regions */
+	memblock_remove_range(&memblock.reserved, 0, base);
+	memblock_remove_range(&memblock.reserved,
+			base + size, (phys_addr_t)ULLONG_MAX);
+}
+
 void __init memblock_mem_limit_remove_map(phys_addr_t limit)
 {
-	struct memblock_type *type = &memblock.memory;
 	phys_addr_t max_addr;
-	int i, ret, start_rgn, end_rgn;
 
 	if (!limit)
 		return;
@@ -1529,19 +1555,7 @@  void __init memblock_mem_limit_remove_map(phys_addr_t limit)
 	if (max_addr == (phys_addr_t)ULLONG_MAX)
 		return;
 
-	ret = memblock_isolate_range(type, max_addr, (phys_addr_t)ULLONG_MAX,
-				&start_rgn, &end_rgn);
-	if (ret)
-		return;
-
-	/* remove all the MAP regions above the limit */
-	for (i = end_rgn - 1; i >= start_rgn; i--) {
-		if (!memblock_is_nomap(&type->regions[i]))
-			memblock_remove_region(type, i);
-	}
-	/* truncate the reserved regions */
-	memblock_remove_range(&memblock.reserved, max_addr,
-			      (phys_addr_t)ULLONG_MAX);
+	memblock_cap_memory_range(0, max_addr);
 }
 
 static int __init_memblock memblock_search(struct memblock_type *type, phys_addr_t addr)

[v27,1/9] memblock: add memblock_cap_memory_range()

Commit Message

Comments

Patch