diff mbox

arm64:free_initrd_mem should also free the memblock

Message ID 35FD53F367049845BC99AC72306C23D103CDBFBFB029@CNBJMBX05.corpusers.net (mailing list archive)
State New, archived
Headers show

Commit Message

Wang, Yalin Sept. 12, 2014, 10:17 a.m. UTC
this patch fix the memblock statics for memblock
in file /sys/kernel/debug/memblock/reserved
if we don't call memblock_free the initrd will still
be marked as reserved, even they are freed.

Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>
---
 arch/arm64/mm/init.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Will Deacon Sept. 15, 2014, 6:33 p.m. UTC | #1
On Fri, Sep 12, 2014 at 11:17:18AM +0100, Wang, Yalin wrote:
> this patch fix the memblock statics for memblock
> in file /sys/kernel/debug/memblock/reserved
> if we don't call memblock_free the initrd will still
> be marked as reserved, even they are freed.
> 
> Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>
> ---
>  arch/arm64/mm/init.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 5472c24..34605c8 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -334,8 +334,10 @@ static int keep_initrd;
>  
>  void free_initrd_mem(unsigned long start, unsigned long end)
>  {
> -	if (!keep_initrd)
> +	if (!keep_initrd) {
>  		free_reserved_area((void *)start, (void *)end, 0, "initrd");
> +		memblock_free(__pa(start), end - start);
> +	}

I don't think it makes any technical difference, but doing the memblock_free
before the free_reserved_area makes more sense to me.

Will
Russell King - ARM Linux Sept. 15, 2014, 6:40 p.m. UTC | #2
On Mon, Sep 15, 2014 at 07:33:34PM +0100, Will Deacon wrote:
> On Fri, Sep 12, 2014 at 11:17:18AM +0100, Wang, Yalin wrote:
> > this patch fix the memblock statics for memblock
> > in file /sys/kernel/debug/memblock/reserved
> > if we don't call memblock_free the initrd will still
> > be marked as reserved, even they are freed.
> > 
> > Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>
> > ---
> >  arch/arm64/mm/init.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> > index 5472c24..34605c8 100644
> > --- a/arch/arm64/mm/init.c
> > +++ b/arch/arm64/mm/init.c
> > @@ -334,8 +334,10 @@ static int keep_initrd;
> >  
> >  void free_initrd_mem(unsigned long start, unsigned long end)
> >  {
> > -	if (!keep_initrd)
> > +	if (!keep_initrd) {
> >  		free_reserved_area((void *)start, (void *)end, 0, "initrd");
> > +		memblock_free(__pa(start), end - start);
> > +	}
> 
> I don't think it makes any technical difference, but doing the memblock_free
> before the free_reserved_area makes more sense to me.

A better question is... should we even be doing this.  The memblock
information is used as a method to bring up the kernel and provide
early allocation.  Once the memory is handed over from memblock to
the normal kernel page allocators, we no longer care what happens to
memblock.

There is no need to free the initrd memory back into memblock.  In
fact, seeing the initrd location in /sys/kernel/debug/memblock/reserved
can be useful debug information in itself.
Will Deacon Sept. 15, 2014, 6:50 p.m. UTC | #3
On Mon, Sep 15, 2014 at 07:40:23PM +0100, Russell King - ARM Linux wrote:
> On Mon, Sep 15, 2014 at 07:33:34PM +0100, Will Deacon wrote:
> > On Fri, Sep 12, 2014 at 11:17:18AM +0100, Wang, Yalin wrote:
> > > this patch fix the memblock statics for memblock
> > > in file /sys/kernel/debug/memblock/reserved
> > > if we don't call memblock_free the initrd will still
> > > be marked as reserved, even they are freed.
> > > 
> > > Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>
> > > ---
> > >  arch/arm64/mm/init.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> > > index 5472c24..34605c8 100644
> > > --- a/arch/arm64/mm/init.c
> > > +++ b/arch/arm64/mm/init.c
> > > @@ -334,8 +334,10 @@ static int keep_initrd;
> > >  
> > >  void free_initrd_mem(unsigned long start, unsigned long end)
> > >  {
> > > -	if (!keep_initrd)
> > > +	if (!keep_initrd) {
> > >  		free_reserved_area((void *)start, (void *)end, 0, "initrd");
> > > +		memblock_free(__pa(start), end - start);
> > > +	}
> > 
> > I don't think it makes any technical difference, but doing the memblock_free
> > before the free_reserved_area makes more sense to me.
> 
> A better question is... should we even be doing this.  The memblock
> information is used as a method to bring up the kernel and provide
> early allocation.  Once the memory is handed over from memblock to
> the normal kernel page allocators, we no longer care what happens to
> memblock.
> 
> There is no need to free the initrd memory back into memblock.  In
> fact, seeing the initrd location in /sys/kernel/debug/memblock/reserved
> can be useful debug information in itself.

That's a fair point. Yang: do you have a specific use-case in mind for this?

I wondered if it might interact with our pfn_valid implementation, which
uses memblock_is_memory, however memblock_free only deals with the reserved
regions, so I now I can't see why this change is required either.

Will
Wang, Yalin Sept. 16, 2014, 1:53 a.m. UTC | #4
Hi

The reason that a want merge this patch is that
It confuse me when I debug memory issue by 
/sys/kernel/debug/memblock/reserved  debug file,
It show lots of un-correct reserved memory.
In fact, I also send a patch to cma driver part
For this issue too:
http://ozlabs.org/~akpm/mmots/broken-out/free-the-reserved-memblock-when-free-cma-pages.patch

I want to remove these un-correct memblock parts as much as possible,
so that I can see more correct info from /sys/kernel/debug/memblock/reserved
debug file .

Thanks



-----Original Message-----

On Mon, Sep 15, 2014 at 07:40:23PM +0100, Russell King - ARM Linux wrote:
> On Mon, Sep 15, 2014 at 07:33:34PM +0100, Will Deacon wrote:
> > On Fri, Sep 12, 2014 at 11:17:18AM +0100, Wang, Yalin wrote:
> > > this patch fix the memblock statics for memblock in file 
> > > /sys/kernel/debug/memblock/reserved
> > > if we don't call memblock_free the initrd will still be marked as 
> > > reserved, even they are freed.
> > > 
> > > Signed-off-by: Yalin Wang <yalin.wang@sonymobile.com>
> > > ---
> > >  arch/arm64/mm/init.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 
> > > 5472c24..34605c8 100644
> > > --- a/arch/arm64/mm/init.c
> > > +++ b/arch/arm64/mm/init.c
> > > @@ -334,8 +334,10 @@ static int keep_initrd;
> > >  
> > >  void free_initrd_mem(unsigned long start, unsigned long end)  {
> > > -	if (!keep_initrd)
> > > +	if (!keep_initrd) {
> > >  		free_reserved_area((void *)start, (void *)end, 0, "initrd");
> > > +		memblock_free(__pa(start), end - start);
> > > +	}
> > 
> > I don't think it makes any technical difference, but doing the 
> > memblock_free before the free_reserved_area makes more sense to me.
> 
> A better question is... should we even be doing this.  The memblock 
> information is used as a method to bring up the kernel and provide 
> early allocation.  Once the memory is handed over from memblock to the 
> normal kernel page allocators, we no longer care what happens to 
> memblock.
> 
> There is no need to free the initrd memory back into memblock.  In 
> fact, seeing the initrd location in 
> /sys/kernel/debug/memblock/reserved
> can be useful debug information in itself.

That's a fair point. Yang: do you have a specific use-case in mind for this?

I wondered if it might interact with our pfn_valid implementation, which uses memblock_is_memory, however memblock_free only deals with the reserved regions, so I now I can't see why this change is required either.

Will
Catalin Marinas Sept. 17, 2014, 4:28 p.m. UTC | #5
On Tue, Sep 16, 2014 at 02:53:55AM +0100, Wang, Yalin wrote:
> The reason that a want merge this patch is that
> It confuse me when I debug memory issue by 
> /sys/kernel/debug/memblock/reserved  debug file,
> It show lots of un-correct reserved memory.
> In fact, I also send a patch to cma driver part
> For this issue too:
> http://ozlabs.org/~akpm/mmots/broken-out/free-the-reserved-memblock-when-free-cma-pages.patch
> 
> I want to remove these un-correct memblock parts as much as possible,
> so that I can see more correct info from /sys/kernel/debug/memblock/reserved
> debug file .

Could we not always call memblock_free() from free_reserved_area() (with
a dummy definition when !CONFIG_HAVE_MEMBLOCK)?
Russell King - ARM Linux Sept. 17, 2014, 6:12 p.m. UTC | #6
On Wed, Sep 17, 2014 at 05:28:23PM +0100, Catalin Marinas wrote:
> On Tue, Sep 16, 2014 at 02:53:55AM +0100, Wang, Yalin wrote:
> > The reason that a want merge this patch is that
> > It confuse me when I debug memory issue by 
> > /sys/kernel/debug/memblock/reserved  debug file,
> > It show lots of un-correct reserved memory.
> > In fact, I also send a patch to cma driver part
> > For this issue too:
> > http://ozlabs.org/~akpm/mmots/broken-out/free-the-reserved-memblock-when-free-cma-pages.patch
> > 
> > I want to remove these un-correct memblock parts as much as possible,
> > so that I can see more correct info from /sys/kernel/debug/memblock/reserved
> > debug file .
> 
> Could we not always call memblock_free() from free_reserved_area() (with
> a dummy definition when !CONFIG_HAVE_MEMBLOCK)?

Why bother?

The next thing is that people will want to have memblock's reserved areas
track whether the kernel allocates a page so that the memblock debugging
follows the kernel's allocation state.

This is utterly rediculous.  Memblock is purely a method to get the system
up and running.  Once it hands memory over to the normal kernel allocators,
the reservation information in memblock is no longer valid.

The /useful/ information that it provides is the state of memory passed
over to the kernel allocators, which in itself is valuable information.
Destroying it by freeing stuff after that point is not useful.
Wang, Yalin Sept. 18, 2014, 9:38 a.m. UTC | #7
Hi Russell,

mm..
I see your meaning,
But how to debug reserved memory,
I mean how to know which physical memory are reserved in kernel if 
Not use /sys/kernel/debug/memblock/reserved  debug file ?

I think memblock provides a debug interface, so it should keep it 
Correct for debug .

For Catalin 's suggestion,
I am not sure if it is always true to call memblock_free() in
free_reserved_area() , maybe some caller don't want this
behaviors .
So maybe we can introduce another function like free_reserved_area_and_memblock() to implement it.


-----Original Message-----
From: Russell King - ARM Linux [mailto:linux@arm.linux.org.uk] 
Sent: Thursday, September 18, 2014 2:13 AM
To: Catalin Marinas
Cc: Wang, Yalin; Will Deacon; 'linux-mm@kvack.org'; 'linux-kernel@vger.kernel.org'; 'linux-arm-kernel@lists.infradead.org'
Subject: Re: [PATCH] arm64:free_initrd_mem should also free the memblock

On Wed, Sep 17, 2014 at 05:28:23PM +0100, Catalin Marinas wrote:
> On Tue, Sep 16, 2014 at 02:53:55AM +0100, Wang, Yalin wrote:
> > The reason that a want merge this patch is that It confuse me when I 
> > debug memory issue by /sys/kernel/debug/memblock/reserved  debug 
> > file, It show lots of un-correct reserved memory.
> > In fact, I also send a patch to cma driver part For this issue too:
> > http://ozlabs.org/~akpm/mmots/broken-out/free-the-reserved-memblock-
> > when-free-cma-pages.patch
> > 
> > I want to remove these un-correct memblock parts as much as 
> > possible, so that I can see more correct info from 
> > /sys/kernel/debug/memblock/reserved
> > debug file .
> 
> Could we not always call memblock_free() from free_reserved_area() 
> (with a dummy definition when !CONFIG_HAVE_MEMBLOCK)?

Why bother?

The next thing is that people will want to have memblock's reserved areas track whether the kernel allocates a page so that the memblock debugging follows the kernel's allocation state.

This is utterly rediculous.  Memblock is purely a method to get the system up and running.  Once it hands memory over to the normal kernel allocators, the reservation information in memblock is no longer valid.

The /useful/ information that it provides is the state of memory passed over to the kernel allocators, which in itself is valuable information.
Destroying it by freeing stuff after that point is not useful.

--
FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up according to speedtest.net.
Russell King - ARM Linux Sept. 18, 2014, 9:59 a.m. UTC | #8
On Thu, Sep 18, 2014 at 05:38:54PM +0800, Wang, Yalin wrote:
> Hi Russell,
> 
> mm..
> I see your meaning,
> But how to debug reserved memory,
> I mean how to know which physical memory are reserved in kernel if 
> Not use /sys/kernel/debug/memblock/reserved  debug file ?

What are you trying to do when you say "debug reserved memory" ?
Let's first sort out what you mean by "reserved memory".

We have pages which are allocated on demand, which are then marked with
the PG_reserved flag to indicate that they are something special.  These
get counted in the statistics as "reserved" pages.  These pages may be
freed at a later time.  These never appear in memblock.

We have pages which cover the kernel text/data.  These are never freed
once the system is running.  Memblock records a chunk of memory reserved
at boot time for the kernel, so that memblock does not try to allocate
from that region.

We also have pages which cover the DT and initrd images.  Again, we need
to mark them reserved in memblock so that it doesn't try to allocate from
those regions while we're bringing the kernel up.  However, once the
kernel is running, these areas are freed into the normal memory kernel
memory allocators, but they remain in memblock.

So, even if you solve the third case, your picture of reserved memory
from memblock is still very much incomplete, and adding memblock calls
to all the sites which allocate and then reserve the pages is (a) going
to add unnecessary overhead, and (b) is going to add quite a bit of
complexity all over the kernel.

Let me re-iterate.  memblock is a stepping stone for bringing the kernel
up.  It is used early in boot to provide trivial memory allocation, and
once the main kernel memory allocators are initialised (using allocations
from memblock), memblock becomes mostly redundant - memblocks idea of
reserved areas becomes completely redundant and unnecessary since that
information is now tracked by the main kernel allocators.

It's useful to leave memblock reserved memory in place, because then you
have a picture at early kernel boot which you can then compare with the
page arrays, and see what's changed between early boot and the current
kernels state.  Yes, there is no code to dump out the page array - when
you realise that dumping the page array (which is 262144 entries per
gigabyte of memory) in some generic way becomes quite cumbersome.  The
array itself is around 8MB per gigabyte of memory.

If you want this information, it's probably best to write a custom debugfs
entry which scans the page array, and dumps the regions of reserved memory
(in other words, detects where the PG_reserved flag changes state between
two consecutive pages.)

With such a dump, you can then compare it with the memblock reserved
debugfs file, and check whether the initrd was properly freed, etc.  If
you dump it in the same format as the memblock reserved debugfs, you
can compare the two using diff(1).

The other advantage of this approach is that you're not asking lots of
places in the kernel to gain additional calls to change the state of
something that is never otherwise used.
Wang, Yalin Sept. 18, 2014, 12:13 p.m. UTC | #9
hi 

(a)
We have pages which are allocated on demand, which are then marked with
the PG_reserved flag to indicate that they are something special.  These
get counted in the statistics as "reserved" pages.  These pages may be
freed at a later time.  These never appear in memblock.

(b)
We have pages which cover the kernel text/data.  These are never freed
once the system is running.  Memblock records a chunk of memory reserved
at boot time for the kernel, so that memblock does not try to allocate
from that region.
(c)
We also have pages which cover the DT and initrd images.  Again, we need
to mark them reserved in memblock so that it doesn't try to allocate from
those regions while we're bringing the kernel up.  However, once the
kernel is running, these areas are freed into the normal memory kernel
memory allocators, but they remain in memblock.


generically to say, the reserved memory i want to know  means all the physical memory whose
struct page are never placed into buddy system ,
this means memblock.reserve  should include (b) , but not include (a) and (c) .

use debugfs to print all struct page which has  PG_reserved is a good idea .
But there is a little problem here :
the struct page which has PG_reserved set maybe come from (a) or (b) ,
if memblock.reserve also mark both (a) (b) as reserved ,
we will assume it come from (b) , even it come from (a) ,
this is wrong .

but this special case is very rare , should not a serious problem .

i will think of your new idea for reserved memory debug .

Thanks !
diff mbox

Patch

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 5472c24..34605c8 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -334,8 +334,10 @@  static int keep_initrd;
 
 void free_initrd_mem(unsigned long start, unsigned long end)
 {
-	if (!keep_initrd)
+	if (!keep_initrd) {
 		free_reserved_area((void *)start, (void *)end, 0, "initrd");
+		memblock_free(__pa(start), end - start);
+	}
 }
 
 static int __init keepinitrd_setup(char *__unused)