diff mbox

[kvm-unit-tests,v2,07/11] s390x: initialize the physical allocator

Message ID 20180112143417.16743-8-david@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

David Hildenbrand Jan. 12, 2018, 2:34 p.m. UTC
We now have the range of free memory, let's initialize the physical
allocator. It is now possible to use alloc_page()/alloc_pages().

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 lib/s390x/sclp.c | 12 ++++++++++++
 s390x/Makefile   |  3 +++
 2 files changed, 15 insertions(+)

Comments

Andrew Jones Jan. 15, 2018, 10:16 a.m. UTC | #1
On Fri, Jan 12, 2018 at 03:34:13PM +0100, David Hildenbrand wrote:
> We now have the range of free memory, let's initialize the physical
> allocator. It is now possible to use alloc_page()/alloc_pages().

Commit message isn't quite right. After this patch it's possible to
use malloc and memalign, based on the early-ops, but alloc_pages()
requires the freelist to be initialized first by free_pages(), done
by setup_vm().

Actually, since we have four states of memory management setup,
then I think we either need three setup calls:

state 1: no setup done, no call
state 2: phys-alloc is set up by the phys_alloc_init() call
state 3: alloc_page() is set up, need call that allocates chunk with the
                                 phys-allocator for free_pages()

(Instead of a chunk, all memory could be given, but then either a
 new set of early-ops should be provided that are based on
 alloc_page(), or all early memory allocations must use alloc_page())

state 4: virtual memory is set up by the setup_vm() call

Or, simply remove state 2 and its code and alloc-ops, changing all early
allocations to use alloc_page(). If each arch provides an mmu_enabled()
call, then the alloc_ops->memalign call in memalign could be replaced
with

 if (mmu_enabled())
    vm_memalign(...)
 else
    /* fallback to alloc_pages() */

I probably should have written all this in a separate thread, because
I wouldn't want to hold this series up on a rework of the general
memory management framework. But, anyway, thoughts on that? Paolo?

Thanks,
drew

> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  lib/s390x/sclp.c | 12 ++++++++++++
>  s390x/Makefile   |  3 +++
>  2 files changed, 15 insertions(+)
> 
> diff --git a/lib/s390x/sclp.c b/lib/s390x/sclp.c
> index 199405c..c7471b1 100644
> --- a/lib/s390x/sclp.c
> +++ b/lib/s390x/sclp.c
> @@ -15,11 +15,21 @@
>  #include <asm/arch_def.h>
>  #include <asm/interrupt.h>
>  #include "sclp.h"
> +#include <alloc_phys.h>
> +
> +extern unsigned long stacktop;
>  
>  static uint64_t storage_increment_size;
>  static uint64_t max_ram_size;
>  static uint64_t ram_size;
>  
> +static void mem_init(phys_addr_t mem_end)
> +{
> +	phys_addr_t freemem_start = (phys_addr_t)&stacktop & PAGE_MASK;
> +
> +	phys_alloc_init(freemem_start, mem_end - freemem_start);
> +}
> +
>  void sclp_memory_setup(void)
>  {
>  	ReadInfo *ri = (void *)_sccb;
> @@ -55,4 +65,6 @@ void sclp_memory_setup(void)
>  			break;
>  		ram_size += storage_increment_size;
>  	}
> +
> +	mem_init(ram_size);
>  }
> diff --git a/s390x/Makefile b/s390x/Makefile
> index ce63dd1..4198fdc 100644
> --- a/s390x/Makefile
> +++ b/s390x/Makefile
> @@ -21,6 +21,9 @@ asm-offsets = lib/$(ARCH)/asm-offsets.h
>  include $(SRCDIR)/scripts/asm-offsets.mak
>  
>  cflatobjs += lib/util.o
> +cflatobjs += lib/alloc.o
> +cflatobjs += lib/alloc_phys.o
> +cflatobjs += lib/alloc_page.o
>  cflatobjs += lib/alloc_phys.o
>  cflatobjs += lib/s390x/io.o
>  cflatobjs += lib/s390x/stack.o
> -- 
> 2.14.3
>
David Hildenbrand Jan. 15, 2018, 10:21 a.m. UTC | #2
On 15.01.2018 11:16, Andrew Jones wrote:
> On Fri, Jan 12, 2018 at 03:34:13PM +0100, David Hildenbrand wrote:
>> We now have the range of free memory, let's initialize the physical
>> allocator. It is now possible to use alloc_page()/alloc_pages().
> 
> Commit message isn't quite right. After this patch it's possible to
> use malloc and memalign, based on the early-ops, but alloc_pages()
> requires the freelist to be initialized first by free_pages(), done
> by setup_vm().

That's interesting, as I am using alloc_pages() during setup_vm() I
haven't noticed it.

> 
> Actually, since we have four states of memory management setup,
> then I think we either need three setup calls:
> 
> state 1: no setup done, no call
> state 2: phys-alloc is set up by the phys_alloc_init() call
> state 3: alloc_page() is set up, need call that allocates chunk with the
>                                  phys-allocator for free_pages()
> 
> (Instead of a chunk, all memory could be given, but then either a
>  new set of early-ops should be provided that are based on
>  alloc_page(), or all early memory allocations must use alloc_page())
> 
> state 4: virtual memory is set up by the setup_vm() call
> 
> Or, simply remove state 2 and its code and alloc-ops, changing all early
> allocations to use alloc_page(). If each arch provides an mmu_enabled()
> call, then the alloc_ops->memalign call in memalign could be replaced
> with

I think I'd prefer that. Unless somebody needs the flexibility of this
extra state.

But let's hear what the others say.

> 
>  if (mmu_enabled())
>     vm_memalign(...)
>  else
>     /* fallback to alloc_pages() */
> 
> I probably should have written all this in a separate thread, because
> I wouldn't want to hold this series up on a rework of the general
> memory management framework. But, anyway, thoughts on that? Paolo?>
> Thanks,
> drew
>
Paolo Bonzini Jan. 15, 2018, 10:31 a.m. UTC | #3
On 15/01/2018 11:21, David Hildenbrand wrote:
>> Commit message isn't quite right. After this patch it's possible to
>> use malloc and memalign, based on the early-ops, but alloc_pages()
>> requires the freelist to be initialized first by free_pages(), done
>> by setup_vm().
> 
> That's interesting, as I am using alloc_pages() during setup_vm() I
> haven't noticed it.

You're using it during setup_mmu, not setup_vm.

Paolo
David Hildenbrand Jan. 15, 2018, 10:35 a.m. UTC | #4
On 15.01.2018 11:31, Paolo Bonzini wrote:
> On 15/01/2018 11:21, David Hildenbrand wrote:
>>> Commit message isn't quite right. After this patch it's possible to
>>> use malloc and memalign, based on the early-ops, but alloc_pages()
>>> requires the freelist to be initialized first by free_pages(), done
>>> by setup_vm().
>>
>> That's interesting, as I am using alloc_pages() during setup_vm() I
>> haven't noticed it.
> 
> You're using it during setup_mmu, not setup_vm.

... which is only called from setup_vm().

> 
> Paolo
>
Paolo Bonzini Jan. 15, 2018, 10:38 a.m. UTC | #5
On 15/01/2018 11:16, Andrew Jones wrote:
> state 2: phys-alloc is set up by the phys_alloc_init() call
>
> state 3: alloc_page() is set up, need call that allocates chunk with the
>                                  phys-allocator for free_pages()
> 
> (Instead of a chunk, all memory could be given, but then either a
>  new set of early-ops should be provided that are based on
>  alloc_page(), or all early memory allocations must use alloc_page())
> 
> state 4: virtual memory is set up by the setup_vm() call
> 
> Or, simply remove state 2 and its code and alloc-ops, changing all early
> allocations to use alloc_page(). If each arch provides an mmu_enabled()
> call, then the alloc_ops->memalign call in memalign could be replaced
> with
> 
>  if (mmu_enabled())
>     vm_memalign(...)
>  else
>     /* fallback to alloc_pages() */
> 
> I probably should have written all this in a separate thread, because
> I wouldn't want to hold this series up on a rework of the general
> memory management framework. But, anyway, thoughts on that? Paolo?

State 3 should only exist in setup_mmu, so indeed the commit message
here is not accurate.  setup_vm does the transition from phys_alloc to
page_alloc before calling setup_mmu, exactly because setup_mmu already
needs to allocate pages:

        phys_alloc_get_unused(&base, &top);
        base = (base + PAGE_SIZE - 1) & -PAGE_SIZE;
        top = top & -PAGE_SIZE;
        free_pages(phys_to_virt(base), top - base);
        page_root = setup_mmu(top);
        alloc_ops = &vmalloc_ops;

The reason for state 2 to exist is that right now on x86 it is optional
to use the MMU.  You could use page-granularity allocation, but that
would be a waste of memory if we switch to a GTest-like API that needs
to dynamically allocate memory.

However, maybe it would make sense to make it internal just like state
3.  That would mean enabling the MMU directly where we now call
phys_alloc_init.  This is independent of course of this s390 series.

Paolo
David Hildenbrand Jan. 15, 2018, 10:42 a.m. UTC | #6
On 15.01.2018 11:16, Andrew Jones wrote:
> On Fri, Jan 12, 2018 at 03:34:13PM +0100, David Hildenbrand wrote:
>> We now have the range of free memory, let's initialize the physical
>> allocator. It is now possible to use alloc_page()/alloc_pages().

Will rephrase to:

"It is now possible to use use malloc and memalign, based on the
early-ops. alloc_pages() cannot be used yet (as it has to be initialized
via free_pages() - e.g. via setup_vm() - first)."

Thanks!

> 
> Commit message isn't quite right. After this patch it's possible to
> use malloc and memalign, based on the early-ops, but alloc_pages()
> requires the freelist to be initialized first by free_pages(), done
> by setup_vm().
> 
> Actually, since we have four states of memory management setup,
> then I think we either need three setup calls:
> 
> state 1: no setup done, no call
> state 2: phys-alloc is set up by the phys_alloc_init() call
> state 3: alloc_page() is set up, need call that allocates chunk with the
>                                  phys-allocator for free_pages()
> 
> (Instead of a chunk, all memory could be given, but then either a
>  new set of early-ops should be provided that are based on
>  alloc_page(), or all early memory allocations must use alloc_page())
> 
> state 4: virtual memory is set up by the setup_vm() call
> 
> Or, simply remove state 2 and its code and alloc-ops, changing all early
> allocations to use alloc_page(). If each arch provides an mmu_enabled()
> call, then the alloc_ops->memalign call in memalign could be replaced
> with
> 
>  if (mmu_enabled())
>     vm_memalign(...)
>  else
>     /* fallback to alloc_pages() */
> 
> I probably should have written all this in a separate thread, because
> I wouldn't want to hold this series up on a rework of the general
> memory management framework. But, anyway, thoughts on that? Paolo?
> 
> Thanks,
> drew
Andrew Jones Jan. 15, 2018, 10:50 a.m. UTC | #7
On Mon, Jan 15, 2018 at 11:38:30AM +0100, Paolo Bonzini wrote:
> On 15/01/2018 11:16, Andrew Jones wrote:
> > state 2: phys-alloc is set up by the phys_alloc_init() call
> >
> > state 3: alloc_page() is set up, need call that allocates chunk with the
> >                                  phys-allocator for free_pages()
> > 
> > (Instead of a chunk, all memory could be given, but then either a
> >  new set of early-ops should be provided that are based on
> >  alloc_page(), or all early memory allocations must use alloc_page())
> > 
> > state 4: virtual memory is set up by the setup_vm() call
> > 
> > Or, simply remove state 2 and its code and alloc-ops, changing all early
> > allocations to use alloc_page(). If each arch provides an mmu_enabled()
> > call, then the alloc_ops->memalign call in memalign could be replaced
> > with
> > 
> >  if (mmu_enabled())
> >     vm_memalign(...)
> >  else
> >     /* fallback to alloc_pages() */
> > 
> > I probably should have written all this in a separate thread, because
> > I wouldn't want to hold this series up on a rework of the general
> > memory management framework. But, anyway, thoughts on that? Paolo?
> 
> State 3 should only exist in setup_mmu, so indeed the commit message
> here is not accurate.  setup_vm does the transition from phys_alloc to
> page_alloc before calling setup_mmu, exactly because setup_mmu already
> needs to allocate pages:
> 
>         phys_alloc_get_unused(&base, &top);
>         base = (base + PAGE_SIZE - 1) & -PAGE_SIZE;
>         top = top & -PAGE_SIZE;
>         free_pages(phys_to_virt(base), top - base);
>         page_root = setup_mmu(top);
>         alloc_ops = &vmalloc_ops;
> 
> The reason for state 2 to exist is that right now on x86 it is optional
> to use the MMU.  You could use page-granularity allocation, but that
> would be a waste of memory if we switch to a GTest-like API that needs
> to dynamically allocate memory.
> 
> However, maybe it would make sense to make it internal just like state
> 3.  That would mean enabling the MMU directly where we now call
> phys_alloc_init.  This is independent of course of this s390 series.
>

I'm still looking at making the MMU enablement of ARM optional, and,
like you say, it already is optional for x86, so I don't think we'd
want to reduce the flexibility by making more setup internal. If we
need the phys-allocator, then I think introducing an external setup
call for state 3, allowing alloc_page() to be used before (without)
the MMU being enabled, probably makes the most sense.

Thanks,
drew
diff mbox

Patch

diff --git a/lib/s390x/sclp.c b/lib/s390x/sclp.c
index 199405c..c7471b1 100644
--- a/lib/s390x/sclp.c
+++ b/lib/s390x/sclp.c
@@ -15,11 +15,21 @@ 
 #include <asm/arch_def.h>
 #include <asm/interrupt.h>
 #include "sclp.h"
+#include <alloc_phys.h>
+
+extern unsigned long stacktop;
 
 static uint64_t storage_increment_size;
 static uint64_t max_ram_size;
 static uint64_t ram_size;
 
+static void mem_init(phys_addr_t mem_end)
+{
+	phys_addr_t freemem_start = (phys_addr_t)&stacktop & PAGE_MASK;
+
+	phys_alloc_init(freemem_start, mem_end - freemem_start);
+}
+
 void sclp_memory_setup(void)
 {
 	ReadInfo *ri = (void *)_sccb;
@@ -55,4 +65,6 @@  void sclp_memory_setup(void)
 			break;
 		ram_size += storage_increment_size;
 	}
+
+	mem_init(ram_size);
 }
diff --git a/s390x/Makefile b/s390x/Makefile
index ce63dd1..4198fdc 100644
--- a/s390x/Makefile
+++ b/s390x/Makefile
@@ -21,6 +21,9 @@  asm-offsets = lib/$(ARCH)/asm-offsets.h
 include $(SRCDIR)/scripts/asm-offsets.mak
 
 cflatobjs += lib/util.o
+cflatobjs += lib/alloc.o
+cflatobjs += lib/alloc_phys.o
+cflatobjs += lib/alloc_page.o
 cflatobjs += lib/alloc_phys.o
 cflatobjs += lib/s390x/io.o
 cflatobjs += lib/s390x/stack.o