diff mbox

xen/arm: setup: initialize xenheap mappings after boot pages avaiable

Message ID 20160531095834.GA10141@linux-7smt.suse (mailing list archive)
State New, archived
Headers show

Commit Message

Peng Fan May 31, 2016, 9:58 a.m. UTC
Hi Julien,

On Mon, May 30, 2016 at 10:53:24PM +0100, Julien Grall wrote:
>Hi Peng,
>
>On 27/05/2016 06:31, Peng Fan wrote:
>>To ARM64, setup_xenheap_mappings may call alloc_boot_pages to allocate
>>first level page table, if there is a big chunk memory (ie, >512GB).
>
>Out of interest, have you found the bug by testing Xen on a platform with
>512GB of RAM? :)

Actually not. My board does not have so large memory. When I am reading
the piece code, I think there is possibility that setup_xenheap_mappings
may need boot pages, but boot pages are not ready.

>
>>So, need to make sure boot pages are ready before setup xenheap mappings.
>
>init_boot_pages is using mfn_to_virt (see bootmem_region_add), which cannot
>work until xenheap_mfn_start is initialized. This is done by
>setup_xenheap_mappings.

My bad. I did not catch this point. Thanks for correcting me.

>
>I would be happy to give you hint on how to solve this, but I am not sure to
>fully understand your issue. Can you give more details?

I did not met issue on my platform. I just think the logic of this piece code
may cause errors on platform with large memory (>512GB).

How about the following patch?
First loop all the memory banks and calculate ram_size/ram_start/ram_end,
then set xenheap_virt_end/start and xenheap_mfn_end.
Now readly for init boot pages and setup_xenheap_mappings.


Thanks,
Peng.

Comments

Julien Grall May 31, 2016, 5:08 p.m. UTC | #1
Hi Peng,

On 31/05/16 10:58, Peng Fan wrote:
>>
>>> So, need to make sure boot pages are ready before setup xenheap mappings.
>>
>> init_boot_pages is using mfn_to_virt (see bootmem_region_add), which cannot
>> work until xenheap_mfn_start is initialized. This is done by
>> setup_xenheap_mappings.
>
> My bad. I did not catch this point. Thanks for correcting me.
>
>>
>> I would be happy to give you hint on how to solve this, but I am not sure to
>> fully understand your issue. Can you give more details?
>
> I did not met issue on my platform. I just think the logic of this piece code
> may cause errors on platform with large memory (>512GB).
>
> How about the following patch?
> First loop all the memory banks and calculate ram_size/ram_start/ram_end,
> then set xenheap_virt_end/start and xenheap_mfn_end.
> Now readly for init boot pages and setup_xenheap_mappings.

Have you tested this patch? I would be surprised to see it working.

>
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index dcb23b7..d3a3af3 100644
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -635,13 +635,24 @@ static void __init setup_mm(unsigned long dtb_paddr, size_t dtb_size)
>           paddr_t bank_start = bootinfo.mem.bank[bank].start;
>           paddr_t bank_size = bootinfo.mem.bank[bank].size;
>           paddr_t bank_end = bank_start + bank_size;
> -        paddr_t s, e;
>
>           ram_size = ram_size + bank_size;
>           ram_start = min(ram_start,bank_start);
>           ram_end = max(ram_end,bank_end);
> +    }
>
> -        setup_xenheap_mappings(bank_start>>PAGE_SHIFT, bank_size>>PAGE_SHIFT);
> +    total_pages += ram_size >> PAGE_SHIFT;
> +
> +    xenheap_virt_end = XENHEAP_VIRT_START + ram_end - ram_start;

XENHEAP_VIRT_START will be replaced by xenheap_virt_start which is not 
initialized at this time. It is done by setup_xenheap_mappings.

In any case, even if the variable are correctly setup the underlying 
page table are not present.

The easiest way I can think to fix the problem is splitting the bank 
with chunk of 512GB and calling setup_xenheap_mappings, init_boot_pages 
for each chunk.

It is not the nicest way, so I will happy if you find a better one.

Regards,
Peng Fan June 1, 2016, 7:40 a.m. UTC | #2
Hi Julien,

On Tue, May 31, 2016 at 06:08:38PM +0100, Julien Grall wrote:
>Hi Peng,
>
>On 31/05/16 10:58, Peng Fan wrote:
>>>
>>>>So, need to make sure boot pages are ready before setup xenheap mappings.
>>>
>>>init_boot_pages is using mfn_to_virt (see bootmem_region_add), which cannot
>>>work until xenheap_mfn_start is initialized. This is done by
>>>setup_xenheap_mappings.
>>
>>My bad. I did not catch this point. Thanks for correcting me.
>>
>>>
>>>I would be happy to give you hint on how to solve this, but I am not sure to
>>>fully understand your issue. Can you give more details?
>>
>>I did not met issue on my platform. I just think the logic of this piece code
>>may cause errors on platform with large memory (>512GB).
>>
>>How about the following patch?
>>First loop all the memory banks and calculate ram_size/ram_start/ram_end,
>>then set xenheap_virt_end/start and xenheap_mfn_end.
>>Now readly for init boot pages and setup_xenheap_mappings.
>
>Have you tested this patch? I would be surprised to see it working.

Not tested (:- Just an idea in my mind.

>
>>
>>diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
>>index dcb23b7..d3a3af3 100644
>>--- a/xen/arch/arm/setup.c
>>+++ b/xen/arch/arm/setup.c
>>@@ -635,13 +635,24 @@ static void __init setup_mm(unsigned long dtb_paddr, size_t dtb_size)
>>          paddr_t bank_start = bootinfo.mem.bank[bank].start;
>>          paddr_t bank_size = bootinfo.mem.bank[bank].size;
>>          paddr_t bank_end = bank_start + bank_size;
>>-        paddr_t s, e;
>>
>>          ram_size = ram_size + bank_size;
>>          ram_start = min(ram_start,bank_start);
>>          ram_end = max(ram_end,bank_end);
>>+    }
>>
>>-        setup_xenheap_mappings(bank_start>>PAGE_SHIFT, bank_size>>PAGE_SHIFT);
>>+    total_pages += ram_size >> PAGE_SHIFT;
>>+
>>+    xenheap_virt_end = XENHEAP_VIRT_START + ram_end - ram_start;
>
>XENHEAP_VIRT_START will be replaced by xenheap_virt_start which is not
>initialized at this time. It is done by setup_xenheap_mappings.

Could we move the piece code out to setup_mm, before init_boot_pages and setup_xenheap_mappings?
"
        xenheap_virt_start = DIRECTMAP_VIRT_START +
            (base_mfn - mfn) * PAGE_SIZE;
"

>
>In any case, even if the variable are correctly setup the underlying page
>table are not present.

Yeah, but setup_mm is only executed on one cpu and this is only for intialization.
Does that really matter if underlying page table are not ready?

>
>The easiest way I can think to fix the problem is splitting the bank with
>chunk of 512GB and calling setup_xenheap_mappings, init_boot_pages for each
>chunk.

Do you mean let process_memory_node handle the >512GB memory bank?

>
>It is not the nicest way, so I will happy if you find a better one.

I have no good idea now (:
I can follow your suggestion to split the >512GB into smaller banks when filling
bootinfo.mem.bank[xx].start and bootinfo.mem.bank[xx].size, if you are happy
with this way.

Thanks,
Peng.
Julien Grall June 1, 2016, 11:13 a.m. UTC | #3
Hello Peng,

On 01/06/16 08:40, Peng Fan wrote:
> On Tue, May 31, 2016 at 06:08:38PM +0100, Julien Grall wrote:
>> On 31/05/16 10:58, Peng Fan wrote:
>>>>
>>>>> So, need to make sure boot pages are ready before setup xenheap mappings.
>>>>
>>>> init_boot_pages is using mfn_to_virt (see bootmem_region_add), which cannot
>>>> work until xenheap_mfn_start is initialized. This is done by
>>>> setup_xenheap_mappings.
>>>
>>> My bad. I did not catch this point. Thanks for correcting me.
>>>
>>>>
>>>> I would be happy to give you hint on how to solve this, but I am not sure to
>>>> fully understand your issue. Can you give more details?
>>>
>>> I did not met issue on my platform. I just think the logic of this piece code
>>> may cause errors on platform with large memory (>512GB).
>>>
>>> How about the following patch?
>>> First loop all the memory banks and calculate ram_size/ram_start/ram_end,
>>> then set xenheap_virt_end/start and xenheap_mfn_end.
>>> Now readly for init boot pages and setup_xenheap_mappings.
>>
>> Have you tested this patch? I would be surprised to see it working.
>
> Not tested (:- Just an idea in my mind.

When you send the patch to the ML, please either test it or clearly mark 
them as untested.

>>
>>>
>>> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
>>> index dcb23b7..d3a3af3 100644
>>> --- a/xen/arch/arm/setup.c
>>> +++ b/xen/arch/arm/setup.c
>>> @@ -635,13 +635,24 @@ static void __init setup_mm(unsigned long dtb_paddr, size_t dtb_size)
>>>           paddr_t bank_start = bootinfo.mem.bank[bank].start;
>>>           paddr_t bank_size = bootinfo.mem.bank[bank].size;
>>>           paddr_t bank_end = bank_start + bank_size;
>>> -        paddr_t s, e;
>>>
>>>           ram_size = ram_size + bank_size;
>>>           ram_start = min(ram_start,bank_start);
>>>           ram_end = max(ram_end,bank_end);
>>> +    }
>>>
>>> -        setup_xenheap_mappings(bank_start>>PAGE_SHIFT, bank_size>>PAGE_SHIFT);
>>> +    total_pages += ram_size >> PAGE_SHIFT;
>>> +
>>> +    xenheap_virt_end = XENHEAP_VIRT_START + ram_end - ram_start;
>>
>> XENHEAP_VIRT_START will be replaced by xenheap_virt_start which is not
>> initialized at this time. It is done by setup_xenheap_mappings.
>
> Could we move the piece code out to setup_mm, before init_boot_pages and setup_xenheap_mappings?
> "
>          xenheap_virt_start = DIRECTMAP_VIRT_START +
>              (base_mfn - mfn) * PAGE_SIZE;
> "
>

Again no, see why below.

>>
>> In any case, even if the variable are correctly setup the underlying page
>> table are not present.
>
> Yeah, but setup_mm is only executed on one cpu and this is only for intialization.
> Does that really matter if underlying page table are not ready?

Even if only one CPU is running, the page tables are already setup for 
this CPU (see the call to setup_pagetables just before setup_mm).

The boot allocator requires a bit of memory to keep track of the bootmem 
region (see bootmem_region_add). This memory will be taken from the xenheap.

So the xenheap has to be mapped (at least partially) into the page 
tables before calling for the first time init_boot_pages.

Regards,
diff mbox

Patch

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index dcb23b7..d3a3af3 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -635,13 +635,24 @@  static void __init setup_mm(unsigned long dtb_paddr, size_t dtb_size)
         paddr_t bank_start = bootinfo.mem.bank[bank].start;
         paddr_t bank_size = bootinfo.mem.bank[bank].size;
         paddr_t bank_end = bank_start + bank_size;
-        paddr_t s, e;
 
         ram_size = ram_size + bank_size;
         ram_start = min(ram_start,bank_start);
         ram_end = max(ram_end,bank_end);
+    }
 
-        setup_xenheap_mappings(bank_start>>PAGE_SHIFT, bank_size>>PAGE_SHIFT);
+    total_pages += ram_size >> PAGE_SHIFT;
+
+    xenheap_virt_end = XENHEAP_VIRT_START + ram_end - ram_start;
+    xenheap_mfn_start = ram_start >> PAGE_SHIFT;
+    xenheap_mfn_end = ram_end >> PAGE_SHIFT;
+
+    for ( bank = 0 ; bank < bootinfo.mem.nr_banks; bank++ )
+    {
+        paddr_t bank_start = bootinfo.mem.bank[bank].start;
+        paddr_t bank_size = bootinfo.mem.bank[bank].size;
+        paddr_t bank_end = bank_start + bank_size;
+        paddr_t s, e;
 
         s = bank_start;
         while ( s < bank_end )
@@ -658,18 +669,12 @@  static void __init setup_mm(unsigned long dtb_paddr, size_t dtb_size)
             if ( e > bank_end )
                 e = bank_end;
 
-            xenheap_mfn_end = e;
-
             dt_unreserved_regions(s, e, init_boot_pages, 0);
             s = n;
         }
-    }
-
-    total_pages += ram_size >> PAGE_SHIFT;
 
-    xenheap_virt_end = XENHEAP_VIRT_START + ram_end - ram_start;
-    xenheap_mfn_start = ram_start >> PAGE_SHIFT;
-    xenheap_mfn_end = ram_end >> PAGE_SHIFT;
+        setup_xenheap_mappings(bank_start>>PAGE_SHIFT, bank_size>>PAGE_SHIFT);
+    }
 
     /*
      * Need enough mapped pages for copying the DTB.