[v2] xen/riscv: Increase XEN_VIRT_SIZE

Message ID	808d1b346bc90dde38fd19a6b92ab78d78e42936.1743766932.git.oleksii.kurochko@gmail.com (mailing list archive)
State	New
Headers	show Return-Path: <xen-devel-bounces@lists.xenproject.org> Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org> From: Oleksii Kurochko <oleksii.kurochko@gmail.com> To: xen-devel@lists.xenproject.org Cc: Oleksii Kurochko <oleksii.kurochko@gmail.com>, Alistair Francis <alistair.francis@wdc.com>, Bob Eshleman <bobbyeshleman@gmail.com>, Connor Davis <connojdavis@gmail.com>, Andrew Cooper <andrew.cooper3@citrix.com>, Anthony PERARD <anthony.perard@vates.tech>, Michal Orzel <michal.orzel@amd.com>, Jan Beulich <jbeulich@suse.com>, Julien Grall <julien@xen.org>, =?utf-8?q?Roger_Pau_Monn=C3=A9?= <roger.pau@citrix.com>, Stefano Stabellini <sstabellini@kernel.org> Subject: [PATCH v2] xen/riscv: Increase XEN_VIRT_SIZE Date: Fri, 4 Apr 2025 18:04:00 +0200 Message-ID: <808d1b346bc90dde38fd19a6b92ab78d78e42936.1743766932.git.oleksii.kurochko@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	[v2] xen/riscv: Increase XEN_VIRT_SIZE \| expand [v2] xen/riscv: Increase XEN_VIRT_SIZE

Oleksii Kurochko April 4, 2025, 4:04 p.m. UTC

A randconfig job failed with the following issue:
  riscv64-linux-gnu-ld: Xen too large for early-boot assumptions

The reason is that enabling the UBSAN config increased the size of
the Xen binary.

Increase XEN_VIRT_SIZE to reserve enough space, allowing both UBSAN
and GCOV to be enabled together, with some slack for future growth.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in v2:
 - Incerease XEN_VIRT_SIZE to 16 Mb to cover also the case if 2M mappings will
   be used for .text (rx), .rodata(r), and .data (rw).
 - Update layout table in config.h.
 - s/xen_virt_starn_vpn/xen_virt_start_vpn
 - Update BUILD_BUG_ON(... != MB(8)) check to "... > GB(1)".
 - Update definition of PGTBL_INITIAL_COUNT and the comment above.
---
 xen/arch/riscv/include/asm/config.h |  8 ++++----
 xen/arch/riscv/include/asm/mm.h     | 12 +++++++++---
 xen/arch/riscv/mm.c                 | 25 ++++++++++++++++---------
 3 files changed, 29 insertions(+), 16 deletions(-)

Jan Beulich April 7, 2025, 10:09 a.m. UTC | #1

On 04.04.2025 18:04, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/include/asm/config.h
> +++ b/xen/arch/riscv/include/asm/config.h
> @@ -41,11 +41,11 @@
>   * Start addr          | End addr         | Slot       | area description
>   * ============================================================================
>   *                   .....                 L2 511          Unused
> - *  0xffffffffc0a00000  0xffffffffc0bfffff L2 511          Fixmap
> + *  0xffffffffc1800000  0xffffffffc1afffff L2 511          Fixmap

Isn't the upper bound 0xffffffffc19fffff now?

> --- a/xen/arch/riscv/include/asm/mm.h
> +++ b/xen/arch/riscv/include/asm/mm.h
> @@ -43,13 +43,19 @@ static inline void *maddr_to_virt(paddr_t ma)
>   */
>  static inline unsigned long virt_to_maddr(unsigned long va)
>  {
> +    const unsigned int vpn1_shift = PAGETABLE_ORDER + PAGE_SHIFT;
> +    const unsigned long va_vpn = va >> vpn1_shift;
> +    const unsigned long xen_virt_start_vpn =
> +        _AC(XEN_VIRT_START, UL) >> vpn1_shift;
> +    const unsigned long xen_virt_end_vpn =
> +        xen_virt_start_vpn + ((XEN_VIRT_SIZE >> vpn1_shift) - 1);
> +
>      if ((va >= DIRECTMAP_VIRT_START) &&
>          (va <= DIRECTMAP_VIRT_END))
>          return directmapoff_to_maddr(va - directmap_virt_start);
>  
> -    BUILD_BUG_ON(XEN_VIRT_SIZE != MB(2));
> -    ASSERT((va >> (PAGETABLE_ORDER + PAGE_SHIFT)) ==
> -           (_AC(XEN_VIRT_START, UL) >> (PAGETABLE_ORDER + PAGE_SHIFT)));
> +    BUILD_BUG_ON(XEN_VIRT_SIZE > GB(1));
> +    ASSERT((va_vpn >= xen_virt_start_vpn) && (va_vpn <= xen_virt_end_vpn));

Not all of the range is backed by memory, and for the excess space the
translation is therefore (likely) wrong. Which better would be caught by
the assertion?

> --- a/xen/arch/riscv/mm.c
> +++ b/xen/arch/riscv/mm.c
> @@ -31,20 +31,27 @@ unsigned long __ro_after_init phys_offset; /* = load_start - XEN_VIRT_START */
>  #define LOAD_TO_LINK(addr) ((unsigned long)(addr) - phys_offset)
>  
>  /*
> - * It is expected that Xen won't be more then 2 MB.
> + * It is expected that Xen won't be more then XEN_VIRT_SIZE MB.
>   * The check in xen.lds.S guarantees that.
> - * At least 3 page tables (in case of Sv39 ) are needed to cover 2 MB.
> - * One for each page level table with PAGE_SIZE = 4 Kb.
>   *
> - * One L0 page table can cover 2 MB(512 entries of one page table * PAGE_SIZE).
> + * Root page table is shared with the initial mapping and is declared
> + * separetely. (look at stage1_pgtbl_root)
>   *
> - * It might be needed one more page table in case when Xen load address
> - * isn't 2 MB aligned.
> + * An amount of page tables between root page table and L0 page table
> + * (in the case of Sv39 it covers L1 table):
> + *   (CONFIG_PAGING_LEVELS - 2) are needed for an identity mapping and
> + *   the same amount are needed for Xen.
>   *
> - * CONFIG_PAGING_LEVELS page tables are needed for the identity mapping,
> - * except that the root page table is shared with the initial mapping
> + * An amount of L0 page tables:
> + *   (512 entries of one L0 page table covers 2MB == 1<<XEN_PT_LEVEL_SHIFT(1))
> + *   XEN_VIRT_SIZE >> XEN_PT_LEVEL_SHIFT(1) are needed for Xen and
> + *   one L0 is needed for indenity mapping.
> + *
> + *   It might be needed one more page table in case when Xen load
> + *   address isn't 2 MB aligned.

Shouldn't we guarantee that? What may require an extra page table is when Xen
crosses a 1Gb boundary (unless we also guaranteed that it won't).

Jan

Oleksii Kurochko April 8, 2025, 11:51 a.m. UTC | #2

On 4/7/25 12:09 PM, Jan Beulich wrote:
> On 04.04.2025 18:04, Oleksii Kurochko wrote:
>> --- a/xen/arch/riscv/include/asm/config.h
>> +++ b/xen/arch/riscv/include/asm/config.h
>> @@ -41,11 +41,11 @@
>>    * Start addr          | End addr         | Slot       | area description
>>    * ============================================================================
>>    *                   .....                 L2 511          Unused
>> - *  0xffffffffc0a00000  0xffffffffc0bfffff L2 511          Fixmap
>> + *  0xffffffffc1800000  0xffffffffc1afffff L2 511          Fixmap
> Isn't the upper bound 0xffffffffc19fffff now?

Yes, it should be updated to 0xffffffffc19fffff.

>
>> --- a/xen/arch/riscv/include/asm/mm.h
>> +++ b/xen/arch/riscv/include/asm/mm.h
>> @@ -43,13 +43,19 @@ static inline void *maddr_to_virt(paddr_t ma)
>>    */
>>   static inline unsigned long virt_to_maddr(unsigned long va)
>>   {
>> +    const unsigned int vpn1_shift = PAGETABLE_ORDER + PAGE_SHIFT;
>> +    const unsigned long va_vpn = va >> vpn1_shift;
>> +    const unsigned long xen_virt_start_vpn =
>> +        _AC(XEN_VIRT_START, UL) >> vpn1_shift;
>> +    const unsigned long xen_virt_end_vpn =
>> +        xen_virt_start_vpn + ((XEN_VIRT_SIZE >> vpn1_shift) - 1);
>> +
>>       if ((va >= DIRECTMAP_VIRT_START) &&
>>           (va <= DIRECTMAP_VIRT_END))
>>           return directmapoff_to_maddr(va - directmap_virt_start);
>>   
>> -    BUILD_BUG_ON(XEN_VIRT_SIZE != MB(2));
>> -    ASSERT((va >> (PAGETABLE_ORDER + PAGE_SHIFT)) ==
>> -           (_AC(XEN_VIRT_START, UL) >> (PAGETABLE_ORDER + PAGE_SHIFT)));
>> +    BUILD_BUG_ON(XEN_VIRT_SIZE > GB(1));
>> +    ASSERT((va_vpn >= xen_virt_start_vpn) && (va_vpn <= xen_virt_end_vpn));
> Not all of the range is backed by memory, and for the excess space the
> translation is therefore (likely) wrong. Which better would be caught by
> the assertion?

Backed here means that the memory is actually mapped?

IIUC it is needed to check only for the range [XEN_VIRT_START, XEN_VIRT_START+xen_phys_size]
where xen_phys_size=(unsigned long)_end - (unsigned long)_start.

Did I understand you correctly?

>
>> --- a/xen/arch/riscv/mm.c
>> +++ b/xen/arch/riscv/mm.c
>> @@ -31,20 +31,27 @@ unsigned long __ro_after_init phys_offset; /* = load_start - XEN_VIRT_START */
>>   #define LOAD_TO_LINK(addr) ((unsigned long)(addr) - phys_offset)
>>   
>>   /*
>> - * It is expected that Xen won't be more then 2 MB.
>> + * It is expected that Xen won't be more then XEN_VIRT_SIZE MB.
>>    * The check in xen.lds.S guarantees that.
>> - * At least 3 page tables (in case of Sv39 ) are needed to cover 2 MB.
>> - * One for each page level table with PAGE_SIZE = 4 Kb.
>>    *
>> - * One L0 page table can cover 2 MB(512 entries of one page table * PAGE_SIZE).
>> + * Root page table is shared with the initial mapping and is declared
>> + * separetely. (look at stage1_pgtbl_root)
>>    *
>> - * It might be needed one more page table in case when Xen load address
>> - * isn't 2 MB aligned.
>> + * An amount of page tables between root page table and L0 page table
>> + * (in the case of Sv39 it covers L1 table):
>> + *   (CONFIG_PAGING_LEVELS - 2) are needed for an identity mapping and
>> + *   the same amount are needed for Xen.
>>    *
>> - * CONFIG_PAGING_LEVELS page tables are needed for the identity mapping,
>> - * except that the root page table is shared with the initial mapping
>> + * An amount of L0 page tables:
>> + *   (512 entries of one L0 page table covers 2MB == 1<<XEN_PT_LEVEL_SHIFT(1))
>> + *   XEN_VIRT_SIZE >> XEN_PT_LEVEL_SHIFT(1) are needed for Xen and
>> + *   one L0 is needed for indenity mapping.
>> + *
>> + *   It might be needed one more page table in case when Xen load
>> + *   address isn't 2 MB aligned.
> Shouldn't we guarantee that?

I think it's sufficient to guarantee 4KB alignment.

The only real benefit I see in enforcing larger alignment is that it likely enables
the use of superpages for mapping, which would reduce TLB pressure.
But perhaps I'm missing something?

Or did you mean that if 2MB alignment isn't guaranteed, then we might need two extra
page tables—one if the start address isn't 2MB aligned, and the Xen size is larger than 2MB?
Then yes one more page table should be added to PGTBL_INITIAL_COUNT.

> What may require an extra page table is when Xen
> crosses a 1Gb boundary (unless we also guaranteed that it won't).

You're right—I also need to add an extra page table if Xen crosses a 1GB boundary.

Thanks!

~ Oleksii

Jan Beulich April 8, 2025, 12:02 p.m. UTC | #3

On 08.04.2025 13:51, Oleksii Kurochko wrote:
> On 4/7/25 12:09 PM, Jan Beulich wrote:
>> On 04.04.2025 18:04, Oleksii Kurochko wrote:
>>> --- a/xen/arch/riscv/include/asm/mm.h
>>> +++ b/xen/arch/riscv/include/asm/mm.h
>>> @@ -43,13 +43,19 @@ static inline void *maddr_to_virt(paddr_t ma)
>>>    */
>>>   static inline unsigned long virt_to_maddr(unsigned long va)
>>>   {
>>> +    const unsigned int vpn1_shift = PAGETABLE_ORDER + PAGE_SHIFT;
>>> +    const unsigned long va_vpn = va >> vpn1_shift;
>>> +    const unsigned long xen_virt_start_vpn =
>>> +        _AC(XEN_VIRT_START, UL) >> vpn1_shift;
>>> +    const unsigned long xen_virt_end_vpn =
>>> +        xen_virt_start_vpn + ((XEN_VIRT_SIZE >> vpn1_shift) - 1);
>>> +
>>>       if ((va >= DIRECTMAP_VIRT_START) &&
>>>           (va <= DIRECTMAP_VIRT_END))
>>>           return directmapoff_to_maddr(va - directmap_virt_start);
>>>   
>>> -    BUILD_BUG_ON(XEN_VIRT_SIZE != MB(2));
>>> -    ASSERT((va >> (PAGETABLE_ORDER + PAGE_SHIFT)) ==
>>> -           (_AC(XEN_VIRT_START, UL) >> (PAGETABLE_ORDER + PAGE_SHIFT)));
>>> +    BUILD_BUG_ON(XEN_VIRT_SIZE > GB(1));
>>> +    ASSERT((va_vpn >= xen_virt_start_vpn) && (va_vpn <= xen_virt_end_vpn));
>> Not all of the range is backed by memory, and for the excess space the
>> translation is therefore (likely) wrong. Which better would be caught by
>> the assertion?
> 
> Backed here means that the memory is actually mapped?
> 
> IIUC it is needed to check only for the range [XEN_VIRT_START, XEN_VIRT_START+xen_phys_size]
> where xen_phys_size=(unsigned long)_end - (unsigned long)_start.
> 
> Did I understand you correctly?

I think so, yes. Depending on what you (intend to) do to .init.* at the
end of boot, that range may later also want excluding.

>>> --- a/xen/arch/riscv/mm.c
>>> +++ b/xen/arch/riscv/mm.c
>>> @@ -31,20 +31,27 @@ unsigned long __ro_after_init phys_offset; /* = load_start - XEN_VIRT_START */
>>>   #define LOAD_TO_LINK(addr) ((unsigned long)(addr) - phys_offset)
>>>   
>>>   /*
>>> - * It is expected that Xen won't be more then 2 MB.
>>> + * It is expected that Xen won't be more then XEN_VIRT_SIZE MB.
>>>    * The check in xen.lds.S guarantees that.
>>> - * At least 3 page tables (in case of Sv39 ) are needed to cover 2 MB.
>>> - * One for each page level table with PAGE_SIZE = 4 Kb.
>>>    *
>>> - * One L0 page table can cover 2 MB(512 entries of one page table * PAGE_SIZE).
>>> + * Root page table is shared with the initial mapping and is declared
>>> + * separetely. (look at stage1_pgtbl_root)
>>>    *
>>> - * It might be needed one more page table in case when Xen load address
>>> - * isn't 2 MB aligned.
>>> + * An amount of page tables between root page table and L0 page table
>>> + * (in the case of Sv39 it covers L1 table):
>>> + *   (CONFIG_PAGING_LEVELS - 2) are needed for an identity mapping and
>>> + *   the same amount are needed for Xen.
>>>    *
>>> - * CONFIG_PAGING_LEVELS page tables are needed for the identity mapping,
>>> - * except that the root page table is shared with the initial mapping
>>> + * An amount of L0 page tables:
>>> + *   (512 entries of one L0 page table covers 2MB == 1<<XEN_PT_LEVEL_SHIFT(1))
>>> + *   XEN_VIRT_SIZE >> XEN_PT_LEVEL_SHIFT(1) are needed for Xen and
>>> + *   one L0 is needed for indenity mapping.
>>> + *
>>> + *   It might be needed one more page table in case when Xen load
>>> + *   address isn't 2 MB aligned.
>> Shouldn't we guarantee that?
> 
> I think it's sufficient to guarantee 4KB alignment.
> 
> The only real benefit I see in enforcing larger alignment is that it likely enables
> the use of superpages for mapping, which would reduce TLB pressure.
> But perhaps I'm missing something?

No, it's indeed mainly that.

> Or did you mean that if 2MB alignment isn't guaranteed, then we might need two extra
> page tables—one if the start address isn't 2MB aligned, and the Xen size is larger than 2MB?
> Then yes one more page table should be added to PGTBL_INITIAL_COUNT.

Well, of course - if alignment isn't guaranteed, crossing whatever boundaries
of course needs accounting for.

Jan

Oleksii Kurochko April 8, 2025, 1:46 p.m. UTC | #4

On 4/8/25 2:02 PM, Jan Beulich wrote:
> On 08.04.2025 13:51, Oleksii Kurochko wrote:
>> On 4/7/25 12:09 PM, Jan Beulich wrote:
>>> On 04.04.2025 18:04, Oleksii Kurochko wrote:
>>>> --- a/xen/arch/riscv/include/asm/mm.h
>>>> +++ b/xen/arch/riscv/include/asm/mm.h
>>>> @@ -43,13 +43,19 @@ static inline void *maddr_to_virt(paddr_t ma)
>>>>     */
>>>>    static inline unsigned long virt_to_maddr(unsigned long va)
>>>>    {
>>>> +    const unsigned int vpn1_shift = PAGETABLE_ORDER + PAGE_SHIFT;
>>>> +    const unsigned long va_vpn = va >> vpn1_shift;
>>>> +    const unsigned long xen_virt_start_vpn =
>>>> +        _AC(XEN_VIRT_START, UL) >> vpn1_shift;
>>>> +    const unsigned long xen_virt_end_vpn =
>>>> +        xen_virt_start_vpn + ((XEN_VIRT_SIZE >> vpn1_shift) - 1);
>>>> +
>>>>        if ((va >= DIRECTMAP_VIRT_START) &&
>>>>            (va <= DIRECTMAP_VIRT_END))
>>>>            return directmapoff_to_maddr(va - directmap_virt_start);
>>>>    
>>>> -    BUILD_BUG_ON(XEN_VIRT_SIZE != MB(2));
>>>> -    ASSERT((va >> (PAGETABLE_ORDER + PAGE_SHIFT)) ==
>>>> -           (_AC(XEN_VIRT_START, UL) >> (PAGETABLE_ORDER + PAGE_SHIFT)));
>>>> +    BUILD_BUG_ON(XEN_VIRT_SIZE > GB(1));
>>>> +    ASSERT((va_vpn >= xen_virt_start_vpn) && (va_vpn <= xen_virt_end_vpn));
>>> Not all of the range is backed by memory, and for the excess space the
>>> translation is therefore (likely) wrong. Which better would be caught by
>>> the assertion?
>> Backed here means that the memory is actually mapped?
>>
>> IIUC it is needed to check only for the range [XEN_VIRT_START, XEN_VIRT_START+xen_phys_size]
>> where xen_phys_size=(unsigned long)_end - (unsigned long)_start.
>>
>> Did I understand you correctly?
> I think so, yes. Depending on what you (intend to) do to .init.* at the
> end of boot, that range may later also want excluding.

I planned to release everything between __init_begin and __init_end in the following way:
   destroy_xen_mappings((unsigned long)__init_begin, (unsigned long)__init_end);

So yes, then I think I have to come up with new ASSERT, add is_init_memory_freed variable and
if is_init_memory_freed=true then also check that `va` isn't from .init.* range.

But I'm not quire sure that mapping for .got* should be destroyed after the end of boot. (now it is
part of [__init_begin,__init_end] range.

>>>> --- a/xen/arch/riscv/mm.c
>>>> +++ b/xen/arch/riscv/mm.c
>>>> @@ -31,20 +31,27 @@ unsigned long __ro_after_init phys_offset; /* = load_start - XEN_VIRT_START */
>>>>    #define LOAD_TO_LINK(addr) ((unsigned long)(addr) - phys_offset)
>>>>    
>>>>    /*
>>>> - * It is expected that Xen won't be more then 2 MB.
>>>> + * It is expected that Xen won't be more then XEN_VIRT_SIZE MB.
>>>>     * The check in xen.lds.S guarantees that.
>>>> - * At least 3 page tables (in case of Sv39 ) are needed to cover 2 MB.
>>>> - * One for each page level table with PAGE_SIZE = 4 Kb.
>>>>     *
>>>> - * One L0 page table can cover 2 MB(512 entries of one page table * PAGE_SIZE).
>>>> + * Root page table is shared with the initial mapping and is declared
>>>> + * separetely. (look at stage1_pgtbl_root)
>>>>     *
>>>> - * It might be needed one more page table in case when Xen load address
>>>> - * isn't 2 MB aligned.
>>>> + * An amount of page tables between root page table and L0 page table
>>>> + * (in the case of Sv39 it covers L1 table):
>>>> + *   (CONFIG_PAGING_LEVELS - 2) are needed for an identity mapping and
>>>> + *   the same amount are needed for Xen.
>>>>     *
>>>> - * CONFIG_PAGING_LEVELS page tables are needed for the identity mapping,
>>>> - * except that the root page table is shared with the initial mapping
>>>> + * An amount of L0 page tables:
>>>> + *   (512 entries of one L0 page table covers 2MB == 1<<XEN_PT_LEVEL_SHIFT(1))
>>>> + *   XEN_VIRT_SIZE >> XEN_PT_LEVEL_SHIFT(1) are needed for Xen and
>>>> + *   one L0 is needed for indenity mapping.
>>>> + *
>>>> + *   It might be needed one more page table in case when Xen load
>>>> + *   address isn't 2 MB aligned.
>>> Shouldn't we guarantee that?
>> I think it's sufficient to guarantee 4KB alignment.
>>
>> The only real benefit I see in enforcing larger alignment is that it likely enables
>> the use of superpages for mapping, which would reduce TLB pressure.
>> But perhaps I'm missing something?
> No, it's indeed mainly that.

But then the linker address and the load address should both be aligned to a 2MB or 1GB boundary.
This likely isn't an issue at all, but could it be a problem if we require 1GB alignment for the
load address? In that case, might it be difficult for the platform to find a suitable place in
memory to load Xen for some reason? (I don't think so but maybe I'm missing something)

These changes should probably be part of a separate patch, as currently,|setup_initial_mapping() |only works with 4KB mapping.
Perhaps it would make sense to add a comment around|setup_initial_mapping()| indicating that if
this function is modified, it may require updating|PGTBL_INITIAL_COUNT|.

~ Oleksii

>
>> Or did you mean that if 2MB alignment isn't guaranteed, then we might need two extra
>> page tables—one if the start address isn't 2MB aligned, and the Xen size is larger than 2MB?
>> Then yes one more page table should be added to PGTBL_INITIAL_COUNT.
> Well, of course - if alignment isn't guaranteed, crossing whatever boundaries
> of course needs accounting for.
>
> Jan

Jan Beulich April 8, 2025, 2:04 p.m. UTC | #5

On 08.04.2025 15:46, Oleksii Kurochko wrote:
> On 4/8/25 2:02 PM, Jan Beulich wrote:
>> On 08.04.2025 13:51, Oleksii Kurochko wrote:
>>> On 4/7/25 12:09 PM, Jan Beulich wrote:
>>>> On 04.04.2025 18:04, Oleksii Kurochko wrote:
>>>>> --- a/xen/arch/riscv/include/asm/mm.h
>>>>> +++ b/xen/arch/riscv/include/asm/mm.h
>>>>> @@ -43,13 +43,19 @@ static inline void *maddr_to_virt(paddr_t ma)
>>>>>     */
>>>>>    static inline unsigned long virt_to_maddr(unsigned long va)
>>>>>    {
>>>>> +    const unsigned int vpn1_shift = PAGETABLE_ORDER + PAGE_SHIFT;
>>>>> +    const unsigned long va_vpn = va >> vpn1_shift;
>>>>> +    const unsigned long xen_virt_start_vpn =
>>>>> +        _AC(XEN_VIRT_START, UL) >> vpn1_shift;
>>>>> +    const unsigned long xen_virt_end_vpn =
>>>>> +        xen_virt_start_vpn + ((XEN_VIRT_SIZE >> vpn1_shift) - 1);
>>>>> +
>>>>>        if ((va >= DIRECTMAP_VIRT_START) &&
>>>>>            (va <= DIRECTMAP_VIRT_END))
>>>>>            return directmapoff_to_maddr(va - directmap_virt_start);
>>>>>    
>>>>> -    BUILD_BUG_ON(XEN_VIRT_SIZE != MB(2));
>>>>> -    ASSERT((va >> (PAGETABLE_ORDER + PAGE_SHIFT)) ==
>>>>> -           (_AC(XEN_VIRT_START, UL) >> (PAGETABLE_ORDER + PAGE_SHIFT)));
>>>>> +    BUILD_BUG_ON(XEN_VIRT_SIZE > GB(1));
>>>>> +    ASSERT((va_vpn >= xen_virt_start_vpn) && (va_vpn <= xen_virt_end_vpn));
>>>> Not all of the range is backed by memory, and for the excess space the
>>>> translation is therefore (likely) wrong. Which better would be caught by
>>>> the assertion?
>>> Backed here means that the memory is actually mapped?
>>>
>>> IIUC it is needed to check only for the range [XEN_VIRT_START, XEN_VIRT_START+xen_phys_size]
>>> where xen_phys_size=(unsigned long)_end - (unsigned long)_start.
>>>
>>> Did I understand you correctly?
>> I think so, yes. Depending on what you (intend to) do to .init.* at the
>> end of boot, that range may later also want excluding.
> 
> I planned to release everything between __init_begin and __init_end in the following way:
>    destroy_xen_mappings((unsigned long)__init_begin, (unsigned long)__init_end);
> 
> So yes, then I think I have to come up with new ASSERT, add is_init_memory_freed variable and
> if is_init_memory_freed=true then also check that `va` isn't from .init.* range.
> 
> But I'm not quire sure that mapping for .got* should be destroyed after the end of boot. (now it is
> part of [__init_begin,__init_end] range.

Isn't this a non-issue considering

ASSERT(!SIZEOF(.got),      ".got non-empty")
ASSERT(!SIZEOF(.got.plt),  ".got.plt non-empty")

near the bottom of xen.lds.S?

>>>>> --- a/xen/arch/riscv/mm.c
>>>>> +++ b/xen/arch/riscv/mm.c
>>>>> @@ -31,20 +31,27 @@ unsigned long __ro_after_init phys_offset; /* = load_start - XEN_VIRT_START */
>>>>>    #define LOAD_TO_LINK(addr) ((unsigned long)(addr) - phys_offset)
>>>>>    
>>>>>    /*
>>>>> - * It is expected that Xen won't be more then 2 MB.
>>>>> + * It is expected that Xen won't be more then XEN_VIRT_SIZE MB.
>>>>>     * The check in xen.lds.S guarantees that.
>>>>> - * At least 3 page tables (in case of Sv39 ) are needed to cover 2 MB.
>>>>> - * One for each page level table with PAGE_SIZE = 4 Kb.
>>>>>     *
>>>>> - * One L0 page table can cover 2 MB(512 entries of one page table * PAGE_SIZE).
>>>>> + * Root page table is shared with the initial mapping and is declared
>>>>> + * separetely. (look at stage1_pgtbl_root)
>>>>>     *
>>>>> - * It might be needed one more page table in case when Xen load address
>>>>> - * isn't 2 MB aligned.
>>>>> + * An amount of page tables between root page table and L0 page table
>>>>> + * (in the case of Sv39 it covers L1 table):
>>>>> + *   (CONFIG_PAGING_LEVELS - 2) are needed for an identity mapping and
>>>>> + *   the same amount are needed for Xen.
>>>>>     *
>>>>> - * CONFIG_PAGING_LEVELS page tables are needed for the identity mapping,
>>>>> - * except that the root page table is shared with the initial mapping
>>>>> + * An amount of L0 page tables:
>>>>> + *   (512 entries of one L0 page table covers 2MB == 1<<XEN_PT_LEVEL_SHIFT(1))
>>>>> + *   XEN_VIRT_SIZE >> XEN_PT_LEVEL_SHIFT(1) are needed for Xen and
>>>>> + *   one L0 is needed for indenity mapping.
>>>>> + *
>>>>> + *   It might be needed one more page table in case when Xen load
>>>>> + *   address isn't 2 MB aligned.
>>>> Shouldn't we guarantee that?
>>> I think it's sufficient to guarantee 4KB alignment.
>>>
>>> The only real benefit I see in enforcing larger alignment is that it likely enables
>>> the use of superpages for mapping, which would reduce TLB pressure.
>>> But perhaps I'm missing something?
>> No, it's indeed mainly that.
> 
> But then the linker address and the load address should both be aligned to a 2MB or 1GB boundary.
> This likely isn't an issue at all, but could it be a problem if we require 1GB alignment for the
> load address? In that case, might it be difficult for the platform to find a suitable place in
> memory to load Xen for some reason? (I don't think so but maybe I'm missing something)

Why would load address need to be 1Gb aligned? That (as well as 2Mb-)alignment
matters only once you set up paging?

> These changes should probably be part of a separate patch, as currently,|setup_initial_mapping() |only works with 4KB mapping.

That's fine; it's just that - as said - the calculation of how many page tables
you may need has to cover for the worst case.

Jan

Oleksii Kurochko April 9, 2025, 9:06 a.m. UTC | #6

On 4/8/25 4:04 PM, Jan Beulich wrote:
> On 08.04.2025 15:46, Oleksii Kurochko wrote:
>> On 4/8/25 2:02 PM, Jan Beulich wrote:
>>> On 08.04.2025 13:51, Oleksii Kurochko wrote:
>>>> On 4/7/25 12:09 PM, Jan Beulich wrote:
>>>>> On 04.04.2025 18:04, Oleksii Kurochko wrote:
>>>>>> --- a/xen/arch/riscv/include/asm/mm.h
>>>>>> +++ b/xen/arch/riscv/include/asm/mm.h
>>>>>> @@ -43,13 +43,19 @@ static inline void *maddr_to_virt(paddr_t ma)
>>>>>>      */
>>>>>>     static inline unsigned long virt_to_maddr(unsigned long va)
>>>>>>     {
>>>>>> +    const unsigned int vpn1_shift = PAGETABLE_ORDER + PAGE_SHIFT;
>>>>>> +    const unsigned long va_vpn = va >> vpn1_shift;
>>>>>> +    const unsigned long xen_virt_start_vpn =
>>>>>> +        _AC(XEN_VIRT_START, UL) >> vpn1_shift;
>>>>>> +    const unsigned long xen_virt_end_vpn =
>>>>>> +        xen_virt_start_vpn + ((XEN_VIRT_SIZE >> vpn1_shift) - 1);
>>>>>> +
>>>>>>         if ((va >= DIRECTMAP_VIRT_START) &&
>>>>>>             (va <= DIRECTMAP_VIRT_END))
>>>>>>             return directmapoff_to_maddr(va - directmap_virt_start);
>>>>>>     
>>>>>> -    BUILD_BUG_ON(XEN_VIRT_SIZE != MB(2));
>>>>>> -    ASSERT((va >> (PAGETABLE_ORDER + PAGE_SHIFT)) ==
>>>>>> -           (_AC(XEN_VIRT_START, UL) >> (PAGETABLE_ORDER + PAGE_SHIFT)));
>>>>>> +    BUILD_BUG_ON(XEN_VIRT_SIZE > GB(1));
>>>>>> +    ASSERT((va_vpn >= xen_virt_start_vpn) && (va_vpn <= xen_virt_end_vpn));
>>>>> Not all of the range is backed by memory, and for the excess space the
>>>>> translation is therefore (likely) wrong. Which better would be caught by
>>>>> the assertion?
>>>> Backed here means that the memory is actually mapped?
>>>>
>>>> IIUC it is needed to check only for the range [XEN_VIRT_START, XEN_VIRT_START+xen_phys_size]
>>>> where xen_phys_size=(unsigned long)_end - (unsigned long)_start.
>>>>
>>>> Did I understand you correctly?
>>> I think so, yes. Depending on what you (intend to) do to .init.* at the
>>> end of boot, that range may later also want excluding.
>> I planned to release everything between __init_begin and __init_end in the following way:
>>     destroy_xen_mappings((unsigned long)__init_begin, (unsigned long)__init_end);
>>
>> So yes, then I think I have to come up with new ASSERT, add is_init_memory_freed variable and
>> if is_init_memory_freed=true then also check that `va` isn't from .init.* range.
>>
>> But I'm not quire sure that mapping for .got* should be destroyed after the end of boot. (now it is
>> part of [__init_begin,__init_end] range.
> Isn't this a non-issue considering
>
> ASSERT(!SIZEOF(.got),      ".got non-empty")
> ASSERT(!SIZEOF(.got.plt),  ".got.plt non-empty")
>
> near the bottom of xen.lds.S?

I forgot about that|ASSERT()|, so it's expected that|.got*| isn't used in Xen anyway.
Therefore, it shouldn't be an issue to destroy the mapping for the|[__init_begin, __init_end]| range.

>
>>>>>> --- a/xen/arch/riscv/mm.c
>>>>>> +++ b/xen/arch/riscv/mm.c
>>>>>> @@ -31,20 +31,27 @@ unsigned long __ro_after_init phys_offset; /* = load_start - XEN_VIRT_START */
>>>>>>     #define LOAD_TO_LINK(addr) ((unsigned long)(addr) - phys_offset)
>>>>>>     
>>>>>>     /*
>>>>>> - * It is expected that Xen won't be more then 2 MB.
>>>>>> + * It is expected that Xen won't be more then XEN_VIRT_SIZE MB.
>>>>>>      * The check in xen.lds.S guarantees that.
>>>>>> - * At least 3 page tables (in case of Sv39 ) are needed to cover 2 MB.
>>>>>> - * One for each page level table with PAGE_SIZE = 4 Kb.
>>>>>>      *
>>>>>> - * One L0 page table can cover 2 MB(512 entries of one page table * PAGE_SIZE).
>>>>>> + * Root page table is shared with the initial mapping and is declared
>>>>>> + * separetely. (look at stage1_pgtbl_root)
>>>>>>      *
>>>>>> - * It might be needed one more page table in case when Xen load address
>>>>>> - * isn't 2 MB aligned.
>>>>>> + * An amount of page tables between root page table and L0 page table
>>>>>> + * (in the case of Sv39 it covers L1 table):
>>>>>> + *   (CONFIG_PAGING_LEVELS - 2) are needed for an identity mapping and
>>>>>> + *   the same amount are needed for Xen.
>>>>>>      *
>>>>>> - * CONFIG_PAGING_LEVELS page tables are needed for the identity mapping,
>>>>>> - * except that the root page table is shared with the initial mapping
>>>>>> + * An amount of L0 page tables:
>>>>>> + *   (512 entries of one L0 page table covers 2MB == 1<<XEN_PT_LEVEL_SHIFT(1))
>>>>>> + *   XEN_VIRT_SIZE >> XEN_PT_LEVEL_SHIFT(1) are needed for Xen and
>>>>>> + *   one L0 is needed for indenity mapping.
>>>>>> + *
>>>>>> + *   It might be needed one more page table in case when Xen load
>>>>>> + *   address isn't 2 MB aligned.
>>>>> Shouldn't we guarantee that?
>>>> I think it's sufficient to guarantee 4KB alignment.
>>>>
>>>> The only real benefit I see in enforcing larger alignment is that it likely enables
>>>> the use of superpages for mapping, which would reduce TLB pressure.
>>>> But perhaps I'm missing something?
>>> No, it's indeed mainly that.
>> But then the linker address and the load address should both be aligned to a 2MB or 1GB boundary.
>> This likely isn't an issue at all, but could it be a problem if we require 1GB alignment for the
>> load address? In that case, might it be difficult for the platform to find a suitable place in
>> memory to load Xen for some reason? (I don't think so but maybe I'm missing something)
> Why would load address need to be 1Gb aligned? That (as well as 2Mb-)alignment
> matters only once you set up paging?

Mostly yes, it matters only once during paging set up.

I was thinking that if, one day, 2MB (or larger) alignment is used and the load address isn't
properly aligned, some space in a page might be lost.
(The word "should" above wasn't entirely accurate.)

But this likely isn't a big deal and can be safely ignored.

~ Oleksii

Jan Beulich April 9, 2025, 10:05 a.m. UTC | #7

On 09.04.2025 11:06, Oleksii Kurochko wrote:
> On 4/8/25 4:04 PM, Jan Beulich wrote:
>> On 08.04.2025 15:46, Oleksii Kurochko wrote:
>>> On 4/8/25 2:02 PM, Jan Beulich wrote:
>>>> On 08.04.2025 13:51, Oleksii Kurochko wrote:
>>>>> On 4/7/25 12:09 PM, Jan Beulich wrote:
>>>>>> On 04.04.2025 18:04, Oleksii Kurochko wrote:
>>>>>>> --- a/xen/arch/riscv/mm.c
>>>>>>> +++ b/xen/arch/riscv/mm.c
>>>>>>> @@ -31,20 +31,27 @@ unsigned long __ro_after_init phys_offset; /* = load_start - XEN_VIRT_START */
>>>>>>>     #define LOAD_TO_LINK(addr) ((unsigned long)(addr) - phys_offset)
>>>>>>>     
>>>>>>>     /*
>>>>>>> - * It is expected that Xen won't be more then 2 MB.
>>>>>>> + * It is expected that Xen won't be more then XEN_VIRT_SIZE MB.
>>>>>>>      * The check in xen.lds.S guarantees that.
>>>>>>> - * At least 3 page tables (in case of Sv39 ) are needed to cover 2 MB.
>>>>>>> - * One for each page level table with PAGE_SIZE = 4 Kb.
>>>>>>>      *
>>>>>>> - * One L0 page table can cover 2 MB(512 entries of one page table * PAGE_SIZE).
>>>>>>> + * Root page table is shared with the initial mapping and is declared
>>>>>>> + * separetely. (look at stage1_pgtbl_root)
>>>>>>>      *
>>>>>>> - * It might be needed one more page table in case when Xen load address
>>>>>>> - * isn't 2 MB aligned.
>>>>>>> + * An amount of page tables between root page table and L0 page table
>>>>>>> + * (in the case of Sv39 it covers L1 table):
>>>>>>> + *   (CONFIG_PAGING_LEVELS - 2) are needed for an identity mapping and
>>>>>>> + *   the same amount are needed for Xen.
>>>>>>>      *
>>>>>>> - * CONFIG_PAGING_LEVELS page tables are needed for the identity mapping,
>>>>>>> - * except that the root page table is shared with the initial mapping
>>>>>>> + * An amount of L0 page tables:
>>>>>>> + *   (512 entries of one L0 page table covers 2MB == 1<<XEN_PT_LEVEL_SHIFT(1))
>>>>>>> + *   XEN_VIRT_SIZE >> XEN_PT_LEVEL_SHIFT(1) are needed for Xen and
>>>>>>> + *   one L0 is needed for indenity mapping.
>>>>>>> + *
>>>>>>> + *   It might be needed one more page table in case when Xen load
>>>>>>> + *   address isn't 2 MB aligned.
>>>>>> Shouldn't we guarantee that?
>>>>> I think it's sufficient to guarantee 4KB alignment.
>>>>>
>>>>> The only real benefit I see in enforcing larger alignment is that it likely enables
>>>>> the use of superpages for mapping, which would reduce TLB pressure.
>>>>> But perhaps I'm missing something?
>>>> No, it's indeed mainly that.
>>> But then the linker address and the load address should both be aligned to a 2MB or 1GB boundary.
>>> This likely isn't an issue at all, but could it be a problem if we require 1GB alignment for the
>>> load address? In that case, might it be difficult for the platform to find a suitable place in
>>> memory to load Xen for some reason? (I don't think so but maybe I'm missing something)
>> Why would load address need to be 1Gb aligned? That (as well as 2Mb-)alignment
>> matters only once you set up paging?
> 
> Mostly yes, it matters only once during paging set up.
> 
> I was thinking that if, one day, 2MB (or larger) alignment is used and the load address isn't
> properly aligned, some space in a page might be lost.
> (The word "should" above wasn't entirely accurate.)

Actually I think I was wrong with my question. Load address of course matters to
a sufficient degree, especially if at 2Mb boundaries to want to be able to change
what permissions to use (without sacrificing the 2Mb mappings).

Jan

[v2] xen/riscv: Increase XEN_VIRT_SIZE

Commit Message

Comments

Patch