Message ID | 808d1b346bc90dde38fd19a6b92ab78d78e42936.1743766932.git.oleksii.kurochko@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] xen/riscv: Increase XEN_VIRT_SIZE | expand |
On 04.04.2025 18:04, Oleksii Kurochko wrote: > --- a/xen/arch/riscv/include/asm/config.h > +++ b/xen/arch/riscv/include/asm/config.h > @@ -41,11 +41,11 @@ > * Start addr | End addr | Slot | area description > * ============================================================================ > * ..... L2 511 Unused > - * 0xffffffffc0a00000 0xffffffffc0bfffff L2 511 Fixmap > + * 0xffffffffc1800000 0xffffffffc1afffff L2 511 Fixmap Isn't the upper bound 0xffffffffc19fffff now? > --- a/xen/arch/riscv/include/asm/mm.h > +++ b/xen/arch/riscv/include/asm/mm.h > @@ -43,13 +43,19 @@ static inline void *maddr_to_virt(paddr_t ma) > */ > static inline unsigned long virt_to_maddr(unsigned long va) > { > + const unsigned int vpn1_shift = PAGETABLE_ORDER + PAGE_SHIFT; > + const unsigned long va_vpn = va >> vpn1_shift; > + const unsigned long xen_virt_start_vpn = > + _AC(XEN_VIRT_START, UL) >> vpn1_shift; > + const unsigned long xen_virt_end_vpn = > + xen_virt_start_vpn + ((XEN_VIRT_SIZE >> vpn1_shift) - 1); > + > if ((va >= DIRECTMAP_VIRT_START) && > (va <= DIRECTMAP_VIRT_END)) > return directmapoff_to_maddr(va - directmap_virt_start); > > - BUILD_BUG_ON(XEN_VIRT_SIZE != MB(2)); > - ASSERT((va >> (PAGETABLE_ORDER + PAGE_SHIFT)) == > - (_AC(XEN_VIRT_START, UL) >> (PAGETABLE_ORDER + PAGE_SHIFT))); > + BUILD_BUG_ON(XEN_VIRT_SIZE > GB(1)); > + ASSERT((va_vpn >= xen_virt_start_vpn) && (va_vpn <= xen_virt_end_vpn)); Not all of the range is backed by memory, and for the excess space the translation is therefore (likely) wrong. Which better would be caught by the assertion? > --- a/xen/arch/riscv/mm.c > +++ b/xen/arch/riscv/mm.c > @@ -31,20 +31,27 @@ unsigned long __ro_after_init phys_offset; /* = load_start - XEN_VIRT_START */ > #define LOAD_TO_LINK(addr) ((unsigned long)(addr) - phys_offset) > > /* > - * It is expected that Xen won't be more then 2 MB. > + * It is expected that Xen won't be more then XEN_VIRT_SIZE MB. > * The check in xen.lds.S guarantees that. > - * At least 3 page tables (in case of Sv39 ) are needed to cover 2 MB. > - * One for each page level table with PAGE_SIZE = 4 Kb. > * > - * One L0 page table can cover 2 MB(512 entries of one page table * PAGE_SIZE). > + * Root page table is shared with the initial mapping and is declared > + * separetely. (look at stage1_pgtbl_root) > * > - * It might be needed one more page table in case when Xen load address > - * isn't 2 MB aligned. > + * An amount of page tables between root page table and L0 page table > + * (in the case of Sv39 it covers L1 table): > + * (CONFIG_PAGING_LEVELS - 2) are needed for an identity mapping and > + * the same amount are needed for Xen. > * > - * CONFIG_PAGING_LEVELS page tables are needed for the identity mapping, > - * except that the root page table is shared with the initial mapping > + * An amount of L0 page tables: > + * (512 entries of one L0 page table covers 2MB == 1<<XEN_PT_LEVEL_SHIFT(1)) > + * XEN_VIRT_SIZE >> XEN_PT_LEVEL_SHIFT(1) are needed for Xen and > + * one L0 is needed for indenity mapping. > + * > + * It might be needed one more page table in case when Xen load > + * address isn't 2 MB aligned. Shouldn't we guarantee that? What may require an extra page table is when Xen crosses a 1Gb boundary (unless we also guaranteed that it won't). Jan
On 4/7/25 12:09 PM, Jan Beulich wrote: > On 04.04.2025 18:04, Oleksii Kurochko wrote: >> --- a/xen/arch/riscv/include/asm/config.h >> +++ b/xen/arch/riscv/include/asm/config.h >> @@ -41,11 +41,11 @@ >> * Start addr | End addr | Slot | area description >> * ============================================================================ >> * ..... L2 511 Unused >> - * 0xffffffffc0a00000 0xffffffffc0bfffff L2 511 Fixmap >> + * 0xffffffffc1800000 0xffffffffc1afffff L2 511 Fixmap > Isn't the upper bound 0xffffffffc19fffff now? Yes, it should be updated to 0xffffffffc19fffff. > >> --- a/xen/arch/riscv/include/asm/mm.h >> +++ b/xen/arch/riscv/include/asm/mm.h >> @@ -43,13 +43,19 @@ static inline void *maddr_to_virt(paddr_t ma) >> */ >> static inline unsigned long virt_to_maddr(unsigned long va) >> { >> + const unsigned int vpn1_shift = PAGETABLE_ORDER + PAGE_SHIFT; >> + const unsigned long va_vpn = va >> vpn1_shift; >> + const unsigned long xen_virt_start_vpn = >> + _AC(XEN_VIRT_START, UL) >> vpn1_shift; >> + const unsigned long xen_virt_end_vpn = >> + xen_virt_start_vpn + ((XEN_VIRT_SIZE >> vpn1_shift) - 1); >> + >> if ((va >= DIRECTMAP_VIRT_START) && >> (va <= DIRECTMAP_VIRT_END)) >> return directmapoff_to_maddr(va - directmap_virt_start); >> >> - BUILD_BUG_ON(XEN_VIRT_SIZE != MB(2)); >> - ASSERT((va >> (PAGETABLE_ORDER + PAGE_SHIFT)) == >> - (_AC(XEN_VIRT_START, UL) >> (PAGETABLE_ORDER + PAGE_SHIFT))); >> + BUILD_BUG_ON(XEN_VIRT_SIZE > GB(1)); >> + ASSERT((va_vpn >= xen_virt_start_vpn) && (va_vpn <= xen_virt_end_vpn)); > Not all of the range is backed by memory, and for the excess space the > translation is therefore (likely) wrong. Which better would be caught by > the assertion? Backed here means that the memory is actually mapped? IIUC it is needed to check only for the range [XEN_VIRT_START, XEN_VIRT_START+xen_phys_size] where xen_phys_size=(unsigned long)_end - (unsigned long)_start. Did I understand you correctly? > >> --- a/xen/arch/riscv/mm.c >> +++ b/xen/arch/riscv/mm.c >> @@ -31,20 +31,27 @@ unsigned long __ro_after_init phys_offset; /* = load_start - XEN_VIRT_START */ >> #define LOAD_TO_LINK(addr) ((unsigned long)(addr) - phys_offset) >> >> /* >> - * It is expected that Xen won't be more then 2 MB. >> + * It is expected that Xen won't be more then XEN_VIRT_SIZE MB. >> * The check in xen.lds.S guarantees that. >> - * At least 3 page tables (in case of Sv39 ) are needed to cover 2 MB. >> - * One for each page level table with PAGE_SIZE = 4 Kb. >> * >> - * One L0 page table can cover 2 MB(512 entries of one page table * PAGE_SIZE). >> + * Root page table is shared with the initial mapping and is declared >> + * separetely. (look at stage1_pgtbl_root) >> * >> - * It might be needed one more page table in case when Xen load address >> - * isn't 2 MB aligned. >> + * An amount of page tables between root page table and L0 page table >> + * (in the case of Sv39 it covers L1 table): >> + * (CONFIG_PAGING_LEVELS - 2) are needed for an identity mapping and >> + * the same amount are needed for Xen. >> * >> - * CONFIG_PAGING_LEVELS page tables are needed for the identity mapping, >> - * except that the root page table is shared with the initial mapping >> + * An amount of L0 page tables: >> + * (512 entries of one L0 page table covers 2MB == 1<<XEN_PT_LEVEL_SHIFT(1)) >> + * XEN_VIRT_SIZE >> XEN_PT_LEVEL_SHIFT(1) are needed for Xen and >> + * one L0 is needed for indenity mapping. >> + * >> + * It might be needed one more page table in case when Xen load >> + * address isn't 2 MB aligned. > Shouldn't we guarantee that? I think it's sufficient to guarantee 4KB alignment. The only real benefit I see in enforcing larger alignment is that it likely enables the use of superpages for mapping, which would reduce TLB pressure. But perhaps I'm missing something? Or did you mean that if 2MB alignment isn't guaranteed, then we might need two extra page tables—one if the start address isn't 2MB aligned, and the Xen size is larger than 2MB? Then yes one more page table should be added to PGTBL_INITIAL_COUNT. > What may require an extra page table is when Xen > crosses a 1Gb boundary (unless we also guaranteed that it won't). You're right—I also need to add an extra page table if Xen crosses a 1GB boundary. Thanks! ~ Oleksii
On 08.04.2025 13:51, Oleksii Kurochko wrote: > On 4/7/25 12:09 PM, Jan Beulich wrote: >> On 04.04.2025 18:04, Oleksii Kurochko wrote: >>> --- a/xen/arch/riscv/include/asm/mm.h >>> +++ b/xen/arch/riscv/include/asm/mm.h >>> @@ -43,13 +43,19 @@ static inline void *maddr_to_virt(paddr_t ma) >>> */ >>> static inline unsigned long virt_to_maddr(unsigned long va) >>> { >>> + const unsigned int vpn1_shift = PAGETABLE_ORDER + PAGE_SHIFT; >>> + const unsigned long va_vpn = va >> vpn1_shift; >>> + const unsigned long xen_virt_start_vpn = >>> + _AC(XEN_VIRT_START, UL) >> vpn1_shift; >>> + const unsigned long xen_virt_end_vpn = >>> + xen_virt_start_vpn + ((XEN_VIRT_SIZE >> vpn1_shift) - 1); >>> + >>> if ((va >= DIRECTMAP_VIRT_START) && >>> (va <= DIRECTMAP_VIRT_END)) >>> return directmapoff_to_maddr(va - directmap_virt_start); >>> >>> - BUILD_BUG_ON(XEN_VIRT_SIZE != MB(2)); >>> - ASSERT((va >> (PAGETABLE_ORDER + PAGE_SHIFT)) == >>> - (_AC(XEN_VIRT_START, UL) >> (PAGETABLE_ORDER + PAGE_SHIFT))); >>> + BUILD_BUG_ON(XEN_VIRT_SIZE > GB(1)); >>> + ASSERT((va_vpn >= xen_virt_start_vpn) && (va_vpn <= xen_virt_end_vpn)); >> Not all of the range is backed by memory, and for the excess space the >> translation is therefore (likely) wrong. Which better would be caught by >> the assertion? > > Backed here means that the memory is actually mapped? > > IIUC it is needed to check only for the range [XEN_VIRT_START, XEN_VIRT_START+xen_phys_size] > where xen_phys_size=(unsigned long)_end - (unsigned long)_start. > > Did I understand you correctly? I think so, yes. Depending on what you (intend to) do to .init.* at the end of boot, that range may later also want excluding. >>> --- a/xen/arch/riscv/mm.c >>> +++ b/xen/arch/riscv/mm.c >>> @@ -31,20 +31,27 @@ unsigned long __ro_after_init phys_offset; /* = load_start - XEN_VIRT_START */ >>> #define LOAD_TO_LINK(addr) ((unsigned long)(addr) - phys_offset) >>> >>> /* >>> - * It is expected that Xen won't be more then 2 MB. >>> + * It is expected that Xen won't be more then XEN_VIRT_SIZE MB. >>> * The check in xen.lds.S guarantees that. >>> - * At least 3 page tables (in case of Sv39 ) are needed to cover 2 MB. >>> - * One for each page level table with PAGE_SIZE = 4 Kb. >>> * >>> - * One L0 page table can cover 2 MB(512 entries of one page table * PAGE_SIZE). >>> + * Root page table is shared with the initial mapping and is declared >>> + * separetely. (look at stage1_pgtbl_root) >>> * >>> - * It might be needed one more page table in case when Xen load address >>> - * isn't 2 MB aligned. >>> + * An amount of page tables between root page table and L0 page table >>> + * (in the case of Sv39 it covers L1 table): >>> + * (CONFIG_PAGING_LEVELS - 2) are needed for an identity mapping and >>> + * the same amount are needed for Xen. >>> * >>> - * CONFIG_PAGING_LEVELS page tables are needed for the identity mapping, >>> - * except that the root page table is shared with the initial mapping >>> + * An amount of L0 page tables: >>> + * (512 entries of one L0 page table covers 2MB == 1<<XEN_PT_LEVEL_SHIFT(1)) >>> + * XEN_VIRT_SIZE >> XEN_PT_LEVEL_SHIFT(1) are needed for Xen and >>> + * one L0 is needed for indenity mapping. >>> + * >>> + * It might be needed one more page table in case when Xen load >>> + * address isn't 2 MB aligned. >> Shouldn't we guarantee that? > > I think it's sufficient to guarantee 4KB alignment. > > The only real benefit I see in enforcing larger alignment is that it likely enables > the use of superpages for mapping, which would reduce TLB pressure. > But perhaps I'm missing something? No, it's indeed mainly that. > Or did you mean that if 2MB alignment isn't guaranteed, then we might need two extra > page tables—one if the start address isn't 2MB aligned, and the Xen size is larger than 2MB? > Then yes one more page table should be added to PGTBL_INITIAL_COUNT. Well, of course - if alignment isn't guaranteed, crossing whatever boundaries of course needs accounting for. Jan
On 4/8/25 2:02 PM, Jan Beulich wrote: > On 08.04.2025 13:51, Oleksii Kurochko wrote: >> On 4/7/25 12:09 PM, Jan Beulich wrote: >>> On 04.04.2025 18:04, Oleksii Kurochko wrote: >>>> --- a/xen/arch/riscv/include/asm/mm.h >>>> +++ b/xen/arch/riscv/include/asm/mm.h >>>> @@ -43,13 +43,19 @@ static inline void *maddr_to_virt(paddr_t ma) >>>> */ >>>> static inline unsigned long virt_to_maddr(unsigned long va) >>>> { >>>> + const unsigned int vpn1_shift = PAGETABLE_ORDER + PAGE_SHIFT; >>>> + const unsigned long va_vpn = va >> vpn1_shift; >>>> + const unsigned long xen_virt_start_vpn = >>>> + _AC(XEN_VIRT_START, UL) >> vpn1_shift; >>>> + const unsigned long xen_virt_end_vpn = >>>> + xen_virt_start_vpn + ((XEN_VIRT_SIZE >> vpn1_shift) - 1); >>>> + >>>> if ((va >= DIRECTMAP_VIRT_START) && >>>> (va <= DIRECTMAP_VIRT_END)) >>>> return directmapoff_to_maddr(va - directmap_virt_start); >>>> >>>> - BUILD_BUG_ON(XEN_VIRT_SIZE != MB(2)); >>>> - ASSERT((va >> (PAGETABLE_ORDER + PAGE_SHIFT)) == >>>> - (_AC(XEN_VIRT_START, UL) >> (PAGETABLE_ORDER + PAGE_SHIFT))); >>>> + BUILD_BUG_ON(XEN_VIRT_SIZE > GB(1)); >>>> + ASSERT((va_vpn >= xen_virt_start_vpn) && (va_vpn <= xen_virt_end_vpn)); >>> Not all of the range is backed by memory, and for the excess space the >>> translation is therefore (likely) wrong. Which better would be caught by >>> the assertion? >> Backed here means that the memory is actually mapped? >> >> IIUC it is needed to check only for the range [XEN_VIRT_START, XEN_VIRT_START+xen_phys_size] >> where xen_phys_size=(unsigned long)_end - (unsigned long)_start. >> >> Did I understand you correctly? > I think so, yes. Depending on what you (intend to) do to .init.* at the > end of boot, that range may later also want excluding. I planned to release everything between __init_begin and __init_end in the following way: destroy_xen_mappings((unsigned long)__init_begin, (unsigned long)__init_end); So yes, then I think I have to come up with new ASSERT, add is_init_memory_freed variable and if is_init_memory_freed=true then also check that `va` isn't from .init.* range. But I'm not quire sure that mapping for .got* should be destroyed after the end of boot. (now it is part of [__init_begin,__init_end] range. >>>> --- a/xen/arch/riscv/mm.c >>>> +++ b/xen/arch/riscv/mm.c >>>> @@ -31,20 +31,27 @@ unsigned long __ro_after_init phys_offset; /* = load_start - XEN_VIRT_START */ >>>> #define LOAD_TO_LINK(addr) ((unsigned long)(addr) - phys_offset) >>>> >>>> /* >>>> - * It is expected that Xen won't be more then 2 MB. >>>> + * It is expected that Xen won't be more then XEN_VIRT_SIZE MB. >>>> * The check in xen.lds.S guarantees that. >>>> - * At least 3 page tables (in case of Sv39 ) are needed to cover 2 MB. >>>> - * One for each page level table with PAGE_SIZE = 4 Kb. >>>> * >>>> - * One L0 page table can cover 2 MB(512 entries of one page table * PAGE_SIZE). >>>> + * Root page table is shared with the initial mapping and is declared >>>> + * separetely. (look at stage1_pgtbl_root) >>>> * >>>> - * It might be needed one more page table in case when Xen load address >>>> - * isn't 2 MB aligned. >>>> + * An amount of page tables between root page table and L0 page table >>>> + * (in the case of Sv39 it covers L1 table): >>>> + * (CONFIG_PAGING_LEVELS - 2) are needed for an identity mapping and >>>> + * the same amount are needed for Xen. >>>> * >>>> - * CONFIG_PAGING_LEVELS page tables are needed for the identity mapping, >>>> - * except that the root page table is shared with the initial mapping >>>> + * An amount of L0 page tables: >>>> + * (512 entries of one L0 page table covers 2MB == 1<<XEN_PT_LEVEL_SHIFT(1)) >>>> + * XEN_VIRT_SIZE >> XEN_PT_LEVEL_SHIFT(1) are needed for Xen and >>>> + * one L0 is needed for indenity mapping. >>>> + * >>>> + * It might be needed one more page table in case when Xen load >>>> + * address isn't 2 MB aligned. >>> Shouldn't we guarantee that? >> I think it's sufficient to guarantee 4KB alignment. >> >> The only real benefit I see in enforcing larger alignment is that it likely enables >> the use of superpages for mapping, which would reduce TLB pressure. >> But perhaps I'm missing something? > No, it's indeed mainly that. But then the linker address and the load address should both be aligned to a 2MB or 1GB boundary. This likely isn't an issue at all, but could it be a problem if we require 1GB alignment for the load address? In that case, might it be difficult for the platform to find a suitable place in memory to load Xen for some reason? (I don't think so but maybe I'm missing something) These changes should probably be part of a separate patch, as currently,|setup_initial_mapping() |only works with 4KB mapping. Perhaps it would make sense to add a comment around|setup_initial_mapping()| indicating that if this function is modified, it may require updating|PGTBL_INITIAL_COUNT|. ~ Oleksii > >> Or did you mean that if 2MB alignment isn't guaranteed, then we might need two extra >> page tables—one if the start address isn't 2MB aligned, and the Xen size is larger than 2MB? >> Then yes one more page table should be added to PGTBL_INITIAL_COUNT. > Well, of course - if alignment isn't guaranteed, crossing whatever boundaries > of course needs accounting for. > > Jan
On 08.04.2025 15:46, Oleksii Kurochko wrote: > On 4/8/25 2:02 PM, Jan Beulich wrote: >> On 08.04.2025 13:51, Oleksii Kurochko wrote: >>> On 4/7/25 12:09 PM, Jan Beulich wrote: >>>> On 04.04.2025 18:04, Oleksii Kurochko wrote: >>>>> --- a/xen/arch/riscv/include/asm/mm.h >>>>> +++ b/xen/arch/riscv/include/asm/mm.h >>>>> @@ -43,13 +43,19 @@ static inline void *maddr_to_virt(paddr_t ma) >>>>> */ >>>>> static inline unsigned long virt_to_maddr(unsigned long va) >>>>> { >>>>> + const unsigned int vpn1_shift = PAGETABLE_ORDER + PAGE_SHIFT; >>>>> + const unsigned long va_vpn = va >> vpn1_shift; >>>>> + const unsigned long xen_virt_start_vpn = >>>>> + _AC(XEN_VIRT_START, UL) >> vpn1_shift; >>>>> + const unsigned long xen_virt_end_vpn = >>>>> + xen_virt_start_vpn + ((XEN_VIRT_SIZE >> vpn1_shift) - 1); >>>>> + >>>>> if ((va >= DIRECTMAP_VIRT_START) && >>>>> (va <= DIRECTMAP_VIRT_END)) >>>>> return directmapoff_to_maddr(va - directmap_virt_start); >>>>> >>>>> - BUILD_BUG_ON(XEN_VIRT_SIZE != MB(2)); >>>>> - ASSERT((va >> (PAGETABLE_ORDER + PAGE_SHIFT)) == >>>>> - (_AC(XEN_VIRT_START, UL) >> (PAGETABLE_ORDER + PAGE_SHIFT))); >>>>> + BUILD_BUG_ON(XEN_VIRT_SIZE > GB(1)); >>>>> + ASSERT((va_vpn >= xen_virt_start_vpn) && (va_vpn <= xen_virt_end_vpn)); >>>> Not all of the range is backed by memory, and for the excess space the >>>> translation is therefore (likely) wrong. Which better would be caught by >>>> the assertion? >>> Backed here means that the memory is actually mapped? >>> >>> IIUC it is needed to check only for the range [XEN_VIRT_START, XEN_VIRT_START+xen_phys_size] >>> where xen_phys_size=(unsigned long)_end - (unsigned long)_start. >>> >>> Did I understand you correctly? >> I think so, yes. Depending on what you (intend to) do to .init.* at the >> end of boot, that range may later also want excluding. > > I planned to release everything between __init_begin and __init_end in the following way: > destroy_xen_mappings((unsigned long)__init_begin, (unsigned long)__init_end); > > So yes, then I think I have to come up with new ASSERT, add is_init_memory_freed variable and > if is_init_memory_freed=true then also check that `va` isn't from .init.* range. > > But I'm not quire sure that mapping for .got* should be destroyed after the end of boot. (now it is > part of [__init_begin,__init_end] range. Isn't this a non-issue considering ASSERT(!SIZEOF(.got), ".got non-empty") ASSERT(!SIZEOF(.got.plt), ".got.plt non-empty") near the bottom of xen.lds.S? >>>>> --- a/xen/arch/riscv/mm.c >>>>> +++ b/xen/arch/riscv/mm.c >>>>> @@ -31,20 +31,27 @@ unsigned long __ro_after_init phys_offset; /* = load_start - XEN_VIRT_START */ >>>>> #define LOAD_TO_LINK(addr) ((unsigned long)(addr) - phys_offset) >>>>> >>>>> /* >>>>> - * It is expected that Xen won't be more then 2 MB. >>>>> + * It is expected that Xen won't be more then XEN_VIRT_SIZE MB. >>>>> * The check in xen.lds.S guarantees that. >>>>> - * At least 3 page tables (in case of Sv39 ) are needed to cover 2 MB. >>>>> - * One for each page level table with PAGE_SIZE = 4 Kb. >>>>> * >>>>> - * One L0 page table can cover 2 MB(512 entries of one page table * PAGE_SIZE). >>>>> + * Root page table is shared with the initial mapping and is declared >>>>> + * separetely. (look at stage1_pgtbl_root) >>>>> * >>>>> - * It might be needed one more page table in case when Xen load address >>>>> - * isn't 2 MB aligned. >>>>> + * An amount of page tables between root page table and L0 page table >>>>> + * (in the case of Sv39 it covers L1 table): >>>>> + * (CONFIG_PAGING_LEVELS - 2) are needed for an identity mapping and >>>>> + * the same amount are needed for Xen. >>>>> * >>>>> - * CONFIG_PAGING_LEVELS page tables are needed for the identity mapping, >>>>> - * except that the root page table is shared with the initial mapping >>>>> + * An amount of L0 page tables: >>>>> + * (512 entries of one L0 page table covers 2MB == 1<<XEN_PT_LEVEL_SHIFT(1)) >>>>> + * XEN_VIRT_SIZE >> XEN_PT_LEVEL_SHIFT(1) are needed for Xen and >>>>> + * one L0 is needed for indenity mapping. >>>>> + * >>>>> + * It might be needed one more page table in case when Xen load >>>>> + * address isn't 2 MB aligned. >>>> Shouldn't we guarantee that? >>> I think it's sufficient to guarantee 4KB alignment. >>> >>> The only real benefit I see in enforcing larger alignment is that it likely enables >>> the use of superpages for mapping, which would reduce TLB pressure. >>> But perhaps I'm missing something? >> No, it's indeed mainly that. > > But then the linker address and the load address should both be aligned to a 2MB or 1GB boundary. > This likely isn't an issue at all, but could it be a problem if we require 1GB alignment for the > load address? In that case, might it be difficult for the platform to find a suitable place in > memory to load Xen for some reason? (I don't think so but maybe I'm missing something) Why would load address need to be 1Gb aligned? That (as well as 2Mb-)alignment matters only once you set up paging? > These changes should probably be part of a separate patch, as currently,|setup_initial_mapping() |only works with 4KB mapping. That's fine; it's just that - as said - the calculation of how many page tables you may need has to cover for the worst case. Jan
On 4/8/25 4:04 PM, Jan Beulich wrote: > On 08.04.2025 15:46, Oleksii Kurochko wrote: >> On 4/8/25 2:02 PM, Jan Beulich wrote: >>> On 08.04.2025 13:51, Oleksii Kurochko wrote: >>>> On 4/7/25 12:09 PM, Jan Beulich wrote: >>>>> On 04.04.2025 18:04, Oleksii Kurochko wrote: >>>>>> --- a/xen/arch/riscv/include/asm/mm.h >>>>>> +++ b/xen/arch/riscv/include/asm/mm.h >>>>>> @@ -43,13 +43,19 @@ static inline void *maddr_to_virt(paddr_t ma) >>>>>> */ >>>>>> static inline unsigned long virt_to_maddr(unsigned long va) >>>>>> { >>>>>> + const unsigned int vpn1_shift = PAGETABLE_ORDER + PAGE_SHIFT; >>>>>> + const unsigned long va_vpn = va >> vpn1_shift; >>>>>> + const unsigned long xen_virt_start_vpn = >>>>>> + _AC(XEN_VIRT_START, UL) >> vpn1_shift; >>>>>> + const unsigned long xen_virt_end_vpn = >>>>>> + xen_virt_start_vpn + ((XEN_VIRT_SIZE >> vpn1_shift) - 1); >>>>>> + >>>>>> if ((va >= DIRECTMAP_VIRT_START) && >>>>>> (va <= DIRECTMAP_VIRT_END)) >>>>>> return directmapoff_to_maddr(va - directmap_virt_start); >>>>>> >>>>>> - BUILD_BUG_ON(XEN_VIRT_SIZE != MB(2)); >>>>>> - ASSERT((va >> (PAGETABLE_ORDER + PAGE_SHIFT)) == >>>>>> - (_AC(XEN_VIRT_START, UL) >> (PAGETABLE_ORDER + PAGE_SHIFT))); >>>>>> + BUILD_BUG_ON(XEN_VIRT_SIZE > GB(1)); >>>>>> + ASSERT((va_vpn >= xen_virt_start_vpn) && (va_vpn <= xen_virt_end_vpn)); >>>>> Not all of the range is backed by memory, and for the excess space the >>>>> translation is therefore (likely) wrong. Which better would be caught by >>>>> the assertion? >>>> Backed here means that the memory is actually mapped? >>>> >>>> IIUC it is needed to check only for the range [XEN_VIRT_START, XEN_VIRT_START+xen_phys_size] >>>> where xen_phys_size=(unsigned long)_end - (unsigned long)_start. >>>> >>>> Did I understand you correctly? >>> I think so, yes. Depending on what you (intend to) do to .init.* at the >>> end of boot, that range may later also want excluding. >> I planned to release everything between __init_begin and __init_end in the following way: >> destroy_xen_mappings((unsigned long)__init_begin, (unsigned long)__init_end); >> >> So yes, then I think I have to come up with new ASSERT, add is_init_memory_freed variable and >> if is_init_memory_freed=true then also check that `va` isn't from .init.* range. >> >> But I'm not quire sure that mapping for .got* should be destroyed after the end of boot. (now it is >> part of [__init_begin,__init_end] range. > Isn't this a non-issue considering > > ASSERT(!SIZEOF(.got), ".got non-empty") > ASSERT(!SIZEOF(.got.plt), ".got.plt non-empty") > > near the bottom of xen.lds.S? I forgot about that|ASSERT()|, so it's expected that|.got*| isn't used in Xen anyway. Therefore, it shouldn't be an issue to destroy the mapping for the|[__init_begin, __init_end]| range. > >>>>>> --- a/xen/arch/riscv/mm.c >>>>>> +++ b/xen/arch/riscv/mm.c >>>>>> @@ -31,20 +31,27 @@ unsigned long __ro_after_init phys_offset; /* = load_start - XEN_VIRT_START */ >>>>>> #define LOAD_TO_LINK(addr) ((unsigned long)(addr) - phys_offset) >>>>>> >>>>>> /* >>>>>> - * It is expected that Xen won't be more then 2 MB. >>>>>> + * It is expected that Xen won't be more then XEN_VIRT_SIZE MB. >>>>>> * The check in xen.lds.S guarantees that. >>>>>> - * At least 3 page tables (in case of Sv39 ) are needed to cover 2 MB. >>>>>> - * One for each page level table with PAGE_SIZE = 4 Kb. >>>>>> * >>>>>> - * One L0 page table can cover 2 MB(512 entries of one page table * PAGE_SIZE). >>>>>> + * Root page table is shared with the initial mapping and is declared >>>>>> + * separetely. (look at stage1_pgtbl_root) >>>>>> * >>>>>> - * It might be needed one more page table in case when Xen load address >>>>>> - * isn't 2 MB aligned. >>>>>> + * An amount of page tables between root page table and L0 page table >>>>>> + * (in the case of Sv39 it covers L1 table): >>>>>> + * (CONFIG_PAGING_LEVELS - 2) are needed for an identity mapping and >>>>>> + * the same amount are needed for Xen. >>>>>> * >>>>>> - * CONFIG_PAGING_LEVELS page tables are needed for the identity mapping, >>>>>> - * except that the root page table is shared with the initial mapping >>>>>> + * An amount of L0 page tables: >>>>>> + * (512 entries of one L0 page table covers 2MB == 1<<XEN_PT_LEVEL_SHIFT(1)) >>>>>> + * XEN_VIRT_SIZE >> XEN_PT_LEVEL_SHIFT(1) are needed for Xen and >>>>>> + * one L0 is needed for indenity mapping. >>>>>> + * >>>>>> + * It might be needed one more page table in case when Xen load >>>>>> + * address isn't 2 MB aligned. >>>>> Shouldn't we guarantee that? >>>> I think it's sufficient to guarantee 4KB alignment. >>>> >>>> The only real benefit I see in enforcing larger alignment is that it likely enables >>>> the use of superpages for mapping, which would reduce TLB pressure. >>>> But perhaps I'm missing something? >>> No, it's indeed mainly that. >> But then the linker address and the load address should both be aligned to a 2MB or 1GB boundary. >> This likely isn't an issue at all, but could it be a problem if we require 1GB alignment for the >> load address? In that case, might it be difficult for the platform to find a suitable place in >> memory to load Xen for some reason? (I don't think so but maybe I'm missing something) > Why would load address need to be 1Gb aligned? That (as well as 2Mb-)alignment > matters only once you set up paging? Mostly yes, it matters only once during paging set up. I was thinking that if, one day, 2MB (or larger) alignment is used and the load address isn't properly aligned, some space in a page might be lost. (The word "should" above wasn't entirely accurate.) But this likely isn't a big deal and can be safely ignored. ~ Oleksii
On 09.04.2025 11:06, Oleksii Kurochko wrote: > On 4/8/25 4:04 PM, Jan Beulich wrote: >> On 08.04.2025 15:46, Oleksii Kurochko wrote: >>> On 4/8/25 2:02 PM, Jan Beulich wrote: >>>> On 08.04.2025 13:51, Oleksii Kurochko wrote: >>>>> On 4/7/25 12:09 PM, Jan Beulich wrote: >>>>>> On 04.04.2025 18:04, Oleksii Kurochko wrote: >>>>>>> --- a/xen/arch/riscv/mm.c >>>>>>> +++ b/xen/arch/riscv/mm.c >>>>>>> @@ -31,20 +31,27 @@ unsigned long __ro_after_init phys_offset; /* = load_start - XEN_VIRT_START */ >>>>>>> #define LOAD_TO_LINK(addr) ((unsigned long)(addr) - phys_offset) >>>>>>> >>>>>>> /* >>>>>>> - * It is expected that Xen won't be more then 2 MB. >>>>>>> + * It is expected that Xen won't be more then XEN_VIRT_SIZE MB. >>>>>>> * The check in xen.lds.S guarantees that. >>>>>>> - * At least 3 page tables (in case of Sv39 ) are needed to cover 2 MB. >>>>>>> - * One for each page level table with PAGE_SIZE = 4 Kb. >>>>>>> * >>>>>>> - * One L0 page table can cover 2 MB(512 entries of one page table * PAGE_SIZE). >>>>>>> + * Root page table is shared with the initial mapping and is declared >>>>>>> + * separetely. (look at stage1_pgtbl_root) >>>>>>> * >>>>>>> - * It might be needed one more page table in case when Xen load address >>>>>>> - * isn't 2 MB aligned. >>>>>>> + * An amount of page tables between root page table and L0 page table >>>>>>> + * (in the case of Sv39 it covers L1 table): >>>>>>> + * (CONFIG_PAGING_LEVELS - 2) are needed for an identity mapping and >>>>>>> + * the same amount are needed for Xen. >>>>>>> * >>>>>>> - * CONFIG_PAGING_LEVELS page tables are needed for the identity mapping, >>>>>>> - * except that the root page table is shared with the initial mapping >>>>>>> + * An amount of L0 page tables: >>>>>>> + * (512 entries of one L0 page table covers 2MB == 1<<XEN_PT_LEVEL_SHIFT(1)) >>>>>>> + * XEN_VIRT_SIZE >> XEN_PT_LEVEL_SHIFT(1) are needed for Xen and >>>>>>> + * one L0 is needed for indenity mapping. >>>>>>> + * >>>>>>> + * It might be needed one more page table in case when Xen load >>>>>>> + * address isn't 2 MB aligned. >>>>>> Shouldn't we guarantee that? >>>>> I think it's sufficient to guarantee 4KB alignment. >>>>> >>>>> The only real benefit I see in enforcing larger alignment is that it likely enables >>>>> the use of superpages for mapping, which would reduce TLB pressure. >>>>> But perhaps I'm missing something? >>>> No, it's indeed mainly that. >>> But then the linker address and the load address should both be aligned to a 2MB or 1GB boundary. >>> This likely isn't an issue at all, but could it be a problem if we require 1GB alignment for the >>> load address? In that case, might it be difficult for the platform to find a suitable place in >>> memory to load Xen for some reason? (I don't think so but maybe I'm missing something) >> Why would load address need to be 1Gb aligned? That (as well as 2Mb-)alignment >> matters only once you set up paging? > > Mostly yes, it matters only once during paging set up. > > I was thinking that if, one day, 2MB (or larger) alignment is used and the load address isn't > properly aligned, some space in a page might be lost. > (The word "should" above wasn't entirely accurate.) Actually I think I was wrong with my question. Load address of course matters to a sufficient degree, especially if at 2Mb boundaries to want to be able to change what permissions to use (without sacrificing the 2Mb mappings). Jan
diff --git a/xen/arch/riscv/include/asm/config.h b/xen/arch/riscv/include/asm/config.h index 7141bd9e46..41b8410d10 100644 --- a/xen/arch/riscv/include/asm/config.h +++ b/xen/arch/riscv/include/asm/config.h @@ -41,11 +41,11 @@ * Start addr | End addr | Slot | area description * ============================================================================ * ..... L2 511 Unused - * 0xffffffffc0a00000 0xffffffffc0bfffff L2 511 Fixmap + * 0xffffffffc1800000 0xffffffffc1afffff L2 511 Fixmap * ..... ( 2 MB gap ) - * 0xffffffffc0400000 0xffffffffc07fffff L2 511 FDT + * 0xffffffffc1200000 0xffffffffc15fffff L2 511 FDT * ..... ( 2 MB gap ) - * 0xffffffffc0000000 0xffffffffc01fffff L2 511 Xen + * 0xffffffffc0000000 0xffffffffc0ffffff L2 511 Xen * ..... L2 510 Unused * 0x3200000000 0x7f7fffffff L2 200-509 Direct map * ..... L2 199 Unused @@ -78,7 +78,7 @@ #define GAP_SIZE MB(2) -#define XEN_VIRT_SIZE MB(2) +#define XEN_VIRT_SIZE MB(16) #define BOOT_FDT_VIRT_START (XEN_VIRT_START + XEN_VIRT_SIZE + GAP_SIZE) #define BOOT_FDT_VIRT_SIZE MB(4) diff --git a/xen/arch/riscv/include/asm/mm.h b/xen/arch/riscv/include/asm/mm.h index 4035cd400a..511e75c6d4 100644 --- a/xen/arch/riscv/include/asm/mm.h +++ b/xen/arch/riscv/include/asm/mm.h @@ -43,13 +43,19 @@ static inline void *maddr_to_virt(paddr_t ma) */ static inline unsigned long virt_to_maddr(unsigned long va) { + const unsigned int vpn1_shift = PAGETABLE_ORDER + PAGE_SHIFT; + const unsigned long va_vpn = va >> vpn1_shift; + const unsigned long xen_virt_start_vpn = + _AC(XEN_VIRT_START, UL) >> vpn1_shift; + const unsigned long xen_virt_end_vpn = + xen_virt_start_vpn + ((XEN_VIRT_SIZE >> vpn1_shift) - 1); + if ((va >= DIRECTMAP_VIRT_START) && (va <= DIRECTMAP_VIRT_END)) return directmapoff_to_maddr(va - directmap_virt_start); - BUILD_BUG_ON(XEN_VIRT_SIZE != MB(2)); - ASSERT((va >> (PAGETABLE_ORDER + PAGE_SHIFT)) == - (_AC(XEN_VIRT_START, UL) >> (PAGETABLE_ORDER + PAGE_SHIFT))); + BUILD_BUG_ON(XEN_VIRT_SIZE > GB(1)); + ASSERT((va_vpn >= xen_virt_start_vpn) && (va_vpn <= xen_virt_end_vpn)); /* phys_offset = load_start - XEN_VIRT_START */ return phys_offset + va; diff --git a/xen/arch/riscv/mm.c b/xen/arch/riscv/mm.c index f2bf279bac..256afdaaa3 100644 --- a/xen/arch/riscv/mm.c +++ b/xen/arch/riscv/mm.c @@ -31,20 +31,27 @@ unsigned long __ro_after_init phys_offset; /* = load_start - XEN_VIRT_START */ #define LOAD_TO_LINK(addr) ((unsigned long)(addr) - phys_offset) /* - * It is expected that Xen won't be more then 2 MB. + * It is expected that Xen won't be more then XEN_VIRT_SIZE MB. * The check in xen.lds.S guarantees that. - * At least 3 page tables (in case of Sv39 ) are needed to cover 2 MB. - * One for each page level table with PAGE_SIZE = 4 Kb. * - * One L0 page table can cover 2 MB(512 entries of one page table * PAGE_SIZE). + * Root page table is shared with the initial mapping and is declared + * separetely. (look at stage1_pgtbl_root) * - * It might be needed one more page table in case when Xen load address - * isn't 2 MB aligned. + * An amount of page tables between root page table and L0 page table + * (in the case of Sv39 it covers L1 table): + * (CONFIG_PAGING_LEVELS - 2) are needed for an identity mapping and + * the same amount are needed for Xen. * - * CONFIG_PAGING_LEVELS page tables are needed for the identity mapping, - * except that the root page table is shared with the initial mapping + * An amount of L0 page tables: + * (512 entries of one L0 page table covers 2MB == 1<<XEN_PT_LEVEL_SHIFT(1)) + * XEN_VIRT_SIZE >> XEN_PT_LEVEL_SHIFT(1) are needed for Xen and + * one L0 is needed for indenity mapping. + * + * It might be needed one more page table in case when Xen load + * address isn't 2 MB aligned. */ -#define PGTBL_INITIAL_COUNT ((CONFIG_PAGING_LEVELS - 1) * 2 + 1) +#define PGTBL_INITIAL_COUNT ((CONFIG_PAGING_LEVELS - 2) * 2 + \ + (XEN_VIRT_SIZE >> XEN_PT_LEVEL_SHIFT(1)) + 2) pte_t __section(".bss.page_aligned") __aligned(PAGE_SIZE) stage1_pgtbl_root[PAGETABLE_ENTRIES];
A randconfig job failed with the following issue: riscv64-linux-gnu-ld: Xen too large for early-boot assumptions The reason is that enabling the UBSAN config increased the size of the Xen binary. Increase XEN_VIRT_SIZE to reserve enough space, allowing both UBSAN and GCOV to be enabled together, with some slack for future growth. Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> --- Changes in v2: - Incerease XEN_VIRT_SIZE to 16 Mb to cover also the case if 2M mappings will be used for .text (rx), .rodata(r), and .data (rw). - Update layout table in config.h. - s/xen_virt_starn_vpn/xen_virt_start_vpn - Update BUILD_BUG_ON(... != MB(8)) check to "... > GB(1)". - Update definition of PGTBL_INITIAL_COUNT and the comment above. --- xen/arch/riscv/include/asm/config.h | 8 ++++---- xen/arch/riscv/include/asm/mm.h | 12 +++++++++--- xen/arch/riscv/mm.c | 25 ++++++++++++++++--------- 3 files changed, 29 insertions(+), 16 deletions(-)