mbox series

[v9,0/5] enable MMU for RISC-V

Message ID cover.1685027257.git.oleksii.kurochko@gmail.com (mailing list archive)
Headers show
Series enable MMU for RISC-V | expand

Message

Oleksii Kurochko May 25, 2023, 3:28 p.m. UTC
The patch series introduces the following things:
1. Functionality to build the page tables for Xen that map
   link-time to physical-time location.
2. Check that Xen is less then page size.
3. Check that load addresses don't overlap with linker addresses.
4. Prepare things for proper switch to virtual memory world.
5. Load the built page table into the SATP
6. Enable MMU.

---
Changes in V9:
  - update comment for VM layout description.
  - update VPN_MASK: remove type cast.
  - Remove double blank lines.
  - Add Reviewed-by for patch #2
---
Changes in V8:
	- Add "#ifdef RV_STAGE1_MODE == SATP_MODE_SV39" instead of "#ifdef SV39"
	  in the comment to VM layout description.
	- Update the upper bound of direct map area in VM layout description.
	- Add parentheses for lvl in pt_index() macros.
	- introduce macros paddr_to_pfn() and pfn_to_paddr() and use them inside
	  paddr_to_pte()/pte_to_paddr().
	- Remove "__" in sfence_vma() and add blanks inside the parentheses of
	  asm volatile.
	- Parenthesize the two & against the || at the start of setup_initial_mapping()
	  function.
	- Code style fixes.
	- Remove ". = ALIGN(PAGE_SIZE);" before "*(.bss.page_aligned)" in vmlinux.lds.S
	  file as .bss.page_aligned specifies proper alignment themselves.
---
Changes in V7:
	- Fix range of frametable range in RV64 layout.
	- Add ifdef SV39 to the RV64 layout comment to make it explicit that
    description if for SV39 mode.
	- Add missed row in the RV64 layout table.
 	- define CONFIG_PAGING_LEVELS=2 for SATP_MODE_SV32.
 	- update switch_stack_and_jump macros(): add constraint 'X' for fn,
    memory clobber and wrap into do {} while ( false ).
 	- add noreturn to definition of enable_mmu().
 	- update pt_index() to "(pt_linear_offset(lvl, (va)) & VPN_MASK)".
 	- expand macros pte_to_addr()/addr_to_pte() in paddr_to_pte() and
    pte_to_paddr() functions and after drop them.
 	- remove inclusion of <asm/config.h>.
 	- update commit message around definition of PGTBL_INITIAL_COUNT.
 	- remove PGTBL_ENTRY_AMOUNT and use PAGETABLE_ENTRIES instead.
 	- code style fixes
 	- remove permission argument of setup_initial_mapping() function
 	- remove calc_pgtbl_lvls_num() as it's not needed anymore after definition
    of CONFIG_PAGING_LEVELS
 	- introduce sfence_vma()
 	- remove satp_mode argument from check_pgtbl_mode_support() and use
    RV_STAGE1_MODE directly instead.
 	- change .align to .p2align
 	- drop inclusion of <asm/asm-offsets.h> from head.S. This change isn't
    necessary for the current patch series.
 	- create a separate patch for xen.lds.S.
---
Changes in V6:
  - update RV VM layout and related to it things
 	- move PAGE_SHIFT, PADDR_BITS to the top of page-bits.h
 	- cast argument x of pte_to_addr() macros to paddr_t to avoid risk of overflow for RV32
 	- update type of num_levels from 'unsigned long' to 'unsigned int'
 	- define PGTBL_INITIAL_COUNT as ((CONFIG_PAGING_LEVELS - 1) + 1)
 	- update type of permission arguments. changed it from 'unsgined long' to 'unsigned int'
 	- fix code style
 	- switch while 'loop' to 'for' loop
 	- undef HANDLE_PGTBL
 	- clean root page table after MMU is disabled in check_pgtbl_mode_support() function
 	- align __bss_start properly
 	- remove unnecesssary const for paddr_to_pte, pte_to_paddr, pte_is_valid functions
 	- add switch_stack_and_jump macros and use it inside enable_mmu() before jump to
 	  cont_after_mmu_is_enabled() function

---
Changes in V5:
  * rebase the patch series on top of current staging
  * Update cover letter: it was removed the info about the patches on which
    MMU patch series is based as they were merged to staging
  * add new patch with description of VM layout for RISC-V2
  * Indent fields of pte_t struct
	* Rename addr_to_pte() and ppn_to_paddr() to match their content
---
Changes in V4:
  * use GB() macros instead of defining SZ_1G
  * hardcode XEN_VIRT_START and add comment (ADDRESS_SPACE_END + 1 - GB(1))
  * remove unnecessary 'asm' word at the end of #error
  * encapsulate pte_t definition in a struct
  * rename addr_to_ppn() to ppn_to_paddr().
  * change type of paddr argument from const unsigned long to paddr_t
  * pte_to_paddr() update prototype.
  * calculate size of Xen binary based on an amount of page tables
  * use unsgined int instead of 'uint32_t' instead of uint32_t as
    their use isn't warranted.
  * remove extern of bss_{start,end} as they aren't used in mm.c anymore
  * fix code style
  * add argument for HANDLE_PGTBL macros instead of curr_lvl_num variable
  * make enable_mmu() as noinline to prevent under link-time optimization
    because of the nature of enable_mmu()
  * add function to check that SATP_MODE is supported.
  * update the commit message
  * update setup_initial_pagetables to set correct PTE flags in one pass
    instead of calling setup_pte_permissions after setup_initial_pagetables()
    as setup_initial_pagetables() isn't used to change permission flags.
---
Changes in V3:
  * Update the cover letter message: the patch series isn't depend on
    [ RISC-V basic exception handling implementation ] as it was decied
    to enable MMU before implementation of exception handling. Also MMU
    patch series is based on two other patches which weren't merged [1]
    and [2]
  - Update the commit message for [ [PATCH v3 1/3]
    xen/riscv: introduce setup_initial_pages ].
  - update definition of pte_t structure to have a proper size of pte_t in case of RV32.
  - update asm/mm.h with new functions and remove unnecessary 'extern'.
  - remove LEVEL_* macros as only XEN_PT_LEVEL_* are enough.
  - update paddr_to_pte() to receive permissions as an argument.
  - add check that map_start & pa_start is properly aligned.
  - move  defines PAGETABLE_ORDER, PAGETABLE_ENTRIES, PTE_PPN_SHIFT to <asm/page-bits.h>
  - Rename PTE_SHIFT to PTE_PPN_SHIFT
  - refactor setup_initial_pagetables: map all LINK addresses to LOAD addresses and after
    setup PTEs permission for sections; update check that linker and load addresses don't
    overlap.
  - refactor setup_initial_mapping: allocate pagetable 'dynamically' if it is necessary.
  - rewrite enable_mmu in C; add the check that map_start and pa_start are aligned on 4k
    boundary.
  - update the comment for setup_initial_pagetable funcion
  - Add RV_STAGE1_MODE to support different MMU modes.
  - update the commit message that MMU is also enabled here
  - set XEN_VIRT_START very high to not overlap with load address range
  - align bss section
---
Changes in V2:
  * Remove {ZEROETH,FIRST,...}_{SHIFT,MASK, SIZE,...} and
    introduce instead of them XEN_PT_LEVEL_*() and LEVEL_*
  * Rework pt_linear_offset() and pt_index based on  XEN_PT_LEVEL_*()
  * Remove clear_pagetables() functions as pagetables were zeroed during
    .bss initialization
  * Rename _setup_initial_pagetables() to setup_initial_mapping()
  * Make PTE_DEFAULT equal to RX.
  * Update prototype of setup_initial_mapping(..., bool writable) -> 
    setup_initial_mapping(..., UL flags)  
  * Update calls of setup_initial_mapping according to new prototype
  * Remove unnecessary call of:
    _setup_initial_pagetables(..., load_addr_start, load_addr_end, load_addr_start, ...)
  * Define index* in the loop of setup_initial_mapping
  * Remove attribute "__attribute__((section(".entry")))" for setup_initial_pagetables()
    as we don't have such section
  * make arguments of paddr_to_pte() and pte_is_valid() as const.
  * use <xen/kernel.h> instead of declaring extern unsigned long _stext, 0etext, _srodata, _erodata
  * update  'extern unsigned long __init_begin' to 'extern unsigned long __init_begin[]'
  * use aligned() instead of "__attribute__((__aligned__(PAGE_SIZE)))"
  * set __section(".bss.page_aligned") for page tables arrays
  * fix identatations
  * Change '__attribute__((section(".entry")))' to '__init'
  * Remove alignment  of {map, pa}_start &= XEN_PT_LEVEL_MAP_MASK(0); in
    setup_inital_mapping() as they should be already aligned.
  * Remove clear_pagetables() as initial pagetables will be
    zeroed during bss initialization
  * Remove __attribute__((section(".entry")) for setup_initial_pagetables()
    as there is no such section in xen.lds.S
  * Update the argument of pte_is_valid() to "const pte_t *p"
  * Remove patch "[PATCH v1 3/3] automation: update RISC-V smoke test" from the patch series
    as it was introduced simplified approach for RISC-V smoke test by Andrew Cooper
  * Add patch [[xen/riscv: remove dummy_bss variable] as there is no any sense in
    dummy_bss variable after introduction of inittial page tables.
---

Oleksii Kurochko (5):
  xen/riscv: add VM space layout
  xen/riscv: introduce setup_initial_pages
  xen/riscv: align __bss_start
  xen/riscv: setup initial pagetables
  xen/riscv: remove dummy_bss variable

 xen/arch/riscv/Makefile                |   1 +
 xen/arch/riscv/include/asm/config.h    |  50 ++++-
 xen/arch/riscv/include/asm/current.h   |  11 +
 xen/arch/riscv/include/asm/mm.h        |  14 ++
 xen/arch/riscv/include/asm/page-bits.h |  10 +
 xen/arch/riscv/include/asm/page.h      |  61 ++++++
 xen/arch/riscv/include/asm/processor.h |   5 +
 xen/arch/riscv/mm.c                    | 276 +++++++++++++++++++++++++
 xen/arch/riscv/setup.c                 |  22 +-
 xen/arch/riscv/xen.lds.S               |   5 +-
 10 files changed, 445 insertions(+), 10 deletions(-)
 create mode 100644 xen/arch/riscv/include/asm/current.h
 create mode 100644 xen/arch/riscv/include/asm/mm.h
 create mode 100644 xen/arch/riscv/include/asm/page.h
 create mode 100644 xen/arch/riscv/mm.c

Comments

Jan Beulich May 30, 2023, 10:23 a.m. UTC | #1
On 25.05.2023 17:28, Oleksii Kurochko wrote:
> Oleksii Kurochko (5):
>   xen/riscv: add VM space layout
>   xen/riscv: introduce setup_initial_pages
>   xen/riscv: align __bss_start
>   xen/riscv: setup initial pagetables
>   xen/riscv: remove dummy_bss variable

While the series is now okay from my perspective, it'll need maintainer
acks. I thought I'd remind you of the respective process aspect: It is
on you to chase them.

Jan
Oleksii Kurochko May 30, 2023, 11:33 a.m. UTC | #2
On Tue, 2023-05-30 at 12:23 +0200, Jan Beulich wrote:
> On 25.05.2023 17:28, Oleksii Kurochko wrote:
> > Oleksii Kurochko (5):
> >   xen/riscv: add VM space layout
> >   xen/riscv: introduce setup_initial_pages
> >   xen/riscv: align __bss_start
> >   xen/riscv: setup initial pagetables
> >   xen/riscv: remove dummy_bss variable
> 
> While the series is now okay from my perspective, it'll need
> maintainer
> acks. I thought I'd remind you of the respective process aspect: It
> is
> on you to chase them.
> 
I appreciate the reminder.

If I don't get an ACK from maintainers in a week, I'll send an
additional e-mail.

~ Oleksii
Andrew Cooper May 31, 2023, 11:45 a.m. UTC | #3
On 25/05/2023 4:28 pm, Oleksii Kurochko wrote:
> Oleksii Kurochko (5):
>   xen/riscv: add VM space layout
>   xen/riscv: introduce setup_initial_pages
>   xen/riscv: align __bss_start
>   xen/riscv: setup initial pagetables
>   xen/riscv: remove dummy_bss variable

These have just been committed.

But as I fed back to early drafts of this series, patch 2 is
sufficiently fragile and unwise as to be unacceptable in this form.

enable_mmu() is unsafe in multiple ways, from the compiler reordering
statements (the label needs to be in the asm statement for that to work
correctly), and because it * depends* on hooking all exceptions and
pagefault.

Any exception other than pagefault, or not taking a pagefault causes it
to malfunction, which means you will fail to boot depending on where Xen
was loaded into memory.  It may not explode inside Qemu right now, but
it will not function reliably in the general case.

Furthermore, a combination of patch 2 and 4 breaks the CI integration of
looking for "All set up" at the end of start_xen().  It's not ok, from a
code quality point of view, to defer 99% of start_xen()'s functionality
into unrelated function.


Please do not do anything else until you've addressed these issues. 
enable_mmu() needs to return normally, cont_after_mmu_is_enabled() needs
deleting entirely, and there needs to be an identity page for Xen to
land on so it isn't jumping into the void and praying not to explode.

Other minor issues include page.h not having __ASSEMBLY__ guards, mm.c
locally externing cpu0_boot_stack[] from setup.c when the declaration
needs to be in a header file somewhere, and SPDX tags.

~Andrew
Oleksii Kurochko June 2, 2023, 5:19 p.m. UTC | #4
On Wed, 2023-05-31 at 12:45 +0100, Andrew Cooper wrote:
> On 25/05/2023 4:28 pm, Oleksii Kurochko wrote:
> > Oleksii Kurochko (5):
> >   xen/riscv: add VM space layout
> >   xen/riscv: introduce setup_initial_pages
> >   xen/riscv: align __bss_start
> >   xen/riscv: setup initial pagetables
> >   xen/riscv: remove dummy_bss variable
> 
> These have just been committed.
> 
> But as I fed back to early drafts of this series, patch 2 is
> sufficiently fragile and unwise as to be unacceptable in this form.
> 
> enable_mmu() is unsafe in multiple ways, from the compiler reordering
> statements (the label needs to be in the asm statement for that to
> work
> correctly), and because it * depends* on hooking all exceptions and
> pagefault.
> 
> Any exception other than pagefault, or not taking a pagefault causes
> it
> to malfunction, which means you will fail to boot depending on where
> Xen
> was loaded into memory.  It may not explode inside Qemu right now,
> but
> it will not function reliably in the general case.
Linux RISC-V has the similar approach and it doesn't explode but if it
will be better to use identity map then I'll re-write it.

> 
> Furthermore, a combination of patch 2 and 4 breaks the CI integration
> of
> looking for "All set up" at the end of start_xen().  It's not ok,
> from a
> code quality point of view, to defer 99% of start_xen()'s
> functionality
> into unrelated function.
> 
> 
> Please do not do anything else until you've addressed these issues. 
> enable_mmu() needs to return normally, cont_after_mmu_is_enabled()
> needs
> deleting entirely, and there needs to be an identity page for Xen to
> land on so it isn't jumping into the void and praying not to explode.
In this case enable_mmu() should be called before start_xen() function
because if start_xen() has local variables then after jumping to VA and
removing identity page we will have an issue with stack pointer as it
will contain addresses where Xen was loaded.
One more enable_mmu() can't have local variable too ( as stack pointer 
can be somewhere outside one page used for identity mapping ) and it
will cause an issue when execute epilogue of enable_mmu() function.

Or am I missing something?

> 
> Other minor issues include page.h not having __ASSEMBLY__ guards,
> mm.c
> locally externing cpu0_boot_stack[] from setup.c when the declaration
> needs to be in a header file somewhere, and SPDX tags.
Thanks. I'll take it into account.

~ Oleksii