Message ID | e78679b00df63bde40eb3a4d97e3ab9d1faf9382.1738933678.git.oleksii.kurochko@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Fixes for vmap_to_mfn() and pt_mapping_level | expand |
On 07.02.2025 14:13, Oleksii Kurochko wrote: > --- a/xen/arch/riscv/pt.c > +++ b/xen/arch/riscv/pt.c > @@ -185,6 +185,57 @@ static int pt_next_level(bool alloc_tbl, pte_t **table, unsigned int offset) > return XEN_TABLE_NORMAL; > } > > +/* > + * _pt_walk() performs software page table walking and returns the pte_t of > + * a leaf node or the leaf-most not-present pte_t if no leaf node is found > + * for further analysis. > + * Additionally, _pt_walk() returns the level of the found pte. That's optional, which I think wants expressing here. > + */ > +static pte_t *_pt_walk(vaddr_t va, unsigned int *pte_level) > +{ > + const mfn_t root = get_root_page(); > + unsigned int level; > + pte_t *table; > + > + DECLARE_OFFSETS(offsets, va); > + > + table = map_table(root); This mapping operation doesn't look to have a counterpart. Aiui ... > + /* > + * Find `table` of an entry which corresponds to `va` by iterating for each > + * page level and checking if the entry points to a next page table or > + * to a page. > + * > + * Two cases are possible: > + * - ret == XEN_TABLE_SUPER_PAGE means that the entry was found; > + * (Despite the name) XEN_TABLE_SUPER_PAGE also covers 4K mappings. If > + * pt_next_level() is called for page table level 0, it results in the > + * entry being a pointer to a leaf node, thereby returning > + * XEN_TABLE_SUPER_PAGE, despite of the fact this leaf covers 4k mapping. > + * - ret == XEN_TABLE_MAP_NONE means that requested `va` wasn't actually > + * mapped. > + */ > + for ( level = HYP_PT_ROOT_LEVEL; ; --level ) > + { > + int ret = pt_next_level(false, &table, offsets[level]); ... the mapping may be replaced here, but a new mapping will then still be held by this function and ... > + if ( ret == XEN_TABLE_MAP_NONE || ret == XEN_TABLE_SUPER_PAGE ) > + break; > + > + ASSERT(level); > + } > + > + if ( pte_level ) > + *pte_level = level; > + > + return table + offsets[level]; > +} ... the final one then be transferred to the caller. > +pte_t pt_walk(vaddr_t va, unsigned int *pte_level) > +{ > + return *_pt_walk(va, pte_level); > +} Hence aiui there needs to be an unmap operation here. Jan
On 2/10/25 5:32 PM, Jan Beulich wrote: > On 07.02.2025 14:13, Oleksii Kurochko wrote: >> --- a/xen/arch/riscv/pt.c >> +++ b/xen/arch/riscv/pt.c >> @@ -185,6 +185,57 @@ static int pt_next_level(bool alloc_tbl, pte_t **table, unsigned int offset) >> return XEN_TABLE_NORMAL; >> } >> >> +/* >> + * _pt_walk() performs software page table walking and returns the pte_t of >> + * a leaf node or the leaf-most not-present pte_t if no leaf node is found >> + * for further analysis. >> + * Additionally, _pt_walk() returns the level of the found pte. > That's optional, which I think wants expressing here. > >> + */ >> +static pte_t *_pt_walk(vaddr_t va, unsigned int *pte_level) >> +{ >> + const mfn_t root = get_root_page(); >> + unsigned int level; >> + pte_t *table; >> + >> + DECLARE_OFFSETS(offsets, va); >> + >> + table = map_table(root); > This mapping operation doesn't look to have a counterpart. Aiui ... > >> + /* >> + * Find `table` of an entry which corresponds to `va` by iterating for each >> + * page level and checking if the entry points to a next page table or >> + * to a page. >> + * >> + * Two cases are possible: >> + * - ret == XEN_TABLE_SUPER_PAGE means that the entry was found; >> + * (Despite the name) XEN_TABLE_SUPER_PAGE also covers 4K mappings. If >> + * pt_next_level() is called for page table level 0, it results in the >> + * entry being a pointer to a leaf node, thereby returning >> + * XEN_TABLE_SUPER_PAGE, despite of the fact this leaf covers 4k mapping. >> + * - ret == XEN_TABLE_MAP_NONE means that requested `va` wasn't actually >> + * mapped. >> + */ >> + for ( level = HYP_PT_ROOT_LEVEL; ; --level ) >> + { >> + int ret = pt_next_level(false, &table, offsets[level]); > ... the mapping may be replaced here, but a new mapping will then still > be held by this function and ... > >> + if ( ret == XEN_TABLE_MAP_NONE || ret == XEN_TABLE_SUPER_PAGE ) >> + break; >> + >> + ASSERT(level); >> + } >> + >> + if ( pte_level ) >> + *pte_level = level; >> + >> + return table + offsets[level]; >> +} > ... the final one then be transferred to the caller. > >> +pte_t pt_walk(vaddr_t va, unsigned int *pte_level) >> +{ >> + return *_pt_walk(va, pte_level); >> +} > Hence aiui there needs to be an unmap operation here. Agree, it should be an unmap here. I will update that in the next patch version. Thanks. ~ Oleksii
On 2/10/25 5:32 PM, Jan Beulich wrote: > On 07.02.2025 14:13, Oleksii Kurochko wrote: >> --- a/xen/arch/riscv/pt.c >> +++ b/xen/arch/riscv/pt.c >> @@ -185,6 +185,57 @@ static int pt_next_level(bool alloc_tbl, pte_t **table, unsigned int offset) >> return XEN_TABLE_NORMAL; >> } >> >> +/* >> + * _pt_walk() performs software page table walking and returns the pte_t of >> + * a leaf node or the leaf-most not-present pte_t if no leaf node is found >> + * for further analysis. >> + * Additionally, _pt_walk() returns the level of the found pte. > That's optional, which I think wants expressing here. > >> + */ >> +static pte_t *_pt_walk(vaddr_t va, unsigned int *pte_level) >> +{ >> + const mfn_t root = get_root_page(); >> + unsigned int level; >> + pte_t *table; >> + >> + DECLARE_OFFSETS(offsets, va); >> + >> + table = map_table(root); > This mapping operation doesn't look to have a counterpart. Aiui ... > >> + /* >> + * Find `table` of an entry which corresponds to `va` by iterating for each >> + * page level and checking if the entry points to a next page table or >> + * to a page. >> + * >> + * Two cases are possible: >> + * - ret == XEN_TABLE_SUPER_PAGE means that the entry was found; >> + * (Despite the name) XEN_TABLE_SUPER_PAGE also covers 4K mappings. If >> + * pt_next_level() is called for page table level 0, it results in the >> + * entry being a pointer to a leaf node, thereby returning >> + * XEN_TABLE_SUPER_PAGE, despite of the fact this leaf covers 4k mapping. >> + * - ret == XEN_TABLE_MAP_NONE means that requested `va` wasn't actually >> + * mapped. >> + */ >> + for ( level = HYP_PT_ROOT_LEVEL; ; --level ) >> + { >> + int ret = pt_next_level(false, &table, offsets[level]); > ... the mapping may be replaced here, but a new mapping will then still > be held by this function and ... > >> + if ( ret == XEN_TABLE_MAP_NONE || ret == XEN_TABLE_SUPER_PAGE ) >> + break; >> + >> + ASSERT(level); >> + } >> + >> + if ( pte_level ) >> + *pte_level = level; >> + >> + return table + offsets[level]; >> +} > ... the final one then be transferred to the caller. > >> +pte_t pt_walk(vaddr_t va, unsigned int *pte_level) >> +{ >> + return *_pt_walk(va, pte_level); >> +} > Hence aiui there needs to be an unmap operation here. As _pt_walk() returns page table entry, it is needed to transform entry to table. Do we have any function in Xen for that? Or the best I can do is: pte_t *table = *_pt_walk(va, pte_level) - TABLE_OFFSET(pt_linear_offset(*pte_level, va) (of course, it is needed to check if pte_level isn't null and do some extra actions if it is NULL) As an option, as all page tables are PAGE_SIZE aligned, we could use PAGE_OFFSET(): pte_t *entry = _pt_walk(va, pte_level); pte_t *table = PAGE_OFFSET(entry); ~ Oleksii
On 12.02.2025 12:13, Oleksii Kurochko wrote: > > On 2/10/25 5:32 PM, Jan Beulich wrote: >> On 07.02.2025 14:13, Oleksii Kurochko wrote: >>> --- a/xen/arch/riscv/pt.c >>> +++ b/xen/arch/riscv/pt.c >>> @@ -185,6 +185,57 @@ static int pt_next_level(bool alloc_tbl, pte_t **table, unsigned int offset) >>> return XEN_TABLE_NORMAL; >>> } >>> >>> +/* >>> + * _pt_walk() performs software page table walking and returns the pte_t of >>> + * a leaf node or the leaf-most not-present pte_t if no leaf node is found >>> + * for further analysis. >>> + * Additionally, _pt_walk() returns the level of the found pte. >> That's optional, which I think wants expressing here. >> >>> + */ >>> +static pte_t *_pt_walk(vaddr_t va, unsigned int *pte_level) >>> +{ >>> + const mfn_t root = get_root_page(); >>> + unsigned int level; >>> + pte_t *table; >>> + >>> + DECLARE_OFFSETS(offsets, va); >>> + >>> + table = map_table(root); >> This mapping operation doesn't look to have a counterpart. Aiui ... >> >>> + /* >>> + * Find `table` of an entry which corresponds to `va` by iterating for each >>> + * page level and checking if the entry points to a next page table or >>> + * to a page. >>> + * >>> + * Two cases are possible: >>> + * - ret == XEN_TABLE_SUPER_PAGE means that the entry was found; >>> + * (Despite the name) XEN_TABLE_SUPER_PAGE also covers 4K mappings. If >>> + * pt_next_level() is called for page table level 0, it results in the >>> + * entry being a pointer to a leaf node, thereby returning >>> + * XEN_TABLE_SUPER_PAGE, despite of the fact this leaf covers 4k mapping. >>> + * - ret == XEN_TABLE_MAP_NONE means that requested `va` wasn't actually >>> + * mapped. >>> + */ >>> + for ( level = HYP_PT_ROOT_LEVEL; ; --level ) >>> + { >>> + int ret = pt_next_level(false, &table, offsets[level]); >> ... the mapping may be replaced here, but a new mapping will then still >> be held by this function and ... >> >>> + if ( ret == XEN_TABLE_MAP_NONE || ret == XEN_TABLE_SUPER_PAGE ) >>> + break; >>> + >>> + ASSERT(level); >>> + } >>> + >>> + if ( pte_level ) >>> + *pte_level = level; >>> + >>> + return table + offsets[level]; >>> +} >> ... the final one then be transferred to the caller. >> >>> +pte_t pt_walk(vaddr_t va, unsigned int *pte_level) >>> +{ >>> + return *_pt_walk(va, pte_level); >>> +} >> Hence aiui there needs to be an unmap operation here. > > As _pt_walk() returns page table entry, it is needed to transform entry to table. > > Do we have any function in Xen for that? I don't understand. Both unmap_domain_page() and pmap_unmap() ignore the low bits of the passed in address. Jan > Or the best I can do is: > pte_t *table = *_pt_walk(va, pte_level) - TABLE_OFFSET(pt_linear_offset(*pte_level, va) > (of course, it is needed to check if pte_level isn't null and do some extra actions if it is NULL) > > As an option, as all page tables are PAGE_SIZE aligned, we could use PAGE_OFFSET(): > pte_t *entry = _pt_walk(va, pte_level); > pte_t *table = PAGE_OFFSET(entry); > > ~ Oleksii > >
On 2/12/25 12:27 PM, Jan Beulich wrote: > On 12.02.2025 12:13, Oleksii Kurochko wrote: >> On 2/10/25 5:32 PM, Jan Beulich wrote: >>> On 07.02.2025 14:13, Oleksii Kurochko wrote: >>>> --- a/xen/arch/riscv/pt.c >>>> +++ b/xen/arch/riscv/pt.c >>>> @@ -185,6 +185,57 @@ static int pt_next_level(bool alloc_tbl, pte_t **table, unsigned int offset) >>>> return XEN_TABLE_NORMAL; >>>> } >>>> >>>> +/* >>>> + * _pt_walk() performs software page table walking and returns the pte_t of >>>> + * a leaf node or the leaf-most not-present pte_t if no leaf node is found >>>> + * for further analysis. >>>> + * Additionally, _pt_walk() returns the level of the found pte. >>> That's optional, which I think wants expressing here. >>> >>>> + */ >>>> +static pte_t *_pt_walk(vaddr_t va, unsigned int *pte_level) >>>> +{ >>>> + const mfn_t root = get_root_page(); >>>> + unsigned int level; >>>> + pte_t *table; >>>> + >>>> + DECLARE_OFFSETS(offsets, va); >>>> + >>>> + table = map_table(root); >>> This mapping operation doesn't look to have a counterpart. Aiui ... >>> >>>> + /* >>>> + * Find `table` of an entry which corresponds to `va` by iterating for each >>>> + * page level and checking if the entry points to a next page table or >>>> + * to a page. >>>> + * >>>> + * Two cases are possible: >>>> + * - ret == XEN_TABLE_SUPER_PAGE means that the entry was found; >>>> + * (Despite the name) XEN_TABLE_SUPER_PAGE also covers 4K mappings. If >>>> + * pt_next_level() is called for page table level 0, it results in the >>>> + * entry being a pointer to a leaf node, thereby returning >>>> + * XEN_TABLE_SUPER_PAGE, despite of the fact this leaf covers 4k mapping. >>>> + * - ret == XEN_TABLE_MAP_NONE means that requested `va` wasn't actually >>>> + * mapped. >>>> + */ >>>> + for ( level = HYP_PT_ROOT_LEVEL; ; --level ) >>>> + { >>>> + int ret = pt_next_level(false, &table, offsets[level]); >>> ... the mapping may be replaced here, but a new mapping will then still >>> be held by this function and ... >>> >>>> + if ( ret == XEN_TABLE_MAP_NONE || ret == XEN_TABLE_SUPER_PAGE ) >>>> + break; >>>> + >>>> + ASSERT(level); >>>> + } >>>> + >>>> + if ( pte_level ) >>>> + *pte_level = level; >>>> + >>>> + return table + offsets[level]; >>>> +} >>> ... the final one then be transferred to the caller. >>> >>>> +pte_t pt_walk(vaddr_t va, unsigned int *pte_level) >>>> +{ >>>> + return *_pt_walk(va, pte_level); >>>> +} >>> Hence aiui there needs to be an unmap operation here. >> As _pt_walk() returns page table entry, it is needed to transform entry to table. >> >> Do we have any function in Xen for that? > I don't understand. Both unmap_domain_page() and pmap_unmap() ignore > the low bits of the passed in address. Missed that. Then it is safe to use unmap_table() with page table entry. Thanks. ~ Oleksii >> Or the best I can do is: >> pte_t *table = *_pt_walk(va, pte_level) - TABLE_OFFSET(pt_linear_offset(*pte_level, va) >> (of course, it is needed to check if pte_level isn't null and do some extra actions if it is NULL) >> >> As an option, as all page tables are PAGE_SIZE aligned, we could use PAGE_OFFSET(): >> pte_t *entry = _pt_walk(va, pte_level); >> pte_t *table = PAGE_OFFSET(entry); >> >> ~ Oleksii >> >>
diff --git a/xen/arch/riscv/include/asm/page.h b/xen/arch/riscv/include/asm/page.h index 7a6174a109..0439a1a9ee 100644 --- a/xen/arch/riscv/include/asm/page.h +++ b/xen/arch/riscv/include/asm/page.h @@ -208,6 +208,8 @@ static inline pte_t pte_from_mfn(mfn_t mfn, unsigned int flags) return (pte_t){ .pte = pte }; } +pte_t pt_walk(vaddr_t va, unsigned int *pte_level); + #endif /* __ASSEMBLY__ */ #endif /* ASM__RISCV__PAGE_H */ diff --git a/xen/arch/riscv/pt.c b/xen/arch/riscv/pt.c index a703e0f1bd..66cb976b55 100644 --- a/xen/arch/riscv/pt.c +++ b/xen/arch/riscv/pt.c @@ -185,6 +185,57 @@ static int pt_next_level(bool alloc_tbl, pte_t **table, unsigned int offset) return XEN_TABLE_NORMAL; } +/* + * _pt_walk() performs software page table walking and returns the pte_t of + * a leaf node or the leaf-most not-present pte_t if no leaf node is found + * for further analysis. + * Additionally, _pt_walk() returns the level of the found pte. + */ +static pte_t *_pt_walk(vaddr_t va, unsigned int *pte_level) +{ + const mfn_t root = get_root_page(); + unsigned int level; + pte_t *table; + + DECLARE_OFFSETS(offsets, va); + + table = map_table(root); + + /* + * Find `table` of an entry which corresponds to `va` by iterating for each + * page level and checking if the entry points to a next page table or + * to a page. + * + * Two cases are possible: + * - ret == XEN_TABLE_SUPER_PAGE means that the entry was found; + * (Despite the name) XEN_TABLE_SUPER_PAGE also covers 4K mappings. If + * pt_next_level() is called for page table level 0, it results in the + * entry being a pointer to a leaf node, thereby returning + * XEN_TABLE_SUPER_PAGE, despite of the fact this leaf covers 4k mapping. + * - ret == XEN_TABLE_MAP_NONE means that requested `va` wasn't actually + * mapped. + */ + for ( level = HYP_PT_ROOT_LEVEL; ; --level ) + { + int ret = pt_next_level(false, &table, offsets[level]); + + if ( ret == XEN_TABLE_MAP_NONE || ret == XEN_TABLE_SUPER_PAGE ) + break; + + ASSERT(level); + } + + if ( pte_level ) + *pte_level = level; + + return table + offsets[level]; +} + +pte_t pt_walk(vaddr_t va, unsigned int *pte_level) +{ + return *_pt_walk(va, pte_level); +} + /* Update an entry at the level @target. */ static int pt_update_entry(mfn_t root, vaddr_t virt, mfn_t mfn, unsigned int target,
RISC-V doesn't have hardware feature to ask MMU to translate virtual address to physical address ( like Arm has, for example ), so software page table walking is implemented. Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> --- Changes in v3: - Remove circular dependency. - Move declaration of pt_walk() to asm/page.h. - Revert other not connected to pt_walk() changes. - Update the commit message. - Drop unnessary anymore local variables of pt_walk(). - Refactor pt_walk() to use for() loop instead of while() loop as it was suggested by Jan B. - Introduce _pt_walk() which returns pte_t * and update prototype of pt_walk() to return pte_t by value. --- Changes in v2: - Update the code of pt_walk() to return pte_t instead of paddr_t. - Fix typos and drop blankets inside parentheses in the comment. - Update the `ret` handling; there is no need for `mfn` calculation anymore as pt_walk() returns or pte_t of a leaf node or non-present pte_t. - Drop the comment before unmap_table(). - Drop local variable `pa` as pt_walk() is going to return pte_t instead of paddr_t. - Add the comment above pt_walk() to explain what it does and returns. - Add additional explanation about possible return values of pt_next_level() used inside while loop in pt_walk(). - Change `pa` to `table` in the comment before while loop in pt_walk() as actually this loop finds a page table where paga table entry for `va` is located. - After including <asm/page.h> in <asm/mm.h>, the following compilation error occurs: ./arch/riscv/include/asm/cmpxchg.h:56:9: error: implicit declaration of function 'GENMASK' To resolve this, <xen/bitops.h> needs to be included at the top of <asm/cmpxchg.h>. - To avoid an issue with the implicit declaration of map_domain_page() and unmap_domain_page() after including <asm/page.h> in <asm/mm.h>, the implementation of flush_page_to_ram() has been moved to mm.c. (look for more detailed explanation in the commit message) As a result drop inclusion of <xen/domain.h> in <asm/page.h>. - Update the commit message. --- xen/arch/riscv/include/asm/page.h | 2 ++ xen/arch/riscv/pt.c | 51 +++++++++++++++++++++++++++++++ 2 files changed, 53 insertions(+)