Message ID | 20170430194838.29932-5-proskurin@sec.in.tum.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 04/30/2017 10:48 PM, Sergej Proskurin wrote: > The function p2m_mem_access_check_and_get_page in mem_access.c > translates a gva to an ipa by means of the hardware functionality > implemented in the function gva_to_ipa. If mem_access is active, > hardware-based gva to ipa translation might fail, as gva_to_ipa uses the > guest's translation tables, access to which might be restricted by the > active VTTBR. To address this issue, we perform the gva to ipa > translation in software. > > Signed-off-by: Sergej Proskurin <proskurin@sec.in.tum.de> > --- > Cc: Razvan Cojocaru <rcojocaru@bitdefender.com> > Cc: Tamas K Lengyel <tamas@tklengyel.com> > Cc: Stefano Stabellini <sstabellini@kernel.org> > Cc: Julien Grall <julien.grall@arm.com> > --- > xen/arch/arm/mem_access.c | 140 +++++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 139 insertions(+), 1 deletion(-) My ARM knowledge is scant to say the least, and I have no way to test this code, so I'll leave it to Tamas who has done some ARM work in the past. In any case - to state the obvious - the main acks here I believe are Julien and Stefano. Thanks, Razvan
Hi Razvan, Sure thing. Thanks anyway :) Cheers, ~Sergej On 05/01/2017 04:38 PM, Razvan Cojocaru wrote: > On 04/30/2017 10:48 PM, Sergej Proskurin wrote: >> The function p2m_mem_access_check_and_get_page in mem_access.c >> translates a gva to an ipa by means of the hardware functionality >> implemented in the function gva_to_ipa. If mem_access is active, >> hardware-based gva to ipa translation might fail, as gva_to_ipa uses the >> guest's translation tables, access to which might be restricted by the >> active VTTBR. To address this issue, we perform the gva to ipa >> translation in software. >> >> Signed-off-by: Sergej Proskurin <proskurin@sec.in.tum.de> >> --- >> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com> >> Cc: Tamas K Lengyel <tamas@tklengyel.com> >> Cc: Stefano Stabellini <sstabellini@kernel.org> >> Cc: Julien Grall <julien.grall@arm.com> >> --- >> xen/arch/arm/mem_access.c | 140 +++++++++++++++++++++++++++++++++++++++++++++- >> 1 file changed, 139 insertions(+), 1 deletion(-) > My ARM knowledge is scant to say the least, and I have no way to test > this code, so I'll leave it to Tamas who has done some ARM work in the > past. In any case - to state the obvious - the main acks here I believe > are Julien and Stefano. > > > Thanks, > Razvan
Hi Sergej, On 30/04/17 20:48, Sergej Proskurin wrote: > The function p2m_mem_access_check_and_get_page in mem_access.c > translates a gva to an ipa by means of the hardware functionality > implemented in the function gva_to_ipa. If mem_access is active, > hardware-based gva to ipa translation might fail, as gva_to_ipa uses the > guest's translation tables, access to which might be restricted by the > active VTTBR. To address this issue, we perform the gva to ipa > translation in software. > > Signed-off-by: Sergej Proskurin <proskurin@sec.in.tum.de> > --- > Cc: Razvan Cojocaru <rcojocaru@bitdefender.com> > Cc: Tamas K Lengyel <tamas@tklengyel.com> > Cc: Stefano Stabellini <sstabellini@kernel.org> > Cc: Julien Grall <julien.grall@arm.com> > --- > xen/arch/arm/mem_access.c | 140 +++++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 139 insertions(+), 1 deletion(-) > > diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c > index 04b1506b00..352eb6073f 100644 > --- a/xen/arch/arm/mem_access.c > +++ b/xen/arch/arm/mem_access.c > @@ -20,6 +20,7 @@ > #include <xen/monitor.h> > #include <xen/sched.h> > #include <xen/vm_event.h> > +#include <xen/domain_page.h> > #include <public/vm_event.h> > #include <asm/event.h> > > @@ -90,6 +91,129 @@ static int __p2m_get_mem_access(struct domain *d, gfn_t gfn, > return 0; > } > > +static int > +p2m_gva_to_ipa(struct p2m_domain *p2m, vaddr_t gva, > + paddr_t *ipa, unsigned int flags) I don't think this function belongs to mem_access.c. It would be better in p2m.c. Also, the name is a bit confusing, a user would not know the difference between p2m_gva_to_ipa and gva_to_ipa. > +{ > + int level=0, t0_sz, t1_sz; Coding style level = 0. Also please use unsigned int for level. t0_sz and t1_sz should be stay signed. > + unsigned long t0_max, t1_min; > + lpae_t pte, *table; > + mfn_t root_mfn; > + uint64_t ttbr; > + uint32_t sctlr = READ_SYSREG(SCTLR_EL1); > + register_t ttbcr = READ_SYSREG(TCR_EL1); So you are assuming that the current vCPU is running, right? If so, this should be clarified in the commit message and in a comment above the function. Also s/ttbcr/tcr/ > + struct domain *d = p2m->domain; > + > + const unsigned int offsets[4] = { > +#ifdef CONFIG_ARM_64 > + zeroeth_table_offset(gva), > +#endif > + first_table_offset(gva), > + second_table_offset(gva), > + third_table_offset(gva) > + }; Offsets are based on the granularity of Xen (currently 4KB). However the guests can support 4KB, 16KB, 64KB. Please handle everything correctly. > + > + const paddr_t masks[4] = { > +#ifdef CONFIG_ARM_64 > + ZEROETH_SIZE - 1, > +#endif > + FIRST_SIZE - 1, > + SECOND_SIZE - 1, > + THIRD_SIZE - 1 > + }; > + > + /* If the MMU is disabled, there is no need to translate the gva. */ > + if ( !(sctlr & SCTLR_M) ) > + { > + *ipa = gva; > + > + return 0; > + } > + > + if ( is_32bit_domain(d) ) > + { > + /* > + * XXX: We do not support 32-bit domain translation table walks for > + * domains using the short-descriptor translation table format, yet. > + */ Debian ARM 32bit is using short-descriptor translation table. So are you suggesting that memaccess will not correctly with Debian guest? > + if ( !(ttbcr & TTBCR_EAE) ) See my comment on patch #2 about the naming convention. > + return -EFAULT; > + > +#ifdef CONFIG_ARM_64 > + level = 1; This sounds wrong to me. The first lookup level is determined by the value of T0SZ and T1SZ (see B3-1352 in ARM DDI 0406C.c). For the AArch64 version see D4.2.5 in ARM DDI 0487A.k_iss10775. > +#endif > + } > + > +#ifdef CONFIG_ARM_64 > + if ( is_64bit_domain(d) ) > + { > + /* Get the max GVA that can be translated by TTBR0. */ > + t0_sz = (ttbcr >> TCR_T0SZ_SHIFT) & TCR_SZ_MASK; > + t0_max = (1UL << (64 - t0_sz)) - 1; > + > + /* Get the min GVA that can be translated by TTBR1. */ > + t1_sz = (ttbcr >> TCR_T1SZ_SHIFT) & TCR_SZ_MASK; > + t1_min = ~0UL - (1UL << (64 - t1_sz)) + 1; > + } > + else > +#endif > + { > + /* Get the max GVA that can be translated by TTBR0. */ > + t0_sz = (ttbcr >> TCR_T0SZ_SHIFT) & TTBCR_SZ_MASK; See my comment on patch #2 for the naming convention. > + t0_max = (1U << (32 - t0_sz)) - 1; > + > + /* Get the min GVA that can be translated by TTBR1. */ > + t1_sz = (ttbcr >> TCR_T1SZ_SHIFT) & TTBCR_SZ_MASK; > + t1_min = ~0U - (1U << (32 - t1_sz)) + 1; 1U << (32 - t1_sz) will not fit in an unsigned long if t1_sz = 0. Also, this code looks wrong to me. Looking at B3.6.4 in DDI 0406C.c, you don't handle properly t0_max and t1_min when T0SZ or T1SZ is 0. I think it would be worth for you to have a look to the pseudo-code in the ARM ARM for TranslationTableWalk (see B3.19.3 in ARM DDI 0406C.c and J1-5295 in ARM DDI 0487A.k_iss10775) which gives you all the details for doing properly translation table walk. > + } > + > + if ( t0_max >= gva ) > + /* Use TTBR0 for GVA to IPA translation. */ > + ttbr = READ_SYSREG64(TTBR0_EL1); > + else if ( t1_min <= gva ) > + /* Use TTBR1 for GVA to IPA translation. */ > + ttbr = READ_SYSREG64(TTBR1_EL1); > + else > + /* GVA out of bounds of TTBR(0|1). */ > + return -EFAULT; > + > + /* Bits [63..48] might be used by an ASID. */ > + root_mfn = p2m_lookup(d, _gfn(paddr_to_pfn(ttbr & ((1ULL<<48)-1))), NULL); Please don't hardcode the mask. > + > + /* Check, whether TTBR holds a valid address. */ > + if ( mfn_eq(root_mfn, INVALID_MFN) ) > + return -EFAULT; > + > + table = map_domain_page(root_mfn); Nothing prevents the guest to give back the page table whilst you browsing it. You may want to have a look to patch "ARM: introduce vgic_access_guest_memory()" [1] to generalize the function and avoid re-inventing the wheel. > + > + for ( ; ; level++ ) > + { > + pte = table[offsets[level]]; You likely want to use ACCESS_ONCE here to prevent the compiler to read the pte twice from the memory. > + > + if ( level == 3 || !pte.walk.valid || !pte.walk.table ) > + break; > + > + unmap_domain_page(table); > + > + root_mfn = p2m_lookup(d, _gfn(pte.walk.base), NULL); No check on root_mfn? > + table = map_domain_page(root_mfn); > + } > + > + unmap_domain_page(table); > + > + if ( !pte.walk.valid ) > + return -EFAULT; > + > + /* Make sure the entry holds the requested access attributes. */ > + if ( ((flags & GV2M_WRITE) == GV2M_WRITE) && pte.pt.ro ) > + return -EFAULT; IHMO, this function should return the access attribute of the page and let the caller decides what to do. This would make this function more generic. > + > + *ipa = pfn_to_paddr(pte.walk.base) | (gva & masks[level]); > + > + return 0; > +} > + > + > /* > * If mem_access is in use it might have been the reason why get_page_from_gva > * failed to fetch the page, as it uses the MMU for the permission checking. > @@ -109,9 +233,23 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag, > struct page_info *page = NULL; > struct p2m_domain *p2m = &v->domain->arch.p2m; > > + ASSERT(p2m->mem_access_enabled); > + Why this ASSERT has been added? > rc = gva_to_ipa(gva, &ipa, flag); > + > + /* > + * In case mem_access is active, hardware-based gva_to_ipa translation > + * might fail. Since gva_to_ipa uses the guest's translation tables, access > + * to which might be restricted by the active VTTBR, we perform a gva to > + * ipa translation in software. > + */ > if ( rc < 0 ) > - goto err; > + if ( p2m_gva_to_ipa(p2m, gva, &ipa, flag) < 0 ) > + /* > + * The software gva to ipa translation can still fail, if the the > + * gva is not mapped or does not hold the requested access rights. > + */ > + goto err; Rather falling back, why don't we do software page table walk every time? > > gfn = _gfn(paddr_to_pfn(ipa)); > > Cheers, [1] https://patchwork.kernel.org/patch/9676291/
Hi Julien, On 05/02/2017 05:17 PM, Julien Grall wrote: > Hi Sergej, > > On 30/04/17 20:48, Sergej Proskurin wrote: >> The function p2m_mem_access_check_and_get_page in mem_access.c >> translates a gva to an ipa by means of the hardware functionality >> implemented in the function gva_to_ipa. If mem_access is active, >> hardware-based gva to ipa translation might fail, as gva_to_ipa uses the >> guest's translation tables, access to which might be restricted by the >> active VTTBR. To address this issue, we perform the gva to ipa >> translation in software. >> >> Signed-off-by: Sergej Proskurin <proskurin@sec.in.tum.de> >> --- >> Cc: Razvan Cojocaru <rcojocaru@bitdefender.com> >> Cc: Tamas K Lengyel <tamas@tklengyel.com> >> Cc: Stefano Stabellini <sstabellini@kernel.org> >> Cc: Julien Grall <julien.grall@arm.com> >> --- >> xen/arch/arm/mem_access.c | 140 >> +++++++++++++++++++++++++++++++++++++++++++++- >> 1 file changed, 139 insertions(+), 1 deletion(-) >> >> diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c >> index 04b1506b00..352eb6073f 100644 >> --- a/xen/arch/arm/mem_access.c >> +++ b/xen/arch/arm/mem_access.c >> @@ -20,6 +20,7 @@ >> #include <xen/monitor.h> >> #include <xen/sched.h> >> #include <xen/vm_event.h> >> +#include <xen/domain_page.h> >> #include <public/vm_event.h> >> #include <asm/event.h> >> >> @@ -90,6 +91,129 @@ static int __p2m_get_mem_access(struct domain *d, >> gfn_t gfn, >> return 0; >> } >> >> +static int >> +p2m_gva_to_ipa(struct p2m_domain *p2m, vaddr_t gva, >> + paddr_t *ipa, unsigned int flags) > > I don't think this function belongs to mem_access.c. It would be > better in p2m.c. Also, the name is a bit confusing, a user would not > know the difference between p2m_gva_to_ipa and gva_to_ipa. Fair enough, thank you. > >> +{ >> + int level=0, t0_sz, t1_sz; > > Coding style level = 0. > > Also please use unsigned int for level. > > t0_sz and t1_sz should be stay signed. > >> + unsigned long t0_max, t1_min; >> + lpae_t pte, *table; >> + mfn_t root_mfn; >> + uint64_t ttbr; >> + uint32_t sctlr = READ_SYSREG(SCTLR_EL1); >> + register_t ttbcr = READ_SYSREG(TCR_EL1); > > So you are assuming that the current vCPU is running, right? If so, > this should be clarified in the commit message and in a comment above > the function. I will state that in a comment, thank you. > > Also s/ttbcr/tcr/ > >> + struct domain *d = p2m->domain; >> + >> + const unsigned int offsets[4] = { >> +#ifdef CONFIG_ARM_64 >> + zeroeth_table_offset(gva), >> +#endif >> + first_table_offset(gva), >> + second_table_offset(gva), >> + third_table_offset(gva) >> + }; > > Offsets are based on the granularity of Xen (currently 4KB). However > the guests can support 4KB, 16KB, 64KB. Please handle everything > correctly. True, ARM ist quite flexible. Yet, I have not seen an OS implementation that is supported by Xen and makes use of page sizes other than 4KB and their supersets (2/4MB, 1GB) (please let me know, if you know some), which is why I doubt that we need that. Please let me know why you think we need that kind of flexibility in software? If you should nevertheless insist on that functionality, I would include it in the next patch and try not to blow up the code too much. > >> + >> + const paddr_t masks[4] = { >> +#ifdef CONFIG_ARM_64 >> + ZEROETH_SIZE - 1, >> +#endif >> + FIRST_SIZE - 1, >> + SECOND_SIZE - 1, >> + THIRD_SIZE - 1 >> + }; >> + >> + /* If the MMU is disabled, there is no need to translate the >> gva. */ >> + if ( !(sctlr & SCTLR_M) ) >> + { >> + *ipa = gva; >> + >> + return 0; >> + } >> + >> + if ( is_32bit_domain(d) ) >> + { >> + /* >> + * XXX: We do not support 32-bit domain translation table >> walks for >> + * domains using the short-descriptor translation table >> format, yet. >> + */ > > Debian ARM 32bit is using short-descriptor translation table. So are > you suggesting that memaccess will not correctly with Debian guest? > Yes, as stated in the comment, the current implementation does not support the short-descriptor translation table format. As this is an RFC patch, I wanted to see your opinion on that functionality in general before implementing all corner cases of the ARM architecture. As mentioned in my previous reply in patch (patch 2/4), I would prefer to separate the long-descriptor from the short-descriptor translation in the future implementation. >> + if ( !(ttbcr & TTBCR_EAE) ) > > See my comment on patch #2 about the naming convention. > Done, thanks. >> + return -EFAULT; >> + >> +#ifdef CONFIG_ARM_64 >> + level = 1; > > This sounds wrong to me. The first lookup level is determined by the > value of T0SZ and T1SZ (see B3-1352 in ARM DDI 0406C.c). > > For the AArch64 version see D4.2.5 in ARM DDI 0487A.k_iss10775. > Thank you for the hint. >> +#endif >> + } >> + >> +#ifdef CONFIG_ARM_64 >> + if ( is_64bit_domain(d) ) >> + { >> + /* Get the max GVA that can be translated by TTBR0. */ >> + t0_sz = (ttbcr >> TCR_T0SZ_SHIFT) & TCR_SZ_MASK; >> + t0_max = (1UL << (64 - t0_sz)) - 1; >> + >> + /* Get the min GVA that can be translated by TTBR1. */ >> + t1_sz = (ttbcr >> TCR_T1SZ_SHIFT) & TCR_SZ_MASK; >> + t1_min = ~0UL - (1UL << (64 - t1_sz)) + 1; >> + } >> + else >> +#endif >> + { >> + /* Get the max GVA that can be translated by TTBR0. */ >> + t0_sz = (ttbcr >> TCR_T0SZ_SHIFT) & TTBCR_SZ_MASK; > > See my comment on patch #2 for the naming convention. > >> + t0_max = (1U << (32 - t0_sz)) - 1; >> + >> + /* Get the min GVA that can be translated by TTBR1. */ >> + t1_sz = (ttbcr >> TCR_T1SZ_SHIFT) & TTBCR_SZ_MASK; >> + t1_min = ~0U - (1U << (32 - t1_sz)) + 1; > > 1U << (32 - t1_sz) will not fit in an unsigned long if t1_sz = 0. An unsigned long long should fix that (however not in the 64-bit case), thanks. > > Also, this code looks wrong to me. Looking at B3.6.4 in DDI 0406C.c, > you don't handle properly t0_max and t1_min when T0SZ or T1SZ is 0. > > I think it would be worth for you to have a look to the pseudo-code in > the ARM ARM for TranslationTableWalk (see B3.19.3 in ARM DDI 0406C.c > and J1-5295 in ARM DDI 0487A.k_iss10775) which gives you all the > details for doing properly translation table walk. I will have a deeper look and adopt parts of the given pseudocode in my implementation. You are right: there might be an off-by-one issue in the computation above. > >> + } >> + >> + if ( t0_max >= gva ) >> + /* Use TTBR0 for GVA to IPA translation. */ >> + ttbr = READ_SYSREG64(TTBR0_EL1); >> + else if ( t1_min <= gva ) >> + /* Use TTBR1 for GVA to IPA translation. */ >> + ttbr = READ_SYSREG64(TTBR1_EL1); >> + else >> + /* GVA out of bounds of TTBR(0|1). */ >> + return -EFAULT; >> + >> + /* Bits [63..48] might be used by an ASID. */ >> + root_mfn = p2m_lookup(d, _gfn(paddr_to_pfn(ttbr & >> ((1ULL<<48)-1))), NULL); > > Please don't hardcode the mask. > >> + >> + /* Check, whether TTBR holds a valid address. */ >> + if ( mfn_eq(root_mfn, INVALID_MFN) ) >> + return -EFAULT; >> + >> + table = map_domain_page(root_mfn); > > Nothing prevents the guest to give back the page table whilst you > browsing it. You may want to have a look to patch "ARM: introduce > vgic_access_guest_memory()" [1] to generalize the function and avoid > re-inventing the wheel. > I will have a look at this commit, thanks. >> + >> + for ( ; ; level++ ) >> + { >> + pte = table[offsets[level]]; > > You likely want to use ACCESS_ONCE here to prevent the compiler to > read the pte twice from the memory. I tried that, however the compiler did not seem to like it. I will try again and let you know if I got that to work. > >> + >> + if ( level == 3 || !pte.walk.valid || !pte.walk.table ) >> + break; >> + >> + unmap_domain_page(table); >> + >> + root_mfn = p2m_lookup(d, _gfn(pte.walk.base), NULL); > > No check on root_mfn? Done, thanks. > >> + table = map_domain_page(root_mfn); >> + } >> + >> + unmap_domain_page(table); >> + >> + if ( !pte.walk.valid ) >> + return -EFAULT; >> + >> + /* Make sure the entry holds the requested access attributes. */ >> + if ( ((flags & GV2M_WRITE) == GV2M_WRITE) && pte.pt.ro ) >> + return -EFAULT; > > IHMO, this function should return the access attribute of the page and > let the caller decides what to do. This would make this function more > generic. > Makes sense. This would make the function indeed more generic. Since there are already multiple gpt-walk stubs throughout the codebase, we could make use of only one. >> + >> + *ipa = pfn_to_paddr(pte.walk.base) | (gva & masks[level]); >> + >> + return 0; >> +} >> + >> + >> /* >> * If mem_access is in use it might have been the reason why >> get_page_from_gva >> * failed to fetch the page, as it uses the MMU for the permission >> checking. >> @@ -109,9 +233,23 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, >> unsigned long flag, >> struct page_info *page = NULL; >> struct p2m_domain *p2m = &v->domain->arch.p2m; >> >> + ASSERT(p2m->mem_access_enabled); >> + > > Why this ASSERT has been added? The function p2m_mem_access_check_and_get_page is called only if get_page_from_gva fails and mem_access is active. I can add a comment and move this part into an individual commit. > >> rc = gva_to_ipa(gva, &ipa, flag); >> + >> + /* >> + * In case mem_access is active, hardware-based gva_to_ipa >> translation >> + * might fail. Since gva_to_ipa uses the guest's translation >> tables, access >> + * to which might be restricted by the active VTTBR, we perform >> a gva to >> + * ipa translation in software. >> + */ >> if ( rc < 0 ) >> - goto err; >> + if ( p2m_gva_to_ipa(p2m, gva, &ipa, flag) < 0 ) >> + /* >> + * The software gva to ipa translation can still fail, >> if the the >> + * gva is not mapped or does not hold the requested >> access rights. >> + */ >> + goto err; > > Rather falling back, why don't we do software page table walk every time? A software page table walk would (presumably) take more time to translate the gva. Do we want that? Also, I am not sure what you meant by "falling back" at this point. Thank you. > >> >> gfn = _gfn(paddr_to_pfn(ipa)); >> >> > > Cheers, > > [1] https://patchwork.kernel.org/patch/9676291/ > Cheers, ~Sergej
Hi, On 08/05/17 10:22, Sergej Proskurin wrote: > On 05/02/2017 05:17 PM, Julien Grall wrote: >> On 30/04/17 20:48, Sergej Proskurin wrote: >> Also s/ttbcr/tcr/ >> >>> + struct domain *d = p2m->domain; >>> + >>> + const unsigned int offsets[4] = { >>> +#ifdef CONFIG_ARM_64 >>> + zeroeth_table_offset(gva), >>> +#endif >>> + first_table_offset(gva), >>> + second_table_offset(gva), >>> + third_table_offset(gva) >>> + }; >> >> Offsets are based on the granularity of Xen (currently 4KB). However >> the guests can support 4KB, 16KB, 64KB. Please handle everything >> correctly. > > True, ARM ist quite flexible. Yet, I have not seen an OS implementation > that is supported by Xen and makes use of page sizes other than 4KB and > their supersets (2/4MB, 1GB) (please let me know, if you know some), > which is why I doubt that we need that. Please let me know why you think > we need that kind of flexibility in software? > > If you should nevertheless insist on that functionality, I would include > it in the next patch and try not to blow up the code too much. Linux is able to support 4KB, 16KB, 64KB page granularity for AArch64. Centos and RedHat are only shipped with 64KB page granularity. > >> >>> + >>> + const paddr_t masks[4] = { >>> +#ifdef CONFIG_ARM_64 >>> + ZEROETH_SIZE - 1, >>> +#endif >>> + FIRST_SIZE - 1, >>> + SECOND_SIZE - 1, >>> + THIRD_SIZE - 1 >>> + }; >>> + >>> + /* If the MMU is disabled, there is no need to translate the >>> gva. */ >>> + if ( !(sctlr & SCTLR_M) ) >>> + { >>> + *ipa = gva; >>> + >>> + return 0; >>> + } >>> + >>> + if ( is_32bit_domain(d) ) >>> + { >>> + /* >>> + * XXX: We do not support 32-bit domain translation table >>> walks for >>> + * domains using the short-descriptor translation table >>> format, yet. >>> + */ >> >> Debian ARM 32bit is using short-descriptor translation table. So are >> you suggesting that memaccess will not correctly with Debian guest? >> > > Yes, as stated in the comment, the current implementation does not > support the short-descriptor translation table format. As this is an RFC > patch, I wanted to see your opinion on that functionality in general > before implementing all corner cases of the ARM architecture. > > As mentioned in my previous reply in patch (patch 2/4), I would prefer > to separate the long-descriptor from the short-descriptor translation in > the future implementation. I agree here. See my answer on patch #2 about how I would like to see the implementation. [...] >>> + >>> + *ipa = pfn_to_paddr(pte.walk.base) | (gva & masks[level]); >>> + >>> + return 0; >>> +} >>> + >>> + >>> /* >>> * If mem_access is in use it might have been the reason why >>> get_page_from_gva >>> * failed to fetch the page, as it uses the MMU for the permission >>> checking. >>> @@ -109,9 +233,23 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, >>> unsigned long flag, >>> struct page_info *page = NULL; >>> struct p2m_domain *p2m = &v->domain->arch.p2m; >>> >>> + ASSERT(p2m->mem_access_enabled); >>> + >> >> Why this ASSERT has been added? > > The function p2m_mem_access_check_and_get_page is called only if > get_page_from_gva fails and mem_access is active. I can add a comment > and move this part into an individual commit. Whilst I agree it is dumb to call this code without mem_access enabled, this code is able to cope with it. So why do you need this ASSERT? > >> >>> rc = gva_to_ipa(gva, &ipa, flag); >>> + >>> + /* >>> + * In case mem_access is active, hardware-based gva_to_ipa >>> translation >>> + * might fail. Since gva_to_ipa uses the guest's translation >>> tables, access >>> + * to which might be restricted by the active VTTBR, we perform >>> a gva to >>> + * ipa translation in software. >>> + */ >>> if ( rc < 0 ) >>> - goto err; >>> + if ( p2m_gva_to_ipa(p2m, gva, &ipa, flag) < 0 ) >>> + /* >>> + * The software gva to ipa translation can still fail, >>> if the the >>> + * gva is not mapped or does not hold the requested >>> access rights. >>> + */ >>> + goto err; >> >> Rather falling back, why don't we do software page table walk every time? > > A software page table walk would (presumably) take more time to > translate the gva. Do we want that? Also, I am not sure what you meant > by "falling back" at this point. Thank you. What you currently do is try gva_to_ipa and if it does not work you will call p2m_gva_to_ipa. This sounds a bit pointless to me and waste of time if the underlying memory of stage-1 page table is protected. Before saying this is taking much more time, I would like to see actual numbers. Cheers,
On Mon, May 8, 2017 at 5:44 AM, Julien Grall <julien.grall@arm.com> wrote: > Hi, > > On 08/05/17 10:22, Sergej Proskurin wrote: >> >> On 05/02/2017 05:17 PM, Julien Grall wrote: >>> >>> On 30/04/17 20:48, Sergej Proskurin wrote: >>> Also s/ttbcr/tcr/ >>> >>>> + struct domain *d = p2m->domain; >>>> + >>>> + const unsigned int offsets[4] = { >>>> +#ifdef CONFIG_ARM_64 >>>> + zeroeth_table_offset(gva), >>>> +#endif >>>> + first_table_offset(gva), >>>> + second_table_offset(gva), >>>> + third_table_offset(gva) >>>> + }; >>> >>> >>> Offsets are based on the granularity of Xen (currently 4KB). However >>> the guests can support 4KB, 16KB, 64KB. Please handle everything >>> correctly. >> >> >> True, ARM ist quite flexible. Yet, I have not seen an OS implementation >> that is supported by Xen and makes use of page sizes other than 4KB and >> their supersets (2/4MB, 1GB) (please let me know, if you know some), >> which is why I doubt that we need that. Please let me know why you think >> we need that kind of flexibility in software? >> >> If you should nevertheless insist on that functionality, I would include >> it in the next patch and try not to blow up the code too much. > > > Linux is able to support 4KB, 16KB, 64KB page granularity for AArch64. > Centos and RedHat are only shipped with 64KB page granularity. > > >> >>> >>>> + >>>> + const paddr_t masks[4] = { >>>> +#ifdef CONFIG_ARM_64 >>>> + ZEROETH_SIZE - 1, >>>> +#endif >>>> + FIRST_SIZE - 1, >>>> + SECOND_SIZE - 1, >>>> + THIRD_SIZE - 1 >>>> + }; >>>> + >>>> + /* If the MMU is disabled, there is no need to translate the >>>> gva. */ >>>> + if ( !(sctlr & SCTLR_M) ) >>>> + { >>>> + *ipa = gva; >>>> + >>>> + return 0; >>>> + } >>>> + >>>> + if ( is_32bit_domain(d) ) >>>> + { >>>> + /* >>>> + * XXX: We do not support 32-bit domain translation table >>>> walks for >>>> + * domains using the short-descriptor translation table >>>> format, yet. >>>> + */ >>> >>> >>> Debian ARM 32bit is using short-descriptor translation table. So are >>> you suggesting that memaccess will not correctly with Debian guest? >>> >> >> Yes, as stated in the comment, the current implementation does not >> support the short-descriptor translation table format. As this is an RFC >> patch, I wanted to see your opinion on that functionality in general >> before implementing all corner cases of the ARM architecture. >> >> As mentioned in my previous reply in patch (patch 2/4), I would prefer >> to separate the long-descriptor from the short-descriptor translation in >> the future implementation. > > > I agree here. See my answer on patch #2 about how I would like to see the > implementation. > > [...] > >>>> + >>>> + *ipa = pfn_to_paddr(pte.walk.base) | (gva & masks[level]); >>>> + >>>> + return 0; >>>> +} >>>> + >>>> + >>>> /* >>>> * If mem_access is in use it might have been the reason why >>>> get_page_from_gva >>>> * failed to fetch the page, as it uses the MMU for the permission >>>> checking. >>>> @@ -109,9 +233,23 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, >>>> unsigned long flag, >>>> struct page_info *page = NULL; >>>> struct p2m_domain *p2m = &v->domain->arch.p2m; >>>> >>>> + ASSERT(p2m->mem_access_enabled); >>>> + >>> >>> >>> Why this ASSERT has been added? >> >> >> The function p2m_mem_access_check_and_get_page is called only if >> get_page_from_gva fails and mem_access is active. I can add a comment >> and move this part into an individual commit. > > > Whilst I agree it is dumb to call this code without mem_access enabled, this > code is able to cope with it. So why do you need this ASSERT? > > >> >>> >>>> rc = gva_to_ipa(gva, &ipa, flag); >>>> + >>>> + /* >>>> + * In case mem_access is active, hardware-based gva_to_ipa >>>> translation >>>> + * might fail. Since gva_to_ipa uses the guest's translation >>>> tables, access >>>> + * to which might be restricted by the active VTTBR, we perform >>>> a gva to >>>> + * ipa translation in software. >>>> + */ >>>> if ( rc < 0 ) >>>> - goto err; >>>> + if ( p2m_gva_to_ipa(p2m, gva, &ipa, flag) < 0 ) >>>> + /* >>>> + * The software gva to ipa translation can still fail, >>>> if the the >>>> + * gva is not mapped or does not hold the requested >>>> access rights. >>>> + */ >>>> + goto err; >>> >>> >>> Rather falling back, why don't we do software page table walk every time? >> >> >> A software page table walk would (presumably) take more time to >> translate the gva. Do we want that? Also, I am not sure what you meant >> by "falling back" at this point. Thank you. > > > What you currently do is try gva_to_ipa and if it does not work you will > call p2m_gva_to_ipa. This sounds a bit pointless to me and waste of time if > the underlying memory of stage-1 page table is protected. But we don't know that the stage-1 page table is protected until the hardware based lookup fails. I can turn your argument around and say doing the software based lookup is pointless and a waste of time when the stage-1 table is not protected. Which is by the way what I would expect to see in most cases. > Before saying this is taking much more time, I would like to see actual > numbers. Whether the performance is measurable different is going to be very usecase specific. If the TLBs are already loaded with the translation then the hardware lookup will be a lot faster. Setting up a test-case for this is just a PITA and I don't really see why you are against using the hardware lookup to start with. Tamas
On 05/08/2017 08:42 PM, Tamas K Lengyel wrote: > On Mon, May 8, 2017 at 5:44 AM, Julien Grall <julien.grall@arm.com> wrote: > Whether the performance is measurable different is going to be very > usecase specific. If the TLBs are already loaded with the translation > then the hardware lookup will be a lot faster. Setting up a test-case > for this is just a PITA and I don't really see why you are against > using the hardware lookup to start with. Well, if you read my e-mail you would have noticed I am not against it. I just find a bit pointless to do both and was asking if you did some benchmark before taking this decision. But it sounds like you are not willing to even consider it. Anyway I am not going to argue on that. It is not something I really care for now. Cheers,
On Mon, May 8, 2017 at 2:53 PM, Julien Grall <julien.grall@arm.com> wrote: > > > On 05/08/2017 08:42 PM, Tamas K Lengyel wrote: >> >> On Mon, May 8, 2017 at 5:44 AM, Julien Grall <julien.grall@arm.com> wrote: >> Whether the performance is measurable different is going to be very >> usecase specific. If the TLBs are already loaded with the translation >> then the hardware lookup will be a lot faster. Setting up a test-case >> for this is just a PITA and I don't really see why you are against >> using the hardware lookup to start with. > > > Well, if you read my e-mail you would have noticed I am not against it. I > just find a bit pointless to do both and was asking if you did some > benchmark before taking this decision. > > But it sounds like you are not willing to even consider it. Anyway I am not > going to argue on that. It is not something I really care for now. It did sound to me like you were pushing for it. I would prefer doing hardware translation first and only falling back to software when needed unless we have some good incentive to do otherwise. Tamas
Hi, >> What you currently do is try gva_to_ipa and if it does not work >> you will call p2m_gva_to_ipa. This sounds a bit pointless to me and >> waste of time if the underlying memory of stage-1 page table is >> protected. > > But we don't know that the stage-1 page table is protected until the > hardware based lookup fails. I can turn your argument around and say > doing the software based lookup is pointless and a waste of time > when the stage-1 table is not protected. Which is by the way what I > would expect to see in most cases. I agree with Tamas: I also believe that in most cases the stage-1 translation table won't be protected. So, in my opinion, falling back to software (which will be presumablly a rare operation) is the better approach. Cheers, ~Sergej
On 05/09/2017 08:17 AM, Sergej Proskurin wrote: > Hi, > >>> What you currently do is try gva_to_ipa and if it does not work >> you will call p2m_gva_to_ipa. This sounds a bit pointless to me and >>> waste of time if the underlying memory of stage-1 page table is >> > protected. > > But we don't know that the stage-1 page table is > protected until the > hardware based lookup fails. I can turn your > argument around and say > doing the software based lookup is pointless > and a waste of time > when the stage-1 table is not protected. Which is > by the way what I > would expect to see in most cases. > > I agree with Tamas: I also believe that in most cases the stage-1 > translation table won't be protected. So, in my opinion, falling back to > software (which will be presumablly a rare operation) is the better > approach. Well, you both consider that it is better to do the fallback by assuming the TLBs will always cache the intermediate translations (e.g VA -> IPA) and not only the full S1 -> S2 translation (e.g VA -> PA). Cheers,
On Tue, May 9, 2017 at 2:09 AM, Julien Grall <julien.grall@arm.com> wrote: > > > On 05/09/2017 08:17 AM, Sergej Proskurin wrote: >> >> Hi, >> >>>> What you currently do is try gva_to_ipa and if it does not work >> you >>>> will call p2m_gva_to_ipa. This sounds a bit pointless to me and >>>> waste of time if the underlying memory of stage-1 page table is >> >> >> protected. > > But we don't know that the stage-1 page table is >> protected until the > hardware based lookup fails. I can turn your >> argument around and say > doing the software based lookup is pointless >> and a waste of time > when the stage-1 table is not protected. Which is >> by the way what I > would expect to see in most cases. >> >> I agree with Tamas: I also believe that in most cases the stage-1 >> translation table won't be protected. So, in my opinion, falling back to >> software (which will be presumablly a rare operation) is the better >> approach. > > > Well, you both consider that it is better to do the fallback by assuming the > TLBs will always cache the intermediate translations (e.g VA -> IPA) and not > only the full S1 -> S2 translation (e.g VA -> PA). No, I was just pointing out that if the TLB has it cached it is guaranteed to be faster. Whether that is the case is probably usecase dependent. But even if the TLB is not loaded, I would assume the hardware lookup is at least as fast as the software based one, but I would bet it is faster. Without number this is just theoretical but I would be very surprised if the hardware lookup would ever be slower then the software based one... And since this is a corner-case that most application will never encounter, forcing them all to use a slower approach doesn't sound right. Tamas
On 09/05/17 17:04, Tamas K Lengyel wrote: > On Tue, May 9, 2017 at 2:09 AM, Julien Grall <julien.grall@arm.com> wrote: >> >> >> On 05/09/2017 08:17 AM, Sergej Proskurin wrote: >>> >>> Hi, >>> >>>>> What you currently do is try gva_to_ipa and if it does not work >> you >>>>> will call p2m_gva_to_ipa. This sounds a bit pointless to me and >>>>> waste of time if the underlying memory of stage-1 page table is >> >>> >>> protected. > > But we don't know that the stage-1 page table is >>> protected until the > hardware based lookup fails. I can turn your >>> argument around and say > doing the software based lookup is pointless >>> and a waste of time > when the stage-1 table is not protected. Which is >>> by the way what I > would expect to see in most cases. >>> >>> I agree with Tamas: I also believe that in most cases the stage-1 >>> translation table won't be protected. So, in my opinion, falling back to >>> software (which will be presumablly a rare operation) is the better >>> approach. >> >> >> Well, you both consider that it is better to do the fallback by assuming the >> TLBs will always cache the intermediate translations (e.g VA -> IPA) and not >> only the full S1 -> S2 translation (e.g VA -> PA). > > No, I was just pointing out that if the TLB has it cached it is > guaranteed to be faster. Whether that is the case is probably usecase > dependent. But even if the TLB is not loaded, I would assume the > hardware lookup is at least as fast as the software based one, but I > would bet it is faster. Without number this is just theoretical but I > would be very surprised if the hardware lookup would ever be slower > then the software based one... And since this is a corner-case that > most application will never encounter, forcing them all to use a > slower approach doesn't sound right. What you miss in my point is TLB can be designed to never cache intermediate translation. It is not even about whether the TLB are loaded or not. I am struggling to understand why this would be a corner case as it would be valid to protect the page table to inspect what the guest is doing... I am not saying translation would be slower in hardware than in software. I am just saying that maybe it would not be a huge performance hit to always do software rather than fallback and having a worse one time to time (or often?). Also, if you use altp2m (when it gets implemented), you would likely need to either do some context switch to use the right p2m or do software lookup. Cheers,
On Tue, May 9, 2017 at 10:22 AM, Julien Grall <julien.grall@arm.com> wrote: > > > On 09/05/17 17:04, Tamas K Lengyel wrote: >> >> On Tue, May 9, 2017 at 2:09 AM, Julien Grall <julien.grall@arm.com> wrote: >>> >>> >>> >>> On 05/09/2017 08:17 AM, Sergej Proskurin wrote: >>>> >>>> >>>> Hi, >>>> >>>>>> What you currently do is try gva_to_ipa and if it does not work >> you >>>>>> will call p2m_gva_to_ipa. This sounds a bit pointless to me and >>>>>> waste of time if the underlying memory of stage-1 page table is >> >>>> >>>> >>>> protected. > > But we don't know that the stage-1 page table is >>>> protected until the > hardware based lookup fails. I can turn your >>>> argument around and say > doing the software based lookup is pointless >>>> and a waste of time > when the stage-1 table is not protected. Which is >>>> by the way what I > would expect to see in most cases. >>>> >>>> I agree with Tamas: I also believe that in most cases the stage-1 >>>> translation table won't be protected. So, in my opinion, falling back to >>>> software (which will be presumablly a rare operation) is the better >>>> approach. >>> >>> >>> >>> Well, you both consider that it is better to do the fallback by assuming >>> the >>> TLBs will always cache the intermediate translations (e.g VA -> IPA) and >>> not >>> only the full S1 -> S2 translation (e.g VA -> PA). >> >> >> No, I was just pointing out that if the TLB has it cached it is >> guaranteed to be faster. Whether that is the case is probably usecase >> dependent. But even if the TLB is not loaded, I would assume the >> hardware lookup is at least as fast as the software based one, but I >> would bet it is faster. Without number this is just theoretical but I >> would be very surprised if the hardware lookup would ever be slower >> then the software based one... And since this is a corner-case that >> most application will never encounter, forcing them all to use a >> slower approach doesn't sound right. > > > What you miss in my point is TLB can be designed to never cache intermediate > translation. It is not even about whether the TLB are loaded or not. > > I am struggling to understand why this would be a corner case as it would be > valid to protect the page table to inspect what the guest is doing... It is valid, yes, but nothing requires the user to block read access to that specific page. Most applications only monitor a handful of pages at a time. Applications may not even use read-access restrictions at all (doing only write or execute tracing). If we do software based lookups all the time we penalize all of those applications when it is really only a very small subset of cases where this will be needed. > > I am not saying translation would be slower in hardware than in software. I > am just saying that maybe it would not be a huge performance hit to always > do software rather than fallback and having a worse one time to time (or > often?). It probably wouldn't be a big hit, but why take any hit if not necessary? Tamas
On 09/05/2017 17:45, Tamas K Lengyel wrote: > On Tue, May 9, 2017 at 10:22 AM, Julien Grall <julien.grall@arm.com> wrote: >> >> >> On 09/05/17 17:04, Tamas K Lengyel wrote: >>> >>> On Tue, May 9, 2017 at 2:09 AM, Julien Grall <julien.grall@arm.com> wrote: >>>> >>>> >>>> >>>> On 05/09/2017 08:17 AM, Sergej Proskurin wrote: >>>>> >>>>> >>>>> Hi, >>>>> >>>>>>> What you currently do is try gva_to_ipa and if it does not work >> you >>>>>>> will call p2m_gva_to_ipa. This sounds a bit pointless to me and >>>>>>> waste of time if the underlying memory of stage-1 page table is >> >>>>> >>>>> >>>>> protected. > > But we don't know that the stage-1 page table is >>>>> protected until the > hardware based lookup fails. I can turn your >>>>> argument around and say > doing the software based lookup is pointless >>>>> and a waste of time > when the stage-1 table is not protected. Which is >>>>> by the way what I > would expect to see in most cases. >>>>> >>>>> I agree with Tamas: I also believe that in most cases the stage-1 >>>>> translation table won't be protected. So, in my opinion, falling back to >>>>> software (which will be presumablly a rare operation) is the better >>>>> approach. >>>> >>>> >>>> >>>> Well, you both consider that it is better to do the fallback by assuming >>>> the >>>> TLBs will always cache the intermediate translations (e.g VA -> IPA) and >>>> not >>>> only the full S1 -> S2 translation (e.g VA -> PA). >>> >>> >>> No, I was just pointing out that if the TLB has it cached it is >>> guaranteed to be faster. Whether that is the case is probably usecase >>> dependent. But even if the TLB is not loaded, I would assume the >>> hardware lookup is at least as fast as the software based one, but I >>> would bet it is faster. Without number this is just theoretical but I >>> would be very surprised if the hardware lookup would ever be slower >>> then the software based one... And since this is a corner-case that >>> most application will never encounter, forcing them all to use a >>> slower approach doesn't sound right. >> >> >> What you miss in my point is TLB can be designed to never cache intermediate >> translation. It is not even about whether the TLB are loaded or not. >> >> I am struggling to understand why this would be a corner case as it would be >> valid to protect the page table to inspect what the guest is doing... > > It is valid, yes, but nothing requires the user to block read access > to that specific page. Most applications only monitor a handful of > pages at a time. Applications may not even use read-access > restrictions at all (doing only write or execute tracing). If we do > software based lookups all the time we penalize all of those > applications when it is really only a very small subset of cases where > this will be needed. > >> >> I am not saying translation would be slower in hardware than in software. I >> am just saying that maybe it would not be a huge performance hit to always >> do software rather than fallback and having a worse one time to time (or >> often?). > > It probably wouldn't be a big hit, but why take any hit if not necessary? You still miss my point... Anyway, it is nothing something I really care for now. So I am not going to argue. Cheers,
On 09/05/2017 19:55, Julien Grall wrote: > > > On 09/05/2017 17:45, Tamas K Lengyel wrote: >> On Tue, May 9, 2017 at 10:22 AM, Julien Grall <julien.grall@arm.com> >> wrote: >>> >>> >>> On 09/05/17 17:04, Tamas K Lengyel wrote: >>>> >>>> On Tue, May 9, 2017 at 2:09 AM, Julien Grall <julien.grall@arm.com> >>>> wrote: >>>>> >>>>> >>>>> >>>>> On 05/09/2017 08:17 AM, Sergej Proskurin wrote: >>>>>> >>>>>> >>>>>> Hi, >>>>>> >>>>>>>> What you currently do is try gva_to_ipa and if it does not work >>>>>>>> >> you >>>>>>>> will call p2m_gva_to_ipa. This sounds a bit pointless to me and >>>>>>>> waste of time if the underlying memory of stage-1 page table is >> >>>>>> >>>>>> >>>>>> protected. > > But we don't know that the stage-1 page table is >>>>>> protected until the > hardware based lookup fails. I can turn your >>>>>> argument around and say > doing the software based lookup is >>>>>> pointless >>>>>> and a waste of time > when the stage-1 table is not protected. >>>>>> Which is >>>>>> by the way what I > would expect to see in most cases. >>>>>> >>>>>> I agree with Tamas: I also believe that in most cases the stage-1 >>>>>> translation table won't be protected. So, in my opinion, falling >>>>>> back to >>>>>> software (which will be presumablly a rare operation) is the better >>>>>> approach. >>>>> >>>>> >>>>> >>>>> Well, you both consider that it is better to do the fallback by >>>>> assuming >>>>> the >>>>> TLBs will always cache the intermediate translations (e.g VA -> >>>>> IPA) and >>>>> not >>>>> only the full S1 -> S2 translation (e.g VA -> PA). >>>> >>>> >>>> No, I was just pointing out that if the TLB has it cached it is >>>> guaranteed to be faster. Whether that is the case is probably usecase >>>> dependent. But even if the TLB is not loaded, I would assume the >>>> hardware lookup is at least as fast as the software based one, but I >>>> would bet it is faster. Without number this is just theoretical but I >>>> would be very surprised if the hardware lookup would ever be slower >>>> then the software based one... And since this is a corner-case that >>>> most application will never encounter, forcing them all to use a >>>> slower approach doesn't sound right. >>> >>> >>> What you miss in my point is TLB can be designed to never cache >>> intermediate >>> translation. It is not even about whether the TLB are loaded or not. >>> >>> I am struggling to understand why this would be a corner case as it >>> would be >>> valid to protect the page table to inspect what the guest is doing... >> >> It is valid, yes, but nothing requires the user to block read access >> to that specific page. Most applications only monitor a handful of >> pages at a time. Applications may not even use read-access >> restrictions at all (doing only write or execute tracing). If we do >> software based lookups all the time we penalize all of those >> applications when it is really only a very small subset of cases where >> this will be needed. >> >>> >>> I am not saying translation would be slower in hardware than in >>> software. I >>> am just saying that maybe it would not be a huge performance hit to >>> always >>> do software rather than fallback and having a worse one time to time (or >>> often?). >> >> It probably wouldn't be a big hit, but why take any hit if not necessary? > > You still miss my point... Anyway, it is nothing something I really care * it is not something > for now. So I am not going to argue. > > Cheers, >
diff --git a/xen/arch/arm/mem_access.c b/xen/arch/arm/mem_access.c index 04b1506b00..352eb6073f 100644 --- a/xen/arch/arm/mem_access.c +++ b/xen/arch/arm/mem_access.c @@ -20,6 +20,7 @@ #include <xen/monitor.h> #include <xen/sched.h> #include <xen/vm_event.h> +#include <xen/domain_page.h> #include <public/vm_event.h> #include <asm/event.h> @@ -90,6 +91,129 @@ static int __p2m_get_mem_access(struct domain *d, gfn_t gfn, return 0; } +static int +p2m_gva_to_ipa(struct p2m_domain *p2m, vaddr_t gva, + paddr_t *ipa, unsigned int flags) +{ + int level=0, t0_sz, t1_sz; + unsigned long t0_max, t1_min; + lpae_t pte, *table; + mfn_t root_mfn; + uint64_t ttbr; + uint32_t sctlr = READ_SYSREG(SCTLR_EL1); + register_t ttbcr = READ_SYSREG(TCR_EL1); + struct domain *d = p2m->domain; + + const unsigned int offsets[4] = { +#ifdef CONFIG_ARM_64 + zeroeth_table_offset(gva), +#endif + first_table_offset(gva), + second_table_offset(gva), + third_table_offset(gva) + }; + + const paddr_t masks[4] = { +#ifdef CONFIG_ARM_64 + ZEROETH_SIZE - 1, +#endif + FIRST_SIZE - 1, + SECOND_SIZE - 1, + THIRD_SIZE - 1 + }; + + /* If the MMU is disabled, there is no need to translate the gva. */ + if ( !(sctlr & SCTLR_M) ) + { + *ipa = gva; + + return 0; + } + + if ( is_32bit_domain(d) ) + { + /* + * XXX: We do not support 32-bit domain translation table walks for + * domains using the short-descriptor translation table format, yet. + */ + if ( !(ttbcr & TTBCR_EAE) ) + return -EFAULT; + +#ifdef CONFIG_ARM_64 + level = 1; +#endif + } + +#ifdef CONFIG_ARM_64 + if ( is_64bit_domain(d) ) + { + /* Get the max GVA that can be translated by TTBR0. */ + t0_sz = (ttbcr >> TCR_T0SZ_SHIFT) & TCR_SZ_MASK; + t0_max = (1UL << (64 - t0_sz)) - 1; + + /* Get the min GVA that can be translated by TTBR1. */ + t1_sz = (ttbcr >> TCR_T1SZ_SHIFT) & TCR_SZ_MASK; + t1_min = ~0UL - (1UL << (64 - t1_sz)) + 1; + } + else +#endif + { + /* Get the max GVA that can be translated by TTBR0. */ + t0_sz = (ttbcr >> TCR_T0SZ_SHIFT) & TTBCR_SZ_MASK; + t0_max = (1U << (32 - t0_sz)) - 1; + + /* Get the min GVA that can be translated by TTBR1. */ + t1_sz = (ttbcr >> TCR_T1SZ_SHIFT) & TTBCR_SZ_MASK; + t1_min = ~0U - (1U << (32 - t1_sz)) + 1; + } + + if ( t0_max >= gva ) + /* Use TTBR0 for GVA to IPA translation. */ + ttbr = READ_SYSREG64(TTBR0_EL1); + else if ( t1_min <= gva ) + /* Use TTBR1 for GVA to IPA translation. */ + ttbr = READ_SYSREG64(TTBR1_EL1); + else + /* GVA out of bounds of TTBR(0|1). */ + return -EFAULT; + + /* Bits [63..48] might be used by an ASID. */ + root_mfn = p2m_lookup(d, _gfn(paddr_to_pfn(ttbr & ((1ULL<<48)-1))), NULL); + + /* Check, whether TTBR holds a valid address. */ + if ( mfn_eq(root_mfn, INVALID_MFN) ) + return -EFAULT; + + table = map_domain_page(root_mfn); + + for ( ; ; level++ ) + { + pte = table[offsets[level]]; + + if ( level == 3 || !pte.walk.valid || !pte.walk.table ) + break; + + unmap_domain_page(table); + + root_mfn = p2m_lookup(d, _gfn(pte.walk.base), NULL); + table = map_domain_page(root_mfn); + } + + unmap_domain_page(table); + + if ( !pte.walk.valid ) + return -EFAULT; + + /* Make sure the entry holds the requested access attributes. */ + if ( ((flags & GV2M_WRITE) == GV2M_WRITE) && pte.pt.ro ) + return -EFAULT; + + *ipa = pfn_to_paddr(pte.walk.base) | (gva & masks[level]); + + return 0; +} + + /* * If mem_access is in use it might have been the reason why get_page_from_gva * failed to fetch the page, as it uses the MMU for the permission checking. @@ -109,9 +233,23 @@ p2m_mem_access_check_and_get_page(vaddr_t gva, unsigned long flag, struct page_info *page = NULL; struct p2m_domain *p2m = &v->domain->arch.p2m; + ASSERT(p2m->mem_access_enabled); + rc = gva_to_ipa(gva, &ipa, flag); + + /* + * In case mem_access is active, hardware-based gva_to_ipa translation + * might fail. Since gva_to_ipa uses the guest's translation tables, access + * to which might be restricted by the active VTTBR, we perform a gva to + * ipa translation in software. + */ if ( rc < 0 ) - goto err; + if ( p2m_gva_to_ipa(p2m, gva, &ipa, flag) < 0 ) + /* + * The software gva to ipa translation can still fail, if the the + * gva is not mapped or does not hold the requested access rights. + */ + goto err; gfn = _gfn(paddr_to_pfn(ipa));
The function p2m_mem_access_check_and_get_page in mem_access.c translates a gva to an ipa by means of the hardware functionality implemented in the function gva_to_ipa. If mem_access is active, hardware-based gva to ipa translation might fail, as gva_to_ipa uses the guest's translation tables, access to which might be restricted by the active VTTBR. To address this issue, we perform the gva to ipa translation in software. Signed-off-by: Sergej Proskurin <proskurin@sec.in.tum.de> --- Cc: Razvan Cojocaru <rcojocaru@bitdefender.com> Cc: Tamas K Lengyel <tamas@tklengyel.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Julien Grall <julien.grall@arm.com> --- xen/arch/arm/mem_access.c | 140 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 139 insertions(+), 1 deletion(-)