Message ID | 20220824060304.21128-4-gankulkarni@os.amperecomputing.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: arm64: nv: Fixes for Nested Virtualization issues | expand |
On Wed, 24 Aug 2022 07:03:04 +0100, Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> wrote: > > In NV case, Shadow stage 2 page table is created using host hypervisor > page table configuration like page size, block size etc. Also, the shadow > stage 2 table uses block level mapping if the Guest Hypervisor IPA is > backed by the THP pages. However, this is resulting in illegal mapping of > NestedVM IPA to Host Hypervisor PA, when Guest Hypervisor and Host > hypervisor are configured with different pagesize. > > Adding fix to avoid block level mapping in stage 2 mapping if > max_map_size is smaller than the block size. > > Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> > --- > arch/arm64/kvm/mmu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c > index 6caa48da1b2e..3d4b53f153a1 100644 > --- a/arch/arm64/kvm/mmu.c > +++ b/arch/arm64/kvm/mmu.c > @@ -1304,7 +1304,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > * backed by a THP and thus use block mapping if possible. > */ > if (vma_pagesize == PAGE_SIZE && > - !(max_map_size == PAGE_SIZE || device)) { > + !(max_map_size < PMD_SIZE || device)) { > if (fault_status == FSC_PERM && fault_granule > PAGE_SIZE) > vma_pagesize = fault_granule; > else That's quite a nice catch. I guess this was the main issue with running 64kB L1 on a 4kB L0? Now, I'm not that fond of the fix itself, and I think max_map_size should always represent something that is a valid size *on the host*, specially when outside of NV-specific code. How about something like this instead: @@ -1346,6 +1346,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, * table uses at least as big a mapping. */ max_map_size = min(kvm_s2_trans_size(nested), max_map_size); + + if (max_map_size >= PMD_SIZE && max_map_size < PUD_SIZE) + max_map_size = PMD_SIZE; + else if (max_map_size >= PAGE_SIZE && max_map_size < PMD_SIZE) + max_map_size = PAGE_SIZE; } vma_pagesize = min(vma_pagesize, max_map_size); Admittedly, this is a lot uglier than your fix. But it keep the nested horror localised, and doesn't risk being reverted by accident by people who would not take NV into account (can't blame them, really). Can you please give it a go? Thanks, M.
On 29-12-2022 11:12 pm, Marc Zyngier wrote: > On Wed, 24 Aug 2022 07:03:04 +0100, > Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> wrote: >> >> In NV case, Shadow stage 2 page table is created using host hypervisor >> page table configuration like page size, block size etc. Also, the shadow >> stage 2 table uses block level mapping if the Guest Hypervisor IPA is >> backed by the THP pages. However, this is resulting in illegal mapping of >> NestedVM IPA to Host Hypervisor PA, when Guest Hypervisor and Host >> hypervisor are configured with different pagesize. >> >> Adding fix to avoid block level mapping in stage 2 mapping if >> max_map_size is smaller than the block size. >> >> Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> >> --- >> arch/arm64/kvm/mmu.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c >> index 6caa48da1b2e..3d4b53f153a1 100644 >> --- a/arch/arm64/kvm/mmu.c >> +++ b/arch/arm64/kvm/mmu.c >> @@ -1304,7 +1304,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, >> * backed by a THP and thus use block mapping if possible. >> */ >> if (vma_pagesize == PAGE_SIZE && >> - !(max_map_size == PAGE_SIZE || device)) { >> + !(max_map_size < PMD_SIZE || device)) { >> if (fault_status == FSC_PERM && fault_granule > PAGE_SIZE) >> vma_pagesize = fault_granule; >> else > > That's quite a nice catch. I guess this was the main issue with > running 64kB L1 on a 4kB L0? Now, I'm not that fond of the fix itself, > and I think max_map_size should always represent something that is a > valid size *on the host*, specially when outside of NV-specific code. > Thanks Marc, yes this patch was to fix the issue seen with L1 64K and L0 4K page size. > How about something like this instead: > > @@ -1346,6 +1346,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, > * table uses at least as big a mapping. > */ > max_map_size = min(kvm_s2_trans_size(nested), max_map_size); > + > + if (max_map_size >= PMD_SIZE && max_map_size < PUD_SIZE) > + max_map_size = PMD_SIZE; > + else if (max_map_size >= PAGE_SIZE && max_map_size < PMD_SIZE) > + max_map_size = PAGE_SIZE; > } > > vma_pagesize = min(vma_pagesize, max_map_size); > > > Admittedly, this is a lot uglier than your fix. But it keep the nested > horror localised, and doesn't risk being reverted by accident by > people who would not take NV into account (can't blame them, really). > > Can you please give it a go? Sure, I will try this and update at the earliest. > > Thanks, > > M. > Thanks, Ganapat
On 03-01-2023 09:56 am, Ganapatrao Kulkarni wrote: > > > On 29-12-2022 11:12 pm, Marc Zyngier wrote: >> On Wed, 24 Aug 2022 07:03:04 +0100, >> Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> wrote: >>> >>> In NV case, Shadow stage 2 page table is created using host hypervisor >>> page table configuration like page size, block size etc. Also, the >>> shadow >>> stage 2 table uses block level mapping if the Guest Hypervisor IPA is >>> backed by the THP pages. However, this is resulting in illegal >>> mapping of >>> NestedVM IPA to Host Hypervisor PA, when Guest Hypervisor and Host >>> hypervisor are configured with different pagesize. >>> >>> Adding fix to avoid block level mapping in stage 2 mapping if >>> max_map_size is smaller than the block size. >>> >>> Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> >>> --- >>> arch/arm64/kvm/mmu.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c >>> index 6caa48da1b2e..3d4b53f153a1 100644 >>> --- a/arch/arm64/kvm/mmu.c >>> +++ b/arch/arm64/kvm/mmu.c >>> @@ -1304,7 +1304,7 @@ static int user_mem_abort(struct kvm_vcpu >>> *vcpu, phys_addr_t fault_ipa, >>> * backed by a THP and thus use block mapping if possible. >>> */ >>> if (vma_pagesize == PAGE_SIZE && >>> - !(max_map_size == PAGE_SIZE || device)) { >>> + !(max_map_size < PMD_SIZE || device)) { >>> if (fault_status == FSC_PERM && fault_granule > PAGE_SIZE) >>> vma_pagesize = fault_granule; >>> else >> >> That's quite a nice catch. I guess this was the main issue with >> running 64kB L1 on a 4kB L0? Now, I'm not that fond of the fix itself, >> and I think max_map_size should always represent something that is a >> valid size *on the host*, specially when outside of NV-specific code. >> > > Thanks Marc, yes this patch was to fix the issue seen with L1 64K and L0 > 4K page size. > >> How about something like this instead: >> >> @@ -1346,6 +1346,11 @@ static int user_mem_abort(struct kvm_vcpu >> *vcpu, phys_addr_t fault_ipa, >> * table uses at least as big a mapping. >> */ >> max_map_size = min(kvm_s2_trans_size(nested), max_map_size); >> + Would be good to add brief comment about the changes. >> + if (max_map_size >= PMD_SIZE && max_map_size < PUD_SIZE) >> + max_map_size = PMD_SIZE; >> + else if (max_map_size >= PAGE_SIZE && max_map_size < PMD_SIZE) >> + max_map_size = PAGE_SIZE; >> } >> vma_pagesize = min(vma_pagesize, max_map_size); >> >> >> Admittedly, this is a lot uglier than your fix. But it keep the nested >> horror localised, and doesn't risk being reverted by accident by >> people who would not take NV into account (can't blame them, really). I agree, it makes sense to keep the changes specific to NV under nested scope. >> >> Can you please give it a go? This diff works as well. > > Sure, I will try this and update at the earliest. >> >> Thanks, >> >> M. >> > > Thanks, > Ganapat Thanks, Ganapat
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 6caa48da1b2e..3d4b53f153a1 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1304,7 +1304,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, * backed by a THP and thus use block mapping if possible. */ if (vma_pagesize == PAGE_SIZE && - !(max_map_size == PAGE_SIZE || device)) { + !(max_map_size < PMD_SIZE || device)) { if (fault_status == FSC_PERM && fault_granule > PAGE_SIZE) vma_pagesize = fault_granule; else
In NV case, Shadow stage 2 page table is created using host hypervisor page table configuration like page size, block size etc. Also, the shadow stage 2 table uses block level mapping if the Guest Hypervisor IPA is backed by the THP pages. However, this is resulting in illegal mapping of NestedVM IPA to Host Hypervisor PA, when Guest Hypervisor and Host hypervisor are configured with different pagesize. Adding fix to avoid block level mapping in stage 2 mapping if max_map_size is smaller than the block size. Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> --- arch/arm64/kvm/mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)