diff mbox series

[3/3] KVM: arm64: nv: Avoid block mapping if max_map_size is smaller than block size.

Message ID 20220824060304.21128-4-gankulkarni@os.amperecomputing.com (mailing list archive)
State New, archived
Headers show
Series KVM: arm64: nv: Fixes for Nested Virtualization issues | expand

Commit Message

Ganapatrao Kulkarni Aug. 24, 2022, 6:03 a.m. UTC
In NV case, Shadow stage 2 page table is created using host hypervisor
page table configuration like page size, block size etc. Also, the shadow
stage 2 table uses block level mapping if the Guest Hypervisor IPA is
backed by the THP pages. However, this is resulting in illegal mapping of
NestedVM IPA to Host Hypervisor PA, when Guest Hypervisor and Host
hypervisor are configured with different pagesize.

Adding fix to avoid block level mapping in stage 2 mapping if
max_map_size is smaller than the block size.

Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
---
 arch/arm64/kvm/mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Marc Zyngier Dec. 29, 2022, 5:42 p.m. UTC | #1
On Wed, 24 Aug 2022 07:03:04 +0100,
Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> wrote:
> 
> In NV case, Shadow stage 2 page table is created using host hypervisor
> page table configuration like page size, block size etc. Also, the shadow
> stage 2 table uses block level mapping if the Guest Hypervisor IPA is
> backed by the THP pages. However, this is resulting in illegal mapping of
> NestedVM IPA to Host Hypervisor PA, when Guest Hypervisor and Host
> hypervisor are configured with different pagesize.
> 
> Adding fix to avoid block level mapping in stage 2 mapping if
> max_map_size is smaller than the block size.
> 
> Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
> ---
>  arch/arm64/kvm/mmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 6caa48da1b2e..3d4b53f153a1 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1304,7 +1304,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	 * backed by a THP and thus use block mapping if possible.
>  	 */
>  	if (vma_pagesize == PAGE_SIZE &&
> -	    !(max_map_size == PAGE_SIZE || device)) {
> +	    !(max_map_size < PMD_SIZE || device)) {
>  		if (fault_status == FSC_PERM && fault_granule > PAGE_SIZE)
>  			vma_pagesize = fault_granule;
>  		else

That's quite a nice catch. I guess this was the main issue with
running 64kB L1 on a 4kB L0? Now, I'm not that fond of the fix itself,
and I think max_map_size should always represent something that is a
valid size *on the host*, specially when outside of NV-specific code.

How about something like this instead:

@@ -1346,6 +1346,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		 * table uses at least as big a mapping.
 		 */
 		max_map_size = min(kvm_s2_trans_size(nested), max_map_size);
+
+		if (max_map_size >= PMD_SIZE && max_map_size < PUD_SIZE)
+			max_map_size = PMD_SIZE;
+		else if (max_map_size >= PAGE_SIZE && max_map_size < PMD_SIZE)
+			max_map_size = PAGE_SIZE;
 	}
 
 	vma_pagesize = min(vma_pagesize, max_map_size);


Admittedly, this is a lot uglier than your fix. But it keep the nested
horror localised, and doesn't risk being reverted by accident by
people who would not take NV into account (can't blame them, really).

Can you please give it a go?

Thanks,

	M.
Ganapatrao Kulkarni Jan. 3, 2023, 4:26 a.m. UTC | #2
On 29-12-2022 11:12 pm, Marc Zyngier wrote:
> On Wed, 24 Aug 2022 07:03:04 +0100,
> Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> wrote:
>>
>> In NV case, Shadow stage 2 page table is created using host hypervisor
>> page table configuration like page size, block size etc. Also, the shadow
>> stage 2 table uses block level mapping if the Guest Hypervisor IPA is
>> backed by the THP pages. However, this is resulting in illegal mapping of
>> NestedVM IPA to Host Hypervisor PA, when Guest Hypervisor and Host
>> hypervisor are configured with different pagesize.
>>
>> Adding fix to avoid block level mapping in stage 2 mapping if
>> max_map_size is smaller than the block size.
>>
>> Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
>> ---
>>   arch/arm64/kvm/mmu.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>> index 6caa48da1b2e..3d4b53f153a1 100644
>> --- a/arch/arm64/kvm/mmu.c
>> +++ b/arch/arm64/kvm/mmu.c
>> @@ -1304,7 +1304,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>   	 * backed by a THP and thus use block mapping if possible.
>>   	 */
>>   	if (vma_pagesize == PAGE_SIZE &&
>> -	    !(max_map_size == PAGE_SIZE || device)) {
>> +	    !(max_map_size < PMD_SIZE || device)) {
>>   		if (fault_status == FSC_PERM && fault_granule > PAGE_SIZE)
>>   			vma_pagesize = fault_granule;
>>   		else
> 
> That's quite a nice catch. I guess this was the main issue with
> running 64kB L1 on a 4kB L0? Now, I'm not that fond of the fix itself,
> and I think max_map_size should always represent something that is a
> valid size *on the host*, specially when outside of NV-specific code.
> 

Thanks Marc, yes this patch was to fix the issue seen with L1 64K and L0 
4K page size.

> How about something like this instead:
> 
> @@ -1346,6 +1346,11 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>   		 * table uses at least as big a mapping.
>   		 */
>   		max_map_size = min(kvm_s2_trans_size(nested), max_map_size);
> +
> +		if (max_map_size >= PMD_SIZE && max_map_size < PUD_SIZE)
> +			max_map_size = PMD_SIZE;
> +		else if (max_map_size >= PAGE_SIZE && max_map_size < PMD_SIZE)
> +			max_map_size = PAGE_SIZE;
>   	}
>   
>   	vma_pagesize = min(vma_pagesize, max_map_size);
> 
> 
> Admittedly, this is a lot uglier than your fix. But it keep the nested
> horror localised, and doesn't risk being reverted by accident by
> people who would not take NV into account (can't blame them, really).
> 
> Can you please give it a go?

Sure, I will try this and update at the earliest.
> 
> Thanks,
> 
> 	M.
> 

Thanks,
Ganapat
Ganapatrao Kulkarni Jan. 9, 2023, 1:58 p.m. UTC | #3
On 03-01-2023 09:56 am, Ganapatrao Kulkarni wrote:
> 
> 
> On 29-12-2022 11:12 pm, Marc Zyngier wrote:
>> On Wed, 24 Aug 2022 07:03:04 +0100,
>> Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> wrote:
>>>
>>> In NV case, Shadow stage 2 page table is created using host hypervisor
>>> page table configuration like page size, block size etc. Also, the 
>>> shadow
>>> stage 2 table uses block level mapping if the Guest Hypervisor IPA is
>>> backed by the THP pages. However, this is resulting in illegal 
>>> mapping of
>>> NestedVM IPA to Host Hypervisor PA, when Guest Hypervisor and Host
>>> hypervisor are configured with different pagesize.
>>>
>>> Adding fix to avoid block level mapping in stage 2 mapping if
>>> max_map_size is smaller than the block size.
>>>
>>> Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
>>> ---
>>>   arch/arm64/kvm/mmu.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
>>> index 6caa48da1b2e..3d4b53f153a1 100644
>>> --- a/arch/arm64/kvm/mmu.c
>>> +++ b/arch/arm64/kvm/mmu.c
>>> @@ -1304,7 +1304,7 @@ static int user_mem_abort(struct kvm_vcpu 
>>> *vcpu, phys_addr_t fault_ipa,
>>>        * backed by a THP and thus use block mapping if possible.
>>>        */
>>>       if (vma_pagesize == PAGE_SIZE &&
>>> -        !(max_map_size == PAGE_SIZE || device)) {
>>> +        !(max_map_size < PMD_SIZE || device)) {
>>>           if (fault_status == FSC_PERM && fault_granule > PAGE_SIZE)
>>>               vma_pagesize = fault_granule;
>>>           else
>>
>> That's quite a nice catch. I guess this was the main issue with
>> running 64kB L1 on a 4kB L0? Now, I'm not that fond of the fix itself,
>> and I think max_map_size should always represent something that is a
>> valid size *on the host*, specially when outside of NV-specific code.
>>
> 
> Thanks Marc, yes this patch was to fix the issue seen with L1 64K and L0 
> 4K page size.
> 
>> How about something like this instead:
>>
>> @@ -1346,6 +1346,11 @@ static int user_mem_abort(struct kvm_vcpu 
>> *vcpu, phys_addr_t fault_ipa,
>>            * table uses at least as big a mapping.
>>            */
>>           max_map_size = min(kvm_s2_trans_size(nested), max_map_size);
>> +

Would be good to add brief comment about the changes.
>> +        if (max_map_size >= PMD_SIZE && max_map_size < PUD_SIZE)
>> +            max_map_size = PMD_SIZE;
>> +        else if (max_map_size >= PAGE_SIZE && max_map_size < PMD_SIZE)
>> +            max_map_size = PAGE_SIZE;
>>       }
>>       vma_pagesize = min(vma_pagesize, max_map_size);
>>
>>
>> Admittedly, this is a lot uglier than your fix. But it keep the nested
>> horror localised, and doesn't risk being reverted by accident by
>> people who would not take NV into account (can't blame them, really).

I agree, it makes sense to keep the changes specific to NV under nested 
scope.

>>
>> Can you please give it a go?

This diff works as well.
> 
> Sure, I will try this and update at the earliest.
>>
>> Thanks,
>>
>>     M.
>>
> 
> Thanks,
> Ganapat

Thanks,
Ganapat
diff mbox series

Patch

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 6caa48da1b2e..3d4b53f153a1 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1304,7 +1304,7 @@  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	 * backed by a THP and thus use block mapping if possible.
 	 */
 	if (vma_pagesize == PAGE_SIZE &&
-	    !(max_map_size == PAGE_SIZE || device)) {
+	    !(max_map_size < PMD_SIZE || device)) {
 		if (fault_status == FSC_PERM && fault_granule > PAGE_SIZE)
 			vma_pagesize = fault_granule;
 		else