diff mbox series

KVM: arm/arm64: Fix young bit from mmu notifier

Message ID 20200121055659.19560-1-gshan@redhat.com (mailing list archive)
State Mainlined
Commit cf2d23e0bac9f6b5cd1cba8898f5f05ead40e530
Headers show
Series KVM: arm/arm64: Fix young bit from mmu notifier | expand

Commit Message

Gavin Shan Jan. 21, 2020, 5:56 a.m. UTC
kvm_test_age_hva() is called upon mmu_notifier_test_young(), but wrong
address range has been passed to handle_hva_to_gpa(). With the wrong
address range, no young bits will be checked in handle_hva_to_gpa().
It means zero is always returned from mmu_notifier_test_young().

This fixes the issue by passing correct address range to the underly
function handle_hva_to_gpa(), so that the hardware young (access) bit
will be visited.

Cc: stable@vger.kernel.org # v4.1+
Fixes: 35307b9a5f7e ("arm/arm64: KVM: Implement Stage-2 page aging")
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
 virt/kvm/arm/mmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Marc Zyngier Jan. 21, 2020, 1:28 p.m. UTC | #1
On 2020-01-21 05:56, Gavin Shan wrote:
> kvm_test_age_hva() is called upon mmu_notifier_test_young(), but wrong
> address range has been passed to handle_hva_to_gpa(). With the wrong
> address range, no young bits will be checked in handle_hva_to_gpa().
> It means zero is always returned from mmu_notifier_test_young().
> 
> This fixes the issue by passing correct address range to the underly
> function handle_hva_to_gpa(), so that the hardware young (access) bit
> will be visited.
> 
> Cc: stable@vger.kernel.org # v4.1+
> Fixes: 35307b9a5f7e ("arm/arm64: KVM: Implement Stage-2 page aging")
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> ---
>  virt/kvm/arm/mmu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> index 0b32a904a1bb..a2777efb558e 100644
> --- a/virt/kvm/arm/mmu.c
> +++ b/virt/kvm/arm/mmu.c
> @@ -2147,7 +2147,8 @@ int kvm_test_age_hva(struct kvm *kvm, unsigned 
> long hva)
>  	if (!kvm->arch.pgd)
>  		return 0;
>  	trace_kvm_test_age_hva(hva);
> -	return handle_hva_to_gpa(kvm, hva, hva, kvm_test_age_hva_handler, 
> NULL);
> +	return handle_hva_to_gpa(kvm, hva, hva + PAGE_SIZE,
> +				 kvm_test_age_hva_handler, NULL);
>  }
> 
>  void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu)

I knew this start/end thing (instead of start/size) would bite us
one of these days. Terribly embarrassing. On the other hand, who
really wants to swap things out? ;-)

Out of curiosity, how did you find this one?

Thanks,

         M.
Gavin Shan Jan. 21, 2020, 11:07 p.m. UTC | #2
On 1/22/20 12:28 AM, Marc Zyngier wrote:
> On 2020-01-21 05:56, Gavin Shan wrote:
>> kvm_test_age_hva() is called upon mmu_notifier_test_young(), but wrong
>> address range has been passed to handle_hva_to_gpa(). With the wrong
>> address range, no young bits will be checked in handle_hva_to_gpa().
>> It means zero is always returned from mmu_notifier_test_young().
>>
>> This fixes the issue by passing correct address range to the underly
>> function handle_hva_to_gpa(), so that the hardware young (access) bit
>> will be visited.
>>
>> Cc: stable@vger.kernel.org # v4.1+
>> Fixes: 35307b9a5f7e ("arm/arm64: KVM: Implement Stage-2 page aging")
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> ---
>>  virt/kvm/arm/mmu.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>> index 0b32a904a1bb..a2777efb558e 100644
>> --- a/virt/kvm/arm/mmu.c
>> +++ b/virt/kvm/arm/mmu.c
>> @@ -2147,7 +2147,8 @@ int kvm_test_age_hva(struct kvm *kvm, unsigned long hva)
>>      if (!kvm->arch.pgd)
>>          return 0;
>>      trace_kvm_test_age_hva(hva);
>> -    return handle_hva_to_gpa(kvm, hva, hva, kvm_test_age_hva_handler, NULL);
>> +    return handle_hva_to_gpa(kvm, hva, hva + PAGE_SIZE,
>> +                 kvm_test_age_hva_handler, NULL);
>>  }
>>
>>  void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu)
> 
> I knew this start/end thing (instead of start/size) would bite us
> one of these days. Terribly embarrassing. On the other hand, who
> really wants to swap things out? ;-)
> 
> Out of curiosity, how did you find this one?
> 

Well, it's hard to tell who really wants to swap things out. Something I
was involved previously: user daemon is started to scan the accessed pages
periodically, in order to determine the least accessed pages. These least
access anonymous pages are migrated to low-cost storage (e.g. NVDIMM). This
helps on balance of performance and cost.

It's found when reading code. After that, I wrote some code (as below) to
double confirm:

    (1) locate qemu process and the corresponding vma because the VM is started
        with "mem-path=/tmp/virtiofs/backup-file". "backup-file" is the key in
        the location.
    (2) iterate the virtual space of the vma by mmu_notifier_test_young(), none
        of return values is 1 (accessed). It seems it's not correct.

With the patch applied and rerun above code, mmu_notifier_test_young() returns
1 (accessed) on some pages.

Thanks,
Gavin
diff mbox series

Patch

diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index 0b32a904a1bb..a2777efb558e 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -2147,7 +2147,8 @@  int kvm_test_age_hva(struct kvm *kvm, unsigned long hva)
 	if (!kvm->arch.pgd)
 		return 0;
 	trace_kvm_test_age_hva(hva);
-	return handle_hva_to_gpa(kvm, hva, hva, kvm_test_age_hva_handler, NULL);
+	return handle_hva_to_gpa(kvm, hva, hva + PAGE_SIZE,
+				 kvm_test_age_hva_handler, NULL);
 }
 
 void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu)