diff mbox series

Subject: [RFC PATCH] kvm/x86: Fix 'lpages' kvm stat for TDM MMU

Message ID 1619700409955.15104@amazon.de (mailing list archive)
State New, archived
Headers show
Series Subject: [RFC PATCH] kvm/x86: Fix 'lpages' kvm stat for TDM MMU | expand

Commit Message

Shahin, Md Shahadat Hossain April 29, 2021, 12:46 p.m. UTC
Large pages not being created properly may result in increased memory
access time. The 'lpages' kvm stat used to keep track of the current
number of large pages in the system, but with TDP MMU enabled the stat
is not showing the correct number.

This patch extends the lpages counter to cover the TDP case.

Signed-off-by: Md Shahadat Hossain Shahin <shahinmd@amazon.de>
Cc: Bartosz Szczepanek <bsz@amazon.de>
---
 arch/x86/kvm/mmu/tdp_mmu.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Paolo Bonzini April 29, 2021, 3:39 p.m. UTC | #1
On 29/04/21 14:46, Shahin, Md Shahadat Hossain wrote:
> Large pages not being created properly may result in increased memory
> access time. The 'lpages' kvm stat used to keep track of the current
> number of large pages in the system, but with TDP MMU enabled the stat
> is not showing the correct number.
> 
> This patch extends the lpages counter to cover the TDP case.
> 
> Signed-off-by: Md Shahadat Hossain Shahin <shahinmd@amazon.de>
> Cc: Bartosz Szczepanek <bsz@amazon.de>
> ---
>   arch/x86/kvm/mmu/tdp_mmu.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 34207b874886..1e2a3cb33568 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -425,6 +425,12 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
>   
>   	if (old_spte == new_spte)
>   		return;
> +	
> +	if (is_large_pte(old_spte))
> +		--kvm->stat.lpages;
> +	
> +	if (is_large_pte(new_spte))
> +		++kvm->stat.lpages;
>   
>   	trace_kvm_tdp_mmu_spte_changed(as_id, gfn, level, old_spte, new_spte);
>   
> 

Queued, thanks.

Paolo
Sean Christopherson April 29, 2021, 4:49 p.m. UTC | #2
On Thu, Apr 29, 2021, Shahin, Md Shahadat Hossain wrote:
> Large pages not being created properly may result in increased memory
> access time. The 'lpages' kvm stat used to keep track of the current
> number of large pages in the system, but with TDP MMU enabled the stat
> is not showing the correct number.
> 
> This patch extends the lpages counter to cover the TDP case.
> 
> Signed-off-by: Md Shahadat Hossain Shahin <shahinmd@amazon.de>
> Cc: Bartosz Szczepanek <bsz@amazon.de>
> ---
>  arch/x86/kvm/mmu/tdp_mmu.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index 34207b874886..1e2a3cb33568 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -425,6 +425,12 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
>  
>  	if (old_spte == new_spte)
>  		return;
> +	
> +	if (is_large_pte(old_spte))
> +		--kvm->stat.lpages;
> +	
> +	if (is_large_pte(new_spte))
> +		++kvm->stat.lpages;

Hrm, kvm->stat.lpages could get corrupted when __handle_changed_spte() is called
under read lock, e.g. if multiple vCPUs are faulting in memory.
Paolo Bonzini April 29, 2021, 5:04 p.m. UTC | #3
On 29/04/21 18:49, Sean Christopherson wrote:
> On Thu, Apr 29, 2021, Shahin, Md Shahadat Hossain wrote:
>> Large pages not being created properly may result in increased memory
>> access time. The 'lpages' kvm stat used to keep track of the current
>> number of large pages in the system, but with TDP MMU enabled the stat
>> is not showing the correct number.
>>
>> This patch extends the lpages counter to cover the TDP case.
>>
>> Signed-off-by: Md Shahadat Hossain Shahin <shahinmd@amazon.de>
>> Cc: Bartosz Szczepanek <bsz@amazon.de>
>> ---
>>   arch/x86/kvm/mmu/tdp_mmu.c | 6 ++++++
>>   1 file changed, 6 insertions(+)
>>
>> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
>> index 34207b874886..1e2a3cb33568 100644
>> --- a/arch/x86/kvm/mmu/tdp_mmu.c
>> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
>> @@ -425,6 +425,12 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
>>   
>>   	if (old_spte == new_spte)
>>   		return;
>> +	
>> +	if (is_large_pte(old_spte))
>> +		--kvm->stat.lpages;
>> +	
>> +	if (is_large_pte(new_spte))
>> +		++kvm->stat.lpages;
> 
> Hrm, kvm->stat.lpages could get corrupted when __handle_changed_spte() is called
> under read lock, e.g. if multiple vCPUs are faulting in memory.

Ouch, indeed!

One way to fix it without needing an atomic operation is to make it a 
per-vcpu stat.  It would be a bit weird for the binary stats because we 
would have to hide this one from the vCPU statistics file descriptor 
(and only aggregate it in the VM statistics).

Alternatively, you can do the atomic_add only if is_large_pte(old_spte) 
!= is_large_pte(new_spte), casting &kvm->stat.lpages to an atomic64_t*.

Paolo
Ben Gardon April 29, 2021, 5:25 p.m. UTC | #4
On Thu, Apr 29, 2021 at 10:04 AM Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 29/04/21 18:49, Sean Christopherson wrote:
> > On Thu, Apr 29, 2021, Shahin, Md Shahadat Hossain wrote:
> >> Large pages not being created properly may result in increased memory
> >> access time. The 'lpages' kvm stat used to keep track of the current
> >> number of large pages in the system, but with TDP MMU enabled the stat
> >> is not showing the correct number.
> >>
> >> This patch extends the lpages counter to cover the TDP case.
> >>
> >> Signed-off-by: Md Shahadat Hossain Shahin <shahinmd@amazon.de>
> >> Cc: Bartosz Szczepanek <bsz@amazon.de>
> >> ---
> >>   arch/x86/kvm/mmu/tdp_mmu.c | 6 ++++++
> >>   1 file changed, 6 insertions(+)
> >>
> >> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> >> index 34207b874886..1e2a3cb33568 100644
> >> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> >> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> >> @@ -425,6 +425,12 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
> >>
> >>      if (old_spte == new_spte)
> >>              return;
> >> +
> >> +    if (is_large_pte(old_spte))
> >> +            --kvm->stat.lpages;
> >> +
> >> +    if (is_large_pte(new_spte))
> >> +            ++kvm->stat.lpages;
> >
> > Hrm, kvm->stat.lpages could get corrupted when __handle_changed_spte() is called
> > under read lock, e.g. if multiple vCPUs are faulting in memory.
>
> Ouch, indeed!
>
> One way to fix it without needing an atomic operation is to make it a
> per-vcpu stat.  It would be a bit weird for the binary stats because we
> would have to hide this one from the vCPU statistics file descriptor
> (and only aggregate it in the VM statistics).
>
> Alternatively, you can do the atomic_add only if is_large_pte(old_spte)
> != is_large_pte(new_spte), casting &kvm->stat.lpages to an atomic64_t*.

I forgot that the lpages stat existed upstream. Internally at Google
we also maintain a count for each mapping level and just update it
atomically. The per-vCPU stat approach would work too, but I doubt
it's worth the complexity unless we have other use cases for a
similarly aggregated stat.

>
> Paolo
>
diff mbox series

Patch

diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 34207b874886..1e2a3cb33568 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -425,6 +425,12 @@  static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
 
 	if (old_spte == new_spte)
 		return;
+	
+	if (is_large_pte(old_spte))
+		--kvm->stat.lpages;
+	
+	if (is_large_pte(new_spte))
+		++kvm->stat.lpages;
 
 	trace_kvm_tdp_mmu_spte_changed(as_id, gfn, level, old_spte, new_spte);