diff mbox series

[v4,2/5] KVM: X86: Bail out of direct yield in case of under-committed scenarios

Message ID 1621339235-11131-2-git-send-email-wanpengli@tencent.com (mailing list archive)
State New, archived
Headers show
Series [v4,1/5] KVM: exit halt polling on need_resched() for both book3s and generic halt-polling | expand

Commit Message

Wanpeng Li May 18, 2021, noon UTC
From: Wanpeng Li <wanpengli@tencent.com>

In case of under-committed scenarios, vCPU can get scheduling easily,
kvm_vcpu_yield_to add extra overhead, we can observe a lot of race
between vcpu->ready is true and yield fails due to p->state is
TASK_RUNNING. Let's bail out in such scenarios by checking the length
of current cpu runqueue, it can be treated as a hint of under-committed
instead of guarantee of accuracy. The directed_yield_successful/attempted
ratio can be improved from 50+% to 80+% in the under-committed scenario.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
v2 -> v3:
 * update patch description
v1 -> v2:
 * move the check after attempted counting
 * update patch description

 arch/x86/kvm/x86.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Sean Christopherson May 18, 2021, 7:21 p.m. UTC | #1
On Tue, May 18, 2021, Wanpeng Li wrote:
> From: Wanpeng Li <wanpengli@tencent.com>
> 
> In case of under-committed scenarios, vCPU can get scheduling easily,
> kvm_vcpu_yield_to add extra overhead, we can observe a lot of race
> between vcpu->ready is true and yield fails due to p->state is
> TASK_RUNNING. Let's bail out in such scenarios by checking the length
> of current cpu runqueue, it can be treated as a hint of under-committed
> instead of guarantee of accuracy. The directed_yield_successful/attempted
> ratio can be improved from 50+% to 80+% in the under-committed scenario.

The "50+% to 80+%" comment will be a bit confusing for future readers now that
the single_task_running() case counts as an attempt.  I think the new comment
would be something like "30%+ of directed-yield attempts can avoid the expensive
lookups in kvm_sched_yield() in an under-committed scenario."  That would also
provide the real justification, as bumping the success ratio isn't the true goal
of this path.

> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> ---
> v2 -> v3:
>  * update patch description
> v1 -> v2:
>  * move the check after attempted counting
>  * update patch description
> 
>  arch/x86/kvm/x86.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 9b6bca616929..dfb7c320581f 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8360,6 +8360,9 @@ static void kvm_sched_yield(struct kvm_vcpu *vcpu, unsigned long dest_id)
>  
>  	vcpu->stat.directed_yield_attempted++;
>  
> +	if (single_task_running())
> +		goto no_yield;
> +
>  	rcu_read_lock();
>  	map = rcu_dereference(vcpu->kvm->arch.apic_map);
>  
> -- 
> 2.25.1
>
Wanpeng Li May 19, 2021, 2:57 a.m. UTC | #2
On Wed, 19 May 2021 at 03:21, Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, May 18, 2021, Wanpeng Li wrote:
> > From: Wanpeng Li <wanpengli@tencent.com>
> >
> > In case of under-committed scenarios, vCPU can get scheduling easily,
> > kvm_vcpu_yield_to add extra overhead, we can observe a lot of race
> > between vcpu->ready is true and yield fails due to p->state is
> > TASK_RUNNING. Let's bail out in such scenarios by checking the length
> > of current cpu runqueue, it can be treated as a hint of under-committed
> > instead of guarantee of accuracy. The directed_yield_successful/attempted
> > ratio can be improved from 50+% to 80+% in the under-committed scenario.
>
> The "50+% to 80+%" comment will be a bit confusing for future readers now that
> the single_task_running() case counts as an attempt.  I think the new comment
> would be something like "30%+ of directed-yield attempts can avoid the expensive
> lookups in kvm_sched_yield() in an under-committed scenario."  That would also
> provide the real justification, as bumping the success ratio isn't the true goal
> of this path.

Looks good. Hope Paolo can update the patch description when applying. :)

"In case of under-committed scenarios, vCPU can get scheduling easily,
kvm_vcpu_yield_to add extra overhead, we can observe a lot of races
between vcpu->ready is true and yield fails due to p->state is
TASK_RUNNING. Let's bail out in such scenarios by checking the length
of current cpu runqueue, it can be treated as a hint of under-committed
instead of guaranteeing accuracy. 30%+ of directed-yield attempts can
avoid the expensive lookups in kvm_sched_yield() in an under-committed
scenario. "
Paolo Bonzini May 24, 2021, 1:42 p.m. UTC | #3
On 19/05/21 04:57, Wanpeng Li wrote:
> Looks good. Hope Paolo can update the patch description when applying.:)
> 
> "In case of under-committed scenarios, vCPU can get scheduling easily,
> kvm_vcpu_yield_to add extra overhead, we can observe a lot of races
> between vcpu->ready is true and yield fails due to p->state is
> TASK_RUNNING. Let's bail out in such scenarios by checking the length
> of current cpu runqueue, it can be treated as a hint of under-committed
> instead of guaranteeing accuracy. 30%+ of directed-yield attempts can
> avoid the expensive lookups in kvm_sched_yield() in an under-committed
> scenario. "
> 

Here is what I used:

     In case of under-committed scenarios, vCPUs can be scheduled easily;
     kvm_vcpu_yield_to adds extra overhead, and it is also common to see
     when vcpu->ready is true but yield later failing due to p->state is
     TASK_RUNNING.
     
     Let's bail out in such scenarios by checking the length of current cpu
     runqueue, which can be treated as a hint of under-committed instead of
     guarantee of accuracy. 30%+ of directed-yield attempts can now avoid
     the expensive lookups in kvm_sched_yield() in an under-committed scenario.

Paolo
Wanpeng Li May 24, 2021, 1:47 p.m. UTC | #4
On Mon, 24 May 2021 at 21:42, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 19/05/21 04:57, Wanpeng Li wrote:
> > Looks good. Hope Paolo can update the patch description when applying.:)
> >
> > "In case of under-committed scenarios, vCPU can get scheduling easily,
> > kvm_vcpu_yield_to add extra overhead, we can observe a lot of races
> > between vcpu->ready is true and yield fails due to p->state is
> > TASK_RUNNING. Let's bail out in such scenarios by checking the length
> > of current cpu runqueue, it can be treated as a hint of under-committed
> > instead of guaranteeing accuracy. 30%+ of directed-yield attempts can
> > avoid the expensive lookups in kvm_sched_yield() in an under-committed
> > scenario. "
> >
>
> Here is what I used:
>
>      In case of under-committed scenarios, vCPUs can be scheduled easily;
>      kvm_vcpu_yield_to adds extra overhead, and it is also common to see
>      when vcpu->ready is true but yield later failing due to p->state is
>      TASK_RUNNING.
>
>      Let's bail out in such scenarios by checking the length of current cpu
>      runqueue, which can be treated as a hint of under-committed instead of
>      guarantee of accuracy. 30%+ of directed-yield attempts can now avoid
>      the expensive lookups in kvm_sched_yield() in an under-committed scenario.

Thanks Paolo! :)
    Wanpeng
diff mbox series

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9b6bca616929..dfb7c320581f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8360,6 +8360,9 @@  static void kvm_sched_yield(struct kvm_vcpu *vcpu, unsigned long dest_id)
 
 	vcpu->stat.directed_yield_attempted++;
 
+	if (single_task_running())
+		goto no_yield;
+
 	rcu_read_lock();
 	map = rcu_dereference(vcpu->kvm->arch.apic_map);