diff mbox series

答复: [PATCH] KVM: X86: set vcpu preempted only if it is preempted

Message ID bb92391dc5de46ac87ff238faf875c7b@baidu.com (mailing list archive)
State New, archived
Headers show
Series 答复: [PATCH] KVM: X86: set vcpu preempted only if it is preempted | expand

Commit Message

Li RongQing Jan. 13, 2022, 4:52 a.m. UTC
> -----邮件原件-----
> 发件人: Peter Zijlstra <peterz@infradead.org>
> 发送时间: 2022年1月13日 5:31
> 收件人: Sean Christopherson <seanjc@google.com>
> 抄送: Li,Rongqing <lirongqing@baidu.com>; pbonzini@redhat.com;
> vkuznets@redhat.com; wanpengli@tencent.com; jmattson@google.com;
> tglx@linutronix.de; bp@alien8.de; x86@kernel.org; kvm@vger.kernel.org;
> joro@8bytes.org
> 主题: Re: [PATCH] KVM: X86: set vcpu preempted only if it is preempted
> 
> On Wed, Jan 12, 2022 at 05:30:47PM +0000, Sean Christopherson wrote:
> > On Wed, Jan 12, 2022, Peter Zijlstra wrote:
> > > On Wed, Jan 12, 2022 at 08:02:01PM +0800, Li RongQing wrote:
> > > > vcpu can schedule out when run halt instruction, and set itself to
> > > > INTERRUPTIBLE and switch to idle thread, vcpu should not be set
> > > > preempted for this condition
> > >
> > > Uhhmm, why not? Who says the vcpu will run the moment it becomes
> > > runnable again? Another task could be woken up meanwhile occupying
> > > the real cpu.
> >
> > Hrm, but when emulating HLT, e.g. for an idling vCPU, KVM will
> > voluntarily schedule out the vCPU and mark it as preempted from the
> > guest's perspective.  The vast majority, probably all, usage of
> > steal_time.preempted expects it to truly mean "preempted" as opposed to
> "not running".
> 
> No, the original use-case was locking and that really cares about running.
> 
> If the vCPU isn't running, we must not busy-wait for it etc..
> 
> Similar to the scheduler use of it, if the vCPU isn't running, we should not
> consider it so. Getting the vCPU task scheduled back on the CPU can take a 'long'
> time.
> 
> If you have pinned vCPU threads and no overcommit, we have other knobs to
> indicate this I think.


Is it possible if guest has KVM_HINTS_REALTIME feature, but its HLT instruction is emulated by KVM?
If it is possible, this condition has been performance degradation, since vcpu_is_preempted is not __kvm_vcpu_is_preempted, will return false.

Similar, guest has nopvspin, but HLT instruction is emulated;  

Should we adjust the setting of pv_ops.lock.vcpu_is_preempted as below
And I see the performance boost when guest has nopvspin, but HLT instruction is emulated with below change



-Li

Comments

Peter Zijlstra Jan. 13, 2022, 9:33 a.m. UTC | #1
On Thu, Jan 13, 2022 at 04:52:40AM +0000, Li,Rongqing wrote:

> > > > On Wed, Jan 12, 2022 at 08:02:01PM +0800, Li RongQing wrote:
> > > > > vcpu can schedule out when run halt instruction, and set itself to
> > > > > INTERRUPTIBLE and switch to idle thread, vcpu should not be set
> > > > > preempted for this condition

> Is it possible if guest has KVM_HINTS_REALTIME feature, but its HLT instruction is emulated by KVM?
> If it is possible, this condition has been performance degradation, since vcpu_is_preempted is not __kvm_vcpu_is_preempted, will return false.
> 
> Similar, guest has nopvspin, but HLT instruction is emulated;  
> 
> Should we adjust the setting of pv_ops.lock.vcpu_is_preempted as below
> And I see the performance boost when guest has nopvspin, but HLT instruction is emulated with below change

I'm a little confused; the initial patch explicitly avoided setting
preempted on HLT, while the below causes it to be set more.

That said; I don't object to this, but I'm not convinced it's right
either. If you have HINTS_REALTIME (horrible naming aside) this means
you have pinned vCPU and no overcommit, in which case setting preempted
makes no sense.

*confused*

> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 59abbda..b061d17 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -1048,6 +1048,11 @@ void __init kvm_spinlock_init(void)
>                 return;
>         }
> 
> +       if (kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {
> +               pv_ops.lock.vcpu_is_preempted =
> +                       PV_CALLEE_SAVE(__kvm_vcpu_is_preempted);
> +       }
> +
>         /*
>          * Disable PV spinlocks and use native qspinlock when dedicated pCPUs
>          * are available.
> @@ -1076,10 +1081,6 @@ void __init kvm_spinlock_init(void)
>         pv_ops.lock.wait = kvm_wait;
>         pv_ops.lock.kick = kvm_kick_cpu;
> 
> -       if (kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {
> -               pv_ops.lock.vcpu_is_preempted =
> -                       PV_CALLEE_SAVE(__kvm_vcpu_is_preempted);
> -       }
>         /*
>          * When PV spinlock is enabled which is preferred over
>          * virt_spin_lock(), virt_spin_lock_key's value is meaningless.
> 
> 
> -Li
Li RongQing Jan. 13, 2022, 11:55 a.m. UTC | #2
> -----邮件原件-----
> 发件人: Peter Zijlstra <peterz@infradead.org>
> 发送时间: 2022年1月13日 17:34
> 收件人: Li,Rongqing <lirongqing@baidu.com>
> 抄送: Sean Christopherson <seanjc@google.com>; pbonzini@redhat.com;
> vkuznets@redhat.com; wanpengli@tencent.com; jmattson@google.com;
> tglx@linutronix.de; bp@alien8.de; x86@kernel.org; kvm@vger.kernel.org;
> joro@8bytes.org; Wang,Guangju <wangguangju@baidu.com>
> 主题: Re: 答复: [PATCH] KVM: X86: set vcpu preempted only if it is preempted
> 
> On Thu, Jan 13, 2022 at 04:52:40AM +0000, Li,Rongqing wrote:
> 
> > > > > On Wed, Jan 12, 2022 at 08:02:01PM +0800, Li RongQing wrote:
> > > > > > vcpu can schedule out when run halt instruction, and set
> > > > > > itself to INTERRUPTIBLE and switch to idle thread, vcpu should
> > > > > > not be set preempted for this condition
> 
> > Is it possible if guest has KVM_HINTS_REALTIME feature, but its HLT
> instruction is emulated by KVM?
> > If it is possible, this condition has been performance degradation, since
> vcpu_is_preempted is not __kvm_vcpu_is_preempted, will return false.
> >
> > Similar, guest has nopvspin, but HLT instruction is emulated;
> >
> > Should we adjust the setting of pv_ops.lock.vcpu_is_preempted as below
> > And I see the performance boost when guest has nopvspin, but HLT
> > instruction is emulated with below change
> 
> I'm a little confused; the initial patch explicitly avoided setting preempted on HLT,
> while the below causes it to be set more.
> 
> That said; I don't object to this, but I'm not convinced it's right either. If you have
> HINTS_REALTIME (horrible naming aside) this means you have pinned vCPU and
> no overcommit, in which case setting preempted makes no sense.
> 
> *confused*
> 

Sorry

I first notice that kvm_vcpu_is_preempted() always return true from code review, even if vcpu is idle, think it is unreasonable, so have first patch.

After see feedback, do some tests, find the first patch will cause unixbench pipe performance degrading in one copy mode, which prove what your said, kvm_vcpu_is_preempted return true nearly always, which makes unixbench two thread running in same one vcpu sometime, so less wakeup, less rescheduling ipi

See kvm_vcpu_is_preempted() works only if guest has not nopvspin kernel cmdline and has not KVM_HINTS_REALTIME feature in kvm_spinlock_init, so there is new patch

Thanks

-LI


> > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index
> > 59abbda..b061d17 100644
> > --- a/arch/x86/kernel/kvm.c
> > +++ b/arch/x86/kernel/kvm.c
> > @@ -1048,6 +1048,11 @@ void __init kvm_spinlock_init(void)
> >                 return;
> >         }
> >
> > +       if (kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {
> > +               pv_ops.lock.vcpu_is_preempted =
> > +                       PV_CALLEE_SAVE(__kvm_vcpu_is_preempted);
> > +       }
> > +
> >         /*
> >          * Disable PV spinlocks and use native qspinlock when dedicated
> pCPUs
> >          * are available.
> > @@ -1076,10 +1081,6 @@ void __init kvm_spinlock_init(void)
> >         pv_ops.lock.wait = kvm_wait;
> >         pv_ops.lock.kick = kvm_kick_cpu;
> >
> > -       if (kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {
> > -               pv_ops.lock.vcpu_is_preempted =
> > -                       PV_CALLEE_SAVE(__kvm_vcpu_is_preempted);
> > -       }
> >         /*
> >          * When PV spinlock is enabled which is preferred over
> >          * virt_spin_lock(), virt_spin_lock_key's value is meaningless.
> >
> >
> > -Li
Wanpeng Li Jan. 13, 2022, 12:48 p.m. UTC | #3
On Thu, 13 Jan 2022 at 18:16, Li,Rongqing <lirongqing@baidu.com> wrote:
>
>
>
> > -----邮件原件-----
> > 发件人: Peter Zijlstra <peterz@infradead.org>
> > 发送时间: 2022年1月13日 5:31
> > 收件人: Sean Christopherson <seanjc@google.com>
> > 抄送: Li,Rongqing <lirongqing@baidu.com>; pbonzini@redhat.com;
> > vkuznets@redhat.com; wanpengli@tencent.com; jmattson@google.com;
> > tglx@linutronix.de; bp@alien8.de; x86@kernel.org; kvm@vger.kernel.org;
> > joro@8bytes.org
> > 主题: Re: [PATCH] KVM: X86: set vcpu preempted only if it is preempted
> >
> > On Wed, Jan 12, 2022 at 05:30:47PM +0000, Sean Christopherson wrote:
> > > On Wed, Jan 12, 2022, Peter Zijlstra wrote:
> > > > On Wed, Jan 12, 2022 at 08:02:01PM +0800, Li RongQing wrote:
> > > > > vcpu can schedule out when run halt instruction, and set itself to
> > > > > INTERRUPTIBLE and switch to idle thread, vcpu should not be set
> > > > > preempted for this condition
> > > >
> > > > Uhhmm, why not? Who says the vcpu will run the moment it becomes
> > > > runnable again? Another task could be woken up meanwhile occupying
> > > > the real cpu.
> > >
> > > Hrm, but when emulating HLT, e.g. for an idling vCPU, KVM will
> > > voluntarily schedule out the vCPU and mark it as preempted from the
> > > guest's perspective.  The vast majority, probably all, usage of
> > > steal_time.preempted expects it to truly mean "preempted" as opposed to
> > "not running".
> >
> > No, the original use-case was locking and that really cares about running.
> >
> > If the vCPU isn't running, we must not busy-wait for it etc..
> >
> > Similar to the scheduler use of it, if the vCPU isn't running, we should not
> > consider it so. Getting the vCPU task scheduled back on the CPU can take a 'long'
> > time.
> >
> > If you have pinned vCPU threads and no overcommit, we have other knobs to
> > indicate this I think.
>
>
> Is it possible if guest has KVM_HINTS_REALTIME feature, but its HLT instruction is emulated by KVM?
> If it is possible, this condition has been performance degradation, since vcpu_is_preempted is not __kvm_vcpu_is_preempted, will return false.
>
> Similar, guest has nopvspin, but HLT instruction is emulated;

https://lkml.kernel.org/r/<20210526133727.42339-1-m.misono760@gmail.com>

So it is the second time guys talk about this, we should tune the
dedicated scenario like advertise guest KVM_HINT_REALTIME feature and
not intercept mwait/hlt/pause simultaneously to get the best
performance.

    Wanpeng
Li RongQing Jan. 14, 2022, 9:58 a.m. UTC | #4
> So it is the second time guys talk about this, we should tune the dedicated
> scenario like advertise guest KVM_HINT_REALTIME feature and not intercept
> mwait/hlt/pause simultaneously to get the best performance.
> 
>     Wanpeng

Similar to KVM_FEATURE_STEAL_TIME

It is contradiction to advertise KVM_HINT_REALTIME feature and KVM_FEATURE_STEAL_TIME feature to guest at the same time

-Li
diff mbox series

Patch

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 59abbda..b061d17 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -1048,6 +1048,11 @@  void __init kvm_spinlock_init(void)
                return;
        }

+       if (kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {
+               pv_ops.lock.vcpu_is_preempted =
+                       PV_CALLEE_SAVE(__kvm_vcpu_is_preempted);
+       }
+
        /*
         * Disable PV spinlocks and use native qspinlock when dedicated pCPUs
         * are available.
@@ -1076,10 +1081,6 @@  void __init kvm_spinlock_init(void)
        pv_ops.lock.wait = kvm_wait;
        pv_ops.lock.kick = kvm_kick_cpu;

-       if (kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {
-               pv_ops.lock.vcpu_is_preempted =
-                       PV_CALLEE_SAVE(__kvm_vcpu_is_preempted);
-       }
        /*
         * When PV spinlock is enabled which is preferred over
         * virt_spin_lock(), virt_spin_lock_key's value is meaningless.