Message ID | 1641988921-3507-1-git-send-email-lirongqing@baidu.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: X86: set vcpu preempted only if it is preempted | expand |
On Wed, Jan 12, 2022 at 08:02:01PM +0800, Li RongQing wrote: > vcpu can schedule out when run halt instruction, and set itself > to INTERRUPTIBLE and switch to idle thread, vcpu should not be > set preempted for this condition Uhhmm, why not? Who says the vcpu will run the moment it becomes runnable again? Another task could be woken up meanwhile occupying the real cpu. > > Signed-off-by: Li RongQing <lirongqing@baidu.com> > Signed-off-by: Wang GuangJu <wangguangju@baidu.com> > --- > arch/x86/kvm/x86.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 9f5dbf7..10d76bf 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -4407,6 +4407,9 @@ static void kvm_steal_time_set_preempted(struct kvm_vcpu *vcpu) > if (vcpu->arch.st.preempted) > return; > > + if (!vcpu->preempted) > + return; > + > /* This happens on process exit */ > if (unlikely(current->mm != vcpu->kvm->mm)) > return; > -- > 2.9.4 >
On Wed, Jan 12, 2022, Peter Zijlstra wrote: > On Wed, Jan 12, 2022 at 08:02:01PM +0800, Li RongQing wrote: > > vcpu can schedule out when run halt instruction, and set itself > > to INTERRUPTIBLE and switch to idle thread, vcpu should not be > > set preempted for this condition > > Uhhmm, why not? Who says the vcpu will run the moment it becomes > runnable again? Another task could be woken up meanwhile occupying the > real cpu. Hrm, but when emulating HLT, e.g. for an idling vCPU, KVM will voluntarily schedule out the vCPU and mark it as preempted from the guest's perspective. The vast majority, probably all, usage of steal_time.preempted expects it to truly mean "preempted" as opposed to "not running". The lack of a vcpu->preempted check has confused me for a long time. I assumed that was intended behavior, but looking at the original commit, I'm not so sure. The changelog is somewhat contradictory, as the the last sentence says "is running or not", but I suspect that's just imprecise language. commit 0b9f6c4615c993d2b552e0d2bd1ade49b56e5beb Author: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Date: Wed Nov 2 05:08:35 2016 -0400 x86/kvm: Support the vCPU preemption check Support the vcpu_is_preempted() functionality under KVM. This will enhance lock performance on overcommitted hosts (more runnable vCPUs than physical CPUs in the system) as doing busy waits for preempted vCPUs will hurt system performance far worse than early yielding. Use struct kvm_steal_time::preempted to indicate that if a vCPU is running or not. vcpu->preempted will be set if KVM schedules out the vCPU to service _TIF_NEED_RESCHED, but not in the HLT case because KVM will mark the vCPU as TASK_INTERRUPTIBLE. The flag also won't be set if KVM puts the vCPU when exiting to userspace to handle I/O or whatever, which is also desirable from the guest's perspective. There might be potential for false negatives, but any damage there is likely far outweighed by getting false positives, especially in the HLT case. So somewhat tentatively... Reviewed-by: Sean Christopherson <seanjc@google.com> > > Signed-off-by: Li RongQing <lirongqing@baidu.com> > > Signed-off-by: Wang GuangJu <wangguangju@baidu.com> > > --- > > arch/x86/kvm/x86.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 9f5dbf7..10d76bf 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -4407,6 +4407,9 @@ static void kvm_steal_time_set_preempted(struct kvm_vcpu *vcpu) > > if (vcpu->arch.st.preempted) > > return; > > > > + if (!vcpu->preempted) > > + return; > > + > > /* This happens on process exit */ > > if (unlikely(current->mm != vcpu->kvm->mm)) > > return; > > -- > > 2.9.4 > >
On 1/12/22 18:30, Sean Christopherson wrote: >> Uhhmm, why not? Who says the vcpu will run the moment it becomes >> runnable again? Another task could be woken up meanwhile occupying the >> real cpu. > Hrm, but when emulating HLT, e.g. for an idling vCPU, KVM will voluntarily schedule > out the vCPU and mark it as preempted from the guest's perspective. The vast majority, > probably all, usage of steal_time.preempted expects it to truly mean "preempted" as > opposed to "not running". I'm not sure about that. In particular, PV TLB shootdown benefits from treating a halted vCPU as preempted, because it avoids wakeups of the halted vCPUs. kvm_smp_send_call_func_ipi might not, though. Paolo
On Wed, Jan 12, 2022, Paolo Bonzini wrote: > On 1/12/22 18:30, Sean Christopherson wrote: > > > Uhhmm, why not? Who says the vcpu will run the moment it becomes > > > runnable again? Another task could be woken up meanwhile occupying the > > > real cpu. > > Hrm, but when emulating HLT, e.g. for an idling vCPU, KVM will voluntarily schedule > > out the vCPU and mark it as preempted from the guest's perspective. The vast majority, > > probably all, usage of steal_time.preempted expects it to truly mean "preempted" as > > opposed to "not running". > > I'm not sure about that. In particular, PV TLB shootdown benefits from > treating a halted vCPU as preempted, because it avoids wakeups of the halted > vCPUs. Ah, right. But that really should be decoupled from steal_time.preempted. KVM can technically handle the PV TLB flush any time the vCPU exits, it's just a question of whether the cost of writing guest memory outweighs the benefits of potentially avoiding an IPI. E.g. modifying KVM's fastpath exit loop to toggle a flag and potentially handle PV TLB flushes is probably a bad idea, but setting a flag immediately before static_call(kvm_x86_handle_exit)() may be a net win.
On Wed, Jan 12, 2022 at 05:30:47PM +0000, Sean Christopherson wrote: > On Wed, Jan 12, 2022, Peter Zijlstra wrote: > > On Wed, Jan 12, 2022 at 08:02:01PM +0800, Li RongQing wrote: > > > vcpu can schedule out when run halt instruction, and set itself > > > to INTERRUPTIBLE and switch to idle thread, vcpu should not be > > > set preempted for this condition > > > > Uhhmm, why not? Who says the vcpu will run the moment it becomes > > runnable again? Another task could be woken up meanwhile occupying the > > real cpu. > > Hrm, but when emulating HLT, e.g. for an idling vCPU, KVM will voluntarily schedule > out the vCPU and mark it as preempted from the guest's perspective. The vast majority, > probably all, usage of steal_time.preempted expects it to truly mean "preempted" as > opposed to "not running". No, the original use-case was locking and that really cares about running. If the vCPU isn't running, we must not busy-wait for it etc.. Similar to the scheduler use of it, if the vCPU isn't running, we should not consider it so. Getting the vCPU task scheduled back on the CPU can take a 'long' time. If you have pinned vCPU threads and no overcommit, we have other knobs to indicate this I tihnk.
On Wed, Jan 12, 2022, Peter Zijlstra wrote: > On Wed, Jan 12, 2022 at 05:30:47PM +0000, Sean Christopherson wrote: > > On Wed, Jan 12, 2022, Peter Zijlstra wrote: > > > On Wed, Jan 12, 2022 at 08:02:01PM +0800, Li RongQing wrote: > > > > vcpu can schedule out when run halt instruction, and set itself > > > > to INTERRUPTIBLE and switch to idle thread, vcpu should not be > > > > set preempted for this condition > > > > > > Uhhmm, why not? Who says the vcpu will run the moment it becomes > > > runnable again? Another task could be woken up meanwhile occupying the > > > real cpu. > > > > Hrm, but when emulating HLT, e.g. for an idling vCPU, KVM will voluntarily schedule > > out the vCPU and mark it as preempted from the guest's perspective. The vast majority, > > probably all, usage of steal_time.preempted expects it to truly mean "preempted" as > > opposed to "not running". > > No, the original use-case was locking and that really cares about > running. > > If the vCPU isn't running, we must not busy-wait for it etc.. > > Similar to the scheduler use of it, if the vCPU isn't running, we should > not consider it so. Getting the vCPU task scheduled back on the CPU can > take a 'long' time. Ah, thanks. Should have blamed more, commit 247f2f6f3c70 ("sched/core: Don't schedule threads on pre-empted vCPUs") is quite clear on this front.
> -----邮件原件----- > 发件人: Peter Zijlstra <peterz@infradead.org> > 发送时间: 2022年1月13日 5:31 > 收件人: Sean Christopherson <seanjc@google.com> > 抄送: Li,Rongqing <lirongqing@baidu.com>; pbonzini@redhat.com; > vkuznets@redhat.com; wanpengli@tencent.com; jmattson@google.com; > tglx@linutronix.de; bp@alien8.de; x86@kernel.org; kvm@vger.kernel.org; > joro@8bytes.org > 主题: Re: [PATCH] KVM: X86: set vcpu preempted only if it is preempted > > On Wed, Jan 12, 2022 at 05:30:47PM +0000, Sean Christopherson wrote: > > On Wed, Jan 12, 2022, Peter Zijlstra wrote: > > > On Wed, Jan 12, 2022 at 08:02:01PM +0800, Li RongQing wrote: > > > > vcpu can schedule out when run halt instruction, and set itself to > > > > INTERRUPTIBLE and switch to idle thread, vcpu should not be set > > > > preempted for this condition > > > > > > Uhhmm, why not? Who says the vcpu will run the moment it becomes > > > runnable again? Another task could be woken up meanwhile occupying > > > the real cpu. > > > > Hrm, but when emulating HLT, e.g. for an idling vCPU, KVM will > > voluntarily schedule out the vCPU and mark it as preempted from the > > guest's perspective. The vast majority, probably all, usage of > > steal_time.preempted expects it to truly mean "preempted" as opposed to > "not running". > > No, the original use-case was locking and that really cares about running. > > If the vCPU isn't running, we must not busy-wait for it etc.. > > Similar to the scheduler use of it, if the vCPU isn't running, we should not > consider it so. Getting the vCPU task scheduled back on the CPU can take a 'long' > time. > > If you have pinned vCPU threads and no overcommit, we have other knobs to > indicate this I tihnk. If vcpu is idle, and be marked as preempted, is it right in kvm_smp_send_call_func_ipi? static void kvm_smp_send_call_func_ipi(const struct cpumask *mask) { int cpu; native_send_call_func_ipi(mask); /* Make sure other vCPUs get a chance to run if they need to. */ for_each_cpu(cpu, mask) { if (vcpu_is_preempted(cpu)) { kvm_hypercall1(KVM_HC_SCHED_YIELD, per_cpu(x86_cpu_to_apicid, cpu)); break; } } } -Li
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9f5dbf7..10d76bf 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4407,6 +4407,9 @@ static void kvm_steal_time_set_preempted(struct kvm_vcpu *vcpu) if (vcpu->arch.st.preempted) return; + if (!vcpu->preempted) + return; + /* This happens on process exit */ if (unlikely(current->mm != vcpu->kvm->mm)) return;