Message ID | 1564573198-16219-1-git-send-email-wanpengli@tencent.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | None | expand |
On 31/07/19 13:39, Wanpeng Li wrote: > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index ed061d8..12f2c91 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -2506,7 +2506,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode) > continue; > if (vcpu == me) > continue; > - if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu)) > + if (READ_ONCE(vcpu->preempted) && swait_active(&vcpu->wq)) > continue; > if (READ_ONCE(vcpu->preempted) && yield_to_kernel_mode && > !kvm_arch_vcpu_in_kernel(vcpu)) > This cannot work. swait_active means you are waiting, so you cannot be involuntarily preempted. The problem here is simply that kvm_vcpu_has_events is being called without holding the lock. So kvm_arch_vcpu_runnable is okay, it's the implementation that's wrong. Just rename the existing function to just vcpu_runnable and make a new arch callback kvm_arch_dy_runnable. kvm_arch_dy_runnable can be conservative and only returns true for a subset of events, in particular for x86 it can check: - vcpu->arch.pv.pv_unhalted - KVM_REQ_NMI or KVM_REQ_SMI or KVM_REQ_EVENT - PIR.ON if APICv is set Ultimately, all variables accessed in kvm_arch_dy_runnable should be accessed with READ_ONCE or atomic_read. And for all architectures, kvm_vcpu_on_spin should check list_empty_careful(&vcpu->async_pf.done) It's okay if your patch renames the function in non-x86 architectures, leaving the fix to maintainers. So, let's CC Marc and Christian since ARM and s390 have pretty complex kvm_arch_vcpu_runnable as well. Paolo
On Wed, 31 Jul 2019 at 20:55, Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 31/07/19 13:39, Wanpeng Li wrote: > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > > index ed061d8..12f2c91 100644 > > --- a/virt/kvm/kvm_main.c > > +++ b/virt/kvm/kvm_main.c > > @@ -2506,7 +2506,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode) > > continue; > > if (vcpu == me) > > continue; > > - if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu)) > > + if (READ_ONCE(vcpu->preempted) && swait_active(&vcpu->wq)) > > continue; > > if (READ_ONCE(vcpu->preempted) && yield_to_kernel_mode && > > !kvm_arch_vcpu_in_kernel(vcpu)) > > > > This cannot work. swait_active means you are waiting, so you cannot be > involuntarily preempted. > > The problem here is simply that kvm_vcpu_has_events is being called > without holding the lock. So kvm_arch_vcpu_runnable is okay, it's the > implementation that's wrong. > > Just rename the existing function to just vcpu_runnable and make a new > arch callback kvm_arch_dy_runnable. kvm_arch_dy_runnable can be > conservative and only returns true for a subset of events, in particular > for x86 it can check: > > - vcpu->arch.pv.pv_unhalted > > - KVM_REQ_NMI or KVM_REQ_SMI or KVM_REQ_EVENT > > - PIR.ON if APICv is set > > Ultimately, all variables accessed in kvm_arch_dy_runnable should be > accessed with READ_ONCE or atomic_read. > > And for all architectures, kvm_vcpu_on_spin should check > list_empty_careful(&vcpu->async_pf.done) > > It's okay if your patch renames the function in non-x86 architectures, > leaving the fix to maintainers. So, let's CC Marc and Christian since > ARM and s390 have pretty complex kvm_arch_vcpu_runnable as well. Ok, just sent patch to do this. Regards, Wanpeng Li
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index ed061d8..12f2c91 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2506,7 +2506,7 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode) continue; if (vcpu == me) continue; - if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu)) + if (READ_ONCE(vcpu->preempted) && swait_active(&vcpu->wq)) continue; if (READ_ONCE(vcpu->preempted) && yield_to_kernel_mode && !kvm_arch_vcpu_in_kernel(vcpu))