Message ID | 1500886678-5417-1-git-send-email-wanpeng.li@hotmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 24/07/2017 10:57, Wanpeng Li wrote: > From: Wanpeng Li <wanpeng.li@hotmail.com> > > Preemption can occur in the preemption timer expiration handler: > > CPU0 CPU1 > > preemption timer vmexit > handle_preemption_timer(vCPU0) > kvm_lapic_expired_hv_timer > hv_timer_is_use == true > sched_out > sched_in > kvm_arch_vcpu_load > kvm_lapic_restart_hv_timer > restart_apic_timer > start_hv_timer > already-expired timer or sw timer triggerd in the window > start_sw_timer > cancel_hv_timer At this point, the timer interrupt is injected, right? If this is correct, kvm_lapic_expired_hv_timer can just do nothing if the timer is not in use, with a comment explaining that the preemption notifier has run start_sw_timer and thus injected the timer interrupt. > /* back in kvm_lapic_expired_hv_timer */ > cancel_hv_timer > WARN_ON(!apic->lapic_timer.hv_timer_in_use); ==> Oops > > This can be reproduced if CONFIG_PREEMPT is enabled. > > This patch fixes it by don't cancel preemption timer repeatedly if > the preemption timer has already been cancelled due to preemption > since already-expired timer or sw timer triggered in the window. > > Cc: Paolo Bonzini <pbonzini@redhat.com> > Cc: Radim Krčmář <rkrcmar@redhat.com> > Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> > --- > arch/x86/kvm/lapic.c | 10 +++++++--- > 1 file changed, 7 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > index 2819d4c..8341b40 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -1560,9 +1560,13 @@ void kvm_lapic_expired_hv_timer(struct kvm_vcpu *vcpu) > { > struct kvm_lapic *apic = vcpu->arch.apic; > > - WARN_ON(!apic->lapic_timer.hv_timer_in_use); > - WARN_ON(swait_active(&vcpu->wq)); > - cancel_hv_timer(apic); > + preempt_disable(); > + if (!(!apic_lvtt_period(apic) && atomic_read(&apic->lapic_timer.pending))) { Why is the "if" necessary? Maybe all of kvm_lapic_expired_hv_timer and start_sw_timer should be in preemption-disabled regions, which trivially avoids any reentrancy issue with the preempt notifier. Then, cancel_hv_timer can assert that it's called with preemption disabled. Paolo > + WARN_ON(!apic->lapic_timer.hv_timer_in_use); > + WARN_ON(swait_active(&vcpu->wq)); > + cancel_hv_timer(apic); > + } > + preempt_enable(); > apic_timer_expired(apic); > > if (apic_lvtt_period(apic) && apic->lapic_timer.period) { >
2017-07-24 22:45 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>: > On 24/07/2017 10:57, Wanpeng Li wrote: >> From: Wanpeng Li <wanpeng.li@hotmail.com> >> >> Preemption can occur in the preemption timer expiration handler: >> >> CPU0 CPU1 >> >> preemption timer vmexit >> handle_preemption_timer(vCPU0) >> kvm_lapic_expired_hv_timer >> hv_timer_is_use == true >> sched_out >> sched_in >> kvm_arch_vcpu_load >> kvm_lapic_restart_hv_timer >> restart_apic_timer >> start_hv_timer >> already-expired timer or sw timer triggerd in the window >> start_sw_timer >> cancel_hv_timer > > At this point, the timer interrupt is injected, right? Do you mean the new one on CPU1? I think we just set the pending timer, we return back to kvm_lapic_expired_hv_timer() after preempt notifier sched_in. > > If this is correct, kvm_lapic_expired_hv_timer can just do nothing if > the timer is not in use, with a comment explaining that the preemption > notifier has run start_sw_timer and thus injected the timer interrupt. > >> /* back in kvm_lapic_expired_hv_timer */ >> cancel_hv_timer >> WARN_ON(!apic->lapic_timer.hv_timer_in_use); ==> Oops >> >> This can be reproduced if CONFIG_PREEMPT is enabled. >> >> This patch fixes it by don't cancel preemption timer repeatedly if >> the preemption timer has already been cancelled due to preemption >> since already-expired timer or sw timer triggered in the window. >> >> Cc: Paolo Bonzini <pbonzini@redhat.com> >> Cc: Radim Krčmář <rkrcmar@redhat.com> >> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> >> --- >> arch/x86/kvm/lapic.c | 10 +++++++--- >> 1 file changed, 7 insertions(+), 3 deletions(-) >> >> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c >> index 2819d4c..8341b40 100644 >> --- a/arch/x86/kvm/lapic.c >> +++ b/arch/x86/kvm/lapic.c >> @@ -1560,9 +1560,13 @@ void kvm_lapic_expired_hv_timer(struct kvm_vcpu *vcpu) >> { >> struct kvm_lapic *apic = vcpu->arch.apic; >> >> - WARN_ON(!apic->lapic_timer.hv_timer_in_use); >> - WARN_ON(swait_active(&vcpu->wq)); >> - cancel_hv_timer(apic); >> + preempt_disable(); >> + if (!(!apic_lvtt_period(apic) && atomic_read(&apic->lapic_timer.pending))) { > > Why is the "if" necessary? > > Maybe all of kvm_lapic_expired_hv_timer and start_sw_timer should be in > preemption-disabled regions, which trivially avoids any reentrancy issue > with the preempt notifier. Then, cancel_hv_timer can assert that it's > called with preemption disabled. For example: static int handle_preemption_timer(struct kvm_vcpu *vcpu) { --------------------------------------------------> We still can be preempted here, and do one cancel_hv_timer() preempt_disable(); kvm_lapic_expired_hv_timer(vcpu); -----> WARN_ON in cancel_hv_timer() even if we remove the WARN_ON in kvm_lapic_expired_hv_timer() as you mentioned above preempt_enable(); return 1; } Regards, Wanpeng Li
On 24/07/2017 17:08, Wanpeng Li wrote: > 2017-07-24 22:45 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>: >> On 24/07/2017 10:57, Wanpeng Li wrote: >>> From: Wanpeng Li <wanpeng.li@hotmail.com> >>> >>> Preemption can occur in the preemption timer expiration handler: >>> >>> CPU0 CPU1 >>> >>> preemption timer vmexit >>> handle_preemption_timer(vCPU0) >>> kvm_lapic_expired_hv_timer >>> hv_timer_is_use == true >>> sched_out >>> sched_in >>> kvm_arch_vcpu_load >>> kvm_lapic_restart_hv_timer >>> restart_apic_timer >>> start_hv_timer >>> already-expired timer or sw timer triggerd in the window >>> start_sw_timer >>> cancel_hv_timer >> >> At this point, the timer interrupt is injected, right? > > Do you mean the new one on CPU1? I think we just set the pending > timer, we return back to kvm_lapic_expired_hv_timer() after preempt > notifier sched_in. start_sw_timer calls apic_timer_expired after cancel_hv_timer. >>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c >>> index 2819d4c..8341b40 100644 >>> --- a/arch/x86/kvm/lapic.c >>> +++ b/arch/x86/kvm/lapic.c >>> @@ -1560,9 +1560,13 @@ void kvm_lapic_expired_hv_timer(struct kvm_vcpu *vcpu) >>> { >>> struct kvm_lapic *apic = vcpu->arch.apic; >>> >>> - WARN_ON(!apic->lapic_timer.hv_timer_in_use); >>> - WARN_ON(swait_active(&vcpu->wq)); >>> - cancel_hv_timer(apic); >>> + preempt_disable(); >>> + if (!(!apic_lvtt_period(apic) && atomic_read(&apic->lapic_timer.pending))) { >> >> Why is the "if" necessary? >> >> Maybe all of kvm_lapic_expired_hv_timer and start_sw_timer should be in >> preemption-disabled regions, which trivially avoids any reentrancy issue >> with the preempt notifier. Then, cancel_hv_timer can assert that it's >> called with preemption disabled. > > For example: > > static int handle_preemption_timer(struct kvm_vcpu *vcpu) > { > --------------------------------------------------> We still can > be preempted here, and do one cancel_hv_timer() Yes, so that just means you can do preempt_disable(); /* The preempt notifier has called apic_timer_expired already. */ if (!apic->lapic_timer.hv_timer_in_use) goto out; Thinking more about it, even _the caller_ of start_hv_timer and start_sw_timer should be in a preemption-disabled region for simplicity. That means that ultimately restart_apic_timer, kvm_lapic_expired_hv_timer and kvm_lapic_switch_to_sw_timer should call preempt_disable/preempt_enable. Paolo
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 2819d4c..8341b40 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1560,9 +1560,13 @@ void kvm_lapic_expired_hv_timer(struct kvm_vcpu *vcpu) { struct kvm_lapic *apic = vcpu->arch.apic; - WARN_ON(!apic->lapic_timer.hv_timer_in_use); - WARN_ON(swait_active(&vcpu->wq)); - cancel_hv_timer(apic); + preempt_disable(); + if (!(!apic_lvtt_period(apic) && atomic_read(&apic->lapic_timer.pending))) { + WARN_ON(!apic->lapic_timer.hv_timer_in_use); + WARN_ON(swait_active(&vcpu->wq)); + cancel_hv_timer(apic); + } + preempt_enable(); apic_timer_expired(apic); if (apic_lvtt_period(apic) && apic->lapic_timer.period) {