Message ID | 20170109111856.8439-1-christoffer.dall@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 09/01/17 11:18, Christoffer Dall wrote: > When a VCPU blocks (WFI) and has programmed the vtimer, we program a > soft timer to expire in the future to wake up the vcpu thread when > appropriate. Because such as wake up involves a vcpu kick, and the > timer expire function can get called from interrupt context, and the > kick may sleep, we have to schedule the kick in the work function. > > The work function currently has a warning that gets raised if it turns > out that the timer shouldn't fire when it's run, which was added because > the idea was that in that case the work should never have been cancelled. > > However, it turns out that this whole thing is racy and we can get > spurious warnings. The problem is that we clear the armed flag in the > work function, which may run in parallel with the > kvm_timer_unschedule->timer_disarm() call. This results in a possible > situation where the timer_disarm() call does not call > cancel_work_sync(), which effectively synchronizes the completion of the > work function with running the VCPU. As a result, the VCPU thread > proceeds before the work function completees, causing changes to the > timer state such that kvm_timer_should_fire(vcpu) returns false in the > work function. > > All we do in the work function is to kick the VCPU, and an occasional > rare extra kick never harmed anyone. Since the race above is extremely > rare, we don't bother checking if the race happens but simply remove the > check and the clearing of the armed flag from the work function. > > Reported-by: Matthias Brugger <mbrugger@suse.com> > Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> M.
diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c index a2dbbcc..a7fe606 100644 --- a/virt/kvm/arm/arch_timer.c +++ b/virt/kvm/arm/arch_timer.c @@ -89,9 +89,6 @@ static void kvm_timer_inject_irq_work(struct work_struct *work) struct kvm_vcpu *vcpu; vcpu = container_of(work, struct kvm_vcpu, arch.timer_cpu.expired); - vcpu->arch.timer_cpu.armed = false; - - WARN_ON(!kvm_timer_should_fire(vcpu)); /* * If the vcpu is blocked we want to wake it up so that it will see
When a VCPU blocks (WFI) and has programmed the vtimer, we program a soft timer to expire in the future to wake up the vcpu thread when appropriate. Because such as wake up involves a vcpu kick, and the timer expire function can get called from interrupt context, and the kick may sleep, we have to schedule the kick in the work function. The work function currently has a warning that gets raised if it turns out that the timer shouldn't fire when it's run, which was added because the idea was that in that case the work should never have been cancelled. However, it turns out that this whole thing is racy and we can get spurious warnings. The problem is that we clear the armed flag in the work function, which may run in parallel with the kvm_timer_unschedule->timer_disarm() call. This results in a possible situation where the timer_disarm() call does not call cancel_work_sync(), which effectively synchronizes the completion of the work function with running the VCPU. As a result, the VCPU thread proceeds before the work function completees, causing changes to the timer state such that kvm_timer_should_fire(vcpu) returns false in the work function. All we do in the work function is to kick the VCPU, and an occasional rare extra kick never harmed anyone. Since the race above is extremely rare, we don't bother checking if the race happens but simply remove the check and the clearing of the armed flag from the work function. Reported-by: Matthias Brugger <mbrugger@suse.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> --- Changes since v1: - Don't add a second call to cancel_work_sync, but avoid clearing the armed flag to let the timer_disarm() function call cancel_work_sync on every unschedule call. - Note that I chose to remove the warning, despite it shouldn't really happen anymore, because I don't see the value in the warning and it does a bit of potentially unnecessary checking in a potentially hot path. virt/kvm/arm/arch_timer.c | 3 --- 1 file changed, 3 deletions(-)