diff mbox

x86/kvm: fix LAPIC timer drift when guest uses periodic mode

Message ID 20180518155546.19879-1-david.vrabel@nutanix.com (mailing list archive)
State New, archived
Headers show

Commit Message

David Vrabel May 18, 2018, 3:55 p.m. UTC
Since 4.10, commit 8003c9ae204e (KVM: LAPIC: add APIC Timer
periodic/oneshot mode VMX preemption timer support), guests using
periodic LAPIC timers (such as FreeBSD 8.4) would see their timers
drift significantly over time.

Differences in the underlying clocks and numerical errors means the
periods of the two timers (hv and sw) are not the same. This
difference will accumulate with every expiry resulting in a large
error between the hv and sw timer.

This means the sw timer may be running slow when compared to the hv
timer. When the timer is switched from hv to sw, the now active sw
timer will expire late. The guest VCPU is reentered and it switches to
using the hv timer. This timer catches up, injecting multiple IRQs
into the guest (of which the guest only sees one as it does not get to
run until the hv timer has caught up) and thus the guest's timer rate
is low (and becomes increasing slower over time as the sw timer lags
further and further behind).

I believe a similar problem would occur if the hv timer is the slower
one, but I have not observed this.

Fix this by synchronizing the deadlines for both timers to the same
time source on every tick. This prevents the errors from accumulating.

Fixes: 8003c9ae204e21204e49816c5ea629357e283b06
Cc: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: David Vrabel <david.vrabel@nutanix.com>
---
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
---
 arch/x86/kvm/lapic.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

Comments

Wanpeng Li May 23, 2018, 11:55 a.m. UTC | #1
2018-05-18 23:55 GMT+08:00 David Vrabel <david.vrabel@nutanix.com>:
> Since 4.10, commit 8003c9ae204e (KVM: LAPIC: add APIC Timer
> periodic/oneshot mode VMX preemption timer support), guests using
> periodic LAPIC timers (such as FreeBSD 8.4) would see their timers
> drift significantly over time.
>
> Differences in the underlying clocks and numerical errors means the
> periods of the two timers (hv and sw) are not the same. This
> difference will accumulate with every expiry resulting in a large
> error between the hv and sw timer.
>
> This means the sw timer may be running slow when compared to the hv
> timer. When the timer is switched from hv to sw, the now active sw
> timer will expire late. The guest VCPU is reentered and it switches to
> using the hv timer. This timer catches up, injecting multiple IRQs
> into the guest (of which the guest only sees one as it does not get to
> run until the hv timer has caught up) and thus the guest's timer rate
> is low (and becomes increasing slower over time as the sw timer lags
> further and further behind).
>
> I believe a similar problem would occur if the hv timer is the slower
> one, but I have not observed this.
>
> Fix this by synchronizing the deadlines for both timers to the same
> time source on every tick. This prevents the errors from accumulating.

Thanks for the fix. Reviewed-by: Wanpeng Li <wanpengli@tencent.com>

Regards,
Wanpeng Li

>
> Fixes: 8003c9ae204e21204e49816c5ea629357e283b06
> Cc: Wanpeng Li <wanpeng.li@hotmail.com>
> Signed-off-by: David Vrabel <david.vrabel@nutanix.com>
> ---
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> ---
>  arch/x86/kvm/lapic.c | 16 ++++++++++++++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index b74c9c1405b9..3773c4625114 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1522,11 +1522,23 @@ static bool set_target_expiration(struct kvm_lapic *apic)
>
>  static void advance_periodic_target_expiration(struct kvm_lapic *apic)
>  {
> -       apic->lapic_timer.tscdeadline +=
> -               nsec_to_cycles(apic->vcpu, apic->lapic_timer.period);
> +       ktime_t now = ktime_get();
> +       u64 tscl = rdtsc();
> +       ktime_t delta;
> +
> +       /*
> +        * Synchronize both deadlines to the same time source or
> +        * differences in the periods (caused by differences in the
> +        * underlying clocks or numerical approximation errors) will
> +        * cause the two to drift apart over time as the errors
> +        * accumulate.
> +        */
>         apic->lapic_timer.target_expiration =
>                 ktime_add_ns(apic->lapic_timer.target_expiration,
>                                 apic->lapic_timer.period);
> +       delta = ktime_sub(apic->lapic_timer.target_expiration, now);
> +       apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) +
> +               nsec_to_cycles(apic->vcpu, delta);
>  }
>
>  static void start_sw_period(struct kvm_lapic *apic)
> --
> 2.11.0
>
Radim Krčmář May 24, 2018, 4:54 p.m. UTC | #2
2018-05-18 16:55+0100, David Vrabel:
> Since 4.10, commit 8003c9ae204e (KVM: LAPIC: add APIC Timer
> periodic/oneshot mode VMX preemption timer support), guests using
> periodic LAPIC timers (such as FreeBSD 8.4) would see their timers
> drift significantly over time.
> 
> Differences in the underlying clocks and numerical errors means the
> periods of the two timers (hv and sw) are not the same. This
> difference will accumulate with every expiry resulting in a large
> error between the hv and sw timer.
> 
> This means the sw timer may be running slow when compared to the hv
> timer. When the timer is switched from hv to sw, the now active sw
> timer will expire late. The guest VCPU is reentered and it switches to
> using the hv timer. This timer catches up, injecting multiple IRQs
> into the guest (of which the guest only sees one as it does not get to
> run until the hv timer has caught up) and thus the guest's timer rate
> is low (and becomes increasing slower over time as the sw timer lags
> further and further behind).
> 
> I believe a similar problem would occur if the hv timer is the slower
> one, but I have not observed this.
> 
> Fix this by synchronizing the deadlines for both timers to the same
> time source on every tick. This prevents the errors from accumulating.
> 
> Fixes: 8003c9ae204e21204e49816c5ea629357e283b06
> Cc: Wanpeng Li <wanpeng.li@hotmail.com>
> Signed-off-by: David Vrabel <david.vrabel@nutanix.com>
> ---
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: "Radim Krčmář" <rkrcmar@redhat.com>
> ---

Added Cc stable and applied, thanks.
diff mbox

Patch

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index b74c9c1405b9..3773c4625114 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1522,11 +1522,23 @@  static bool set_target_expiration(struct kvm_lapic *apic)
 
 static void advance_periodic_target_expiration(struct kvm_lapic *apic)
 {
-	apic->lapic_timer.tscdeadline +=
-		nsec_to_cycles(apic->vcpu, apic->lapic_timer.period);
+	ktime_t now = ktime_get();
+	u64 tscl = rdtsc();
+	ktime_t delta;
+
+	/*
+	 * Synchronize both deadlines to the same time source or
+	 * differences in the periods (caused by differences in the
+	 * underlying clocks or numerical approximation errors) will
+	 * cause the two to drift apart over time as the errors
+	 * accumulate.
+	 */
 	apic->lapic_timer.target_expiration =
 		ktime_add_ns(apic->lapic_timer.target_expiration,
 				apic->lapic_timer.period);
+	delta = ktime_sub(apic->lapic_timer.target_expiration, now);
+	apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) +
+		nsec_to_cycles(apic->vcpu, delta);
 }
 
 static void start_sw_period(struct kvm_lapic *apic)