diff mbox

[3/3] KVM: LAPIC: Fix lapic timer injection delay

Message ID 1498755501-39602-4-git-send-email-pbonzini@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Paolo Bonzini June 29, 2017, 4:58 p.m. UTC
From: Wanpeng Li <wanpeng.li@hotmail.com>

If the TSC deadline timer is programmed really close to the deadline or
even in the past, the computation in vmx_set_hv_timer will program the
absolute target tsc value to vmcs preemption timer field w/ delta == 0.
The next vmentry results in an immediate vmx preemption timer vmexit
and the lapic timer injection is delayed due to this duration.  Actually
the lapic timer which is emulated by hrtimer can handle this correctly.

This patch fixes it by firing the lapic timer and injecting a timer interrupt
immediately during the next vmentry if the TSC deadline timer is programmed
really close to the deadline or even in the past. This saves ~1200 cycles on
the tscdeadline_immed test of vmexit.flat.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
[Rebased on top of previous patch. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kvm/lapic.c | 5 ++++-
 arch/x86/kvm/vmx.c   | 3 ++-
 2 files changed, 6 insertions(+), 2 deletions(-)

Comments

Wanpeng Li July 2, 2017, 1:35 a.m. UTC | #1
2017-06-30 0:58 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
>
> If the TSC deadline timer is programmed really close to the deadline or
> even in the past, the computation in vmx_set_hv_timer will program the
> absolute target tsc value to vmcs preemption timer field w/ delta == 0.
> The next vmentry results in an immediate vmx preemption timer vmexit
> and the lapic timer injection is delayed due to this duration.  Actually
> the lapic timer which is emulated by hrtimer can handle this correctly.
>
> This patch fixes it by firing the lapic timer and injecting a timer interrupt
> immediately during the next vmentry if the TSC deadline timer is programmed
> really close to the deadline or even in the past. This saves ~1200 cycles on
> the tscdeadline_immed test of vmexit.flat.
>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> [Rebased on top of previous patch. - Paolo]
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/lapic.c | 5 ++++-
>  arch/x86/kvm/vmx.c   | 3 ++-
>  2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index a80e5a5d6f2f..2819d4c123eb 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1525,8 +1525,11 @@ static bool start_hv_timer(struct kvm_lapic *apic)
>          * the window.  For periodic timer, leave the hv timer running for
>          * simplicity, and the deadline will be recomputed on the next vmexit.
>          */
> -       if (!apic_lvtt_period(apic) && atomic_read(&ktimer->pending))
> +       if (!apic_lvtt_period(apic) && (r || atomic_read(&ktimer->pending))) {
> +               if (r)
> +                       apic_timer_expired(apic);
>                 return false;
> +       }

This logic is not the same as in my v4
http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1434040.html
. You return false for the expired timer and actually it will switch
to sw timer.

>
>         trace_kvm_hv_timer_state(apic->vcpu->vcpu_id, true);
>         return true;
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index e8b61ad84a8e..92ddea08f999 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -11147,7 +11147,8 @@ static int vmx_set_hv_timer(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc)
>         vmx->hv_deadline_tsc = tscl + delta_tsc;
>         vmcs_set_bits(PIN_BASED_VM_EXEC_CONTROL,
>                         PIN_BASED_VMX_PREEMPTION_TIMER);
> -       return 0;
> +
> +       return delta_tsc == 0;
>  }
>
>  static void vmx_cancel_hv_timer(struct kvm_vcpu *vcpu)
> --
> 1.8.3.1
>
Wanpeng Li July 2, 2017, 1:56 a.m. UTC | #2
2017-07-02 9:35 GMT+08:00 Wanpeng Li <kernellwp@gmail.com>:
> 2017-06-30 0:58 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>
>> If the TSC deadline timer is programmed really close to the deadline or
>> even in the past, the computation in vmx_set_hv_timer will program the
>> absolute target tsc value to vmcs preemption timer field w/ delta == 0.
>> The next vmentry results in an immediate vmx preemption timer vmexit
>> and the lapic timer injection is delayed due to this duration.  Actually
>> the lapic timer which is emulated by hrtimer can handle this correctly.
>>
>> This patch fixes it by firing the lapic timer and injecting a timer interrupt
>> immediately during the next vmentry if the TSC deadline timer is programmed
>> really close to the deadline or even in the past. This saves ~1200 cycles on
>> the tscdeadline_immed test of vmexit.flat.
>>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>> [Rebased on top of previous patch. - Paolo]
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  arch/x86/kvm/lapic.c | 5 ++++-
>>  arch/x86/kvm/vmx.c   | 3 ++-
>>  2 files changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index a80e5a5d6f2f..2819d4c123eb 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -1525,8 +1525,11 @@ static bool start_hv_timer(struct kvm_lapic *apic)
>>          * the window.  For periodic timer, leave the hv timer running for
>>          * simplicity, and the deadline will be recomputed on the next vmexit.
>>          */
>> -       if (!apic_lvtt_period(apic) && atomic_read(&ktimer->pending))
>> +       if (!apic_lvtt_period(apic) && (r || atomic_read(&ktimer->pending))) {
>> +               if (r)
>> +                       apic_timer_expired(apic);
>>                 return false;
>> +       }
>
> This logic is not the same as in my v4
> http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1434040.html
> . You return false for the expired timer and actually it will switch
> to sw timer.

Ah, I miss read it, the rebase is correct.

Regards,
Wanpeng Li

>
>>
>>         trace_kvm_hv_timer_state(apic->vcpu->vcpu_id, true);
>>         return true;
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index e8b61ad84a8e..92ddea08f999 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -11147,7 +11147,8 @@ static int vmx_set_hv_timer(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc)
>>         vmx->hv_deadline_tsc = tscl + delta_tsc;
>>         vmcs_set_bits(PIN_BASED_VM_EXEC_CONTROL,
>>                         PIN_BASED_VMX_PREEMPTION_TIMER);
>> -       return 0;
>> +
>> +       return delta_tsc == 0;
>>  }
>>
>>  static void vmx_cancel_hv_timer(struct kvm_vcpu *vcpu)
>> --
>> 1.8.3.1
>>
Paolo Bonzini July 3, 2017, 7:30 a.m. UTC | #3
On 02/07/2017 03:56, Wanpeng Li wrote:
>>> -       if (!apic_lvtt_period(apic) && atomic_read(&ktimer->pending))
>>> +       if (!apic_lvtt_period(apic) && (r || atomic_read(&ktimer->pending))) {
>>> +               if (r)
>>> +                       apic_timer_expired(apic);
>>>                 return false;
>>> +       }
>>
>> This logic is not the same as in my v4
>> http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1434040.html
>> . You return false for the expired timer and actually it will switch
>> to sw timer.
>
> Ah, I miss read it, the rebase is correct.

Yeah, I'm not entirely satisfied with it but it's working: start_sw
timer will see ktimer->pending and do nothing.

But thinking more about it, maybe the "if (r)" can be omitted
completely?  We need to benchmark it but it can be done.

Paolo
Wanpeng Li July 3, 2017, 8:08 a.m. UTC | #4
2017-07-03 15:30 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>
>
> On 02/07/2017 03:56, Wanpeng Li wrote:
>>>> -       if (!apic_lvtt_period(apic) && atomic_read(&ktimer->pending))
>>>> +       if (!apic_lvtt_period(apic) && (r || atomic_read(&ktimer->pending))) {
>>>> +               if (r)
>>>> +                       apic_timer_expired(apic);
>>>>                 return false;
>>>> +       }
>>>
>>> This logic is not the same as in my v4
>>> http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1434040.html
>>> . You return false for the expired timer and actually it will switch
>>> to sw timer.
>>
>> Ah, I miss read it, the rebase is correct.
>
> Yeah, I'm not entirely satisfied with it but it's working: start_sw
> timer will see ktimer->pending and do nothing.
>
> But thinking more about it, maybe the "if (r)" can be omitted
> completely?  We need to benchmark it but it can be done.

"if (r)" makes codes more understandable, in addition, calling expired
the pending timer here looks weird though ktimer->pending.

Regards,
Wanpeng Li
Paolo Bonzini July 3, 2017, 8:16 a.m. UTC | #5
On 03/07/2017 10:08, Wanpeng Li wrote:
>> Yeah, I'm not entirely satisfied with it but it's working: start_sw
>> timer will see ktimer->pending and do nothing.
>>
>> But thinking more about it, maybe the "if (r)" can be omitted
>> completely?  We need to benchmark it but it can be done.
> "if (r)" makes codes more understandable, in addition, calling expired
> the pending timer here looks weird though ktimer->pending.

We can remove the call to apic_timer_expired too (sorry if I was too
terse). :)  start_sw_period and start_sw_tscdeadline would take care of it.

Paolo
Wanpeng Li July 3, 2017, 8:41 a.m. UTC | #6
2017-07-03 16:16 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>
>
> On 03/07/2017 10:08, Wanpeng Li wrote:
>>> Yeah, I'm not entirely satisfied with it but it's working: start_sw
>>> timer will see ktimer->pending and do nothing.
>>>
>>> But thinking more about it, maybe the "if (r)" can be omitted
>>> completely?  We need to benchmark it but it can be done.
>> "if (r)" makes codes more understandable, in addition, calling expired
>> the pending timer here looks weird though ktimer->pending.
>
> We can remove the call to apic_timer_expired too (sorry if I was too
> terse). :)  start_sw_period and start_sw_tscdeadline would take care of it.

IRQ disable and ktime_get() in start_sw_tscdeadline() are more
expensive. So maybe current status is a better choice. :)

Regards,
Wanpeng Li
diff mbox

Patch

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index a80e5a5d6f2f..2819d4c123eb 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1525,8 +1525,11 @@  static bool start_hv_timer(struct kvm_lapic *apic)
 	 * the window.  For periodic timer, leave the hv timer running for
 	 * simplicity, and the deadline will be recomputed on the next vmexit.
 	 */
-	if (!apic_lvtt_period(apic) && atomic_read(&ktimer->pending))
+	if (!apic_lvtt_period(apic) && (r || atomic_read(&ktimer->pending))) {
+		if (r)
+			apic_timer_expired(apic);
 		return false;
+	}
 
 	trace_kvm_hv_timer_state(apic->vcpu->vcpu_id, true);
 	return true;
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e8b61ad84a8e..92ddea08f999 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -11147,7 +11147,8 @@  static int vmx_set_hv_timer(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc)
 	vmx->hv_deadline_tsc = tscl + delta_tsc;
 	vmcs_set_bits(PIN_BASED_VM_EXEC_CONTROL,
 			PIN_BASED_VMX_PREEMPTION_TIMER);
-	return 0;
+
+	return delta_tsc == 0;
 }
 
 static void vmx_cancel_hv_timer(struct kvm_vcpu *vcpu)