diff mbox

[5/5] KVM: x86: Using TSC deadline may cause multiple interrupts by user writes

Message ID 1412287806-16016-6-git-send-email-namit@cs.technion.ac.il (mailing list archive)
State New, archived
Headers show

Commit Message

Nadav Amit Oct. 2, 2014, 10:10 p.m. UTC
Setting the TSC deadline MSR that are initiated by the host (using ioctl's) may
cause superfluous interrupt.  This occurs in the following case:

1. A TSC deadline timer interrupt is pending.
2. TSC deadline was still not cleared (which happens during vcpu_run).
3. Userspace uses KVM_GET_MSRS/KVM_SET_MSRS to load the same deadline msr.

To solve this situation, ignore host initiated TSC deadline writes that do not
change the deadline value.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
---
 arch/x86/kvm/lapic.c | 7 ++++++-
 arch/x86/kvm/lapic.h | 3 ++-
 arch/x86/kvm/x86.c   | 2 +-
 3 files changed, 9 insertions(+), 3 deletions(-)

Comments

Radim Krčmář Oct. 6, 2014, 8:57 p.m. UTC | #1
2014-10-03 01:10+0300, Nadav Amit:
> Setting the TSC deadline MSR that are initiated by the host (using ioctl's) may
> cause superfluous interrupt.  This occurs in the following case:
> 
> 1. A TSC deadline timer interrupt is pending.
> 2. TSC deadline was still not cleared (which happens during vcpu_run).
> 3. Userspace uses KVM_GET_MSRS/KVM_SET_MSRS to load the same deadline msr.
> 
> To solve this situation, ignore host initiated TSC deadline writes that do not
> change the deadline value.

I find this change slightly dubious ...
 - why does the userspace do that?
 - why is host_initiated required?

Thanks.

It seems like an performance improvement, so why shouldn't return when
  'data <= tscdeadline && pending()'
or even
  'data <= now() && pending()'

(Sorry, I ran out of time to search for answers today.)

> Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
> ---
>  arch/x86/kvm/lapic.c | 7 ++++++-
>  arch/x86/kvm/lapic.h | 3 ++-
>  arch/x86/kvm/x86.c   | 2 +-
>  3 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index b8345dd..0bcf2e1 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1346,14 +1346,19 @@ u64 kvm_get_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu)
>  	return apic->lapic_timer.tscdeadline;
>  }
>  
> -void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu, u64 data)
> +void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu,
> +				   struct msr_data *msr_info)
>  {
>  	struct kvm_lapic *apic = vcpu->arch.apic;
> +	u64 data = msr_info->data;
>  
>  	if (!kvm_vcpu_has_lapic(vcpu) || apic_lvtt_oneshot(apic) ||
>  			apic_lvtt_period(apic))
>  		return;
>  
> +	if (msr_info->host_initiated && apic->lapic_timer.tscdeadline == data)
> +		return;
> +
>  	hrtimer_cancel(&apic->lapic_timer.timer);
>  	/* Inject here so clearing tscdeadline won't override new value */
>  	if (apic_has_pending_timer(vcpu))
> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
> index 6a11845..5bfa3db 100644
> --- a/arch/x86/kvm/lapic.h
> +++ b/arch/x86/kvm/lapic.h
> @@ -71,7 +71,8 @@ void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu,
>  int kvm_lapic_find_highest_irr(struct kvm_vcpu *vcpu);
>  
>  u64 kvm_get_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu);
> -void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu, u64 data);
> +void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu,
> +				struct msr_data *msr_info);
>  
>  void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset);
>  void kvm_apic_set_eoi_accelerated(struct kvm_vcpu *vcpu, int vector);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e903167..8c2745c 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2093,7 +2093,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>  	case APIC_BASE_MSR ... APIC_BASE_MSR + 0x3ff:
>  		return kvm_x2apic_msr_write(vcpu, msr, data);
>  	case MSR_IA32_TSCDEADLINE:
> -		kvm_set_lapic_tscdeadline_msr(vcpu, data);
> +		kvm_set_lapic_tscdeadline_msr(vcpu, msr_info);
>  		break;
>  	case MSR_IA32_TSC_ADJUST:
>  		if (guest_cpuid_has_tsc_adjust(vcpu)) {
> -- 
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nadav Amit Oct. 7, 2014, 9:35 a.m. UTC | #2
Thanks for reviewing this patch and the rest of the gang.

On Oct 6, 2014, at 11:57 PM, Radim Kr?má? <rkrcmar@redhat.com> wrote:

> 2014-10-03 01:10+0300, Nadav Amit:
>> Setting the TSC deadline MSR that are initiated by the host (using ioctl's) may
>> cause superfluous interrupt.  This occurs in the following case:
>> 
>> 1. A TSC deadline timer interrupt is pending.
>> 2. TSC deadline was still not cleared (which happens during vcpu_run).
>> 3. Userspace uses KVM_GET_MSRS/KVM_SET_MSRS to load the same deadline msr.
>> 
>> To solve this situation, ignore host initiated TSC deadline writes that do not
>> change the deadline value.
> 
> I find this change slightly dubious …
Why? I see similar handling of MSR_TSC_ADJUST.

> - why does the userspace do that?
It seems qemu’s kvm_cpu_exec does so when it calls kvm_arch_put_registers.
It is pretty much done after every exit to userspace.

> - why is host_initiated required?
Since if the guest writes to the MSR, it means it wants to rearm the TSC deadline. Even if the deadline passed, interrupt should be triggered.
If the guest writes the same value on the deadline MSR twice, it might expect two interrupts.


> 
> Thanks.
> 
> It seems like an performance improvement, so why shouldn't return when
>  'data <= tscdeadline && pending()'
> or even
>  'data <= now() && pending()'
> 
> (Sorry, I ran out of time to search for answers today.)
The bug I encountered is not a performance issue, but a superfluous interrupt (functional bug).
As I said, the guest may write a new deadline MSR value which is in the past and expect an interrupt.

Nadav--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini Oct. 8, 2014, 9:29 a.m. UTC | #3
Il 03/10/2014 00:10, Nadav Amit ha scritto:
> To solve this situation, ignore host initiated TSC deadline writes that do not
> change the deadline value.
> 
> Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
> ---
>  arch/x86/kvm/lapic.c | 7 ++++++-
>  arch/x86/kvm/lapic.h | 3 ++-
>  arch/x86/kvm/x86.c   | 2 +-
>  3 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index b8345dd..0bcf2e1 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1346,14 +1346,19 @@ u64 kvm_get_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu)
>  	return apic->lapic_timer.tscdeadline;
>  }
>  
> -void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu, u64 data)
> +void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu,
> +				   struct msr_data *msr_info)
>  {
>  	struct kvm_lapic *apic = vcpu->arch.apic;
> +	u64 data = msr_info->data;
>  
>  	if (!kvm_vcpu_has_lapic(vcpu) || apic_lvtt_oneshot(apic) ||
>  			apic_lvtt_period(apic))
>  		return;
>  
> +	if (msr_info->host_initiated && apic->lapic_timer.tscdeadline == data)
> +		return;
> +

Why do we have to check host_initiated?

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index b8345dd..0bcf2e1 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1346,14 +1346,19 @@  u64 kvm_get_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu)
 	return apic->lapic_timer.tscdeadline;
 }
 
-void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu, u64 data)
+void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu,
+				   struct msr_data *msr_info)
 {
 	struct kvm_lapic *apic = vcpu->arch.apic;
+	u64 data = msr_info->data;
 
 	if (!kvm_vcpu_has_lapic(vcpu) || apic_lvtt_oneshot(apic) ||
 			apic_lvtt_period(apic))
 		return;
 
+	if (msr_info->host_initiated && apic->lapic_timer.tscdeadline == data)
+		return;
+
 	hrtimer_cancel(&apic->lapic_timer.timer);
 	/* Inject here so clearing tscdeadline won't override new value */
 	if (apic_has_pending_timer(vcpu))
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 6a11845..5bfa3db 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -71,7 +71,8 @@  void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu,
 int kvm_lapic_find_highest_irr(struct kvm_vcpu *vcpu);
 
 u64 kvm_get_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu);
-void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu, u64 data);
+void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu,
+				struct msr_data *msr_info);
 
 void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset);
 void kvm_apic_set_eoi_accelerated(struct kvm_vcpu *vcpu, int vector);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e903167..8c2745c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2093,7 +2093,7 @@  int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 	case APIC_BASE_MSR ... APIC_BASE_MSR + 0x3ff:
 		return kvm_x2apic_msr_write(vcpu, msr, data);
 	case MSR_IA32_TSCDEADLINE:
-		kvm_set_lapic_tscdeadline_msr(vcpu, data);
+		kvm_set_lapic_tscdeadline_msr(vcpu, msr_info);
 		break;
 	case MSR_IA32_TSC_ADJUST:
 		if (guest_cpuid_has_tsc_adjust(vcpu)) {