diff mbox

[-rt,2/2] KVM: lapic: mark LAPIC timer handler as irqsafe

Message ID 20141125172329.580701786@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Marcelo Tosatti Nov. 25, 2014, 5:21 p.m. UTC
Since lapic timer handler only wakes up a simple waitqueue,
it can be executed from hardirq context.

Reduces average cyclictest latency by 3us.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

---
 arch/x86/kvm/lapic.c |   42 +++++++++++++++++++++++++++++++++++++++---
 1 file changed, 39 insertions(+), 3 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Paolo Bonzini Nov. 25, 2014, 5:38 p.m. UTC | #1
On 25/11/2014 18:21, Marcelo Tosatti wrote:
> +
> +	if (r == HRTIMER_RESTART) {
> +		do {
> +			ret = hrtimer_start_expires(data, HRTIMER_MODE_ABS);
> +			if (ret == -ETIME)
> +				hrtimer_add_expires_ns(&ktimer->timer,
> +							ktimer->period);

Is it possible to just compute the time where the next interrupt
happens?  I suspect the printk and WARN_ON below can be easily triggered
by a guest.

Paolo

> +			i++;
> +		} while (ret == -ETIME && i < 10);
> +
> +		if (ret == -ETIME) {
> +			printk(KERN_ERR "%s: failed to reprogram timer\n",
> +			       __func__);
> +			WARN_ON(1);
> +		}
> +	}
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka Nov. 25, 2014, 7:01 p.m. UTC | #2
On 2014-11-25 18:38, Paolo Bonzini wrote:
> 
> 
> On 25/11/2014 18:21, Marcelo Tosatti wrote:
>> +
>> +	if (r == HRTIMER_RESTART) {
>> +		do {
>> +			ret = hrtimer_start_expires(data, HRTIMER_MODE_ABS);
>> +			if (ret == -ETIME)
>> +				hrtimer_add_expires_ns(&ktimer->timer,
>> +							ktimer->period);
> 
> Is it possible to just compute the time where the next interrupt
> happens?  I suspect the printk and WARN_ON below can be easily triggered
> by a guest.

We have a lower bound for the period that a guest can program. Unless
that value is set too low, this should practically not happen if we
avoid disturbances while handling the event and reprogramming the next
one (irqs off?).

Jan
Rik van Riel Nov. 25, 2014, 7:11 p.m. UTC | #3
On 11/25/2014 12:21 PM, Marcelo Tosatti wrote:
> Since lapic timer handler only wakes up a simple waitqueue,
> it can be executed from hardirq context.
> 
> Reduces average cyclictest latency by 3us.

Can this patch be merged in the KVM tree, and go
upstream via Paolo?

> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

Acked-by: Rik van Riel <riel@redhat.com>

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner Nov. 25, 2014, 8:24 p.m. UTC | #4
On Tue, 25 Nov 2014, Rik van Riel wrote:
> On 11/25/2014 12:21 PM, Marcelo Tosatti wrote:
> > Since lapic timer handler only wakes up a simple waitqueue,
> > it can be executed from hardirq context.
> > 
> > Reduces average cyclictest latency by 3us.
> 
> Can this patch be merged in the KVM tree, and go
> upstream via Paolo?

Not really as it has RT dependencies ....

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marcelo Tosatti Dec. 4, 2014, 1:43 p.m. UTC | #5
On Tue, Nov 25, 2014 at 06:38:04PM +0100, Paolo Bonzini wrote:
> 
> 
> On 25/11/2014 18:21, Marcelo Tosatti wrote:
> > +
> > +	if (r == HRTIMER_RESTART) {
> > +		do {
> > +			ret = hrtimer_start_expires(data, HRTIMER_MODE_ABS);
> > +			if (ret == -ETIME)
> > +				hrtimer_add_expires_ns(&ktimer->timer,
> > +							ktimer->period);
> 
> Is it possible to just compute the time where the next interrupt
> happens?  

Yes but there is no guarantee hrtimer_start_expires will not fail.

> I suspect the printk and WARN_ON below can be easily triggered
> by a guest.

I'll remove those, thanks.


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini Dec. 4, 2014, 1:53 p.m. UTC | #6
On 25/11/2014 21:24, Thomas Gleixner wrote:
> On Tue, 25 Nov 2014, Rik van Riel wrote:
>> On 11/25/2014 12:21 PM, Marcelo Tosatti wrote:
>>> Since lapic timer handler only wakes up a simple waitqueue,
>>> it can be executed from hardirq context.
>>>
>>> Reduces average cyclictest latency by 3us.
>>
>> Can this patch be merged in the KVM tree, and go
>> upstream via Paolo?
> 
> Not really as it has RT dependencies ....

Can hrtimer_start_expires even return ETIME on a non-RT kernel?  If yes,
I can take patch 2.  If not, it's better to keep both patches in the RT
tree.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marcelo Tosatti Dec. 5, 2014, 4:17 p.m. UTC | #7
On Thu, Dec 04, 2014 at 02:53:26PM +0100, Paolo Bonzini wrote:
> 
> 
> On 25/11/2014 21:24, Thomas Gleixner wrote:
> > On Tue, 25 Nov 2014, Rik van Riel wrote:
> >> On 11/25/2014 12:21 PM, Marcelo Tosatti wrote:
> >>> Since lapic timer handler only wakes up a simple waitqueue,
> >>> it can be executed from hardirq context.
> >>>
> >>> Reduces average cyclictest latency by 3us.
> >>
> >> Can this patch be merged in the KVM tree, and go
> >> upstream via Paolo?
> > 
> > Not really as it has RT dependencies ....
> 
> Can hrtimer_start_expires even return ETIME on a non-RT kernel?  If yes,
> I can take patch 2.  If not, it's better to keep both patches in the RT
> tree.
> 
> Paolo

It can't. We're still evaluating the necessity of this patch, will
resubmit accordingly.

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

Index: linux-stable-rt/arch/x86/kvm/lapic.c
===================================================================
--- linux-stable-rt.orig/arch/x86/kvm/lapic.c	2014-11-25 14:14:38.636810068 -0200
+++ linux-stable-rt/arch/x86/kvm/lapic.c	2014-11-25 14:17:28.850567059 -0200
@@ -1031,8 +1031,38 @@ 
 				   apic->divide_count);
 }
 
+
+static enum hrtimer_restart apic_timer_fn(struct hrtimer *data);
+
+static void apic_timer_expired(struct hrtimer *data)
+{
+	int ret, i = 0;
+	enum hrtimer_restart r;
+	struct kvm_timer *ktimer = container_of(data, struct kvm_timer, timer);
+
+	r = apic_timer_fn(data);
+
+	if (r == HRTIMER_RESTART) {
+		do {
+			ret = hrtimer_start_expires(data, HRTIMER_MODE_ABS);
+			if (ret == -ETIME)
+				hrtimer_add_expires_ns(&ktimer->timer,
+							ktimer->period);
+			i++;
+		} while (ret == -ETIME && i < 10);
+
+		if (ret == -ETIME) {
+			printk(KERN_ERR "%s: failed to reprogram timer\n",
+			       __func__);
+			WARN_ON(1);
+		}
+	}
+}
+
+
 static void start_apic_timer(struct kvm_lapic *apic)
 {
+	int ret;
 	ktime_t now;
 	atomic_set(&apic->lapic_timer.pending, 0);
 
@@ -1062,9 +1092,11 @@ 
 			}
 		}
 
-		hrtimer_start(&apic->lapic_timer.timer,
+		ret = hrtimer_start(&apic->lapic_timer.timer,
 			      ktime_add_ns(now, apic->lapic_timer.period),
 			      HRTIMER_MODE_ABS);
+		if (ret == -ETIME)
+			apic_timer_expired(&apic->lapic_timer.timer);
 
 		apic_debug("%s: bus cycle is %" PRId64 "ns, now 0x%016"
 			   PRIx64 ", "
@@ -1094,8 +1126,10 @@ 
 			ns = (tscdeadline - guest_tsc) * 1000000ULL;
 			do_div(ns, this_tsc_khz);
 		}
-		hrtimer_start(&apic->lapic_timer.timer,
+		ret = hrtimer_start(&apic->lapic_timer.timer,
 			ktime_add_ns(now, ns), HRTIMER_MODE_ABS);
+		if (ret == -ETIME)
+			apic_timer_expired(&apic->lapic_timer.timer);
 
 		local_irq_restore(flags);
 	}
@@ -1581,6 +1615,7 @@ 
 	hrtimer_init(&apic->lapic_timer.timer, CLOCK_MONOTONIC,
 		     HRTIMER_MODE_ABS);
 	apic->lapic_timer.timer.function = apic_timer_fn;
+	apic->lapic_timer.timer.irqsafe = 1;
 
 	/*
 	 * APIC is created enabled. This will prevent kvm_lapic_set_base from
@@ -1699,7 +1734,8 @@ 
 
 	timer = &vcpu->arch.apic->lapic_timer.timer;
 	if (hrtimer_cancel(timer))
-		hrtimer_start_expires(timer, HRTIMER_MODE_ABS);
+		if (hrtimer_start_expires(timer, HRTIMER_MODE_ABS) == -ETIME)
+			apic_timer_expired(timer);
 }
 
 /*