diff mbox

[v9,16/16] ARM: KVM: Guest wait-for-interrupts (WFI) support

Message ID 20120703090212.27746.39517.stgit@ubuntu (mailing list archive)
State New, archived
Headers show

Commit Message

Christoffer Dall July 3, 2012, 9:02 a.m. UTC
From: Christoffer Dall <cdall@cs.columbia.edu>

When the guest executes a WFI instruction the operation is trapped to
KVM, which emulates the instruction in software. There is no correlation
between a guest executing a WFI instruction and actually putting the
hardware into a low-power mode, since a KVM guest is essentially a
process and the WFI instruction can be seen as 'sleep' call from this
process. Therefore, we flag the VCPU to be in wait_for_interrupts mode
and call the main KVM function kvm_vcpu_block() function. This function
will put the thread on a wait-queue and call schedule.

When an interrupt comes in through KVM_IRQ_LINE (see previous patch) we
signal the VCPU thread and unflag the VCPU to no longer wait for
interrupts. All calls to kvm_arch_vcpu_ioctl_run() result in a call to
kvm_vcpu_block() as long as the VCPU is in wfi-mode.

Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
---
 arch/arm/kvm/arm.c     |   15 ++++++++++++++-
 arch/arm/kvm/emulate.c |   12 ++++++++++++
 arch/arm/kvm/trace.h   |   16 ++++++++++++++++
 3 files changed, 42 insertions(+), 1 deletion(-)


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Avi Kivity July 3, 2012, 1:10 p.m. UTC | #1
On 07/03/2012 12:02 PM, Christoffer Dall wrote:
> From: Christoffer Dall <cdall@cs.columbia.edu>
> 
> When the guest executes a WFI instruction the operation is trapped to
> KVM, which emulates the instruction in software. There is no correlation
> between a guest executing a WFI instruction and actually putting the
> hardware into a low-power mode, since a KVM guest is essentially a
> process and the WFI instruction can be seen as 'sleep' call from this
> process. Therefore, we flag the VCPU to be in wait_for_interrupts mode
> and call the main KVM function kvm_vcpu_block() function. This function
> will put the thread on a wait-queue and call schedule.
> 
> When an interrupt comes in through KVM_IRQ_LINE (see previous patch) we
> signal the VCPU thread and unflag the VCPU to no longer wait for
> interrupts. All calls to kvm_arch_vcpu_ioctl_run() result in a call to
> kvm_vcpu_block() as long as the VCPU is in wfi-mode.
> 
>  
>  int kvm_arch_vcpu_in_guest_mode(struct kvm_vcpu *v)
> @@ -538,6 +546,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  		 */
>  		cond_resched();
>  
> +		if (vcpu->arch.wait_for_interrupts)
> +			kvm_vcpu_block(vcpu);
> +
>  		update_vttbr(vcpu->kvm);
>  
>  		/*
> @@ -635,6 +646,8 @@ int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level)
>  	 * trigger a world-switch round on the running physical CPU to set the
>  	 * virtual IRQ/FIQ fields in the HCR appropriately.
>  	 */
> +	if (irq_level->level)
> +		vcpu->arch.wait_for_interrupts = 0;

What, no memory barriers, etc?

Is it actually needed?  We can clear it instead after calling
kvm_vcpu_block() above, so the variable is only accessed from the vcpu
thread.  The savings in pain medication are measurable.

>  	kvm_vcpu_kick(vcpu);
>  
>  	return 0;
> diff --git a/arch/arm/kvm/emulate.c b/arch/arm/kvm/emulate.c
> index 99432d8..564add2 100644
> +/**
> + * kvm_handle_wfi - handle a wait-for-interrupts instruction executed by a guest
> + * @vcpu:	the vcpu pointer
> + * @run:	the kvm_run structure pointer
> + *
> + * Simply sets the wait_for_interrupts flag on the vcpu structure, which will
> + * halt execution of world-switches and schedule other host processes until
> + * there is an incoming IRQ or FIQ to the VM.
> + */
>  int kvm_handle_wfi(struct kvm_vcpu *vcpu, struct kvm_run *run)
>  {
> +	trace_kvm_wfi(vcpu->arch.regs.pc);
>  	vcpu->stat.wfi_exits++;
> +	if (!vcpu->arch.irq_lines)
> +		vcpu->arch.wait_for_interrupts = 1;
>  

Or you could just call kvm_vcpu_block() here without having the
variable.  But eventually you'll need it since you want to expose wfi
state to userspace for live migration.
Peter Maydell July 3, 2012, 1:14 p.m. UTC | #2
On 3 July 2012 14:10, Avi Kivity <avi@redhat.com> wrote:
> Or you could just call kvm_vcpu_block() here without having the
> variable.  But eventually you'll need it since you want to expose wfi
> state to userspace for live migration.

You could just always wake the cpu when migrating: the
architecture allows WFI to return early for any reason
it likes including implementation convenience.

-- PMM
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity July 3, 2012, 1:24 p.m. UTC | #3
On 07/03/2012 04:14 PM, Peter Maydell wrote:
> On 3 July 2012 14:10, Avi Kivity <avi@redhat.com> wrote:
>> Or you could just call kvm_vcpu_block() here without having the
>> variable.  But eventually you'll need it since you want to expose wfi
>> state to userspace for live migration.
> 
> You could just always wake the cpu when migrating: the
> architecture allows WFI to return early for any reason
> it likes including implementation convenience.

Seems reasonable.

I imagine wfi works with interrupts disabled, unlike the x86 silliness?
Peter Maydell July 3, 2012, 1:49 p.m. UTC | #4
On 3 July 2012 14:24, Avi Kivity <avi@redhat.com> wrote:
> On 07/03/2012 04:14 PM, Peter Maydell wrote:
>> You could just always wake the cpu when migrating: the
>> architecture allows WFI to return early for any reason
>> it likes including implementation convenience.
>
> Seems reasonable.
>
> I imagine wfi works with interrupts disabled, unlike the x86 silliness?

Not sure exactly which bit of x86 silliness you're referring
to, but WFI will wake up regardless of the interrupt mask
bits in the CPSR. (If you've disabled interrupts in the GIC
that's your own bad lookout I guess.)

-- PMM
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity July 3, 2012, 3:57 p.m. UTC | #5
On 07/03/2012 04:49 PM, Peter Maydell wrote:
> On 3 July 2012 14:24, Avi Kivity <avi@redhat.com> wrote:
>> On 07/03/2012 04:14 PM, Peter Maydell wrote:
>>> You could just always wake the cpu when migrating: the
>>> architecture allows WFI to return early for any reason
>>> it likes including implementation convenience.
>>
>> Seems reasonable.
>>
>> I imagine wfi works with interrupts disabled, unlike the x86 silliness?
> 
> Not sure exactly which bit of x86 silliness you're referring
> to, but WFI will wake up regardless of the interrupt mask
> bits in the CPSR. (If you've disabled interrupts in the GIC
> that's your own bad lookout I guess.)


On x86 HLT respects the interrupt enable flag, so to avoid races, you
have to use a feature of STI (enable interrupts instruction) that only
enables interrupts after another instruction has executed.  So the
sequence STI; HLT atomically enables interrupts and waits for one (as
long as HLT didn't trigger an exception like a page fault).

The problem is that this "interrupt shadow" has to be tracked by
virtualization hardware, live migration, instruction emulation, etc.  It
interacts with non-maskable interrupts as well.  A horrible hack.
Christoffer Dall July 26, 2012, 9:08 p.m. UTC | #6
On Tue, Jul 3, 2012 at 3:49 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 3 July 2012 14:24, Avi Kivity <avi@redhat.com> wrote:
>> On 07/03/2012 04:14 PM, Peter Maydell wrote:
>>> You could just always wake the cpu when migrating: the
>>> architecture allows WFI to return early for any reason
>>> it likes including implementation convenience.
>>
>> Seems reasonable.
>>
>> I imagine wfi works with interrupts disabled, unlike the x86 silliness?
>
> Not sure exactly which bit of x86 silliness you're referring
> to, but WFI will wake up regardless of the interrupt mask
> bits in the CPSR. (If you've disabled interrupts in the GIC
> that's your own bad lookout I guess.)
>

ok, I'll just call kvm_vcpu_block() then and be rid of the variable. Thanks.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index b18f68f..f3b206a 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -306,9 +306,17 @@  int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu,
 	return -EINVAL;
 }
 
+/**
+ * kvm_arch_vcpu_runnable - determine if the vcpu can be scheduled
+ * @v:		The VCPU pointer
+ *
+ * If the guest CPU is not waiting for interrupts (or waiting and
+ * an interrupt is pending) then it is by definition runnable.
+ */
 int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
 {
-	return 0;
+	return !!v->arch.irq_lines ||
+		!v->arch.wait_for_interrupts;
 }
 
 int kvm_arch_vcpu_in_guest_mode(struct kvm_vcpu *v)
@@ -538,6 +546,9 @@  int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		 */
 		cond_resched();
 
+		if (vcpu->arch.wait_for_interrupts)
+			kvm_vcpu_block(vcpu);
+
 		update_vttbr(vcpu->kvm);
 
 		/*
@@ -635,6 +646,8 @@  int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level *irq_level)
 	 * trigger a world-switch round on the running physical CPU to set the
 	 * virtual IRQ/FIQ fields in the HCR appropriately.
 	 */
+	if (irq_level->level)
+		vcpu->arch.wait_for_interrupts = 0;
 	kvm_vcpu_kick(vcpu);
 
 	return 0;
diff --git a/arch/arm/kvm/emulate.c b/arch/arm/kvm/emulate.c
index 99432d8..564add2 100644
--- a/arch/arm/kvm/emulate.c
+++ b/arch/arm/kvm/emulate.c
@@ -505,9 +505,21 @@  int kvm_handle_cp15_32(struct kvm_vcpu *vcpu, struct kvm_run *run)
 	return emulate_cp15(vcpu, &params);
 }
 
+/**
+ * kvm_handle_wfi - handle a wait-for-interrupts instruction executed by a guest
+ * @vcpu:	the vcpu pointer
+ * @run:	the kvm_run structure pointer
+ *
+ * Simply sets the wait_for_interrupts flag on the vcpu structure, which will
+ * halt execution of world-switches and schedule other host processes until
+ * there is an incoming IRQ or FIQ to the VM.
+ */
 int kvm_handle_wfi(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
+	trace_kvm_wfi(vcpu->arch.regs.pc);
 	vcpu->stat.wfi_exits++;
+	if (!vcpu->arch.irq_lines)
+		vcpu->arch.wait_for_interrupts = 1;
 	return 0;
 }
 
diff --git a/arch/arm/kvm/trace.h b/arch/arm/kvm/trace.h
index 325106c..28ed1a1 100644
--- a/arch/arm/kvm/trace.h
+++ b/arch/arm/kvm/trace.h
@@ -90,6 +90,22 @@  TRACE_EVENT(kvm_emulate_cp15_imp,
 			__entry->CRm, __entry->Op2)
 );
 
+TRACE_EVENT(kvm_wfi,
+	TP_PROTO(unsigned long vcpu_pc),
+	TP_ARGS(vcpu_pc),
+
+	TP_STRUCT__entry(
+		__field(	unsigned long,	vcpu_pc		)
+	),
+
+	TP_fast_assign(
+		__entry->vcpu_pc		= vcpu_pc;
+	),
+
+	TP_printk("guest executed wfi at: 0x%08lx", __entry->vcpu_pc)
+);
+
+
 #endif /* _TRACE_KVM_H */
 
 #undef TRACE_INCLUDE_PATH