diff mbox

[RFC] KVM: optimize the kvm_vcpu_on_spin

Message ID 1501309377-195256-1-git-send-email-longpeng2@huawei.com (mailing list archive)
State New, archived
Headers show

Commit Message

Longpeng(Mike) July 29, 2017, 6:22 a.m. UTC
We had disscuss the idea here:
https://www.spinics.net/lists/kvm/msg140593.html

I think it's also suitable for other architectures.

If the vcpu(me) exit due to request a usermode spinlock, then
the spinlock-holder may be preempted in usermode or kernmode.
But if the vcpu(me) is in kernmode, then the holder must be
preempted in kernmode, so we should choose a vcpu in kernmode
as the most eligible candidate.

PS: I only implement X86 arch currently for I'm not familiar
    with other architecture.

Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
---
 arch/mips/kvm/mips.c       | 5 +++++
 arch/powerpc/kvm/powerpc.c | 5 +++++
 arch/s390/kvm/kvm-s390.c   | 5 +++++
 arch/x86/kvm/x86.c         | 5 +++++
 include/linux/kvm_host.h   | 4 ++++
 virt/kvm/arm/arm.c         | 5 +++++
 virt/kvm/kvm_main.c        | 9 ++++++++-
 7 files changed, 37 insertions(+), 1 deletion(-)

Comments

David Hildenbrand July 31, 2017, 11:31 a.m. UTC | #1
[no idea if this change makes sense (and especially if it has any bad
side effects), do you have performance numbers? I'll just have a look at
the general structure of the patch in the meanwhile]

> +bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)

kvm_arch_vcpu_in_kernel() ?

> +{
> +	return kvm_x86_ops->get_cpl(vcpu) == 0;
> +}
> +
>  int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
>  {
>  	return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 648b34c..f8f0d74 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -272,6 +272,9 @@ struct kvm_vcpu {
>  	} spin_loop;
>  #endif
>  	bool preempted;
> +	/* If vcpu is in kernel-mode when preempted */
> +	bool in_kernmode;
> +

Why do you have to store that ...

[...]> +	me->in_kernmode = kvm_arch_vcpu_spin_kernmode(me);
>  	kvm_vcpu_set_in_spin_loop(me, true);
>  	/*
>  	 * We boost the priority of a VCPU that is runnable but not
> @@ -2351,6 +2353,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>  				continue;
>  			if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
>  				continue;
> +			if (me->in_kernmode && !vcpu->in_kernmode)

Wouldn't it be easier to simply have

in_kernel = kvm_arch_vcpu_in_kernel(me);
...
if (in_kernel && !kvm_arch_vcpu_in_kernel(vcpu))
...

> +				continue;
>  			if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
>  				continue;
>  
> @@ -4009,8 +4013,11 @@ static void kvm_sched_out(struct preempt_notifier *pn,
>  {
>  	struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
>  
> -	if (current->state == TASK_RUNNING)
> +	if (current->state == TASK_RUNNING) {
>  		vcpu->preempted = true;
> +		vcpu->in_kernmode = kvm_arch_vcpu_spin_kernmode(vcpu);
> +	}
> +

so you don't have to do this change, too.

>  	kvm_arch_vcpu_put(vcpu);
>  }
>  
>
Longpeng(Mike) July 31, 2017, 12:08 p.m. UTC | #2
Hi David,

On 2017/7/31 19:31, David Hildenbrand wrote:

> [no idea if this change makes sense (and especially if it has any bad
> side effects), do you have performance numbers? I'll just have a look at
> the general structure of the patch in the meanwhile]
> 

I haven't any test results yet, could you give me some suggestion about what
benchmarks are suitable ?

>> +bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
> 
> kvm_arch_vcpu_in_kernel() ?
> 

Um...yes, I'll correct this.

>> +{
>> +	return kvm_x86_ops->get_cpl(vcpu) == 0;
>> +}
>> +
>>  int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
>>  {
>>  	return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>> index 648b34c..f8f0d74 100644
>> --- a/include/linux/kvm_host.h
>> +++ b/include/linux/kvm_host.h
>> @@ -272,6 +272,9 @@ struct kvm_vcpu {
>>  	} spin_loop;
>>  #endif
>>  	bool preempted;
>> +	/* If vcpu is in kernel-mode when preempted */
>> +	bool in_kernmode;
>> +
> 
> Why do you have to store that ...
> 

> [...]> +	me->in_kernmode = kvm_arch_vcpu_spin_kernmode(me);
>>  	kvm_vcpu_set_in_spin_loop(me, true);
>>  	/*
>>  	 * We boost the priority of a VCPU that is runnable but not
>> @@ -2351,6 +2353,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>>  				continue;
>>  			if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
>>  				continue;
>> +			if (me->in_kernmode && !vcpu->in_kernmode)
> 
> Wouldn't it be easier to simply have
> 
> in_kernel = kvm_arch_vcpu_in_kernel(me);
> ...
> if (in_kernel && !kvm_arch_vcpu_in_kernel(vcpu))
> ...
> 

I'm not sure whether the operation of get the vcpu's priority-level is
expensive on all architectures, so I record it in kvm_sched_out() for
minimal the extra cycles cost in kvm_vcpu_on_spin().

>> +				continue;
>>  			if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
>>  				continue;
>>  
>> @@ -4009,8 +4013,11 @@ static void kvm_sched_out(struct preempt_notifier *pn,
>>  {
>>  	struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
>>  
>> -	if (current->state == TASK_RUNNING)
>> +	if (current->state == TASK_RUNNING) {
>>  		vcpu->preempted = true;
>> +		vcpu->in_kernmode = kvm_arch_vcpu_spin_kernmode(vcpu);
>> +	}
>> +
> 
> so you don't have to do this change, too.
> 
>>  	kvm_arch_vcpu_put(vcpu);
>>  }
>>  
>>
> 
>
David Hildenbrand July 31, 2017, 12:27 p.m. UTC | #3
> I'm not sure whether the operation of get the vcpu's priority-level is
> expensive on all architectures, so I record it in kvm_sched_out() for
> minimal the extra cycles cost in kvm_vcpu_on_spin().
> 

as you only care for x86 right now either way, you can directly optimize
here for the good (here: x86) case (keeping changes and therefore
possible bugs minimal).
Cornelia Huck July 31, 2017, 12:31 p.m. UTC | #4
On Mon, 31 Jul 2017 20:08:14 +0800
"Longpeng (Mike)" <longpeng2@huawei.com> wrote:

> Hi David,
> 
> On 2017/7/31 19:31, David Hildenbrand wrote:

> >> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> >> index 648b34c..f8f0d74 100644
> >> --- a/include/linux/kvm_host.h
> >> +++ b/include/linux/kvm_host.h
> >> @@ -272,6 +272,9 @@ struct kvm_vcpu {
> >>  	} spin_loop;
> >>  #endif
> >>  	bool preempted;
> >> +	/* If vcpu is in kernel-mode when preempted */
> >> +	bool in_kernmode;
> >> +  
> > 
> > Why do you have to store that ...
> >   
> 
> > [...]> +	me->in_kernmode = kvm_arch_vcpu_spin_kernmode(me);
> >>  	kvm_vcpu_set_in_spin_loop(me, true);
> >>  	/*
> >>  	 * We boost the priority of a VCPU that is runnable but not
> >> @@ -2351,6 +2353,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
> >>  				continue;
> >>  			if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
> >>  				continue;
> >> +			if (me->in_kernmode && !vcpu->in_kernmode)  
> > 
> > Wouldn't it be easier to simply have
> > 
> > in_kernel = kvm_arch_vcpu_in_kernel(me);
> > ...
> > if (in_kernel && !kvm_arch_vcpu_in_kernel(vcpu))
> > ...
> >   
> 
> I'm not sure whether the operation of get the vcpu's priority-level is
> expensive on all architectures, so I record it in kvm_sched_out() for
> minimal the extra cycles cost in kvm_vcpu_on_spin().

As it is now, this handling looks a bit inconsistent. You only update
the field on sched-out via preemption _or_ if kvm_vcpu_on_spin is
called for the vcpu. In most contexts, this field will have stale
content.

Also, would checking for kernel mode be more expensive than the various
other checks already done in this function?

[I like David's suggestion.]

> 
> >> +				continue;
> >>  			if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
> >>  				continue;
> >>
Longpeng(Mike) July 31, 2017, 1:01 p.m. UTC | #5
On 2017/7/31 20:31, Cornelia Huck wrote:

> On Mon, 31 Jul 2017 20:08:14 +0800
> "Longpeng (Mike)" <longpeng2@huawei.com> wrote:
> 
>> Hi David,
>>
>> On 2017/7/31 19:31, David Hildenbrand wrote:
> 
>>>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>>>> index 648b34c..f8f0d74 100644
>>>> --- a/include/linux/kvm_host.h
>>>> +++ b/include/linux/kvm_host.h
>>>> @@ -272,6 +272,9 @@ struct kvm_vcpu {
>>>>  	} spin_loop;
>>>>  #endif
>>>>  	bool preempted;
>>>> +	/* If vcpu is in kernel-mode when preempted */
>>>> +	bool in_kernmode;
>>>> +  
>>>
>>> Why do you have to store that ...
>>>   
>>
>>> [...]> +	me->in_kernmode = kvm_arch_vcpu_spin_kernmode(me);
>>>>  	kvm_vcpu_set_in_spin_loop(me, true);
>>>>  	/*
>>>>  	 * We boost the priority of a VCPU that is runnable but not
>>>> @@ -2351,6 +2353,8 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>>>>  				continue;
>>>>  			if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
>>>>  				continue;
>>>> +			if (me->in_kernmode && !vcpu->in_kernmode)  
>>>
>>> Wouldn't it be easier to simply have
>>>
>>> in_kernel = kvm_arch_vcpu_in_kernel(me);
>>> ...
>>> if (in_kernel && !kvm_arch_vcpu_in_kernel(vcpu))
>>> ...
>>>   
>>
>> I'm not sure whether the operation of get the vcpu's priority-level is
>> expensive on all architectures, so I record it in kvm_sched_out() for
>> minimal the extra cycles cost in kvm_vcpu_on_spin().
> 
> As it is now, this handling looks a bit inconsistent. You only update
> the field on sched-out via preemption _or_ if kvm_vcpu_on_spin is
> called for the vcpu. In most contexts, this field will have stale
> content.
> 
> Also, would checking for kernel mode be more expensive than the various
> other checks already done in this function?
> 

> [I like David's suggestion.]
> 


Hi Cornelia & David,

I'll take your suggestion, thanks :)

>>
>>>> +				continue;
>>>>  			if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
>>>>  				continue;
>>>>  
> 
> .
>
Paolo Bonzini July 31, 2017, 1:20 p.m. UTC | #6
On 31/07/2017 14:27, David Hildenbrand wrote:
>> I'm not sure whether the operation of get the vcpu's priority-level is
>> expensive on all architectures, so I record it in kvm_sched_out() for
>> minimal the extra cycles cost in kvm_vcpu_on_spin().
>>
> as you only care for x86 right now either way, you can directly optimize
> here for the good (here: x86) case (keeping changes and therefore
> possible bugs minimal).

I agree with Cornelia that this is inconsistent, so you shouldn't update
me->in_kernmode in kvm_vcpu_on_spin.  However, get_cpl requires
vcpu_load on Intel x86, so Mike's patch is necessary (vmx_get_cpl ->
vmx_read_guest_seg_ar -> vmcs_read32).

Alternatively, we can add a new callback kvm_x86_ops->sched_out to x86
KVM, and call vmx_get_cpl from the Intel implementation (vmx_sched_out).
 This will cache the result until the next sched_in, so that
kvm_vcpu_on_spin can use it.

Paolo
Marc Zyngier July 31, 2017, 1:42 p.m. UTC | #7
On 29/07/17 07:22, Longpeng(Mike) wrote:
> We had disscuss the idea here:
> https://www.spinics.net/lists/kvm/msg140593.html
> 
> I think it's also suitable for other architectures.
> 
> If the vcpu(me) exit due to request a usermode spinlock, then
> the spinlock-holder may be preempted in usermode or kernmode.
> But if the vcpu(me) is in kernmode, then the holder must be
> preempted in kernmode, so we should choose a vcpu in kernmode
> as the most eligible candidate.

That seems to preclude any form of locking between userspace and kernel
(which probably wouldn't be Linux). Are you sure that this form of
construct is not used anywhere? I have the feeling this patch could
break this scenario...

Thanks,

	M.
Paolo Bonzini July 31, 2017, 2:23 p.m. UTC | #8
On 31/07/2017 15:42, Marc Zyngier wrote:
>> If the vcpu(me) exit due to request a usermode spinlock, then
>> the spinlock-holder may be preempted in usermode or kernmode.
>> But if the vcpu(me) is in kernmode, then the holder must be
>> preempted in kernmode, so we should choose a vcpu in kernmode
>> as the most eligible candidate.
>
> That seems to preclude any form of locking between userspace and kernel
> (which probably wouldn't be Linux). Are you sure that this form of
> construct is not used anywhere? I have the feeling this patch could
> break this scenario...

It's just a heuristic; it would only be broken if you overcommit, and it
would be just as broken as if KVM didn't implement directed yield at all.

Paolo
Longpeng(Mike) Aug. 1, 2017, 3:26 a.m. UTC | #9
On 2017/7/31 21:20, Paolo Bonzini wrote:

> On 31/07/2017 14:27, David Hildenbrand wrote:
>>> I'm not sure whether the operation of get the vcpu's priority-level is
>>> expensive on all architectures, so I record it in kvm_sched_out() for
>>> minimal the extra cycles cost in kvm_vcpu_on_spin().
>>>
>> as you only care for x86 right now either way, you can directly optimize
>> here for the good (here: x86) case (keeping changes and therefore
>> possible bugs minimal).
> 
> I agree with Cornelia that this is inconsistent, so you shouldn't update
> me->in_kernmode in kvm_vcpu_on_spin.  However, get_cpl requires
> vcpu_load on Intel x86, so Mike's patch is necessary (vmx_get_cpl ->
> vmx_read_guest_seg_ar -> vmcs_read32).
> 

Hi Paolo,

It seems that other architectures(e.g. arm/s390) needn't to cache the result,
but x86 need, so I need to move 'in_kernmode' into kvm_vcpu_arch and only add
this field to x86, right?

> Alternatively, we can add a new callback kvm_x86_ops->sched_out to x86
> KVM, and call vmx_get_cpl from the Intel implementation (vmx_sched_out).


In this approach, vmx_sched_out would only call vmx_get_cpl, isn't too
redundant, because we can just call kvm_x86_ops->get_cpl instead at the right place?

>  This will cache the result until the next sched_in, so that


'until the next sched_in' --> Do we need to clear the result in sched in ?

> kvm_vcpu_on_spin can use it.
> 
> Paolo
> 
> .
>
Paolo Bonzini Aug. 1, 2017, 8:32 a.m. UTC | #10
On 01/08/2017 05:26, Longpeng (Mike) wrote:
> 
> 
> On 2017/7/31 21:20, Paolo Bonzini wrote:
> 
>> On 31/07/2017 14:27, David Hildenbrand wrote:
>>>> I'm not sure whether the operation of get the vcpu's priority-level is
>>>> expensive on all architectures, so I record it in kvm_sched_out() for
>>>> minimal the extra cycles cost in kvm_vcpu_on_spin().
>>>>
>>> as you only care for x86 right now either way, you can directly optimize
>>> here for the good (here: x86) case (keeping changes and therefore
>>> possible bugs minimal).
>>
>> I agree with Cornelia that this is inconsistent, so you shouldn't update
>> me->in_kernmode in kvm_vcpu_on_spin.  However, get_cpl requires
>> vcpu_load on Intel x86, so Mike's patch is necessary (vmx_get_cpl ->
>> vmx_read_guest_seg_ar -> vmcs_read32).
>>
> 
> Hi Paolo,
> 
> It seems that other architectures(e.g. arm/s390) needn't to cache the result,
> but x86 need, so I need to move 'in_kernmode' into kvm_vcpu_arch and only add
> this field to x86, right?

That's another way to do it, and I like it.

>>  This will cache the result until the next sched_in, so that
> 
> 'until the next sched_in' --> Do we need to clear the result in sched in ?

No, thanks.

Paolo
diff mbox

Patch

diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index d4b2ad1..2e2701d 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -98,6 +98,11 @@  int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
 	return !!(vcpu->arch.pending_exceptions);
 }
 
+bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
+{
+	return false;
+}
+
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
 {
 	return 1;
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 1a75c0b..2489f64 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -58,6 +58,11 @@  int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
 	return !!(v->arch.pending_exceptions) || kvm_request_pending(v);
 }
 
+bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
+{
+	return false;
+}
+
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
 {
 	return 1;
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 3f2884e..9d7c42e 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2443,6 +2443,11 @@  int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
 	return kvm_s390_vcpu_has_irq(vcpu, 0);
 }
 
+bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
+{
+	return false;
+}
+
 void kvm_s390_vcpu_block(struct kvm_vcpu *vcpu)
 {
 	atomic_or(PROG_BLOCK_SIE, &vcpu->arch.sie_block->prog20);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 82a63c5..b5a2e53 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8435,6 +8435,11 @@  int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
 	return kvm_vcpu_running(vcpu) || kvm_vcpu_has_events(vcpu);
 }
 
+bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
+{
+	return kvm_x86_ops->get_cpl(vcpu) == 0;
+}
+
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
 {
 	return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 648b34c..f8f0d74 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -272,6 +272,9 @@  struct kvm_vcpu {
 	} spin_loop;
 #endif
 	bool preempted;
+	/* If vcpu is in kernel-mode when preempted */
+	bool in_kernmode;
+
 	struct kvm_vcpu_arch arch;
 	struct dentry *debugfs_dentry;
 };
@@ -797,6 +800,7 @@  int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
 void kvm_arch_hardware_unsetup(void);
 void kvm_arch_check_processor_compat(void *rtn);
 int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu);
+bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu);
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu);
 
 #ifndef __KVM_HAVE_ARCH_VM_ALLOC
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index a39a1e1..ca6a394 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -416,6 +416,11 @@  int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
 		&& !v->arch.power_off && !v->arch.pause);
 }
 
+bool kvm_arch_vcpu_spin_kernmode(struct kvm_vcpu *vcpu)
+{
+	return false;
+}
+
 /* Just ensure a guest exit from a particular CPU */
 static void exit_vm_noop(void *info)
 {
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 82987d4..8d83caa 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -290,6 +290,7 @@  int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
 	kvm_vcpu_set_in_spin_loop(vcpu, false);
 	kvm_vcpu_set_dy_eligible(vcpu, false);
 	vcpu->preempted = false;
+	vcpu->in_kernmode = false;
 
 	r = kvm_arch_vcpu_init(vcpu);
 	if (r < 0)
@@ -2330,6 +2331,7 @@  void kvm_vcpu_on_spin(struct kvm_vcpu *me)
 	int pass;
 	int i;
 
+	me->in_kernmode = kvm_arch_vcpu_spin_kernmode(me);
 	kvm_vcpu_set_in_spin_loop(me, true);
 	/*
 	 * We boost the priority of a VCPU that is runnable but not
@@ -2351,6 +2353,8 @@  void kvm_vcpu_on_spin(struct kvm_vcpu *me)
 				continue;
 			if (swait_active(&vcpu->wq) && !kvm_arch_vcpu_runnable(vcpu))
 				continue;
+			if (me->in_kernmode && !vcpu->in_kernmode)
+				continue;
 			if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
 				continue;
 
@@ -4009,8 +4013,11 @@  static void kvm_sched_out(struct preempt_notifier *pn,
 {
 	struct kvm_vcpu *vcpu = preempt_notifier_to_vcpu(pn);
 
-	if (current->state == TASK_RUNNING)
+	if (current->state == TASK_RUNNING) {
 		vcpu->preempted = true;
+		vcpu->in_kernmode = kvm_arch_vcpu_spin_kernmode(vcpu);
+	}
+
 	kvm_arch_vcpu_put(vcpu);
 }