diff mbox

[5/5] KVM: PPC: Book3S HV: Check wait conditions before sleeping in kvmppc_vcore_blocked

Message ID 1414990320-6378-6-git-send-email-paulus@samba.org (mailing list archive)
State New, archived
Headers show

Commit Message

Paul Mackerras Nov. 3, 2014, 4:52 a.m. UTC
From: "Suresh E. Warrier" <warrier@linux.vnet.ibm.com>

The kvmppc_vcore_blocked() code does not check for the wait condition
after putting the process on the wait queue. This means that it is
possible for an external interrupt to become pending, but the vcpu to
remain asleep until the next decrementer interrupt.  The fix is to
make one last check for pending exceptions and ceded state before
calling schedule().

Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/kvm/book3s_hv.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Comments

Alexander Graf Nov. 20, 2014, 5:36 p.m. UTC | #1
On 03.11.14 05:52, Paul Mackerras wrote:
> From: "Suresh E. Warrier" <warrier@linux.vnet.ibm.com>
> 
> The kvmppc_vcore_blocked() code does not check for the wait condition
> after putting the process on the wait queue. This means that it is
> possible for an external interrupt to become pending, but the vcpu to
> remain asleep until the next decrementer interrupt.  The fix is to
> make one last check for pending exceptions and ceded state before
> calling schedule().
> 
> Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
> Signed-off-by: Paul Mackerras <paulus@samba.org>

I don't understand the race you're fixing here. Can you please explain it?


Alex

> ---
>  arch/powerpc/kvm/book3s_hv.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index cd7e030..1a7a281 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -1828,9 +1828,29 @@ static void kvmppc_wait_for_exec(struct kvm_vcpu *vcpu, int wait_state)
>   */
>  static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
>  {
> +	struct kvm_vcpu *vcpu;
> +	int do_sleep = 1;
> +
>  	DEFINE_WAIT(wait);
>  
>  	prepare_to_wait(&vc->wq, &wait, TASK_INTERRUPTIBLE);
> +
> +	/*
> +	 * Check one last time for pending exceptions and ceded state after
> +	 * we put ourselves on the wait queue
> +	 */
> +	list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) {
> +		if (vcpu->arch.pending_exceptions || !vcpu->arch.ceded) {
> +			do_sleep = 0;
> +			break;
> +		}
> +	}
> +
> +	if (!do_sleep) {
> +		finish_wait(&vc->wq, &wait);
> +		return;
> +	}
> +
>  	vc->vcore_state = VCORE_SLEEPING;
>  	spin_unlock(&vc->lock);
>  	schedule();
> 
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Suresh E. Warrier Nov. 20, 2014, 7:31 p.m. UTC | #2
On 11/20/2014 11:36 AM, Alexander Graf wrote:
> 
> 
> On 03.11.14 05:52, Paul Mackerras wrote:
>> From: "Suresh E. Warrier" <warrier@linux.vnet.ibm.com>
>>
>> The kvmppc_vcore_blocked() code does not check for the wait condition
>> after putting the process on the wait queue. This means that it is
>> possible for an external interrupt to become pending, but the vcpu to
>> remain asleep until the next decrementer interrupt.  The fix is to
>> make one last check for pending exceptions and ceded state before
>> calling schedule().
>>
>> Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
>> Signed-off-by: Paul Mackerras <paulus@samba.org>
> 
> I don't understand the race you're fixing here. Can you please explain it?
> 

When a virtual interrupt needs to be delivered to the guest, and the
virtual ICS state for the interrupt and virtual ICP state for the VCPU
allow for the VCPU to be immediately interrupted, we
1. Set the BOOK3S_INTERRUPT_EXTERNAL_LEVEL bit in pending_exceptions.
2. Call kvmppc_fast_vcpu_kick_hv(), which checks the wait queue at vcpu->wq
   to wake the VCPU up.

The caller of kvmppc_vcore_blocked() does the check for pending exceptions, but
there is a race condition here and we do need to check again after the VCPU
is put on the wait queue.

-suresh

> 
> Alex
> 
>> ---
>>  arch/powerpc/kvm/book3s_hv.c | 20 ++++++++++++++++++++
>>  1 file changed, 20 insertions(+)
>>
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index cd7e030..1a7a281 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -1828,9 +1828,29 @@ static void kvmppc_wait_for_exec(struct kvm_vcpu *vcpu, int wait_state)
>>   */
>>  static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
>>  {
>> +	struct kvm_vcpu *vcpu;
>> +	int do_sleep = 1;
>> +
>>  	DEFINE_WAIT(wait);
>>  
>>  	prepare_to_wait(&vc->wq, &wait, TASK_INTERRUPTIBLE);
>> +
>> +	/*
>> +	 * Check one last time for pending exceptions and ceded state after
>> +	 * we put ourselves on the wait queue
>> +	 */
>> +	list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) {
>> +		if (vcpu->arch.pending_exceptions || !vcpu->arch.ceded) {
>> +			do_sleep = 0;
>> +			break;
>> +		}
>> +	}
>> +
>> +	if (!do_sleep) {
>> +		finish_wait(&vc->wq, &wait);
>> +		return;
>> +	}
>> +
>>  	vc->vcore_state = VCORE_SLEEPING;
>>  	spin_unlock(&vc->lock);
>>  	schedule();
>>
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Graf Nov. 23, 2014, 12:41 a.m. UTC | #3
On 20.11.14 20:31, Suresh E. Warrier wrote:
> 
> 
> On 11/20/2014 11:36 AM, Alexander Graf wrote:
>>
>>
>> On 03.11.14 05:52, Paul Mackerras wrote:
>>> From: "Suresh E. Warrier" <warrier@linux.vnet.ibm.com>
>>>
>>> The kvmppc_vcore_blocked() code does not check for the wait condition
>>> after putting the process on the wait queue. This means that it is
>>> possible for an external interrupt to become pending, but the vcpu to
>>> remain asleep until the next decrementer interrupt.  The fix is to
>>> make one last check for pending exceptions and ceded state before
>>> calling schedule().
>>>
>>> Signed-off-by: Suresh Warrier <warrier@linux.vnet.ibm.com>
>>> Signed-off-by: Paul Mackerras <paulus@samba.org>
>>
>> I don't understand the race you're fixing here. Can you please explain it?
>>
> 
> When a virtual interrupt needs to be delivered to the guest, and the
> virtual ICS state for the interrupt and virtual ICP state for the VCPU
> allow for the VCPU to be immediately interrupted, we
> 1. Set the BOOK3S_INTERRUPT_EXTERNAL_LEVEL bit in pending_exceptions.
> 2. Call kvmppc_fast_vcpu_kick_hv(), which checks the wait queue at vcpu->wq
>    to wake the VCPU up.
> 
> The caller of kvmppc_vcore_blocked() does the check for pending exceptions, but
> there is a race condition here and we do need to check again after the VCPU
> is put on the wait queue.

I see. Thanks, applied to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index cd7e030..1a7a281 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1828,9 +1828,29 @@  static void kvmppc_wait_for_exec(struct kvm_vcpu *vcpu, int wait_state)
  */
 static void kvmppc_vcore_blocked(struct kvmppc_vcore *vc)
 {
+	struct kvm_vcpu *vcpu;
+	int do_sleep = 1;
+
 	DEFINE_WAIT(wait);
 
 	prepare_to_wait(&vc->wq, &wait, TASK_INTERRUPTIBLE);
+
+	/*
+	 * Check one last time for pending exceptions and ceded state after
+	 * we put ourselves on the wait queue
+	 */
+	list_for_each_entry(vcpu, &vc->runnable_threads, arch.run_list) {
+		if (vcpu->arch.pending_exceptions || !vcpu->arch.ceded) {
+			do_sleep = 0;
+			break;
+		}
+	}
+
+	if (!do_sleep) {
+		finish_wait(&vc->wq, &wait);
+		return;
+	}
+
 	vc->vcore_state = VCORE_SLEEPING;
 	spin_unlock(&vc->lock);
 	schedule();