diff mbox series

[3/3] KVM: arm/arm64: vgic: Don't rely on the wrong pending table

Message ID 20191029071919.177-4-yuzenghui@huawei.com (mailing list archive)
State New, archived
Headers show
Series KVM: arm/arm64: vgic: Some cleanups and fixes | expand

Commit Message

Zenghui Yu Oct. 29, 2019, 7:19 a.m. UTC
It's possible that two LPIs locate in the same "byte_offset" but target
two different vcpus, where their pending status are indicated by two
different pending tables.  In such a scenario, using last_byte_offset
optimization will lead KVM relying on the wrong pending table entry.
Let us use last_ptr instead, which can be treated as a byte index into
a pending table and also, can be vcpu specific.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
---

If this patch has done the right thing, we can even add the:

Fixes: 280771252c1b ("KVM: arm64: vgic-v3: KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES")

But to be honest, I'm not clear about what has this patch actually fixed.
Pending tables should contain all zeros before we flush vgic_irq's pending
status into guest's RAM (thinking that guest should never write anything
into it). So the pending table entry we've read from the guest memory
seems always be zero. And we will always do the right thing even if we
rely on the wrong pending table entry.

I think I must have some misunderstanding here... Please fix me.

 virt/kvm/arm/vgic/vgic-v3.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Marc Zyngier Oct. 29, 2019, 9:23 a.m. UTC | #1
On Tue, 29 Oct 2019 07:19:19 +0000,
Zenghui Yu <yuzenghui@huawei.com> wrote:
> 
> It's possible that two LPIs locate in the same "byte_offset" but target
> two different vcpus, where their pending status are indicated by two
> different pending tables.  In such a scenario, using last_byte_offset
> optimization will lead KVM relying on the wrong pending table entry.
> Let us use last_ptr instead, which can be treated as a byte index into
> a pending table and also, can be vcpu specific.
> 
> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
> ---
> 
> If this patch has done the right thing, we can even add the:
> 
> Fixes: 280771252c1b ("KVM: arm64: vgic-v3: KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES")
> 
> But to be honest, I'm not clear about what has this patch actually fixed.
> Pending tables should contain all zeros before we flush vgic_irq's pending
> status into guest's RAM (thinking that guest should never write anything
> into it). So the pending table entry we've read from the guest memory
> seems always be zero. And we will always do the right thing even if we
> rely on the wrong pending table entry.
> 
> I think I must have some misunderstanding here... Please fix me.

I think you're spot on, and it is the code needs fixing, not you! The
problem is that we only read a byte once, irrespective of the vcpu the
interrupts is routed to. If we switch to another vcpu for the same
byte offset, we must reload it.

This can be done by either checking the vcpu, or by tracking the guest
address that we read from (just like you do here).

A small comment below:

>  virt/kvm/arm/vgic/vgic-v3.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
> index 5ef93e5041e1..7cd2e2f81513 100644
> --- a/virt/kvm/arm/vgic/vgic-v3.c
> +++ b/virt/kvm/arm/vgic/vgic-v3.c
> @@ -363,8 +363,8 @@ int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, struct vgic_irq *irq)
>  int vgic_v3_save_pending_tables(struct kvm *kvm)
>  {
>  	struct vgic_dist *dist = &kvm->arch.vgic;
> -	int last_byte_offset = -1;
>  	struct vgic_irq *irq;
> +	gpa_t last_ptr = -1;

This should be written as

     gpa_t last_ptr = ~(gpa_t)0;

>  	int ret;
>  	u8 val;
>  
> @@ -384,11 +384,11 @@ int vgic_v3_save_pending_tables(struct kvm *kvm)
>  		bit_nr = irq->intid % BITS_PER_BYTE;
>  		ptr = pendbase + byte_offset;
>  
> -		if (byte_offset != last_byte_offset) {
> +		if (ptr != last_ptr) {
>  			ret = kvm_read_guest_lock(kvm, ptr, &val, 1);
>  			if (ret)
>  				return ret;
> -			last_byte_offset = byte_offset;
> +			last_ptr = ptr;
>  		}
>  
>  		stored = val & (1U << bit_nr);

Otherwise, this looks good to me (no need to respin for the above
nit).

Eric, can I get an Ack from you, since you write this code?

Thanks,

	M.
Eric Auger Oct. 29, 2019, 12:17 p.m. UTC | #2
Hi Zenghui, Marc,

On 10/29/19 8:19 AM, Zenghui Yu wrote:
> It's possible that two LPIs locate in the same "byte_offset" but target
> two different vcpus, where their pending status are indicated by two
> different pending tables.  In such a scenario, using last_byte_offset
> optimization will lead KVM relying on the wrong pending table entry.
> Let us use last_ptr instead, which can be treated as a byte index into
> a pending table and also, can be vcpu specific.
> 
> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
> ---
> 
> If this patch has done the right thing, we can even add the:
> 
> Fixes: 280771252c1b ("KVM: arm64: vgic-v3: KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES")
> 
> But to be honest, I'm not clear about what has this patch actually fixed.
> Pending tables should contain all zeros before we flush vgic_irq's pending
> status into guest's RAM (thinking that guest should never write anything
> into it). So the pending table entry we've read from the guest memory
> seems always be zero. And we will always do the right thing even if we
> rely on the wrong pending table entry.
> 
> I think I must have some misunderstanding here... Please fix me.
> 
>  virt/kvm/arm/vgic/vgic-v3.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
> index 5ef93e5041e1..7cd2e2f81513 100644
> --- a/virt/kvm/arm/vgic/vgic-v3.c
> +++ b/virt/kvm/arm/vgic/vgic-v3.c
> @@ -363,8 +363,8 @@ int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, struct vgic_irq *irq)
>  int vgic_v3_save_pending_tables(struct kvm *kvm)
>  {
>  	struct vgic_dist *dist = &kvm->arch.vgic;
> -	int last_byte_offset = -1;
>  	struct vgic_irq *irq;
> +	gpa_t last_ptr = -1;
>  	int ret;
>  	u8 val;
>  
> @@ -384,11 +384,11 @@ int vgic_v3_save_pending_tables(struct kvm *kvm)
>  		bit_nr = irq->intid % BITS_PER_BYTE;
>  		ptr = pendbase + byte_offset;
>  
> -		if (byte_offset != last_byte_offset) {
> +		if (ptr != last_ptr) {
>  			ret = kvm_read_guest_lock(kvm, ptr, &val, 1);
>  			if (ret)
>  				return ret;
> -			last_byte_offset = byte_offset;
> +			last_ptr = ptr;
>  		}
>  
>  		stored = val & (1U << bit_nr);
> 
Acked-by: Eric Auger <eric.auger@redhat.com>

Thanks for fixing this.

Eric
Zenghui Yu Oct. 29, 2019, 12:27 p.m. UTC | #3
Hi Marc,

On 2019/10/29 17:23, Marc Zyngier wrote:
> On Tue, 29 Oct 2019 07:19:19 +0000,
> Zenghui Yu <yuzenghui@huawei.com> wrote:
>>
>> It's possible that two LPIs locate in the same "byte_offset" but target
>> two different vcpus, where their pending status are indicated by two
>> different pending tables.  In such a scenario, using last_byte_offset
>> optimization will lead KVM relying on the wrong pending table entry.
>> Let us use last_ptr instead, which can be treated as a byte index into
>> a pending table and also, can be vcpu specific.
>>
>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>> ---
>>
>> If this patch has done the right thing, we can even add the:
>>
>> Fixes: 280771252c1b ("KVM: arm64: vgic-v3: KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES")
>>
>> But to be honest, I'm not clear about what has this patch actually fixed.
>> Pending tables should contain all zeros before we flush vgic_irq's pending
>> status into guest's RAM (thinking that guest should never write anything
>> into it). So the pending table entry we've read from the guest memory
>> seems always be zero. And we will always do the right thing even if we
>> rely on the wrong pending table entry.
>>
>> I think I must have some misunderstanding here... Please fix me.
> 
> I think you're spot on, and it is the code needs fixing, not you! The
> problem is that we only read a byte once, irrespective of the vcpu the
> interrupts is routed to. If we switch to another vcpu for the same
> byte offset, we must reload it.
> 
> This can be done by either checking the vcpu, or by tracking the guest
> address that we read from (just like you do here).

okay, the remaining question is that in vgic_v3_save_pending_tables():

	stored = val & (1U << bit_nr);
	if (stored == irq->pending_latch)
		continue;

	if (irq->pending_latch)
		val |= 1 << bit_nr;
	else
		val &= ~(1 << bit_nr);

Do we really have a scenario where irq->pending_latch==false and
stored==true (corresponds to the above "else") and then we clear
pending status of this LPI in guest memory?
I can not think out one now.

> 
> A small comment below:
> 
>>   virt/kvm/arm/vgic/vgic-v3.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
>> index 5ef93e5041e1..7cd2e2f81513 100644
>> --- a/virt/kvm/arm/vgic/vgic-v3.c
>> +++ b/virt/kvm/arm/vgic/vgic-v3.c
>> @@ -363,8 +363,8 @@ int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, struct vgic_irq *irq)
>>   int vgic_v3_save_pending_tables(struct kvm *kvm)
>>   {
>>   	struct vgic_dist *dist = &kvm->arch.vgic;
>> -	int last_byte_offset = -1;
>>   	struct vgic_irq *irq;
>> +	gpa_t last_ptr = -1;
> 
> This should be written as
> 
>       gpa_t last_ptr = ~(gpa_t)0;

Thanks for pointing it out.

> 
>>   	int ret;
>>   	u8 val;
>>   
>> @@ -384,11 +384,11 @@ int vgic_v3_save_pending_tables(struct kvm *kvm)
>>   		bit_nr = irq->intid % BITS_PER_BYTE;
>>   		ptr = pendbase + byte_offset;
>>   
>> -		if (byte_offset != last_byte_offset) {
>> +		if (ptr != last_ptr) {
>>   			ret = kvm_read_guest_lock(kvm, ptr, &val, 1);
>>   			if (ret)
>>   				return ret;
>> -			last_byte_offset = byte_offset;
>> +			last_ptr = ptr;
>>   		}
>>   
>>   		stored = val & (1U << bit_nr);
> 
> Otherwise, this looks good to me (no need to respin for the above
> nit).

Thanks,

Zenghui
Zenghui Yu Oct. 29, 2019, 12:30 p.m. UTC | #4
On 2019/10/29 20:17, Auger Eric wrote:
> Hi Zenghui, Marc,
> 
> On 10/29/19 8:19 AM, Zenghui Yu wrote:
>> It's possible that two LPIs locate in the same "byte_offset" but target
>> two different vcpus, where their pending status are indicated by two
>> different pending tables.  In such a scenario, using last_byte_offset
>> optimization will lead KVM relying on the wrong pending table entry.
>> Let us use last_ptr instead, which can be treated as a byte index into
>> a pending table and also, can be vcpu specific.
>>
>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>> ---
>>
>> If this patch has done the right thing, we can even add the:
>>
>> Fixes: 280771252c1b ("KVM: arm64: vgic-v3: KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES")
>>
>> But to be honest, I'm not clear about what has this patch actually fixed.
>> Pending tables should contain all zeros before we flush vgic_irq's pending
>> status into guest's RAM (thinking that guest should never write anything
>> into it). So the pending table entry we've read from the guest memory
>> seems always be zero. And we will always do the right thing even if we
>> rely on the wrong pending table entry.
>>
>> I think I must have some misunderstanding here... Please fix me.
>>
>>   virt/kvm/arm/vgic/vgic-v3.c | 6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
>> index 5ef93e5041e1..7cd2e2f81513 100644
>> --- a/virt/kvm/arm/vgic/vgic-v3.c
>> +++ b/virt/kvm/arm/vgic/vgic-v3.c
>> @@ -363,8 +363,8 @@ int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, struct vgic_irq *irq)
>>   int vgic_v3_save_pending_tables(struct kvm *kvm)
>>   {
>>   	struct vgic_dist *dist = &kvm->arch.vgic;
>> -	int last_byte_offset = -1;
>>   	struct vgic_irq *irq;
>> +	gpa_t last_ptr = -1;
>>   	int ret;
>>   	u8 val;
>>   
>> @@ -384,11 +384,11 @@ int vgic_v3_save_pending_tables(struct kvm *kvm)
>>   		bit_nr = irq->intid % BITS_PER_BYTE;
>>   		ptr = pendbase + byte_offset;
>>   
>> -		if (byte_offset != last_byte_offset) {
>> +		if (ptr != last_ptr) {
>>   			ret = kvm_read_guest_lock(kvm, ptr, &val, 1);
>>   			if (ret)
>>   				return ret;
>> -			last_byte_offset = byte_offset;
>> +			last_ptr = ptr;
>>   		}
>>   
>>   		stored = val & (1U << bit_nr);
>>
> Acked-by: Eric Auger <eric.auger@redhat.com>

Thanks Eric,


Zenghui
Eric Auger Oct. 29, 2019, 12:49 p.m. UTC | #5
Hi Zenghui,

On 10/29/19 1:27 PM, Zenghui Yu wrote:
> Hi Marc,
> 
> On 2019/10/29 17:23, Marc Zyngier wrote:
>> On Tue, 29 Oct 2019 07:19:19 +0000,
>> Zenghui Yu <yuzenghui@huawei.com> wrote:
>>>
>>> It's possible that two LPIs locate in the same "byte_offset" but target
>>> two different vcpus, where their pending status are indicated by two
>>> different pending tables.  In such a scenario, using last_byte_offset
>>> optimization will lead KVM relying on the wrong pending table entry.
>>> Let us use last_ptr instead, which can be treated as a byte index into
>>> a pending table and also, can be vcpu specific.
>>>
>>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>>> ---
>>>
>>> If this patch has done the right thing, we can even add the:
>>>
>>> Fixes: 280771252c1b ("KVM: arm64: vgic-v3:
>>> KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES")
>>>
>>> But to be honest, I'm not clear about what has this patch actually
>>> fixed.
>>> Pending tables should contain all zeros before we flush vgic_irq's
>>> pending
>>> status into guest's RAM (thinking that guest should never write anything
>>> into it). So the pending table entry we've read from the guest memory
>>> seems always be zero. And we will always do the right thing even if we
>>> rely on the wrong pending table entry.
>>>
>>> I think I must have some misunderstanding here... Please fix me.
>>
>> I think you're spot on, and it is the code needs fixing, not you! The
>> problem is that we only read a byte once, irrespective of the vcpu the
>> interrupts is routed to. If we switch to another vcpu for the same
>> byte offset, we must reload it.
>>
>> This can be done by either checking the vcpu, or by tracking the guest
>> address that we read from (just like you do here).
> 
> okay, the remaining question is that in vgic_v3_save_pending_tables():
> 
>     stored = val & (1U << bit_nr);
>     if (stored == irq->pending_latch)
>         continue;
> 
>     if (irq->pending_latch)
>         val |= 1 << bit_nr;
>     else
>         val &= ~(1 << bit_nr);
> 
> Do we really have a scenario where irq->pending_latch==false and
> stored==true (corresponds to the above "else") and then we clear
> pending status of this LPI in guest memory?
> I can not think out one now.

if you save, restore and save again. On the 1st save the LPI may be
pending, it gets stored. On the second save the LPI may be not pending
anymore?

Thanks

Eric
> 
>>
>> A small comment below:
>>
>>>   virt/kvm/arm/vgic/vgic-v3.c | 6 +++---
>>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
>>> index 5ef93e5041e1..7cd2e2f81513 100644
>>> --- a/virt/kvm/arm/vgic/vgic-v3.c
>>> +++ b/virt/kvm/arm/vgic/vgic-v3.c
>>> @@ -363,8 +363,8 @@ int vgic_v3_lpi_sync_pending_status(struct kvm
>>> *kvm, struct vgic_irq *irq)
>>>   int vgic_v3_save_pending_tables(struct kvm *kvm)
>>>   {
>>>       struct vgic_dist *dist = &kvm->arch.vgic;
>>> -    int last_byte_offset = -1;
>>>       struct vgic_irq *irq;
>>> +    gpa_t last_ptr = -1;
>>
>> This should be written as
>>
>>       gpa_t last_ptr = ~(gpa_t)0;
> 
> Thanks for pointing it out.
> 
>>
>>>       int ret;
>>>       u8 val;
>>>   @@ -384,11 +384,11 @@ int vgic_v3_save_pending_tables(struct kvm *kvm)
>>>           bit_nr = irq->intid % BITS_PER_BYTE;
>>>           ptr = pendbase + byte_offset;
>>>   -        if (byte_offset != last_byte_offset) {
>>> +        if (ptr != last_ptr) {
>>>               ret = kvm_read_guest_lock(kvm, ptr, &val, 1);
>>>               if (ret)
>>>                   return ret;
>>> -            last_byte_offset = byte_offset;
>>> +            last_ptr = ptr;
>>>           }
>>>             stored = val & (1U << bit_nr);
>>
>> Otherwise, this looks good to me (no need to respin for the above
>> nit).
> 
> Thanks,
> 
> Zenghui
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Zenghui Yu Oct. 29, 2019, 1:31 p.m. UTC | #6
Hi Eric,

On 2019/10/29 20:49, Auger Eric wrote:
> On 10/29/19 1:27 PM, Zenghui Yu wrote:
>> okay, the remaining question is that in vgic_v3_save_pending_tables():
>>
>>      stored = val & (1U << bit_nr);
>>      if (stored == irq->pending_latch)
>>          continue;
>>
>>      if (irq->pending_latch)
>>          val |= 1 << bit_nr;
>>      else
>>          val &= ~(1 << bit_nr);
>>
>> Do we really have a scenario where irq->pending_latch==false and
>> stored==true (corresponds to the above "else") and then we clear
>> pending status of this LPI in guest memory?
>> I can not think out one now.
> 
> if you save, restore and save again. On the 1st save the LPI may be
> pending, it gets stored. On the second save the LPI may be not pending
> anymore?

I assume you mean the "restore" by vgic_its_restore_ite().

While restoring a LPI, we will sync the pending status from guest
pending table (into the software pending_latch), and clear the
corresponding bit in guest memory.
See vgic_v3_lpi_sync_pending_status().

So on the second save, the LPI can be not pending, the guest pending
table will also indicate not pending.


Thanks,
Zenghui
Eric Auger Oct. 29, 2019, 10:52 p.m. UTC | #7
Hi Zenghui,

On 10/29/19 2:31 PM, Zenghui Yu wrote:
> Hi Eric,
> 
> On 2019/10/29 20:49, Auger Eric wrote:
>> On 10/29/19 1:27 PM, Zenghui Yu wrote:
>>> okay, the remaining question is that in vgic_v3_save_pending_tables():
>>>
>>>      stored = val & (1U << bit_nr);
>>>      if (stored == irq->pending_latch)
>>>          continue;
>>>
>>>      if (irq->pending_latch)
>>>          val |= 1 << bit_nr;
>>>      else
>>>          val &= ~(1 << bit_nr);
>>>
>>> Do we really have a scenario where irq->pending_latch==false and
>>> stored==true (corresponds to the above "else") and then we clear
>>> pending status of this LPI in guest memory?
>>> I can not think out one now.
>>
>> if you save, restore and save again. On the 1st save the LPI may be
>> pending, it gets stored. On the second save the LPI may be not pending
>> anymore?
> 
> I assume you mean the "restore" by vgic_its_restore_ite().

yes that's what I meant

> 
> While restoring a LPI, we will sync the pending status from guest
> pending table (into the software pending_latch), and clear the
> corresponding bit in guest memory.
> See vgic_v3_lpi_sync_pending_status().
> 
> So on the second save, the LPI can be not pending, the guest pending
> table will also indicate not pending.

You're right; I did not remember vgic_v3_lpi_sync_pending_status (called
from vgic_its_restore_ite/vgic_add_lpi) "cleared the consumed data"
(44de9d683847  KVM: arm64: vgic-v3: vgic_v3_lpi_sync_pending_status).

So effectively after restore the pending table is zeroed and the above
code should be rewrittable in a more simple manner, ie. just update the
byte in case the pending_latch is set.

Nethertheless your patch indeed fixes an actual bug independently on
this cleanup, ie. the written byte may be incorrect if LPIs belonging to
this byte target different RDIST.

Thanks

Eric
> 
> 
> Thanks,
> Zenghui
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
diff mbox series

Patch

diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
index 5ef93e5041e1..7cd2e2f81513 100644
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -363,8 +363,8 @@  int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, struct vgic_irq *irq)
 int vgic_v3_save_pending_tables(struct kvm *kvm)
 {
 	struct vgic_dist *dist = &kvm->arch.vgic;
-	int last_byte_offset = -1;
 	struct vgic_irq *irq;
+	gpa_t last_ptr = -1;
 	int ret;
 	u8 val;
 
@@ -384,11 +384,11 @@  int vgic_v3_save_pending_tables(struct kvm *kvm)
 		bit_nr = irq->intid % BITS_PER_BYTE;
 		ptr = pendbase + byte_offset;
 
-		if (byte_offset != last_byte_offset) {
+		if (ptr != last_ptr) {
 			ret = kvm_read_guest_lock(kvm, ptr, &val, 1);
 			if (ret)
 				return ret;
-			last_byte_offset = byte_offset;
+			last_ptr = ptr;
 		}
 
 		stored = val & (1U << bit_nr);