diff mbox series

[RFC] KVM: arm64: vgic: Decouple the check of the EnableLPIs bit from the ITS LPI translation

Message ID 20201231062813.714-1-lushenming@huawei.com (mailing list archive)
State New, archived
Headers show
Series [RFC] KVM: arm64: vgic: Decouple the check of the EnableLPIs bit from the ITS LPI translation | expand

Commit Message

Shenming Lu Dec. 31, 2020, 6:28 a.m. UTC
When the EnableLPIs bit is set to 0, any ITS LPI requests in the
Redistributor would be ignored. And this check is independent from
the ITS LPI translation. So it might be better to move the check
of the EnableLPIs bit out of the LPI resolving, and also add it
to the path that uses the translation cache. Besides it seems that
by this the invalidating of the translation cache caused by the LPI
disabling is unnecessary.

Not sure if I have missed something... Thanks.

Signed-off-by: Shenming Lu <lushenming@huawei.com>
---
 arch/arm64/kvm/vgic/vgic-its.c     | 9 +++++----
 arch/arm64/kvm/vgic/vgic-mmio-v3.c | 4 +---
 2 files changed, 6 insertions(+), 7 deletions(-)

Comments

Marc Zyngier Dec. 31, 2020, 8:57 a.m. UTC | #1
Hi Shemming,

On 2020-12-31 06:28, Shenming Lu wrote:
> When the EnableLPIs bit is set to 0, any ITS LPI requests in the
> Redistributor would be ignored. And this check is independent from
> the ITS LPI translation. So it might be better to move the check
> of the EnableLPIs bit out of the LPI resolving, and also add it
> to the path that uses the translation cache.

But by doing that, you are moving the overhead of checking for
EnableLPIs from the slow path (translation walk) to the fast
path (cache hit), which seems counter-productive.

> Besides it seems that
> by this the invalidating of the translation cache caused by the LPI
> disabling is unnecessary.
> 
> Not sure if I have missed something... Thanks.

I am certainly missing the purpose of this patch.

The effect of EnableLPIs being zero is to drop the result of any
translation (a new pending bit) on the floor. Given that, it is
immaterial whether this causes a new translation or hits in the
cache, as the result is still to not pend a new interrupt.

I get the feeling that you are trying to optimise for the unusual
case where EnableLPIs is 0 *and* you have a screaming device
injecting tons of interrupt. If that is the case, I don't think
this is worth it.

Thanks,

         M.

> 
> Signed-off-by: Shenming Lu <lushenming@huawei.com>
> ---
>  arch/arm64/kvm/vgic/vgic-its.c     | 9 +++++----
>  arch/arm64/kvm/vgic/vgic-mmio-v3.c | 4 +---
>  2 files changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/kvm/vgic/vgic-its.c 
> b/arch/arm64/kvm/vgic/vgic-its.c
> index 40cbaca81333..f53446bc154e 100644
> --- a/arch/arm64/kvm/vgic/vgic-its.c
> +++ b/arch/arm64/kvm/vgic/vgic-its.c
> @@ -683,9 +683,6 @@ int vgic_its_resolve_lpi(struct kvm *kvm, struct
> vgic_its *its,
>  	if (!vcpu)
>  		return E_ITS_INT_UNMAPPED_INTERRUPT;
> 
> -	if (!vcpu->arch.vgic_cpu.lpis_enabled)
> -		return -EBUSY;
> -
>  	vgic_its_cache_translation(kvm, its, devid, eventid, ite->irq);
> 
>  	*irq = ite->irq;
> @@ -738,6 +735,9 @@ static int vgic_its_trigger_msi(struct kvm *kvm,
> struct vgic_its *its,
>  	if (err)
>  		return err;
> 
> +	if (!irq->target_vcpu->arch.vgic_cpu.lpis_enabled)
> +		return -EBUSY;
> +
>  	if (irq->hw)
>  		return irq_set_irqchip_state(irq->host_irq,
>  					     IRQCHIP_STATE_PENDING, true);
> @@ -757,7 +757,8 @@ int vgic_its_inject_cached_translation(struct kvm
> *kvm, struct kvm_msi *msi)
> 
>  	db = (u64)msi->address_hi << 32 | msi->address_lo;
>  	irq = vgic_its_check_cache(kvm, db, msi->devid, msi->data);
> -	if (!irq)
> +
> +	if (!irq || !irq->target_vcpu->arch.vgic_cpu.lpis_enabled)
>  		return -EWOULDBLOCK;
> 
>  	raw_spin_lock_irqsave(&irq->irq_lock, flags);
> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> index 15a6c98ee92f..7b0749f7660d 100644
> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
> @@ -242,10 +242,8 @@ static void vgic_mmio_write_v3r_ctlr(struct 
> kvm_vcpu *vcpu,
> 
>  	vgic_cpu->lpis_enabled = val & GICR_CTLR_ENABLE_LPIS;
> 
> -	if (was_enabled && !vgic_cpu->lpis_enabled) {
> +	if (was_enabled && !vgic_cpu->lpis_enabled)
>  		vgic_flush_pending_lpis(vcpu);
> -		vgic_its_invalidate_cache(vcpu->kvm);
> -	}
> 
>  	if (!was_enabled && vgic_cpu->lpis_enabled)
>  		vgic_enable_lpis(vcpu);
Shenming Lu Dec. 31, 2020, 11:58 a.m. UTC | #2
On 2020/12/31 16:57, Marc Zyngier wrote:
> Hi Shemming,
> 
> On 2020-12-31 06:28, Shenming Lu wrote:
>> When the EnableLPIs bit is set to 0, any ITS LPI requests in the
>> Redistributor would be ignored. And this check is independent from
>> the ITS LPI translation. So it might be better to move the check
>> of the EnableLPIs bit out of the LPI resolving, and also add it
>> to the path that uses the translation cache.
> 
> But by doing that, you are moving the overhead of checking for
> EnableLPIs from the slow path (translation walk) to the fast
> path (cache hit), which seems counter-productive.

Oh, I didn't notice the overhead of the checking, I thought it would
be negligible...

> 
>> Besides it seems that
>> by this the invalidating of the translation cache caused by the LPI
>> disabling is unnecessary.
>>
>> Not sure if I have missed something... Thanks.
> 
> I am certainly missing the purpose of this patch.
> 
> The effect of EnableLPIs being zero is to drop the result of any
> translation (a new pending bit) on the floor. Given that, it is
> immaterial whether this causes a new translation or hits in the
> cache, as the result is still to not pend a new interrupt.
> 
> I get the feeling that you are trying to optimise for the unusual
> case where EnableLPIs is 0 *and* you have a screaming device
> injecting tons of interrupt. If that is the case, I don't think
> this is worth it.

In fact, I just found (imagining) that if the EnableLPIs bit is 0,
the kvm_vgic_v4_set_forwarding() would fail when performing the LPI
translation, but indeed we don't try to pend any interrupts there...

By the way, it seems that the LPI disabling would not affect the
injection of VLPIs...

Thanks,
Shenming

> 
> Thanks,
> 
>         M.
> 
>>
>> Signed-off-by: Shenming Lu <lushenming@huawei.com>
>> ---
>>  arch/arm64/kvm/vgic/vgic-its.c     | 9 +++++----
>>  arch/arm64/kvm/vgic/vgic-mmio-v3.c | 4 +---
>>  2 files changed, 6 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
>> index 40cbaca81333..f53446bc154e 100644
>> --- a/arch/arm64/kvm/vgic/vgic-its.c
>> +++ b/arch/arm64/kvm/vgic/vgic-its.c
>> @@ -683,9 +683,6 @@ int vgic_its_resolve_lpi(struct kvm *kvm, struct
>> vgic_its *its,
>>      if (!vcpu)
>>          return E_ITS_INT_UNMAPPED_INTERRUPT;
>>
>> -    if (!vcpu->arch.vgic_cpu.lpis_enabled)
>> -        return -EBUSY;
>> -
>>      vgic_its_cache_translation(kvm, its, devid, eventid, ite->irq);
>>
>>      *irq = ite->irq;
>> @@ -738,6 +735,9 @@ static int vgic_its_trigger_msi(struct kvm *kvm,
>> struct vgic_its *its,
>>      if (err)
>>          return err;
>>
>> +    if (!irq->target_vcpu->arch.vgic_cpu.lpis_enabled)
>> +        return -EBUSY;
>> +
>>      if (irq->hw)
>>          return irq_set_irqchip_state(irq->host_irq,
>>                           IRQCHIP_STATE_PENDING, true);
>> @@ -757,7 +757,8 @@ int vgic_its_inject_cached_translation(struct kvm
>> *kvm, struct kvm_msi *msi)
>>
>>      db = (u64)msi->address_hi << 32 | msi->address_lo;
>>      irq = vgic_its_check_cache(kvm, db, msi->devid, msi->data);
>> -    if (!irq)
>> +
>> +    if (!irq || !irq->target_vcpu->arch.vgic_cpu.lpis_enabled)
>>          return -EWOULDBLOCK;
>>
>>      raw_spin_lock_irqsave(&irq->irq_lock, flags);
>> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>> b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>> index 15a6c98ee92f..7b0749f7660d 100644
>> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
>> @@ -242,10 +242,8 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu,
>>
>>      vgic_cpu->lpis_enabled = val & GICR_CTLR_ENABLE_LPIS;
>>
>> -    if (was_enabled && !vgic_cpu->lpis_enabled) {
>> +    if (was_enabled && !vgic_cpu->lpis_enabled)
>>          vgic_flush_pending_lpis(vcpu);
>> -        vgic_its_invalidate_cache(vcpu->kvm);
>> -    }
>>
>>      if (!was_enabled && vgic_cpu->lpis_enabled)
>>          vgic_enable_lpis(vcpu);
>
Marc Zyngier Dec. 31, 2020, 12:22 p.m. UTC | #3
On 2020-12-31 11:58, Shenming Lu wrote:
> On 2020/12/31 16:57, Marc Zyngier wrote:
>> Hi Shemming,
>> 
>> On 2020-12-31 06:28, Shenming Lu wrote:
>>> When the EnableLPIs bit is set to 0, any ITS LPI requests in the
>>> Redistributor would be ignored. And this check is independent from
>>> the ITS LPI translation. So it might be better to move the check
>>> of the EnableLPIs bit out of the LPI resolving, and also add it
>>> to the path that uses the translation cache.
>> 
>> But by doing that, you are moving the overhead of checking for
>> EnableLPIs from the slow path (translation walk) to the fast
>> path (cache hit), which seems counter-productive.
> 
> Oh, I didn't notice the overhead of the checking, I thought it would
> be negligible...

It probably doesn't show on a modern box, but some of the slower
systems might see it. Overall, this is a design decision to keep
the translation cache as simple and straightforward as possible:
if anything affects the output of the cache, we invalidate it,
and that's it.

> 
>> 
>>> Besides it seems that
>>> by this the invalidating of the translation cache caused by the LPI
>>> disabling is unnecessary.
>>> 
>>> Not sure if I have missed something... Thanks.
>> 
>> I am certainly missing the purpose of this patch.
>> 
>> The effect of EnableLPIs being zero is to drop the result of any
>> translation (a new pending bit) on the floor. Given that, it is
>> immaterial whether this causes a new translation or hits in the
>> cache, as the result is still to not pend a new interrupt.
>> 
>> I get the feeling that you are trying to optimise for the unusual
>> case where EnableLPIs is 0 *and* you have a screaming device
>> injecting tons of interrupt. If that is the case, I don't think
>> this is worth it.
> 
> In fact, I just found (imagining) that if the EnableLPIs bit is 0,
> the kvm_vgic_v4_set_forwarding() would fail when performing the LPI
> translation, but indeed we don't try to pend any interrupts there...
> 
> By the way, it seems that the LPI disabling would not affect the
> injection of VLPIs...

Yes, good point. We could unmap the VPE from all ITS, which would result
in all translations to be discarded, but this has the really bad side
effect of *also* preventing the delivery of vSGIs, which isn't what
you'd expect.

Overall, I don't think there is a good way to support this, and maybe
we should just prevent EnableLPIs to be turned off when using direct
injection. After all, the architecture does allow that for GICv3
implementations, which is what we emulate.

Thanks,

         M.
Shenming Lu Jan. 1, 2021, 3:08 a.m. UTC | #4
On 2020/12/31 20:22, Marc Zyngier wrote:
> On 2020-12-31 11:58, Shenming Lu wrote:
>> On 2020/12/31 16:57, Marc Zyngier wrote:
>>> Hi Shemming,
>>>
>>> On 2020-12-31 06:28, Shenming Lu wrote:
>>>> When the EnableLPIs bit is set to 0, any ITS LPI requests in the
>>>> Redistributor would be ignored. And this check is independent from
>>>> the ITS LPI translation. So it might be better to move the check
>>>> of the EnableLPIs bit out of the LPI resolving, and also add it
>>>> to the path that uses the translation cache.
>>>
>>> But by doing that, you are moving the overhead of checking for
>>> EnableLPIs from the slow path (translation walk) to the fast
>>> path (cache hit), which seems counter-productive.
>>
>> Oh, I didn't notice the overhead of the checking, I thought it would
>> be negligible...
> 
> It probably doesn't show on a modern box, but some of the slower
> systems might see it. Overall, this is a design decision to keep
> the translation cache as simple and straightforward as possible:
> if anything affects the output of the cache, we invalidate it,
> and that's it.

Ok, get it.

> 
>>
>>>
>>>> Besides it seems that
>>>> by this the invalidating of the translation cache caused by the LPI
>>>> disabling is unnecessary.
>>>>
>>>> Not sure if I have missed something... Thanks.
>>>
>>> I am certainly missing the purpose of this patch.
>>>
>>> The effect of EnableLPIs being zero is to drop the result of any
>>> translation (a new pending bit) on the floor. Given that, it is
>>> immaterial whether this causes a new translation or hits in the
>>> cache, as the result is still to not pend a new interrupt.
>>>
>>> I get the feeling that you are trying to optimise for the unusual
>>> case where EnableLPIs is 0 *and* you have a screaming device
>>> injecting tons of interrupt. If that is the case, I don't think
>>> this is worth it.
>>
>> In fact, I just found (imagining) that if the EnableLPIs bit is 0,
>> the kvm_vgic_v4_set_forwarding() would fail when performing the LPI
>> translation, but indeed we don't try to pend any interrupts there...
>>
>> By the way, it seems that the LPI disabling would not affect the
>> injection of VLPIs...
> 
> Yes, good point. We could unmap the VPE from all ITS, which would result
> in all translations to be discarded, but this has the really bad side
> effect of *also* preventing the delivery of vSGIs, which isn't what
> you'd expect.
> 
> Overall, I don't think there is a good way to support this, and maybe
> we should just prevent EnableLPIs to be turned off when using direct
> injection. After all, the architecture does allow that for GICv3
> implementations, which is what we emulate.

Agreed, if there is no good way, we could just make the EnableLPIs clearing
unsupported...

Thanks(Happy 2021),
Shenming

> 
> Thanks,
> 
>         M.
diff mbox series

Patch

diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index 40cbaca81333..f53446bc154e 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -683,9 +683,6 @@  int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
 	if (!vcpu)
 		return E_ITS_INT_UNMAPPED_INTERRUPT;
 
-	if (!vcpu->arch.vgic_cpu.lpis_enabled)
-		return -EBUSY;
-
 	vgic_its_cache_translation(kvm, its, devid, eventid, ite->irq);
 
 	*irq = ite->irq;
@@ -738,6 +735,9 @@  static int vgic_its_trigger_msi(struct kvm *kvm, struct vgic_its *its,
 	if (err)
 		return err;
 
+	if (!irq->target_vcpu->arch.vgic_cpu.lpis_enabled)
+		return -EBUSY;
+
 	if (irq->hw)
 		return irq_set_irqchip_state(irq->host_irq,
 					     IRQCHIP_STATE_PENDING, true);
@@ -757,7 +757,8 @@  int vgic_its_inject_cached_translation(struct kvm *kvm, struct kvm_msi *msi)
 
 	db = (u64)msi->address_hi << 32 | msi->address_lo;
 	irq = vgic_its_check_cache(kvm, db, msi->devid, msi->data);
-	if (!irq)
+
+	if (!irq || !irq->target_vcpu->arch.vgic_cpu.lpis_enabled)
 		return -EWOULDBLOCK;
 
 	raw_spin_lock_irqsave(&irq->irq_lock, flags);
diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
index 15a6c98ee92f..7b0749f7660d 100644
--- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c
@@ -242,10 +242,8 @@  static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu,
 
 	vgic_cpu->lpis_enabled = val & GICR_CTLR_ENABLE_LPIS;
 
-	if (was_enabled && !vgic_cpu->lpis_enabled) {
+	if (was_enabled && !vgic_cpu->lpis_enabled)
 		vgic_flush_pending_lpis(vcpu);
-		vgic_its_invalidate_cache(vcpu->kvm);
-	}
 
 	if (!was_enabled && vgic_cpu->lpis_enabled)
 		vgic_enable_lpis(vcpu);