Message ID | 20201231062813.714-1-lushenming@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] KVM: arm64: vgic: Decouple the check of the EnableLPIs bit from the ITS LPI translation | expand |
Hi Shemming, On 2020-12-31 06:28, Shenming Lu wrote: > When the EnableLPIs bit is set to 0, any ITS LPI requests in the > Redistributor would be ignored. And this check is independent from > the ITS LPI translation. So it might be better to move the check > of the EnableLPIs bit out of the LPI resolving, and also add it > to the path that uses the translation cache. But by doing that, you are moving the overhead of checking for EnableLPIs from the slow path (translation walk) to the fast path (cache hit), which seems counter-productive. > Besides it seems that > by this the invalidating of the translation cache caused by the LPI > disabling is unnecessary. > > Not sure if I have missed something... Thanks. I am certainly missing the purpose of this patch. The effect of EnableLPIs being zero is to drop the result of any translation (a new pending bit) on the floor. Given that, it is immaterial whether this causes a new translation or hits in the cache, as the result is still to not pend a new interrupt. I get the feeling that you are trying to optimise for the unusual case where EnableLPIs is 0 *and* you have a screaming device injecting tons of interrupt. If that is the case, I don't think this is worth it. Thanks, M. > > Signed-off-by: Shenming Lu <lushenming@huawei.com> > --- > arch/arm64/kvm/vgic/vgic-its.c | 9 +++++---- > arch/arm64/kvm/vgic/vgic-mmio-v3.c | 4 +--- > 2 files changed, 6 insertions(+), 7 deletions(-) > > diff --git a/arch/arm64/kvm/vgic/vgic-its.c > b/arch/arm64/kvm/vgic/vgic-its.c > index 40cbaca81333..f53446bc154e 100644 > --- a/arch/arm64/kvm/vgic/vgic-its.c > +++ b/arch/arm64/kvm/vgic/vgic-its.c > @@ -683,9 +683,6 @@ int vgic_its_resolve_lpi(struct kvm *kvm, struct > vgic_its *its, > if (!vcpu) > return E_ITS_INT_UNMAPPED_INTERRUPT; > > - if (!vcpu->arch.vgic_cpu.lpis_enabled) > - return -EBUSY; > - > vgic_its_cache_translation(kvm, its, devid, eventid, ite->irq); > > *irq = ite->irq; > @@ -738,6 +735,9 @@ static int vgic_its_trigger_msi(struct kvm *kvm, > struct vgic_its *its, > if (err) > return err; > > + if (!irq->target_vcpu->arch.vgic_cpu.lpis_enabled) > + return -EBUSY; > + > if (irq->hw) > return irq_set_irqchip_state(irq->host_irq, > IRQCHIP_STATE_PENDING, true); > @@ -757,7 +757,8 @@ int vgic_its_inject_cached_translation(struct kvm > *kvm, struct kvm_msi *msi) > > db = (u64)msi->address_hi << 32 | msi->address_lo; > irq = vgic_its_check_cache(kvm, db, msi->devid, msi->data); > - if (!irq) > + > + if (!irq || !irq->target_vcpu->arch.vgic_cpu.lpis_enabled) > return -EWOULDBLOCK; > > raw_spin_lock_irqsave(&irq->irq_lock, flags); > diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c > b/arch/arm64/kvm/vgic/vgic-mmio-v3.c > index 15a6c98ee92f..7b0749f7660d 100644 > --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c > +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c > @@ -242,10 +242,8 @@ static void vgic_mmio_write_v3r_ctlr(struct > kvm_vcpu *vcpu, > > vgic_cpu->lpis_enabled = val & GICR_CTLR_ENABLE_LPIS; > > - if (was_enabled && !vgic_cpu->lpis_enabled) { > + if (was_enabled && !vgic_cpu->lpis_enabled) > vgic_flush_pending_lpis(vcpu); > - vgic_its_invalidate_cache(vcpu->kvm); > - } > > if (!was_enabled && vgic_cpu->lpis_enabled) > vgic_enable_lpis(vcpu);
On 2020/12/31 16:57, Marc Zyngier wrote: > Hi Shemming, > > On 2020-12-31 06:28, Shenming Lu wrote: >> When the EnableLPIs bit is set to 0, any ITS LPI requests in the >> Redistributor would be ignored. And this check is independent from >> the ITS LPI translation. So it might be better to move the check >> of the EnableLPIs bit out of the LPI resolving, and also add it >> to the path that uses the translation cache. > > But by doing that, you are moving the overhead of checking for > EnableLPIs from the slow path (translation walk) to the fast > path (cache hit), which seems counter-productive. Oh, I didn't notice the overhead of the checking, I thought it would be negligible... > >> Besides it seems that >> by this the invalidating of the translation cache caused by the LPI >> disabling is unnecessary. >> >> Not sure if I have missed something... Thanks. > > I am certainly missing the purpose of this patch. > > The effect of EnableLPIs being zero is to drop the result of any > translation (a new pending bit) on the floor. Given that, it is > immaterial whether this causes a new translation or hits in the > cache, as the result is still to not pend a new interrupt. > > I get the feeling that you are trying to optimise for the unusual > case where EnableLPIs is 0 *and* you have a screaming device > injecting tons of interrupt. If that is the case, I don't think > this is worth it. In fact, I just found (imagining) that if the EnableLPIs bit is 0, the kvm_vgic_v4_set_forwarding() would fail when performing the LPI translation, but indeed we don't try to pend any interrupts there... By the way, it seems that the LPI disabling would not affect the injection of VLPIs... Thanks, Shenming > > Thanks, > > M. > >> >> Signed-off-by: Shenming Lu <lushenming@huawei.com> >> --- >> arch/arm64/kvm/vgic/vgic-its.c | 9 +++++---- >> arch/arm64/kvm/vgic/vgic-mmio-v3.c | 4 +--- >> 2 files changed, 6 insertions(+), 7 deletions(-) >> >> diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c >> index 40cbaca81333..f53446bc154e 100644 >> --- a/arch/arm64/kvm/vgic/vgic-its.c >> +++ b/arch/arm64/kvm/vgic/vgic-its.c >> @@ -683,9 +683,6 @@ int vgic_its_resolve_lpi(struct kvm *kvm, struct >> vgic_its *its, >> if (!vcpu) >> return E_ITS_INT_UNMAPPED_INTERRUPT; >> >> - if (!vcpu->arch.vgic_cpu.lpis_enabled) >> - return -EBUSY; >> - >> vgic_its_cache_translation(kvm, its, devid, eventid, ite->irq); >> >> *irq = ite->irq; >> @@ -738,6 +735,9 @@ static int vgic_its_trigger_msi(struct kvm *kvm, >> struct vgic_its *its, >> if (err) >> return err; >> >> + if (!irq->target_vcpu->arch.vgic_cpu.lpis_enabled) >> + return -EBUSY; >> + >> if (irq->hw) >> return irq_set_irqchip_state(irq->host_irq, >> IRQCHIP_STATE_PENDING, true); >> @@ -757,7 +757,8 @@ int vgic_its_inject_cached_translation(struct kvm >> *kvm, struct kvm_msi *msi) >> >> db = (u64)msi->address_hi << 32 | msi->address_lo; >> irq = vgic_its_check_cache(kvm, db, msi->devid, msi->data); >> - if (!irq) >> + >> + if (!irq || !irq->target_vcpu->arch.vgic_cpu.lpis_enabled) >> return -EWOULDBLOCK; >> >> raw_spin_lock_irqsave(&irq->irq_lock, flags); >> diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c >> b/arch/arm64/kvm/vgic/vgic-mmio-v3.c >> index 15a6c98ee92f..7b0749f7660d 100644 >> --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c >> +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c >> @@ -242,10 +242,8 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu, >> >> vgic_cpu->lpis_enabled = val & GICR_CTLR_ENABLE_LPIS; >> >> - if (was_enabled && !vgic_cpu->lpis_enabled) { >> + if (was_enabled && !vgic_cpu->lpis_enabled) >> vgic_flush_pending_lpis(vcpu); >> - vgic_its_invalidate_cache(vcpu->kvm); >> - } >> >> if (!was_enabled && vgic_cpu->lpis_enabled) >> vgic_enable_lpis(vcpu); >
On 2020-12-31 11:58, Shenming Lu wrote: > On 2020/12/31 16:57, Marc Zyngier wrote: >> Hi Shemming, >> >> On 2020-12-31 06:28, Shenming Lu wrote: >>> When the EnableLPIs bit is set to 0, any ITS LPI requests in the >>> Redistributor would be ignored. And this check is independent from >>> the ITS LPI translation. So it might be better to move the check >>> of the EnableLPIs bit out of the LPI resolving, and also add it >>> to the path that uses the translation cache. >> >> But by doing that, you are moving the overhead of checking for >> EnableLPIs from the slow path (translation walk) to the fast >> path (cache hit), which seems counter-productive. > > Oh, I didn't notice the overhead of the checking, I thought it would > be negligible... It probably doesn't show on a modern box, but some of the slower systems might see it. Overall, this is a design decision to keep the translation cache as simple and straightforward as possible: if anything affects the output of the cache, we invalidate it, and that's it. > >> >>> Besides it seems that >>> by this the invalidating of the translation cache caused by the LPI >>> disabling is unnecessary. >>> >>> Not sure if I have missed something... Thanks. >> >> I am certainly missing the purpose of this patch. >> >> The effect of EnableLPIs being zero is to drop the result of any >> translation (a new pending bit) on the floor. Given that, it is >> immaterial whether this causes a new translation or hits in the >> cache, as the result is still to not pend a new interrupt. >> >> I get the feeling that you are trying to optimise for the unusual >> case where EnableLPIs is 0 *and* you have a screaming device >> injecting tons of interrupt. If that is the case, I don't think >> this is worth it. > > In fact, I just found (imagining) that if the EnableLPIs bit is 0, > the kvm_vgic_v4_set_forwarding() would fail when performing the LPI > translation, but indeed we don't try to pend any interrupts there... > > By the way, it seems that the LPI disabling would not affect the > injection of VLPIs... Yes, good point. We could unmap the VPE from all ITS, which would result in all translations to be discarded, but this has the really bad side effect of *also* preventing the delivery of vSGIs, which isn't what you'd expect. Overall, I don't think there is a good way to support this, and maybe we should just prevent EnableLPIs to be turned off when using direct injection. After all, the architecture does allow that for GICv3 implementations, which is what we emulate. Thanks, M.
On 2020/12/31 20:22, Marc Zyngier wrote: > On 2020-12-31 11:58, Shenming Lu wrote: >> On 2020/12/31 16:57, Marc Zyngier wrote: >>> Hi Shemming, >>> >>> On 2020-12-31 06:28, Shenming Lu wrote: >>>> When the EnableLPIs bit is set to 0, any ITS LPI requests in the >>>> Redistributor would be ignored. And this check is independent from >>>> the ITS LPI translation. So it might be better to move the check >>>> of the EnableLPIs bit out of the LPI resolving, and also add it >>>> to the path that uses the translation cache. >>> >>> But by doing that, you are moving the overhead of checking for >>> EnableLPIs from the slow path (translation walk) to the fast >>> path (cache hit), which seems counter-productive. >> >> Oh, I didn't notice the overhead of the checking, I thought it would >> be negligible... > > It probably doesn't show on a modern box, but some of the slower > systems might see it. Overall, this is a design decision to keep > the translation cache as simple and straightforward as possible: > if anything affects the output of the cache, we invalidate it, > and that's it. Ok, get it. > >> >>> >>>> Besides it seems that >>>> by this the invalidating of the translation cache caused by the LPI >>>> disabling is unnecessary. >>>> >>>> Not sure if I have missed something... Thanks. >>> >>> I am certainly missing the purpose of this patch. >>> >>> The effect of EnableLPIs being zero is to drop the result of any >>> translation (a new pending bit) on the floor. Given that, it is >>> immaterial whether this causes a new translation or hits in the >>> cache, as the result is still to not pend a new interrupt. >>> >>> I get the feeling that you are trying to optimise for the unusual >>> case where EnableLPIs is 0 *and* you have a screaming device >>> injecting tons of interrupt. If that is the case, I don't think >>> this is worth it. >> >> In fact, I just found (imagining) that if the EnableLPIs bit is 0, >> the kvm_vgic_v4_set_forwarding() would fail when performing the LPI >> translation, but indeed we don't try to pend any interrupts there... >> >> By the way, it seems that the LPI disabling would not affect the >> injection of VLPIs... > > Yes, good point. We could unmap the VPE from all ITS, which would result > in all translations to be discarded, but this has the really bad side > effect of *also* preventing the delivery of vSGIs, which isn't what > you'd expect. > > Overall, I don't think there is a good way to support this, and maybe > we should just prevent EnableLPIs to be turned off when using direct > injection. After all, the architecture does allow that for GICv3 > implementations, which is what we emulate. Agreed, if there is no good way, we could just make the EnableLPIs clearing unsupported... Thanks(Happy 2021), Shenming > > Thanks, > > M.
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c index 40cbaca81333..f53446bc154e 100644 --- a/arch/arm64/kvm/vgic/vgic-its.c +++ b/arch/arm64/kvm/vgic/vgic-its.c @@ -683,9 +683,6 @@ int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its, if (!vcpu) return E_ITS_INT_UNMAPPED_INTERRUPT; - if (!vcpu->arch.vgic_cpu.lpis_enabled) - return -EBUSY; - vgic_its_cache_translation(kvm, its, devid, eventid, ite->irq); *irq = ite->irq; @@ -738,6 +735,9 @@ static int vgic_its_trigger_msi(struct kvm *kvm, struct vgic_its *its, if (err) return err; + if (!irq->target_vcpu->arch.vgic_cpu.lpis_enabled) + return -EBUSY; + if (irq->hw) return irq_set_irqchip_state(irq->host_irq, IRQCHIP_STATE_PENDING, true); @@ -757,7 +757,8 @@ int vgic_its_inject_cached_translation(struct kvm *kvm, struct kvm_msi *msi) db = (u64)msi->address_hi << 32 | msi->address_lo; irq = vgic_its_check_cache(kvm, db, msi->devid, msi->data); - if (!irq) + + if (!irq || !irq->target_vcpu->arch.vgic_cpu.lpis_enabled) return -EWOULDBLOCK; raw_spin_lock_irqsave(&irq->irq_lock, flags); diff --git a/arch/arm64/kvm/vgic/vgic-mmio-v3.c b/arch/arm64/kvm/vgic/vgic-mmio-v3.c index 15a6c98ee92f..7b0749f7660d 100644 --- a/arch/arm64/kvm/vgic/vgic-mmio-v3.c +++ b/arch/arm64/kvm/vgic/vgic-mmio-v3.c @@ -242,10 +242,8 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu, vgic_cpu->lpis_enabled = val & GICR_CTLR_ENABLE_LPIS; - if (was_enabled && !vgic_cpu->lpis_enabled) { + if (was_enabled && !vgic_cpu->lpis_enabled) vgic_flush_pending_lpis(vcpu); - vgic_its_invalidate_cache(vcpu->kvm); - } if (!was_enabled && vgic_cpu->lpis_enabled) vgic_enable_lpis(vcpu);
When the EnableLPIs bit is set to 0, any ITS LPI requests in the Redistributor would be ignored. And this check is independent from the ITS LPI translation. So it might be better to move the check of the EnableLPIs bit out of the LPI resolving, and also add it to the path that uses the translation cache. Besides it seems that by this the invalidating of the translation cache caused by the LPI disabling is unnecessary. Not sure if I have missed something... Thanks. Signed-off-by: Shenming Lu <lushenming@huawei.com> --- arch/arm64/kvm/vgic/vgic-its.c | 9 +++++---- arch/arm64/kvm/vgic/vgic-mmio-v3.c | 4 +--- 2 files changed, 6 insertions(+), 7 deletions(-)