Message ID | 20201123065410.1915-4-lushenming@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: arm64: Add VLPI migration support on GICv4.1 | expand |
On 2020-11-23 06:54, Shenming Lu wrote: > From: Zenghui Yu <yuzenghui@huawei.com> > > When setting the forwarding path of a VLPI, it is more consistent to I'm not sure it is more consistent. It is a *new* behaviour, because it only matters for migration, which has been so far unsupported. > also transfer the pending state from irq->pending_latch to VPT > (especially > in migration, the pending states of VLPIs are restored into kvm’s vgic > first). And we currently send "INT+VSYNC" to trigger a VLPI to pending. > > Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> > Signed-off-by: Shenming Lu <lushenming@huawei.com> > --- > arch/arm64/kvm/vgic/vgic-v4.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/arch/arm64/kvm/vgic/vgic-v4.c > b/arch/arm64/kvm/vgic/vgic-v4.c > index b5fa73c9fd35..cc3ab9cea182 100644 > --- a/arch/arm64/kvm/vgic/vgic-v4.c > +++ b/arch/arm64/kvm/vgic/vgic-v4.c > @@ -418,6 +418,18 @@ int kvm_vgic_v4_set_forwarding(struct kvm *kvm, > int virq, > irq->host_irq = virq; > atomic_inc(&map.vpe->vlpi_count); > > + /* Transfer pending state */ > + ret = irq_set_irqchip_state(irq->host_irq, > + IRQCHIP_STATE_PENDING, > + irq->pending_latch); > + WARN_RATELIMIT(ret, "IRQ %d", irq->host_irq); > + > + /* > + * Let it be pruned from ap_list later and don't bother > + * the List Register. > + */ > + irq->pending_latch = false; It occurs to me that calling into irq_set_irqchip_state() for a large number of interrupts can take a significant amount of time. It is also odd that you dump the VPT with the VPE unmapped, but rely on the VPE being mapped for the opposite operation. Shouldn't these be symmetric, all performed while the VPE is unmapped? It would also save a lot of ITS traffic. > + > out: > mutex_unlock(&its->its_lock); > return ret; Thanks, M.
On 2020/11/23 17:27, Marc Zyngier wrote: > On 2020-11-23 06:54, Shenming Lu wrote: >> From: Zenghui Yu <yuzenghui@huawei.com> >> >> When setting the forwarding path of a VLPI, it is more consistent to > > I'm not sure it is more consistent. It is a *new* behaviour, because it only > matters for migration, which has been so far unsupported. Alright, consistent may not be accurate... But I have doubt that whether there is really no need to transfer the pending states from kvm'vgic to VPT in set_forwarding regardless of migration, and the similar for unset_forwarding. > >> also transfer the pending state from irq->pending_latch to VPT (especially >> in migration, the pending states of VLPIs are restored into kvm’s vgic >> first). And we currently send "INT+VSYNC" to trigger a VLPI to pending. >> >> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> >> Signed-off-by: Shenming Lu <lushenming@huawei.com> >> --- >> arch/arm64/kvm/vgic/vgic-v4.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) >> >> diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c >> index b5fa73c9fd35..cc3ab9cea182 100644 >> --- a/arch/arm64/kvm/vgic/vgic-v4.c >> +++ b/arch/arm64/kvm/vgic/vgic-v4.c >> @@ -418,6 +418,18 @@ int kvm_vgic_v4_set_forwarding(struct kvm *kvm, int virq, >> irq->host_irq = virq; >> atomic_inc(&map.vpe->vlpi_count); >> >> + /* Transfer pending state */ >> + ret = irq_set_irqchip_state(irq->host_irq, >> + IRQCHIP_STATE_PENDING, >> + irq->pending_latch); >> + WARN_RATELIMIT(ret, "IRQ %d", irq->host_irq); >> + >> + /* >> + * Let it be pruned from ap_list later and don't bother >> + * the List Register. >> + */ >> + irq->pending_latch = false; > > It occurs to me that calling into irq_set_irqchip_state() for a large > number of interrupts can take a significant amount of time. It is also > odd that you dump the VPT with the VPE unmapped, but rely on the VPE > being mapped for the opposite operation. > > Shouldn't these be symmetric, all performed while the VPE is unmapped? > It would also save a lot of ITS traffic. > My thought was to use the existing interface directly without unmapping... If you want to unmap the vPE and poke the VPT here, as I said in the cover letter, set/unset_forwarding might also be called when all devices are running at normal run time, in which case the unmapping of the vPE is not allowed... Another possible solution is to add a new dedicated interface to QEMU to transfer these pending states to HW in GIC VM state change handler corresponding to save_pending_tables? >> + >> out: >> mutex_unlock(&its->its_lock); >> return ret; > > Thanks, > > M.
On 2020-11-24 08:10, Shenming Lu wrote: > On 2020/11/23 17:27, Marc Zyngier wrote: >> On 2020-11-23 06:54, Shenming Lu wrote: >>> From: Zenghui Yu <yuzenghui@huawei.com> >>> >>> When setting the forwarding path of a VLPI, it is more consistent to >> >> I'm not sure it is more consistent. It is a *new* behaviour, because >> it only >> matters for migration, which has been so far unsupported. > > Alright, consistent may not be accurate... > But I have doubt that whether there is really no need to transfer the > pending states > from kvm'vgic to VPT in set_forwarding regardless of migration, and the > similar > for unset_forwarding. If you have to transfer that state outside of the a save/restore, it means that you have missed the programming of the PCI endpoint. This is an established restriction that the MSI programming must occur *after* the translation has been established using MAPI/MAPTI (see the large comment at the beginning of vgic-v4.c). If you want to revisit this, fair enough. But you will need a lot more than just opportunistically transfer the pending state. > >> >>> also transfer the pending state from irq->pending_latch to VPT >>> (especially >>> in migration, the pending states of VLPIs are restored into kvm’s >>> vgic >>> first). And we currently send "INT+VSYNC" to trigger a VLPI to >>> pending. >>> >>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> >>> Signed-off-by: Shenming Lu <lushenming@huawei.com> >>> --- >>> arch/arm64/kvm/vgic/vgic-v4.c | 12 ++++++++++++ >>> 1 file changed, 12 insertions(+) >>> >>> diff --git a/arch/arm64/kvm/vgic/vgic-v4.c >>> b/arch/arm64/kvm/vgic/vgic-v4.c >>> index b5fa73c9fd35..cc3ab9cea182 100644 >>> --- a/arch/arm64/kvm/vgic/vgic-v4.c >>> +++ b/arch/arm64/kvm/vgic/vgic-v4.c >>> @@ -418,6 +418,18 @@ int kvm_vgic_v4_set_forwarding(struct kvm *kvm, >>> int virq, >>> irq->host_irq = virq; >>> atomic_inc(&map.vpe->vlpi_count); >>> >>> + /* Transfer pending state */ >>> + ret = irq_set_irqchip_state(irq->host_irq, >>> + IRQCHIP_STATE_PENDING, >>> + irq->pending_latch); >>> + WARN_RATELIMIT(ret, "IRQ %d", irq->host_irq); >>> + >>> + /* >>> + * Let it be pruned from ap_list later and don't bother >>> + * the List Register. >>> + */ >>> + irq->pending_latch = false; >> >> It occurs to me that calling into irq_set_irqchip_state() for a large >> number of interrupts can take a significant amount of time. It is also >> odd that you dump the VPT with the VPE unmapped, but rely on the VPE >> being mapped for the opposite operation. >> >> Shouldn't these be symmetric, all performed while the VPE is unmapped? >> It would also save a lot of ITS traffic. >> > > My thought was to use the existing interface directly without > unmapping... > > If you want to unmap the vPE and poke the VPT here, as I said in the > cover > letter, set/unset_forwarding might also be called when all devices are > running > at normal run time, in which case the unmapping of the vPE is not > allowed... No, I'm suggesting that you don't do anything here, but instead as a by-product of restoring the ITS tables. What goes wrong if you use the KVM_DEV_ARM_ITS_RESTORE_TABLE backend instead? > Another possible solution is to add a new dedicated interface to QEMU > to transfer > these pending states to HW in GIC VM state change handler corresponding > to > save_pending_tables? Userspace has no way to know we use GICv4, and I intend to keep it completely out of the loop. The API is already pretty tortuous, and I really don't want to add any extra complexity to it. Thanks, M.
On 2020/11/24 16:44, Marc Zyngier wrote: > On 2020-11-24 08:10, Shenming Lu wrote: >> On 2020/11/23 17:27, Marc Zyngier wrote: >>> On 2020-11-23 06:54, Shenming Lu wrote: >>>> From: Zenghui Yu <yuzenghui@huawei.com> >>>> >>>> When setting the forwarding path of a VLPI, it is more consistent to >>> >>> I'm not sure it is more consistent. It is a *new* behaviour, because it only >>> matters for migration, which has been so far unsupported. >> >> Alright, consistent may not be accurate... >> But I have doubt that whether there is really no need to transfer the >> pending states >> from kvm'vgic to VPT in set_forwarding regardless of migration, and the similar >> for unset_forwarding. > > If you have to transfer that state outside of the a save/restore, it means that > you have missed the programming of the PCI endpoint. This is an established > restriction that the MSI programming must occur *after* the translation has > been established using MAPI/MAPTI (see the large comment at the beginning of > vgic-v4.c). > > If you want to revisit this, fair enough. But you will need a lot more than > just opportunistically transfer the pending state. Thanks, I will look at what you mentioned. > >> >>> >>>> also transfer the pending state from irq->pending_latch to VPT (especially >>>> in migration, the pending states of VLPIs are restored into kvm’s vgic >>>> first). And we currently send "INT+VSYNC" to trigger a VLPI to pending. >>>> >>>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> >>>> Signed-off-by: Shenming Lu <lushenming@huawei.com> >>>> --- >>>> arch/arm64/kvm/vgic/vgic-v4.c | 12 ++++++++++++ >>>> 1 file changed, 12 insertions(+) >>>> >>>> diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c >>>> index b5fa73c9fd35..cc3ab9cea182 100644 >>>> --- a/arch/arm64/kvm/vgic/vgic-v4.c >>>> +++ b/arch/arm64/kvm/vgic/vgic-v4.c >>>> @@ -418,6 +418,18 @@ int kvm_vgic_v4_set_forwarding(struct kvm *kvm, int virq, >>>> irq->host_irq = virq; >>>> atomic_inc(&map.vpe->vlpi_count); >>>> >>>> + /* Transfer pending state */ >>>> + ret = irq_set_irqchip_state(irq->host_irq, >>>> + IRQCHIP_STATE_PENDING, >>>> + irq->pending_latch); >>>> + WARN_RATELIMIT(ret, "IRQ %d", irq->host_irq); >>>> + >>>> + /* >>>> + * Let it be pruned from ap_list later and don't bother >>>> + * the List Register. >>>> + */ >>>> + irq->pending_latch = false; >>> >>> It occurs to me that calling into irq_set_irqchip_state() for a large >>> number of interrupts can take a significant amount of time. It is also >>> odd that you dump the VPT with the VPE unmapped, but rely on the VPE >>> being mapped for the opposite operation. >>> >>> Shouldn't these be symmetric, all performed while the VPE is unmapped? >>> It would also save a lot of ITS traffic. >>> >> >> My thought was to use the existing interface directly without unmapping... >> >> If you want to unmap the vPE and poke the VPT here, as I said in the cover >> letter, set/unset_forwarding might also be called when all devices are running >> at normal run time, in which case the unmapping of the vPE is not allowed... > > No, I'm suggesting that you don't do anything here, but instead as a by-product > of restoring the ITS tables. What goes wrong if you use the > KVM_DEV_ARM_ITS_RESTORE_TABLE backend instead? There is an issue if we do it in the restoring of the ITS tables: the transferring of the pending state needs the irq to be marked as hw before, which is done by the pass-through device, but the configuring of the forwarding path of the VLPI depends on the restoring of the vgic first... It is a circular dependency. > >> Another possible solution is to add a new dedicated interface to QEMU >> to transfer >> these pending states to HW in GIC VM state change handler corresponding to >> save_pending_tables? > > Userspace has no way to know we use GICv4, and I intend to keep it > completely out of the loop. The API is already pretty tortuous, and > I really don't want to add any extra complexity to it. > > Thanks, > > M.
On 2020/11/24 21:12, Shenming Lu wrote: > On 2020/11/24 16:44, Marc Zyngier wrote: >> On 2020-11-24 08:10, Shenming Lu wrote: >>> On 2020/11/23 17:27, Marc Zyngier wrote: >>>> On 2020-11-23 06:54, Shenming Lu wrote: >>>>> From: Zenghui Yu <yuzenghui@huawei.com> >>>>> >>>>> When setting the forwarding path of a VLPI, it is more consistent to >>>> >>>> I'm not sure it is more consistent. It is a *new* behaviour, because it only >>>> matters for migration, which has been so far unsupported. >>> >>> Alright, consistent may not be accurate... >>> But I have doubt that whether there is really no need to transfer the >>> pending states >>> from kvm'vgic to VPT in set_forwarding regardless of migration, and the similar >>> for unset_forwarding. >> >> If you have to transfer that state outside of the a save/restore, it means that >> you have missed the programming of the PCI endpoint. This is an established >> restriction that the MSI programming must occur *after* the translation has >> been established using MAPI/MAPTI (see the large comment at the beginning of >> vgic-v4.c). >> >> If you want to revisit this, fair enough. But you will need a lot more than >> just opportunistically transfer the pending state. > > Thanks, I will look at what you mentioned. > >> >>> >>>> >>>>> also transfer the pending state from irq->pending_latch to VPT (especially >>>>> in migration, the pending states of VLPIs are restored into kvm’s vgic >>>>> first). And we currently send "INT+VSYNC" to trigger a VLPI to pending. >>>>> >>>>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> >>>>> Signed-off-by: Shenming Lu <lushenming@huawei.com> >>>>> --- >>>>> arch/arm64/kvm/vgic/vgic-v4.c | 12 ++++++++++++ >>>>> 1 file changed, 12 insertions(+) >>>>> >>>>> diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c >>>>> index b5fa73c9fd35..cc3ab9cea182 100644 >>>>> --- a/arch/arm64/kvm/vgic/vgic-v4.c >>>>> +++ b/arch/arm64/kvm/vgic/vgic-v4.c >>>>> @@ -418,6 +418,18 @@ int kvm_vgic_v4_set_forwarding(struct kvm *kvm, int virq, >>>>> irq->host_irq = virq; >>>>> atomic_inc(&map.vpe->vlpi_count); >>>>> >>>>> + /* Transfer pending state */ >>>>> + ret = irq_set_irqchip_state(irq->host_irq, >>>>> + IRQCHIP_STATE_PENDING, >>>>> + irq->pending_latch); >>>>> + WARN_RATELIMIT(ret, "IRQ %d", irq->host_irq); >>>>> + >>>>> + /* >>>>> + * Let it be pruned from ap_list later and don't bother >>>>> + * the List Register. >>>>> + */ >>>>> + irq->pending_latch = false; >>>> >>>> It occurs to me that calling into irq_set_irqchip_state() for a large >>>> number of interrupts can take a significant amount of time. It is also >>>> odd that you dump the VPT with the VPE unmapped, but rely on the VPE >>>> being mapped for the opposite operation. >>>> >>>> Shouldn't these be symmetric, all performed while the VPE is unmapped? >>>> It would also save a lot of ITS traffic. >>>> >>> >>> My thought was to use the existing interface directly without unmapping... >>> >>> If you want to unmap the vPE and poke the VPT here, as I said in the cover >>> letter, set/unset_forwarding might also be called when all devices are running >>> at normal run time, in which case the unmapping of the vPE is not allowed... >> >> No, I'm suggesting that you don't do anything here, but instead as a by-product >> of restoring the ITS tables. What goes wrong if you use the >> KVM_DEV_ARM_ITS_RESTORE_TABLE backend instead? > > There is an issue if we do it in the restoring of the ITS tables: the transferring > of the pending state needs the irq to be marked as hw before, which is done by the > pass-through device, but the configuring of the forwarding path of the VLPI depends > on the restoring of the vgic first... It is a circular dependency. > Hi Marc, We are pondering over this problem these days, but still don't get a good solution... Could you give us some advice on this? Or could we move the restoring of the pending states (include the sync from guest RAM and the transfer to HW) to the GIC VM state change handler, which is completely corresponding to save_pending_tables (more symmetric?) and don't expose GICv4... Thanks, Shenming >> >>> Another possible solution is to add a new dedicated interface to QEMU >>> to transfer >>> these pending states to HW in GIC VM state change handler corresponding to >>> save_pending_tables? >> >> Userspace has no way to know we use GICv4, and I intend to keep it >> completely out of the loop. The API is already pretty tortuous, and >> I really don't want to add any extra complexity to it. >> >> Thanks, >> >> M.
On 2020-11-30 07:23, Shenming Lu wrote: Hi Shenming, > We are pondering over this problem these days, but still don't get a > good solution... > Could you give us some advice on this? > > Or could we move the restoring of the pending states (include the sync > from guest RAM and the transfer to HW) to the GIC VM state change > handler, > which is completely corresponding to save_pending_tables (more > symmetric?) > and don't expose GICv4... What is "the GIC VM state change handler"? Is that a QEMU thing? We don't really have that concept in KVM, so I'd appreciate if you could be a bit more explicit on this. Thanks, M.
On 2020/12/1 18:55, Marc Zyngier wrote: > On 2020-11-30 07:23, Shenming Lu wrote: > > Hi Shenming, > >> We are pondering over this problem these days, but still don't get a >> good solution... >> Could you give us some advice on this? >> >> Or could we move the restoring of the pending states (include the sync >> from guest RAM and the transfer to HW) to the GIC VM state change handler, >> which is completely corresponding to save_pending_tables (more symmetric?) >> and don't expose GICv4... > > What is "the GIC VM state change handler"? Is that a QEMU thing? Yeah, it is a a QEMU thing... > We don't really have that concept in KVM, so I'd appreciate if you could > be a bit more explicit on this. My thought is to add a new interface (to QEMU) for the restoring of the pending states, which is completely corresponding to KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES... And it is called from the GIC VM state change handler in QEMU, which is happening after the restoring (call kvm_vgic_v4_set_forwarding()) but before the starting (running) of the VFIO device. Thanks, Shenming > > Thanks, > > M.
On 2020-12-01 11:40, Shenming Lu wrote: > On 2020/12/1 18:55, Marc Zyngier wrote: >> On 2020-11-30 07:23, Shenming Lu wrote: >> >> Hi Shenming, >> >>> We are pondering over this problem these days, but still don't get a >>> good solution... >>> Could you give us some advice on this? >>> >>> Or could we move the restoring of the pending states (include the >>> sync >>> from guest RAM and the transfer to HW) to the GIC VM state change >>> handler, >>> which is completely corresponding to save_pending_tables (more >>> symmetric?) >>> and don't expose GICv4... >> >> What is "the GIC VM state change handler"? Is that a QEMU thing? > > Yeah, it is a a QEMU thing... > >> We don't really have that concept in KVM, so I'd appreciate if you >> could >> be a bit more explicit on this. > > My thought is to add a new interface (to QEMU) for the restoring of > the pending states, which is completely corresponding to > KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES... > And it is called from the GIC VM state change handler in QEMU, which > is happening after the restoring (call kvm_vgic_v4_set_forwarding()) > but before the starting (running) of the VFIO device. Right, that makes sense. I still wonder how much the GIC save/restore stuff differs from other architectures that implement similar features, such as x86 with VT-D. It is obviously too late to change the userspace interface, but I wonder whether we missed something at the time. Thanks, M.
On 2020/12/1 19:50, Marc Zyngier wrote: > On 2020-12-01 11:40, Shenming Lu wrote: >> On 2020/12/1 18:55, Marc Zyngier wrote: >>> On 2020-11-30 07:23, Shenming Lu wrote: >>> >>> Hi Shenming, >>> >>>> We are pondering over this problem these days, but still don't get a >>>> good solution... >>>> Could you give us some advice on this? >>>> >>>> Or could we move the restoring of the pending states (include the sync >>>> from guest RAM and the transfer to HW) to the GIC VM state change handler, >>>> which is completely corresponding to save_pending_tables (more symmetric?) >>>> and don't expose GICv4... >>> >>> What is "the GIC VM state change handler"? Is that a QEMU thing? >> >> Yeah, it is a a QEMU thing... >> >>> We don't really have that concept in KVM, so I'd appreciate if you could >>> be a bit more explicit on this. >> >> My thought is to add a new interface (to QEMU) for the restoring of >> the pending states, which is completely corresponding to >> KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES... >> And it is called from the GIC VM state change handler in QEMU, which >> is happening after the restoring (call kvm_vgic_v4_set_forwarding()) >> but before the starting (running) of the VFIO device. > > Right, that makes sense. I still wonder how much the GIC save/restore > stuff differs from other architectures that implement similar features, > such as x86 with VT-D. I am not familiar with it... > > It is obviously too late to change the userspace interface, but I wonder > whether we missed something at the time. The interface seems to be really asymmetrical?... Or is there a possibility that we could know which irq is hw before the VFIO device calls kvm_vgic_v4_set_forwarding()? Thanks, Shenming > > Thanks, > > M.
On 2020/12/1 20:15, Shenming Lu wrote: > On 2020/12/1 19:50, Marc Zyngier wrote: >> On 2020-12-01 11:40, Shenming Lu wrote: >>> On 2020/12/1 18:55, Marc Zyngier wrote: >>>> On 2020-11-30 07:23, Shenming Lu wrote: >>>> >>>> Hi Shenming, >>>> >>>>> We are pondering over this problem these days, but still don't get a >>>>> good solution... >>>>> Could you give us some advice on this? >>>>> >>>>> Or could we move the restoring of the pending states (include the sync >>>>> from guest RAM and the transfer to HW) to the GIC VM state change handler, >>>>> which is completely corresponding to save_pending_tables (more symmetric?) >>>>> and don't expose GICv4... >>>> >>>> What is "the GIC VM state change handler"? Is that a QEMU thing? >>> >>> Yeah, it is a a QEMU thing... >>> >>>> We don't really have that concept in KVM, so I'd appreciate if you could >>>> be a bit more explicit on this. >>> >>> My thought is to add a new interface (to QEMU) for the restoring of >>> the pending states, which is completely corresponding to >>> KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES... >>> And it is called from the GIC VM state change handler in QEMU, which >>> is happening after the restoring (call kvm_vgic_v4_set_forwarding()) >>> but before the starting (running) of the VFIO device. >> >> Right, that makes sense. I still wonder how much the GIC save/restore >> stuff differs from other architectures that implement similar features, >> such as x86 with VT-D. > > I am not familiar with it... > >> >> It is obviously too late to change the userspace interface, but I wonder >> whether we missed something at the time. > > The interface seems to be really asymmetrical?... > > Or is there a possibility that we could know which irq is hw before the VFIO > device calls kvm_vgic_v4_set_forwarding()? > > Thanks, > Shenming > >> >> Thanks, >> >> M. > . > Hi Marc, I am learning VT-d Posted Interrupt (PI) these days. As far as I can tell, the posted interrupts are firstly recorded in the Posted Interrupt Request (*PIR*) field of the Posted Interrupt Descriptor (a temporary storage area (data structure in memory) which is specific to PI), and when the vCPU is running, a notification event (host vector) will be generated and sent to the CPU (the target vCPU is currently scheduled on it), which will cause the CPU to transfer the posted interrupt in the PIR field to the *Virtual-APIC page* (a data structure in kvm, the virtual interrupts delivered through kvm are put here, and it is also accessed by the VMX microcode (the layout matches the register layout seen by the guest)) of the vCPU and directly deliver it to the vCPU. So they only have to sync the PIR field to the Virtual-APIC page for the migration saving, and do nothing for the resuming... Besides, on x86 the setting of the IRQ bypass is independent of the VM interrupt setup... Not sure if I have missed something. In addition, I found that the enabling of the vAPIC is at the end of the migration (just before the VM start) on x86. So I am wondering if we could move the calling of *vgic_enable_lpis()* back, and transfer the pending state to the VPT there if the irq is hw (and I think the semantics of this function should include the transfer). In fact, this function is dependent on the restoring of the vgic(lpi_list)... After exploration, there seems to be no perfect place to transfer the pending states to HW in order to be compatible with the existing interface and under the current architecture, but we have to choose one solution? Thanks, Shenming
Hi Shenming, On 12/1/20 1:15 PM, Shenming Lu wrote: > On 2020/12/1 19:50, Marc Zyngier wrote: >> On 2020-12-01 11:40, Shenming Lu wrote: >>> On 2020/12/1 18:55, Marc Zyngier wrote: >>>> On 2020-11-30 07:23, Shenming Lu wrote: >>>> >>>> Hi Shenming, >>>> >>>>> We are pondering over this problem these days, but still don't get a >>>>> good solution... >>>>> Could you give us some advice on this? >>>>> >>>>> Or could we move the restoring of the pending states (include the sync >>>>> from guest RAM and the transfer to HW) to the GIC VM state change handler, >>>>> which is completely corresponding to save_pending_tables (more symmetric?) >>>>> and don't expose GICv4... >>>> >>>> What is "the GIC VM state change handler"? Is that a QEMU thing? >>> >>> Yeah, it is a a QEMU thing... >>> >>>> We don't really have that concept in KVM, so I'd appreciate if you could >>>> be a bit more explicit on this. >>> >>> My thought is to add a new interface (to QEMU) for the restoring of >>> the pending states, which is completely corresponding to >>> KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES... >>> And it is called from the GIC VM state change handler in QEMU, which >>> is happening after the restoring (call kvm_vgic_v4_set_forwarding()) >>> but before the starting (running) of the VFIO device. >> >> Right, that makes sense. I still wonder how much the GIC save/restore >> stuff differs from other architectures that implement similar features, >> such as x86 with VT-D. > > I am not familiar with it... > >> >> It is obviously too late to change the userspace interface, but I wonder >> whether we missed something at the time. > > The interface seems to be really asymmetrical?... in qemu d5aa0c229a ("hw/intc/arm_gicv3_kvm: Implement pending table save") commit message, it is traced: "There is no explicit restore as the tables are implicitly sync'ed on ITS table restore and on LPI enable at redistributor level." At that time there was no real justification behind adding the RESTORE fellow attr. Maybe a stupid question but isn't it possible to unset the forwarding when saving and rely on VFIO to automatically restore it when resuming on destination? Thanks Eric > > Or is there a possibility that we could know which irq is hw before the VFIO > device calls kvm_vgic_v4_set_forwarding()? > > Thanks, > Shenming > >> >> Thanks, >> >> M. >
On 2020/12/16 18:35, Auger Eric wrote: > Hi Shenming, > > On 12/1/20 1:15 PM, Shenming Lu wrote: >> On 2020/12/1 19:50, Marc Zyngier wrote: >>> On 2020-12-01 11:40, Shenming Lu wrote: >>>> On 2020/12/1 18:55, Marc Zyngier wrote: >>>>> On 2020-11-30 07:23, Shenming Lu wrote: >>>>> >>>>> Hi Shenming, >>>>> >>>>>> We are pondering over this problem these days, but still don't get a >>>>>> good solution... >>>>>> Could you give us some advice on this? >>>>>> >>>>>> Or could we move the restoring of the pending states (include the sync >>>>>> from guest RAM and the transfer to HW) to the GIC VM state change handler, >>>>>> which is completely corresponding to save_pending_tables (more symmetric?) >>>>>> and don't expose GICv4... >>>>> >>>>> What is "the GIC VM state change handler"? Is that a QEMU thing? >>>> >>>> Yeah, it is a a QEMU thing... >>>> >>>>> We don't really have that concept in KVM, so I'd appreciate if you could >>>>> be a bit more explicit on this. >>>> >>>> My thought is to add a new interface (to QEMU) for the restoring of >>>> the pending states, which is completely corresponding to >>>> KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES... >>>> And it is called from the GIC VM state change handler in QEMU, which >>>> is happening after the restoring (call kvm_vgic_v4_set_forwarding()) >>>> but before the starting (running) of the VFIO device. >>> >>> Right, that makes sense. I still wonder how much the GIC save/restore >>> stuff differs from other architectures that implement similar features, >>> such as x86 with VT-D. >> >> I am not familiar with it... >> >>> >>> It is obviously too late to change the userspace interface, but I wonder >>> whether we missed something at the time. >> >> The interface seems to be really asymmetrical?... > > in qemu d5aa0c229a ("hw/intc/arm_gicv3_kvm: Implement pending table > save") commit message, it is traced: > > "There is no explicit restore as the tables are implicitly sync'ed > on ITS table restore and on LPI enable at redistributor level." > > At that time there was no real justification behind adding the RESTORE > fellow attr. > > Maybe a stupid question but isn't it possible to unset the forwarding > when saving and rely on VFIO to automatically restore it when resuming > on destination? It seems that the unset_forwarding would not be called when saving, it would be called after migration completion... As for the resuming/set_forwarding, I still wonder: is it really improper to transfer the pending states from vgic to VPT in set_forwarding (not only in migration)?... -_- Thanks, Shenming > > Thanks > > Eric > > >> >> Or is there a possibility that we could know which irq is hw before the VFIO >> device calls kvm_vgic_v4_set_forwarding()? >> >> Thanks, >> Shenming >> >>> >>> Thanks, >>> >>> M. >> > > . >
diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c index b5fa73c9fd35..cc3ab9cea182 100644 --- a/arch/arm64/kvm/vgic/vgic-v4.c +++ b/arch/arm64/kvm/vgic/vgic-v4.c @@ -418,6 +418,18 @@ int kvm_vgic_v4_set_forwarding(struct kvm *kvm, int virq, irq->host_irq = virq; atomic_inc(&map.vpe->vlpi_count); + /* Transfer pending state */ + ret = irq_set_irqchip_state(irq->host_irq, + IRQCHIP_STATE_PENDING, + irq->pending_latch); + WARN_RATELIMIT(ret, "IRQ %d", irq->host_irq); + + /* + * Let it be pruned from ap_list later and don't bother + * the List Register. + */ + irq->pending_latch = false; + out: mutex_unlock(&its->its_lock); return ret;