Message ID | 20211119090718.440793-4-lvivier@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | tests/qtest: add some tests for virtio-net failover | expand |
On Fri, Nov 19, 2021 at 10:07:17AM +0100, Laurent Vivier wrote: > Failover needs to detect the end of the PCI unplug to start migration > after the VFIO card has been unplugged. > > To do that, a flag is set in pcie_cap_slot_unplug_request_cb() and reset in > pcie_unplug_device(). > > But since > 17858a169508 ("hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35") > we have switched to ACPI unplug and these functions are not called anymore > and the flag not set. So failover migration is not able to detect if card > is really unplugged and acts as it's done as soon as it's started. So it > doesn't wait the end of the unplug to start the migration. We don't see any > problem when we test that because ACPI unplug is faster than PCIe native > hotplug and when the migration really starts the unplug operation is > already done. > > See c000a9bd06ea ("pci: mark device having guest unplug request pending") > a99c4da9fc2a ("pci: mark devices partially unplugged") > > Signed-off-by: Laurent Vivier <lvivier@redhat.com> > Reviewed-by: Ani Sinha <ani@anisinha.ca> Hmm. I think this one may be needed for this release actually. Isolate from testing changes and repost? > --- > hw/acpi/pcihp.c | 30 +++++++++++++++++++++++++++--- > 1 file changed, 27 insertions(+), 3 deletions(-) > > diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c > index f610a25d2ef9..30405b5113d7 100644 > --- a/hw/acpi/pcihp.c > +++ b/hw/acpi/pcihp.c > @@ -222,9 +222,27 @@ static void acpi_pcihp_eject_slot(AcpiPciHpState *s, unsigned bsel, unsigned slo > PCIDevice *dev = PCI_DEVICE(qdev); > if (PCI_SLOT(dev->devfn) == slot) { > if (!acpi_pcihp_pc_no_hotplug(s, dev)) { > - hotplug_ctrl = qdev_get_hotplug_handler(qdev); > - hotplug_handler_unplug(hotplug_ctrl, qdev, &error_abort); > - object_unparent(OBJECT(qdev)); > + /* > + * partially_hotplugged is used by virtio-net failover: > + * failover has asked the guest OS to unplug the device > + * but we need to keep some references to the device > + * to be able to plug it back in case of failure so > + * we don't execute hotplug_handler_unplug(). > + */ > + if (dev->partially_hotplugged) { > + /* > + * pending_deleted_event is set to true when > + * virtio-net failover asks to unplug the device, > + * and set to false here when the operation is done > + * This is used by the migration loop to detect the > + * end of the operation and really start the migration. > + */ > + qdev->pending_deleted_event = false; > + } else { > + hotplug_ctrl = qdev_get_hotplug_handler(qdev); > + hotplug_handler_unplug(hotplug_ctrl, qdev, &error_abort); > + object_unparent(OBJECT(qdev)); > + } > } > } > } > @@ -396,6 +414,12 @@ void acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev, > return; > } > > + /* > + * pending_deleted_event is used by virtio-net failover to detect the > + * end of the unplug operation, the flag is set to false in > + * acpi_pcihp_eject_slot() when the operation is completed. > + */ > + pdev->qdev.pending_deleted_event = true; > s->acpi_pcihp_pci_status[bsel].down |= (1U << slot); > acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS); > } > -- > 2.33.1
On 08/12/2021 08.36, Michael S. Tsirkin wrote: > On Fri, Nov 19, 2021 at 10:07:17AM +0100, Laurent Vivier wrote: >> Failover needs to detect the end of the PCI unplug to start migration >> after the VFIO card has been unplugged. >> >> To do that, a flag is set in pcie_cap_slot_unplug_request_cb() and reset in >> pcie_unplug_device(). >> >> But since >> 17858a169508 ("hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35") >> we have switched to ACPI unplug and these functions are not called anymore >> and the flag not set. So failover migration is not able to detect if card >> is really unplugged and acts as it's done as soon as it's started. So it >> doesn't wait the end of the unplug to start the migration. We don't see any >> problem when we test that because ACPI unplug is faster than PCIe native >> hotplug and when the migration really starts the unplug operation is >> already done. >> >> See c000a9bd06ea ("pci: mark device having guest unplug request pending") >> a99c4da9fc2a ("pci: mark devices partially unplugged") >> >> Signed-off-by: Laurent Vivier <lvivier@redhat.com> >> Reviewed-by: Ani Sinha <ani@anisinha.ca> > > Hmm. I think this one may be needed for this release actually. > Isolate from testing changes and repost? You merged it already here: https://gitlab.com/qemu-project/qemu/-/commit/9323f892b39d133eb6 so we should be fine :-) Thomas
On Wed, Dec 8, 2021 at 1:20 PM Thomas Huth <thuth@redhat.com> wrote: > > On 08/12/2021 08.36, Michael S. Tsirkin wrote: > > On Fri, Nov 19, 2021 at 10:07:17AM +0100, Laurent Vivier wrote: > >> Failover needs to detect the end of the PCI unplug to start migration > >> after the VFIO card has been unplugged. > >> > >> To do that, a flag is set in pcie_cap_slot_unplug_request_cb() and reset in > >> pcie_unplug_device(). > >> > >> But since > >> 17858a169508 ("hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35") > >> we have switched to ACPI unplug and these functions are not called anymore > >> and the flag not set. So failover migration is not able to detect if card > >> is really unplugged and acts as it's done as soon as it's started. So it > >> doesn't wait the end of the unplug to start the migration. We don't see any > >> problem when we test that because ACPI unplug is faster than PCIe native > >> hotplug and when the migration really starts the unplug operation is > >> already done. > >> > >> See c000a9bd06ea ("pci: mark device having guest unplug request pending") > >> a99c4da9fc2a ("pci: mark devices partially unplugged") > >> > >> Signed-off-by: Laurent Vivier <lvivier@redhat.com> > >> Reviewed-by: Ani Sinha <ani@anisinha.ca> > > > > Hmm. I think this one may be needed for this release actually. > > Isolate from testing changes and repost? > > You merged it already here: > > https://gitlab.com/qemu-project/qemu/-/commit/9323f892b39d133eb6 Yes I pointed this out in the other thread. Michael was not CC'd there somehow ... > > so we should be fine :-) > > Thomas >
On Wed, Dec 08, 2021 at 08:50:34AM +0100, Thomas Huth wrote: > On 08/12/2021 08.36, Michael S. Tsirkin wrote: > > On Fri, Nov 19, 2021 at 10:07:17AM +0100, Laurent Vivier wrote: > > > Failover needs to detect the end of the PCI unplug to start migration > > > after the VFIO card has been unplugged. > > > > > > To do that, a flag is set in pcie_cap_slot_unplug_request_cb() and reset in > > > pcie_unplug_device(). > > > > > > But since > > > 17858a169508 ("hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35") > > > we have switched to ACPI unplug and these functions are not called anymore > > > and the flag not set. So failover migration is not able to detect if card > > > is really unplugged and acts as it's done as soon as it's started. So it > > > doesn't wait the end of the unplug to start the migration. We don't see any > > > problem when we test that because ACPI unplug is faster than PCIe native > > > hotplug and when the migration really starts the unplug operation is > > > already done. > > > > > > See c000a9bd06ea ("pci: mark device having guest unplug request pending") > > > a99c4da9fc2a ("pci: mark devices partially unplugged") > > > > > > Signed-off-by: Laurent Vivier <lvivier@redhat.com> > > > Reviewed-by: Ani Sinha <ani@anisinha.ca> > > > > Hmm. I think this one may be needed for this release actually. > > Isolate from testing changes and repost? > > You merged it already here: > > https://gitlab.com/qemu-project/qemu/-/commit/9323f892b39d133eb6 Oh, good. Forgot to drop the tag. > so we should be fine :-) > > Thomas
diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c index f610a25d2ef9..30405b5113d7 100644 --- a/hw/acpi/pcihp.c +++ b/hw/acpi/pcihp.c @@ -222,9 +222,27 @@ static void acpi_pcihp_eject_slot(AcpiPciHpState *s, unsigned bsel, unsigned slo PCIDevice *dev = PCI_DEVICE(qdev); if (PCI_SLOT(dev->devfn) == slot) { if (!acpi_pcihp_pc_no_hotplug(s, dev)) { - hotplug_ctrl = qdev_get_hotplug_handler(qdev); - hotplug_handler_unplug(hotplug_ctrl, qdev, &error_abort); - object_unparent(OBJECT(qdev)); + /* + * partially_hotplugged is used by virtio-net failover: + * failover has asked the guest OS to unplug the device + * but we need to keep some references to the device + * to be able to plug it back in case of failure so + * we don't execute hotplug_handler_unplug(). + */ + if (dev->partially_hotplugged) { + /* + * pending_deleted_event is set to true when + * virtio-net failover asks to unplug the device, + * and set to false here when the operation is done + * This is used by the migration loop to detect the + * end of the operation and really start the migration. + */ + qdev->pending_deleted_event = false; + } else { + hotplug_ctrl = qdev_get_hotplug_handler(qdev); + hotplug_handler_unplug(hotplug_ctrl, qdev, &error_abort); + object_unparent(OBJECT(qdev)); + } } } } @@ -396,6 +414,12 @@ void acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev, return; } + /* + * pending_deleted_event is used by virtio-net failover to detect the + * end of the unplug operation, the flag is set to false in + * acpi_pcihp_eject_slot() when the operation is completed. + */ + pdev->qdev.pending_deleted_event = true; s->acpi_pcihp_pci_status[bsel].down |= (1U << slot); acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS); }