diff mbox series

[PULL,19/20] spapr: Fix handling of unplugged devices during CAS and migration

Message ID 20200221033650.444386-20-david@gibson.dropbear.id.au
State New
Headers show
Series None | expand

Commit Message

David Gibson Feb. 21, 2020, 3:36 a.m. UTC
From: Greg Kurz <groug@kaod.org>

We already detect if a device is being hot plugged before CAS to trigger
a CAS reboot and during migration to migrate the state of the associated
DRC. But hot unplugging a device is also an asynchronous operation that
requires the guest to take action. This means that if the guest is migrated
after the hot unplug event was sent but before it could release the device
with RTAS, the destination QEMU doesn't know about the pending unplug
operation and doesn't actually remove the device when the guest finally
releases it.

Similarly, if the unplug request is fired before CAS, the guest isn't
notified of the change, just like with hotplug. It ends up booting with
the device still present in the DT and configures it, just like it was
never removed. Even weirder, since the event is still queued, it will
be eventually processed when some other unrelated event is posted to
the guest.

Enhance spapr_drc_transient() to also return true if an unplug request is
pending. This fixes the issue at CAS with a CAS reboot request and
causes the DRC state to be migrated. Some extra care is still needed to
inform the destination that an unplug request is pending : migrate the
unplug_requested field of the DRC in an optional subsection. This might
break backwards migration, but this is still better than ending with
an inconsistent guest.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <158169248798.3465937.1108351365840514270.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
 hw/ppc/spapr_drc.c | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)
diff mbox series


diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 4c35ce7c5c..e373d342eb 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -456,6 +456,22 @@  void spapr_drc_reset(SpaprDrc *drc)
+static bool spapr_drc_unplug_requested_needed(void *opaque)
+    return spapr_drc_unplug_requested(opaque);
+static const VMStateDescription vmstate_spapr_drc_unplug_requested = {
+    .name = "spapr_drc/unplug_requested",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = spapr_drc_unplug_requested_needed,
+    .fields  = (VMStateField []) {
+        VMSTATE_BOOL(unplug_requested, SpaprDrc),
+    }
 bool spapr_drc_transient(SpaprDrc *drc)
     SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
@@ -471,9 +487,10 @@  bool spapr_drc_transient(SpaprDrc *drc)
      * We need to reset the DRC at CAS or to migrate the DRC state if it's
      * not equal to the expected long-term state, which is the same as the
-     * coldplugged initial state.
+     * coldplugged initial state, or if an unplug request is pending.
-    return (drc->state != drck->ready_state);
+    return drc->state != drck->ready_state ||
+        spapr_drc_unplug_requested(drc);
 static bool spapr_drc_needed(void *opaque)
@@ -489,6 +506,10 @@  static const VMStateDescription vmstate_spapr_drc = {
     .fields  = (VMStateField []) {
         VMSTATE_UINT32(state, SpaprDrc),
+    },
+    .subsections = (const VMStateDescription * []) {
+        &vmstate_spapr_drc_unplug_requested,
+        NULL