diff mbox

[2/2] xl: disable events earlier for shutdown event

Message ID 20170202154637.1478-3-wei.liu2@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Wei Liu Feb. 2, 2017, 3:46 p.m. UTC
We need to disable event machinery when the guest shuts down.  It
doesn't really matter where we disable it as long as it is within the
branch for shutdown event.

Move the code snippet before handle_domain_death, so that d_config is
not yet updated and we have the correct ->num_disks. And the free diskws
and set it to NULL, so that diskws gets automatically set up again when
xl goes over the domain creation code path.

Reported-by: Fatih Acar <fatih.acar@gandi.net>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Fatih Acar <fatih.acar@gandi.net>

Fatih, please test this series to see if it fixes your issue.

Ian, backport candidate?
---
 tools/libxl/xl_cmdimpl.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Ian Jackson Feb. 2, 2017, 3:52 p.m. UTC | #1
Wei Liu writes ("[PATCH 2/2] xl: disable events earlier for shutdown event"):
> We need to disable event machinery when the guest shuts down.  It
> doesn't really matter where we disable it as long as it is within the
> branch for shutdown event.

I don't think this is necessary.  My intent was that libxl_ctx_free
would automatically disable all event generation, and that it is
therefore not necessary to call _disable.

This isn't documented particularly clearly, but libxl_event.h says:

  * An evgen is associated with the libxl_ctx used for its creation.
  * After libxl_ctx_free, all corresponding evgen handles become
  * invalid and must no longer be passed to evdisable.

This implies it's legal to call libxl_ctx_free with some events
enabled.

But I think I don't really understand what the original bug is.

Ian.
Wei Liu Feb. 2, 2017, 4 p.m. UTC | #2
On Thu, Feb 02, 2017 at 03:52:41PM +0000, Ian Jackson wrote:
> Wei Liu writes ("[PATCH 2/2] xl: disable events earlier for shutdown event"):
> > We need to disable event machinery when the guest shuts down.  It
> > doesn't really matter where we disable it as long as it is within the
> > branch for shutdown event.
> 
> I don't think this is necessary.  My intent was that libxl_ctx_free
> would automatically disable all event generation, and that it is
> therefore not necessary to call _disable.
> 
> This isn't documented particularly clearly, but libxl_event.h says:
> 
>   * An evgen is associated with the libxl_ctx used for its creation.
>   * After libxl_ctx_free, all corresponding evgen handles become
>   * invalid and must no longer be passed to evdisable.
> 
> This implies it's legal to call libxl_ctx_free with some events
> enabled.
> 

That's right. But this is not saying there is bug in the event
machinery.

> But I think I don't really understand what the original bug is.
> 

The original bug is that:

1. boot a guest with no disks, so diskws is NULL, num_disks == 0
2. some removable disks are added, and domain config updated
3. guest reboots, xl gets shutdown event
4. handle_domain_death gets the latest d_config, num_disks != 0 now
5. try to disable disk eject events with evdisable_disk_ejects -> BOOM

So basically 5 needs to happen before 4. Moving that snippet seems to do
the trick, and then freeing diskws + setting it to NULL makes the
reboot path automatically re-register diskws.

Wei.

> Ian.
Ian Jackson Feb. 2, 2017, 4:05 p.m. UTC | #3
Wei Liu writes ("Re: [PATCH 2/2] xl: disable events earlier for shutdown event"):
> On Thu, Feb 02, 2017 at 03:52:41PM +0000, Ian Jackson wrote:
> > But I think I don't really understand what the original bug is.
> 
> The original bug is that:

Ah.

> 1. boot a guest with no disks, so diskws is NULL, num_disks == 0
> 2. some removable disks are added, and domain config updated
> 3. guest reboots, xl gets shutdown event
> 4. handle_domain_death gets the latest d_config, num_disks != 0 now
> 5. try to disable disk eject events with evdisable_disk_ejects -> BOOM
> 
> So basically 5 needs to happen before 4. Moving that snippet seems to do
> the trick, and then freeing diskws + setting it to NULL makes the
> reboot path automatically re-register diskws.

Surely the right answer is to make evdisable_disk_ejects tolerate
whatever it finds in diskws (including diskws==NULL).  It could be
idempotent too.

That way nothing depends on the exact ordering.

Ian.
diff mbox

Patch

diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 9bf10dfcb2..7df6db1ec9 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -3137,6 +3137,11 @@  start:
             LOG("Domain %u has shut down, reason code %d 0x%x", domid,
                 event->u.domain_shutdown.shutdown_reason,
                 event->u.domain_shutdown.shutdown_reason);
+            libxl_evdisable_domain_death(ctx, deathw);
+            deathw = NULL;
+            evdisable_disk_ejects(diskws, d_config.num_disks);
+            free(diskws);
+            diskws = NULL;
             switch (handle_domain_death(&domid, event, &d_config)) {
             case DOMAIN_RESTART_SOFT_RESET:
                 domid_soft_reset = domid;
@@ -3154,9 +3159,6 @@  start:
                 /* Otherwise fall through and restart. */
             case DOMAIN_RESTART_NORMAL:
                 libxl_event_free(ctx, event);
-                libxl_evdisable_domain_death(ctx, deathw);
-                deathw = NULL;
-                evdisable_disk_ejects(diskws, d_config.num_disks);
                 /* discard any other events which may have been generated */
                 while (!(ret = libxl_event_check(ctx, &event,
                                                  LIBXL_EVENTMASK_ALL, 0,0))) {