Message ID | 20170202154637.1478-3-wei.liu2@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Wei Liu writes ("[PATCH 2/2] xl: disable events earlier for shutdown event"): > We need to disable event machinery when the guest shuts down. It > doesn't really matter where we disable it as long as it is within the > branch for shutdown event. I don't think this is necessary. My intent was that libxl_ctx_free would automatically disable all event generation, and that it is therefore not necessary to call _disable. This isn't documented particularly clearly, but libxl_event.h says: * An evgen is associated with the libxl_ctx used for its creation. * After libxl_ctx_free, all corresponding evgen handles become * invalid and must no longer be passed to evdisable. This implies it's legal to call libxl_ctx_free with some events enabled. But I think I don't really understand what the original bug is. Ian.
On Thu, Feb 02, 2017 at 03:52:41PM +0000, Ian Jackson wrote: > Wei Liu writes ("[PATCH 2/2] xl: disable events earlier for shutdown event"): > > We need to disable event machinery when the guest shuts down. It > > doesn't really matter where we disable it as long as it is within the > > branch for shutdown event. > > I don't think this is necessary. My intent was that libxl_ctx_free > would automatically disable all event generation, and that it is > therefore not necessary to call _disable. > > This isn't documented particularly clearly, but libxl_event.h says: > > * An evgen is associated with the libxl_ctx used for its creation. > * After libxl_ctx_free, all corresponding evgen handles become > * invalid and must no longer be passed to evdisable. > > This implies it's legal to call libxl_ctx_free with some events > enabled. > That's right. But this is not saying there is bug in the event machinery. > But I think I don't really understand what the original bug is. > The original bug is that: 1. boot a guest with no disks, so diskws is NULL, num_disks == 0 2. some removable disks are added, and domain config updated 3. guest reboots, xl gets shutdown event 4. handle_domain_death gets the latest d_config, num_disks != 0 now 5. try to disable disk eject events with evdisable_disk_ejects -> BOOM So basically 5 needs to happen before 4. Moving that snippet seems to do the trick, and then freeing diskws + setting it to NULL makes the reboot path automatically re-register diskws. Wei. > Ian.
Wei Liu writes ("Re: [PATCH 2/2] xl: disable events earlier for shutdown event"): > On Thu, Feb 02, 2017 at 03:52:41PM +0000, Ian Jackson wrote: > > But I think I don't really understand what the original bug is. > > The original bug is that: Ah. > 1. boot a guest with no disks, so diskws is NULL, num_disks == 0 > 2. some removable disks are added, and domain config updated > 3. guest reboots, xl gets shutdown event > 4. handle_domain_death gets the latest d_config, num_disks != 0 now > 5. try to disable disk eject events with evdisable_disk_ejects -> BOOM > > So basically 5 needs to happen before 4. Moving that snippet seems to do > the trick, and then freeing diskws + setting it to NULL makes the > reboot path automatically re-register diskws. Surely the right answer is to make evdisable_disk_ejects tolerate whatever it finds in diskws (including diskws==NULL). It could be idempotent too. That way nothing depends on the exact ordering. Ian.
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c index 9bf10dfcb2..7df6db1ec9 100644 --- a/tools/libxl/xl_cmdimpl.c +++ b/tools/libxl/xl_cmdimpl.c @@ -3137,6 +3137,11 @@ start: LOG("Domain %u has shut down, reason code %d 0x%x", domid, event->u.domain_shutdown.shutdown_reason, event->u.domain_shutdown.shutdown_reason); + libxl_evdisable_domain_death(ctx, deathw); + deathw = NULL; + evdisable_disk_ejects(diskws, d_config.num_disks); + free(diskws); + diskws = NULL; switch (handle_domain_death(&domid, event, &d_config)) { case DOMAIN_RESTART_SOFT_RESET: domid_soft_reset = domid; @@ -3154,9 +3159,6 @@ start: /* Otherwise fall through and restart. */ case DOMAIN_RESTART_NORMAL: libxl_event_free(ctx, event); - libxl_evdisable_domain_death(ctx, deathw); - deathw = NULL; - evdisable_disk_ejects(diskws, d_config.num_disks); /* discard any other events which may have been generated */ while (!(ret = libxl_event_check(ctx, &event, LIBXL_EVENTMASK_ALL, 0,0))) {
We need to disable event machinery when the guest shuts down. It doesn't really matter where we disable it as long as it is within the branch for shutdown event. Move the code snippet before handle_domain_death, so that d_config is not yet updated and we have the correct ->num_disks. And the free diskws and set it to NULL, so that diskws gets automatically set up again when xl goes over the domain creation code path. Reported-by: Fatih Acar <fatih.acar@gandi.net> Signed-off-by: Wei Liu <wei.liu2@citrix.com> --- Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Fatih Acar <fatih.acar@gandi.net> Fatih, please test this series to see if it fixes your issue. Ian, backport candidate? --- tools/libxl/xl_cmdimpl.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)