Message ID | 20190927161746.25902-1-olaf@aepfle.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v1] libxl: fix crash in helper_done due to uninitialized data | expand |
Olaf Hering writes ("[PATCH v1] libxl: fix crash in helper_done due to uninitialized data"): > A crash in helper_done, called from libxl_domain_suspend, was reported, > triggered by 'virsh migrate --live xen+ssh://host': ... > This is triggered by a failed poll, the actual error was: > > libxl_aoutils.c:328:datacopier_writable: unexpected poll event 0x1c on fd 37 (should be POLLOUT) writing libxc header during copy of save v2 stream > > In this case revents in datacopier_writable is POLLHUP|POLLERR|POLLOUT, > which triggers datacopier_callback. In helper_done, > shs->completion_callback is still zero. libxl__xc_domain_save fills > dss.sws.shs. But that function is only called after stream_header_done. > Any error before that will leave dss partly uninitialized. Thanks for the diagnosis. And sorry for the inconvenience of this bug. > Fix this crash by checking if ->completion_callback is valid. But I don't think this fix is right. The bug is that libxl__save_helper_abort is called on a blank shs. (You say "uninitialised" but actually this is all zeroed at some point, so it it's not reading uninitialised memory.) I think it is in fact a bug that this error path if (!stream->rc && rc) { /* First reported failure. Tear everything down. */ stream->rc = rc; stream->sync_teardown = true; libxl__stream_read_abort(egc, stream, rc); libxl__save_helper_abort(egc, &stream->shs); libxl__conversion_helper_abort(egc, &stream->chs, rc); stream->sync_teardown = false; } calls libxl__save_helper_abort when it hasn't ever called anything other than libxl__save_helper_init. Because _abort naturally wants to call the failure callback. I suggest that we add a check for _inuse to this bit of check_all_finished. Ross and Andrew, you wrote much of this stream read stuff, what do you think ? Part of the difficulty is that the possible states and transitions of a libxl__save_helper_state are not formatlly documented. That's my fault - sorry. Thanks, Ian.
Am Fri, 27 Sep 2019 18:17:12 +0100 schrieb Ian Jackson <ian.jackson@citrix.com>: > Olaf Hering writes ("[PATCH v1] libxl: fix crash in helper_done due to uninitialized data"): > > A crash in helper_done, called from libxl_domain_suspend, was reported, > > libxl_aoutils.c:328:datacopier_writable: unexpected poll event 0x1c on fd 37 (should be POLLOUT) writing libxc header during copy of save v2 stream > Ross and Andrew, you wrote much of this stream read stuff, what do you think ? Did you have a chance to look at this issue? Olaf
diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c index 6452d70036..89a2f6ecf0 100644 --- a/tools/libxl/libxl_save_callout.c +++ b/tools/libxl/libxl_save_callout.c @@ -366,8 +366,9 @@ static void helper_done(libxl__egc *egc, libxl__save_helper_state *shs) assert(!libxl__save_helper_inuse(shs)); shs->egc = egc; - shs->completion_callback(egc, shs->caller_state, - shs->rc, shs->retval, shs->errnoval); + if (shs->completion_callback) + shs->completion_callback(egc, shs->caller_state, + shs->rc, shs->retval, shs->errnoval); shs->egc = 0; }
A crash in helper_done, called from libxl_domain_suspend, was reported, triggered by 'virsh migrate --live xen+ssh://host': #1 helper_done (...) at libxl_save_callout.c:371 helper_failed helper_stop libxl__save_helper_abort #2 check_all_finished (..., rc=-3) at libxl_stream_write.c:671 stream_done stream_complete write_done dc->callback == write_done efd->func == datacopier_writable #3 afterpoll_internal (...) at libxl_event.c:1269 This is triggered by a failed poll, the actual error was: libxl_aoutils.c:328:datacopier_writable: unexpected poll event 0x1c on fd 37 (should be POLLOUT) writing libxc header during copy of save v2 stream In this case revents in datacopier_writable is POLLHUP|POLLERR|POLLOUT, which triggers datacopier_callback. In helper_done, shs->completion_callback is still zero. libxl__xc_domain_save fills dss.sws.shs. But that function is only called after stream_header_done. Any error before that will leave dss partly uninitialized. Fix this crash by checking if ->completion_callback is valid. Signed-off-by: Olaf Hering <olaf@aepfle.de> --- tools/libxl/libxl_save_callout.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)