diff mbox series

[v1] libxl: fix crash in helper_done due to uninitialized data

Message ID 20190927161746.25902-1-olaf@aepfle.de (mailing list archive)
State New, archived
Headers show
Series [v1] libxl: fix crash in helper_done due to uninitialized data | expand

Commit Message

Olaf Hering Sept. 27, 2019, 4:17 p.m. UTC
A crash in helper_done, called from libxl_domain_suspend, was reported,
triggered by 'virsh migrate --live xen+ssh://host':

 #1 helper_done (...) at libxl_save_callout.c:371
  helper_failed
  helper_stop
  libxl__save_helper_abort
 #2 check_all_finished (..., rc=-3) at libxl_stream_write.c:671
  stream_done
  stream_complete
  write_done
  dc->callback == write_done
  efd->func == datacopier_writable
 #3 afterpoll_internal (...) at libxl_event.c:1269

This is triggered by a failed poll, the actual error was:

libxl_aoutils.c:328:datacopier_writable: unexpected poll event 0x1c on fd 37 (should be POLLOUT) writing libxc header during copy of save v2 stream

In this case revents in datacopier_writable is POLLHUP|POLLERR|POLLOUT,
which triggers datacopier_callback. In helper_done,
shs->completion_callback is still zero. libxl__xc_domain_save fills
dss.sws.shs. But that function is only called after stream_header_done.
Any error before that will leave dss partly uninitialized.

Fix this crash by checking if ->completion_callback is valid.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
---
 tools/libxl/libxl_save_callout.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Ian Jackson Sept. 27, 2019, 5:17 p.m. UTC | #1
Olaf Hering writes ("[PATCH v1] libxl: fix crash in helper_done due to uninitialized data"):
> A crash in helper_done, called from libxl_domain_suspend, was reported,
> triggered by 'virsh migrate --live xen+ssh://host':
...
> This is triggered by a failed poll, the actual error was:
> 
> libxl_aoutils.c:328:datacopier_writable: unexpected poll event 0x1c on fd 37 (should be POLLOUT) writing libxc header during copy of save v2 stream
> 
> In this case revents in datacopier_writable is POLLHUP|POLLERR|POLLOUT,
> which triggers datacopier_callback. In helper_done,
> shs->completion_callback is still zero. libxl__xc_domain_save fills
> dss.sws.shs. But that function is only called after stream_header_done.
> Any error before that will leave dss partly uninitialized.

Thanks for the diagnosis.  And sorry for the inconvenience of this
bug.

> Fix this crash by checking if ->completion_callback is valid.

But I don't think this fix is right.  The bug is that
libxl__save_helper_abort is called on a blank shs.  (You say
"uninitialised" but actually this is all zeroed at some point, so it
it's not reading uninitialised memory.)

I think it is in fact a bug that this error path

    if (!stream->rc && rc) {
        /* First reported failure. Tear everything down. */
        stream->rc = rc;
        stream->sync_teardown = true;

        libxl__stream_read_abort(egc, stream, rc);
        libxl__save_helper_abort(egc, &stream->shs);
        libxl__conversion_helper_abort(egc, &stream->chs, rc);

        stream->sync_teardown = false;
    }

calls libxl__save_helper_abort when it hasn't ever called anything
other than libxl__save_helper_init.  Because _abort naturally wants to
call the failure callback.

I suggest that we add a check for _inuse to this bit of
check_all_finished.

Ross and Andrew, you wrote much of this stream read stuff, what do you
think ?

Part of the difficulty is that the possible states and transitions of
a libxl__save_helper_state are not formatlly documented.  That's my
fault - sorry.

Thanks,
Ian.
Olaf Hering Nov. 5, 2019, 10:31 a.m. UTC | #2
Am Fri, 27 Sep 2019 18:17:12 +0100
schrieb Ian Jackson <ian.jackson@citrix.com>:

> Olaf Hering writes ("[PATCH v1] libxl: fix crash in helper_done due to uninitialized data"):
> > A crash in helper_done, called from libxl_domain_suspend, was reported,
> > libxl_aoutils.c:328:datacopier_writable: unexpected poll event 0x1c on fd 37 (should be POLLOUT) writing libxc header during copy of save v2 stream
> Ross and Andrew, you wrote much of this stream read stuff, what do you think ?

Did you have a chance to look at this issue?


Olaf
diff mbox series

Patch

diff --git a/tools/libxl/libxl_save_callout.c b/tools/libxl/libxl_save_callout.c
index 6452d70036..89a2f6ecf0 100644
--- a/tools/libxl/libxl_save_callout.c
+++ b/tools/libxl/libxl_save_callout.c
@@ -366,8 +366,9 @@  static void helper_done(libxl__egc *egc, libxl__save_helper_state *shs)
     assert(!libxl__save_helper_inuse(shs));
 
     shs->egc = egc;
-    shs->completion_callback(egc, shs->caller_state,
-                             shs->rc, shs->retval, shs->errnoval);
+    if (shs->completion_callback)
+        shs->completion_callback(egc, shs->caller_state,
+                                 shs->rc, shs->retval, shs->errnoval);
     shs->egc = 0;
 }