diff mbox series

monitor: Fix order in monitor_cleanup()

Message ID 20201013125027.41003-1-kwolf@redhat.com
State New, archived
Headers show
Series monitor: Fix order in monitor_cleanup() | expand

Commit Message

Kevin Wolf Oct. 13, 2020, 12:50 p.m. UTC
We can only destroy Monitor objects after we're sure that they are not
in use by the dispatcher coroutine any more. This fixes crashes like the
following where we tried to destroy a monitor mutex while the dispatcher
coroutine still holds it:

 (gdb) bt
 #0  0x00007fe541cf4bc5 in raise () at /lib64/libc.so.6
 #1  0x00007fe541cdd8a4 in abort () at /lib64/libc.so.6
 #2  0x000055c24e965327 in error_exit (err=16, msg=0x55c24eead3a0 <__func__.33> "qemu_mutex_destroy") at ../util/qemu-thread-posix.c:37
 #3  0x000055c24e9654c3 in qemu_mutex_destroy (mutex=0x55c25133e0f0) at ../util/qemu-thread-posix.c:70
 #4  0x000055c24e7cfaf1 in monitor_data_destroy_qmp (mon=0x55c25133dfd0) at ../monitor/qmp.c:439
 #5  0x000055c24e7d23bc in monitor_data_destroy (mon=0x55c25133dfd0) at ../monitor/monitor.c:615
 #6  0x000055c24e7d253a in monitor_cleanup () at ../monitor/monitor.c:644
 #7  0x000055c24e6cb002 in qemu_cleanup () at ../softmmu/vl.c:4549
 #8  0x000055c24e0d259b in main (argc=24, argv=0x7ffff66b0d58, envp=0x7ffff66b0e20) at ../softmmu/main.c:51

Reported-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 monitor/monitor.c | 33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

Comments

Ben Widawsky Oct. 13, 2020, 1:32 p.m. UTC | #1
On 20-10-13 14:50:27, Kevin Wolf wrote:
> We can only destroy Monitor objects after we're sure that they are not
> in use by the dispatcher coroutine any more. This fixes crashes like the
> following where we tried to destroy a monitor mutex while the dispatcher
> coroutine still holds it:
> 
>  (gdb) bt
>  #0  0x00007fe541cf4bc5 in raise () at /lib64/libc.so.6
>  #1  0x00007fe541cdd8a4 in abort () at /lib64/libc.so.6
>  #2  0x000055c24e965327 in error_exit (err=16, msg=0x55c24eead3a0 <__func__.33> "qemu_mutex_destroy") at ../util/qemu-thread-posix.c:37
>  #3  0x000055c24e9654c3 in qemu_mutex_destroy (mutex=0x55c25133e0f0) at ../util/qemu-thread-posix.c:70
>  #4  0x000055c24e7cfaf1 in monitor_data_destroy_qmp (mon=0x55c25133dfd0) at ../monitor/qmp.c:439
>  #5  0x000055c24e7d23bc in monitor_data_destroy (mon=0x55c25133dfd0) at ../monitor/monitor.c:615
>  #6  0x000055c24e7d253a in monitor_cleanup () at ../monitor/monitor.c:644
>  #7  0x000055c24e6cb002 in qemu_cleanup () at ../softmmu/vl.c:4549
>  #8  0x000055c24e0d259b in main (argc=24, argv=0x7ffff66b0d58, envp=0x7ffff66b0e20) at ../softmmu/main.c:51
> 
> Reported-by: Alex Bennée <alex.bennee@linaro.org>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Ben Widawsky <ben.widawsky@intel.com>

[snip]
Alex Bennée Oct. 14, 2020, 5:20 p.m. UTC | #2
Kevin Wolf <kwolf@redhat.com> writes:

> We can only destroy Monitor objects after we're sure that they are not
> in use by the dispatcher coroutine any more. This fixes crashes like the
> following where we tried to destroy a monitor mutex while the dispatcher
> coroutine still holds it:
>
>  (gdb) bt
>  #0  0x00007fe541cf4bc5 in raise () at /lib64/libc.so.6
>  #1  0x00007fe541cdd8a4 in abort () at /lib64/libc.so.6
>  #2  0x000055c24e965327 in error_exit (err=16, msg=0x55c24eead3a0 <__func__.33> "qemu_mutex_destroy") at ../util/qemu-thread-posix.c:37
>  #3  0x000055c24e9654c3 in qemu_mutex_destroy (mutex=0x55c25133e0f0) at ../util/qemu-thread-posix.c:70
>  #4  0x000055c24e7cfaf1 in monitor_data_destroy_qmp (mon=0x55c25133dfd0) at ../monitor/qmp.c:439
>  #5  0x000055c24e7d23bc in monitor_data_destroy (mon=0x55c25133dfd0) at ../monitor/monitor.c:615
>  #6  0x000055c24e7d253a in monitor_cleanup () at ../monitor/monitor.c:644
>  #7  0x000055c24e6cb002 in qemu_cleanup () at ../softmmu/vl.c:4549
>  #8  0x000055c24e0d259b in main (argc=24, argv=0x7ffff66b0d58, envp=0x7ffff66b0e20) at ../softmmu/main.c:51
>
> Reported-by: Alex Bennée <alex.bennee@linaro.org>
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>

LGTM:

Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

Who's tree is going to take it?
Kevin Wolf Oct. 15, 2020, 7:46 a.m. UTC | #3
Am 14.10.2020 um 19:20 hat Alex Bennée geschrieben:
> 
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > We can only destroy Monitor objects after we're sure that they are not
> > in use by the dispatcher coroutine any more. This fixes crashes like the
> > following where we tried to destroy a monitor mutex while the dispatcher
> > coroutine still holds it:
> >
> >  (gdb) bt
> >  #0  0x00007fe541cf4bc5 in raise () at /lib64/libc.so.6
> >  #1  0x00007fe541cdd8a4 in abort () at /lib64/libc.so.6
> >  #2  0x000055c24e965327 in error_exit (err=16, msg=0x55c24eead3a0 <__func__.33> "qemu_mutex_destroy") at ../util/qemu-thread-posix.c:37
> >  #3  0x000055c24e9654c3 in qemu_mutex_destroy (mutex=0x55c25133e0f0) at ../util/qemu-thread-posix.c:70
> >  #4  0x000055c24e7cfaf1 in monitor_data_destroy_qmp (mon=0x55c25133dfd0) at ../monitor/qmp.c:439
> >  #5  0x000055c24e7d23bc in monitor_data_destroy (mon=0x55c25133dfd0) at ../monitor/monitor.c:615
> >  #6  0x000055c24e7d253a in monitor_cleanup () at ../monitor/monitor.c:644
> >  #7  0x000055c24e6cb002 in qemu_cleanup () at ../softmmu/vl.c:4549
> >  #8  0x000055c24e0d259b in main (argc=24, argv=0x7ffff66b0d58, envp=0x7ffff66b0e20) at ../softmmu/main.c:51
> >
> > Reported-by: Alex Bennée <alex.bennee@linaro.org>
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> 
> LGTM:
> 
> Tested-by: Alex Bennée <alex.bennee@linaro.org>
> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
> 
> Who's tree is going to take it?

In theory Markus, but he's on vacation this week. As this seems to
affect multiple people and we want to unblock testing quickly, I'll just
take it through mine.

Kevin
Markus Armbruster Oct. 19, 2020, 9:19 a.m. UTC | #4
Kevin Wolf <kwolf@redhat.com> writes:

> Am 14.10.2020 um 19:20 hat Alex Bennée geschrieben:
>> 
>> Kevin Wolf <kwolf@redhat.com> writes:
>> 
>> > We can only destroy Monitor objects after we're sure that they are not
>> > in use by the dispatcher coroutine any more. This fixes crashes like the
>> > following where we tried to destroy a monitor mutex while the dispatcher
>> > coroutine still holds it:
>> >
>> >  (gdb) bt
>> >  #0  0x00007fe541cf4bc5 in raise () at /lib64/libc.so.6
>> >  #1  0x00007fe541cdd8a4 in abort () at /lib64/libc.so.6
>> >  #2  0x000055c24e965327 in error_exit (err=16, msg=0x55c24eead3a0 <__func__.33> "qemu_mutex_destroy") at ../util/qemu-thread-posix.c:37
>> >  #3  0x000055c24e9654c3 in qemu_mutex_destroy (mutex=0x55c25133e0f0) at ../util/qemu-thread-posix.c:70
>> >  #4  0x000055c24e7cfaf1 in monitor_data_destroy_qmp (mon=0x55c25133dfd0) at ../monitor/qmp.c:439
>> >  #5  0x000055c24e7d23bc in monitor_data_destroy (mon=0x55c25133dfd0) at ../monitor/monitor.c:615
>> >  #6  0x000055c24e7d253a in monitor_cleanup () at ../monitor/monitor.c:644
>> >  #7  0x000055c24e6cb002 in qemu_cleanup () at ../softmmu/vl.c:4549
>> >  #8  0x000055c24e0d259b in main (argc=24, argv=0x7ffff66b0d58, envp=0x7ffff66b0e20) at ../softmmu/main.c:51
>> >
>> > Reported-by: Alex Bennée <alex.bennee@linaro.org>
>> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>> 
>> LGTM:
>> 
>> Tested-by: Alex Bennée <alex.bennee@linaro.org>
>> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
>> 
>> Who's tree is going to take it?
>
> In theory Markus, but he's on vacation this week. As this seems to
> affect multiple people and we want to unblock testing quickly, I'll just
> take it through mine.

Thanks!
diff mbox series

Patch

diff --git a/monitor/monitor.c b/monitor/monitor.c
index ceffe1a83b..84222cd130 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -632,23 +632,9 @@  void monitor_cleanup(void)
         iothread_stop(mon_iothread);
     }
 
-    /* Flush output buffers and destroy monitors */
-    qemu_mutex_lock(&monitor_lock);
-    monitor_destroyed = true;
-    while (!QTAILQ_EMPTY(&mon_list)) {
-        Monitor *mon = QTAILQ_FIRST(&mon_list);
-        QTAILQ_REMOVE(&mon_list, mon, entry);
-        /* Permit QAPI event emission from character frontend release */
-        qemu_mutex_unlock(&monitor_lock);
-        monitor_flush(mon);
-        monitor_data_destroy(mon);
-        qemu_mutex_lock(&monitor_lock);
-        g_free(mon);
-    }
-    qemu_mutex_unlock(&monitor_lock);
-
     /*
-     * The dispatcher needs to stop before destroying the I/O thread.
+     * The dispatcher needs to stop before destroying the monitor and
+     * the I/O thread.
      *
      * We need to poll both qemu_aio_context and iohandler_ctx to make
      * sure that the dispatcher coroutine keeps making progress and
@@ -665,6 +651,21 @@  void monitor_cleanup(void)
                    (aio_poll(iohandler_get_aio_context(), false),
                     qatomic_mb_read(&qmp_dispatcher_co_busy)));
 
+    /* Flush output buffers and destroy monitors */
+    qemu_mutex_lock(&monitor_lock);
+    monitor_destroyed = true;
+    while (!QTAILQ_EMPTY(&mon_list)) {
+        Monitor *mon = QTAILQ_FIRST(&mon_list);
+        QTAILQ_REMOVE(&mon_list, mon, entry);
+        /* Permit QAPI event emission from character frontend release */
+        qemu_mutex_unlock(&monitor_lock);
+        monitor_flush(mon);
+        monitor_data_destroy(mon);
+        qemu_mutex_lock(&monitor_lock);
+        g_free(mon);
+    }
+    qemu_mutex_unlock(&monitor_lock);
+
     if (mon_iothread) {
         iothread_destroy(mon_iothread);
         mon_iothread = NULL;