diff mbox

qemu_opt_foreach: Fix crasher

Message ID 50710af43689d251448f6b2f8d5606956758c998.1471360024.git.mprivozn@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Michal Prívozník Aug. 16, 2016, 3:17 p.m. UTC
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000055baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650 <net_vhost_chardev_opts>, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at util/qemu-option.c:617
617         QTAILQ_FOREACH(opt, &opts->head, next) {
[Current thread is 1 (Thread 0x7f1d4970bb40 (LWP 6603))]
(gdb) bt
#0  0x000055baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650 <net_vhost_chardev_opts>, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at util/qemu-option.c:617
#1  0x000055baf696b7da in net_vhost_parse_chardev (opts=0x55baf8ff9260, errp=0x7ffc51368e48) at net/vhost-user.c:314
#2  0x000055baf696b985 in net_init_vhost_user (netdev=0x55baf8ff9250, name=0x55baf879d270 "hostnet2", peer=0x0, errp=0x7ffc51368e48) at net/vhost-user.c:360
#3  0x000055baf6960216 in net_client_init1 (object=0x55baf8ff9250, is_netdev=true, errp=0x7ffc51368e48) at net/net.c:1051
#4  0x000055baf6960518 in net_client_init (opts=0x55baf776e7e0, is_netdev=true, errp=0x7ffc51368f00) at net/net.c:1108
#5  0x000055baf696083f in netdev_add (opts=0x55baf776e7e0, errp=0x7ffc51368f00) at net/net.c:1186
#6  0x000055baf69608c7 in qmp_netdev_add (qdict=0x55baf7afaf60, ret=0x7ffc51368f50, errp=0x7ffc51368f48) at net/net.c:1205
#7  0x000055baf6622135 in handle_qmp_command (parser=0x55baf77fb590, tokens=0x7f1d24011960) at /path/to/qemu.git/monitor.c:3978
#8  0x000055baf6a9d099 in json_message_process_token (lexer=0x55baf77fb598, input=0x55baf75acd20, type=JSON_RCURLY, x=113, y=19) at qobject/json-streamer.c:105
#9  0x000055baf6abf7aa in json_lexer_feed_char (lexer=0x55baf77fb598, ch=125 '}', flush=false) at qobject/json-lexer.c:319
#10 0x000055baf6abf8f2 in json_lexer_feed (lexer=0x55baf77fb598, buffer=0x7ffc51369170 "}R\204\367\272U", size=1) at qobject/json-lexer.c:369
#11 0x000055baf6a9d13c in json_message_parser_feed (parser=0x55baf77fb590, buffer=0x7ffc51369170 "}R\204\367\272U", size=1) at qobject/json-streamer.c:124
#12 0x000055baf66221f7 in monitor_qmp_read (opaque=0x55baf77fb530, buf=0x7ffc51369170 "}R\204\367\272U", size=1) at /path/to/qemu.git/monitor.c:3994
#13 0x000055baf6757014 in qemu_chr_be_write_impl (s=0x55baf7610a40, buf=0x7ffc51369170 "}R\204\367\272U", len=1) at qemu-char.c:387
#14 0x000055baf6757076 in qemu_chr_be_write (s=0x55baf7610a40, buf=0x7ffc51369170 "}R\204\367\272U", len=1) at qemu-char.c:399
#15 0x000055baf675b3b0 in tcp_chr_read (chan=0x55baf90244b0, cond=G_IO_IN, opaque=0x55baf7610a40) at qemu-char.c:2927
#16 0x000055baf6a5d655 in qio_channel_fd_source_dispatch (source=0x55baf7610df0, callback=0x55baf675b25a <tcp_chr_read>, user_data=0x55baf7610a40) at io/channel-watch.c:84
#17 0x00007f1d3e80cbbd in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0
#18 0x000055baf69d3720 in glib_pollfds_poll () at main-loop.c:213
#19 0x000055baf69d37fd in os_host_main_loop_wait (timeout=126000000) at main-loop.c:258
#20 0x000055baf69d38ad in main_loop_wait (nonblocking=0) at main-loop.c:506
#21 0x000055baf676587b in main_loop () at vl.c:1908
#22 0x000055baf676d3bf in main (argc=101, argv=0x7ffc5136a6c8, envp=0x7ffc5136a9f8) at vl.c:4604
(gdb) p opts
$1 = (QemuOpts *) 0x0

The crash occurred when I tried to attach vhost-user net:

{"execute":"chardev-add","arguments":{"id":"charnet2","backend":{"type":"socket","data":{"addr":{"type":"unix","data":{"path":"/var/run/openvswitch/vhost-user1"}},"wait":false,"server":false}}},"id":"libvirt-19"}
{"return": {}, "id": "libvirt-19"}
{"execute":"netdev_add","arguments":{"type":"vhost-user","chardev":"charnet2","id":"hostnet2"},"id":"libvirt-20"}

2016-08-16 14:52:18.274+0000: 6456: error : qemuMonitorIORead:586 : Unable to read from monitor: Connection reset by peer


The solution is to teach qemu_opt_foreach() to take a shortcut if
@opts is NULL.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
---

Even after this patch I'm still unable to attach vhost-user:

{"id": "libvirt-20", "error": {"class": "GenericError", "desc": "chardev \"charnet2\" is not a unix socket"}}

But at least, qemu does not crash anymore.

 util/qemu-option.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Peter Maydell Aug. 16, 2016, 4:25 p.m. UTC | #1
On 16 August 2016 at 16:17, Michal Privoznik <mprivozn@redhat.com> wrote:
> The solution is to teach qemu_opt_foreach() to take a shortcut if
> @opts is NULL.
>
> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> ---
>
> Even after this patch I'm still unable to attach vhost-user:
>
> {"id": "libvirt-20", "error": {"class": "GenericError", "desc": "chardev \"charnet2\" is not a unix socket"}}
>
> But at least, qemu does not crash anymore.
>
>  util/qemu-option.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/util/qemu-option.c b/util/qemu-option.c
> index 3467dc2..78be7e1 100644
> --- a/util/qemu-option.c
> +++ b/util/qemu-option.c
> @@ -614,6 +614,11 @@ int qemu_opt_foreach(QemuOpts *opts, qemu_opt_loopfunc func, void *opaque,
>      QemuOpt *opt;
>      int rc;
>
> +    if (!opts) {
> +        /* Done, trivially. */
> +        return 0;
> +    }
> +
>      QTAILQ_FOREACH(opt, &opts->head, next) {
>          rc = func(opaque, opt->name, opt->str, errp);
>          if (rc) {
> --
> 2.8.4

This seems plausible, but I don't understand our option
code very well, and we seem to have a mix of "check for
NULL" and "caller had better not pass NULL" in the various
functions in util/qemu-option.c.

Markus: how is this supposed to work?

In any case something is clearly still busted in the
vhost-user code, because it's expecting to get a non-NULL
opts so it can properly parse the chardev, so that seems
like the thing we really need to fix.

thanks
-- PMM
Markus Armbruster Aug. 17, 2016, 7:26 a.m. UTC | #2
Peter Maydell <peter.maydell@linaro.org> writes:

> On 16 August 2016 at 16:17, Michal Privoznik <mprivozn@redhat.com> wrote:
>> The solution is to teach qemu_opt_foreach() to take a shortcut if
>> @opts is NULL.

Please provide a reproducer.  A stack backtrace wouldn't hurt.

>>
>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
>> ---
>>
>> Even after this patch I'm still unable to attach vhost-user:
>>
>> {"id": "libvirt-20", "error": {"class": "GenericError", "desc": "chardev \"charnet2\" is not a unix socket"}}
>>
>> But at least, qemu does not crash anymore.
>>
>>  util/qemu-option.c | 5 +++++
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/util/qemu-option.c b/util/qemu-option.c
>> index 3467dc2..78be7e1 100644
>> --- a/util/qemu-option.c
>> +++ b/util/qemu-option.c
>> @@ -614,6 +614,11 @@ int qemu_opt_foreach(QemuOpts *opts, qemu_opt_loopfunc func, void *opaque,
>>      QemuOpt *opt;
>>      int rc;
>>
>> +    if (!opts) {
>> +        /* Done, trivially. */
>> +        return 0;
>> +    }
>> +
>>      QTAILQ_FOREACH(opt, &opts->head, next) {
>>          rc = func(opaque, opt->name, opt->str, errp);
>>          if (rc) {
>> --
>> 2.8.4
>
> This seems plausible, but I don't understand our option
> code very well, and we seem to have a mix of "check for
> NULL" and "caller had better not pass NULL" in the various
> functions in util/qemu-option.c.
>
> Markus: how is this supposed to work?

I wouldn't say this is "supposed to work" in some specific way.
"Happens to work" would be closer to the truth.

If you want me to interpret some sense into the mess after the fact,
here's my best guess: we generally require non-null opts, except for
qemu_opts_del() and the qemu_opt_get_FOO().

Makes obvious sense for qemu_opts_del(), since when a failing
constructor returns null, the destructor should accept null.

The qemu_opt_get_FOO() feel like a (possibly misguided) attempt at
convenience to me.

> In any case something is clearly still busted in the
> vhost-user code, because it's expecting to get a non-NULL
> opts so it can properly parse the chardev, so that seems
> like the thing we really need to fix.

Probably.  If I had a reproducer or at least a stack backtrace, I'd even
know where to look.
Marc-André Lureau Aug. 17, 2016, 7:35 a.m. UTC | #3
Hi

On Tue, Aug 16, 2016 at 7:18 PM Michal Privoznik <mprivozn@redhat.com>
wrote:

> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x000055baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650
> <net_vhost_chardev_opts>, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at
> util/qemu-option.c:617
> 617         QTAILQ_FOREACH(opt, &opts->head, next) {
> [Current thread is 1 (Thread 0x7f1d4970bb40 (LWP 6603))]
> (gdb) bt
> #0  0x000055baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650
> <net_vhost_chardev_opts>, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at
> util/qemu-option.c:617
> #1  0x000055baf696b7da in net_vhost_parse_chardev (opts=0x55baf8ff9260,
> errp=0x7ffc51368e48) at net/vhost-user.c:314
> #2  0x000055baf696b985 in net_init_vhost_user (netdev=0x55baf8ff9250,
> name=0x55baf879d270 "hostnet2", peer=0x0, errp=0x7ffc51368e48) at
> net/vhost-user.c:360
> #3  0x000055baf6960216 in net_client_init1 (object=0x55baf8ff9250,
> is_netdev=true, errp=0x7ffc51368e48) at net/net.c:1051
> #4  0x000055baf6960518 in net_client_init (opts=0x55baf776e7e0,
> is_netdev=true, errp=0x7ffc51368f00) at net/net.c:1108
> #5  0x000055baf696083f in netdev_add (opts=0x55baf776e7e0,
> errp=0x7ffc51368f00) at net/net.c:1186
> #6  0x000055baf69608c7 in qmp_netdev_add (qdict=0x55baf7afaf60,
> ret=0x7ffc51368f50, errp=0x7ffc51368f48) at net/net.c:1205
> #7  0x000055baf6622135 in handle_qmp_command (parser=0x55baf77fb590,
> tokens=0x7f1d24011960) at /path/to/qemu.git/monitor.c:3978
> #8  0x000055baf6a9d099 in json_message_process_token
> (lexer=0x55baf77fb598, input=0x55baf75acd20, type=JSON_RCURLY, x=113, y=19)
> at qobject/json-streamer.c:105
> #9  0x000055baf6abf7aa in json_lexer_feed_char (lexer=0x55baf77fb598,
> ch=125 '}', flush=false) at qobject/json-lexer.c:319
> #10 0x000055baf6abf8f2 in json_lexer_feed (lexer=0x55baf77fb598,
> buffer=0x7ffc51369170 "}R\204\367\272U", size=1) at qobject/json-lexer.c:369
> #11 0x000055baf6a9d13c in json_message_parser_feed (parser=0x55baf77fb590,
> buffer=0x7ffc51369170 "}R\204\367\272U", size=1) at
> qobject/json-streamer.c:124
> #12 0x000055baf66221f7 in monitor_qmp_read (opaque=0x55baf77fb530,
> buf=0x7ffc51369170 "}R\204\367\272U", size=1) at
> /path/to/qemu.git/monitor.c:3994
> #13 0x000055baf6757014 in qemu_chr_be_write_impl (s=0x55baf7610a40,
> buf=0x7ffc51369170 "}R\204\367\272U", len=1) at qemu-char.c:387
> #14 0x000055baf6757076 in qemu_chr_be_write (s=0x55baf7610a40,
> buf=0x7ffc51369170 "}R\204\367\272U", len=1) at qemu-char.c:399
> #15 0x000055baf675b3b0 in tcp_chr_read (chan=0x55baf90244b0, cond=G_IO_IN,
> opaque=0x55baf7610a40) at qemu-char.c:2927
> #16 0x000055baf6a5d655 in qio_channel_fd_source_dispatch
> (source=0x55baf7610df0, callback=0x55baf675b25a <tcp_chr_read>,
> user_data=0x55baf7610a40) at io/channel-watch.c:84
> #17 0x00007f1d3e80cbbd in g_main_context_dispatch () from
> /usr/lib64/libglib-2.0.so.0
> #18 0x000055baf69d3720 in glib_pollfds_poll () at main-loop.c:213
> #19 0x000055baf69d37fd in os_host_main_loop_wait (timeout=126000000) at
> main-loop.c:258
> #20 0x000055baf69d38ad in main_loop_wait (nonblocking=0) at main-loop.c:506
> #21 0x000055baf676587b in main_loop () at vl.c:1908
> #22 0x000055baf676d3bf in main (argc=101, argv=0x7ffc5136a6c8,
> envp=0x7ffc5136a9f8) at vl.c:4604
> (gdb) p opts
> $1 = (QemuOpts *) 0x0
>
> The crash occurred when I tried to attach vhost-user net:
>
>
> {"execute":"chardev-add","arguments":{"id":"charnet2","backend":{"type":"socket","data":{"addr":{"type":"unix","data":{"path":"/var/run/openvswitch/vhost-user1"}},"wait":false,"server":false}}},"id":"libvirt-19"}
> {"return": {}, "id": "libvirt-19"}
>
> {"execute":"netdev_add","arguments":{"type":"vhost-user","chardev":"charnet2","id":"hostnet2"},"id":"libvirt-20"}
>
> 2016-08-16 14:52:18.274+0000: 6456: error : qemuMonitorIORead:586 : Unable
> to read from monitor: Connection reset by peer
>
>
fwiw, this crash can be reproduced with 2.5: it's not a regression. Imho,
it shouldn't prevent or delay 2.7.


>
> The solution is to teach qemu_opt_foreach() to take a shortcut if
> @opts is NULL.
>
> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> ---
>
> Even after this patch I'm still unable to attach vhost-user:
>
> {"id": "libvirt-20", "error": {"class": "GenericError", "desc": "chardev
> \"charnet2\" is not a unix socket"}}
>
> But at least, qemu does not crash anymore.
>
>  util/qemu-option.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/util/qemu-option.c b/util/qemu-option.c
> index 3467dc2..78be7e1 100644
> --- a/util/qemu-option.c
> +++ b/util/qemu-option.c
> @@ -614,6 +614,11 @@ int qemu_opt_foreach(QemuOpts *opts,
> qemu_opt_loopfunc func, void *opaque,
>      QemuOpt *opt;
>      int rc;
>
> +    if (!opts) {
> +        /* Done, trivially. */
> +        return 0;
> +    }
> +
>      QTAILQ_FOREACH(opt, &opts->head, next) {
>          rc = func(opaque, opt->name, opt->str, errp);
>          if (rc) {
> --
> 2.8.4
>
>
> --
Marc-André Lureau
Markus Armbruster Aug. 17, 2016, 8:03 a.m. UTC | #4
Michal Privoznik <mprivozn@redhat.com> writes:

> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  0x000055baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650 <net_vhost_chardev_opts>, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at util/qemu-option.c:617
> 617         QTAILQ_FOREACH(opt, &opts->head, next) {
> [Current thread is 1 (Thread 0x7f1d4970bb40 (LWP 6603))]
> (gdb) bt
> #0  0x000055baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650 <net_vhost_chardev_opts>, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at util/qemu-option.c:617
> #1  0x000055baf696b7da in net_vhost_parse_chardev (opts=0x55baf8ff9260, errp=0x7ffc51368e48) at net/vhost-user.c:314

This is where the null opts come from:

    CharDriverState *chr = qemu_chr_find(opts->chardev);
    VhostUserChardevProps props;

    if (chr == NULL) {
        error_setg(errp, "chardev \"%s\" not found", opts->chardev);
        return NULL;
    }

    /* inspect chardev opts */
    memset(&props, 0, sizeof(props));
    if (qemu_opt_foreach(chr->opts, net_vhost_chardev_opts, &props, errp)) {
        return NULL;
    }

Can CharDriverState member opts be legitimately null?  If yes, then its
definition needs a comment.  But I suspect the answer is no.

[...]
Markus Armbruster Aug. 17, 2016, 8:11 a.m. UTC | #5
Markus Armbruster <armbru@redhat.com> writes:

> Michal Privoznik <mprivozn@redhat.com> writes:
>
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0  0x000055baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650 <net_vhost_chardev_opts>, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at util/qemu-option.c:617
>> 617         QTAILQ_FOREACH(opt, &opts->head, next) {
>> [Current thread is 1 (Thread 0x7f1d4970bb40 (LWP 6603))]
>> (gdb) bt
>> #0  0x000055baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650 <net_vhost_chardev_opts>, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at util/qemu-option.c:617
>> #1  0x000055baf696b7da in net_vhost_parse_chardev (opts=0x55baf8ff9260, errp=0x7ffc51368e48) at net/vhost-user.c:314
>
> This is where the null opts come from:
>
>     CharDriverState *chr = qemu_chr_find(opts->chardev);
>     VhostUserChardevProps props;
>
>     if (chr == NULL) {
>         error_setg(errp, "chardev \"%s\" not found", opts->chardev);
>         return NULL;
>     }
>
>     /* inspect chardev opts */
>     memset(&props, 0, sizeof(props));
>     if (qemu_opt_foreach(chr->opts, net_vhost_chardev_opts, &props, errp)) {
>         return NULL;
>     }
>
> Can CharDriverState member opts be legitimately null?  If yes, then its
> definition needs a comment.  But I suspect the answer is no.
>
> [...]

Looks like the answer is yes: CharDriverState member opts is non-null
when the CharDriverState got created the old-fashioned way, via
qemu_chr_new_from_opts(), and null when it got created by
qmp_chardev_add().  We need to check all uses of member opts.  We should
also document this with a comment next to the definition of
CharDriverState member opts.
Peter Maydell Aug. 18, 2016, 9:57 a.m. UTC | #6
On 17 August 2016 at 08:35, Marc-André Lureau
<marcandre.lureau@gmail.com> wrote:
> fwiw, this crash can be reproduced with 2.5: it's not a regression. Imho, it
> shouldn't prevent or delay 2.7.

Thanks for checking. OK, let's not try to squeeze a fix into 2.7;
better to investigate more carefully and try to line up our
QemuOpts semantics into something that makes sense for 2.8 :-)

thanks
-- PMM
diff mbox

Patch

diff --git a/util/qemu-option.c b/util/qemu-option.c
index 3467dc2..78be7e1 100644
--- a/util/qemu-option.c
+++ b/util/qemu-option.c
@@ -614,6 +614,11 @@  int qemu_opt_foreach(QemuOpts *opts, qemu_opt_loopfunc func, void *opaque,
     QemuOpt *opt;
     int rc;
 
+    if (!opts) {
+        /* Done, trivially. */
+        return 0;
+    }
+
     QTAILQ_FOREACH(opt, &opts->head, next) {
         rc = func(opaque, opt->name, opt->str, errp);
         if (rc) {