diff mbox series

[v4,for-4.0,1/7] char-socket: Enable "nowait" option on client sockets

Message ID 20190109112728.9214-2-xieyongji@baidu.com (mailing list archive)
State New, archived
Headers show
Series vhost-user-blk: Add support for backend reconnecting | expand

Commit Message

Yongji Xie Jan. 9, 2019, 11:27 a.m. UTC
From: Xie Yongji <xieyongji@baidu.com>

Enable "nowait" option to make QEMU not do a connect
on client sockets during initialization of the chardev.
Then we can use qemu_chr_fe_wait_connected() to connect
when necessary. Now it would be used for unix domain
socket of vhost-user-blk device to support reconnect.

Signed-off-by: Xie Yongji <xieyongji@baidu.com>
Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
---
 chardev/char-socket.c | 56 +++++++++++++++++++++----------------------
 qapi/char.json        |  3 +--
 qemu-options.hx       |  9 ++++---
 3 files changed, 35 insertions(+), 33 deletions(-)

Comments

Daniel P. Berrangé Jan. 10, 2019, 12:49 p.m. UTC | #1
On Wed, Jan 09, 2019 at 07:27:22PM +0800, elohimes@gmail.com wrote:
> From: Xie Yongji <xieyongji@baidu.com>
> 
> Enable "nowait" option to make QEMU not do a connect
> on client sockets during initialization of the chardev.
> Then we can use qemu_chr_fe_wait_connected() to connect
> when necessary. Now it would be used for unix domain
> socket of vhost-user-blk device to support reconnect.
> 
> Signed-off-by: Xie Yongji <xieyongji@baidu.com>
> Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
> ---
>  chardev/char-socket.c | 56 +++++++++++++++++++++----------------------
>  qapi/char.json        |  3 +--
>  qemu-options.hx       |  9 ++++---
>  3 files changed, 35 insertions(+), 33 deletions(-)
> 
> diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> index eaa8e8b68f..f803f4f7d3 100644
> --- a/chardev/char-socket.c
> +++ b/chardev/char-socket.c
> @@ -1072,37 +1072,37 @@ static void qmp_chardev_open_socket(Chardev *chr,
>          s->reconnect_time = reconnect;
>      }
>  
> -    if (s->reconnect_time) {
> -        tcp_chr_connect_async(chr);
> -    } else {
> -        if (s->is_listen) {
> -            char *name;
> -            s->listener = qio_net_listener_new();
> +    if (s->is_listen) {
> +        char *name;
> +        s->listener = qio_net_listener_new();
>  
> -            name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> -            qio_net_listener_set_name(s->listener, name);
> -            g_free(name);
> +        name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> +        qio_net_listener_set_name(s->listener, name);
> +        g_free(name);
>  
> -            if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> -                object_unref(OBJECT(s->listener));
> -                s->listener = NULL;
> -                goto error;
> -            }
> +        if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> +            object_unref(OBJECT(s->listener));
> +            s->listener = NULL;
> +            goto error;
> +        }
>  
> -            qapi_free_SocketAddress(s->addr);
> -            s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> -            update_disconnected_filename(s);
> +        qapi_free_SocketAddress(s->addr);
> +        s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> +        update_disconnected_filename(s);
>  
> -            if (is_waitconnect &&
> -                qemu_chr_wait_connected(chr, errp) < 0) {
> -                return;
> -            }
> -            if (!s->ioc) {
> -                qio_net_listener_set_client_func_full(s->listener,
> -                                                      tcp_chr_accept,
> -                                                      chr, NULL,
> -                                                      chr->gcontext);
> -            }
> +        if (is_waitconnect &&
> +            qemu_chr_wait_connected(chr, errp) < 0) {
> +            return;
> +        }
> +        if (!s->ioc) {
> +            qio_net_listener_set_client_func_full(s->listener,
> +                                                  tcp_chr_accept,
> +                                                  chr, NULL,
> +                                                  chr->gcontext);
> +        }
> +    } else if (is_waitconnect) {
> +        if (s->reconnect_time) {
> +            tcp_chr_connect_async(chr);
>          } else if (qemu_chr_wait_connected(chr, errp) < 0) {
>              goto error;
>          }

This skips everything when 'is_waitconnect' is false.

This combines with a bug in tests/libqtest.c which adds the 'nowait'
flag to the -chardevs it cteates. This mistake was previously ignored
because the chardevs were socket clients, but now we honour it.

We shoul remove 'nowait' from the qtest chardevs, but separately
from that this code should also still attempt a non-blocking
connect when is_waitconnect is false.

ie

    } else if (is_waitconnect) {
        if (s->reconnect_time || !is_waitconnect) {
            tcp_chr_connect_async(chr);
        } else if (qemu_chr_wait_connected(chr, errp) < 0) {
            goto error;
        }
    }


Regards,
Daniel
Yongji Xie Jan. 10, 2019, 1:19 p.m. UTC | #2
On Thu, 10 Jan 2019 at 20:50, Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Wed, Jan 09, 2019 at 07:27:22PM +0800, elohimes@gmail.com wrote:
> > From: Xie Yongji <xieyongji@baidu.com>
> >
> > Enable "nowait" option to make QEMU not do a connect
> > on client sockets during initialization of the chardev.
> > Then we can use qemu_chr_fe_wait_connected() to connect
> > when necessary. Now it would be used for unix domain
> > socket of vhost-user-blk device to support reconnect.
> >
> > Signed-off-by: Xie Yongji <xieyongji@baidu.com>
> > Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
> > ---
> >  chardev/char-socket.c | 56 +++++++++++++++++++++----------------------
> >  qapi/char.json        |  3 +--
> >  qemu-options.hx       |  9 ++++---
> >  3 files changed, 35 insertions(+), 33 deletions(-)
> >
> > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > index eaa8e8b68f..f803f4f7d3 100644
> > --- a/chardev/char-socket.c
> > +++ b/chardev/char-socket.c
> > @@ -1072,37 +1072,37 @@ static void qmp_chardev_open_socket(Chardev *chr,
> >          s->reconnect_time = reconnect;
> >      }
> >
> > -    if (s->reconnect_time) {
> > -        tcp_chr_connect_async(chr);
> > -    } else {
> > -        if (s->is_listen) {
> > -            char *name;
> > -            s->listener = qio_net_listener_new();
> > +    if (s->is_listen) {
> > +        char *name;
> > +        s->listener = qio_net_listener_new();
> >
> > -            name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > -            qio_net_listener_set_name(s->listener, name);
> > -            g_free(name);
> > +        name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > +        qio_net_listener_set_name(s->listener, name);
> > +        g_free(name);
> >
> > -            if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > -                object_unref(OBJECT(s->listener));
> > -                s->listener = NULL;
> > -                goto error;
> > -            }
> > +        if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > +            object_unref(OBJECT(s->listener));
> > +            s->listener = NULL;
> > +            goto error;
> > +        }
> >
> > -            qapi_free_SocketAddress(s->addr);
> > -            s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > -            update_disconnected_filename(s);
> > +        qapi_free_SocketAddress(s->addr);
> > +        s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > +        update_disconnected_filename(s);
> >
> > -            if (is_waitconnect &&
> > -                qemu_chr_wait_connected(chr, errp) < 0) {
> > -                return;
> > -            }
> > -            if (!s->ioc) {
> > -                qio_net_listener_set_client_func_full(s->listener,
> > -                                                      tcp_chr_accept,
> > -                                                      chr, NULL,
> > -                                                      chr->gcontext);
> > -            }
> > +        if (is_waitconnect &&
> > +            qemu_chr_wait_connected(chr, errp) < 0) {
> > +            return;
> > +        }
> > +        if (!s->ioc) {
> > +            qio_net_listener_set_client_func_full(s->listener,
> > +                                                  tcp_chr_accept,
> > +                                                  chr, NULL,
> > +                                                  chr->gcontext);
> > +        }
> > +    } else if (is_waitconnect) {
> > +        if (s->reconnect_time) {
> > +            tcp_chr_connect_async(chr);
> >          } else if (qemu_chr_wait_connected(chr, errp) < 0) {
> >              goto error;
> >          }
>
> This skips everything when 'is_waitconnect' is false.
>
> This combines with a bug in tests/libqtest.c which adds the 'nowait'
> flag to the -chardevs it cteates. This mistake was previously ignored
> because the chardevs were socket clients, but now we honour it.
>
> We shoul remove 'nowait' from the qtest chardevs, but separately
> from that this code should also still attempt a non-blocking
> connect when is_waitconnect is false.
>

Do you mean we still need to connect server in background with
"nowait" option? But my purpose is not to connect server until we
manually call qemu_chr_fe_wait_connected() in other place.

> ie
>
>     } else if (is_waitconnect) {
>         if (s->reconnect_time || !is_waitconnect) {

"!is_waitconnect" should always be false here?

Thanks,
Yongji
Daniel P. Berrangé Jan. 10, 2019, 1:24 p.m. UTC | #3
On Thu, Jan 10, 2019 at 09:19:41PM +0800, Yongji Xie wrote:
> On Thu, 10 Jan 2019 at 20:50, Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > On Wed, Jan 09, 2019 at 07:27:22PM +0800, elohimes@gmail.com wrote:
> > > From: Xie Yongji <xieyongji@baidu.com>
> > >
> > > Enable "nowait" option to make QEMU not do a connect
> > > on client sockets during initialization of the chardev.
> > > Then we can use qemu_chr_fe_wait_connected() to connect
> > > when necessary. Now it would be used for unix domain
> > > socket of vhost-user-blk device to support reconnect.
> > >
> > > Signed-off-by: Xie Yongji <xieyongji@baidu.com>
> > > Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
> > > ---
> > >  chardev/char-socket.c | 56 +++++++++++++++++++++----------------------
> > >  qapi/char.json        |  3 +--
> > >  qemu-options.hx       |  9 ++++---
> > >  3 files changed, 35 insertions(+), 33 deletions(-)
> > >
> > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > > index eaa8e8b68f..f803f4f7d3 100644
> > > --- a/chardev/char-socket.c
> > > +++ b/chardev/char-socket.c
> > > @@ -1072,37 +1072,37 @@ static void qmp_chardev_open_socket(Chardev *chr,
> > >          s->reconnect_time = reconnect;
> > >      }
> > >
> > > -    if (s->reconnect_time) {
> > > -        tcp_chr_connect_async(chr);
> > > -    } else {
> > > -        if (s->is_listen) {
> > > -            char *name;
> > > -            s->listener = qio_net_listener_new();
> > > +    if (s->is_listen) {
> > > +        char *name;
> > > +        s->listener = qio_net_listener_new();
> > >
> > > -            name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > > -            qio_net_listener_set_name(s->listener, name);
> > > -            g_free(name);
> > > +        name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > > +        qio_net_listener_set_name(s->listener, name);
> > > +        g_free(name);
> > >
> > > -            if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > > -                object_unref(OBJECT(s->listener));
> > > -                s->listener = NULL;
> > > -                goto error;
> > > -            }
> > > +        if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > > +            object_unref(OBJECT(s->listener));
> > > +            s->listener = NULL;
> > > +            goto error;
> > > +        }
> > >
> > > -            qapi_free_SocketAddress(s->addr);
> > > -            s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > > -            update_disconnected_filename(s);
> > > +        qapi_free_SocketAddress(s->addr);
> > > +        s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > > +        update_disconnected_filename(s);
> > >
> > > -            if (is_waitconnect &&
> > > -                qemu_chr_wait_connected(chr, errp) < 0) {
> > > -                return;
> > > -            }
> > > -            if (!s->ioc) {
> > > -                qio_net_listener_set_client_func_full(s->listener,
> > > -                                                      tcp_chr_accept,
> > > -                                                      chr, NULL,
> > > -                                                      chr->gcontext);
> > > -            }
> > > +        if (is_waitconnect &&
> > > +            qemu_chr_wait_connected(chr, errp) < 0) {
> > > +            return;
> > > +        }
> > > +        if (!s->ioc) {
> > > +            qio_net_listener_set_client_func_full(s->listener,
> > > +                                                  tcp_chr_accept,
> > > +                                                  chr, NULL,
> > > +                                                  chr->gcontext);
> > > +        }
> > > +    } else if (is_waitconnect) {
> > > +        if (s->reconnect_time) {
> > > +            tcp_chr_connect_async(chr);
> > >          } else if (qemu_chr_wait_connected(chr, errp) < 0) {
> > >              goto error;
> > >          }
> >
> > This skips everything when 'is_waitconnect' is false.
> >
> > This combines with a bug in tests/libqtest.c which adds the 'nowait'
> > flag to the -chardevs it cteates. This mistake was previously ignored
> > because the chardevs were socket clients, but now we honour it.
> >
> > We shoul remove 'nowait' from the qtest chardevs, but separately
> > from that this code should also still attempt a non-blocking
> > connect when is_waitconnect is false.
> >
> 
> Do you mean we still need to connect server in background with
> "nowait" option? But my purpose is not to connect server until we
> manually call qemu_chr_fe_wait_connected() in other place.

I don't see a need to delay the connect. We can start a
background connect right away. The later code you have
merely needs to wait for that background connect  to
finish, which qemu_chr_fe_wait_connected still accomplishes.
This keeps the chardev code clearer only having 2 distinct
code paths to worry about - blocking or non-blocking connect.

> 
> > ie
> >
> >     } else if (is_waitconnect) {
> >         if (s->reconnect_time || !is_waitconnect) {
> 
> "!is_waitconnect" should always be false here?

Opps, I meant

     } else  {
         if (s->reconnect_time || !is_waitconnect) {

Regards,
Daniel
Yongji Xie Jan. 10, 2019, 2:08 p.m. UTC | #4
On Thu, 10 Jan 2019 at 21:24, Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Thu, Jan 10, 2019 at 09:19:41PM +0800, Yongji Xie wrote:
> > On Thu, 10 Jan 2019 at 20:50, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >
> > > On Wed, Jan 09, 2019 at 07:27:22PM +0800, elohimes@gmail.com wrote:
> > > > From: Xie Yongji <xieyongji@baidu.com>
> > > >
> > > > Enable "nowait" option to make QEMU not do a connect
> > > > on client sockets during initialization of the chardev.
> > > > Then we can use qemu_chr_fe_wait_connected() to connect
> > > > when necessary. Now it would be used for unix domain
> > > > socket of vhost-user-blk device to support reconnect.
> > > >
> > > > Signed-off-by: Xie Yongji <xieyongji@baidu.com>
> > > > Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
> > > > ---
> > > >  chardev/char-socket.c | 56 +++++++++++++++++++++----------------------
> > > >  qapi/char.json        |  3 +--
> > > >  qemu-options.hx       |  9 ++++---
> > > >  3 files changed, 35 insertions(+), 33 deletions(-)
> > > >
> > > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > > > index eaa8e8b68f..f803f4f7d3 100644
> > > > --- a/chardev/char-socket.c
> > > > +++ b/chardev/char-socket.c
> > > > @@ -1072,37 +1072,37 @@ static void qmp_chardev_open_socket(Chardev *chr,
> > > >          s->reconnect_time = reconnect;
> > > >      }
> > > >
> > > > -    if (s->reconnect_time) {
> > > > -        tcp_chr_connect_async(chr);
> > > > -    } else {
> > > > -        if (s->is_listen) {
> > > > -            char *name;
> > > > -            s->listener = qio_net_listener_new();
> > > > +    if (s->is_listen) {
> > > > +        char *name;
> > > > +        s->listener = qio_net_listener_new();
> > > >
> > > > -            name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > > > -            qio_net_listener_set_name(s->listener, name);
> > > > -            g_free(name);
> > > > +        name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > > > +        qio_net_listener_set_name(s->listener, name);
> > > > +        g_free(name);
> > > >
> > > > -            if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > > > -                object_unref(OBJECT(s->listener));
> > > > -                s->listener = NULL;
> > > > -                goto error;
> > > > -            }
> > > > +        if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > > > +            object_unref(OBJECT(s->listener));
> > > > +            s->listener = NULL;
> > > > +            goto error;
> > > > +        }
> > > >
> > > > -            qapi_free_SocketAddress(s->addr);
> > > > -            s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > > > -            update_disconnected_filename(s);
> > > > +        qapi_free_SocketAddress(s->addr);
> > > > +        s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > > > +        update_disconnected_filename(s);
> > > >
> > > > -            if (is_waitconnect &&
> > > > -                qemu_chr_wait_connected(chr, errp) < 0) {
> > > > -                return;
> > > > -            }
> > > > -            if (!s->ioc) {
> > > > -                qio_net_listener_set_client_func_full(s->listener,
> > > > -                                                      tcp_chr_accept,
> > > > -                                                      chr, NULL,
> > > > -                                                      chr->gcontext);
> > > > -            }
> > > > +        if (is_waitconnect &&
> > > > +            qemu_chr_wait_connected(chr, errp) < 0) {
> > > > +            return;
> > > > +        }
> > > > +        if (!s->ioc) {
> > > > +            qio_net_listener_set_client_func_full(s->listener,
> > > > +                                                  tcp_chr_accept,
> > > > +                                                  chr, NULL,
> > > > +                                                  chr->gcontext);
> > > > +        }
> > > > +    } else if (is_waitconnect) {
> > > > +        if (s->reconnect_time) {
> > > > +            tcp_chr_connect_async(chr);
> > > >          } else if (qemu_chr_wait_connected(chr, errp) < 0) {
> > > >              goto error;
> > > >          }
> > >
> > > This skips everything when 'is_waitconnect' is false.
> > >
> > > This combines with a bug in tests/libqtest.c which adds the 'nowait'
> > > flag to the -chardevs it cteates. This mistake was previously ignored
> > > because the chardevs were socket clients, but now we honour it.
> > >
> > > We shoul remove 'nowait' from the qtest chardevs, but separately
> > > from that this code should also still attempt a non-blocking
> > > connect when is_waitconnect is false.
> > >
> >
> > Do you mean we still need to connect server in background with
> > "nowait" option? But my purpose is not to connect server until we
> > manually call qemu_chr_fe_wait_connected() in other place.
>
> I don't see a need to delay the connect. We can start a
> background connect right away. The later code you have
> merely needs to wait for that background connect  to
> finish, which qemu_chr_fe_wait_connected still accomplishes.
> This keeps the chardev code clearer only having 2 distinct
> code paths to worry about - blocking or non-blocking connect.
>

Now the problem is that we have a server that only accept one
connection. And we want to read something from it during device
initializtion.

If background connect it before we call qemu_chr_fe_wait_connected()
during device initializtion, qemu_chr_fe_wait_connected() will
accomplish but we can't read anything. And we have no way to release
the background connection. So what I want to do in this patch is to
disable background connect.

Thanks,
Yongji
Daniel P. Berrangé Jan. 10, 2019, 2:11 p.m. UTC | #5
On Thu, Jan 10, 2019 at 10:08:54PM +0800, Yongji Xie wrote:
> On Thu, 10 Jan 2019 at 21:24, Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > On Thu, Jan 10, 2019 at 09:19:41PM +0800, Yongji Xie wrote:
> > > On Thu, 10 Jan 2019 at 20:50, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > >
> > > > On Wed, Jan 09, 2019 at 07:27:22PM +0800, elohimes@gmail.com wrote:
> > > > > From: Xie Yongji <xieyongji@baidu.com>
> > > > >
> > > > > Enable "nowait" option to make QEMU not do a connect
> > > > > on client sockets during initialization of the chardev.
> > > > > Then we can use qemu_chr_fe_wait_connected() to connect
> > > > > when necessary. Now it would be used for unix domain
> > > > > socket of vhost-user-blk device to support reconnect.
> > > > >
> > > > > Signed-off-by: Xie Yongji <xieyongji@baidu.com>
> > > > > Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
> > > > > ---
> > > > >  chardev/char-socket.c | 56 +++++++++++++++++++++----------------------
> > > > >  qapi/char.json        |  3 +--
> > > > >  qemu-options.hx       |  9 ++++---
> > > > >  3 files changed, 35 insertions(+), 33 deletions(-)
> > > > >
> > > > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > > > > index eaa8e8b68f..f803f4f7d3 100644
> > > > > --- a/chardev/char-socket.c
> > > > > +++ b/chardev/char-socket.c
> > > > > @@ -1072,37 +1072,37 @@ static void qmp_chardev_open_socket(Chardev *chr,
> > > > >          s->reconnect_time = reconnect;
> > > > >      }
> > > > >
> > > > > -    if (s->reconnect_time) {
> > > > > -        tcp_chr_connect_async(chr);
> > > > > -    } else {
> > > > > -        if (s->is_listen) {
> > > > > -            char *name;
> > > > > -            s->listener = qio_net_listener_new();
> > > > > +    if (s->is_listen) {
> > > > > +        char *name;
> > > > > +        s->listener = qio_net_listener_new();
> > > > >
> > > > > -            name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > > > > -            qio_net_listener_set_name(s->listener, name);
> > > > > -            g_free(name);
> > > > > +        name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > > > > +        qio_net_listener_set_name(s->listener, name);
> > > > > +        g_free(name);
> > > > >
> > > > > -            if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > > > > -                object_unref(OBJECT(s->listener));
> > > > > -                s->listener = NULL;
> > > > > -                goto error;
> > > > > -            }
> > > > > +        if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > > > > +            object_unref(OBJECT(s->listener));
> > > > > +            s->listener = NULL;
> > > > > +            goto error;
> > > > > +        }
> > > > >
> > > > > -            qapi_free_SocketAddress(s->addr);
> > > > > -            s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > > > > -            update_disconnected_filename(s);
> > > > > +        qapi_free_SocketAddress(s->addr);
> > > > > +        s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > > > > +        update_disconnected_filename(s);
> > > > >
> > > > > -            if (is_waitconnect &&
> > > > > -                qemu_chr_wait_connected(chr, errp) < 0) {
> > > > > -                return;
> > > > > -            }
> > > > > -            if (!s->ioc) {
> > > > > -                qio_net_listener_set_client_func_full(s->listener,
> > > > > -                                                      tcp_chr_accept,
> > > > > -                                                      chr, NULL,
> > > > > -                                                      chr->gcontext);
> > > > > -            }
> > > > > +        if (is_waitconnect &&
> > > > > +            qemu_chr_wait_connected(chr, errp) < 0) {
> > > > > +            return;
> > > > > +        }
> > > > > +        if (!s->ioc) {
> > > > > +            qio_net_listener_set_client_func_full(s->listener,
> > > > > +                                                  tcp_chr_accept,
> > > > > +                                                  chr, NULL,
> > > > > +                                                  chr->gcontext);
> > > > > +        }
> > > > > +    } else if (is_waitconnect) {
> > > > > +        if (s->reconnect_time) {
> > > > > +            tcp_chr_connect_async(chr);
> > > > >          } else if (qemu_chr_wait_connected(chr, errp) < 0) {
> > > > >              goto error;
> > > > >          }
> > > >
> > > > This skips everything when 'is_waitconnect' is false.
> > > >
> > > > This combines with a bug in tests/libqtest.c which adds the 'nowait'
> > > > flag to the -chardevs it cteates. This mistake was previously ignored
> > > > because the chardevs were socket clients, but now we honour it.
> > > >
> > > > We shoul remove 'nowait' from the qtest chardevs, but separately
> > > > from that this code should also still attempt a non-blocking
> > > > connect when is_waitconnect is false.
> > > >
> > >
> > > Do you mean we still need to connect server in background with
> > > "nowait" option? But my purpose is not to connect server until we
> > > manually call qemu_chr_fe_wait_connected() in other place.
> >
> > I don't see a need to delay the connect. We can start a
> > background connect right away. The later code you have
> > merely needs to wait for that background connect  to
> > finish, which qemu_chr_fe_wait_connected still accomplishes.
> > This keeps the chardev code clearer only having 2 distinct
> > code paths to worry about - blocking or non-blocking connect.
> >
> 
> Now the problem is that we have a server that only accept one
> connection. And we want to read something from it during device
> initializtion.
> 
> If background connect it before we call qemu_chr_fe_wait_connected()
> during device initializtion, qemu_chr_fe_wait_connected() will
> accomplish but we can't read anything. And we have no way to release
> the background connection. So what I want to do in this patch is to
> disable background connect.

I'm not seeing the problem here. What I proposed results in 

  1. chardev starts connect()
  2. vhost backend waits for connect() to complete
  3. vhost backend reads from chardev

vs the flow

  1. vhost backend starts connect()
  2. vhost backend waits for connect() to complete
  3. vhost backend reads from chardev

in both cases there's only a single connection established to the
server

Regards,
Daniel
Yongji Xie Jan. 10, 2019, 2:29 p.m. UTC | #6
On Thu, 10 Jan 2019 at 22:11, Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Thu, Jan 10, 2019 at 10:08:54PM +0800, Yongji Xie wrote:
> > On Thu, 10 Jan 2019 at 21:24, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >
> > > On Thu, Jan 10, 2019 at 09:19:41PM +0800, Yongji Xie wrote:
> > > > On Thu, 10 Jan 2019 at 20:50, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > >
> > > > > On Wed, Jan 09, 2019 at 07:27:22PM +0800, elohimes@gmail.com wrote:
> > > > > > From: Xie Yongji <xieyongji@baidu.com>
> > > > > >
> > > > > > Enable "nowait" option to make QEMU not do a connect
> > > > > > on client sockets during initialization of the chardev.
> > > > > > Then we can use qemu_chr_fe_wait_connected() to connect
> > > > > > when necessary. Now it would be used for unix domain
> > > > > > socket of vhost-user-blk device to support reconnect.
> > > > > >
> > > > > > Signed-off-by: Xie Yongji <xieyongji@baidu.com>
> > > > > > Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
> > > > > > ---
> > > > > >  chardev/char-socket.c | 56 +++++++++++++++++++++----------------------
> > > > > >  qapi/char.json        |  3 +--
> > > > > >  qemu-options.hx       |  9 ++++---
> > > > > >  3 files changed, 35 insertions(+), 33 deletions(-)
> > > > > >
> > > > > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > > > > > index eaa8e8b68f..f803f4f7d3 100644
> > > > > > --- a/chardev/char-socket.c
> > > > > > +++ b/chardev/char-socket.c
> > > > > > @@ -1072,37 +1072,37 @@ static void qmp_chardev_open_socket(Chardev *chr,
> > > > > >          s->reconnect_time = reconnect;
> > > > > >      }
> > > > > >
> > > > > > -    if (s->reconnect_time) {
> > > > > > -        tcp_chr_connect_async(chr);
> > > > > > -    } else {
> > > > > > -        if (s->is_listen) {
> > > > > > -            char *name;
> > > > > > -            s->listener = qio_net_listener_new();
> > > > > > +    if (s->is_listen) {
> > > > > > +        char *name;
> > > > > > +        s->listener = qio_net_listener_new();
> > > > > >
> > > > > > -            name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > > > > > -            qio_net_listener_set_name(s->listener, name);
> > > > > > -            g_free(name);
> > > > > > +        name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > > > > > +        qio_net_listener_set_name(s->listener, name);
> > > > > > +        g_free(name);
> > > > > >
> > > > > > -            if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > > > > > -                object_unref(OBJECT(s->listener));
> > > > > > -                s->listener = NULL;
> > > > > > -                goto error;
> > > > > > -            }
> > > > > > +        if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > > > > > +            object_unref(OBJECT(s->listener));
> > > > > > +            s->listener = NULL;
> > > > > > +            goto error;
> > > > > > +        }
> > > > > >
> > > > > > -            qapi_free_SocketAddress(s->addr);
> > > > > > -            s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > > > > > -            update_disconnected_filename(s);
> > > > > > +        qapi_free_SocketAddress(s->addr);
> > > > > > +        s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > > > > > +        update_disconnected_filename(s);
> > > > > >
> > > > > > -            if (is_waitconnect &&
> > > > > > -                qemu_chr_wait_connected(chr, errp) < 0) {
> > > > > > -                return;
> > > > > > -            }
> > > > > > -            if (!s->ioc) {
> > > > > > -                qio_net_listener_set_client_func_full(s->listener,
> > > > > > -                                                      tcp_chr_accept,
> > > > > > -                                                      chr, NULL,
> > > > > > -                                                      chr->gcontext);
> > > > > > -            }
> > > > > > +        if (is_waitconnect &&
> > > > > > +            qemu_chr_wait_connected(chr, errp) < 0) {
> > > > > > +            return;
> > > > > > +        }
> > > > > > +        if (!s->ioc) {
> > > > > > +            qio_net_listener_set_client_func_full(s->listener,
> > > > > > +                                                  tcp_chr_accept,
> > > > > > +                                                  chr, NULL,
> > > > > > +                                                  chr->gcontext);
> > > > > > +        }
> > > > > > +    } else if (is_waitconnect) {
> > > > > > +        if (s->reconnect_time) {
> > > > > > +            tcp_chr_connect_async(chr);
> > > > > >          } else if (qemu_chr_wait_connected(chr, errp) < 0) {
> > > > > >              goto error;
> > > > > >          }
> > > > >
> > > > > This skips everything when 'is_waitconnect' is false.
> > > > >
> > > > > This combines with a bug in tests/libqtest.c which adds the 'nowait'
> > > > > flag to the -chardevs it cteates. This mistake was previously ignored
> > > > > because the chardevs were socket clients, but now we honour it.
> > > > >
> > > > > We shoul remove 'nowait' from the qtest chardevs, but separately
> > > > > from that this code should also still attempt a non-blocking
> > > > > connect when is_waitconnect is false.
> > > > >
> > > >
> > > > Do you mean we still need to connect server in background with
> > > > "nowait" option? But my purpose is not to connect server until we
> > > > manually call qemu_chr_fe_wait_connected() in other place.
> > >
> > > I don't see a need to delay the connect. We can start a
> > > background connect right away. The later code you have
> > > merely needs to wait for that background connect  to
> > > finish, which qemu_chr_fe_wait_connected still accomplishes.
> > > This keeps the chardev code clearer only having 2 distinct
> > > code paths to worry about - blocking or non-blocking connect.
> > >
> >
> > Now the problem is that we have a server that only accept one
> > connection. And we want to read something from it during device
> > initializtion.
> >
> > If background connect it before we call qemu_chr_fe_wait_connected()
> > during device initializtion, qemu_chr_fe_wait_connected() will
> > accomplish but we can't read anything. And we have no way to release
> > the background connection. So what I want to do in this patch is to
> > disable background connect.
>
> I'm not seeing the problem here. What I proposed results in
>
>   1. chardev starts connect()

This should be asynchronous with "reconnect" option. Another thread
may connect before vhost backend?

>   2. vhost backend waits for connect() to complete

Sorry, I'm not sure I get your point here. Do you mean vhost backend
call qemu_chr_fe_wait_connected()? Seems like
qemu_chr_fe_wait_connected() will connect directly rather than wait
for connect() to complete?

Thanks,
Yongji
Daniel P. Berrangé Jan. 10, 2019, 4:41 p.m. UTC | #7
On Thu, Jan 10, 2019 at 10:29:20PM +0800, Yongji Xie wrote:
> On Thu, 10 Jan 2019 at 22:11, Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > On Thu, Jan 10, 2019 at 10:08:54PM +0800, Yongji Xie wrote:
> > > On Thu, 10 Jan 2019 at 21:24, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > >
> > > > On Thu, Jan 10, 2019 at 09:19:41PM +0800, Yongji Xie wrote:
> > > > > On Thu, 10 Jan 2019 at 20:50, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > >
> > > > > > On Wed, Jan 09, 2019 at 07:27:22PM +0800, elohimes@gmail.com wrote:
> > > > > > > From: Xie Yongji <xieyongji@baidu.com>
> > > > > > >
> > > > > > > Enable "nowait" option to make QEMU not do a connect
> > > > > > > on client sockets during initialization of the chardev.
> > > > > > > Then we can use qemu_chr_fe_wait_connected() to connect
> > > > > > > when necessary. Now it would be used for unix domain
> > > > > > > socket of vhost-user-blk device to support reconnect.
> > > > > > >
> > > > > > > Signed-off-by: Xie Yongji <xieyongji@baidu.com>
> > > > > > > Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
> > > > > > > ---
> > > > > > >  chardev/char-socket.c | 56 +++++++++++++++++++++----------------------
> > > > > > >  qapi/char.json        |  3 +--
> > > > > > >  qemu-options.hx       |  9 ++++---
> > > > > > >  3 files changed, 35 insertions(+), 33 deletions(-)
> > > > > > >
> > > > > > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > > > > > > index eaa8e8b68f..f803f4f7d3 100644
> > > > > > > --- a/chardev/char-socket.c
> > > > > > > +++ b/chardev/char-socket.c
> > > > > > > @@ -1072,37 +1072,37 @@ static void qmp_chardev_open_socket(Chardev *chr,
> > > > > > >          s->reconnect_time = reconnect;
> > > > > > >      }
> > > > > > >
> > > > > > > -    if (s->reconnect_time) {
> > > > > > > -        tcp_chr_connect_async(chr);
> > > > > > > -    } else {
> > > > > > > -        if (s->is_listen) {
> > > > > > > -            char *name;
> > > > > > > -            s->listener = qio_net_listener_new();
> > > > > > > +    if (s->is_listen) {
> > > > > > > +        char *name;
> > > > > > > +        s->listener = qio_net_listener_new();
> > > > > > >
> > > > > > > -            name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > > > > > > -            qio_net_listener_set_name(s->listener, name);
> > > > > > > -            g_free(name);
> > > > > > > +        name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > > > > > > +        qio_net_listener_set_name(s->listener, name);
> > > > > > > +        g_free(name);
> > > > > > >
> > > > > > > -            if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > > > > > > -                object_unref(OBJECT(s->listener));
> > > > > > > -                s->listener = NULL;
> > > > > > > -                goto error;
> > > > > > > -            }
> > > > > > > +        if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > > > > > > +            object_unref(OBJECT(s->listener));
> > > > > > > +            s->listener = NULL;
> > > > > > > +            goto error;
> > > > > > > +        }
> > > > > > >
> > > > > > > -            qapi_free_SocketAddress(s->addr);
> > > > > > > -            s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > > > > > > -            update_disconnected_filename(s);
> > > > > > > +        qapi_free_SocketAddress(s->addr);
> > > > > > > +        s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > > > > > > +        update_disconnected_filename(s);
> > > > > > >
> > > > > > > -            if (is_waitconnect &&
> > > > > > > -                qemu_chr_wait_connected(chr, errp) < 0) {
> > > > > > > -                return;
> > > > > > > -            }
> > > > > > > -            if (!s->ioc) {
> > > > > > > -                qio_net_listener_set_client_func_full(s->listener,
> > > > > > > -                                                      tcp_chr_accept,
> > > > > > > -                                                      chr, NULL,
> > > > > > > -                                                      chr->gcontext);
> > > > > > > -            }
> > > > > > > +        if (is_waitconnect &&
> > > > > > > +            qemu_chr_wait_connected(chr, errp) < 0) {
> > > > > > > +            return;
> > > > > > > +        }
> > > > > > > +        if (!s->ioc) {
> > > > > > > +            qio_net_listener_set_client_func_full(s->listener,
> > > > > > > +                                                  tcp_chr_accept,
> > > > > > > +                                                  chr, NULL,
> > > > > > > +                                                  chr->gcontext);
> > > > > > > +        }
> > > > > > > +    } else if (is_waitconnect) {
> > > > > > > +        if (s->reconnect_time) {
> > > > > > > +            tcp_chr_connect_async(chr);
> > > > > > >          } else if (qemu_chr_wait_connected(chr, errp) < 0) {
> > > > > > >              goto error;
> > > > > > >          }
> > > > > >
> > > > > > This skips everything when 'is_waitconnect' is false.
> > > > > >
> > > > > > This combines with a bug in tests/libqtest.c which adds the 'nowait'
> > > > > > flag to the -chardevs it cteates. This mistake was previously ignored
> > > > > > because the chardevs were socket clients, but now we honour it.
> > > > > >
> > > > > > We shoul remove 'nowait' from the qtest chardevs, but separately
> > > > > > from that this code should also still attempt a non-blocking
> > > > > > connect when is_waitconnect is false.
> > > > > >
> > > > >
> > > > > Do you mean we still need to connect server in background with
> > > > > "nowait" option? But my purpose is not to connect server until we
> > > > > manually call qemu_chr_fe_wait_connected() in other place.
> > > >
> > > > I don't see a need to delay the connect. We can start a
> > > > background connect right away. The later code you have
> > > > merely needs to wait for that background connect  to
> > > > finish, which qemu_chr_fe_wait_connected still accomplishes.
> > > > This keeps the chardev code clearer only having 2 distinct
> > > > code paths to worry about - blocking or non-blocking connect.
> > > >
> > >
> > > Now the problem is that we have a server that only accept one
> > > connection. And we want to read something from it during device
> > > initializtion.
> > >
> > > If background connect it before we call qemu_chr_fe_wait_connected()
> > > during device initializtion, qemu_chr_fe_wait_connected() will
> > > accomplish but we can't read anything. And we have no way to release
> > > the background connection. So what I want to do in this patch is to
> > > disable background connect.
> >
> > I'm not seeing the problem here. What I proposed results in
> >
> >   1. chardev starts connect()
> 
> This should be asynchronous with "reconnect" option. Another thread
> may connect before vhost backend?
> 
> >   2. vhost backend waits for connect() to complete
> 
> Sorry, I'm not sure I get your point here. Do you mean vhost backend
> call qemu_chr_fe_wait_connected()? Seems like
> qemu_chr_fe_wait_connected() will connect directly rather than wait
> for connect() to complete?

Ahhhh, yes, you are right.

qemu_chr_fe_wait_connected will potentially cause a second connection to
be established

Looking at it the qemu_chr_fe_wait_connected() I believe it is seriously
broken even before this patch series.

The intended usage is that a device can all qemu_chr_fe_wait_connected
to wait for a new connection to be established, and then do I/O on the
chardev.  This does not in fact work if TLS, websock or telnet modes
are enabled for the socket, due to a mistake introduced when we previously
tried to fix this:

  commit 1dc8a6695c731abb7461c637b2512c3670d82be4
  Author: Marc-André Lureau <marcandre.lureau@redhat.com>
  Date:   Tue Aug 16 12:33:32 2016 +0400

    char: fix waiting for TLS and telnet connection

That commit fixed the problem where we continued to accept() new sockets
when TLS/telnet was enabled, because the 's->connected' flag isn't set
immediately.

Unfortunately what this means is that when qemu_chr_fe_wait_connected
returns, the chardev is *not* ready to read/write data. The TLS/telnet
handshake has not been run, and is still pending in the background.

So we'll end up with device backend trying todo I/O on the chardev
at the same time as it is trying todo the TLS/telnet handshake.

We need to fix qemu_chr_fe_wait_connected so that it does explicit
synchronization wrt to any ongoing background connection process.
It must only return once all TLS/telnet/websock handshakes have
completed.  If we fix that correctly, then I believe it will  also
solve the problem you're trying to address.

Regards,
Daniel
Yongji Xie Jan. 11, 2019, 7:50 a.m. UTC | #8
On Fri, 11 Jan 2019 at 00:41, Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Thu, Jan 10, 2019 at 10:29:20PM +0800, Yongji Xie wrote:
> > On Thu, 10 Jan 2019 at 22:11, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >
> > > On Thu, Jan 10, 2019 at 10:08:54PM +0800, Yongji Xie wrote:
> > > > On Thu, 10 Jan 2019 at 21:24, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > >
> > > > > On Thu, Jan 10, 2019 at 09:19:41PM +0800, Yongji Xie wrote:
> > > > > > On Thu, 10 Jan 2019 at 20:50, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > >
> > > > > > > On Wed, Jan 09, 2019 at 07:27:22PM +0800, elohimes@gmail.com wrote:
> > > > > > > > From: Xie Yongji <xieyongji@baidu.com>
> > > > > > > >
> > > > > > > > Enable "nowait" option to make QEMU not do a connect
> > > > > > > > on client sockets during initialization of the chardev.
> > > > > > > > Then we can use qemu_chr_fe_wait_connected() to connect
> > > > > > > > when necessary. Now it would be used for unix domain
> > > > > > > > socket of vhost-user-blk device to support reconnect.
> > > > > > > >
> > > > > > > > Signed-off-by: Xie Yongji <xieyongji@baidu.com>
> > > > > > > > Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
> > > > > > > > ---
> > > > > > > >  chardev/char-socket.c | 56 +++++++++++++++++++++----------------------
> > > > > > > >  qapi/char.json        |  3 +--
> > > > > > > >  qemu-options.hx       |  9 ++++---
> > > > > > > >  3 files changed, 35 insertions(+), 33 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > > > > > > > index eaa8e8b68f..f803f4f7d3 100644
> > > > > > > > --- a/chardev/char-socket.c
> > > > > > > > +++ b/chardev/char-socket.c
> > > > > > > > @@ -1072,37 +1072,37 @@ static void qmp_chardev_open_socket(Chardev *chr,
> > > > > > > >          s->reconnect_time = reconnect;
> > > > > > > >      }
> > > > > > > >
> > > > > > > > -    if (s->reconnect_time) {
> > > > > > > > -        tcp_chr_connect_async(chr);
> > > > > > > > -    } else {
> > > > > > > > -        if (s->is_listen) {
> > > > > > > > -            char *name;
> > > > > > > > -            s->listener = qio_net_listener_new();
> > > > > > > > +    if (s->is_listen) {
> > > > > > > > +        char *name;
> > > > > > > > +        s->listener = qio_net_listener_new();
> > > > > > > >
> > > > > > > > -            name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > > > > > > > -            qio_net_listener_set_name(s->listener, name);
> > > > > > > > -            g_free(name);
> > > > > > > > +        name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > > > > > > > +        qio_net_listener_set_name(s->listener, name);
> > > > > > > > +        g_free(name);
> > > > > > > >
> > > > > > > > -            if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > > > > > > > -                object_unref(OBJECT(s->listener));
> > > > > > > > -                s->listener = NULL;
> > > > > > > > -                goto error;
> > > > > > > > -            }
> > > > > > > > +        if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > > > > > > > +            object_unref(OBJECT(s->listener));
> > > > > > > > +            s->listener = NULL;
> > > > > > > > +            goto error;
> > > > > > > > +        }
> > > > > > > >
> > > > > > > > -            qapi_free_SocketAddress(s->addr);
> > > > > > > > -            s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > > > > > > > -            update_disconnected_filename(s);
> > > > > > > > +        qapi_free_SocketAddress(s->addr);
> > > > > > > > +        s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > > > > > > > +        update_disconnected_filename(s);
> > > > > > > >
> > > > > > > > -            if (is_waitconnect &&
> > > > > > > > -                qemu_chr_wait_connected(chr, errp) < 0) {
> > > > > > > > -                return;
> > > > > > > > -            }
> > > > > > > > -            if (!s->ioc) {
> > > > > > > > -                qio_net_listener_set_client_func_full(s->listener,
> > > > > > > > -                                                      tcp_chr_accept,
> > > > > > > > -                                                      chr, NULL,
> > > > > > > > -                                                      chr->gcontext);
> > > > > > > > -            }
> > > > > > > > +        if (is_waitconnect &&
> > > > > > > > +            qemu_chr_wait_connected(chr, errp) < 0) {
> > > > > > > > +            return;
> > > > > > > > +        }
> > > > > > > > +        if (!s->ioc) {
> > > > > > > > +            qio_net_listener_set_client_func_full(s->listener,
> > > > > > > > +                                                  tcp_chr_accept,
> > > > > > > > +                                                  chr, NULL,
> > > > > > > > +                                                  chr->gcontext);
> > > > > > > > +        }
> > > > > > > > +    } else if (is_waitconnect) {
> > > > > > > > +        if (s->reconnect_time) {
> > > > > > > > +            tcp_chr_connect_async(chr);
> > > > > > > >          } else if (qemu_chr_wait_connected(chr, errp) < 0) {
> > > > > > > >              goto error;
> > > > > > > >          }
> > > > > > >
> > > > > > > This skips everything when 'is_waitconnect' is false.
> > > > > > >
> > > > > > > This combines with a bug in tests/libqtest.c which adds the 'nowait'
> > > > > > > flag to the -chardevs it cteates. This mistake was previously ignored
> > > > > > > because the chardevs were socket clients, but now we honour it.
> > > > > > >
> > > > > > > We shoul remove 'nowait' from the qtest chardevs, but separately
> > > > > > > from that this code should also still attempt a non-blocking
> > > > > > > connect when is_waitconnect is false.
> > > > > > >
> > > > > >
> > > > > > Do you mean we still need to connect server in background with
> > > > > > "nowait" option? But my purpose is not to connect server until we
> > > > > > manually call qemu_chr_fe_wait_connected() in other place.
> > > > >
> > > > > I don't see a need to delay the connect. We can start a
> > > > > background connect right away. The later code you have
> > > > > merely needs to wait for that background connect  to
> > > > > finish, which qemu_chr_fe_wait_connected still accomplishes.
> > > > > This keeps the chardev code clearer only having 2 distinct
> > > > > code paths to worry about - blocking or non-blocking connect.
> > > > >
> > > >
> > > > Now the problem is that we have a server that only accept one
> > > > connection. And we want to read something from it during device
> > > > initializtion.
> > > >
> > > > If background connect it before we call qemu_chr_fe_wait_connected()
> > > > during device initializtion, qemu_chr_fe_wait_connected() will
> > > > accomplish but we can't read anything. And we have no way to release
> > > > the background connection. So what I want to do in this patch is to
> > > > disable background connect.
> > >
> > > I'm not seeing the problem here. What I proposed results in
> > >
> > >   1. chardev starts connect()
> >
> > This should be asynchronous with "reconnect" option. Another thread
> > may connect before vhost backend?
> >
> > >   2. vhost backend waits for connect() to complete
> >
> > Sorry, I'm not sure I get your point here. Do you mean vhost backend
> > call qemu_chr_fe_wait_connected()? Seems like
> > qemu_chr_fe_wait_connected() will connect directly rather than wait
> > for connect() to complete?
>
> Ahhhh, yes, you are right.
>
> qemu_chr_fe_wait_connected will potentially cause a second connection to
> be established
>
> Looking at it the qemu_chr_fe_wait_connected() I believe it is seriously
> broken even before this patch series.
>
> The intended usage is that a device can all qemu_chr_fe_wait_connected
> to wait for a new connection to be established, and then do I/O on the
> chardev.  This does not in fact work if TLS, websock or telnet modes
> are enabled for the socket, due to a mistake introduced when we previously
> tried to fix this:
>
>   commit 1dc8a6695c731abb7461c637b2512c3670d82be4
>   Author: Marc-André Lureau <marcandre.lureau@redhat.com>
>   Date:   Tue Aug 16 12:33:32 2016 +0400
>
>     char: fix waiting for TLS and telnet connection
>
> That commit fixed the problem where we continued to accept() new sockets
> when TLS/telnet was enabled, because the 's->connected' flag isn't set
> immediately.
>
> Unfortunately what this means is that when qemu_chr_fe_wait_connected
> returns, the chardev is *not* ready to read/write data. The TLS/telnet
> handshake has not been run, and is still pending in the background.
>
> So we'll end up with device backend trying todo I/O on the chardev
> at the same time as it is trying todo the TLS/telnet handshake.
>
> We need to fix qemu_chr_fe_wait_connected so that it does explicit
> synchronization wrt to any ongoing background connection process.
> It must only return once all TLS/telnet/websock handshakes have
> completed.  If we fix that correctly, then I believe it will  also
> solve the problem you're trying to address.
>

Yes, I think this should be the right way to go. To fix it, my thought
is to track the async QIOChannelSocket in SocketChardev. Then we can
easily get the connection progress in qemu_chr_fe_wait_connected(). Do
you have any suggestion?

Thanks,
Yongji
Daniel P. Berrangé Jan. 11, 2019, 8:32 a.m. UTC | #9
On Fri, Jan 11, 2019 at 03:50:40PM +0800, Yongji Xie wrote:
> On Fri, 11 Jan 2019 at 00:41, Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > On Thu, Jan 10, 2019 at 10:29:20PM +0800, Yongji Xie wrote:
> > > On Thu, 10 Jan 2019 at 22:11, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > >
> > > > On Thu, Jan 10, 2019 at 10:08:54PM +0800, Yongji Xie wrote:
> > > > > On Thu, 10 Jan 2019 at 21:24, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > >
> > > > > > On Thu, Jan 10, 2019 at 09:19:41PM +0800, Yongji Xie wrote:
> > > > > > > On Thu, 10 Jan 2019 at 20:50, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > > >
> > > > > > > > On Wed, Jan 09, 2019 at 07:27:22PM +0800, elohimes@gmail.com wrote:
> > > > > > > > > From: Xie Yongji <xieyongji@baidu.com>
> > > > > > > > >
> > > > > > > > > Enable "nowait" option to make QEMU not do a connect
> > > > > > > > > on client sockets during initialization of the chardev.
> > > > > > > > > Then we can use qemu_chr_fe_wait_connected() to connect
> > > > > > > > > when necessary. Now it would be used for unix domain
> > > > > > > > > socket of vhost-user-blk device to support reconnect.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Xie Yongji <xieyongji@baidu.com>
> > > > > > > > > Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
> > > > > > > > > ---
> > > > > > > > >  chardev/char-socket.c | 56 +++++++++++++++++++++----------------------
> > > > > > > > >  qapi/char.json        |  3 +--
> > > > > > > > >  qemu-options.hx       |  9 ++++---
> > > > > > > > >  3 files changed, 35 insertions(+), 33 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > > > > > > > > index eaa8e8b68f..f803f4f7d3 100644
> > > > > > > > > --- a/chardev/char-socket.c
> > > > > > > > > +++ b/chardev/char-socket.c
> > > > > > > > > @@ -1072,37 +1072,37 @@ static void qmp_chardev_open_socket(Chardev *chr,
> > > > > > > > >          s->reconnect_time = reconnect;
> > > > > > > > >      }
> > > > > > > > >
> > > > > > > > > -    if (s->reconnect_time) {
> > > > > > > > > -        tcp_chr_connect_async(chr);
> > > > > > > > > -    } else {
> > > > > > > > > -        if (s->is_listen) {
> > > > > > > > > -            char *name;
> > > > > > > > > -            s->listener = qio_net_listener_new();
> > > > > > > > > +    if (s->is_listen) {
> > > > > > > > > +        char *name;
> > > > > > > > > +        s->listener = qio_net_listener_new();
> > > > > > > > >
> > > > > > > > > -            name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > > > > > > > > -            qio_net_listener_set_name(s->listener, name);
> > > > > > > > > -            g_free(name);
> > > > > > > > > +        name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > > > > > > > > +        qio_net_listener_set_name(s->listener, name);
> > > > > > > > > +        g_free(name);
> > > > > > > > >
> > > > > > > > > -            if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > > > > > > > > -                object_unref(OBJECT(s->listener));
> > > > > > > > > -                s->listener = NULL;
> > > > > > > > > -                goto error;
> > > > > > > > > -            }
> > > > > > > > > +        if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > > > > > > > > +            object_unref(OBJECT(s->listener));
> > > > > > > > > +            s->listener = NULL;
> > > > > > > > > +            goto error;
> > > > > > > > > +        }
> > > > > > > > >
> > > > > > > > > -            qapi_free_SocketAddress(s->addr);
> > > > > > > > > -            s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > > > > > > > > -            update_disconnected_filename(s);
> > > > > > > > > +        qapi_free_SocketAddress(s->addr);
> > > > > > > > > +        s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > > > > > > > > +        update_disconnected_filename(s);
> > > > > > > > >
> > > > > > > > > -            if (is_waitconnect &&
> > > > > > > > > -                qemu_chr_wait_connected(chr, errp) < 0) {
> > > > > > > > > -                return;
> > > > > > > > > -            }
> > > > > > > > > -            if (!s->ioc) {
> > > > > > > > > -                qio_net_listener_set_client_func_full(s->listener,
> > > > > > > > > -                                                      tcp_chr_accept,
> > > > > > > > > -                                                      chr, NULL,
> > > > > > > > > -                                                      chr->gcontext);
> > > > > > > > > -            }
> > > > > > > > > +        if (is_waitconnect &&
> > > > > > > > > +            qemu_chr_wait_connected(chr, errp) < 0) {
> > > > > > > > > +            return;
> > > > > > > > > +        }
> > > > > > > > > +        if (!s->ioc) {
> > > > > > > > > +            qio_net_listener_set_client_func_full(s->listener,
> > > > > > > > > +                                                  tcp_chr_accept,
> > > > > > > > > +                                                  chr, NULL,
> > > > > > > > > +                                                  chr->gcontext);
> > > > > > > > > +        }
> > > > > > > > > +    } else if (is_waitconnect) {
> > > > > > > > > +        if (s->reconnect_time) {
> > > > > > > > > +            tcp_chr_connect_async(chr);
> > > > > > > > >          } else if (qemu_chr_wait_connected(chr, errp) < 0) {
> > > > > > > > >              goto error;
> > > > > > > > >          }
> > > > > > > >
> > > > > > > > This skips everything when 'is_waitconnect' is false.
> > > > > > > >
> > > > > > > > This combines with a bug in tests/libqtest.c which adds the 'nowait'
> > > > > > > > flag to the -chardevs it cteates. This mistake was previously ignored
> > > > > > > > because the chardevs were socket clients, but now we honour it.
> > > > > > > >
> > > > > > > > We shoul remove 'nowait' from the qtest chardevs, but separately
> > > > > > > > from that this code should also still attempt a non-blocking
> > > > > > > > connect when is_waitconnect is false.
> > > > > > > >
> > > > > > >
> > > > > > > Do you mean we still need to connect server in background with
> > > > > > > "nowait" option? But my purpose is not to connect server until we
> > > > > > > manually call qemu_chr_fe_wait_connected() in other place.
> > > > > >
> > > > > > I don't see a need to delay the connect. We can start a
> > > > > > background connect right away. The later code you have
> > > > > > merely needs to wait for that background connect  to
> > > > > > finish, which qemu_chr_fe_wait_connected still accomplishes.
> > > > > > This keeps the chardev code clearer only having 2 distinct
> > > > > > code paths to worry about - blocking or non-blocking connect.
> > > > > >
> > > > >
> > > > > Now the problem is that we have a server that only accept one
> > > > > connection. And we want to read something from it during device
> > > > > initializtion.
> > > > >
> > > > > If background connect it before we call qemu_chr_fe_wait_connected()
> > > > > during device initializtion, qemu_chr_fe_wait_connected() will
> > > > > accomplish but we can't read anything. And we have no way to release
> > > > > the background connection. So what I want to do in this patch is to
> > > > > disable background connect.
> > > >
> > > > I'm not seeing the problem here. What I proposed results in
> > > >
> > > >   1. chardev starts connect()
> > >
> > > This should be asynchronous with "reconnect" option. Another thread
> > > may connect before vhost backend?
> > >
> > > >   2. vhost backend waits for connect() to complete
> > >
> > > Sorry, I'm not sure I get your point here. Do you mean vhost backend
> > > call qemu_chr_fe_wait_connected()? Seems like
> > > qemu_chr_fe_wait_connected() will connect directly rather than wait
> > > for connect() to complete?
> >
> > Ahhhh, yes, you are right.
> >
> > qemu_chr_fe_wait_connected will potentially cause a second connection to
> > be established
> >
> > Looking at it the qemu_chr_fe_wait_connected() I believe it is seriously
> > broken even before this patch series.
> >
> > The intended usage is that a device can all qemu_chr_fe_wait_connected
> > to wait for a new connection to be established, and then do I/O on the
> > chardev.  This does not in fact work if TLS, websock or telnet modes
> > are enabled for the socket, due to a mistake introduced when we previously
> > tried to fix this:
> >
> >   commit 1dc8a6695c731abb7461c637b2512c3670d82be4
> >   Author: Marc-André Lureau <marcandre.lureau@redhat.com>
> >   Date:   Tue Aug 16 12:33:32 2016 +0400
> >
> >     char: fix waiting for TLS and telnet connection
> >
> > That commit fixed the problem where we continued to accept() new sockets
> > when TLS/telnet was enabled, because the 's->connected' flag isn't set
> > immediately.
> >
> > Unfortunately what this means is that when qemu_chr_fe_wait_connected
> > returns, the chardev is *not* ready to read/write data. The TLS/telnet
> > handshake has not been run, and is still pending in the background.
> >
> > So we'll end up with device backend trying todo I/O on the chardev
> > at the same time as it is trying todo the TLS/telnet handshake.
> >
> > We need to fix qemu_chr_fe_wait_connected so that it does explicit
> > synchronization wrt to any ongoing background connection process.
> > It must only return once all TLS/telnet/websock handshakes have
> > completed.  If we fix that correctly, then I believe it will  also
> > solve the problem you're trying to address.
> >
> 
> Yes, I think this should be the right way to go. To fix it, my thought
> is to track the async QIOChannelSocket in SocketChardev. Then we can
> easily get the connection progress in qemu_chr_fe_wait_connected(). Do
> you have any suggestion?

I've got a few patches that refactor the code to fix this. I'll send them
today and CC you on them.

Regards,
Daniel
Yongji Xie Jan. 11, 2019, 8:36 a.m. UTC | #10
On Fri, 11 Jan 2019 at 16:32, Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Fri, Jan 11, 2019 at 03:50:40PM +0800, Yongji Xie wrote:
> > On Fri, 11 Jan 2019 at 00:41, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >
> > > On Thu, Jan 10, 2019 at 10:29:20PM +0800, Yongji Xie wrote:
> > > > On Thu, 10 Jan 2019 at 22:11, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > >
> > > > > On Thu, Jan 10, 2019 at 10:08:54PM +0800, Yongji Xie wrote:
> > > > > > On Thu, 10 Jan 2019 at 21:24, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > >
> > > > > > > On Thu, Jan 10, 2019 at 09:19:41PM +0800, Yongji Xie wrote:
> > > > > > > > On Thu, 10 Jan 2019 at 20:50, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, Jan 09, 2019 at 07:27:22PM +0800, elohimes@gmail.com wrote:
> > > > > > > > > > From: Xie Yongji <xieyongji@baidu.com>
> > > > > > > > > >
> > > > > > > > > > Enable "nowait" option to make QEMU not do a connect
> > > > > > > > > > on client sockets during initialization of the chardev.
> > > > > > > > > > Then we can use qemu_chr_fe_wait_connected() to connect
> > > > > > > > > > when necessary. Now it would be used for unix domain
> > > > > > > > > > socket of vhost-user-blk device to support reconnect.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Xie Yongji <xieyongji@baidu.com>
> > > > > > > > > > Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
> > > > > > > > > > ---
> > > > > > > > > >  chardev/char-socket.c | 56 +++++++++++++++++++++----------------------
> > > > > > > > > >  qapi/char.json        |  3 +--
> > > > > > > > > >  qemu-options.hx       |  9 ++++---
> > > > > > > > > >  3 files changed, 35 insertions(+), 33 deletions(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> > > > > > > > > > index eaa8e8b68f..f803f4f7d3 100644
> > > > > > > > > > --- a/chardev/char-socket.c
> > > > > > > > > > +++ b/chardev/char-socket.c
> > > > > > > > > > @@ -1072,37 +1072,37 @@ static void qmp_chardev_open_socket(Chardev *chr,
> > > > > > > > > >          s->reconnect_time = reconnect;
> > > > > > > > > >      }
> > > > > > > > > >
> > > > > > > > > > -    if (s->reconnect_time) {
> > > > > > > > > > -        tcp_chr_connect_async(chr);
> > > > > > > > > > -    } else {
> > > > > > > > > > -        if (s->is_listen) {
> > > > > > > > > > -            char *name;
> > > > > > > > > > -            s->listener = qio_net_listener_new();
> > > > > > > > > > +    if (s->is_listen) {
> > > > > > > > > > +        char *name;
> > > > > > > > > > +        s->listener = qio_net_listener_new();
> > > > > > > > > >
> > > > > > > > > > -            name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > > > > > > > > > -            qio_net_listener_set_name(s->listener, name);
> > > > > > > > > > -            g_free(name);
> > > > > > > > > > +        name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
> > > > > > > > > > +        qio_net_listener_set_name(s->listener, name);
> > > > > > > > > > +        g_free(name);
> > > > > > > > > >
> > > > > > > > > > -            if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > > > > > > > > > -                object_unref(OBJECT(s->listener));
> > > > > > > > > > -                s->listener = NULL;
> > > > > > > > > > -                goto error;
> > > > > > > > > > -            }
> > > > > > > > > > +        if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
> > > > > > > > > > +            object_unref(OBJECT(s->listener));
> > > > > > > > > > +            s->listener = NULL;
> > > > > > > > > > +            goto error;
> > > > > > > > > > +        }
> > > > > > > > > >
> > > > > > > > > > -            qapi_free_SocketAddress(s->addr);
> > > > > > > > > > -            s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > > > > > > > > > -            update_disconnected_filename(s);
> > > > > > > > > > +        qapi_free_SocketAddress(s->addr);
> > > > > > > > > > +        s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
> > > > > > > > > > +        update_disconnected_filename(s);
> > > > > > > > > >
> > > > > > > > > > -            if (is_waitconnect &&
> > > > > > > > > > -                qemu_chr_wait_connected(chr, errp) < 0) {
> > > > > > > > > > -                return;
> > > > > > > > > > -            }
> > > > > > > > > > -            if (!s->ioc) {
> > > > > > > > > > -                qio_net_listener_set_client_func_full(s->listener,
> > > > > > > > > > -                                                      tcp_chr_accept,
> > > > > > > > > > -                                                      chr, NULL,
> > > > > > > > > > -                                                      chr->gcontext);
> > > > > > > > > > -            }
> > > > > > > > > > +        if (is_waitconnect &&
> > > > > > > > > > +            qemu_chr_wait_connected(chr, errp) < 0) {
> > > > > > > > > > +            return;
> > > > > > > > > > +        }
> > > > > > > > > > +        if (!s->ioc) {
> > > > > > > > > > +            qio_net_listener_set_client_func_full(s->listener,
> > > > > > > > > > +                                                  tcp_chr_accept,
> > > > > > > > > > +                                                  chr, NULL,
> > > > > > > > > > +                                                  chr->gcontext);
> > > > > > > > > > +        }
> > > > > > > > > > +    } else if (is_waitconnect) {
> > > > > > > > > > +        if (s->reconnect_time) {
> > > > > > > > > > +            tcp_chr_connect_async(chr);
> > > > > > > > > >          } else if (qemu_chr_wait_connected(chr, errp) < 0) {
> > > > > > > > > >              goto error;
> > > > > > > > > >          }
> > > > > > > > >
> > > > > > > > > This skips everything when 'is_waitconnect' is false.
> > > > > > > > >
> > > > > > > > > This combines with a bug in tests/libqtest.c which adds the 'nowait'
> > > > > > > > > flag to the -chardevs it cteates. This mistake was previously ignored
> > > > > > > > > because the chardevs were socket clients, but now we honour it.
> > > > > > > > >
> > > > > > > > > We shoul remove 'nowait' from the qtest chardevs, but separately
> > > > > > > > > from that this code should also still attempt a non-blocking
> > > > > > > > > connect when is_waitconnect is false.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Do you mean we still need to connect server in background with
> > > > > > > > "nowait" option? But my purpose is not to connect server until we
> > > > > > > > manually call qemu_chr_fe_wait_connected() in other place.
> > > > > > >
> > > > > > > I don't see a need to delay the connect. We can start a
> > > > > > > background connect right away. The later code you have
> > > > > > > merely needs to wait for that background connect  to
> > > > > > > finish, which qemu_chr_fe_wait_connected still accomplishes.
> > > > > > > This keeps the chardev code clearer only having 2 distinct
> > > > > > > code paths to worry about - blocking or non-blocking connect.
> > > > > > >
> > > > > >
> > > > > > Now the problem is that we have a server that only accept one
> > > > > > connection. And we want to read something from it during device
> > > > > > initializtion.
> > > > > >
> > > > > > If background connect it before we call qemu_chr_fe_wait_connected()
> > > > > > during device initializtion, qemu_chr_fe_wait_connected() will
> > > > > > accomplish but we can't read anything. And we have no way to release
> > > > > > the background connection. So what I want to do in this patch is to
> > > > > > disable background connect.
> > > > >
> > > > > I'm not seeing the problem here. What I proposed results in
> > > > >
> > > > >   1. chardev starts connect()
> > > >
> > > > This should be asynchronous with "reconnect" option. Another thread
> > > > may connect before vhost backend?
> > > >
> > > > >   2. vhost backend waits for connect() to complete
> > > >
> > > > Sorry, I'm not sure I get your point here. Do you mean vhost backend
> > > > call qemu_chr_fe_wait_connected()? Seems like
> > > > qemu_chr_fe_wait_connected() will connect directly rather than wait
> > > > for connect() to complete?
> > >
> > > Ahhhh, yes, you are right.
> > >
> > > qemu_chr_fe_wait_connected will potentially cause a second connection to
> > > be established
> > >
> > > Looking at it the qemu_chr_fe_wait_connected() I believe it is seriously
> > > broken even before this patch series.
> > >
> > > The intended usage is that a device can all qemu_chr_fe_wait_connected
> > > to wait for a new connection to be established, and then do I/O on the
> > > chardev.  This does not in fact work if TLS, websock or telnet modes
> > > are enabled for the socket, due to a mistake introduced when we previously
> > > tried to fix this:
> > >
> > >   commit 1dc8a6695c731abb7461c637b2512c3670d82be4
> > >   Author: Marc-André Lureau <marcandre.lureau@redhat.com>
> > >   Date:   Tue Aug 16 12:33:32 2016 +0400
> > >
> > >     char: fix waiting for TLS and telnet connection
> > >
> > > That commit fixed the problem where we continued to accept() new sockets
> > > when TLS/telnet was enabled, because the 's->connected' flag isn't set
> > > immediately.
> > >
> > > Unfortunately what this means is that when qemu_chr_fe_wait_connected
> > > returns, the chardev is *not* ready to read/write data. The TLS/telnet
> > > handshake has not been run, and is still pending in the background.
> > >
> > > So we'll end up with device backend trying todo I/O on the chardev
> > > at the same time as it is trying todo the TLS/telnet handshake.
> > >
> > > We need to fix qemu_chr_fe_wait_connected so that it does explicit
> > > synchronization wrt to any ongoing background connection process.
> > > It must only return once all TLS/telnet/websock handshakes have
> > > completed.  If we fix that correctly, then I believe it will  also
> > > solve the problem you're trying to address.
> > >
> >
> > Yes, I think this should be the right way to go. To fix it, my thought
> > is to track the async QIOChannelSocket in SocketChardev. Then we can
> > easily get the connection progress in qemu_chr_fe_wait_connected(). Do
> > you have any suggestion?
>
> I've got a few patches that refactor the code to fix this. I'll send them
> today and CC you on them.
>

That would be great! Thank you.

Thanks,
Yongji
Daniel P. Berrangé Jan. 15, 2019, 3:39 p.m. UTC | #11
On Fri, Jan 11, 2019 at 04:36:11PM +0800, Yongji Xie wrote:
> On Fri, 11 Jan 2019 at 16:32, Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > On Fri, Jan 11, 2019 at 03:50:40PM +0800, Yongji Xie wrote:
> > > On Fri, 11 Jan 2019 at 00:41, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > >
> > > > We need to fix qemu_chr_fe_wait_connected so that it does explicit
> > > > synchronization wrt to any ongoing background connection process.
> > > > It must only return once all TLS/telnet/websock handshakes have
> > > > completed.  If we fix that correctly, then I believe it will  also
> > > > solve the problem you're trying to address.
> > > >
> > >
> > > Yes, I think this should be the right way to go. To fix it, my thought
> > > is to track the async QIOChannelSocket in SocketChardev. Then we can
> > > easily get the connection progress in qemu_chr_fe_wait_connected(). Do
> > > you have any suggestion?
> >
> > I've got a few patches that refactor the code to fix this. I'll send them
> > today and CC you on them.
> >
> 
> That would be great! Thank you.

It took me rather longer than expected to fully debug all scenarios, but
I've finally sent patches:

https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg03344.html

Regards,
Daniel
Yury Kotov Jan. 15, 2019, 4:53 p.m. UTC | #12
15.01.2019, 18:39, "Daniel P. Berrangé" <berrange@redhat.com>:
> On Fri, Jan 11, 2019 at 04:36:11PM +0800, Yongji Xie wrote:
>>  On Fri, 11 Jan 2019 at 16:32, Daniel P. Berrangé <berrange@redhat.com> wrote:
>>  >
>>  > On Fri, Jan 11, 2019 at 03:50:40PM +0800, Yongji Xie wrote:
>>  > > On Fri, 11 Jan 2019 at 00:41, Daniel P. Berrangé <berrange@redhat.com> wrote:
>>  > > >
>>  > > > We need to fix qemu_chr_fe_wait_connected so that it does explicit
>>  > > > synchronization wrt to any ongoing background connection process.
>>  > > > It must only return once all TLS/telnet/websock handshakes have
>>  > > > completed. If we fix that correctly, then I believe it will also
>>  > > > solve the problem you're trying to address.
>>  > > >
>>  > >
>>  > > Yes, I think this should be the right way to go. To fix it, my thought
>>  > > is to track the async QIOChannelSocket in SocketChardev. Then we can
>>  > > easily get the connection progress in qemu_chr_fe_wait_connected(). Do
>>  > > you have any suggestion?
>>  >
>>  > I've got a few patches that refactor the code to fix this. I'll send them
>>  > today and CC you on them.
>>  >
>>
>>  That would be great! Thank you.
>
> It took me rather longer than expected to fully debug all scenarios, but
> I've finally sent patches:
>
> https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg03344.html
>
> Regards,
> Daniel
> --
> |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o- https://fstop138.berrange.com :|
> |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

I didn't do a deep review and may be wrong, but I think the race is still
possible in the hotplug case.

Example:
1. User adds a chardev (qmp: chardev-add) with 'reconnect' option;
2. Some main-loop iterations...
3. Reconnect timer triggers and starts connection thread;
4. User adds a vhost-user-blk (qmp: device-add): device realize -> wait_connected.

Here, there is a chance that we are in the wait_connected and the connection
thread is still running.

Regards,
Yury
Daniel P. Berrangé Jan. 15, 2019, 5:15 p.m. UTC | #13
On Tue, Jan 15, 2019 at 07:53:51PM +0300, Yury Kotov wrote:
> 15.01.2019, 18:39, "Daniel P. Berrangé" <berrange@redhat.com>:
> > On Fri, Jan 11, 2019 at 04:36:11PM +0800, Yongji Xie wrote:
> >>  On Fri, 11 Jan 2019 at 16:32, Daniel P. Berrangé <berrange@redhat.com> wrote:
> >>  >
> >>  > On Fri, Jan 11, 2019 at 03:50:40PM +0800, Yongji Xie wrote:
> >>  > > On Fri, 11 Jan 2019 at 00:41, Daniel P. Berrangé <berrange@redhat.com> wrote:
> >>  > > >
> >>  > > > We need to fix qemu_chr_fe_wait_connected so that it does explicit
> >>  > > > synchronization wrt to any ongoing background connection process.
> >>  > > > It must only return once all TLS/telnet/websock handshakes have
> >>  > > > completed. If we fix that correctly, then I believe it will also
> >>  > > > solve the problem you're trying to address.
> >>  > > >
> >>  > >
> >>  > > Yes, I think this should be the right way to go. To fix it, my thought
> >>  > > is to track the async QIOChannelSocket in SocketChardev. Then we can
> >>  > > easily get the connection progress in qemu_chr_fe_wait_connected(). Do
> >>  > > you have any suggestion?
> >>  >
> >>  > I've got a few patches that refactor the code to fix this. I'll send them
> >>  > today and CC you on them.
> >>  >
> >>
> >>  That would be great! Thank you.
> >
> > It took me rather longer than expected to fully debug all scenarios, but
> > I've finally sent patches:
> >
> > https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg03344.html
> 
> I didn't do a deep review and may be wrong, but I think the race is still
> possible in the hotplug case.
> 
> Example:
> 1. User adds a chardev (qmp: chardev-add) with 'reconnect' option;
> 2. Some main-loop iterations...
> 3. Reconnect timer triggers and starts connection thread;
> 4. User adds a vhost-user-blk (qmp: device-add): device realize -> wait_connected.
> 
> Here, there is a chance that we are in the wait_connected and the connection
> thread is still running.

Hmm, dealing with device-add is tricky. tcp_chr_wait_connect rather
assumes no main loop is running. I'll have to think about how we could
possibly handle hotplug safely.

Regards,
Daniel
Yongji Xie Jan. 16, 2019, 5:39 a.m. UTC | #14
On Tue, 15 Jan 2019 at 23:39, Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Fri, Jan 11, 2019 at 04:36:11PM +0800, Yongji Xie wrote:
> > On Fri, 11 Jan 2019 at 16:32, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >
> > > On Fri, Jan 11, 2019 at 03:50:40PM +0800, Yongji Xie wrote:
> > > > On Fri, 11 Jan 2019 at 00:41, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > >
> > > > > We need to fix qemu_chr_fe_wait_connected so that it does explicit
> > > > > synchronization wrt to any ongoing background connection process.
> > > > > It must only return once all TLS/telnet/websock handshakes have
> > > > > completed.  If we fix that correctly, then I believe it will  also
> > > > > solve the problem you're trying to address.
> > > > >
> > > >
> > > > Yes, I think this should be the right way to go. To fix it, my thought
> > > > is to track the async QIOChannelSocket in SocketChardev. Then we can
> > > > easily get the connection progress in qemu_chr_fe_wait_connected(). Do
> > > > you have any suggestion?
> > >
> > > I've got a few patches that refactor the code to fix this. I'll send them
> > > today and CC you on them.
> > >
> >
> > That would be great! Thank you.
>
> It took me rather longer than expected to fully debug all scenarios, but
> I've finally sent patches:
>
> https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg03344.html
>

I will test my series based on this. Thank you.

Thanks,
Yongji
diff mbox series

Patch

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index eaa8e8b68f..f803f4f7d3 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -1072,37 +1072,37 @@  static void qmp_chardev_open_socket(Chardev *chr,
         s->reconnect_time = reconnect;
     }
 
-    if (s->reconnect_time) {
-        tcp_chr_connect_async(chr);
-    } else {
-        if (s->is_listen) {
-            char *name;
-            s->listener = qio_net_listener_new();
+    if (s->is_listen) {
+        char *name;
+        s->listener = qio_net_listener_new();
 
-            name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
-            qio_net_listener_set_name(s->listener, name);
-            g_free(name);
+        name = g_strdup_printf("chardev-tcp-listener-%s", chr->label);
+        qio_net_listener_set_name(s->listener, name);
+        g_free(name);
 
-            if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
-                object_unref(OBJECT(s->listener));
-                s->listener = NULL;
-                goto error;
-            }
+        if (qio_net_listener_open_sync(s->listener, s->addr, errp) < 0) {
+            object_unref(OBJECT(s->listener));
+            s->listener = NULL;
+            goto error;
+        }
 
-            qapi_free_SocketAddress(s->addr);
-            s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
-            update_disconnected_filename(s);
+        qapi_free_SocketAddress(s->addr);
+        s->addr = socket_local_address(s->listener->sioc[0]->fd, errp);
+        update_disconnected_filename(s);
 
-            if (is_waitconnect &&
-                qemu_chr_wait_connected(chr, errp) < 0) {
-                return;
-            }
-            if (!s->ioc) {
-                qio_net_listener_set_client_func_full(s->listener,
-                                                      tcp_chr_accept,
-                                                      chr, NULL,
-                                                      chr->gcontext);
-            }
+        if (is_waitconnect &&
+            qemu_chr_wait_connected(chr, errp) < 0) {
+            return;
+        }
+        if (!s->ioc) {
+            qio_net_listener_set_client_func_full(s->listener,
+                                                  tcp_chr_accept,
+                                                  chr, NULL,
+                                                  chr->gcontext);
+        }
+    } else if (is_waitconnect) {
+        if (s->reconnect_time) {
+            tcp_chr_connect_async(chr);
         } else if (qemu_chr_wait_connected(chr, errp) < 0) {
             goto error;
         }
@@ -1120,7 +1120,7 @@  static void qemu_chr_parse_socket(QemuOpts *opts, ChardevBackend *backend,
                                   Error **errp)
 {
     bool is_listen      = qemu_opt_get_bool(opts, "server", false);
-    bool is_waitconnect = is_listen && qemu_opt_get_bool(opts, "wait", true);
+    bool is_waitconnect = qemu_opt_get_bool(opts, "wait", true);
     bool is_telnet      = qemu_opt_get_bool(opts, "telnet", false);
     bool is_tn3270      = qemu_opt_get_bool(opts, "tn3270", false);
     bool is_websock     = qemu_opt_get_bool(opts, "websocket", false);
diff --git a/qapi/char.json b/qapi/char.json
index 77ed847972..6a3b5bcd71 100644
--- a/qapi/char.json
+++ b/qapi/char.json
@@ -249,8 +249,7 @@ 
 #        or connect to (server=false)
 # @tls-creds: the ID of the TLS credentials object (since 2.6)
 # @server: create server socket (default: true)
-# @wait: wait for incoming connection on server
-#        sockets (default: false).
+# @wait: wait for being connected or connecting to (default: false)
 # @nodelay: set TCP_NODELAY socket option (default: false)
 # @telnet: enable telnet protocol on server
 #          sockets (default: false)
diff --git a/qemu-options.hx b/qemu-options.hx
index d4f3564b78..ebd11220c4 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -2556,8 +2556,9 @@  undefined if TCP options are specified for a unix socket.
 
 @option{server} specifies that the socket shall be a listening socket.
 
-@option{nowait} specifies that QEMU should not block waiting for a client to
-connect to a listening socket.
+@option{nowait} specifies that QEMU should not wait for being connected on
+server sockets or try to do a sync/async connect on client sockets during
+initialization of the chardev.
 
 @option{telnet} specifies that traffic on the socket should interpret telnet
 escape sequences.
@@ -3093,7 +3094,9 @@  I/O to a location or wait for a connection from a location.  By default
 the TCP Net Console is sent to @var{host} at the @var{port}.  If you use
 the @var{server} option QEMU will wait for a client socket application
 to connect to the port before continuing, unless the @code{nowait}
-option was specified.  The @code{nodelay} option disables the Nagle buffering
+option was specified. And the @code{nowait} option could also be
+used when @var{noserver} is set to disallow QEMU to connect during
+initialization.  The @code{nodelay} option disables the Nagle buffering
 algorithm.  The @code{reconnect} option only applies if @var{noserver} is
 set, if the connection goes down it will attempt to reconnect at the
 given interval.  If @var{host} is omitted, 0.0.0.0 is assumed. Only