mbox series

[0/3] vhost-user reconnect

Message ID 1534433563-30865-1-git-send-email-yury-kotov@yandex-team.ru (mailing list archive)
Headers show
Series vhost-user reconnect | expand

Message

Yury Kotov Aug. 16, 2018, 3:32 p.m. UTC
We are using QEMU (2.12.0) with SPDK (18.04.1) over vhost-user to emulate block
devices. One of our cases it to restart SPDK without restarting VM (in case
of some updates or smth like it). We tried to use the 'reconnect' option for
the '-chardev' device:
  -object memory-backend-file,id=mem0,size=1G,mem-path=/dev/hugepages,share=on \
  -numa node,memdev=mem0 \
  -chardev socket,id=spdk_vhost_blk1,path=/var/tmp/vhost.1,reconnect=10 \
  -device vhost-user-blk-pci,chardev=spdk_vhost_blk1,num-queues=4

After this, vhost-user-blk initialization fails with an error below:
  qemu-system-x86_64: -device ...: Failed to set msg fds.
  qemu-system-x86_64: -device ...: vhost-user-blk: vhost initialization failed:
                                   Operation not permitted

We got the same error with the latest QEMU (c542a9f9794ec8e0bc3f).

We made some investigations and found out that there are several issues:

1. Reconnect option postpones the first connection till machine init done event.
   But we need this connection during vhost blk device initialization which
   happens before the machine init done handling.

2. If the connection is forced, then the reconnection will be successful
   after SPDK restart. The problem is that virtual queue will not start.
   The reason for it is that virtual queue initialization commands
   should be resent:
   * VHOST_USER_SET_FEATURES
   * VHOST_USER_SET_MEM_TABLE
   * VHOST_USER_SET_VRING_NUM
   * VHOST_USER_SET_VRING_BASE
   * VHOST_USER_SET_VRING_ADDR
   * VHOST_USER_SET_VRING_KICK
   * VHOST_USER_SET_VRING_CALL

The patch set resolves both of these issues.

Test case:

1. Start fio process (inside VM):
     fio --name test --ioengine=libaio --iodepth=64 --bs=4096 \
         --rw=randrw --direct=1 --sync=1 --verify=md5 \
         --size=64M --filename=/dev/vda --loops=100

2. Restart SPDK many times.
   We are expecting that during SPDK restart fio will pause and fio should
   continue to work after restart completion.

3. fio process completed successfully without any error.

Yury Kotov (3):
  chardev: prevent extra connection attempt in tcp_chr_machine_done_hook
  vhost: refactor vhost_dev_start and vhost_virtqueue_start
  vhost-user: add reconnect support for vhost-user

 chardev/char-socket.c     |   5 +-
 hw/virtio/vhost-user.c    |  65 ++++++++++++--
 hw/virtio/vhost.c         | 223 +++++++++++++++++++++++++++++++---------------
 include/hw/virtio/vhost.h |   2 +
 4 files changed, 215 insertions(+), 80 deletions(-)

Comments

Marc-André Lureau Aug. 16, 2018, 3:36 p.m. UTC | #1
On Thu, Aug 16, 2018 at 5:32 PM, Yury Kotov <yury-kotov@yandex-team.ru> wrote:
> We are using QEMU (2.12.0) with SPDK (18.04.1) over vhost-user to emulate block
> devices. One of our cases it to restart SPDK without restarting VM (in case
> of some updates or smth like it). We tried to use the 'reconnect' option for
> the '-chardev' device:
>   -object memory-backend-file,id=mem0,size=1G,mem-path=/dev/hugepages,share=on \
>   -numa node,memdev=mem0 \
>   -chardev socket,id=spdk_vhost_blk1,path=/var/tmp/vhost.1,reconnect=10 \
>   -device vhost-user-blk-pci,chardev=spdk_vhost_blk1,num-queues=4
>
> After this, vhost-user-blk initialization fails with an error below:
>   qemu-system-x86_64: -device ...: Failed to set msg fds.
>   qemu-system-x86_64: -device ...: vhost-user-blk: vhost initialization failed:
>                                    Operation not permitted
>
> We got the same error with the latest QEMU (c542a9f9794ec8e0bc3f).
>
> We made some investigations and found out that there are several issues:
>
> 1. Reconnect option postpones the first connection till machine init done event.
>    But we need this connection during vhost blk device initialization which
>    happens before the machine init done handling.
>
> 2. If the connection is forced, then the reconnection will be successful
>    after SPDK restart. The problem is that virtual queue will not start.
>    The reason for it is that virtual queue initialization commands
>    should be resent:
>    * VHOST_USER_SET_FEATURES
>    * VHOST_USER_SET_MEM_TABLE
>    * VHOST_USER_SET_VRING_NUM
>    * VHOST_USER_SET_VRING_BASE
>    * VHOST_USER_SET_VRING_ADDR
>    * VHOST_USER_SET_VRING_KICK
>    * VHOST_USER_SET_VRING_CALL
>
> The patch set resolves both of these issues.
>
> Test case:
>
> 1. Start fio process (inside VM):
>      fio --name test --ioengine=libaio --iodepth=64 --bs=4096 \
>          --rw=randrw --direct=1 --sync=1 --verify=md5 \
>          --size=64M --filename=/dev/vda --loops=100
>
> 2. Restart SPDK many times.
>    We are expecting that during SPDK restart fio will pause and fio should
>    continue to work after restart completion.
>
> 3. fio process completed successfully without any error.

Can you write a test case in vhost-user-test.c ? (perhaps under
QTEST_VHOST_USER_FIXME scope...)

>
> Yury Kotov (3):
>   chardev: prevent extra connection attempt in tcp_chr_machine_done_hook
>   vhost: refactor vhost_dev_start and vhost_virtqueue_start
>   vhost-user: add reconnect support for vhost-user
>
>  chardev/char-socket.c     |   5 +-
>  hw/virtio/vhost-user.c    |  65 ++++++++++++--
>  hw/virtio/vhost.c         | 223 +++++++++++++++++++++++++++++++---------------
>  include/hw/virtio/vhost.h |   2 +
>  4 files changed, 215 insertions(+), 80 deletions(-)
>
> --
> 2.7.4
>
Marc-André Lureau Aug. 16, 2018, 3:41 p.m. UTC | #2
Hi

On Thu, Aug 16, 2018 at 5:32 PM, Yury Kotov <yury-kotov@yandex-team.ru> wrote:
> Usually chardev connects to specified address during the initialization.
> But if reconnect_time is specified then connection will be postponed until
> machine done event.
>
> Thus if reconnect is specified and some device forces connection during
> initialization, tcp_chr_machine_done_hook will do useless connection attempt.
>
> So add a check to prevent it.
>
> Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
> Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>

It would be better with a test case in tests/test-char.c, but lgtm,

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>

> ---
>  chardev/char-socket.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> index efbad6e..116dcc4 100644
> --- a/chardev/char-socket.c
> +++ b/chardev/char-socket.c
> @@ -1165,7 +1165,10 @@ static int tcp_chr_machine_done_hook(Chardev *chr)
>  {
>      SocketChardev *s = SOCKET_CHARDEV(chr);
>
> -    if (s->reconnect_time) {
> +    /* It's possible that connection was established during the device
> +     * initialization. So check if the socket is already connected to
> +     * prevent extra connection attempt. */
> +    if (!s->connected && s->reconnect_time) {
>          tcp_chr_connect_async(chr);
>      }
>
> --
> 2.7.4
>
Marc-André Lureau Aug. 16, 2018, 3:46 p.m. UTC | #3
Hi

On Thu, Aug 16, 2018 at 5:32 PM, Yury Kotov <yury-kotov@yandex-team.ru> wrote:
> We are using QEMU (2.12.0) with SPDK (18.04.1) over vhost-user to emulate block
> devices. One of our cases it to restart SPDK without restarting VM (in case
> of some updates or smth like it). We tried to use the 'reconnect' option for
> the '-chardev' device:
>   -object memory-backend-file,id=mem0,size=1G,mem-path=/dev/hugepages,share=on \
>   -numa node,memdev=mem0 \
>   -chardev socket,id=spdk_vhost_blk1,path=/var/tmp/vhost.1,reconnect=10 \
>   -device vhost-user-blk-pci,chardev=spdk_vhost_blk1,num-queues=4
>
> After this, vhost-user-blk initialization fails with an error below:
>   qemu-system-x86_64: -device ...: Failed to set msg fds.
>   qemu-system-x86_64: -device ...: vhost-user-blk: vhost initialization failed:
>                                    Operation not permitted
>
> We got the same error with the latest QEMU (c542a9f9794ec8e0bc3f).

Why not setup qemu socket chardev in server mode? This is the only
vhost-user reconnect setup that is supported atm (see
vhost-user-test.c). You avoid having a reconnect loop that way.

>
> We made some investigations and found out that there are several issues:
>
> 1. Reconnect option postpones the first connection till machine init done event.
>    But we need this connection during vhost blk device initialization which
>    happens before the machine init done handling.
>
> 2. If the connection is forced, then the reconnection will be successful
>    after SPDK restart. The problem is that virtual queue will not start.
>    The reason for it is that virtual queue initialization commands
>    should be resent:
>    * VHOST_USER_SET_FEATURES
>    * VHOST_USER_SET_MEM_TABLE
>    * VHOST_USER_SET_VRING_NUM
>    * VHOST_USER_SET_VRING_BASE
>    * VHOST_USER_SET_VRING_ADDR
>    * VHOST_USER_SET_VRING_KICK
>    * VHOST_USER_SET_VRING_CALL
>
> The patch set resolves both of these issues.
>
> Test case:
>
> 1. Start fio process (inside VM):
>      fio --name test --ioengine=libaio --iodepth=64 --bs=4096 \
>          --rw=randrw --direct=1 --sync=1 --verify=md5 \
>          --size=64M --filename=/dev/vda --loops=100
>
> 2. Restart SPDK many times.
>    We are expecting that during SPDK restart fio will pause and fio should
>    continue to work after restart completion.
>
> 3. fio process completed successfully without any error.
>
> Yury Kotov (3):
>   chardev: prevent extra connection attempt in tcp_chr_machine_done_hook
>   vhost: refactor vhost_dev_start and vhost_virtqueue_start
>   vhost-user: add reconnect support for vhost-user
>
>  chardev/char-socket.c     |   5 +-
>  hw/virtio/vhost-user.c    |  65 ++++++++++++--
>  hw/virtio/vhost.c         | 223 +++++++++++++++++++++++++++++++---------------
>  include/hw/virtio/vhost.h |   2 +
>  4 files changed, 215 insertions(+), 80 deletions(-)
>
> --
> 2.7.4
>
Yury Kotov Aug. 20, 2018, 12:51 p.m. UTC | #4
16.08.2018, 18:36, "Marc-André Lureau" <marcandre.lureau@redhat.com>:
> On Thu, Aug 16, 2018 at 5:32 PM, Yury Kotov <yury-kotov@yandex-team.ru> wrote:
>>  We are using QEMU (2.12.0) with SPDK (18.04.1) over vhost-user to emulate block
>>  devices. One of our cases it to restart SPDK without restarting VM (in case
>>  of some updates or smth like it). We tried to use the 'reconnect' option for
>>  the '-chardev' device:
>>    -object memory-backend-file,id=mem0,size=1G,mem-path=/dev/hugepages,share=on \
>>    -numa node,memdev=mem0 \
>>    -chardev socket,id=spdk_vhost_blk1,path=/var/tmp/vhost.1,reconnect=10 \
>>    -device vhost-user-blk-pci,chardev=spdk_vhost_blk1,num-queues=4
>>
>>  After this, vhost-user-blk initialization fails with an error below:
>>    qemu-system-x86_64: -device ...: Failed to set msg fds.
>>    qemu-system-x86_64: -device ...: vhost-user-blk: vhost initialization failed:
>>                                     Operation not permitted
>>
>>  We got the same error with the latest QEMU (c542a9f9794ec8e0bc3f).
>>
>>  We made some investigations and found out that there are several issues:
>>
>>  1. Reconnect option postpones the first connection till machine init done event.
>>     But we need this connection during vhost blk device initialization which
>>     happens before the machine init done handling.
>>
>>  2. If the connection is forced, then the reconnection will be successful
>>     after SPDK restart. The problem is that virtual queue will not start.
>>     The reason for it is that virtual queue initialization commands
>>     should be resent:
>>     * VHOST_USER_SET_FEATURES
>>     * VHOST_USER_SET_MEM_TABLE
>>     * VHOST_USER_SET_VRING_NUM
>>     * VHOST_USER_SET_VRING_BASE
>>     * VHOST_USER_SET_VRING_ADDR
>>     * VHOST_USER_SET_VRING_KICK
>>     * VHOST_USER_SET_VRING_CALL
>>
>>  The patch set resolves both of these issues.
>>
>>  Test case:
>>
>>  1. Start fio process (inside VM):
>>       fio --name test --ioengine=libaio --iodepth=64 --bs=4096 \
>>           --rw=randrw --direct=1 --sync=1 --verify=md5 \
>>           --size=64M --filename=/dev/vda --loops=100
>>
>>  2. Restart SPDK many times.
>>     We are expecting that during SPDK restart fio will pause and fio should
>>     continue to work after restart completion.
>>
>>  3. fio process completed successfully without any error.
>
> Can you write a test case in vhost-user-test.c ? (perhaps under
> QTEST_VHOST_USER_FIXME scope...)
>

This is a great idea and we were definitely going to do that during coming couple of weeks. We thought that we could make a follow up commit with necessary tests added a bit later though, since currently we need to figure out the state of vhost-user tests in general, before we can try to add any new stuff, and that will take some time. So far we have stress-tested these fixes manually.

Do you suggest we wait with this series as well until we have all tests ready? Or do we proceed now and make a follow up series with vhost user tests later like we suggested?

>>  Yury Kotov (3):
>>    chardev: prevent extra connection attempt in tcp_chr_machine_done_hook
>>    vhost: refactor vhost_dev_start and vhost_virtqueue_start
>>    vhost-user: add reconnect support for vhost-user
>>
>>   chardev/char-socket.c | 5 +-
>>   hw/virtio/vhost-user.c | 65 ++++++++++++--
>>   hw/virtio/vhost.c | 223 +++++++++++++++++++++++++++++++---------------
>>   include/hw/virtio/vhost.h | 2 +
>>   4 files changed, 215 insertions(+), 80 deletions(-)
>>
>>  --
>>  2.7.4
Yury Kotov Aug. 20, 2018, 12:52 p.m. UTC | #5
16.08.2018, 19:12, "Marc-André Lureau" <marcandre.lureau@redhat.com>:
> Hi
>
> On Thu, Aug 16, 2018 at 5:32 PM, Yury Kotov <yury-kotov@yandex-team.ru> wrote:
>>  We are using QEMU (2.12.0) with SPDK (18.04.1) over vhost-user to emulate block
>>  devices. One of our cases it to restart SPDK without restarting VM (in case
>>  of some updates or smth like it). We tried to use the 'reconnect' option for
>>  the '-chardev' device:
>>    -object memory-backend-file,id=mem0,size=1G,mem-path=/dev/hugepages,share=on \
>>    -numa node,memdev=mem0 \
>>    -chardev socket,id=spdk_vhost_blk1,path=/var/tmp/vhost.1,reconnect=10 \
>>    -device vhost-user-blk-pci,chardev=spdk_vhost_blk1,num-queues=4
>>
>>  After this, vhost-user-blk initialization fails with an error below:
>>    qemu-system-x86_64: -device ...: Failed to set msg fds.
>>    qemu-system-x86_64: -device ...: vhost-user-blk: vhost initialization failed:
>>                                     Operation not permitted
>>
>>  We got the same error with the latest QEMU (c542a9f9794ec8e0bc3f).
>
> Why not setup qemu socket chardev in server mode? This is the only
> vhost-user reconnect setup that is supported atm (see
> vhost-user-test.c). You avoid having a reconnect loop that way.
>

Yes, it will work. But client mode also should work and we want to support them both.

>>  We made some investigations and found out that there are several issues:
>>
>>  1. Reconnect option postpones the first connection till machine init done event.
>>     But we need this connection during vhost blk device initialization which
>>     happens before the machine init done handling.
>>
>>  2. If the connection is forced, then the reconnection will be successful
>>     after SPDK restart. The problem is that virtual queue will not start.
>>     The reason for it is that virtual queue initialization commands
>>     should be resent:
>>     * VHOST_USER_SET_FEATURES
>>     * VHOST_USER_SET_MEM_TABLE
>>     * VHOST_USER_SET_VRING_NUM
>>     * VHOST_USER_SET_VRING_BASE
>>     * VHOST_USER_SET_VRING_ADDR
>>     * VHOST_USER_SET_VRING_KICK
>>     * VHOST_USER_SET_VRING_CALL
>>
>>  The patch set resolves both of these issues.
>>
>>  Test case:
>>
>>  1. Start fio process (inside VM):
>>       fio --name test --ioengine=libaio --iodepth=64 --bs=4096 \
>>           --rw=randrw --direct=1 --sync=1 --verify=md5 \
>>           --size=64M --filename=/dev/vda --loops=100
>>
>>  2. Restart SPDK many times.
>>     We are expecting that during SPDK restart fio will pause and fio should
>>     continue to work after restart completion.
>>
>>  3. fio process completed successfully without any error.
>>
>>  Yury Kotov (3):
>>    chardev: prevent extra connection attempt in tcp_chr_machine_done_hook
>>    vhost: refactor vhost_dev_start and vhost_virtqueue_start
>>    vhost-user: add reconnect support for vhost-user
>>
>>   chardev/char-socket.c | 5 +-
>>   hw/virtio/vhost-user.c | 65 ++++++++++++--
>>   hw/virtio/vhost.c | 223 +++++++++++++++++++++++++++++++---------------
>>   include/hw/virtio/vhost.h | 2 +
>>   4 files changed, 215 insertions(+), 80 deletions(-)
>>
>>  --
>>  2.7.4
Marc-André Lureau Aug. 20, 2018, 1:11 p.m. UTC | #6
Hi

On Mon, Aug 20, 2018 at 2:51 PM, Yury Kotov <yury-kotov@yandex-team.ru> wrote:
> 16.08.2018, 18:36, "Marc-André Lureau" <marcandre.lureau@redhat.com>:
>> On Thu, Aug 16, 2018 at 5:32 PM, Yury Kotov <yury-kotov@yandex-team.ru> wrote:
>>>  We are using QEMU (2.12.0) with SPDK (18.04.1) over vhost-user to emulate block
>>>  devices. One of our cases it to restart SPDK without restarting VM (in case
>>>  of some updates or smth like it). We tried to use the 'reconnect' option for
>>>  the '-chardev' device:
>>>    -object memory-backend-file,id=mem0,size=1G,mem-path=/dev/hugepages,share=on \
>>>    -numa node,memdev=mem0 \
>>>    -chardev socket,id=spdk_vhost_blk1,path=/var/tmp/vhost.1,reconnect=10 \
>>>    -device vhost-user-blk-pci,chardev=spdk_vhost_blk1,num-queues=4
>>>
>>>  After this, vhost-user-blk initialization fails with an error below:
>>>    qemu-system-x86_64: -device ...: Failed to set msg fds.
>>>    qemu-system-x86_64: -device ...: vhost-user-blk: vhost initialization failed:
>>>                                     Operation not permitted
>>>
>>>  We got the same error with the latest QEMU (c542a9f9794ec8e0bc3f).
>>>
>>>  We made some investigations and found out that there are several issues:
>>>
>>>  1. Reconnect option postpones the first connection till machine init done event.
>>>     But we need this connection during vhost blk device initialization which
>>>     happens before the machine init done handling.
>>>
>>>  2. If the connection is forced, then the reconnection will be successful
>>>     after SPDK restart. The problem is that virtual queue will not start.
>>>     The reason for it is that virtual queue initialization commands
>>>     should be resent:
>>>     * VHOST_USER_SET_FEATURES
>>>     * VHOST_USER_SET_MEM_TABLE
>>>     * VHOST_USER_SET_VRING_NUM
>>>     * VHOST_USER_SET_VRING_BASE
>>>     * VHOST_USER_SET_VRING_ADDR
>>>     * VHOST_USER_SET_VRING_KICK
>>>     * VHOST_USER_SET_VRING_CALL
>>>
>>>  The patch set resolves both of these issues.
>>>
>>>  Test case:
>>>
>>>  1. Start fio process (inside VM):
>>>       fio --name test --ioengine=libaio --iodepth=64 --bs=4096 \
>>>           --rw=randrw --direct=1 --sync=1 --verify=md5 \
>>>           --size=64M --filename=/dev/vda --loops=100
>>>
>>>  2. Restart SPDK many times.
>>>     We are expecting that during SPDK restart fio will pause and fio should
>>>     continue to work after restart completion.
>>>
>>>  3. fio process completed successfully without any error.
>>
>> Can you write a test case in vhost-user-test.c ? (perhaps under
>> QTEST_VHOST_USER_FIXME scope...)
>>
>
> This is a great idea and we were definitely going to do that during coming couple of weeks. We thought that we could make a follow up commit with necessary tests added a bit later though, since currently we need to figure out the state of vhost-user tests in general, before we can try to add any new stuff, and that will take some time. So far we have stress-tested these fixes manually.

Yes, some vhost-user tests are disabled by default (sadly for travis
CI reason - not a really bug), and it's easy to introduce regressions.

I sent a related series "[PATCH 0/4] Fix socket chardev regression" to
make it work again.

> Do you suggest we wait with this series as well until we have all tests ready? Or do we proceed now and make a follow up series with vhost user tests later like we suggested?

I would rather have the tests with the series.

>
>>>  Yury Kotov (3):
>>>    chardev: prevent extra connection attempt in tcp_chr_machine_done_hook
>>>    vhost: refactor vhost_dev_start and vhost_virtqueue_start
>>>    vhost-user: add reconnect support for vhost-user
>>>
>>>   chardev/char-socket.c | 5 +-
>>>   hw/virtio/vhost-user.c | 65 ++++++++++++--
>>>   hw/virtio/vhost.c | 223 +++++++++++++++++++++++++++++++---------------
>>>   include/hw/virtio/vhost.h | 2 +
>>>   4 files changed, 215 insertions(+), 80 deletions(-)
>>>
>>>  --
>>>  2.7.4
Yury Kotov Aug. 20, 2018, 1:39 p.m. UTC | #7
20.08.2018, 16:11, "Marc-André Lureau" <marcandre.lureau@redhat.com>:
> Hi
>
> On Mon, Aug 20, 2018 at 2:51 PM, Yury Kotov <yury-kotov@yandex-team.ru> wrote:
>>  16.08.2018, 18:36, "Marc-André Lureau" <marcandre.lureau@redhat.com>:
>>>  On Thu, Aug 16, 2018 at 5:32 PM, Yury Kotov <yury-kotov@yandex-team.ru> wrote:
>>>>   We are using QEMU (2.12.0) with SPDK (18.04.1) over vhost-user to emulate block
>>>>   devices. One of our cases it to restart SPDK without restarting VM (in case
>>>>   of some updates or smth like it). We tried to use the 'reconnect' option for
>>>>   the '-chardev' device:
>>>>     -object memory-backend-file,id=mem0,size=1G,mem-path=/dev/hugepages,share=on \
>>>>     -numa node,memdev=mem0 \
>>>>     -chardev socket,id=spdk_vhost_blk1,path=/var/tmp/vhost.1,reconnect=10 \
>>>>     -device vhost-user-blk-pci,chardev=spdk_vhost_blk1,num-queues=4
>>>>
>>>>   After this, vhost-user-blk initialization fails with an error below:
>>>>     qemu-system-x86_64: -device ...: Failed to set msg fds.
>>>>     qemu-system-x86_64: -device ...: vhost-user-blk: vhost initialization failed:
>>>>                                      Operation not permitted
>>>>
>>>>   We got the same error with the latest QEMU (c542a9f9794ec8e0bc3f).
>>>>
>>>>   We made some investigations and found out that there are several issues:
>>>>
>>>>   1. Reconnect option postpones the first connection till machine init done event.
>>>>      But we need this connection during vhost blk device initialization which
>>>>      happens before the machine init done handling.
>>>>
>>>>   2. If the connection is forced, then the reconnection will be successful
>>>>      after SPDK restart. The problem is that virtual queue will not start.
>>>>      The reason for it is that virtual queue initialization commands
>>>>      should be resent:
>>>>      * VHOST_USER_SET_FEATURES
>>>>      * VHOST_USER_SET_MEM_TABLE
>>>>      * VHOST_USER_SET_VRING_NUM
>>>>      * VHOST_USER_SET_VRING_BASE
>>>>      * VHOST_USER_SET_VRING_ADDR
>>>>      * VHOST_USER_SET_VRING_KICK
>>>>      * VHOST_USER_SET_VRING_CALL
>>>>
>>>>   The patch set resolves both of these issues.
>>>>
>>>>   Test case:
>>>>
>>>>   1. Start fio process (inside VM):
>>>>        fio --name test --ioengine=libaio --iodepth=64 --bs=4096 \
>>>>            --rw=randrw --direct=1 --sync=1 --verify=md5 \
>>>>            --size=64M --filename=/dev/vda --loops=100
>>>>
>>>>   2. Restart SPDK many times.
>>>>      We are expecting that during SPDK restart fio will pause and fio should
>>>>      continue to work after restart completion.
>>>>
>>>>   3. fio process completed successfully without any error.
>>>
>>>  Can you write a test case in vhost-user-test.c ? (perhaps under
>>>  QTEST_VHOST_USER_FIXME scope...)
>>
>>  This is a great idea and we were definitely going to do that during coming couple of weeks. We thought that we could make a follow up commit with necessary tests added a bit later though, since currently we need to figure out the state of vhost-user tests in general, before we can try to add any new stuff, and that will take some time. So far we have stress-tested these fixes manually.
>
> Yes, some vhost-user tests are disabled by default (sadly for travis
> CI reason - not a really bug), and it's easy to introduce regressions.
>
> I sent a related series "[PATCH 0/4] Fix socket chardev regression" to
> make it work again.
>
>>  Do you suggest we wait with this series as well until we have all tests ready? Or do we proceed now and make a follow up series with vhost user tests later like we suggested?
>
> I would rather have the tests with the series.
>

Sounds good. We will resend v2 with tests. However while we do that, we would be
grateful for more comments on current implementation as well, since it at least
passes our internal functional tests.

>>>>   Yury Kotov (3):
>>>>     chardev: prevent extra connection attempt in tcp_chr_machine_done_hook
>>>>     vhost: refactor vhost_dev_start and vhost_virtqueue_start
>>>>     vhost-user: add reconnect support for vhost-user
>>>>
>>>>    chardev/char-socket.c | 5 +-
>>>>    hw/virtio/vhost-user.c | 65 ++++++++++++--
>>>>    hw/virtio/vhost.c | 223 +++++++++++++++++++++++++++++++---------------
>>>>    include/hw/virtio/vhost.h | 2 +
>>>>    4 files changed, 215 insertions(+), 80 deletions(-)
>>>>
>>>>   --
>>>>   2.7.4