mbox series

[0/2] vhost-user race condition on shutdown

Message ID 20190416184624.15397-1-dan.streetman@canonical.com (mailing list archive)
Headers show
Series vhost-user race condition on shutdown | expand

Message

Dan Streetman April 16, 2019, 6:46 p.m. UTC
From: Dan Streetman <ddstreet@canonical.com>

Buglink: https://launchpad.net/bugs/1823458

This is a race condition between the normal shutdown of a guest
and the handling of its vhost-user net being externally closed.
It's explained in more detail at the bug link; the short version
is that there are 2 problems, fixed by the 2 patches.  The first
patch fixes the race condition where multiple threads call
vhost_net_stop(), and the second patch prevents vhost-user from
calling vhost_net_cleanup() on CHR_EVENT_CLOSED, because it will
be cleaned up later and its fields will be accessed when
vhost_net_stop() is called later.

As explained in the bug report, this requires a rather complicated
setup to reproduce, and I'm not able to create a setup to reproduce
it myself.  However this has been reported to me/Canonical, and the
reporter is able to reproduce it consistently, so I've used them for
debug and testing.  This reproduction was done with the older 2.5
qemu, from Ubuntu Xenial; but the problem does still appear to exist
in upstream qemu, based on review of the code, which is why I'm sending
these patches.

Dan Streetman (2):
  add VirtIONet vhost_stopped flag to prevent multiple stops
  do not call vhost_net_cleanup() on running net from char user event

 hw/net/virtio-net.c            | 3 ++-
 include/hw/virtio/virtio-net.h | 1 +
 net/vhost-user.c               | 1 -
 3 files changed, 3 insertions(+), 2 deletions(-)

Comments

Michael S. Tsirkin April 19, 2019, 11:12 p.m. UTC | #1
On Tue, Apr 16, 2019 at 02:46:22PM -0400, Dan Streetman wrote:
> From: Dan Streetman <ddstreet@canonical.com>
> 
> Buglink: https://launchpad.net/bugs/1823458

Cc Maxime.


> This is a race condition between the normal shutdown of a guest
> and the handling of its vhost-user net being externally closed.
> It's explained in more detail at the bug link; the short version
> is that there are 2 problems, fixed by the 2 patches.  The first
> patch fixes the race condition where multiple threads call
> vhost_net_stop(), and the second patch prevents vhost-user from
> calling vhost_net_cleanup() on CHR_EVENT_CLOSED, because it will
> be cleaned up later and its fields will be accessed when
> vhost_net_stop() is called later.
> 
> As explained in the bug report, this requires a rather complicated
> setup to reproduce, and I'm not able to create a setup to reproduce
> it myself.  However this has been reported to me/Canonical, and the
> reporter is able to reproduce it consistently, so I've used them for
> debug and testing.  This reproduction was done with the older 2.5
> qemu, from Ubuntu Xenial; but the problem does still appear to exist
> in upstream qemu, based on review of the code, which is why I'm sending
> these patches.
> 
> Dan Streetman (2):
>   add VirtIONet vhost_stopped flag to prevent multiple stops
>   do not call vhost_net_cleanup() on running net from char user event
> 
>  hw/net/virtio-net.c            | 3 ++-
>  include/hw/virtio/virtio-net.h | 1 +
>  net/vhost-user.c               | 1 -
>  3 files changed, 3 insertions(+), 2 deletions(-)
> 
> -- 
> 2.20.1