diff mbox

[for,2.11] virtio-net: don't touch virtqueue if vm is stopped

Message ID 1511408266-4139-1-git-send-email-jasowang@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jason Wang Nov. 23, 2017, 3:37 a.m. UTC
Guest state should not be touched if VM is stopped, unfortunately we
didn't check running state and tried to drain tx queue unconditionally
in virtio_net_set_status(). A crash was then noticed as a migration
destination when user type quit after virtqueue state is loaded but
before region cache is initialized. In this case,
virtio_net_drop_tx_queue_data() tries to access the uninitialized
region cache.

Fix this by only dropping tx queue data when vm is running.

Fixes: 283e2c2adcb80 ("net: virtio-net discards TX data after link down")
Cc: Yuri Benditovich <yuri.benditovich@daynix.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/net/virtio-net.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Stefan Hajnoczi Nov. 23, 2017, 10:59 a.m. UTC | #1
On Thu, Nov 23, 2017 at 11:37:46AM +0800, Jason Wang wrote:
> Guest state should not be touched if VM is stopped, unfortunately we
> didn't check running state and tried to drain tx queue unconditionally
> in virtio_net_set_status(). A crash was then noticed as a migration
> destination when user type quit after virtqueue state is loaded but
> before region cache is initialized. In this case,
> virtio_net_drop_tx_queue_data() tries to access the uninitialized
> region cache.
> 
> Fix this by only dropping tx queue data when vm is running.

hw/virtio/virtio.c:virtio_load() does the following:

  for (i = 0; i < num; i++) {
      if (vdev->vq[i].vring.desc) {
          uint16_t nheads;

          /*
           * VIRTIO-1 devices migrate desc, used, and avail ring addresses so
           * only the region cache needs to be set up.  Legacy devices need
           * to calculate used and avail ring addresses based on the desc
           * address.
           */
          if (virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) {
              virtio_init_region_cache(vdev, i);
          } else {
              virtio_queue_update_rings(vdev, i);
          }

So the region caches should be initialized after virtqueue state is
loaded.

It's unclear to me which code path triggers this issue.  Can you add a
backtrace or an explanation?

Thanks,
Stefan
Jason Wang Nov. 24, 2017, 2:57 a.m. UTC | #2
On 2017年11月23日 18:59, Stefan Hajnoczi wrote:
> On Thu, Nov 23, 2017 at 11:37:46AM +0800, Jason Wang wrote:
>> Guest state should not be touched if VM is stopped, unfortunately we
>> didn't check running state and tried to drain tx queue unconditionally
>> in virtio_net_set_status(). A crash was then noticed as a migration
>> destination when user type quit after virtqueue state is loaded but
>> before region cache is initialized. In this case,
>> virtio_net_drop_tx_queue_data() tries to access the uninitialized
>> region cache.
>>
>> Fix this by only dropping tx queue data when vm is running.
> hw/virtio/virtio.c:virtio_load() does the following:
>
>    for (i = 0; i < num; i++) {
>        if (vdev->vq[i].vring.desc) {
>            uint16_t nheads;
>
>            /*
>             * VIRTIO-1 devices migrate desc, used, and avail ring addresses so
>             * only the region cache needs to be set up.  Legacy devices need
>             * to calculate used and avail ring addresses based on the desc
>             * address.
>             */
>            if (virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) {
>                virtio_init_region_cache(vdev, i);
>            } else {
>                virtio_queue_update_rings(vdev, i);
>            }
>
> So the region caches should be initialized after virtqueue state is
> loaded.
>
> It's unclear to me which code path triggers this issue.  Can you add a
> backtrace or an explanation?
>
> Thanks,
> Stefan

Migration coroutine was yield before region cache was initialized. The 
backtrace looks like:

#0  qio_channel_yield (ioc=0x55555758d000, condition=G_IO_IN) at 
io/channel.c:432
#1  0x0000555555b209c3 in channel_get_buffer (opaque=0x55555758d000, 
buf=0x555556f7c048 "", pos=1657701385, size=32768)
     at migration/qemu-file-channel.c:83
#2  0x0000555555b1f4a3 in qemu_fill_buffer (f=0x555556f7c010) at 
migration/qemu-file.c:293
#3  0x0000555555b1fd42 in qemu_peek_byte (f=0x555556f7c010, offset=0) at 
migration/qemu-file.c:553
#4  0x0000555555b1fd94 in qemu_get_byte (f=0x555556f7c010) at 
migration/qemu-file.c:566
#5  0x0000555555b1ffef in qemu_get_be32 (f=0x555556f7c010) at 
migration/qemu-file.c:646
#6  0x0000555555b1d002 in qemu_get_be32s (f=0x555556f7c010, 
pv=0x555557d6578c) at 
/home/devel/git/qemu/include/migration/qemu-file-types.h:78
#7  0x0000555555b1da0d in get_uint32 (f=0x555556f7c010, 
pv=0x555557d6578c, size=4, field=0x555556503ea0 <__compound_literal.0+416>)
     at migration/vmstate-types.c:241
#8  0x0000555555b1c0b5 in vmstate_load_state (f=0x555556f7c010, 
vmsd=0x555556375b60 <vmstate_virtio_pci_modern_queue_state>, 
opaque=0x555557d6577c,
     version_id=1) at migration/vmstate.c:140
#9  0x0000555555b1c090 in vmstate_load_state (f=0x555556f7c010, 
vmsd=0x555556375bc0 <vmstate_virtio_pci_modern_state_sub>, 
opaque=0x555557d604a0,
     version_id=1) at migration/vmstate.c:137
#10 0x0000555555b1cce5 in vmstate_subsection_load (f=0x555556f7c010, 
vmsd=0x555556375c20 <vmstate_virtio_pci>, opaque=0x555557d604a0)
     at migration/vmstate.c:453
#11 0x0000555555b1c199 in vmstate_load_state (f=0x555556f7c010, 
vmsd=0x555556375c20 <vmstate_virtio_pci>, opaque=0x555557d604a0, 
version_id=1)
     at migration/vmstate.c:160
#12 0x0000555555b00aa0 in virtio_pci_load_extra_state (d=0x555557d604a0, 
f=0x555556f7c010) at hw/virtio/virtio-pci.c:161
#13 0x00005555558651ab in get_extra_state (f=0x555556f7c010, 
pv=0x555557d68610, size=0, field=0x5555563cd0e0 <__compound_literal.4>)
     at /home/devel/git/qemu/hw/virtio/virtio.c:1808
#14 0x0000555555b1c0b5 in vmstate_load_state (f=0x555556f7c010, 
vmsd=0x5555562b7fe0 <vmstate_virtio_extra_state>, opaque=0x555557d68610, 
version_id=1)
     at migration/vmstate.c:140
#15 0x0000555555b1cce5 in vmstate_subsection_load (f=0x555556f7c010, 
vmsd=0x5555562b8160 <vmstate_virtio>, opaque=0x555557d68610)
     at migration/vmstate.c:453
#16 0x0000555555b1c199 in vmstate_load_state (f=0x555556f7c010, 
vmsd=0x5555562b8160 <vmstate_virtio>, opaque=0x555557d68610, version_id=1)
     at migration/vmstate.c:160
#17 0x0000555555865cc3 in virtio_load (vdev=0x555557d68610, 
f=0x555556f7c010, version_id=11) at 
/home/devel/git/qemu/hw/virtio/virtio.c:2110
#18 0x0000555555865705 in virtio_device_get (f=0x555556f7c010, 
opaque=0x555557d68610, size=0, field=0x5555563ca4a0 <__compound_literal.5>)
     at /home/devel/git/qemu/hw/virtio/virtio.c:1974
#19 0x0000555555b1c0b5 in vmstate_load_state (f=0x555556f7c010, 
vmsd=0x5555562b6ae0 <vmstate_virtio_net>, opaque=0x555557d68610, 
version_id=11)
     at migration/vmstate.c:140
#20 0x0000555555b15d20 in vmstate_load (f=0x555556f7c010, 
se=0x555557e81400) at migration/savevm.c:748
#21 0x0000555555b1827b in qemu_loadvm_section_start_full 
(f=0x555556f7c010, mis=0x555556564a40 <mis_current>) at 
migration/savevm.c:1903
#22 0x0000555555b185d5 in qemu_loadvm_state_main (f=0x555556f7c010, 
mis=0x555556564a40 <mis_current>) at migration/savevm.c:1998
#23 0x0000555555b187fc in qemu_loadvm_state (f=0x555556f7c010) at 
migration/savevm.c:2077
#24 0x0000555555b0c0af in process_incoming_migration_co (opaque=0x0) at 
migration/migration.c:327
#25 0x0000555555cc1d06 in coroutine_trampoline (i0=1469933216, i1=21845) 
at util/coroutine-ucontext.c:79
#26 0x00007ffff39d95d0 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#27 0x00007fffffffd2f0 in ?? ()
#28 0x0000000000000000 in ?? ()
Stefan Hajnoczi Nov. 24, 2017, 10:44 a.m. UTC | #3
On Fri, Nov 24, 2017 at 10:57:11AM +0800, Jason Wang wrote:
> On 2017年11月23日 18:59, Stefan Hajnoczi wrote:
> > On Thu, Nov 23, 2017 at 11:37:46AM +0800, Jason Wang wrote:
> > > Guest state should not be touched if VM is stopped, unfortunately we
> > > didn't check running state and tried to drain tx queue unconditionally
> > > in virtio_net_set_status(). A crash was then noticed as a migration
> > > destination when user type quit after virtqueue state is loaded but
> > > before region cache is initialized. In this case,
> > > virtio_net_drop_tx_queue_data() tries to access the uninitialized
> > > region cache.
> > > 
> > > Fix this by only dropping tx queue data when vm is running.
> > hw/virtio/virtio.c:virtio_load() does the following:
> > 
> >    for (i = 0; i < num; i++) {
> >        if (vdev->vq[i].vring.desc) {
> >            uint16_t nheads;
> > 
> >            /*
> >             * VIRTIO-1 devices migrate desc, used, and avail ring addresses so
> >             * only the region cache needs to be set up.  Legacy devices need
> >             * to calculate used and avail ring addresses based on the desc
> >             * address.
> >             */
> >            if (virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) {
> >                virtio_init_region_cache(vdev, i);
> >            } else {
> >                virtio_queue_update_rings(vdev, i);
> >            }
> > 
> > So the region caches should be initialized after virtqueue state is
> > loaded.
> > 
> > It's unclear to me which code path triggers this issue.  Can you add a
> > backtrace or an explanation?
> > 
> > Thanks,
> > Stefan
> 
> Migration coroutine was yield before region cache was initialized. The
> backtrace looks like:
[...]
> #16 0x0000555555b1c199 in vmstate_load_state (f=0x555556f7c010,
> vmsd=0x5555562b8160 <vmstate_virtio>, opaque=0x555557d68610, version_id=1)
>     at migration/vmstate.c:160
> #17 0x0000555555865cc3 in virtio_load (vdev=0x555557d68610,
> f=0x555556f7c010, version_id=11) at
> /home/devel/git/qemu/hw/virtio/virtio.c:2110

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

Thanks for the backtrace!  Your patch is fine but I have a larger
concern:

The backtrace shows that the virtio code is re-entrant during savevm
load.  That's probably a bad thing because set_status() and other APIs
are probably not intended to run while we are half-way through savevm
load.  The virtqueue is only partially set up at this point :(.  I
wonder if a more general cleanup is necessary to avoid problems like
this in the future...

Stefan
Jason Wang Nov. 27, 2017, 3:26 a.m. UTC | #4
On 2017年11月24日 18:44, Stefan Hajnoczi wrote:
> On Fri, Nov 24, 2017 at 10:57:11AM +0800, Jason Wang wrote:
>> On 2017年11月23日 18:59, Stefan Hajnoczi wrote:
>>> On Thu, Nov 23, 2017 at 11:37:46AM +0800, Jason Wang wrote:
>>>> Guest state should not be touched if VM is stopped, unfortunately we
>>>> didn't check running state and tried to drain tx queue unconditionally
>>>> in virtio_net_set_status(). A crash was then noticed as a migration
>>>> destination when user type quit after virtqueue state is loaded but
>>>> before region cache is initialized. In this case,
>>>> virtio_net_drop_tx_queue_data() tries to access the uninitialized
>>>> region cache.
>>>>
>>>> Fix this by only dropping tx queue data when vm is running.
>>> hw/virtio/virtio.c:virtio_load() does the following:
>>>
>>>     for (i = 0; i < num; i++) {
>>>         if (vdev->vq[i].vring.desc) {
>>>             uint16_t nheads;
>>>
>>>             /*
>>>              * VIRTIO-1 devices migrate desc, used, and avail ring addresses so
>>>              * only the region cache needs to be set up.  Legacy devices need
>>>              * to calculate used and avail ring addresses based on the desc
>>>              * address.
>>>              */
>>>             if (virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) {
>>>                 virtio_init_region_cache(vdev, i);
>>>             } else {
>>>                 virtio_queue_update_rings(vdev, i);
>>>             }
>>>
>>> So the region caches should be initialized after virtqueue state is
>>> loaded.
>>>
>>> It's unclear to me which code path triggers this issue.  Can you add a
>>> backtrace or an explanation?
>>>
>>> Thanks,
>>> Stefan
>> Migration coroutine was yield before region cache was initialized. The
>> backtrace looks like:
> [...]
>> #16 0x0000555555b1c199 in vmstate_load_state (f=0x555556f7c010,
>> vmsd=0x5555562b8160 <vmstate_virtio>, opaque=0x555557d68610, version_id=1)
>>      at migration/vmstate.c:160
>> #17 0x0000555555865cc3 in virtio_load (vdev=0x555557d68610,
>> f=0x555556f7c010, version_id=11) at
>> /home/devel/git/qemu/hw/virtio/virtio.c:2110
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
>
> Thanks for the backtrace!  Your patch is fine but I have a larger
> concern:
>
> The backtrace shows that the virtio code is re-entrant during savevm
> load.  That's probably a bad thing because set_status() and other APIs
> are probably not intended to run while we are half-way through savevm
> load.  The virtqueue is only partially set up at this point :(.  I
> wonder if a more general cleanup is necessary to avoid problems like
> this in the future...
>
> Stefan

Yes, this needs some thought. An idea is to guarantee the atomicity of 
the virtio state and don't expose partial state. But looks like this 
needs lots of changes.

Anyway, I will apply this patch first.

Thanks
Stefan Hajnoczi Nov. 27, 2017, 11:19 a.m. UTC | #5
On Mon, Nov 27, 2017 at 11:26:34AM +0800, Jason Wang wrote:
> 
> 
> On 2017年11月24日 18:44, Stefan Hajnoczi wrote:
> > On Fri, Nov 24, 2017 at 10:57:11AM +0800, Jason Wang wrote:
> > > On 2017年11月23日 18:59, Stefan Hajnoczi wrote:
> > > > On Thu, Nov 23, 2017 at 11:37:46AM +0800, Jason Wang wrote:
> > > > > Guest state should not be touched if VM is stopped, unfortunately we
> > > > > didn't check running state and tried to drain tx queue unconditionally
> > > > > in virtio_net_set_status(). A crash was then noticed as a migration
> > > > > destination when user type quit after virtqueue state is loaded but
> > > > > before region cache is initialized. In this case,
> > > > > virtio_net_drop_tx_queue_data() tries to access the uninitialized
> > > > > region cache.
> > > > > 
> > > > > Fix this by only dropping tx queue data when vm is running.
> > > > hw/virtio/virtio.c:virtio_load() does the following:
> > > > 
> > > >     for (i = 0; i < num; i++) {
> > > >         if (vdev->vq[i].vring.desc) {
> > > >             uint16_t nheads;
> > > > 
> > > >             /*
> > > >              * VIRTIO-1 devices migrate desc, used, and avail ring addresses so
> > > >              * only the region cache needs to be set up.  Legacy devices need
> > > >              * to calculate used and avail ring addresses based on the desc
> > > >              * address.
> > > >              */
> > > >             if (virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) {
> > > >                 virtio_init_region_cache(vdev, i);
> > > >             } else {
> > > >                 virtio_queue_update_rings(vdev, i);
> > > >             }
> > > > 
> > > > So the region caches should be initialized after virtqueue state is
> > > > loaded.
> > > > 
> > > > It's unclear to me which code path triggers this issue.  Can you add a
> > > > backtrace or an explanation?
> > > > 
> > > > Thanks,
> > > > Stefan
> > > Migration coroutine was yield before region cache was initialized. The
> > > backtrace looks like:
> > [...]
> > > #16 0x0000555555b1c199 in vmstate_load_state (f=0x555556f7c010,
> > > vmsd=0x5555562b8160 <vmstate_virtio>, opaque=0x555557d68610, version_id=1)
> > >      at migration/vmstate.c:160
> > > #17 0x0000555555865cc3 in virtio_load (vdev=0x555557d68610,
> > > f=0x555556f7c010, version_id=11) at
> > > /home/devel/git/qemu/hw/virtio/virtio.c:2110
> > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> > 
> > Thanks for the backtrace!  Your patch is fine but I have a larger
> > concern:
> > 
> > The backtrace shows that the virtio code is re-entrant during savevm
> > load.  That's probably a bad thing because set_status() and other APIs
> > are probably not intended to run while we are half-way through savevm
> > load.  The virtqueue is only partially set up at this point :(.  I
> > wonder if a more general cleanup is necessary to avoid problems like
> > this in the future...
> > 
> > Stefan
> 
> Yes, this needs some thought. An idea is to guarantee the atomicity of the
> virtio state and don't expose partial state. But looks like this needs lots
> of changes.
> 
> Anyway, I will apply this patch first.

Sounds good, thanks!

Stefan
diff mbox

Patch

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 150fd07..38674b0 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -288,7 +288,8 @@  static void virtio_net_set_status(struct VirtIODevice *vdev, uint8_t status)
                 qemu_bh_cancel(q->tx_bh);
             }
             if ((n->status & VIRTIO_NET_S_LINK_UP) == 0 &&
-                (queue_status & VIRTIO_CONFIG_S_DRIVER_OK)) {
+                (queue_status & VIRTIO_CONFIG_S_DRIVER_OK) &&
+                vdev->vm_running) {
                 /* if tx is waiting we are likely have some packets in tx queue
                  * and disabled notification */
                 q->tx_waiting = 0;