Message ID | 20230128071724.33677-3-xuanzhuo@linux.alibaba.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | virtio: fix for assertion failure: virtio_net_get_subqueue(nc)->async_tx.elem failed | expand |
On Sat, Jan 28, 2023 at 03:17:23PM +0800, Xuan Zhuo wrote: > In the current design, we stop the device from operating on the vring > during per-queue reset by resetting the structure VirtQueue. > > But before the reset operation, when recycling some resources, we should > stop referencing new vring resources. For example, when recycling > virtio-net's asynchronous sending resources, virtio-net should be able > to perceive that the current queue is in the per-queue reset state, and > stop sending new packets from the tx queue. > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > --- > hw/virtio/virtio.c | 8 ++++++++ > include/hw/virtio/virtio.h | 3 +++ > 2 files changed, 11 insertions(+) > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > index 03077b2ecf..907d5b8bde 100644 > --- a/hw/virtio/virtio.c > +++ b/hw/virtio/virtio.c > @@ -2030,6 +2030,12 @@ void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index) > { > VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); > > + /* > + * Mark this queue is per-queue reset status. The device should release the > + * references of the vring, and not refer more new vring item. > + */ > + vdev->vq[queue_index].reset = true; > + > if (k->queue_reset) { > k->queue_reset(vdev, queue_index); > } > @@ -2053,6 +2059,8 @@ void virtio_queue_enable(VirtIODevice *vdev, uint32_t queue_index) > } > */ > > + vdev->vq[queue_index].reset = false; > + > if (k->queue_enable) { > k->queue_enable(vdev, queue_index); > } > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h > index 1c0d77c670..b888538d09 100644 > --- a/include/hw/virtio/virtio.h > +++ b/include/hw/virtio/virtio.h > @@ -251,6 +251,9 @@ struct VirtQueue { > /* Notification enabled? */ > bool notification; > > + /* Per-Queue Reset status */ > + bool reset; > + > uint16_t queue_index; > Reset state makes no sense. It seems to imply queue_reset in the spec. And for extra fun there's "reset" in the pci proxy which means "virtio_queue_reset is in progress" - I have no idea what uses it though - it is not guest visible. First what is it? It actually means "queue has been reset and not has not been enabled since". So disabled_by_reset maybe? Second this hack helps make the change minimal so it's helpful for stable, but it's ugly in that it duplicates the reverse of enabled value - we don't really care what disabled it in practice. With the fixups above I can apply so it's easier to backport, but later a patch on top should clean it all up, perhaps by adding "enabled" in VirtQueue. We should also get rid of "reset" in the proxy unless there's some way it's useful which I don't currently see. > unsigned int inuse; > -- > 2.32.0.3.g01195cf9f
On Sat, 28 Jan 2023 05:22:05 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Sat, Jan 28, 2023 at 03:17:23PM +0800, Xuan Zhuo wrote: > > In the current design, we stop the device from operating on the vring > > during per-queue reset by resetting the structure VirtQueue. > > > > But before the reset operation, when recycling some resources, we should > > stop referencing new vring resources. For example, when recycling > > virtio-net's asynchronous sending resources, virtio-net should be able > > to perceive that the current queue is in the per-queue reset state, and > > stop sending new packets from the tx queue. > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > --- > > hw/virtio/virtio.c | 8 ++++++++ > > include/hw/virtio/virtio.h | 3 +++ > > 2 files changed, 11 insertions(+) > > > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > > index 03077b2ecf..907d5b8bde 100644 > > --- a/hw/virtio/virtio.c > > +++ b/hw/virtio/virtio.c > > @@ -2030,6 +2030,12 @@ void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index) > > { > > VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); > > > > + /* > > + * Mark this queue is per-queue reset status. The device should release the > > + * references of the vring, and not refer more new vring item. > > + */ > > + vdev->vq[queue_index].reset = true; > > + > > if (k->queue_reset) { > > k->queue_reset(vdev, queue_index); > > } > > @@ -2053,6 +2059,8 @@ void virtio_queue_enable(VirtIODevice *vdev, uint32_t queue_index) > > } > > */ > > > > + vdev->vq[queue_index].reset = false; > > + > > if (k->queue_enable) { > > k->queue_enable(vdev, queue_index); > > } > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h > > index 1c0d77c670..b888538d09 100644 > > --- a/include/hw/virtio/virtio.h > > +++ b/include/hw/virtio/virtio.h > > @@ -251,6 +251,9 @@ struct VirtQueue { > > /* Notification enabled? */ > > bool notification; > > > > + /* Per-Queue Reset status */ > > + bool reset; > > + > > uint16_t queue_index; > > > > Reset state makes no sense. It seems to imply queue_reset > in the spec. And for extra fun there's "reset" in the pci > proxy which means "virtio_queue_reset is in progress" - I have no > idea what uses it though - it is not guest visible. First what is it? > It actually means "queue has been reset and not has not been enabled since". > So disabled_by_reset maybe? In fact, when reading this, the queue has not been reset, so prepare_for_reset? > > Second this hack helps make the change minimal > so it's helpful for stable, but it's ugly in that it > duplicates the reverse of enabled value - we don't really > care what disabled it in practice. > > With the fixups above I can apply so it's easier to backport, but later > a patch on top should clean it all up, perhaps by adding > "enabled" in VirtQueue. We should also get rid of "reset" in the proxy > unless there's some way it's useful which I don't currently see. > I have some confusion, I don't understand what you mean. Why did we remove the "reset" in the proxy? I agree to rename the "reset". Thanks. > > > > unsigned int inuse; > > -- > > 2.32.0.3.g01195cf9f >
On Sat, Jan 28, 2023 at 06:41:09PM +0800, Xuan Zhuo wrote: > On Sat, 28 Jan 2023 05:22:05 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote: > > On Sat, Jan 28, 2023 at 03:17:23PM +0800, Xuan Zhuo wrote: > > > In the current design, we stop the device from operating on the vring > > > during per-queue reset by resetting the structure VirtQueue. > > > > > > But before the reset operation, when recycling some resources, we should > > > stop referencing new vring resources. For example, when recycling > > > virtio-net's asynchronous sending resources, virtio-net should be able > > > to perceive that the current queue is in the per-queue reset state, and > > > stop sending new packets from the tx queue. > > > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > > --- > > > hw/virtio/virtio.c | 8 ++++++++ > > > include/hw/virtio/virtio.h | 3 +++ > > > 2 files changed, 11 insertions(+) > > > > > > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c > > > index 03077b2ecf..907d5b8bde 100644 > > > --- a/hw/virtio/virtio.c > > > +++ b/hw/virtio/virtio.c > > > @@ -2030,6 +2030,12 @@ void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index) > > > { > > > VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); > > > > > > + /* > > > + * Mark this queue is per-queue reset status. The device should release the > > > + * references of the vring, and not refer more new vring item. > > > + */ > > > + vdev->vq[queue_index].reset = true; > > > + > > > if (k->queue_reset) { > > > k->queue_reset(vdev, queue_index); > > > } > > > @@ -2053,6 +2059,8 @@ void virtio_queue_enable(VirtIODevice *vdev, uint32_t queue_index) > > > } > > > */ > > > > > > + vdev->vq[queue_index].reset = false; > > > + > > > if (k->queue_enable) { > > > k->queue_enable(vdev, queue_index); > > > } > > > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h > > > index 1c0d77c670..b888538d09 100644 > > > --- a/include/hw/virtio/virtio.h > > > +++ b/include/hw/virtio/virtio.h > > > @@ -251,6 +251,9 @@ struct VirtQueue { > > > /* Notification enabled? */ > > > bool notification; > > > > > > + /* Per-Queue Reset status */ > > > + bool reset; > > > + > > > uint16_t queue_index; > > > > > > > Reset state makes no sense. It seems to imply queue_reset > > in the spec. And for extra fun there's "reset" in the pci > > proxy which means "virtio_queue_reset is in progress" - I have no > > idea what uses it though - it is not guest visible. First what is it? > > It actually means "queue has been reset and not has not been enabled since". > > So disabled_by_reset maybe? > > > In fact, when reading this, the queue has not been reset, > so prepare_for_reset? Makes it sound like it's some kind of temporary state where it is not - it will stay like this until enabled. As this makes no practical difference that it is set to early, just set it later for consistency. > > > > Second this hack helps make the change minimal > > so it's helpful for stable, but it's ugly in that it > > duplicates the reverse of enabled value - we don't really > > care what disabled it in practice. > > > > With the fixups above I can apply so it's easier to backport, but later > > a patch on top should clean it all up, perhaps by adding > > "enabled" in VirtQueue. We should also get rid of "reset" in the proxy > > unless there's some way it's useful which I don't currently see. > > > > I have some confusion, I don't understand what you mean. > > Why did we remove the "reset" in the proxy? We did not but we should. Why we should remove "reset" in the proxy? Because guest can never read it as != 0: case VIRTIO_PCI_COMMON_Q_RESET: if (val == 1) { proxy->vqs[vdev->queue_sel].reset = 1; virtio_queue_reset(vdev, vdev->queue_sel); proxy->vqs[vdev->queue_sel].reset = 0; proxy->vqs[vdev->queue_sel].enabled = 0; } break; from guest's POV reset is atomic and so does not need a variable to track state. > I agree to rename the "reset". > > Thanks. > > > > > > > > unsigned int inuse; > > > -- > > > 2.32.0.3.g01195cf9f > >
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 03077b2ecf..907d5b8bde 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -2030,6 +2030,12 @@ void virtio_queue_reset(VirtIODevice *vdev, uint32_t queue_index) { VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); + /* + * Mark this queue is per-queue reset status. The device should release the + * references of the vring, and not refer more new vring item. + */ + vdev->vq[queue_index].reset = true; + if (k->queue_reset) { k->queue_reset(vdev, queue_index); } @@ -2053,6 +2059,8 @@ void virtio_queue_enable(VirtIODevice *vdev, uint32_t queue_index) } */ + vdev->vq[queue_index].reset = false; + if (k->queue_enable) { k->queue_enable(vdev, queue_index); } diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h index 1c0d77c670..b888538d09 100644 --- a/include/hw/virtio/virtio.h +++ b/include/hw/virtio/virtio.h @@ -251,6 +251,9 @@ struct VirtQueue { /* Notification enabled? */ bool notification; + /* Per-Queue Reset status */ + bool reset; + uint16_t queue_index; unsigned int inuse;
In the current design, we stop the device from operating on the vring during per-queue reset by resetting the structure VirtQueue. But before the reset operation, when recycling some resources, we should stop referencing new vring resources. For example, when recycling virtio-net's asynchronous sending resources, virtio-net should be able to perceive that the current queue is in the per-queue reset state, and stop sending new packets from the tx queue. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> --- hw/virtio/virtio.c | 8 ++++++++ include/hw/virtio/virtio.h | 3 +++ 2 files changed, 11 insertions(+)