diff mbox series

hw/char/virtio-serial-bus: fix: Unpop throttled VirtQueueElement to queue before discard vq data

Message ID 918AA6D7-F6C6-4A0E-8E65-6C7ECF148304@kingsoft.com (mailing list archive)
State New, archived
Headers show
Series hw/char/virtio-serial-bus: fix: Unpop throttled VirtQueueElement to queue before discard vq data | expand

Commit Message

Ports enter a "throttled" state when writing to the chardev would block.
The current output VirtQueueElement is kept around until the chardev
becomes writable again.

Because closing the virtio serial device does not reset the queue, we cannot
directly discard this element,  otherwise the control variables of the front
and back ends of the queue are inconsistent such as used_index. We should unpop the
VirtQueueElement to queue, let discard_vq_data process it.

The test environment:
kernel: linux-5.12
Qemu command:
Qemu-system-x86 -machine pc,accel=kvm \
    -cpu host,host-phys-bits \
    -smp 4 \
    -m 4G \
    -kernel ./kernel \
    -display none \
    -nodefaults \
    -serial mon:stdio \
    -append "panic=1 no_timer_check noreplace-smp rootflags=data=ordered rootfstype=ext4 console=ttyS0 reboot=k root=/dev/vda1 rw" \
    -drive id=os,file=./disk,if=none \
    -device virtio-blk-pci,drive=os \
    -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 \
    -chardev socket,id=charchannel0,path=/tmp/char-dev-test,server,nowait \
  -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0

full up virtio queue after VM started:
Cat /large-file > /dev/vport1p1

Host side:
Open and close character device sockets repeatedly

After awhile we can’t write any request to /dev/vport1p1 at VM side, VM kernel soft lockup at drivers/char/virtio_console.c: __send_to_port


Signed-off-by: Arafatms <aierpatijiang1@kingsoft.com>

Comments

Stefan Hajnoczi Aug. 2, 2021, 10 a.m. UTC | #1
On Fri, Jul 30, 2021 at 01:58:41AM +0000, AIERPATIJIANG1 [艾尔帕提江·阿布都赛买提] wrote:
> Ports enter a "throttled" state when writing to the chardev would block.
> The current output VirtQueueElement is kept around until the chardev
> becomes writable again.
> 
> Because closing the virtio serial device does not reset the queue, we cannot
> directly discard this element,  otherwise the control variables of the front
> and back ends of the queue are inconsistent such as used_index. We should unpop the
> VirtQueueElement to queue, let discard_vq_data process it.

Laurent: Ping

> 
> The test environment:
> kernel: linux-5.12
> Qemu command:
> Qemu-system-x86 -machine pc,accel=kvm \
>     -cpu host,host-phys-bits \
>     -smp 4 \
>     -m 4G \
>     -kernel ./kernel \
>     -display none \
>     -nodefaults \
>     -serial mon:stdio \
>     -append "panic=1 no_timer_check noreplace-smp rootflags=data=ordered rootfstype=ext4 console=ttyS0 reboot=k root=/dev/vda1 rw" \
>     -drive id=os,file=./disk,if=none \
>     -device virtio-blk-pci,drive=os \
>     -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 \
>     -chardev socket,id=charchannel0,path=/tmp/char-dev-test,server,nowait \
>   -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0
> 
> full up virtio queue after VM started:
> Cat /large-file > /dev/vport1p1
> 
> Host side:
> Open and close character device sockets repeatedly
> 
> After awhile we can’t write any request to /dev/vport1p1 at VM side, VM kernel soft lockup at drivers/char/virtio_console.c: __send_to_port
> 
> 
> Signed-off-by: Arafatms <aierpatijiang1@kingsoft.com>
> 
> diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
> index dd6bc27b3b..36236defdf 100644
> --- a/hw/char/virtio-serial-bus.c
> +++ b/hw/char/virtio-serial-bus.c
> @@ -150,8 +150,12 @@ static void discard_vq_data(VirtQueue *vq, VirtIODevice *vdev)
> 
> static void discard_throttle_data(VirtIOSerialPort *port)
> {
> +    if (!virtio_queue_ready(port->ovq)) {
> +        return;
> +    }
> +
>      if (port->elem) {
> -        virtqueue_detach_element(port->ovq, port->elem, 0);
> +        virtqueue_unpop(port->ovq, port->elem, 0);
>          g_free(port->elem);
>          port->elem = NULL;
>      }
> 
> 
>
Laurent Vivier Aug. 2, 2021, 3:01 p.m. UTC | #2
On 30/07/2021 03:58, AIERPATIJIANG1 [艾尔帕提江·阿布都赛买提] wrote:
> Ports enter a "throttled" state when writing to the chardev would block.
> The current output VirtQueueElement is kept around until the chardev
> becomes writable again.
> 
> Because closing the virtio serial device does not reset the queue, we cannot
> directly discard this element,  otherwise the control variables of the front
> and back ends of the queue are inconsistent such as used_index. We should unpop the
> VirtQueueElement to queue, let discard_vq_data process it.
> 
> The test environment:
> kernel: linux-5.12
> Qemu command:
> Qemu-system-x86 -machine pc,accel=kvm \
>     -cpu host,host-phys-bits \
>     -smp 4 \
>     -m 4G \
>     -kernel ./kernel \
>     -display none \
>     -nodefaults \
>     -serial mon:stdio \
>     -append "panic=1 no_timer_check noreplace-smp rootflags=data=ordered rootfstype=ext4 console=ttyS0 reboot=k root=/dev/vda1 rw" \
>     -drive id=os,file=./disk,if=none \
>     -device virtio-blk-pci,drive=os \
>     -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 \
>     -chardev socket,id=charchannel0,path=/tmp/char-dev-test,server,nowait \
>   -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0
> 
> full up virtio queue after VM started:
> Cat /large-file > /dev/vport1p1
> 
> Host side:
> Open and close character device sockets repeatedly
> 
> After awhile we can’t write any request to /dev/vport1p1 at VM side, VM kernel soft lockup at drivers/char/virtio_console.c: __send_to_port
> 
> 
> Signed-off-by: Arafatms <aierpatijiang1@kingsoft.com>
> 
> diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
> index dd6bc27b3b..36236defdf 100644
> --- a/hw/char/virtio-serial-bus.c
> +++ b/hw/char/virtio-serial-bus.c
> @@ -150,8 +150,12 @@ static void discard_vq_data(VirtQueue *vq, VirtIODevice *vdev)
> 
> static void discard_throttle_data(VirtIOSerialPort *port)
> {
> +    if (!virtio_queue_ready(port->ovq)) {
> +        return;
> +    }

Why do we need to check virtio_queue_ready() now?

> +
>      if (port->elem) {
> -        virtqueue_detach_element(port->ovq, port->elem, 0);
> +        virtqueue_unpop(port->ovq, port->elem, 0);
>          g_free(port->elem);
>          port->elem = NULL;
>      }
> 

It seems the problem you report can only happen when the port is closed from the host side
(because from the guest side I guess the cleanup is done by the guest driver).

Stefan, you have introduced the function discard_throttle_data() in:

  d4c19cdeeb2f ("virtio-serial: add missing virtio_detach_element() call")

do you remember why you prefered to use virtqueue_detach_element() rather than
virtqueue_unpop() (or virtqueue_discard() at this time, see 27e57efe32f5 ("virtio: rename
virtqueue_discard to virtqueue_unpop"))

Thanks,
Laurent
在 2021/8/2 下午11:01,“Laurent Vivier”<lvivier@redhat.com> 写入:

    On 30/07/2021 03:58, AIERPATIJIANG1 [艾尔帕提江·阿布都赛买提] wrote:
    >> Ports enter a "throttled" state when writing to the chardev would block.
    >> The current output VirtQueueElement is kept around until the chardev
    >> becomes writable again.
    >> 
    >> Because closing the virtio serial device does not reset the queue, we cannot
    >> directly discard this element,  otherwise the control variables of the front
    >> and back ends of the queue are inconsistent such as used_index. We should unpop the
    >> VirtQueueElement to queue, let discard_vq_data process it.
    >> 
    >> The test environment:
    >> kernel: linux-5.12
    >> Qemu command:
    >> Qemu-system-x86 -machine pc,accel=kvm \
    >>     -cpu host,host-phys-bits \
    >>     -smp 4 \
    >>     -m 4G \
    >>     -kernel ./kernel \
    >>     -display none \
    >>     -nodefaults \
    >>     -serial mon:stdio \
    >>     -append "panic=1 no_timer_check noreplace-smp rootflags=data=ordered rootfstype=ext4 console=ttyS0 reboot=k root=/dev/vda1 rw" \
    >>     -drive id=os,file=./disk,if=none \
    >>     -device virtio-blk-pci,drive=os \
    >>     -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 \
    >>     -chardev socket,id=charchannel0,path=/tmp/char-dev-test,server,nowait \
    >>   -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0
    >> 
    >> full up virtio queue after VM started:
    >> Cat /large-file > /dev/vport1p1
    >> 
    >> Host side:
    >> Open and close character device sockets repeatedly
    >> 
    >> After awhile we can’t write any request to /dev/vport1p1 at VM side, VM kernel soft lockup at drivers/char/virtio_console.c: __send_to_port
    >> 
    >> 
    >> Signed-off-by: Arafatms <aierpatijiang1@kingsoft.com>
    >> 
    >> diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
    >> index dd6bc27b3b..36236defdf 100644
    >> --- a/hw/char/virtio-serial-bus.c
    >> +++ b/hw/char/virtio-serial-bus.c
    >> @@ -150,8 +150,12 @@ static void discard_vq_data(VirtQueue *vq, VirtIODevice *vdev)
    >> 
    >> static void discard_throttle_data(VirtIOSerialPort *port)
    >> {
    >> +    if (!virtio_queue_ready(port->ovq)) {
    >> +        return;
    >> +    }

    >Why do we need to check virtio_queue_ready() now?

	I think we should check virtio_queue_ready before every operation of virt_queues.

    >> +
    >>      if (port->elem) {
    >> -        virtqueue_detach_element(port->ovq, port->elem, 0);
    >> +        virtqueue_unpop(port->ovq, port->elem, 0);
    >>          g_free(port->elem);
    >>          port->elem = NULL;
    >>      }
    >> 

    > It seems the problem you report can only happen when the port is closed from the host side
    > (because from the guest side I guess the cleanup is done by the guest driver).

	Yes, this problem can only happen when the port is closed from the host side. 

    > Stefan, you have introduced the function discard_throttle_data() in:

    >  d4c19cdeeb2f ("virtio-serial: add missing virtio_detach_element() call")

    > do you remember why you prefered to use virtqueue_detach_element() rather than
    > virtqueue_unpop() (or virtqueue_discard() at this time, see 27e57efe32f5 ("virtio: rename
    > virtqueue_discard to virtqueue_unpop"))

    > Thanks,
    > Laurent

	I observed all the situations where discard_throttle_data is called, The result only points to one situation, that is, the virtio device is
	damaged(driver will reset all queues). About virtio-serial device, close devise from host side doesn't mean the driver reset all queues,
	so we can't just detach the throttled elem.

	Thanks,
	Arafatms
Stefan Hajnoczi Aug. 4, 2021, 3:34 p.m. UTC | #4
On Mon, Aug 02, 2021 at 05:01:14PM +0200, Laurent Vivier wrote:
> On 30/07/2021 03:58, AIERPATIJIANG1 [艾尔帕提江·阿布都赛买提] wrote:
> > Ports enter a "throttled" state when writing to the chardev would block.
> > The current output VirtQueueElement is kept around until the chardev
> > becomes writable again.
> > 
> > Because closing the virtio serial device does not reset the queue, we cannot
> > directly discard this element,  otherwise the control variables of the front
> > and back ends of the queue are inconsistent such as used_index. We should unpop the
> > VirtQueueElement to queue, let discard_vq_data process it.
> > 
> > The test environment:
> > kernel: linux-5.12
> > Qemu command:
> > Qemu-system-x86 -machine pc,accel=kvm \
> >     -cpu host,host-phys-bits \
> >     -smp 4 \
> >     -m 4G \
> >     -kernel ./kernel \
> >     -display none \
> >     -nodefaults \
> >     -serial mon:stdio \
> >     -append "panic=1 no_timer_check noreplace-smp rootflags=data=ordered rootfstype=ext4 console=ttyS0 reboot=k root=/dev/vda1 rw" \
> >     -drive id=os,file=./disk,if=none \
> >     -device virtio-blk-pci,drive=os \
> >     -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 \
> >     -chardev socket,id=charchannel0,path=/tmp/char-dev-test,server,nowait \
> >   -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0
> > 
> > full up virtio queue after VM started:
> > Cat /large-file > /dev/vport1p1
> > 
> > Host side:
> > Open and close character device sockets repeatedly
> > 
> > After awhile we can’t write any request to /dev/vport1p1 at VM side, VM kernel soft lockup at drivers/char/virtio_console.c: __send_to_port
> > 
> > 
> > Signed-off-by: Arafatms <aierpatijiang1@kingsoft.com>
> > 
> > diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
> > index dd6bc27b3b..36236defdf 100644
> > --- a/hw/char/virtio-serial-bus.c
> > +++ b/hw/char/virtio-serial-bus.c
> > @@ -150,8 +150,12 @@ static void discard_vq_data(VirtQueue *vq, VirtIODevice *vdev)
> > 
> > static void discard_throttle_data(VirtIOSerialPort *port)
> > {
> > +    if (!virtio_queue_ready(port->ovq)) {
> > +        return;
> > +    }
> 
> Why do we need to check virtio_queue_ready() now?

I'm not sure why this is necessary either. We don't need the vring to be
functional, we're just trying to clean up our internal state.

> > +
> >      if (port->elem) {
> > -        virtqueue_detach_element(port->ovq, port->elem, 0);
> > +        virtqueue_unpop(port->ovq, port->elem, 0);
> >          g_free(port->elem);
> >          port->elem = NULL;
> >      }
> > 
> 
> It seems the problem you report can only happen when the port is closed from the host side
> (because from the guest side I guess the cleanup is done by the guest driver).
> 
> Stefan, you have introduced the function discard_throttle_data() in:
> 
>   d4c19cdeeb2f ("virtio-serial: add missing virtio_detach_element() call")
> 
> do you remember why you prefered to use virtqueue_detach_element() rather than
> virtqueue_unpop() (or virtqueue_discard() at this time, see 27e57efe32f5 ("virtio: rename
> virtqueue_discard to virtqueue_unpop"))

I don't remember. virtqueue_unpop() cannot handle out-of-order element
processing, but it seems safe to use here since port->elem is the last
element we pop off ovq.

It's probably just a bug in my commit d4c19cdeeb2f. I may have thought
the virtqueue is never touched again after reset (true), close (false),
and remove port (false?).

Please add:

  Fixes: d4c19cdeeb2f ("virtio-serial: add missing virtio_detach_element() call")

Stefan
diff mbox series

Patch

diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index dd6bc27b3b..36236defdf 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -150,8 +150,12 @@  static void discard_vq_data(VirtQueue *vq, VirtIODevice *vdev)

static void discard_throttle_data(VirtIOSerialPort *port)
{
+    if (!virtio_queue_ready(port->ovq)) {
+        return;
+    }
+
     if (port->elem) {
-        virtqueue_detach_element(port->ovq, port->elem, 0);
+        virtqueue_unpop(port->ovq, port->elem, 0);
         g_free(port->elem);
         port->elem = NULL;
     }