mbox series

[v4,0/6] virtio,vhost: Add VIRTIO_F_IN_ORDER support

Message ID 20240710125522.4168043-1-jonah.palmer@oracle.com (mailing list archive)
Headers show
Series virtio,vhost: Add VIRTIO_F_IN_ORDER support | expand

Message

Jonah Palmer July 10, 2024, 12:55 p.m. UTC
The goal of these patches is to add support to a variety of virtio and
vhost devices for the VIRTIO_F_IN_ORDER transport feature. This feature
indicates that all buffers are used by the device in the same order in
which they were made available by the driver.

These patches attempt to implement a generalized, non-device-specific
solution to support this feature.

The core feature behind this solution is a buffer mechanism in the form
of a VirtQueue's used_elems VirtQueueElement array. This allows devices
who always use buffers in-order by default to have a minimal overhead
impact. Devices that may not always use buffers in-order likely will
experience a performance hit. How large that performance hit is will
depend on how frequently elements are completed out-of-order.

A VirtQueue whose device uses this feature will use its used_elems
VirtQueueElement array to hold used VirtQueueElements. The index that
used elements are placed in used_elems is the same index on the
used/descriptor ring that would satisfy the in-order requirement. In
other words, used elements are placed in their in-order locations on
used_elems and are only written to the used/descriptor ring once the
elements on used_elems are able to continue their expected order.

To differentiate between a "used" and "unused" element on the used_elems
array (a "used" element being an element that has returned from
processing and an "unused" element being an element that has not yet
been processed), we added a boolean 'in_order_filled' member to the
VirtQueueElement struct. This flag is set to true when the element comes
back from processing (virtqueue_ordered_fill) and then set back to false
once it's been written to the used/descriptor ring
(virtqueue_ordered_flush).

Testing:
========

Testing was done using the dpdk-testpmd application on the guest under
the following configurations. A bridge and two TAP devices were
configured on the host to create a loopback environment where packets
transmitted from one interface can be received on the other, and vice
versa. After starting the dpdk-testpmd application on the guest, the
testpmd command 'start tx_first' was executed to begin network traffic.
Traffic was simulated for 30s before executing the 'stop' command.

Relevant Qemu args:
-------------------
Note: both 'packed=true' and 'packed=false' were tested.

-netdev tap,id=net1,vhost=off,ifname=tap1
-netdev tap,id=net2,vhost=off,ifname=tap2
-device virtio-net-pci,in_order=true,packed=true,netdev=net1,addr=0x2
-device virtio-net-pci,in_order=true,packed=true,netdev=net2,addr=0x3

Loopback environment on host:
-----------------------------
BRIDGE=virbrDPDK
ip link add name $BRIDGE type bridge
ip link set dev $BRIDGE up
ip link add dev tap1 type tap
ip link set dev tap1 up
ip link set dev tap1 master $BRIDGE
ip link add dev tap2 type tap
ip link set dev tap2 up
ip link set dev tap2 master $BRIDGE

dpdk-testpmd command (guest):
-----------------------------
dpdk-testpmd -l 0-3 -n 4 -a 0000:00:02.0 -a 0000:00:03.0 -- -i
--port-topology=chained --forward-mode=io --stats-period 1 --burst=1

Results:
--------
After running 'start tx_first' and then 'stop' after 30 seconds in
the testpmd commandline:

split-VQ in-order:

---------------------- Forward statistics for port 0  ---------------
RX-packets: 408154         RX-dropped: 0             RX-total: 408154
TX-packets: 408174         TX-dropped: 0             TX-total: 408174
---------------------------------------------------------------------

---------------------- Forward statistics for port 1  ---------------
RX-packets: 408173         RX-dropped: 0             RX-total: 408173
TX-packets: 408155         TX-dropped: 0             TX-total: 408155
---------------------------------------------------------------------

+++++++++++++++ Accumulated forward statistics for all ports+++++++++
RX-packets: 816327         RX-dropped: 0             RX-total: 816327
TX-packets: 816329         TX-dropped: 0             TX-total: 816329
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

packed-VQ in-order:

---------------------- Forward statistics for port 0  ---------------
RX-packets: 414808         RX-dropped: 0             RX-total: 414808
TX-packets: 414822         TX-dropped: 0             TX-total: 414822
---------------------------------------------------------------------

---------------------- Forward statistics for port 1  ---------------
RX-packets: 414821         RX-dropped: 0             RX-total: 414821
TX-packets: 414809         TX-dropped: 0             TX-total: 414809
---------------------------------------------------------------------

+++++++++++++++ Accumulated forward statistics for all ports+++++++++
RX-packets: 829629         RX-dropped: 0             RX-total: 829629
TX-packets: 829631         TX-dropped: 0             TX-total: 829631
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

---
v4: Prevent used_elems overflow in virtqueue_split_pop.
    Don't keep used_idx bound between 0 and vring.num-1 for split VQs.
    Fix incrementing used_elems index 'i' in virtqueue_ordered_flush.
    Ensure all previous write ops to buffers are completed before
    updating the used_idx (via smp_wmb()).
    Use virtio-net instead of vhost-user devices for testing.

v3: Drop Tested-by tags until patches are re-tested.
    Replace 'prev_avail_idx' with 'vq->last_avail_idx - 1' in
    virtqueue_split_pop.
    Remove redundant '+vq->vring.num' in 'max_steps' calculation in
    virtqueue_ordered_fill.
    Add test results to CV.

v2: Make 'in_order_filled' more descriptive.
    Change 'j' to more descriptive var name in virtqueue_split_pop.
    Use more definitive search conditional in virtqueue_ordered_fill.
    Avoid code duplication in virtqueue_ordered_flush.

v1: Move series from RFC to PATCH for submission.

Jonah Palmer (6):
  virtio: Add bool to VirtQueueElement
  virtio: virtqueue_pop - VIRTIO_F_IN_ORDER support
  virtio: virtqueue_ordered_fill - VIRTIO_F_IN_ORDER support
  virtio: virtqueue_ordered_flush - VIRTIO_F_IN_ORDER support
  vhost,vhost-user: Add VIRTIO_F_IN_ORDER to vhost feature bits
  virtio: Add VIRTIO_F_IN_ORDER property definition

 hw/block/vhost-user-blk.c    |   1 +
 hw/net/vhost_net.c           |   2 +
 hw/scsi/vhost-scsi.c         |   1 +
 hw/scsi/vhost-user-scsi.c    |   1 +
 hw/virtio/vhost-user-fs.c    |   1 +
 hw/virtio/vhost-user-vsock.c |   1 +
 hw/virtio/virtio.c           | 130 ++++++++++++++++++++++++++++++++++-
 include/hw/virtio/virtio.h   |   6 +-
 net/vhost-vdpa.c             |   1 +
 9 files changed, 140 insertions(+), 4 deletions(-)