mbox series

[vhost,00/22] virtio-net: support AF_XDP zero copy

Message ID 20231011092728.105904-1-xuanzhuo@linux.alibaba.com (mailing list archive)
Headers show
Series virtio-net: support AF_XDP zero copy | expand

Message

Xuan Zhuo Oct. 11, 2023, 9:27 a.m. UTC
## AF_XDP

XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
copy feature of xsk (XDP socket) needs to be supported by the driver. The
performance of zero copy is very good. mlx5 and intel ixgbe already support
this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
feature.

At present, we have completed some preparation:

1. vq-reset (virtio spec and kernel code)
2. virtio-core premapped dma
3. virtio-net xdp refactor

So it is time for Virtio-Net to complete the support for the XDP Socket
Zerocopy.

Virtio-net can not increase the queue num at will, so xsk shares the queue with
kernel.

On the other hand, Virtio-Net does not support generate interrupt from driver
manually, so when we wakeup tx xmit, we used some tips. If the CPU run by TX
NAPI last time is other CPUs, use IPI to wake up NAPI on the remote CPU. If it
is also the local CPU, then we wake up napi directly.

This patch set includes some refactor to the virtio-net to let that to support
AF_XDP.

## performance

ENV: Qemu with vhost-user(polling mode).

Sockperf: https://github.com/Mellanox/sockperf
I use this tool to send udp packet by kernel syscall.

xmit command: sockperf tp -i 10.0.3.1 -t 1000

I write a tool that sends udp packets or recvs udp packets by AF_XDP.

                  | Guest APP CPU |Guest Softirq CPU | UDP PPS
------------------|---------------|------------------|------------
xmit by syscall   |   100%        |                  |   676,915
xmit by xsk       |   59.1%       |   100%           | 5,447,168
recv by syscall   |   60%         |   100%           |   932,288
recv by xsk       |   35.7%       |   100%           | 3,343,168

## maintain

I am currently a reviewer for virtio-net. I commit to maintain AF_XDP support in
virtio-net.

Please review.

Thanks.
Xuan Zhuo (22):
  virtio_ring: virtqueue_set_dma_premapped support disable
  virtio_ring: introduce virtqueue_dma_[un]map_page_attrs
  virtio_net: rename free_old_xmit_skbs to free_old_xmit
  virtio_net: unify the code for recycling the xmit ptr
  virtio_net: independent directory
  virtio_net: move to virtio_net.h
  virtio_net: add prefix virtnet to all struct/api inside virtio_net.h
  virtio_net: virtnet_poll_tx support rescheduled
  virtio_net: separate virtnet_rx_resize()
  virtio_net: separate virtnet_tx_resize()
  virtio_net: sq support premapped mode
  virtio_net: xsk: bind/unbind xsk
  virtio_net: xsk: prevent disable tx napi
  virtio_net: xsk: tx: support tx
  virtio_net: xsk: tx: support wakeup
  virtio_net: xsk: tx: virtnet_free_old_xmit() distinguishes xsk buffer
  virtio_net: xsk: tx: virtnet_sq_free_unused_buf() check xsk buffer
  virtio_net: xsk: rx: introduce add_recvbuf_xsk()
  virtio_net: xsk: rx: introduce receive_xsk() to recv xsk buffer
  virtio_net: xsk: rx: virtnet_rq_free_unused_buf() check xsk buffer
  virtio_net: update tx timeout record
  virtio_net: xdp_features add NETDEV_XDP_ACT_XSK_ZEROCOPY

 MAINTAINERS                                 |   2 +-
 drivers/net/Kconfig                         |   8 +-
 drivers/net/Makefile                        |   2 +-
 drivers/net/virtio/Kconfig                  |  13 +
 drivers/net/virtio/Makefile                 |   8 +
 drivers/net/{virtio_net.c => virtio/main.c} | 644 +++++++++-----------
 drivers/net/virtio/virtio_net.h             | 360 +++++++++++
 drivers/net/virtio/xsk.c                    | 545 +++++++++++++++++
 drivers/net/virtio/xsk.h                    |  32 +
 drivers/virtio/virtio_ring.c                |  63 +-
 include/linux/virtio.h                      |   9 +-
 11 files changed, 1315 insertions(+), 371 deletions(-)
 create mode 100644 drivers/net/virtio/Kconfig
 create mode 100644 drivers/net/virtio/Makefile
 rename drivers/net/{virtio_net.c => virtio/main.c} (92%)
 create mode 100644 drivers/net/virtio/virtio_net.h
 create mode 100644 drivers/net/virtio/xsk.c
 create mode 100644 drivers/net/virtio/xsk.h

--
2.32.0.3.g01195cf9f

Comments

Jakub Kicinski Oct. 11, 2023, 5 p.m. UTC | #1
On Wed, 11 Oct 2023 17:27:06 +0800 Xuan Zhuo wrote:
> ## AF_XDP
> 
> XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> copy feature of xsk (XDP socket) needs to be supported by the driver. The
> performance of zero copy is very good. mlx5 and intel ixgbe already support
> this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> feature.

You're moving the driver and adding a major feature.
This really needs to go via net or bpf.
If you have dependencies in other trees please wait for
after the merge window.
Xuan Zhuo Oct. 12, 2023, 1:53 a.m. UTC | #2
On Wed, 11 Oct 2023 10:00:57 -0700, Jakub Kicinski <kuba@kernel.org> wrote:
> On Wed, 11 Oct 2023 17:27:06 +0800 Xuan Zhuo wrote:
> > ## AF_XDP
> >
> > XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> > copy feature of xsk (XDP socket) needs to be supported by the driver. The
> > performance of zero copy is very good. mlx5 and intel ixgbe already support
> > this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> > feature.
>
> You're moving the driver and adding a major feature.
> This really needs to go via net or bpf.
> If you have dependencies in other trees please wait for
> after the merge window.


If so, I can remove the first two commits.

Then, the sq uses the premapped mode by default.
And we can use the api virtqueue_dma_map_single_attrs to replace the
virtqueue_dma_map_page_attrs.

And then I will fix that on the top.

Hi Micheal and Jason, is that ok for you?

Thanks.
Jason Wang Oct. 12, 2023, 7:50 a.m. UTC | #3
On Thu, Oct 12, 2023 at 9:58 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
>
> On Wed, 11 Oct 2023 10:00:57 -0700, Jakub Kicinski <kuba@kernel.org> wrote:
> > On Wed, 11 Oct 2023 17:27:06 +0800 Xuan Zhuo wrote:
> > > ## AF_XDP
> > >
> > > XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> > > copy feature of xsk (XDP socket) needs to be supported by the driver. The
> > > performance of zero copy is very good. mlx5 and intel ixgbe already support
> > > this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> > > feature.
> >
> > You're moving the driver and adding a major feature.
> > This really needs to go via net or bpf.
> > If you have dependencies in other trees please wait for
> > after the merge window.
>
>
> If so, I can remove the first two commits.
>
> Then, the sq uses the premapped mode by default.
> And we can use the api virtqueue_dma_map_single_attrs to replace the
> virtqueue_dma_map_page_attrs.
>
> And then I will fix that on the top.
>
> Hi Micheal and Jason, is that ok for you?

I would go with what looks easy for you but I think Jakub wants the
series to go with next-next (this is what we did in the past for
networking specific features that is done in virtio-net). So we need
to tweak the prefix to use net-next instead of vhost.

Thanks


>
> Thanks.
>
Xuan Zhuo Oct. 12, 2023, 8:32 a.m. UTC | #4
On Thu, 12 Oct 2023 15:50:13 +0800, Jason Wang <jasowang@redhat.com> wrote:
> On Thu, Oct 12, 2023 at 9:58 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> >
> > On Wed, 11 Oct 2023 10:00:57 -0700, Jakub Kicinski <kuba@kernel.org> wrote:
> > > On Wed, 11 Oct 2023 17:27:06 +0800 Xuan Zhuo wrote:
> > > > ## AF_XDP
> > > >
> > > > XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> > > > copy feature of xsk (XDP socket) needs to be supported by the driver. The
> > > > performance of zero copy is very good. mlx5 and intel ixgbe already support
> > > > this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> > > > feature.
> > >
> > > You're moving the driver and adding a major feature.
> > > This really needs to go via net or bpf.
> > > If you have dependencies in other trees please wait for
> > > after the merge window.
> >
> >
> > If so, I can remove the first two commits.
> >
> > Then, the sq uses the premapped mode by default.
> > And we can use the api virtqueue_dma_map_single_attrs to replace the
> > virtqueue_dma_map_page_attrs.
> >
> > And then I will fix that on the top.
> >
> > Hi Micheal and Jason, is that ok for you?
>
> I would go with what looks easy for you but I think Jakub wants the
> series to go with next-next (this is what we did in the past for
> networking specific features that is done in virtio-net). So we need
> to tweak the prefix to use net-next instead of vhost.

OK.

I will fix that in next version.

Thanks.

>
> Thanks
>
>
> >
> > Thanks.
> >
>
Michael S. Tsirkin Oct. 12, 2023, 2:50 p.m. UTC | #5
On Thu, Oct 12, 2023 at 04:32:40PM +0800, Xuan Zhuo wrote:
> On Thu, 12 Oct 2023 15:50:13 +0800, Jason Wang <jasowang@redhat.com> wrote:
> > On Thu, Oct 12, 2023 at 9:58 AM Xuan Zhuo <xuanzhuo@linux.alibaba.com> wrote:
> > >
> > > On Wed, 11 Oct 2023 10:00:57 -0700, Jakub Kicinski <kuba@kernel.org> wrote:
> > > > On Wed, 11 Oct 2023 17:27:06 +0800 Xuan Zhuo wrote:
> > > > > ## AF_XDP
> > > > >
> > > > > XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> > > > > copy feature of xsk (XDP socket) needs to be supported by the driver. The
> > > > > performance of zero copy is very good. mlx5 and intel ixgbe already support
> > > > > this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> > > > > feature.
> > > >
> > > > You're moving the driver and adding a major feature.
> > > > This really needs to go via net or bpf.
> > > > If you have dependencies in other trees please wait for
> > > > after the merge window.
> > >
> > >
> > > If so, I can remove the first two commits.
> > >
> > > Then, the sq uses the premapped mode by default.
> > > And we can use the api virtqueue_dma_map_single_attrs to replace the
> > > virtqueue_dma_map_page_attrs.
> > >
> > > And then I will fix that on the top.
> > >
> > > Hi Micheal and Jason, is that ok for you?
> >
> > I would go with what looks easy for you but I think Jakub wants the
> > series to go with next-next (this is what we did in the past for
> > networking specific features that is done in virtio-net). So we need
> > to tweak the prefix to use net-next instead of vhost.
> 
> OK.
> 
> I will fix that in next version.
> 
> Thanks.

Scaling scope back as far as possible is a good idea generally.
I am not sure how this will work though. Let's see.

> >
> > Thanks
> >
> >
> > >
> > > Thanks.
> > >
> >