mbox series

[00/33] virtio-net: support AF_XDP zero copy

Message ID 20230202110058.130695-1-xuanzhuo@linux.alibaba.com (mailing list archive)
Headers show
Series virtio-net: support AF_XDP zero copy | expand

Message

Xuan Zhuo Feb. 2, 2023, 11 a.m. UTC
XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
copy feature of xsk (XDP socket) needs to be supported by the driver. The
performance of zero copy is very good. mlx5 and intel ixgbe already support
this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
feature.

Virtio-net did not support per-queue reset, so it was impossible to support XDP
Socket Zerocopy. At present, we have completed the work of Virtio Spec and
Kernel in Per-Queue Reset. It is time for Virtio-Net to complete the support for
the XDP Socket Zerocopy.

Virtio-net can not increase the queue at will, so xsk shares the queue with
kernel.

On the other hand, Virtio-Net does not support generate interrupt manually, so
when we wakeup tx xmit, we used some tips. If the CPU run by TX NAPI last time
is other CPUs, use IPI to wake up NAPI on the remote CPU. If it is also the
local CPU, then we wake up sofrirqd.

Please review.

Thanks.


Xuan Zhuo (33):
  virtio_ring: virtqueue_add() support premapped
  virtio_ring: split: virtqueue_add_split() support premapped
  virtio_ring: packed: virtqueue_add_packed() support premapped
  virtio_ring: introduce virtqueue_add_outbuf_premapped()
  virtio_ring: introduce virtqueue_add_inbuf_premapped()
  virtio_ring: introduce virtqueue_reset()
  virtio_ring: add api virtio_dma_map() for advance dma
  virtio_ring: introduce dma sync api for virtio
  xsk: xsk_buff_pool add callback for dma_sync
  xsk: support virtio DMA map
  virtio_net: rename free_old_xmit_skbs to free_old_xmit
  virtio_net: unify the code for recycling the xmit ptr
  virtio_net: virtnet_poll_tx support rescheduled
  virtio_net: independent directory
  virtio_net: move to virtio_net.h
  virtio_net: introduce virtnet_xdp_handler() to seprate the logic of
    run xdp
  virtio_net: receive_small() use virtnet_xdp_handler()
  virtio_net: receive_merageable() use virtnet_xdp_handler()
  virtio_net: introduce virtnet_tx_reset()
  virtio_net: xsk: introduce virtnet_rq_bind_xsk_pool()
  virtio_net: xsk: introduce virtnet_xsk_pool_enable()
  virtio_net: xsk: introduce xsk disable
  virtio_net: xsk: support xsk setup
  virtio_net: xsk: stop disable tx napi
  virtio_net: xsk: __free_old_xmit distinguishes xsk buffer
  virtio_net: virtnet_sq_free_unused_buf() check xsk buffer
  virtio_net: virtnet_rq_free_unused_buf() check xsk buffer
  net: introduce napi_tx_raise()
  virtio_net: xsk: tx: support tx
  virtio_net: xsk: tx: support wakeup
  virtio_net: xsk: tx: auto wakeup when free old xmit
  virtio_net: xsk: rx: introduce add_recvbuf_xsk()
  virtio_net: xsk: rx: introduce receive_xsk() to recv xsk buffer

 MAINTAINERS                                 |   2 +-
 drivers/net/Kconfig                         |   8 +-
 drivers/net/Makefile                        |   2 +-
 drivers/net/virtio/Kconfig                  |  11 +
 drivers/net/virtio/Makefile                 |   8 +
 drivers/net/{virtio_net.c => virtio/main.c} | 564 +++++++-------------
 drivers/net/virtio/virtio_net.h             | 317 +++++++++++
 drivers/net/virtio/xsk.c                    | 524 ++++++++++++++++++
 drivers/net/virtio/xsk.h                    |  33 ++
 drivers/virtio/virtio_ring.c                | 376 +++++++++++--
 include/linux/netdevice.h                   |   7 +
 include/linux/virtio.h                      |  29 +
 include/net/xsk_buff_pool.h                 |   6 +
 net/core/dev.c                              |  11 +
 net/xdp/xsk_buff_pool.c                     |  79 ++-
 15 files changed, 1541 insertions(+), 436 deletions(-)
 create mode 100644 drivers/net/virtio/Kconfig
 create mode 100644 drivers/net/virtio/Makefile
 rename drivers/net/{virtio_net.c => virtio/main.c} (92%)
 create mode 100644 drivers/net/virtio/virtio_net.h
 create mode 100644 drivers/net/virtio/xsk.c
 create mode 100644 drivers/net/virtio/xsk.h

Comments

Michael S. Tsirkin Feb. 2, 2023, 11:08 a.m. UTC | #1
On Thu, Feb 02, 2023 at 07:00:25PM +0800, Xuan Zhuo wrote:
> XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> copy feature of xsk (XDP socket) needs to be supported by the driver. The
> performance of zero copy is very good.

Great! Any numbers to share?

> mlx5 and intel ixgbe already support
> this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> feature.
> 
> Virtio-net did not support per-queue reset, so it was impossible to support XDP
> Socket Zerocopy. At present, we have completed the work of Virtio Spec and
> Kernel in Per-Queue Reset. It is time for Virtio-Net to complete the support for
> the XDP Socket Zerocopy.
> 
> Virtio-net can not increase the queue at will, so xsk shares the queue with
> kernel.
> 
> On the other hand, Virtio-Net does not support generate interrupt manually, so
> when we wakeup tx xmit, we used some tips. If the CPU run by TX NAPI last time
> is other CPUs, use IPI to wake up NAPI on the remote CPU. If it is also the
> local CPU, then we wake up sofrirqd.
> 
> Please review.
> 
> Thanks.
> 
> 
> Xuan Zhuo (33):
>   virtio_ring: virtqueue_add() support premapped
>   virtio_ring: split: virtqueue_add_split() support premapped
>   virtio_ring: packed: virtqueue_add_packed() support premapped
>   virtio_ring: introduce virtqueue_add_outbuf_premapped()
>   virtio_ring: introduce virtqueue_add_inbuf_premapped()
>   virtio_ring: introduce virtqueue_reset()
>   virtio_ring: add api virtio_dma_map() for advance dma
>   virtio_ring: introduce dma sync api for virtio
>   xsk: xsk_buff_pool add callback for dma_sync
>   xsk: support virtio DMA map
>   virtio_net: rename free_old_xmit_skbs to free_old_xmit
>   virtio_net: unify the code for recycling the xmit ptr
>   virtio_net: virtnet_poll_tx support rescheduled
>   virtio_net: independent directory
>   virtio_net: move to virtio_net.h
>   virtio_net: introduce virtnet_xdp_handler() to seprate the logic of
>     run xdp
>   virtio_net: receive_small() use virtnet_xdp_handler()
>   virtio_net: receive_merageable() use virtnet_xdp_handler()
>   virtio_net: introduce virtnet_tx_reset()
>   virtio_net: xsk: introduce virtnet_rq_bind_xsk_pool()
>   virtio_net: xsk: introduce virtnet_xsk_pool_enable()
>   virtio_net: xsk: introduce xsk disable
>   virtio_net: xsk: support xsk setup
>   virtio_net: xsk: stop disable tx napi
>   virtio_net: xsk: __free_old_xmit distinguishes xsk buffer
>   virtio_net: virtnet_sq_free_unused_buf() check xsk buffer
>   virtio_net: virtnet_rq_free_unused_buf() check xsk buffer
>   net: introduce napi_tx_raise()
>   virtio_net: xsk: tx: support tx
>   virtio_net: xsk: tx: support wakeup
>   virtio_net: xsk: tx: auto wakeup when free old xmit
>   virtio_net: xsk: rx: introduce add_recvbuf_xsk()
>   virtio_net: xsk: rx: introduce receive_xsk() to recv xsk buffer
> 
>  MAINTAINERS                                 |   2 +-
>  drivers/net/Kconfig                         |   8 +-
>  drivers/net/Makefile                        |   2 +-
>  drivers/net/virtio/Kconfig                  |  11 +
>  drivers/net/virtio/Makefile                 |   8 +
>  drivers/net/{virtio_net.c => virtio/main.c} | 564 +++++++-------------
>  drivers/net/virtio/virtio_net.h             | 317 +++++++++++
>  drivers/net/virtio/xsk.c                    | 524 ++++++++++++++++++
>  drivers/net/virtio/xsk.h                    |  33 ++
>  drivers/virtio/virtio_ring.c                | 376 +++++++++++--
>  include/linux/netdevice.h                   |   7 +
>  include/linux/virtio.h                      |  29 +
>  include/net/xsk_buff_pool.h                 |   6 +
>  net/core/dev.c                              |  11 +
>  net/xdp/xsk_buff_pool.c                     |  79 ++-
>  15 files changed, 1541 insertions(+), 436 deletions(-)
>  create mode 100644 drivers/net/virtio/Kconfig
>  create mode 100644 drivers/net/virtio/Makefile
>  rename drivers/net/{virtio_net.c => virtio/main.c} (92%)
>  create mode 100644 drivers/net/virtio/virtio_net.h
>  create mode 100644 drivers/net/virtio/xsk.c
>  create mode 100644 drivers/net/virtio/xsk.h
> 
> -- 
> 2.32.0.3.g01195cf9f
Xuan Zhuo Feb. 2, 2023, 11:44 a.m. UTC | #2
On Thu, 2 Feb 2023 06:08:30 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Thu, Feb 02, 2023 at 07:00:25PM +0800, Xuan Zhuo wrote:
> > XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> > copy feature of xsk (XDP socket) needs to be supported by the driver. The
> > performance of zero copy is very good.
>
> Great! Any numbers to share?

RESEND. Last mail has some email format error.

ENV: Qemu with vhost.

                   vhost cpu | Guest APP CPU |Guest Softirq CPU | PPS
-----------------------------|---------------|------------------|------------
xmit by sockperf:     90%    |   100%        |                  |  318967
xmit by xsk:          100%   |   30%         |   33%            | 1192064
recv by sockperf:     100%   |   68%         |   100%           |  692288
recv by xsk:          100%   |   33%         |   43%            |  771670

Thanks.


>
> > mlx5 and intel ixgbe already support
> > this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> > feature.
> >
> > Virtio-net did not support per-queue reset, so it was impossible to support XDP
> > Socket Zerocopy. At present, we have completed the work of Virtio Spec and
> > Kernel in Per-Queue Reset. It is time for Virtio-Net to complete the support for
> > the XDP Socket Zerocopy.
> >
> > Virtio-net can not increase the queue at will, so xsk shares the queue with
> > kernel.
> >
> > On the other hand, Virtio-Net does not support generate interrupt manually, so
> > when we wakeup tx xmit, we used some tips. If the CPU run by TX NAPI last time
> > is other CPUs, use IPI to wake up NAPI on the remote CPU. If it is also the
> > local CPU, then we wake up sofrirqd.
> >
> > Please review.
> >
> > Thanks.
> >
> >
> > Xuan Zhuo (33):
> >   virtio_ring: virtqueue_add() support premapped
> >   virtio_ring: split: virtqueue_add_split() support premapped
> >   virtio_ring: packed: virtqueue_add_packed() support premapped
> >   virtio_ring: introduce virtqueue_add_outbuf_premapped()
> >   virtio_ring: introduce virtqueue_add_inbuf_premapped()
> >   virtio_ring: introduce virtqueue_reset()
> >   virtio_ring: add api virtio_dma_map() for advance dma
> >   virtio_ring: introduce dma sync api for virtio
> >   xsk: xsk_buff_pool add callback for dma_sync
> >   xsk: support virtio DMA map
> >   virtio_net: rename free_old_xmit_skbs to free_old_xmit
> >   virtio_net: unify the code for recycling the xmit ptr
> >   virtio_net: virtnet_poll_tx support rescheduled
> >   virtio_net: independent directory
> >   virtio_net: move to virtio_net.h
> >   virtio_net: introduce virtnet_xdp_handler() to seprate the logic of
> >     run xdp
> >   virtio_net: receive_small() use virtnet_xdp_handler()
> >   virtio_net: receive_merageable() use virtnet_xdp_handler()
> >   virtio_net: introduce virtnet_tx_reset()
> >   virtio_net: xsk: introduce virtnet_rq_bind_xsk_pool()
> >   virtio_net: xsk: introduce virtnet_xsk_pool_enable()
> >   virtio_net: xsk: introduce xsk disable
> >   virtio_net: xsk: support xsk setup
> >   virtio_net: xsk: stop disable tx napi
> >   virtio_net: xsk: __free_old_xmit distinguishes xsk buffer
> >   virtio_net: virtnet_sq_free_unused_buf() check xsk buffer
> >   virtio_net: virtnet_rq_free_unused_buf() check xsk buffer
> >   net: introduce napi_tx_raise()
> >   virtio_net: xsk: tx: support tx
> >   virtio_net: xsk: tx: support wakeup
> >   virtio_net: xsk: tx: auto wakeup when free old xmit
> >   virtio_net: xsk: rx: introduce add_recvbuf_xsk()
> >   virtio_net: xsk: rx: introduce receive_xsk() to recv xsk buffer
> >
> >  MAINTAINERS                                 |   2 +-
> >  drivers/net/Kconfig                         |   8 +-
> >  drivers/net/Makefile                        |   2 +-
> >  drivers/net/virtio/Kconfig                  |  11 +
> >  drivers/net/virtio/Makefile                 |   8 +
> >  drivers/net/{virtio_net.c => virtio/main.c} | 564 +++++++-------------
> >  drivers/net/virtio/virtio_net.h             | 317 +++++++++++
> >  drivers/net/virtio/xsk.c                    | 524 ++++++++++++++++++
> >  drivers/net/virtio/xsk.h                    |  33 ++
> >  drivers/virtio/virtio_ring.c                | 376 +++++++++++--
> >  include/linux/netdevice.h                   |   7 +
> >  include/linux/virtio.h                      |  29 +
> >  include/net/xsk_buff_pool.h                 |   6 +
> >  net/core/dev.c                              |  11 +
> >  net/xdp/xsk_buff_pool.c                     |  79 ++-
> >  15 files changed, 1541 insertions(+), 436 deletions(-)
> >  create mode 100644 drivers/net/virtio/Kconfig
> >  create mode 100644 drivers/net/virtio/Makefile
> >  rename drivers/net/{virtio_net.c => virtio/main.c} (92%)
> >  create mode 100644 drivers/net/virtio/virtio_net.h
> >  create mode 100644 drivers/net/virtio/xsk.c
> >  create mode 100644 drivers/net/virtio/xsk.h
> >
> > --
> > 2.32.0.3.g01195cf9f
>
Paolo Abeni Feb. 2, 2023, 2:41 p.m. UTC | #3
On Thu, 2023-02-02 at 19:00 +0800, Xuan Zhuo wrote:
> XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> copy feature of xsk (XDP socket) needs to be supported by the driver. The
> performance of zero copy is very good. mlx5 and intel ixgbe already support
> this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> feature.
> 
> Virtio-net did not support per-queue reset, so it was impossible to support XDP
> Socket Zerocopy. At present, we have completed the work of Virtio Spec and
> Kernel in Per-Queue Reset. It is time for Virtio-Net to complete the support for
> the XDP Socket Zerocopy.
> 
> Virtio-net can not increase the queue at will, so xsk shares the queue with
> kernel.
> 
> On the other hand, Virtio-Net does not support generate interrupt manually, so
> when we wakeup tx xmit, we used some tips. If the CPU run by TX NAPI last time
> is other CPUs, use IPI to wake up NAPI on the remote CPU. If it is also the
> local CPU, then we wake up sofrirqd.

Thank you for the large effort.

Since this will likely need a few iterations, on next revision please
do split the work in multiple chunks to help the reviewer efforts -
from Documentation/process/maintainer-netdev.rst:

 - don't post large series (> 15 patches), break them up

In this case I guess you can split it in 1 (or even 2) pre-req series
and another one for the actual xsk zero copy support.

Thanks!

Paolo
Xuan Zhuo Feb. 3, 2023, 3:33 a.m. UTC | #4
On Thu, 02 Feb 2023 15:41:44 +0100, Paolo Abeni <pabeni@redhat.com> wrote:
> On Thu, 2023-02-02 at 19:00 +0800, Xuan Zhuo wrote:
> > XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> > copy feature of xsk (XDP socket) needs to be supported by the driver. The
> > performance of zero copy is very good. mlx5 and intel ixgbe already support
> > this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> > feature.
> >
> > Virtio-net did not support per-queue reset, so it was impossible to support XDP
> > Socket Zerocopy. At present, we have completed the work of Virtio Spec and
> > Kernel in Per-Queue Reset. It is time for Virtio-Net to complete the support for
> > the XDP Socket Zerocopy.
> >
> > Virtio-net can not increase the queue at will, so xsk shares the queue with
> > kernel.
> >
> > On the other hand, Virtio-Net does not support generate interrupt manually, so
> > when we wakeup tx xmit, we used some tips. If the CPU run by TX NAPI last time
> > is other CPUs, use IPI to wake up NAPI on the remote CPU. If it is also the
> > local CPU, then we wake up sofrirqd.
>
> Thank you for the large effort.
>
> Since this will likely need a few iterations, on next revision please
> do split the work in multiple chunks to help the reviewer efforts -
> from Documentation/process/maintainer-netdev.rst:
>
>  - don't post large series (> 15 patches), break them up
>
> In this case I guess you can split it in 1 (or even 2) pre-req series
> and another one for the actual xsk zero copy support.


OK.

I can split patch into multiple parts such as

* virtio core
* xsk
* virtio-net prepare
* virtio-net support xsk zerocopy

However, there is a problem, the virtio core part should enter the VHOST branch
of Michael. Then, should I post follow-up patches to which branch vhost or
next-next?

Thanks.


>
> Thanks!
>
> Paolo
>
Michael S. Tsirkin Feb. 3, 2023, 8:37 a.m. UTC | #5
On Fri, Feb 03, 2023 at 11:33:31AM +0800, Xuan Zhuo wrote:
> On Thu, 02 Feb 2023 15:41:44 +0100, Paolo Abeni <pabeni@redhat.com> wrote:
> > On Thu, 2023-02-02 at 19:00 +0800, Xuan Zhuo wrote:
> > > XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> > > copy feature of xsk (XDP socket) needs to be supported by the driver. The
> > > performance of zero copy is very good. mlx5 and intel ixgbe already support
> > > this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> > > feature.
> > >
> > > Virtio-net did not support per-queue reset, so it was impossible to support XDP
> > > Socket Zerocopy. At present, we have completed the work of Virtio Spec and
> > > Kernel in Per-Queue Reset. It is time for Virtio-Net to complete the support for
> > > the XDP Socket Zerocopy.
> > >
> > > Virtio-net can not increase the queue at will, so xsk shares the queue with
> > > kernel.
> > >
> > > On the other hand, Virtio-Net does not support generate interrupt manually, so
> > > when we wakeup tx xmit, we used some tips. If the CPU run by TX NAPI last time
> > > is other CPUs, use IPI to wake up NAPI on the remote CPU. If it is also the
> > > local CPU, then we wake up sofrirqd.
> >
> > Thank you for the large effort.
> >
> > Since this will likely need a few iterations, on next revision please
> > do split the work in multiple chunks to help the reviewer efforts -
> > from Documentation/process/maintainer-netdev.rst:
> >
> >  - don't post large series (> 15 patches), break them up
> >
> > In this case I guess you can split it in 1 (or even 2) pre-req series
> > and another one for the actual xsk zero copy support.
> 
> 
> OK.
> 
> I can split patch into multiple parts such as
> 
> * virtio core
> * xsk
> * virtio-net prepare
> * virtio-net support xsk zerocopy
> 
> However, there is a problem, the virtio core part should enter the VHOST branch
> of Michael. Then, should I post follow-up patches to which branch vhost or
> next-next?
> 
> Thanks.

I personally think 33 patches is still manageable no need to split.
Do try to be careful and track acks and changes: if someone sends an ack
add it in the patch if you change the patch drop the acks,
and logs this fact in the changelog in the cover letter
so people know they need to re-review.


> 
> >
> > Thanks!
> >
> > Paolo
> >
Maciej Fijalkowski Feb. 3, 2023, 8:46 a.m. UTC | #6
On Fri, Feb 03, 2023 at 03:37:32AM -0500, Michael S. Tsirkin wrote:
> On Fri, Feb 03, 2023 at 11:33:31AM +0800, Xuan Zhuo wrote:
> > On Thu, 02 Feb 2023 15:41:44 +0100, Paolo Abeni <pabeni@redhat.com> wrote:
> > > On Thu, 2023-02-02 at 19:00 +0800, Xuan Zhuo wrote:
> > > > XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> > > > copy feature of xsk (XDP socket) needs to be supported by the driver. The
> > > > performance of zero copy is very good. mlx5 and intel ixgbe already support
> > > > this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> > > > feature.
> > > >
> > > > Virtio-net did not support per-queue reset, so it was impossible to support XDP
> > > > Socket Zerocopy. At present, we have completed the work of Virtio Spec and
> > > > Kernel in Per-Queue Reset. It is time for Virtio-Net to complete the support for
> > > > the XDP Socket Zerocopy.
> > > >
> > > > Virtio-net can not increase the queue at will, so xsk shares the queue with
> > > > kernel.
> > > >
> > > > On the other hand, Virtio-Net does not support generate interrupt manually, so
> > > > when we wakeup tx xmit, we used some tips. If the CPU run by TX NAPI last time
> > > > is other CPUs, use IPI to wake up NAPI on the remote CPU. If it is also the
> > > > local CPU, then we wake up sofrirqd.
> > >
> > > Thank you for the large effort.
> > >
> > > Since this will likely need a few iterations, on next revision please
> > > do split the work in multiple chunks to help the reviewer efforts -
> > > from Documentation/process/maintainer-netdev.rst:
> > >
> > >  - don't post large series (> 15 patches), break them up
> > >
> > > In this case I guess you can split it in 1 (or even 2) pre-req series
> > > and another one for the actual xsk zero copy support.
> > 
> > 
> > OK.
> > 
> > I can split patch into multiple parts such as
> > 
> > * virtio core
> > * xsk
> > * virtio-net prepare
> > * virtio-net support xsk zerocopy
> > 
> > However, there is a problem, the virtio core part should enter the VHOST branch
> > of Michael. Then, should I post follow-up patches to which branch vhost or
> > next-next?
> > 
> > Thanks.
> 
> I personally think 33 patches is still manageable no need to split.
> Do try to be careful and track acks and changes: if someone sends an ack
> add it in the patch if you change the patch drop the acks,
> and logs this fact in the changelog in the cover letter
> so people know they need to re-review.

To me some of the patches are too granular but probably this is related to
personal taste. However, I would like to ask to check how this series
affects existing ZC enabled driver(s), since xsk core is touched.

> 
> 
> > 
> > >
> > > Thanks!
> > >
> > > Paolo
> > >
>
Michael S. Tsirkin Feb. 3, 2023, 9:08 a.m. UTC | #7
On Thu, Feb 02, 2023 at 07:44:07PM +0800, Xuan Zhuo wrote:
> On Thu, 2 Feb 2023 06:08:30 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Thu, Feb 02, 2023 at 07:00:25PM +0800, Xuan Zhuo wrote:
> > > XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> > > copy feature of xsk (XDP socket) needs to be supported by the driver. The
> > > performance of zero copy is very good.
> >
> > Great! Any numbers to share?
> 
> RESEND. Last mail has some email format error.
> 
> ENV: Qemu with vhost.
> 
>                    vhost cpu | Guest APP CPU |Guest Softirq CPU | PPS
> -----------------------------|---------------|------------------|------------
> xmit by sockperf:     90%    |   100%        |                  |  318967
> xmit by xsk:          100%   |   30%         |   33%            | 1192064
> recv by sockperf:     100%   |   68%         |   100%           |  692288
> recv by xsk:          100%   |   33%         |   43%            |  771670
> 
> Thanks.

Impressive, thanks a lot for this work!
Pls remember to retest and include up to date numbers on
subsequent versions.

> 
> >
> > > mlx5 and intel ixgbe already support
> > > this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> > > feature.
> > >
> > > Virtio-net did not support per-queue reset, so it was impossible to support XDP
> > > Socket Zerocopy. At present, we have completed the work of Virtio Spec and
> > > Kernel in Per-Queue Reset. It is time for Virtio-Net to complete the support for
> > > the XDP Socket Zerocopy.
> > >
> > > Virtio-net can not increase the queue at will, so xsk shares the queue with
> > > kernel.
> > >
> > > On the other hand, Virtio-Net does not support generate interrupt manually, so
> > > when we wakeup tx xmit, we used some tips. If the CPU run by TX NAPI last time
> > > is other CPUs, use IPI to wake up NAPI on the remote CPU. If it is also the
> > > local CPU, then we wake up sofrirqd.
> > >
> > > Please review.
> > >
> > > Thanks.
> > >
> > >
> > > Xuan Zhuo (33):
> > >   virtio_ring: virtqueue_add() support premapped
> > >   virtio_ring: split: virtqueue_add_split() support premapped
> > >   virtio_ring: packed: virtqueue_add_packed() support premapped
> > >   virtio_ring: introduce virtqueue_add_outbuf_premapped()
> > >   virtio_ring: introduce virtqueue_add_inbuf_premapped()
> > >   virtio_ring: introduce virtqueue_reset()
> > >   virtio_ring: add api virtio_dma_map() for advance dma
> > >   virtio_ring: introduce dma sync api for virtio
> > >   xsk: xsk_buff_pool add callback for dma_sync
> > >   xsk: support virtio DMA map
> > >   virtio_net: rename free_old_xmit_skbs to free_old_xmit
> > >   virtio_net: unify the code for recycling the xmit ptr
> > >   virtio_net: virtnet_poll_tx support rescheduled
> > >   virtio_net: independent directory
> > >   virtio_net: move to virtio_net.h
> > >   virtio_net: introduce virtnet_xdp_handler() to seprate the logic of
> > >     run xdp
> > >   virtio_net: receive_small() use virtnet_xdp_handler()
> > >   virtio_net: receive_merageable() use virtnet_xdp_handler()
> > >   virtio_net: introduce virtnet_tx_reset()
> > >   virtio_net: xsk: introduce virtnet_rq_bind_xsk_pool()
> > >   virtio_net: xsk: introduce virtnet_xsk_pool_enable()
> > >   virtio_net: xsk: introduce xsk disable
> > >   virtio_net: xsk: support xsk setup
> > >   virtio_net: xsk: stop disable tx napi
> > >   virtio_net: xsk: __free_old_xmit distinguishes xsk buffer
> > >   virtio_net: virtnet_sq_free_unused_buf() check xsk buffer
> > >   virtio_net: virtnet_rq_free_unused_buf() check xsk buffer
> > >   net: introduce napi_tx_raise()
> > >   virtio_net: xsk: tx: support tx
> > >   virtio_net: xsk: tx: support wakeup
> > >   virtio_net: xsk: tx: auto wakeup when free old xmit
> > >   virtio_net: xsk: rx: introduce add_recvbuf_xsk()
> > >   virtio_net: xsk: rx: introduce receive_xsk() to recv xsk buffer
> > >
> > >  MAINTAINERS                                 |   2 +-
> > >  drivers/net/Kconfig                         |   8 +-
> > >  drivers/net/Makefile                        |   2 +-
> > >  drivers/net/virtio/Kconfig                  |  11 +
> > >  drivers/net/virtio/Makefile                 |   8 +
> > >  drivers/net/{virtio_net.c => virtio/main.c} | 564 +++++++-------------
> > >  drivers/net/virtio/virtio_net.h             | 317 +++++++++++
> > >  drivers/net/virtio/xsk.c                    | 524 ++++++++++++++++++
> > >  drivers/net/virtio/xsk.h                    |  33 ++
> > >  drivers/virtio/virtio_ring.c                | 376 +++++++++++--
> > >  include/linux/netdevice.h                   |   7 +
> > >  include/linux/virtio.h                      |  29 +
> > >  include/net/xsk_buff_pool.h                 |   6 +
> > >  net/core/dev.c                              |  11 +
> > >  net/xdp/xsk_buff_pool.c                     |  79 ++-
> > >  15 files changed, 1541 insertions(+), 436 deletions(-)
> > >  create mode 100644 drivers/net/virtio/Kconfig
> > >  create mode 100644 drivers/net/virtio/Makefile
> > >  rename drivers/net/{virtio_net.c => virtio/main.c} (92%)
> > >  create mode 100644 drivers/net/virtio/virtio_net.h
> > >  create mode 100644 drivers/net/virtio/xsk.c
> > >  create mode 100644 drivers/net/virtio/xsk.h
> > >
> > > --
> > > 2.32.0.3.g01195cf9f
> >
Michael S. Tsirkin Feb. 3, 2023, 9:09 a.m. UTC | #8
On Fri, Feb 03, 2023 at 09:46:46AM +0100, Maciej Fijalkowski wrote:
> On Fri, Feb 03, 2023 at 03:37:32AM -0500, Michael S. Tsirkin wrote:
> > On Fri, Feb 03, 2023 at 11:33:31AM +0800, Xuan Zhuo wrote:
> > > On Thu, 02 Feb 2023 15:41:44 +0100, Paolo Abeni <pabeni@redhat.com> wrote:
> > > > On Thu, 2023-02-02 at 19:00 +0800, Xuan Zhuo wrote:
> > > > > XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> > > > > copy feature of xsk (XDP socket) needs to be supported by the driver. The
> > > > > performance of zero copy is very good. mlx5 and intel ixgbe already support
> > > > > this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> > > > > feature.
> > > > >
> > > > > Virtio-net did not support per-queue reset, so it was impossible to support XDP
> > > > > Socket Zerocopy. At present, we have completed the work of Virtio Spec and
> > > > > Kernel in Per-Queue Reset. It is time for Virtio-Net to complete the support for
> > > > > the XDP Socket Zerocopy.
> > > > >
> > > > > Virtio-net can not increase the queue at will, so xsk shares the queue with
> > > > > kernel.
> > > > >
> > > > > On the other hand, Virtio-Net does not support generate interrupt manually, so
> > > > > when we wakeup tx xmit, we used some tips. If the CPU run by TX NAPI last time
> > > > > is other CPUs, use IPI to wake up NAPI on the remote CPU. If it is also the
> > > > > local CPU, then we wake up sofrirqd.
> > > >
> > > > Thank you for the large effort.
> > > >
> > > > Since this will likely need a few iterations, on next revision please
> > > > do split the work in multiple chunks to help the reviewer efforts -
> > > > from Documentation/process/maintainer-netdev.rst:
> > > >
> > > >  - don't post large series (> 15 patches), break them up
> > > >
> > > > In this case I guess you can split it in 1 (or even 2) pre-req series
> > > > and another one for the actual xsk zero copy support.
> > > 
> > > 
> > > OK.
> > > 
> > > I can split patch into multiple parts such as
> > > 
> > > * virtio core
> > > * xsk
> > > * virtio-net prepare
> > > * virtio-net support xsk zerocopy
> > > 
> > > However, there is a problem, the virtio core part should enter the VHOST branch
> > > of Michael. Then, should I post follow-up patches to which branch vhost or
> > > next-next?
> > > 
> > > Thanks.
> > 
> > I personally think 33 patches is still manageable no need to split.
> > Do try to be careful and track acks and changes: if someone sends an ack
> > add it in the patch if you change the patch drop the acks,
> > and logs this fact in the changelog in the cover letter
> > so people know they need to re-review.
> 
> To me some of the patches are too granular but probably this is related to
> personal taste.

I agree here. Some unrelated refactoring can also be deferred.

> However, I would like to ask to check how this series
> affects existing ZC enabled driver(s), since xsk core is touched.
> 
> > 
> > 
> > > 
> > > >
> > > > Thanks!
> > > >
> > > > Paolo
> > > >
> >
Xuan Zhuo Feb. 3, 2023, 9:09 a.m. UTC | #9
On Fri, 3 Feb 2023 04:08:31 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Thu, Feb 02, 2023 at 07:44:07PM +0800, Xuan Zhuo wrote:
> > On Thu, 2 Feb 2023 06:08:30 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > On Thu, Feb 02, 2023 at 07:00:25PM +0800, Xuan Zhuo wrote:
> > > > XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> > > > copy feature of xsk (XDP socket) needs to be supported by the driver. The
> > > > performance of zero copy is very good.
> > >
> > > Great! Any numbers to share?
> >
> > RESEND. Last mail has some email format error.
> >
> > ENV: Qemu with vhost.
> >
> >                    vhost cpu | Guest APP CPU |Guest Softirq CPU | PPS
> > -----------------------------|---------------|------------------|------------
> > xmit by sockperf:     90%    |   100%        |                  |  318967
> > xmit by xsk:          100%   |   30%         |   33%            | 1192064
> > recv by sockperf:     100%   |   68%         |   100%           |  692288
> > recv by xsk:          100%   |   33%         |   43%            |  771670
> >
> > Thanks.
>
> Impressive, thanks a lot for this work!
> Pls remember to retest and include up to date numbers on
> subsequent versions.


OK.

Thanks.


>
> >
> > >
> > > > mlx5 and intel ixgbe already support
> > > > this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> > > > feature.
> > > >
> > > > Virtio-net did not support per-queue reset, so it was impossible to support XDP
> > > > Socket Zerocopy. At present, we have completed the work of Virtio Spec and
> > > > Kernel in Per-Queue Reset. It is time for Virtio-Net to complete the support for
> > > > the XDP Socket Zerocopy.
> > > >
> > > > Virtio-net can not increase the queue at will, so xsk shares the queue with
> > > > kernel.
> > > >
> > > > On the other hand, Virtio-Net does not support generate interrupt manually, so
> > > > when we wakeup tx xmit, we used some tips. If the CPU run by TX NAPI last time
> > > > is other CPUs, use IPI to wake up NAPI on the remote CPU. If it is also the
> > > > local CPU, then we wake up sofrirqd.
> > > >
> > > > Please review.
> > > >
> > > > Thanks.
> > > >
> > > >
> > > > Xuan Zhuo (33):
> > > >   virtio_ring: virtqueue_add() support premapped
> > > >   virtio_ring: split: virtqueue_add_split() support premapped
> > > >   virtio_ring: packed: virtqueue_add_packed() support premapped
> > > >   virtio_ring: introduce virtqueue_add_outbuf_premapped()
> > > >   virtio_ring: introduce virtqueue_add_inbuf_premapped()
> > > >   virtio_ring: introduce virtqueue_reset()
> > > >   virtio_ring: add api virtio_dma_map() for advance dma
> > > >   virtio_ring: introduce dma sync api for virtio
> > > >   xsk: xsk_buff_pool add callback for dma_sync
> > > >   xsk: support virtio DMA map
> > > >   virtio_net: rename free_old_xmit_skbs to free_old_xmit
> > > >   virtio_net: unify the code for recycling the xmit ptr
> > > >   virtio_net: virtnet_poll_tx support rescheduled
> > > >   virtio_net: independent directory
> > > >   virtio_net: move to virtio_net.h
> > > >   virtio_net: introduce virtnet_xdp_handler() to seprate the logic of
> > > >     run xdp
> > > >   virtio_net: receive_small() use virtnet_xdp_handler()
> > > >   virtio_net: receive_merageable() use virtnet_xdp_handler()
> > > >   virtio_net: introduce virtnet_tx_reset()
> > > >   virtio_net: xsk: introduce virtnet_rq_bind_xsk_pool()
> > > >   virtio_net: xsk: introduce virtnet_xsk_pool_enable()
> > > >   virtio_net: xsk: introduce xsk disable
> > > >   virtio_net: xsk: support xsk setup
> > > >   virtio_net: xsk: stop disable tx napi
> > > >   virtio_net: xsk: __free_old_xmit distinguishes xsk buffer
> > > >   virtio_net: virtnet_sq_free_unused_buf() check xsk buffer
> > > >   virtio_net: virtnet_rq_free_unused_buf() check xsk buffer
> > > >   net: introduce napi_tx_raise()
> > > >   virtio_net: xsk: tx: support tx
> > > >   virtio_net: xsk: tx: support wakeup
> > > >   virtio_net: xsk: tx: auto wakeup when free old xmit
> > > >   virtio_net: xsk: rx: introduce add_recvbuf_xsk()
> > > >   virtio_net: xsk: rx: introduce receive_xsk() to recv xsk buffer
> > > >
> > > >  MAINTAINERS                                 |   2 +-
> > > >  drivers/net/Kconfig                         |   8 +-
> > > >  drivers/net/Makefile                        |   2 +-
> > > >  drivers/net/virtio/Kconfig                  |  11 +
> > > >  drivers/net/virtio/Makefile                 |   8 +
> > > >  drivers/net/{virtio_net.c => virtio/main.c} | 564 +++++++-------------
> > > >  drivers/net/virtio/virtio_net.h             | 317 +++++++++++
> > > >  drivers/net/virtio/xsk.c                    | 524 ++++++++++++++++++
> > > >  drivers/net/virtio/xsk.h                    |  33 ++
> > > >  drivers/virtio/virtio_ring.c                | 376 +++++++++++--
> > > >  include/linux/netdevice.h                   |   7 +
> > > >  include/linux/virtio.h                      |  29 +
> > > >  include/net/xsk_buff_pool.h                 |   6 +
> > > >  net/core/dev.c                              |  11 +
> > > >  net/xdp/xsk_buff_pool.c                     |  79 ++-
> > > >  15 files changed, 1541 insertions(+), 436 deletions(-)
> > > >  create mode 100644 drivers/net/virtio/Kconfig
> > > >  create mode 100644 drivers/net/virtio/Makefile
> > > >  rename drivers/net/{virtio_net.c => virtio/main.c} (92%)
> > > >  create mode 100644 drivers/net/virtio/virtio_net.h
> > > >  create mode 100644 drivers/net/virtio/xsk.c
> > > >  create mode 100644 drivers/net/virtio/xsk.h
> > > >
> > > > --
> > > > 2.32.0.3.g01195cf9f
> > >
>
Michael S. Tsirkin Feb. 3, 2023, 9:17 a.m. UTC | #10
On Fri, Feb 03, 2023 at 11:33:31AM +0800, Xuan Zhuo wrote:
> On Thu, 02 Feb 2023 15:41:44 +0100, Paolo Abeni <pabeni@redhat.com> wrote:
> > On Thu, 2023-02-02 at 19:00 +0800, Xuan Zhuo wrote:
> > > XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> > > copy feature of xsk (XDP socket) needs to be supported by the driver. The
> > > performance of zero copy is very good. mlx5 and intel ixgbe already support
> > > this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> > > feature.
> > >
> > > Virtio-net did not support per-queue reset, so it was impossible to support XDP
> > > Socket Zerocopy. At present, we have completed the work of Virtio Spec and
> > > Kernel in Per-Queue Reset. It is time for Virtio-Net to complete the support for
> > > the XDP Socket Zerocopy.
> > >
> > > Virtio-net can not increase the queue at will, so xsk shares the queue with
> > > kernel.
> > >
> > > On the other hand, Virtio-Net does not support generate interrupt manually, so
> > > when we wakeup tx xmit, we used some tips. If the CPU run by TX NAPI last time
> > > is other CPUs, use IPI to wake up NAPI on the remote CPU. If it is also the
> > > local CPU, then we wake up sofrirqd.
> >
> > Thank you for the large effort.
> >
> > Since this will likely need a few iterations, on next revision please
> > do split the work in multiple chunks to help the reviewer efforts -
> > from Documentation/process/maintainer-netdev.rst:
> >
> >  - don't post large series (> 15 patches), break them up
> >
> > In this case I guess you can split it in 1 (or even 2) pre-req series
> > and another one for the actual xsk zero copy support.
> 
> 
> OK.
> 
> I can split patch into multiple parts such as
> 
> * virtio core
> * xsk
> * virtio-net prepare
> * virtio-net support xsk zerocopy
> 
> However, there is a problem, the virtio core part should enter the VHOST branch
> of Michael. Then, should I post follow-up patches to which branch vhost or
> next-next?
> 
> Thanks.
> 

Here are some ideas on how to make the patchset smaller
and easier to merge:
- keep everything in virtio_net.c for now. We can split
  things out later, but this way your patchset will not
  conflict with every since change merged meanwhile.
  Also, split up needs to be done carefully with sane
  APIs between components, let's maybe not waste time
  on that now, do the split-up later.
- you have patches that add APIs then other
  patches use them. as long as it's only virtio net just
  add and use in a single patch, review is actually easier this way.
- we can try merging pre-requisites earlier, then patchset
  size will shrink.


> >
> > Thanks!
> >
> > Paolo
> >
Xuan Zhuo Feb. 6, 2023, 2:41 a.m. UTC | #11
On Fri, 3 Feb 2023 04:17:59 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Fri, Feb 03, 2023 at 11:33:31AM +0800, Xuan Zhuo wrote:
> > On Thu, 02 Feb 2023 15:41:44 +0100, Paolo Abeni <pabeni@redhat.com> wrote:
> > > On Thu, 2023-02-02 at 19:00 +0800, Xuan Zhuo wrote:
> > > > XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> > > > copy feature of xsk (XDP socket) needs to be supported by the driver. The
> > > > performance of zero copy is very good. mlx5 and intel ixgbe already support
> > > > this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> > > > feature.
> > > >
> > > > Virtio-net did not support per-queue reset, so it was impossible to support XDP
> > > > Socket Zerocopy. At present, we have completed the work of Virtio Spec and
> > > > Kernel in Per-Queue Reset. It is time for Virtio-Net to complete the support for
> > > > the XDP Socket Zerocopy.
> > > >
> > > > Virtio-net can not increase the queue at will, so xsk shares the queue with
> > > > kernel.
> > > >
> > > > On the other hand, Virtio-Net does not support generate interrupt manually, so
> > > > when we wakeup tx xmit, we used some tips. If the CPU run by TX NAPI last time
> > > > is other CPUs, use IPI to wake up NAPI on the remote CPU. If it is also the
> > > > local CPU, then we wake up sofrirqd.
> > >
> > > Thank you for the large effort.
> > >
> > > Since this will likely need a few iterations, on next revision please
> > > do split the work in multiple chunks to help the reviewer efforts -
> > > from Documentation/process/maintainer-netdev.rst:
> > >
> > >  - don't post large series (> 15 patches), break them up
> > >
> > > In this case I guess you can split it in 1 (or even 2) pre-req series
> > > and another one for the actual xsk zero copy support.
> >
> >
> > OK.
> >
> > I can split patch into multiple parts such as
> >
> > * virtio core
> > * xsk
> > * virtio-net prepare
> > * virtio-net support xsk zerocopy
> >
> > However, there is a problem, the virtio core part should enter the VHOST branch
> > of Michael. Then, should I post follow-up patches to which branch vhost or
> > next-next?
> >
> > Thanks.
> >
>
> Here are some ideas on how to make the patchset smaller
> and easier to merge:
> - keep everything in virtio_net.c for now. We can split
>   things out later, but this way your patchset will not
>   conflict with every since change merged meanwhile.
>   Also, split up needs to be done carefully with sane
>   APIs between components, let's maybe not waste time
>   on that now, do the split-up later.
> - you have patches that add APIs then other
>   patches use them. as long as it's only virtio net just
>   add and use in a single patch, review is actually easier this way.

I will try to merge #16-#18 and #20-#23.


> - we can try merging pre-requisites earlier, then patchset
>   size will shrink.

Do you mean the patches of virtio core? Should we put these
patches to vhost branch?

Thanks.

>
>
> > >
> > > Thanks!
> > >
> > > Paolo
> > >
>
Michael S. Tsirkin Feb. 13, 2023, 12:14 p.m. UTC | #12
On Mon, Feb 06, 2023 at 10:41:16AM +0800, Xuan Zhuo wrote:
> On Fri, 3 Feb 2023 04:17:59 -0500, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > On Fri, Feb 03, 2023 at 11:33:31AM +0800, Xuan Zhuo wrote:
> > > On Thu, 02 Feb 2023 15:41:44 +0100, Paolo Abeni <pabeni@redhat.com> wrote:
> > > > On Thu, 2023-02-02 at 19:00 +0800, Xuan Zhuo wrote:
> > > > > XDP socket(AF_XDP) is an excellent bypass kernel network framework. The zero
> > > > > copy feature of xsk (XDP socket) needs to be supported by the driver. The
> > > > > performance of zero copy is very good. mlx5 and intel ixgbe already support
> > > > > this feature, This patch set allows virtio-net to support xsk's zerocopy xmit
> > > > > feature.
> > > > >
> > > > > Virtio-net did not support per-queue reset, so it was impossible to support XDP
> > > > > Socket Zerocopy. At present, we have completed the work of Virtio Spec and
> > > > > Kernel in Per-Queue Reset. It is time for Virtio-Net to complete the support for
> > > > > the XDP Socket Zerocopy.
> > > > >
> > > > > Virtio-net can not increase the queue at will, so xsk shares the queue with
> > > > > kernel.
> > > > >
> > > > > On the other hand, Virtio-Net does not support generate interrupt manually, so
> > > > > when we wakeup tx xmit, we used some tips. If the CPU run by TX NAPI last time
> > > > > is other CPUs, use IPI to wake up NAPI on the remote CPU. If it is also the
> > > > > local CPU, then we wake up sofrirqd.
> > > >
> > > > Thank you for the large effort.
> > > >
> > > > Since this will likely need a few iterations, on next revision please
> > > > do split the work in multiple chunks to help the reviewer efforts -
> > > > from Documentation/process/maintainer-netdev.rst:
> > > >
> > > >  - don't post large series (> 15 patches), break them up
> > > >
> > > > In this case I guess you can split it in 1 (or even 2) pre-req series
> > > > and another one for the actual xsk zero copy support.
> > >
> > >
> > > OK.
> > >
> > > I can split patch into multiple parts such as
> > >
> > > * virtio core
> > > * xsk
> > > * virtio-net prepare
> > > * virtio-net support xsk zerocopy
> > >
> > > However, there is a problem, the virtio core part should enter the VHOST branch
> > > of Michael. Then, should I post follow-up patches to which branch vhost or
> > > next-next?
> > >
> > > Thanks.
> > >
> >
> > Here are some ideas on how to make the patchset smaller
> > and easier to merge:
> > - keep everything in virtio_net.c for now. We can split
> >   things out later, but this way your patchset will not
> >   conflict with every since change merged meanwhile.
> >   Also, split up needs to be done carefully with sane
> >   APIs between components, let's maybe not waste time
> >   on that now, do the split-up later.
> > - you have patches that add APIs then other
> >   patches use them. as long as it's only virtio net just
> >   add and use in a single patch, review is actually easier this way.
> 
> I will try to merge #16-#18 and #20-#23.

don't do the code reorg thing for now either.

leave this for later.

> 
> > - we can try merging pre-requisites earlier, then patchset
> >   size will shrink.
> 
> Do you mean the patches of virtio core? Should we put these
> patches to vhost branch?
> 
> Thanks.

I can merge patches 1-8, yes.
This patchset probably missed the merge window anyway.


> >
> >
> > > >
> > > > Thanks!
> > > >
> > > > Paolo
> > > >
> >