Message ID | 20230803140441.53596-1-huangjie.albert@bytedance.com (mailing list archive) |
---|---|
Headers | show |
Series | [RFC,Optimizing,veth,xsk,performance,01/10] veth: Implement ethtool's get_ringparam() callback | expand |
On Thu, 2023-08-03 at 22:04 +0800, huangjie.albert wrote: > AF_XDP is a kernel bypass technology that can greatly improve performance. > However, for virtual devices like veth, even with the use of AF_XDP sockets, > there are still many additional software paths that consume CPU resources. > This patch series focuses on optimizing the performance of AF_XDP sockets > for veth virtual devices. Patches 1 to 4 mainly involve preparatory work. > Patch 5 introduces tx queue and tx napi for packet transmission, while > patch 9 primarily implements zero-copy, and patch 10 adds support for > batch sending of IPv4 UDP packets. These optimizations significantly reduce > the software path and support checksum offload. > > I tested those feature with > A typical topology is shown below: > veth<-->veth-peer veth1-peer<--->veth1 > 1 | | 7 > |2 6| > | | > bridge<------->eth0(mlnx5)- switch -eth1(mlnx5)<--->bridge1 > 3 4 5 > (machine1) (machine2) > AF_XDP socket is attach to veth and veth1. and send packets to physical NIC(eth0) > veth:(172.17.0.2/24) > bridge:(172.17.0.1/24) > eth0:(192.168.156.66/24) > > eth1(172.17.0.2/24) > bridge1:(172.17.0.1/24) > eth0:(192.168.156.88/24) > > after set default route��?snat��?dnat. we can have a tests > to get the performance results. > > packets send from veth to veth1: > af_xdp test tool: > link:https://github.com/cclinuxer/libxudp > send:(veth) > ./objs/xudpperf send --dst 192.168.156.88:6002 -l 1300 > recv:(veth1) > ./objs/xudpperf recv --src 172.17.0.2:6002 > > udp test tool:iperf3 > send:(veth) > iperf3 -c 192.168.156.88 -p 6002 -l 1300 -b 60G -u Should be: '-b 0' otherwise you will experience additional overhead. And you would likely pin processes and irqs to ensure BH and US run on different cores of the same numa node. Cheers, Paolo
On 03/08/2023 16.04, huangjie.albert wrote: > AF_XDP is a kernel bypass technology that can greatly improve performance. > However, for virtual devices like veth, even with the use of AF_XDP sockets, > there are still many additional software paths that consume CPU resources. > This patch series focuses on optimizing the performance of AF_XDP sockets > for veth virtual devices. Patches 1 to 4 mainly involve preparatory work. > Patch 5 introduces tx queue and tx napi for packet transmission, while > patch 9 primarily implements zero-copy, and patch 10 adds support for > batch sending of IPv4 UDP packets. These optimizations significantly reduce > the software path and support checksum offload. > > I tested those feature with > A typical topology is shown below: > veth<-->veth-peer veth1-peer<--->veth1 > 1 | | 7 > |2 6| > | | > bridge<------->eth0(mlnx5)- switch -eth1(mlnx5)<--->bridge1 > 3 4 5 > (machine1) (machine2) > AF_XDP socket is attach to veth and veth1. and send packets to physical NIC(eth0) > veth:(172.17.0.2/24) > bridge:(172.17.0.1/24) > eth0:(192.168.156.66/24) > > eth1(172.17.0.2/24) > bridge1:(172.17.0.1/24) > eth0:(192.168.156.88/24) > > after set default route、snat、dnat. we can have a tests > to get the performance results. > > packets send from veth to veth1: > af_xdp test tool: > link:https://github.com/cclinuxer/libxudp > send:(veth) > ./objs/xudpperf send --dst 192.168.156.88:6002 -l 1300 > recv:(veth1) > ./objs/xudpperf recv --src 172.17.0.2:6002 > > udp test tool:iperf3 > send:(veth) > iperf3 -c 192.168.156.88 -p 6002 -l 1300 -b 60G -u > recv:(veth1) > iperf3 -s -p 6002 > > performance: > performance:(test weth libxdp lib) > UDP : 250 Kpps (with 100% cpu) > AF_XDP no zerocopy + no batch : 480 Kpps (with ksoftirqd 100% cpu) > AF_XDP with zerocopy + no batch : 540 Kpps (with ksoftirqd 100% cpu) > AF_XDP with batch + zerocopy : 1.5 Mpps (with ksoftirqd 15% cpu) > > With af_xdp batch, the libxdp user-space program reaches a bottleneck. Do you mean libxdp [1] or libxudp ? [1] https://github.com/xdp-project/xdp-tools/tree/master/lib/libxdp > Therefore, the softirq did not reach the limit. > > This is just an RFC patch series, and some code details still need > further consideration. Please review this proposal. > I find this performance work interesting as we have customer requests (via Maryam (cc)) to improve AF_XDP performance both native and on veth. Our benchmark is stored at: https://github.com/maryamtahhan/veth-benchmark Great to see other companies also interested in this area. --Jesper > thanks! > > huangjie.albert (10): > veth: Implement ethtool's get_ringparam() callback > xsk: add dma_check_skip for skipping dma check > veth: add support for send queue > xsk: add xsk_tx_completed_addr function > veth: use send queue tx napi to xmit xsk tx desc > veth: add ndo_xsk_wakeup callback for veth > sk_buff: add destructor_arg_xsk_pool for zero copy > xdp: add xdp_mem_type MEM_TYPE_XSK_BUFF_POOL_TX > veth: support zero copy for af xdp > veth: af_xdp tx batch support for ipv4 udp > > drivers/net/veth.c | 729 +++++++++++++++++++++++++++++++++++- > include/linux/skbuff.h | 1 + > include/net/xdp.h | 1 + > include/net/xdp_sock_drv.h | 1 + > include/net/xsk_buff_pool.h | 1 + > net/xdp/xsk.c | 6 + > net/xdp/xsk_buff_pool.c | 3 +- > net/xdp/xsk_queue.h | 11 + > 8 files changed, 751 insertions(+), 2 deletions(-) >
Paolo Abeni <pabeni@redhat.com> 于2023年8月3日周四 22:20写道: > > On Thu, 2023-08-03 at 22:04 +0800, huangjie.albert wrote: > > AF_XDP is a kernel bypass technology that can greatly improve performance. > > However, for virtual devices like veth, even with the use of AF_XDP sockets, > > there are still many additional software paths that consume CPU resources. > > This patch series focuses on optimizing the performance of AF_XDP sockets > > for veth virtual devices. Patches 1 to 4 mainly involve preparatory work. > > Patch 5 introduces tx queue and tx napi for packet transmission, while > > patch 9 primarily implements zero-copy, and patch 10 adds support for > > batch sending of IPv4 UDP packets. These optimizations significantly reduce > > the software path and support checksum offload. > > > > I tested those feature with > > A typical topology is shown below: > > veth<-->veth-peer veth1-peer<--->veth1 > > 1 | | 7 > > |2 6| > > | | > > bridge<------->eth0(mlnx5)- switch -eth1(mlnx5)<--->bridge1 > > 3 4 5 > > (machine1) (machine2) > > AF_XDP socket is attach to veth and veth1. and send packets to physical NIC(eth0) > > veth:(172.17.0.2/24) > > bridge:(172.17.0.1/24) > > eth0:(192.168.156.66/24) > > > > eth1(172.17.0.2/24) > > bridge1:(172.17.0.1/24) > > eth0:(192.168.156.88/24) > > > > after set default route . snat . dnat. we can have a tests > > to get the performance results. > > > > packets send from veth to veth1: > > af_xdp test tool: > > link:https://github.com/cclinuxer/libxudp > > send:(veth) > > ./objs/xudpperf send --dst 192.168.156.88:6002 -l 1300 > > recv:(veth1) > > ./objs/xudpperf recv --src 172.17.0.2:6002 > > > > udp test tool:iperf3 > > send:(veth) > > iperf3 -c 192.168.156.88 -p 6002 -l 1300 -b 60G -u > > Should be: '-b 0' otherwise you will experience additional overhead. > with -b 0: performance: performance:(test weth libxdp lib) UDP : 320 Kpps (with 100% cpu) AF_XDP no zerocopy + no batch : 480 Kpps (with ksoftirqd 100% cpu) AF_XDP with zerocopy + no batch : 540 Kpps (with ksoftirqd 100% cpu) AF_XDP with batch + zerocopy : 1.5 Mpps (with ksoftirqd 15% cpu) thanks. > And you would likely pin processes and irqs to ensure BH and US run on > different cores of the same numa node. > > Cheers, > > Paolo >