Message ID | 20221007211713.170714-1-jonathan.lemon@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | zero-copy RX for io_uring | expand |
On Fri, Oct 07, 2022 at 02:17:04PM -0700, Jonathan Lemon wrote: >This series is a RFC for io_uring/zctap. This is an evolution of >the earlier zctap work, re-targeted to use io_uring as the userspace >API. The current code is intended to provide a zero-copy RX path for >upper-level networking protocols (aka TCP and UDP). The current draft >focuses on host-provided memory (not GPU memory). > >This RFC contains the upper-level core code required for operation, >with the intent of soliciting feedback on the general API. This does >not contain the network driver side changes required for complete >operation. Also please note that as an RFC, there are some things >which are incomplete or in need of rework. > >The intent is to use a network driver which provides header/data >splitting, so the frame header (which is processed by the networking >stack) does not reside in user memory. > >The code is roughly working (in that it has successfully received >a TCP stream from a remote sender), but as an RFC, the intent is >to solicit feedback on the API and overall design. The current code >will also work with system pages, copying the data out to the >application - this is intended as a fallback/testing path. > >High level description: > >The application allocates a frame backing store, and provides this >to the kernel for use. An interface queue is requested from the >networking device, and incoming frames are deposited into the provided >memory region. > >Responsibility for correctly steering incoming frames to the queue >is outside the scope of this work - it is assumed that the user >has set steering rules up separately. > >Incoming frames are sent up the stack as skb's and eventually >land in the application's socket receive queue. This differs >from AF_XDP, which receives raw frames directly to userspace, >without protocol processing. > >The RECV_ZC opcode then returns an iov[] style vector which points >to the data in userspace memory. When the application has completed >processing of the data, the buffer is returned back to the kernel >through a fill ring for reuse. Interesting work ! Any userspace demo and performance data ? > >Jonathan Lemon (9): > io_uring: add zctap ifq definition > netdevice: add SETUP_ZCTAP to the netdev_bpf structure > io_uring: add register ifq opcode > io_uring: add provide_ifq_region opcode > io_uring: Add io_uring zctap iov structure and helpers > io_uring: introduce reference tracking for user pages. > page_pool: add page allocation and free hooks. > io_uring: provide functions for the page_pool. > io_uring: add OP_RECV_ZC command. > > include/linux/io_uring.h | 24 ++ > include/linux/io_uring_types.h | 10 + > include/linux/netdevice.h | 6 + > include/net/page_pool.h | 6 + > include/uapi/linux/io_uring.h | 26 ++ > io_uring/Makefile | 3 +- > io_uring/io_uring.c | 10 + > io_uring/kbuf.c | 13 + > io_uring/kbuf.h | 2 + > io_uring/net.c | 123 ++++++ > io_uring/opdef.c | 23 + > io_uring/zctap.c | 749 +++++++++++++++++++++++++++++++++ > io_uring/zctap.h | 20 + > net/core/page_pool.c | 41 +- > 14 files changed, 1048 insertions(+), 8 deletions(-) > create mode 100644 io_uring/zctap.c > create mode 100644 io_uring/zctap.h > >-- >2.30.2
On 10/10/22 12:37 AM, dust.li wrote: > On Fri, Oct 07, 2022 at 02:17:04PM -0700, Jonathan Lemon wrote: >> This series is a RFC for io_uring/zctap. This is an evolution of >> the earlier zctap work, re-targeted to use io_uring as the userspace >> API. The current code is intended to provide a zero-copy RX path for >> upper-level networking protocols (aka TCP and UDP). The current draft >> focuses on host-provided memory (not GPU memory). >> >> This RFC contains the upper-level core code required for operation, >> with the intent of soliciting feedback on the general API. This does >> not contain the network driver side changes required for complete >> operation. Also please note that as an RFC, there are some things >> which are incomplete or in need of rework. >> >> The intent is to use a network driver which provides header/data >> splitting, so the frame header (which is processed by the networking >> stack) does not reside in user memory. >> >> The code is roughly working (in that it has successfully received >> a TCP stream from a remote sender), but as an RFC, the intent is >> to solicit feedback on the API and overall design. The current code >> will also work with system pages, copying the data out to the >> application - this is intended as a fallback/testing path. >> >> High level description: >> >> The application allocates a frame backing store, and provides this >> to the kernel for use. An interface queue is requested from the >> networking device, and incoming frames are deposited into the provided >> memory region. >> >> Responsibility for correctly steering incoming frames to the queue >> is outside the scope of this work - it is assumed that the user >> has set steering rules up separately. >> >> Incoming frames are sent up the stack as skb's and eventually >> land in the application's socket receive queue. This differs >>from AF_XDP, which receives raw frames directly to userspace, >> without protocol processing. >> >> The RECV_ZC opcode then returns an iov[] style vector which points >> to the data in userspace memory. When the application has completed >> processing of the data, the buffer is returned back to the kernel >> through a fill ring for reuse. > > Interesting work ! Any userspace demo and performance data ? Coming soon! I'm hoping to get feedback on the overall API though, did you have any thoughts here?