mbox series

[GIT,PULL] io_uring network zero-copy receive support

Message ID 12e0af8c-8417-41d5-9d47-408556b50322@kernel.dk (mailing list archive)
State Not Applicable
Headers show
Series [GIT,PULL] io_uring network zero-copy receive support | expand

Pull-request

git://git.kernel.dk/linux.git for-6.15/io_uring-rx-zc-20250325

Checks

Context Check Description
netdev/tree_selection success Pull request for net
netdev/apply fail Pull to net-0 failed

Message

Jens Axboe March 27, 2025, 11:46 a.m. UTC
Hi Linus,

This pull request adds support for zero-copy receive with io_uring,
enabling fast bulk receive of data directly into application memory,
rather than needing to copy the data out of kernel memory. While this
version only supports host memory as that was the initial target, other
memory types are planned as well, with notably GPU memory coming next.

This work depends on some networking components which were queued up on
the networking side, but have now landed in your tree.

This is the work of Pavel Begunkov and David Wei. From the v14 posting:

"We configure a page pool that a driver uses to fill a hw rx queue to
 hand out user pages instead of kernel pages. Any data that ends up
 hitting this hw rx queue will thus be dma'd into userspace memory
 directly, without needing to be bounced through kernel memory. 'Reading'
 data out of a socket instead becomes a _notification_ mechanism, where
 the kernel tells userspace where the data is. The overall approach is
 similar to the devmem TCP proposal.

 This relies on hw header/data split, flow steering and RSS to ensure
 packet headers remain in kernel memory and only desired flows hit a hw
 rx queue configured for zero copy. Configuring this is outside of the
 scope of this patchset.

 We share netdev core infra with devmem TCP. The main difference is that
 io_uring is used for the uAPI and the lifetime of all objects are bound
 to an io_uring instance. Data is 'read' using a new io_uring request
 type. When done, data is returned via a new shared refill queue. A zero
 copy page pool refills a hw rx queue from this refill queue directly. Of
 course, the lifetime of these data buffers are managed by io_uring
 rather than the networking stack, with different refcounting rules.

 This patchset is the first step adding basic zero copy support. We will
 extend this iteratively with new features e.g. dynamically allocated
 zero copy areas, THP support, dmabuf support, improved copy fallback,
 general optimisations and more."

In a local setup, I was able to saturate a 200G link with a single CPU
core, and at netdev conf 0x19 earlier this month, Jamal reported 188Gbit
of bandwidth using a single core (no HT, including soft-irq). Safe to
say the efficiency is there, as bigger links would be needed to find the
per-core limit, and it's considerably more efficient and faster than the
existing devmem solution.

Please pull!


The following changes since commit 5c496ff11df179c32db960cf10af90a624a035eb:

  Merge commit '71f0dd5a3293d75d26d405ffbaedfdda4836af32' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next into for-6.15/io_uring-rx-zc (2025-02-17 05:38:28 -0700)

are available in the Git repository at:

  git://git.kernel.dk/linux.git for-6.15/io_uring-rx-zc-20250325

for you to fetch changes up to 89baa22d75278b69d3a30f86c3f47ac3a3a659e9:

  io_uring/zcrx: add selftest case for recvzc with read limit (2025-02-24 12:56:13 -0700)

----------------------------------------------------------------
Bui Quang Minh (1):
      io_uring: add missing IORING_MAP_OFF_ZCRX_REGION in io_uring_mmap

David Wei (8):
      io_uring/zcrx: add interface queue and refill queue
      io_uring/zcrx: add io_zcrx_area
      io_uring/zcrx: add io_recvzc request
      io_uring/zcrx: set pp memory provider for an rx queue
      net: add documentation for io_uring zcrx
      io_uring/zcrx: add selftest
      io_uring/zcrx: add a read limit to recvzc requests
      io_uring/zcrx: add selftest case for recvzc with read limit

Geert Uytterhoeven (1):
      io_uring: Rename KConfig to Kconfig

Jens Axboe (1):
      Merge commit '71f0dd5a3293d75d26d405ffbaedfdda4836af32' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next into for-6.15/io_uring-rx-zc

Pavel Begunkov (7):
      io_uring/zcrx: grab a net device
      io_uring/zcrx: implement zerocopy receive pp memory provider
      io_uring/zcrx: dma-map area for the device
      io_uring/zcrx: throttle receive requests
      io_uring/zcrx: add copy fallback
      io_uring/zcrx: recheck ifq on shutdown
      io_uring/zcrx: fix leaks on failed registration

 Documentation/networking/index.rst                 |   1 +
 Documentation/networking/iou-zcrx.rst              | 202 +++++
 Kconfig                                            |   2 +
 include/linux/io_uring_types.h                     |   6 +
 include/uapi/linux/io_uring.h                      |  54 +-
 io_uring/Kconfig                                   |  10 +
 io_uring/Makefile                                  |   1 +
 io_uring/io_uring.c                                |   7 +
 io_uring/io_uring.h                                |  10 +
 io_uring/memmap.c                                  |   2 +
 io_uring/memmap.h                                  |   1 +
 io_uring/net.c                                     |  84 ++
 io_uring/opdef.c                                   |  16 +
 io_uring/register.c                                |   7 +
 io_uring/rsrc.c                                    |   2 +-
 io_uring/rsrc.h                                    |   1 +
 io_uring/zcrx.c                                    | 960 +++++++++++++++++++++
 io_uring/zcrx.h                                    |  73 ++
 tools/testing/selftests/drivers/net/hw/.gitignore  |   2 +
 tools/testing/selftests/drivers/net/hw/Makefile    |   5 +
 tools/testing/selftests/drivers/net/hw/iou-zcrx.c  | 457 ++++++++++
 tools/testing/selftests/drivers/net/hw/iou-zcrx.py |  87 ++
 22 files changed, 1988 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/networking/iou-zcrx.rst
 create mode 100644 io_uring/Kconfig
 create mode 100644 io_uring/zcrx.c
 create mode 100644 io_uring/zcrx.h
 create mode 100644 tools/testing/selftests/drivers/net/hw/iou-zcrx.c
 create mode 100755 tools/testing/selftests/drivers/net/hw/iou-zcrx.py

Comments

pr-tracker-bot@kernel.org March 28, 2025, 10:11 p.m. UTC | #1
The pull request you sent on Thu, 27 Mar 2025 05:46:21 -0600:

> git://git.kernel.dk/linux.git for-6.15/io_uring-rx-zc-20250325

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/78b6f6e9bf3960c5ee3368415a11babb754b9a19

Thank you!