mbox series

[RFC,V2,0/9] io_uring: support sqe group and provide group kbuf

Message ID 20240506162251.3853781-1-ming.lei@redhat.com (mailing list archive)
Headers show
Series io_uring: support sqe group and provide group kbuf | expand

Message

Ming Lei May 6, 2024, 4:22 p.m. UTC
Hello,

The 1st 4 patches are cleanup, and prepare for adding sqe group.

The 5th patch supports generic sqe group which is like link chain, but
allows each sqe in group to be issued in parallel, so N:M dependency can be
supported with sqe group & io link together.

The 6th patch supports one variant of sqe group: allow members to depend
on group leader, so that kernel resource lifetime can be aligned with
group leader or group, then any kernel resource can be shared in this
sqe group, and can be used in generic device zero copy.

The 7th & 8th patches supports providing sqe group buffer via the sqe
group variant.

The 9th patch supports ublk zero copy based on io_uring providing sqe
group buffer.

Tests:

1) pass liburing test
- make runtests

2) write/pass two sqe group test cases:

https://github.com/axboe/liburing/compare/master...ming1:liburing:sqe_group_v2

covers related sqe flags combination and linking groups, both nop and
one multi-destination file copy.

3) ublksrv zero copy:

ublksrv userspace implements zero copy by sqe group & provide group
kbuf:

	git clone https://github.com/ublk-org/ublksrv.git -b group-provide-buf_v2
	make test T=loop/009:nbd/061:nbd/062	#ublk zc tests

When running 64KB block size test on ublk-loop('ublk add -t loop --buffered_io -f $backing'),
it is observed that perf can be doubled.

Any comments are welcome!

V2:
	- add generic sqe group, suggested by Kevin Wolf
	- add REQ_F_SQE_GROUP_DEP which is based on IOSQE_SQE_GROUP, for sharing
	  kernel resource in group wide, suggested by Kevin Wolf
	- remove sqe ext flag, and use the last bit for IOSQE_SQE_GROUP(Pavel),
	in future we still can extend sqe flags with one uring context flag
	- initialize group requests via submit state pattern, suggested by Pavel
	- all kinds of cleanup & bug fixes

Ming Lei (9):
  io_uring: add io_link_req() helper
  io_uring: add io_submit_fail_link() helper
  io_uring: add helper of io_req_commit_cqe()
  io_uring: move marking REQ_F_CQE_SKIP out of io_free_req()
  io_uring: support SQE group
  io_uring: support sqe group with members depending on leader
  io_uring: support providing sqe group buffer
  io_uring/uring_cmd: support provide group kernel buffer
  ublk: support provide io buffer

 drivers/block/ublk_drv.c       | 158 ++++++++++++++-
 include/linux/io_uring/cmd.h   |   7 +
 include/linux/io_uring_types.h |  48 +++++
 include/uapi/linux/io_uring.h  |  10 +-
 include/uapi/linux/ublk_cmd.h  |   7 +-
 io_uring/io_uring.c            | 356 +++++++++++++++++++++++++++++----
 io_uring/io_uring.h            |  29 ++-
 io_uring/kbuf.c                |  60 ++++++
 io_uring/kbuf.h                |  13 ++
 io_uring/net.c                 |  31 ++-
 io_uring/opdef.c               |   5 +
 io_uring/opdef.h               |   2 +
 io_uring/rw.c                  |  20 +-
 io_uring/timeout.c             |   3 +
 io_uring/uring_cmd.c           |  28 +++
 15 files changed, 724 insertions(+), 53 deletions(-)

Comments

Ming Lei May 10, 2024, 2:02 p.m. UTC | #1
On Tue, May 07, 2024 at 12:22:36AM +0800, Ming Lei wrote:
> Hello,
> 
> The 1st 4 patches are cleanup, and prepare for adding sqe group.
> 
> The 5th patch supports generic sqe group which is like link chain, but
> allows each sqe in group to be issued in parallel, so N:M dependency can be
> supported with sqe group & io link together.
> 
> The 6th patch supports one variant of sqe group: allow members to depend
> on group leader, so that kernel resource lifetime can be aligned with
> group leader or group, then any kernel resource can be shared in this
> sqe group, and can be used in generic device zero copy.
> 
> The 7th & 8th patches supports providing sqe group buffer via the sqe
> group variant.
> 
> The 9th patch supports ublk zero copy based on io_uring providing sqe
> group buffer.
> 
> Tests:
> 
> 1) pass liburing test
> - make runtests
> 
> 2) write/pass two sqe group test cases:
> 
> https://github.com/axboe/liburing/compare/master...ming1:liburing:sqe_group_v2
> 
> covers related sqe flags combination and linking groups, both nop and
> one multi-destination file copy.
> 
> 3) ublksrv zero copy:
> 
> ublksrv userspace implements zero copy by sqe group & provide group
> kbuf:
> 
> 	git clone https://github.com/ublk-org/ublksrv.git -b group-provide-buf_v2
> 	make test T=loop/009:nbd/061:nbd/062	#ublk zc tests
> 
> When running 64KB block size test on ublk-loop('ublk add -t loop --buffered_io -f $backing'),
> it is observed that perf can be doubled.
> 
> Any comments are welcome!
> 
> V2:
> 	- add generic sqe group, suggested by Kevin Wolf
> 	- add REQ_F_SQE_GROUP_DEP which is based on IOSQE_SQE_GROUP, for sharing
> 	  kernel resource in group wide, suggested by Kevin Wolf
> 	- remove sqe ext flag, and use the last bit for IOSQE_SQE_GROUP(Pavel),
> 	in future we still can extend sqe flags with one uring context flag
> 	- initialize group requests via submit state pattern, suggested by Pavel
> 	- all kinds of cleanup & bug fixes

Please ignore V2, and will send V3 with simplification & cleanup, and
many fixes on error handling code path.


Thanks,
Ming