diff mbox series

[V5,5/8] io_uring: support sqe group with members depending on leader

Message ID 20240808162503.345913-6-ming.lei@redhat.com (mailing list archive)
State New
Headers show
Series io_uring: support sqe group and provide group kbuf | expand

Commit Message

Ming Lei Aug. 8, 2024, 4:24 p.m. UTC
IOSQE_SQE_GROUP just starts to queue members after the leader is completed,
which way is just for simplifying implementation, and this behavior is never
part of UAPI, and it may be relaxed and members can be queued concurrently
with leader in future.

However, some resource can't cross OPs, such as kernel buffer, otherwise
the buffer may be leaked easily in case that any OP failure or application
panic.

Add flag REQ_F_SQE_GROUP_DEP for allowing members to depend on group leader
explicitly, so that group members won't be queued until the leader request is
completed, and we still commit leader's CQE after all members CQE are posted.
With this way, the kernel resource lifetime can be aligned with group leader
or group, one typical use case is to support zero copy for device internal
buffer.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 include/linux/io_uring_types.h | 3 +++
 io_uring/io_uring.c            | 8 +++++++-
 2 files changed, 10 insertions(+), 1 deletion(-)
diff mbox series

Patch

diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index c5250e585289..d0972e2a098f 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -469,6 +469,7 @@  enum {
 	REQ_F_BL_NO_RECYCLE_BIT,
 	REQ_F_BUFFERS_COMMIT_BIT,
 	REQ_F_SQE_GROUP_LEADER_BIT,
+	REQ_F_SQE_GROUP_DEP_BIT,
 
 	/* not a real bit, just to check we're not overflowing the space */
 	__REQ_F_LAST_BIT,
@@ -551,6 +552,8 @@  enum {
 	REQ_F_BUFFERS_COMMIT	= IO_REQ_FLAG(REQ_F_BUFFERS_COMMIT_BIT),
 	/* sqe group lead */
 	REQ_F_SQE_GROUP_LEADER	= IO_REQ_FLAG(REQ_F_SQE_GROUP_LEADER_BIT),
+	/* sqe group with members depending on leader */
+	REQ_F_SQE_GROUP_DEP	= IO_REQ_FLAG(REQ_F_SQE_GROUP_DEP_BIT),
 };
 
 typedef void (*io_req_tw_func_t)(struct io_kiocb *req, struct io_tw_state *ts);
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 45a292445b18..b4f5dac85fa4 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -982,7 +982,13 @@  static void io_complete_group_leader(struct io_kiocb *req)
 	req->grp_refs -= 1;
 	WARN_ON_ONCE(req->grp_refs == 0);
 
-	/* TODO: queue members with leader in parallel */
+	/*
+	 * TODO: queue members with leader in parallel
+	 *
+	 * So far, REQ_F_SQE_GROUP_DEP depends that members are queued
+	 * after leader is completed, which may be changed in future,
+	 * then REQ_F_SQE_GROUP_DEP has to be respected in another way.
+	 */
 	if (req->grp_link)
 		io_queue_group_members(req);
 }