diff mbox series

[14/17] io_uring: add polling support for uring-cmd

Message ID 20220308152105.309618-15-joshi.k@samsung.com (mailing list archive)
State New, archived
Headers show
Series io_uring passthru over nvme | expand

Commit Message

Kanchan Joshi March 8, 2022, 3:21 p.m. UTC
From: Pankaj Raghav <p.raghav@samsung.com>

Enable polling support infra for uring-cmd.
Collect the bio during submission, and use that to implement polling on
completion.

Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
---
 fs/io_uring.c            | 51 ++++++++++++++++++++++++++++++++++------
 include/linux/io_uring.h |  9 +++++--
 2 files changed, 51 insertions(+), 9 deletions(-)

Comments

Christoph Hellwig March 11, 2022, 6:50 a.m. UTC | #1
On Tue, Mar 08, 2022 at 08:51:02PM +0530, Kanchan Joshi wrote:
> +		if (req->opcode == IORING_OP_URING_CMD ||
> +		    req->opcode == IORING_OP_URING_CMD_FIXED) {
> +			/* uring_cmd structure does not contain kiocb struct */
> +			struct kiocb kiocb_uring_cmd;
> +
> +			kiocb_uring_cmd.private = req->uring_cmd.bio;
> +			kiocb_uring_cmd.ki_filp = req->uring_cmd.file;
> +			ret = req->uring_cmd.file->f_op->iopoll(&kiocb_uring_cmd,
> +			      &iob, poll_flags);
> +		} else {
> +			ret = kiocb->ki_filp->f_op->iopoll(kiocb, &iob,
> +							   poll_flags);
> +		}

This is just completely broken.  You absolutely do need the iocb
in struct uring_cmd for ->iopoll to work.
Kanchan Joshi March 14, 2022, 10:16 a.m. UTC | #2
On Fri, Mar 11, 2022 at 12:20 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Tue, Mar 08, 2022 at 08:51:02PM +0530, Kanchan Joshi wrote:
> > +             if (req->opcode == IORING_OP_URING_CMD ||
> > +                 req->opcode == IORING_OP_URING_CMD_FIXED) {
> > +                     /* uring_cmd structure does not contain kiocb struct */
> > +                     struct kiocb kiocb_uring_cmd;
> > +
> > +                     kiocb_uring_cmd.private = req->uring_cmd.bio;
> > +                     kiocb_uring_cmd.ki_filp = req->uring_cmd.file;
> > +                     ret = req->uring_cmd.file->f_op->iopoll(&kiocb_uring_cmd,
> > +                           &iob, poll_flags);
> > +             } else {
> > +                     ret = kiocb->ki_filp->f_op->iopoll(kiocb, &iob,
> > +                                                        poll_flags);
> > +             }
>
> This is just completely broken.  You absolutely do need the iocb
> in struct uring_cmd for ->iopoll to work.

But, after you did bio based polling, we need just the bio to poll.
iocb is a big structure (48 bytes), and if we try to place it in
struct io_uring_cmd, we will just blow up the cacheline in io_uring
(first one in io_kiocb).
So we just store that bio pointer in io_uring_cmd on submission
(please see patch 15).
And here on completion we pick that bio, and put that into this local
iocb, simply because  ->iopoll needs it.
Do you see I am still missing anything here?
Christoph Hellwig March 15, 2022, 8:57 a.m. UTC | #3
On Mon, Mar 14, 2022 at 03:46:08PM +0530, Kanchan Joshi wrote:
> But, after you did bio based polling, we need just the bio to poll.
> iocb is a big structure (48 bytes), and if we try to place it in
> struct io_uring_cmd, we will just blow up the cacheline in io_uring
> (first one in io_kiocb).
> So we just store that bio pointer in io_uring_cmd on submission
> (please see patch 15).
> And here on completion we pick that bio, and put that into this local
> iocb, simply because  ->iopoll needs it.
> Do you see I am still missing anything here?

Yes.  The VFS layer interface for polling is the kiocb.  Don't break
it.  The bio is just an implementation detail.
Kanchan Joshi March 16, 2022, 5:09 a.m. UTC | #4
On Tue, Mar 15, 2022 at 09:57:45AM +0100, Christoph Hellwig wrote:
>On Mon, Mar 14, 2022 at 03:46:08PM +0530, Kanchan Joshi wrote:
>> But, after you did bio based polling, we need just the bio to poll.
>> iocb is a big structure (48 bytes), and if we try to place it in
>> struct io_uring_cmd, we will just blow up the cacheline in io_uring
>> (first one in io_kiocb).
>> So we just store that bio pointer in io_uring_cmd on submission
>> (please see patch 15).
>> And here on completion we pick that bio, and put that into this local
>> iocb, simply because  ->iopoll needs it.
>> Do you see I am still missing anything here?
>
>Yes.  The VFS layer interface for polling is the kiocb.  Don't break
>it.  The bio is just an implementation detail.

So how about adding ->async_cmd_poll in file_operations (since this
corresponds to ->async_cmd)?
It will take struct io_uring_cmd pointer as parameter.
Both ->iopoll and ->async_cmd_poll will differ in what they accept (kiocb
vs io_uring_cmd). The provider may use bio_poll, or whatever else is the
implementation detail.

for read/write, submission interface took kiocb, and completion
interface (->iopoll) also operated on the same.
for uring/async-cmd, submission interface took io_uring_cmd, but
completion used kiocb based ->iopoll. The new ->async_cmd_poll should
settle this.
Christoph Hellwig March 24, 2022, 6:30 a.m. UTC | #5
On Wed, Mar 16, 2022 at 10:39:05AM +0530, Kanchan Joshi wrote:
> So how about adding ->async_cmd_poll in file_operations (since this
> corresponds to ->async_cmd)?
> It will take struct io_uring_cmd pointer as parameter.
> Both ->iopoll and ->async_cmd_poll will differ in what they accept (kiocb
> vs io_uring_cmd). The provider may use bio_poll, or whatever else is the
> implementation detail.

That does sound way better than the current version at least.
diff mbox series

Patch

diff --git a/fs/io_uring.c b/fs/io_uring.c
index f04bb497bd88..8bd9401f9964 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2664,7 +2664,20 @@  static int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin)
 		if (READ_ONCE(req->iopoll_completed))
 			break;
 
-		ret = kiocb->ki_filp->f_op->iopoll(kiocb, &iob, poll_flags);
+		if (req->opcode == IORING_OP_URING_CMD ||
+		    req->opcode == IORING_OP_URING_CMD_FIXED) {
+			/* uring_cmd structure does not contain kiocb struct */
+			struct kiocb kiocb_uring_cmd;
+
+			kiocb_uring_cmd.private = req->uring_cmd.bio;
+			kiocb_uring_cmd.ki_filp = req->uring_cmd.file;
+			ret = req->uring_cmd.file->f_op->iopoll(&kiocb_uring_cmd,
+			      &iob, poll_flags);
+		} else {
+			ret = kiocb->ki_filp->f_op->iopoll(kiocb, &iob,
+							   poll_flags);
+		}
+
 		if (unlikely(ret < 0))
 			return ret;
 		else if (ret)
@@ -2777,6 +2790,15 @@  static int io_iopoll_check(struct io_ring_ctx *ctx, long min)
 			    wq_list_empty(&ctx->iopoll_list))
 				break;
 		}
+
+		/*
+		 * In some scenarios, completion callback has been queued up to be
+		 * completed in-task context but polling happens in the same task
+		 * not giving a chance for the completion callback to complete.
+		 */
+		if (current->task_works)
+			io_run_task_work();
+
 		ret = io_do_iopoll(ctx, !min);
 		if (ret < 0)
 			break;
@@ -4136,6 +4158,14 @@  static int io_linkat(struct io_kiocb *req, unsigned int issue_flags)
 	return 0;
 }
 
+static void io_complete_uring_cmd_iopoll(struct io_kiocb *req, long res)
+{
+	WRITE_ONCE(req->result, res);
+	/* order with io_iopoll_complete() checking ->result */
+	smp_wmb();
+	WRITE_ONCE(req->iopoll_completed, 1);
+}
+
 /*
  * Called by consumers of io_uring_cmd, if they originally returned
  * -EIOCBQUEUED upon receiving the command.
@@ -4146,7 +4176,11 @@  void io_uring_cmd_done(struct io_uring_cmd *ioucmd, ssize_t ret)
 
 	if (ret < 0)
 		req_set_fail(req);
-	io_req_complete(req, ret);
+
+	if (req->uring_cmd.flags & IO_URING_F_UCMD_POLLED)
+		io_complete_uring_cmd_iopoll(req, ret);
+	else
+		io_req_complete(req, ret);
 }
 EXPORT_SYMBOL_GPL(io_uring_cmd_done);
 
@@ -4158,15 +4192,18 @@  static int io_uring_cmd_prep(struct io_kiocb *req,
 
 	if (!req->file->f_op->async_cmd || !(req->ctx->flags & IORING_SETUP_SQE128))
 		return -EOPNOTSUPP;
-	if (req->ctx->flags & IORING_SETUP_IOPOLL)
-		return -EOPNOTSUPP;
+	if (req->ctx->flags & IORING_SETUP_IOPOLL) {
+		ioucmd->flags = IO_URING_F_UCMD_POLLED;
+		ioucmd->bio = NULL;
+		req->iopoll_completed = 0;
+	} else {
+		ioucmd->flags = 0;
+	}
 	if (req->opcode == IORING_OP_URING_CMD_FIXED) {
 		req->imu = NULL;
 		io_req_set_rsrc_node(req, ctx);
 		req->buf_index = READ_ONCE(sqe->buf_index);
-		ioucmd->flags = IO_URING_F_UCMD_FIXEDBUFS;
-	} else {
-		ioucmd->flags = 0;
+		ioucmd->flags |= IO_URING_F_UCMD_FIXEDBUFS;
 	}
 
 	ioucmd->cmd = (void *) &sqe->cmd;
diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h
index abad6175739e..65db83d703b7 100644
--- a/include/linux/io_uring.h
+++ b/include/linux/io_uring.h
@@ -9,6 +9,7 @@  enum io_uring_cmd_flags {
 	IO_URING_F_COMPLETE_DEFER	= 1,
 	IO_URING_F_UNLOCKED		= 2,
 	IO_URING_F_UCMD_FIXEDBUFS	= 4,
+	IO_URING_F_UCMD_POLLED		= 8,
 	/* int's last bit, sign checks are usually faster than a bit test */
 	IO_URING_F_NONBLOCK		= INT_MIN,
 };
@@ -16,8 +17,12 @@  enum io_uring_cmd_flags {
 struct io_uring_cmd {
 	struct file     *file;
 	void            *cmd;
-	/* for irq-completion - if driver requires doing stuff in task-context*/
-	void (*driver_cb)(struct io_uring_cmd *cmd);
+	union {
+		void *bio; /* used for polled completion */
+
+		/* for irq-completion - if driver requires doing stuff in task-context*/
+		void (*driver_cb)(struct io_uring_cmd *cmd);
+	};
 	u32             flags;
 	u32             cmd_op;
 	u16		cmd_len;