Message ID | 20230419102930.2979231-2-leitao@debian.org (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | io_uring: Pass whole sqe to commands | expand |
On Wed, Apr 19, 2023 at 03:29:29AM -0700, Breno Leitao wrote: > struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); > - const struct nvme_uring_cmd *cmd = ioucmd->cmd; > + const struct nvme_uring_cmd *cmd = (struct nvme_uring_cmd *)ioucmd->sqe->cmd; Please don't add the pointless cast. And in general avoid the overly long lines. I suspect most other users should just also defined their structures const instead of adding all theses casts thare are a sign of problems, but that's a pre-existing issue. > struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd); > - size_t cmd_size; > + size_t size = sizeof(struct io_uring_sqe); > > BUILD_BUG_ON(uring_cmd_pdu_size(0) != 16); > BUILD_BUG_ON(uring_cmd_pdu_size(1) != 80); > > - cmd_size = uring_cmd_pdu_size(req->ctx->flags & IORING_SETUP_SQE128); > + if (req->ctx->flags & IORING_SETUP_SQE128) > + size <<= 1; Why does this stop using uring_cmd_pdu_size()?
Hello Christoph, On Thu, Apr 20, 2023 at 06:57:12AM +0200, Christoph Hellwig wrote: > On Wed, Apr 19, 2023 at 03:29:29AM -0700, Breno Leitao wrote: > > struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); > > - const struct nvme_uring_cmd *cmd = ioucmd->cmd; > > + const struct nvme_uring_cmd *cmd = (struct nvme_uring_cmd *)ioucmd->sqe->cmd; > > Please don't add the pointless cast. And in general avoid the overly > long lines. Ack! > > I suspect most other users should just also defined their structures > const instead of adding all theses casts thare are a sign of problems, > but that's a pre-existing issue. > > struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd); > > - size_t cmd_size; > > + size_t size = sizeof(struct io_uring_sqe); > > > > BUILD_BUG_ON(uring_cmd_pdu_size(0) != 16); > > BUILD_BUG_ON(uring_cmd_pdu_size(1) != 80); > > > > - cmd_size = uring_cmd_pdu_size(req->ctx->flags & IORING_SETUP_SQE128); > > + if (req->ctx->flags & IORING_SETUP_SQE128) > > + size <<= 1; > > > Why does this stop using uring_cmd_pdu_size()? Before, only the cmd payload (sqe->cmd) was being copied to the async structure. We are copying over the whole sqe now, since we can use SQE fields inside the ioctl callbacks (instead of only cmd fields). So, the copy now is 64 bytes for single SQE or 128 for double SQEs. This has two major advantages: * It is not necessary to create a cmd structure for every command operations (which will be mapped in sqe->cmd) to pass arguments. The arguments could be passed as fields in SQE. * sqe->cmd is 16 bytes on single SQEs. Passing the whole SQE to cmd will reduce the necessity to use double SQE for operations that require large fields, such as {g,s}etsockopt(). There are some discussions about it also at https://lkml.org/lkml/2023/4/6/786 Thanks for the review!
On Thu, Apr 20, 2023 at 05:29:18AM -0700, Breno Leitao wrote: > > > - cmd_size = uring_cmd_pdu_size(req->ctx->flags & IORING_SETUP_SQE128); > > > + if (req->ctx->flags & IORING_SETUP_SQE128) > > > + size <<= 1; > > > > > > Why does this stop using uring_cmd_pdu_size()? > > Before, only the cmd payload (sqe->cmd) was being copied to the async > structure. We are copying over the whole sqe now, since we can use SQE > fields inside the ioctl callbacks (instead of only cmd fields). So, the > copy now is 64 bytes for single SQE or 128 for double SQEs. That's the point of this series and I get it. But why do we remove the nice and self-documenting helper that returns once or twice the sizeof of the SQE structure and instead add a magic open coded left shift?
On Thu, Apr 20, 2023 at 02:31:39PM +0200, Christoph Hellwig wrote: > On Thu, Apr 20, 2023 at 05:29:18AM -0700, Breno Leitao wrote: > > > > - cmd_size = uring_cmd_pdu_size(req->ctx->flags & IORING_SETUP_SQE128); > > > > + if (req->ctx->flags & IORING_SETUP_SQE128) > > > > + size <<= 1; > > > > > > > > > Why does this stop using uring_cmd_pdu_size()? > > > > Before, only the cmd payload (sqe->cmd) was being copied to the async > > structure. We are copying over the whole sqe now, since we can use SQE > > fields inside the ioctl callbacks (instead of only cmd fields). So, the > > copy now is 64 bytes for single SQE or 128 for double SQEs. > > That's the point of this series and I get it. But why do we remove > the nice and self-documenting helper that returns once or twice > the sizeof of the SQE structure and instead add a magic open coded > left shift? uring_cmd_pdu_size() returns the size of the payload, not the size of the SQE structure. Basically it returns 16 bytes or single SQE or 80 for double SQE. Since we are not coping the payload anymore, this is not necessary. Now we are copying 64 bytes for the single SQE or 128 bytes for double SQE. Do you prefer I create a helper that returns the SQE size, instead of doing the left shift? Thank you!
On Thu, Apr 20, 2023 at 05:38:02AM -0700, Breno Leitao wrote: > Since we are not coping the payload anymore, this is not necessary. Now > we are copying 64 bytes for the single SQE or 128 bytes for double SQE. > > Do you prefer I create a helper that returns the SQE size, instead of > doing the left shift? I think a helper would be nice. And adding another sizeof(sqe) seems more self documenting then the shift, but if you really prefer the shift at least write a good comment explaining it.
On Thu, Apr 20, 2023 at 06:57:12AM +0200, Christoph Hellwig wrote: > On Wed, Apr 19, 2023 at 03:29:29AM -0700, Breno Leitao wrote: > > struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); > > - const struct nvme_uring_cmd *cmd = ioucmd->cmd; > > + const struct nvme_uring_cmd *cmd = (struct nvme_uring_cmd *)ioucmd->sqe->cmd; > > Please don't add the pointless cast. And in general avoid the overly > long lines. If I don't add this cast, the compiler complains with the follow error: drivers/nvme/host/ioctl.c: In function ‘nvme_uring_cmd_io’: drivers/nvme/host/ioctl.c:555:37: error: initialization of ‘const struct nvme_uring_cmd *’ from incompatible pointer type ‘const __u8 *’ {aka ‘const unsigned char *’} [-Werror=incompatible-pointer-types] const struct nvme_uring_cmd *cmd = ioucmd->sqe->cmd;
On Fri, Apr 21, 2023 at 08:11:31AM -0700, Breno Leitao wrote: > On Thu, Apr 20, 2023 at 06:57:12AM +0200, Christoph Hellwig wrote: > > On Wed, Apr 19, 2023 at 03:29:29AM -0700, Breno Leitao wrote: > > > struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); > > > - const struct nvme_uring_cmd *cmd = ioucmd->cmd; > > > + const struct nvme_uring_cmd *cmd = (struct nvme_uring_cmd *)ioucmd->sqe->cmd; > > > > Please don't add the pointless cast. And in general avoid the overly > > long lines. > > If I don't add this cast, the compiler complains with the follow error: > > drivers/nvme/host/ioctl.c: In function ‘nvme_uring_cmd_io’: > drivers/nvme/host/ioctl.c:555:37: error: initialization of ‘const struct nvme_uring_cmd *’ from incompatible pointer type ‘const __u8 *’ {aka ‘const unsigned char *’} [-Werror=incompatible-pointer-types] > const struct nvme_uring_cmd *cmd = ioucmd->sqe->cmd; Oh. I think then we need a helper to get the private data from the io_uring_cmd as an interface that requires casts in all callers is one asking for bugs.
On Thu, Apr 20, 2023 at 02:46:15PM +0200, Christoph Hellwig wrote: > On Thu, Apr 20, 2023 at 05:38:02AM -0700, Breno Leitao wrote: > > Since we are not coping the payload anymore, this is not necessary. Now > > we are copying 64 bytes for the single SQE or 128 bytes for double SQE. > > > > Do you prefer I create a helper that returns the SQE size, instead of > > doing the left shift? > > I think a helper would be nice. And adding another sizeof(sqe) seems > more self documenting then the shift, but if you really prefer the > shift at least write a good comment explaining it. Agree, this is a good idea. I've fixed it in the nvme code by creating a function helper. The same problem happen on the ublkd_drv, and I've also fixed it there in a new patch. https://lkml.org/lkml/2023/4/30/94 Thanks for the review.
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index c73cc57ec547..ec23a3c9fac8 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1263,7 +1263,7 @@ static void ublk_handle_need_get_data(struct ublk_device *ub, int q_id, static int ublk_ch_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags) { - struct ublksrv_io_cmd *ub_cmd = (struct ublksrv_io_cmd *)cmd->cmd; + struct ublksrv_io_cmd *ub_cmd = (struct ublksrv_io_cmd *)cmd->sqe->cmd; struct ublk_device *ub = cmd->file->private_data; struct ublk_queue *ubq; struct ublk_io *io; @@ -1567,7 +1567,7 @@ static struct ublk_device *ublk_get_device_from_id(int idx) static int ublk_ctrl_start_dev(struct ublk_device *ub, struct io_uring_cmd *cmd) { - struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd; + struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd; int ublksrv_pid = (int)header->data[0]; struct gendisk *disk; int ret = -EINVAL; @@ -1630,7 +1630,7 @@ static int ublk_ctrl_start_dev(struct ublk_device *ub, struct io_uring_cmd *cmd) static int ublk_ctrl_get_queue_affinity(struct ublk_device *ub, struct io_uring_cmd *cmd) { - struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd; + struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd; void __user *argp = (void __user *)(unsigned long)header->addr; cpumask_var_t cpumask; unsigned long queue; @@ -1681,7 +1681,7 @@ static inline void ublk_dump_dev_info(struct ublksrv_ctrl_dev_info *info) static int ublk_ctrl_add_dev(struct io_uring_cmd *cmd) { - struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd; + struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd; void __user *argp = (void __user *)(unsigned long)header->addr; struct ublksrv_ctrl_dev_info info; struct ublk_device *ub; @@ -1844,7 +1844,7 @@ static int ublk_ctrl_del_dev(struct ublk_device **p_ub) static inline void ublk_ctrl_cmd_dump(struct io_uring_cmd *cmd) { - struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd; + struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd; pr_devel("%s: cmd_op %x, dev id %d qid %d data %llx buf %llx len %u\n", __func__, cmd->cmd_op, header->dev_id, header->queue_id, @@ -1863,7 +1863,7 @@ static int ublk_ctrl_stop_dev(struct ublk_device *ub) static int ublk_ctrl_get_dev_info(struct ublk_device *ub, struct io_uring_cmd *cmd) { - struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd; + struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd; void __user *argp = (void __user *)(unsigned long)header->addr; if (header->len < sizeof(struct ublksrv_ctrl_dev_info) || !header->addr) @@ -1894,7 +1894,7 @@ static void ublk_ctrl_fill_params_devt(struct ublk_device *ub) static int ublk_ctrl_get_params(struct ublk_device *ub, struct io_uring_cmd *cmd) { - struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd; + struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd; void __user *argp = (void __user *)(unsigned long)header->addr; struct ublk_params_header ph; int ret; @@ -1925,7 +1925,7 @@ static int ublk_ctrl_get_params(struct ublk_device *ub, static int ublk_ctrl_set_params(struct ublk_device *ub, struct io_uring_cmd *cmd) { - struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd; + struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd; void __user *argp = (void __user *)(unsigned long)header->addr; struct ublk_params_header ph; int ret = -EFAULT; @@ -1983,7 +1983,7 @@ static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq) static int ublk_ctrl_start_recovery(struct ublk_device *ub, struct io_uring_cmd *cmd) { - struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd; + struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd; int ret = -EINVAL; int i; @@ -2025,7 +2025,7 @@ static int ublk_ctrl_start_recovery(struct ublk_device *ub, static int ublk_ctrl_end_recovery(struct ublk_device *ub, struct io_uring_cmd *cmd) { - struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd; + struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd; int ublksrv_pid = (int)header->data[0]; int ret = -EINVAL; @@ -2092,7 +2092,7 @@ static int ublk_char_dev_permission(struct ublk_device *ub, static int ublk_ctrl_uring_cmd_permission(struct ublk_device *ub, struct io_uring_cmd *cmd) { - struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd; + struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd; bool unprivileged = ub->dev_info.flags & UBLK_F_UNPRIVILEGED_DEV; void __user *argp = (void __user *)(unsigned long)header->addr; char *dev_path = NULL; @@ -2171,7 +2171,7 @@ static int ublk_ctrl_uring_cmd_permission(struct ublk_device *ub, static int ublk_ctrl_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags) { - struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd; + struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->sqe->cmd; struct ublk_device *ub = NULL; int ret = -EINVAL; diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index d24ea2e05156..351dff872fa0 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -552,7 +552,7 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, struct io_uring_cmd *ioucmd, unsigned int issue_flags, bool vec) { struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); - const struct nvme_uring_cmd *cmd = ioucmd->cmd; + const struct nvme_uring_cmd *cmd = (struct nvme_uring_cmd *)ioucmd->sqe->cmd; struct request_queue *q = ns ? ns->queue : ctrl->admin_q; struct nvme_uring_data d; struct nvme_command c; diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index 35b9328ca335..2dfc81dd6d1a 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -24,7 +24,7 @@ enum io_uring_cmd_flags { struct io_uring_cmd { struct file *file; - const void *cmd; + const struct io_uring_sqe *sqe; union { /* callback to defer completions to task context */ void (*task_work_cb)(struct io_uring_cmd *cmd, unsigned); diff --git a/io_uring/opdef.c b/io_uring/opdef.c index cca7c5b55208..3b9c6489b8b6 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -627,7 +627,7 @@ const struct io_cold_def io_cold_defs[] = { }, [IORING_OP_URING_CMD] = { .name = "URING_CMD", - .async_size = uring_cmd_pdu_size(1), + .async_size = 2 * sizeof(struct io_uring_sqe), .prep_async = io_uring_cmd_prep_async, }, [IORING_OP_SEND_ZC] = { diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index 5113c9a48583..5cb2e39e99f9 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -69,15 +69,16 @@ EXPORT_SYMBOL_GPL(io_uring_cmd_done); int io_uring_cmd_prep_async(struct io_kiocb *req) { struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd); - size_t cmd_size; + size_t size = sizeof(struct io_uring_sqe); BUILD_BUG_ON(uring_cmd_pdu_size(0) != 16); BUILD_BUG_ON(uring_cmd_pdu_size(1) != 80); - cmd_size = uring_cmd_pdu_size(req->ctx->flags & IORING_SETUP_SQE128); + if (req->ctx->flags & IORING_SETUP_SQE128) + size <<= 1; - memcpy(req->async_data, ioucmd->cmd, cmd_size); - ioucmd->cmd = req->async_data; + memcpy(req->async_data, ioucmd->sqe, size); + ioucmd->sqe = req->async_data; return 0; } @@ -103,7 +104,7 @@ int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) req->imu = ctx->user_bufs[index]; io_req_set_rsrc_node(req, ctx, 0); } - ioucmd->cmd = sqe->cmd; + ioucmd->sqe = sqe; ioucmd->cmd_op = READ_ONCE(sqe->cmd_op); return 0; }