Message ID | ecdfacd0967a22d88b7779e2efd09e040825d0f8.1684154817.git.asml.silence@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Enable IOU_F_TWQ_LAZY_WAKE for passthrough | expand |
On Mon, May 15, 2023 at 01:54:43PM +0100, Pavel Begunkov wrote: > Use IOU_F_TWQ_LAZY_WAKE via iou_cmd_exec_in_task_lazy() for passthrough > commands completion. It further delays the execution of task_work for > DEFER_TASKRUN until there are enough of task_work items queued to meet > the waiting criteria, which reduces the number of wake ups we issue. Why wouldn't you just do that unconditionally for io_uring_cmd_complete_in_task?
On 5/17/23 08:23, Christoph Hellwig wrote: > On Mon, May 15, 2023 at 01:54:43PM +0100, Pavel Begunkov wrote: >> Use IOU_F_TWQ_LAZY_WAKE via iou_cmd_exec_in_task_lazy() for passthrough >> commands completion. It further delays the execution of task_work for >> DEFER_TASKRUN until there are enough of task_work items queued to meet >> the waiting criteria, which reduces the number of wake ups we issue. > > Why wouldn't you just do that unconditionally for > io_uring_cmd_complete_in_task? 1) ublk does secondary batching and so may produce multiple cqes, that's not supported. I believe Ming sent patches removing it, but I'd rather not deal with conflicts for now. 2) Some users may have dependencies b/w requests, i.e. a request will only complete when another request's task_work is executed. 3) There might be use cases when you don't wont it to be delayed, IO retries would be a good example. I wouldn't also use it for control paths like ublk_ctrl_uring_cmd. Let's better keep io_uring_cmd_complete_in_task() as the default option for the interface.
On Wed, May 17, 2023 at 01:32:53PM +0100, Pavel Begunkov wrote: > 1) ublk does secondary batching and so may produce multiple cqes, > that's not supported. I believe Ming sent patches removing it, > but I'd rather not deal with conflicts for now. > > 2) Some users may have dependencies b/w requests, i.e. a request > will only complete when another request's task_work is executed. > > 3) There might be use cases when you don't wont it to be delayed, > IO retries would be a good example. I wouldn't also use it for > control paths like ublk_ctrl_uring_cmd. You speak a lot of some users and some cases when the only users are ublk and nvme, both of which would obviously benefit. If you don't want conflicts wait for Ming to finish his work and then we can do this cleanly and without leaving dead code around.
On 5/17/23 13:39, Christoph Hellwig wrote: > On Wed, May 17, 2023 at 01:32:53PM +0100, Pavel Begunkov wrote: >> 1) ublk does secondary batching and so may produce multiple cqes, >> that's not supported. I believe Ming sent patches removing it, >> but I'd rather not deal with conflicts for now. >> >> 2) Some users may have dependencies b/w requests, i.e. a request >> will only complete when another request's task_work is executed. >> >> 3) There might be use cases when you don't wont it to be delayed, >> IO retries would be a good example. I wouldn't also use it for >> control paths like ublk_ctrl_uring_cmd. > > You speak a lot of some users and some cases when the only users > are ublk and nvme, both of which would obviously benefit. > > If you don't want conflicts wait for Ming to finish his work > and then we can do this cleanly and without leaving dead code > around. Aside that you decided to ignore the third point, that's a generic interface, not nvme specific, there are patches for net cmds, someone even tried to use it for drm. How do you think new users are supposed to appear if the only helper doing the job can hang the userspace for their use case? Well, then maybe it'll remain nvme/ublk specific with such an approach. It is clean, and it's not dead code, and we should not remove the simpler and more straightforward helper.
On Wed, May 17, 2023 at 02:30:47PM +0100, Pavel Begunkov wrote: > Aside that you decided to ignore the third point, that's a > generic interface, not nvme specific, there are patches for > net cmds, someone even tried to use it for drm. How do you > think new users are supposed to appear if the only helper > doing the job can hang the userspace for their use case? > Well, then maybe it'll remain nvme/ublk specific with such > an approach. New users can add new code when it's actualy needed. We don't bloat the kernel for maybe in the future crap as a policy.
On 5/17/23 6:32 AM, Pavel Begunkov wrote: > On 5/17/23 08:23, Christoph Hellwig wrote: >> On Mon, May 15, 2023 at 01:54:43PM +0100, Pavel Begunkov wrote: >>> Use IOU_F_TWQ_LAZY_WAKE via iou_cmd_exec_in_task_lazy() for passthrough >>> commands completion. It further delays the execution of task_work for >>> DEFER_TASKRUN until there are enough of task_work items queued to meet >>> the waiting criteria, which reduces the number of wake ups we issue. >> >> Why wouldn't you just do that unconditionally for >> io_uring_cmd_complete_in_task? > > 1) ublk does secondary batching and so may produce multiple cqes, > that's not supported. I believe Ming sent patches removing it, > but I'd rather not deal with conflicts for now. Ming, what's the status of those patches? Looks like we'll end up with a dependency regardless of the ordering of these. Since these patches are here now, sanest approach seems to move forward with this series and defer the conflict resolving to the ublk side.
On 5/17/23 14:53, Christoph Hellwig wrote: > On Wed, May 17, 2023 at 02:30:47PM +0100, Pavel Begunkov wrote: >> Aside that you decided to ignore the third point, that's a >> generic interface, not nvme specific, there are patches for >> net cmds, someone even tried to use it for drm. How do you >> think new users are supposed to appear if the only helper >> doing the job can hang the userspace for their use case? >> Well, then maybe it'll remain nvme/ublk specific with such >> an approach. > > New users can add new code when it's actualy needed. We don't > bloat the kernel for maybe in the future crap as a policy. Let me put it for you this way, it's an absolutely horrendous idea to leave the old innocently looking name, i.e. io_uring_cmd_complete_in_task(), and add there a bunch of restrictions no new user would care about, that's called shooting yourself in the leg. So, we need to rename the function, which, again, for absolutely no reason adds dependency on ublk. Why doing that instead of waiting until ublk is converted? That's a big mystery.
On Wed, May 17, 2023 at 01:31:00PM -0600, Jens Axboe wrote: > On 5/17/23 6:32 AM, Pavel Begunkov wrote: > > On 5/17/23 08:23, Christoph Hellwig wrote: > >> On Mon, May 15, 2023 at 01:54:43PM +0100, Pavel Begunkov wrote: > >>> Use IOU_F_TWQ_LAZY_WAKE via iou_cmd_exec_in_task_lazy() for passthrough > >>> commands completion. It further delays the execution of task_work for > >>> DEFER_TASKRUN until there are enough of task_work items queued to meet > >>> the waiting criteria, which reduces the number of wake ups we issue. > >> > >> Why wouldn't you just do that unconditionally for > >> io_uring_cmd_complete_in_task? > > > > 1) ublk does secondary batching and so may produce multiple cqes, > > that's not supported. I believe Ming sent patches removing it, > > but I'd rather not deal with conflicts for now. > > Ming, what's the status of those patches? Looks like we'll end up > with a dependency regardless of the ordering of these. Since these > patches are here now, sanest approach seems to move forward with > this series and defer the conflict resolving to the ublk side. I didn't send patch to remove the batch in ublk, such as, the following line code: ublk_queue_cmd(): ... if (!llist_add(&data->node, &ubq->io_cmds)) return; ... But I did want to kill it given __io_req_task_work_add() can do batching process, but we have to re-order request in this list, so can't remove it now simply, see commit: 7d4a93176e01 ("ublk_drv: don't forward io commands in reserve order") Pavel must have misunderstood the following one as the batch removal: https://lore.kernel.org/linux-block/20230427124414.319945-2-ming.lei@redhat.com/ thanks, Ming
diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index 81c5c9e38477..52ed1094ccbb 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -521,7 +521,7 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, if (cookie != NULL && blk_rq_is_poll(req)) nvme_uring_task_cb(ioucmd, IO_URING_F_UNLOCKED); else - io_uring_cmd_complete_in_task(ioucmd, nvme_uring_task_cb); + io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); return RQ_END_IO_FREE; } @@ -543,7 +543,7 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io_meta(struct request *req, if (cookie != NULL && blk_rq_is_poll(req)) nvme_uring_task_meta_cb(ioucmd, IO_URING_F_UNLOCKED); else - io_uring_cmd_complete_in_task(ioucmd, nvme_uring_task_meta_cb); + io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_meta_cb); return RQ_END_IO_NONE; }
Use IOU_F_TWQ_LAZY_WAKE via iou_cmd_exec_in_task_lazy() for passthrough commands completion. It further delays the execution of task_work for DEFER_TASKRUN until there are enough of task_work items queued to meet the waiting criteria, which reduces the number of wake ups we issue. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> --- drivers/nvme/host/ioctl.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)