Message ID | 20240711082430.609597-1-xue01.he@samsung.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [v2] io_uring: Avoid polling configuration errors | expand |
On Thu, Jul 11, 2024 at 04:24:30PM +0800, hexue wrote: > + if (!ctx->check_poll_queue) { > + struct block_device *bdev; > + struct request_queue *q; > + struct inode *inode = req->file->f_inode; > + > + if (inode->i_rdev) { > + bdev = blkdev_get_no_open(inode->i_rdev); > + q = bdev->bd_queue; > + if (!test_bit(QUEUE_FLAG_POLL, &q->queue_flags)) > + pr_warn("the device does't configured with poll queues\n"); > + } > + ctx->check_poll_queue = true; > + } This is wrong for multiple reasons. One is that we can't simply poke into block device internals like this in a higher layer like io_uring. Second blkdev_get_no_open is in no way available for use outside the block layer. The fact that the even exist as separate helpers that aren't entirely hidden is a decade old layering violation in blk-cgroup. If you want to advertize this properly we'll need a flag in struct file or something similar.
>This is wrong for multiple reasons. One is that we can't simply poke >into block device internals like this in a higher layer like io_uring. >Second blkdev_get_no_open is in no way available for use outside the >block layer. The fact that the even exist as separate helpers that >aren't entirely hidden is a decade old layering violation in blk-cgroup. Got it, thanks. >If you want to advertize this properly we'll need a flag in struct >file or something similar. Thanks, I will try to do this. -- hexue
On 7/12/24 12:57 AM, hexue wrote: >> This is wrong for multiple reasons. One is that we can't simply poke >> into block device internals like this in a higher layer like io_uring. >> Second blkdev_get_no_open is in no way available for use outside the >> block layer. The fact that the even exist as separate helpers that >> aren't entirely hidden is a decade old layering violation in blk-cgroup. > > Got it, thanks. > >> If you want to advertize this properly we'll need a flag in struct >> file or something similar. > > Thanks, I will try to do this. My stance is still the same - why add all of this junk just to detect a misuse of polled IO? It doesn't make sense to me, it's the very definition of "doctor it hurts when I do this" - don't do it. So unless this has _zero_ overhead or extra code, which obviously isn't possible, or extraordinary arguments exists for why this should be added, I don't see this going anywhere.
>My stance is still the same - why add all of this junk just to detect a >misuse of polled IO? It doesn't make sense to me, it's the very >definition of "doctor it hurts when I do this" - don't do it. >So unless this has _zero_ overhead or extra code, which obviously isn't >possible, or extraordinary arguments exists for why this should be >added, I don't see this going anywhere. Actually, I just want users to know why they got wrong data, just a warning of an error, like doctor tell you why you do this will hurt. I think it's helpful for users to use tools accurately. and yes, this should be as simple as possible, I'll working on it. I'm not sure if I made myself clear and make sense to you? -- hexue
On 7/14/24 8:39 PM, hexue wrote: >> My stance is still the same - why add all of this junk just to detect a >> misuse of polled IO? It doesn't make sense to me, it's the very >> definition of "doctor it hurts when I do this" - don't do it. > >> So unless this has _zero_ overhead or extra code, which obviously isn't >> possible, or extraordinary arguments exists for why this should be >> added, I don't see this going anywhere. > > Actually, I just want users to know why they got wrong data, just a > warning of an error, like doctor tell you why you do this will hurt. I > think it's helpful for users to use tools accurately. and yes, this > should be as simple as possible, I'll working on it. I'm not sure if I > made myself clear and make sense to you? Certainly agree that that is an issue and a much more worthy reason for the addition. It's the main reason why -EOPNOTSUPP return would be more useful, and I'd probably argue the better way then to do it. It may indeed break existing use cases, but probably only because they are misconfigured. That then means that it'd be saner to do this on the block layer side, imho, as that's when the queue is resolved anyway, rather than attempt to hack around this on the issuing side.
On 7/15/24 10:59 AM, Jens Axboe wrote: >On 7/14/24 8:39 PM, hexue wrote: >>> My stance is still the same - why add all of this junk just to detect a >>> misuse of polled IO? It doesn't make sense to me, it's the very >>> definition of "doctor it hurts when I do this" - don't do it. >>> >>> So unless this has _zero_ overhead or extra code, which obviously isn't >>> possible, or extraordinary arguments exists for why this should be >>> added, I don't see this going anywhere. >> >> Actually, I just want users to know why they got wrong data, just a >> warning of an error, like doctor tell you why you do this will hurt. I >> think it's helpful for users to use tools accurately. and yes, this >> should be as simple as possible, I'll working on it. I'm not sure if I >> made myself clear and make sense to you? > >Certainly agree that that is an issue and a much more worthy reason for >the addition. It's the main reason why -EOPNOTSUPP return would be more >useful, and I'd probably argue the better way then to do it. It may >indeed break existing use cases, but probably only because they are >misconfigured. > >That then means that it'd be saner to do this on the block layer side, >imho, as that's when the queue is resolved anyway, rather than attempt >to hack around this on the issuing side. Implementing it at the block layer is indeed more reasonable, thanks for your affirmation and suggestion, I will look for an appropriate place in the path to perform the check. Thanks.
diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index 91224bbcfa73..270c3edbbf21 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -428,6 +428,7 @@ struct io_ring_ctx { unsigned short n_sqe_pages; struct page **ring_pages; struct page **sqe_pages; + bool check_poll_queue; }; struct io_tw_state { diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 816e93e7f949..1b45b4c52ae0 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3463,8 +3463,10 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, !(ctx->flags & IORING_SETUP_SQPOLL)) ctx->task_complete = true; - if (ctx->task_complete || (ctx->flags & IORING_SETUP_IOPOLL)) + if (ctx->task_complete || (ctx->flags & IORING_SETUP_IOPOLL)) { ctx->lockless_cq = true; + ctx->check_poll_queue = false; + } /* * lazy poll_wq activation relies on ->task_complete for synchronisation diff --git a/io_uring/rw.c b/io_uring/rw.c index 1a2128459cb4..20f417152a17 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -772,6 +772,23 @@ static bool need_complete_io(struct io_kiocb *req) S_ISBLK(file_inode(req->file)->i_mode); } +static void check_poll_queue_state(struct io_ring_ctx *ctx, struct io_kiocb *req) +{ + if (!ctx->check_poll_queue) { + struct block_device *bdev; + struct request_queue *q; + struct inode *inode = req->file->f_inode; + + if (inode->i_rdev) { + bdev = blkdev_get_no_open(inode->i_rdev); + q = bdev->bd_queue; + if (!test_bit(QUEUE_FLAG_POLL, &q->queue_flags)) + pr_warn("the device does't configured with poll queues\n"); + } + ctx->check_poll_queue = true; + } +} + static int io_rw_init_file(struct io_kiocb *req, fmode_t mode) { struct io_rw *rw = io_kiocb_to_cmd(req, struct io_rw); @@ -804,6 +821,7 @@ static int io_rw_init_file(struct io_kiocb *req, fmode_t mode) if (ctx->flags & IORING_SETUP_IOPOLL) { if (!(kiocb->ki_flags & IOCB_DIRECT) || !file->f_op->iopoll) return -EOPNOTSUPP; + check_poll_queue_state(ctx, req); kiocb->private = NULL; kiocb->ki_flags |= IOCB_HIPRI;
If user doesn't configured poll queue but do the polled IO, it will get a low performeance than regular poll, but 100% CPU uage. And there's no prompts or warnings. This patch aims to help users more easily verify their configurations correctly, avoiding time and performance losses. -- changes from v1: - without disrupting the original I/O process. - move judgement from block to io_uring. Signed-off-by: hexue <xue01.he@samsung.com> --- include/linux/io_uring_types.h | 1 + io_uring/io_uring.c | 4 +++- io_uring/rw.c | 18 ++++++++++++++++++ 3 files changed, 22 insertions(+), 1 deletion(-)