Message ID | 20250409024955.3626275-1-csander@purestorage.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | ublk: skip blk_mq_tag_to_rq() bounds check | expand |
On Tue, Apr 08, 2025 at 08:49:54PM -0600, Caleb Sander Mateos wrote: > The ublk driver calls blk_mq_tag_to_rq() in several places. > blk_mq_tag_to_rq() tolerates an invalid tag for the tagset, checking it > against the number of tags and returning NULL if it is out of bounds. > But all the calls from the ublk driver have already verified the tag > against the ublk queue's queue depth. In ublk_commit_completion(), > ublk_handle_need_get_data(), and case UBLK_IO_COMMIT_AND_FETCH_REQ, the > tag has already been checked in __ublk_ch_uring_cmd(). In > ublk_abort_queue(), the loop bounds the tag by the queue depth. In > __ublk_check_and_get_req(), the tag has already been checked in > __ublk_ch_uring_cmd(), in the case of ublk_register_io_buf(), or in > ublk_check_and_get_req(). > > So just index the tagset's rqs array directly in the ublk driver. > Convert the tags to unsigned, as blk_mq_tag_to_rq() does. Poking directly into block layer internals feels like a really bad idea. If this is important enough we'll need a non-checking helper in the core code, but as with all these kinds of micro-optimizations it better have a really good justification.
On 4/10/25 3:24 AM, Christoph Hellwig wrote: > On Tue, Apr 08, 2025 at 08:49:54PM -0600, Caleb Sander Mateos wrote: >> The ublk driver calls blk_mq_tag_to_rq() in several places. >> blk_mq_tag_to_rq() tolerates an invalid tag for the tagset, checking it >> against the number of tags and returning NULL if it is out of bounds. >> But all the calls from the ublk driver have already verified the tag >> against the ublk queue's queue depth. In ublk_commit_completion(), >> ublk_handle_need_get_data(), and case UBLK_IO_COMMIT_AND_FETCH_REQ, the >> tag has already been checked in __ublk_ch_uring_cmd(). In >> ublk_abort_queue(), the loop bounds the tag by the queue depth. In >> __ublk_check_and_get_req(), the tag has already been checked in >> __ublk_ch_uring_cmd(), in the case of ublk_register_io_buf(), or in >> ublk_check_and_get_req(). >> >> So just index the tagset's rqs array directly in the ublk driver. >> Convert the tags to unsigned, as blk_mq_tag_to_rq() does. > > Poking directly into block layer internals feels like a really bad > idea. If this is important enough we'll need a non-checking helper > in the core code, but as with all these kinds of micro-optimizations > it better have a really good justification. FWIW, I agree, and I also have a hard time imagining this making much of a measurable difference. Caleb, was this based "well this seems pointless" or was it something you noticed in profiling/testing?
On Tue, Apr 08, 2025 at 08:49:54PM -0600, Caleb Sander Mateos wrote: > The ublk driver calls blk_mq_tag_to_rq() in several places. > blk_mq_tag_to_rq() tolerates an invalid tag for the tagset, checking it > against the number of tags and returning NULL if it is out of bounds. > But all the calls from the ublk driver have already verified the tag > against the ublk queue's queue depth. In ublk_commit_completion(), > ublk_handle_need_get_data(), and case UBLK_IO_COMMIT_AND_FETCH_REQ, the > tag has already been checked in __ublk_ch_uring_cmd(). In > ublk_abort_queue(), the loop bounds the tag by the queue depth. In > __ublk_check_and_get_req(), the tag has already been checked in > __ublk_ch_uring_cmd(), in the case of ublk_register_io_buf(), or in > ublk_check_and_get_req(). > > So just index the tagset's rqs array directly in the ublk driver. > Convert the tags to unsigned, as blk_mq_tag_to_rq() does. If blk_mq_tag_to_rq() turns out to be not efficient enough, we can kill it in fast path by storing it in ublk_io and sharing space with 'struct io_uring_cmd *', since the two's lifetime isn't overlapped basically. Thanks, Ming
On Thu, Apr 10, 2025 at 6:13 AM Jens Axboe <axboe@kernel.dk> wrote: > > On 4/10/25 3:24 AM, Christoph Hellwig wrote: > > On Tue, Apr 08, 2025 at 08:49:54PM -0600, Caleb Sander Mateos wrote: > >> The ublk driver calls blk_mq_tag_to_rq() in several places. > >> blk_mq_tag_to_rq() tolerates an invalid tag for the tagset, checking it > >> against the number of tags and returning NULL if it is out of bounds. > >> But all the calls from the ublk driver have already verified the tag > >> against the ublk queue's queue depth. In ublk_commit_completion(), > >> ublk_handle_need_get_data(), and case UBLK_IO_COMMIT_AND_FETCH_REQ, the > >> tag has already been checked in __ublk_ch_uring_cmd(). In > >> ublk_abort_queue(), the loop bounds the tag by the queue depth. In > >> __ublk_check_and_get_req(), the tag has already been checked in > >> __ublk_ch_uring_cmd(), in the case of ublk_register_io_buf(), or in > >> ublk_check_and_get_req(). > >> > >> So just index the tagset's rqs array directly in the ublk driver. > >> Convert the tags to unsigned, as blk_mq_tag_to_rq() does. > > > > Poking directly into block layer internals feels like a really bad > > idea. If this is important enough we'll need a non-checking helper > > in the core code, but as with all these kinds of micro-optimizations > > it better have a really good justification. > > FWIW, I agree, and I also have a hard time imagining this making much of > a measurable difference. Caleb, was this based "well this seems > pointless" or was it something you noticed in profiling/testing? That's true, the nr_tags check doesn't show up super prominently in a CPU profile. The atomic reference counting in __ublk_check_and_get_req() or ublk_commit_completion() is significantly more expensive. Still, it seems like unnecessary work. nr_tags is in a different cache line from rqs, so there is the potential for a cache miss. And the prefetch() is another unnecessary cache miss in the cases where ublk doesn't access any of struct request's fields. I am happy to add a "blk_mq_tag_to_rq_unchecked()" helper to avoid accessing the blk-mq internals. Best, Caleb
On 4/11/25 12:36 PM, Caleb Sander Mateos wrote: > On Thu, Apr 10, 2025 at 6:13?AM Jens Axboe <axboe@kernel.dk> wrote: >> >> On 4/10/25 3:24 AM, Christoph Hellwig wrote: >>> On Tue, Apr 08, 2025 at 08:49:54PM -0600, Caleb Sander Mateos wrote: >>>> The ublk driver calls blk_mq_tag_to_rq() in several places. >>>> blk_mq_tag_to_rq() tolerates an invalid tag for the tagset, checking it >>>> against the number of tags and returning NULL if it is out of bounds. >>>> But all the calls from the ublk driver have already verified the tag >>>> against the ublk queue's queue depth. In ublk_commit_completion(), >>>> ublk_handle_need_get_data(), and case UBLK_IO_COMMIT_AND_FETCH_REQ, the >>>> tag has already been checked in __ublk_ch_uring_cmd(). In >>>> ublk_abort_queue(), the loop bounds the tag by the queue depth. In >>>> __ublk_check_and_get_req(), the tag has already been checked in >>>> __ublk_ch_uring_cmd(), in the case of ublk_register_io_buf(), or in >>>> ublk_check_and_get_req(). >>>> >>>> So just index the tagset's rqs array directly in the ublk driver. >>>> Convert the tags to unsigned, as blk_mq_tag_to_rq() does. >>> >>> Poking directly into block layer internals feels like a really bad >>> idea. If this is important enough we'll need a non-checking helper >>> in the core code, but as with all these kinds of micro-optimizations >>> it better have a really good justification. >> >> FWIW, I agree, and I also have a hard time imagining this making much of >> a measurable difference. Caleb, was this based "well this seems >> pointless" or was it something you noticed in profiling/testing? > > That's true, the nr_tags check doesn't show up super prominently in a > CPU profile. The atomic reference counting in > __ublk_check_and_get_req() or ublk_commit_completion() is > significantly more expensive. Still, it seems like unnecessary work. Matching atomics on either side is always going to be miserable, and I'd wager a much bigger issue than the minor thing that this patch is trying to address... > nr_tags is in a different cache line from rqs, so there is the > potential for a cache miss. And the prefetch() is another unnecessary > cache miss in the cases where ublk doesn't access any of struct > request's fields. > I am happy to add a "blk_mq_tag_to_rq_unchecked()" helper to avoid > accessing the blk-mq internals. Or maybe go the route that Ming suggested? But if you go the other route, I'd just add a __blk_mq_tag_to_rq() and have blk_mq_tag_to_rq() call that with the validation happening before.
On Fri, Apr 11, 2025 at 12:56 AM Ming Lei <ming.lei@redhat.com> wrote: > > On Tue, Apr 08, 2025 at 08:49:54PM -0600, Caleb Sander Mateos wrote: > > The ublk driver calls blk_mq_tag_to_rq() in several places. > > blk_mq_tag_to_rq() tolerates an invalid tag for the tagset, checking it > > against the number of tags and returning NULL if it is out of bounds. > > But all the calls from the ublk driver have already verified the tag > > against the ublk queue's queue depth. In ublk_commit_completion(), > > ublk_handle_need_get_data(), and case UBLK_IO_COMMIT_AND_FETCH_REQ, the > > tag has already been checked in __ublk_ch_uring_cmd(). In > > ublk_abort_queue(), the loop bounds the tag by the queue depth. In > > __ublk_check_and_get_req(), the tag has already been checked in > > __ublk_ch_uring_cmd(), in the case of ublk_register_io_buf(), or in > > ublk_check_and_get_req(). > > > > So just index the tagset's rqs array directly in the ublk driver. > > Convert the tags to unsigned, as blk_mq_tag_to_rq() does. > > If blk_mq_tag_to_rq() turns out to be not efficient enough, we can kill it > in fast path by storing it in ublk_io and sharing space with 'struct io_uring_cmd *', > since the two's lifetime isn't overlapped basically. I agree it would be nice to just store a pointer from in struct ublk_io to its current struct request. I guess we would set it in ubq_complete_io_cmd() and clear it in ublk_commit_completion() (matching when UBLK_IO_FLAG_OWNED_BY_SRV is set), as well as in ublk_timeout() for UBLK_F_UNPRIVILEGED_DEV? I'm not sure it is possible to overlap the fields, though. When using UBLK_U_IO_NEED_GET_DATA, the cmd field is overwritten with the a pointer to the UBLK_U_IO_NEED_GET_DATA command, but the req would need to be recorded earlier upon completion of the UBLK_U_IO_(COMMIT_AND_)FETCH_REQ command. Would you be okay with 2 separate fields? Best, Caleb
On Fri, Apr 11, 2025 at 12:51:10PM -0700, Caleb Sander Mateos wrote: > On Fri, Apr 11, 2025 at 12:56 AM Ming Lei <ming.lei@redhat.com> wrote: > > > > On Tue, Apr 08, 2025 at 08:49:54PM -0600, Caleb Sander Mateos wrote: > > > The ublk driver calls blk_mq_tag_to_rq() in several places. > > > blk_mq_tag_to_rq() tolerates an invalid tag for the tagset, checking it > > > against the number of tags and returning NULL if it is out of bounds. > > > But all the calls from the ublk driver have already verified the tag > > > against the ublk queue's queue depth. In ublk_commit_completion(), > > > ublk_handle_need_get_data(), and case UBLK_IO_COMMIT_AND_FETCH_REQ, the > > > tag has already been checked in __ublk_ch_uring_cmd(). In > > > ublk_abort_queue(), the loop bounds the tag by the queue depth. In > > > __ublk_check_and_get_req(), the tag has already been checked in > > > __ublk_ch_uring_cmd(), in the case of ublk_register_io_buf(), or in > > > ublk_check_and_get_req(). > > > > > > So just index the tagset's rqs array directly in the ublk driver. > > > Convert the tags to unsigned, as blk_mq_tag_to_rq() does. > > > > If blk_mq_tag_to_rq() turns out to be not efficient enough, we can kill it > > in fast path by storing it in ublk_io and sharing space with 'struct io_uring_cmd *', > > since the two's lifetime isn't overlapped basically. > > I agree it would be nice to just store a pointer from in struct > ublk_io to its current struct request. I guess we would set it in > ubq_complete_io_cmd() and clear it in ublk_commit_completion() > (matching when UBLK_IO_FLAG_OWNED_BY_SRV is set), as well as in > ublk_timeout() for UBLK_F_UNPRIVILEGED_DEV? > > I'm not sure it is possible to overlap the fields, though. When using > UBLK_U_IO_NEED_GET_DATA, the cmd field is overwritten with the a > pointer to the UBLK_U_IO_NEED_GET_DATA command, but the req would need Both UBLK_U_IO_NEED_GET_DATA & UBLK_IO_COMMIT_AND_FETCH_REQ share same usage on uring_cmd/request actually. Especially for UBLK_U_IO_NEED_GET_DATA, the uring cmd pointer needn't to be stored in ublk_io. Or just keep to use blk_mq_tag_to_rq() simply for it only. > to be recorded earlier upon completion of the > UBLK_U_IO_(COMMIT_AND_)FETCH_REQ command. Each one can be moved in local variable first, then store it. If we do this way, helper can be added for set/get cmd/req from ublk_io, then the implementation can be reliable & readable. > Would you be okay with 2 separate fields? Yeah, I think it is fine to do it first. Thanks, Ming
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 2fd05c1bd30b..5b07329f5197 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -210,11 +210,11 @@ struct ublk_params_header { }; static bool ublk_abort_requests(struct ublk_device *ub, struct ublk_queue *ubq); static inline struct request *__ublk_check_and_get_req(struct ublk_device *ub, - struct ublk_queue *ubq, int tag, size_t offset); + struct ublk_queue *ubq, unsigned tag, size_t offset); static inline unsigned int ublk_req_build_flags(struct request *req); static inline struct ublksrv_io_desc *ublk_get_iod(struct ublk_queue *ubq, int tag); static inline bool ublk_dev_is_user_copy(const struct ublk_device *ub) { @@ -1515,11 +1515,11 @@ static void ublk_commit_completion(struct ublk_device *ub, /* now this cmd slot is owned by nbd driver */ io->flags &= ~UBLK_IO_FLAG_OWNED_BY_SRV; io->res = ub_cmd->result; /* find the io request and complete */ - req = blk_mq_tag_to_rq(ub->tag_set.tags[qid], tag); + req = ub->tag_set.tags[qid]->rqs[tag]; if (WARN_ON_ONCE(unlikely(!req))) return; if (req_op(req) == REQ_OP_ZONE_APPEND) req->__sector = ub_cmd->zone_append_lba; @@ -1533,11 +1533,11 @@ static void ublk_commit_completion(struct ublk_device *ub, * blk-mq queue, so we are called exclusively with blk-mq and ubq_daemon * context, so everything is serialized. */ static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq) { - int i; + unsigned i; for (i = 0; i < ubq->q_depth; i++) { struct ublk_io *io = &ubq->ios[i]; if (!(io->flags & UBLK_IO_FLAG_ACTIVE)) { @@ -1545,11 +1545,11 @@ static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq) /* * Either we fail the request or ublk_rq_task_work_cb * will do it */ - rq = blk_mq_tag_to_rq(ub->tag_set.tags[ubq->q_id], i); + rq = ub->tag_set.tags[ubq->q_id]->rqs[i]; if (rq && blk_mq_request_started(rq)) { io->flags |= UBLK_IO_FLAG_ABORTED; __ublk_fail_req(ubq, io, rq); } } @@ -1824,14 +1824,14 @@ static void ublk_mark_io_ready(struct ublk_device *ub, struct ublk_queue *ubq) complete_all(&ub->completion); mutex_unlock(&ub->mutex); } static void ublk_handle_need_get_data(struct ublk_device *ub, int q_id, - int tag) + unsigned tag) { struct ublk_queue *ubq = ublk_get_queue(ub, q_id); - struct request *req = blk_mq_tag_to_rq(ub->tag_set.tags[q_id], tag); + struct request *req = ub->tag_set.tags[q_id]->rqs[tag]; ublk_queue_cmd(ubq, req); } static inline int ublk_check_cmd_op(u32 cmd_op) @@ -1989,11 +1989,11 @@ static int __ublk_ch_uring_cmd(struct io_uring_cmd *cmd, ublk_fill_io_cmd(io, cmd, ub_cmd->addr); ublk_mark_io_ready(ub, ubq); break; case UBLK_IO_COMMIT_AND_FETCH_REQ: - req = blk_mq_tag_to_rq(ub->tag_set.tags[ub_cmd->q_id], tag); + req = ub->tag_set.tags[ub_cmd->q_id]->rqs[tag]; if (!(io->flags & UBLK_IO_FLAG_OWNED_BY_SRV)) goto out; if (ublk_need_map_io(ubq)) { @@ -2033,18 +2033,18 @@ static int __ublk_ch_uring_cmd(struct io_uring_cmd *cmd, __func__, cmd_op, tag, ret, io->flags); return ret; } static inline struct request *__ublk_check_and_get_req(struct ublk_device *ub, - struct ublk_queue *ubq, int tag, size_t offset) + struct ublk_queue *ubq, unsigned tag, size_t offset) { struct request *req; if (!ublk_need_req_ref(ubq)) return NULL; - req = blk_mq_tag_to_rq(ub->tag_set.tags[ubq->q_id], tag); + req = ub->tag_set.tags[ubq->q_id]->rqs[tag]; if (!req) return NULL; if (!ublk_get_req_ref(ubq, req)) return NULL;
The ublk driver calls blk_mq_tag_to_rq() in several places. blk_mq_tag_to_rq() tolerates an invalid tag for the tagset, checking it against the number of tags and returning NULL if it is out of bounds. But all the calls from the ublk driver have already verified the tag against the ublk queue's queue depth. In ublk_commit_completion(), ublk_handle_need_get_data(), and case UBLK_IO_COMMIT_AND_FETCH_REQ, the tag has already been checked in __ublk_ch_uring_cmd(). In ublk_abort_queue(), the loop bounds the tag by the queue depth. In __ublk_check_and_get_req(), the tag has already been checked in __ublk_ch_uring_cmd(), in the case of ublk_register_io_buf(), or in ublk_check_and_get_req(). So just index the tagset's rqs array directly in the ublk driver. Convert the tags to unsigned, as blk_mq_tag_to_rq() does. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> --- drivers/block/ublk_drv.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)