diff mbox series

ublk: skip blk_mq_tag_to_rq() bounds check

Message ID 20250409024955.3626275-1-csander@purestorage.com (mailing list archive)
State New
Headers show
Series ublk: skip blk_mq_tag_to_rq() bounds check | expand

Commit Message

Caleb Sander Mateos April 9, 2025, 2:49 a.m. UTC
The ublk driver calls blk_mq_tag_to_rq() in several places.
blk_mq_tag_to_rq() tolerates an invalid tag for the tagset, checking it
against the number of tags and returning NULL if it is out of bounds.
But all the calls from the ublk driver have already verified the tag
against the ublk queue's queue depth. In ublk_commit_completion(),
ublk_handle_need_get_data(), and case UBLK_IO_COMMIT_AND_FETCH_REQ, the
tag has already been checked in __ublk_ch_uring_cmd(). In
ublk_abort_queue(), the loop bounds the tag by the queue depth. In
__ublk_check_and_get_req(), the tag has already been checked in
__ublk_ch_uring_cmd(), in the case of ublk_register_io_buf(), or in
ublk_check_and_get_req().

So just index the tagset's rqs array directly in the ublk driver.
Convert the tags to unsigned, as blk_mq_tag_to_rq() does.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
---
 drivers/block/ublk_drv.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

Comments

Christoph Hellwig April 10, 2025, 9:24 a.m. UTC | #1
On Tue, Apr 08, 2025 at 08:49:54PM -0600, Caleb Sander Mateos wrote:
> The ublk driver calls blk_mq_tag_to_rq() in several places.
> blk_mq_tag_to_rq() tolerates an invalid tag for the tagset, checking it
> against the number of tags and returning NULL if it is out of bounds.
> But all the calls from the ublk driver have already verified the tag
> against the ublk queue's queue depth. In ublk_commit_completion(),
> ublk_handle_need_get_data(), and case UBLK_IO_COMMIT_AND_FETCH_REQ, the
> tag has already been checked in __ublk_ch_uring_cmd(). In
> ublk_abort_queue(), the loop bounds the tag by the queue depth. In
> __ublk_check_and_get_req(), the tag has already been checked in
> __ublk_ch_uring_cmd(), in the case of ublk_register_io_buf(), or in
> ublk_check_and_get_req().
> 
> So just index the tagset's rqs array directly in the ublk driver.
> Convert the tags to unsigned, as blk_mq_tag_to_rq() does.

Poking directly into block layer internals feels like a really bad
idea.  If this is important enough we'll need a non-checking helper
in the core code, but as with all these kinds of micro-optimizations
it better have a really good justification.
Jens Axboe April 10, 2025, 1:13 p.m. UTC | #2
On 4/10/25 3:24 AM, Christoph Hellwig wrote:
> On Tue, Apr 08, 2025 at 08:49:54PM -0600, Caleb Sander Mateos wrote:
>> The ublk driver calls blk_mq_tag_to_rq() in several places.
>> blk_mq_tag_to_rq() tolerates an invalid tag for the tagset, checking it
>> against the number of tags and returning NULL if it is out of bounds.
>> But all the calls from the ublk driver have already verified the tag
>> against the ublk queue's queue depth. In ublk_commit_completion(),
>> ublk_handle_need_get_data(), and case UBLK_IO_COMMIT_AND_FETCH_REQ, the
>> tag has already been checked in __ublk_ch_uring_cmd(). In
>> ublk_abort_queue(), the loop bounds the tag by the queue depth. In
>> __ublk_check_and_get_req(), the tag has already been checked in
>> __ublk_ch_uring_cmd(), in the case of ublk_register_io_buf(), or in
>> ublk_check_and_get_req().
>>
>> So just index the tagset's rqs array directly in the ublk driver.
>> Convert the tags to unsigned, as blk_mq_tag_to_rq() does.
> 
> Poking directly into block layer internals feels like a really bad
> idea.  If this is important enough we'll need a non-checking helper
> in the core code, but as with all these kinds of micro-optimizations
> it better have a really good justification.

FWIW, I agree, and I also have a hard time imagining this making much of
a measurable difference. Caleb, was this based "well this seems
pointless" or was it something you noticed in profiling/testing?
Ming Lei April 11, 2025, 7:56 a.m. UTC | #3
On Tue, Apr 08, 2025 at 08:49:54PM -0600, Caleb Sander Mateos wrote:
> The ublk driver calls blk_mq_tag_to_rq() in several places.
> blk_mq_tag_to_rq() tolerates an invalid tag for the tagset, checking it
> against the number of tags and returning NULL if it is out of bounds.
> But all the calls from the ublk driver have already verified the tag
> against the ublk queue's queue depth. In ublk_commit_completion(),
> ublk_handle_need_get_data(), and case UBLK_IO_COMMIT_AND_FETCH_REQ, the
> tag has already been checked in __ublk_ch_uring_cmd(). In
> ublk_abort_queue(), the loop bounds the tag by the queue depth. In
> __ublk_check_and_get_req(), the tag has already been checked in
> __ublk_ch_uring_cmd(), in the case of ublk_register_io_buf(), or in
> ublk_check_and_get_req().
> 
> So just index the tagset's rqs array directly in the ublk driver.
> Convert the tags to unsigned, as blk_mq_tag_to_rq() does.

If blk_mq_tag_to_rq() turns out to be not efficient enough, we can kill it
in fast path by storing it in ublk_io and sharing space with 'struct io_uring_cmd *',
since the two's lifetime isn't overlapped basically.



Thanks,
Ming
Caleb Sander Mateos April 11, 2025, 6:36 p.m. UTC | #4
On Thu, Apr 10, 2025 at 6:13 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 4/10/25 3:24 AM, Christoph Hellwig wrote:
> > On Tue, Apr 08, 2025 at 08:49:54PM -0600, Caleb Sander Mateos wrote:
> >> The ublk driver calls blk_mq_tag_to_rq() in several places.
> >> blk_mq_tag_to_rq() tolerates an invalid tag for the tagset, checking it
> >> against the number of tags and returning NULL if it is out of bounds.
> >> But all the calls from the ublk driver have already verified the tag
> >> against the ublk queue's queue depth. In ublk_commit_completion(),
> >> ublk_handle_need_get_data(), and case UBLK_IO_COMMIT_AND_FETCH_REQ, the
> >> tag has already been checked in __ublk_ch_uring_cmd(). In
> >> ublk_abort_queue(), the loop bounds the tag by the queue depth. In
> >> __ublk_check_and_get_req(), the tag has already been checked in
> >> __ublk_ch_uring_cmd(), in the case of ublk_register_io_buf(), or in
> >> ublk_check_and_get_req().
> >>
> >> So just index the tagset's rqs array directly in the ublk driver.
> >> Convert the tags to unsigned, as blk_mq_tag_to_rq() does.
> >
> > Poking directly into block layer internals feels like a really bad
> > idea.  If this is important enough we'll need a non-checking helper
> > in the core code, but as with all these kinds of micro-optimizations
> > it better have a really good justification.
>
> FWIW, I agree, and I also have a hard time imagining this making much of
> a measurable difference. Caleb, was this based "well this seems
> pointless" or was it something you noticed in profiling/testing?

That's true, the nr_tags check doesn't show up super prominently in a
CPU profile. The atomic reference counting in
__ublk_check_and_get_req() or ublk_commit_completion() is
significantly more expensive. Still, it seems like unnecessary work.
nr_tags is in a different cache line from rqs, so there is the
potential for a cache miss. And the prefetch() is another unnecessary
cache miss in the cases where ublk doesn't access any of struct
request's fields.
I am happy to add a "blk_mq_tag_to_rq_unchecked()" helper to avoid
accessing the blk-mq internals.

Best,
Caleb
Jens Axboe April 11, 2025, 6:40 p.m. UTC | #5
On 4/11/25 12:36 PM, Caleb Sander Mateos wrote:
> On Thu, Apr 10, 2025 at 6:13?AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 4/10/25 3:24 AM, Christoph Hellwig wrote:
>>> On Tue, Apr 08, 2025 at 08:49:54PM -0600, Caleb Sander Mateos wrote:
>>>> The ublk driver calls blk_mq_tag_to_rq() in several places.
>>>> blk_mq_tag_to_rq() tolerates an invalid tag for the tagset, checking it
>>>> against the number of tags and returning NULL if it is out of bounds.
>>>> But all the calls from the ublk driver have already verified the tag
>>>> against the ublk queue's queue depth. In ublk_commit_completion(),
>>>> ublk_handle_need_get_data(), and case UBLK_IO_COMMIT_AND_FETCH_REQ, the
>>>> tag has already been checked in __ublk_ch_uring_cmd(). In
>>>> ublk_abort_queue(), the loop bounds the tag by the queue depth. In
>>>> __ublk_check_and_get_req(), the tag has already been checked in
>>>> __ublk_ch_uring_cmd(), in the case of ublk_register_io_buf(), or in
>>>> ublk_check_and_get_req().
>>>>
>>>> So just index the tagset's rqs array directly in the ublk driver.
>>>> Convert the tags to unsigned, as blk_mq_tag_to_rq() does.
>>>
>>> Poking directly into block layer internals feels like a really bad
>>> idea.  If this is important enough we'll need a non-checking helper
>>> in the core code, but as with all these kinds of micro-optimizations
>>> it better have a really good justification.
>>
>> FWIW, I agree, and I also have a hard time imagining this making much of
>> a measurable difference. Caleb, was this based "well this seems
>> pointless" or was it something you noticed in profiling/testing?
> 
> That's true, the nr_tags check doesn't show up super prominently in a
> CPU profile. The atomic reference counting in
> __ublk_check_and_get_req() or ublk_commit_completion() is
> significantly more expensive. Still, it seems like unnecessary work.

Matching atomics on either side is always going to be miserable, and I'd
wager a much bigger issue than the minor thing that this patch is trying
to address...

> nr_tags is in a different cache line from rqs, so there is the
> potential for a cache miss. And the prefetch() is another unnecessary
> cache miss in the cases where ublk doesn't access any of struct
> request's fields.
> I am happy to add a "blk_mq_tag_to_rq_unchecked()" helper to avoid
> accessing the blk-mq internals.

Or maybe go the route that Ming suggested? But if you go the other
route, I'd just add a __blk_mq_tag_to_rq() and have blk_mq_tag_to_rq()
call that with the validation happening before.
Caleb Sander Mateos April 11, 2025, 7:51 p.m. UTC | #6
On Fri, Apr 11, 2025 at 12:56 AM Ming Lei <ming.lei@redhat.com> wrote:
>
> On Tue, Apr 08, 2025 at 08:49:54PM -0600, Caleb Sander Mateos wrote:
> > The ublk driver calls blk_mq_tag_to_rq() in several places.
> > blk_mq_tag_to_rq() tolerates an invalid tag for the tagset, checking it
> > against the number of tags and returning NULL if it is out of bounds.
> > But all the calls from the ublk driver have already verified the tag
> > against the ublk queue's queue depth. In ublk_commit_completion(),
> > ublk_handle_need_get_data(), and case UBLK_IO_COMMIT_AND_FETCH_REQ, the
> > tag has already been checked in __ublk_ch_uring_cmd(). In
> > ublk_abort_queue(), the loop bounds the tag by the queue depth. In
> > __ublk_check_and_get_req(), the tag has already been checked in
> > __ublk_ch_uring_cmd(), in the case of ublk_register_io_buf(), or in
> > ublk_check_and_get_req().
> >
> > So just index the tagset's rqs array directly in the ublk driver.
> > Convert the tags to unsigned, as blk_mq_tag_to_rq() does.
>
> If blk_mq_tag_to_rq() turns out to be not efficient enough, we can kill it
> in fast path by storing it in ublk_io and sharing space with 'struct io_uring_cmd *',
> since the two's lifetime isn't overlapped basically.

I agree it would be nice to just store a pointer from in struct
ublk_io to its current struct request. I guess we would set it in
ubq_complete_io_cmd() and clear it in ublk_commit_completion()
(matching when UBLK_IO_FLAG_OWNED_BY_SRV is set), as well as in
ublk_timeout() for UBLK_F_UNPRIVILEGED_DEV?

I'm not sure it is possible to overlap the fields, though. When using
UBLK_U_IO_NEED_GET_DATA, the cmd field is overwritten with the a
pointer to the UBLK_U_IO_NEED_GET_DATA command, but the req would need
to be recorded earlier upon completion of the
UBLK_U_IO_(COMMIT_AND_)FETCH_REQ command. Would you be okay with 2
separate fields?

Best,
Caleb
Ming Lei April 12, 2025, 12:27 a.m. UTC | #7
On Fri, Apr 11, 2025 at 12:51:10PM -0700, Caleb Sander Mateos wrote:
> On Fri, Apr 11, 2025 at 12:56 AM Ming Lei <ming.lei@redhat.com> wrote:
> >
> > On Tue, Apr 08, 2025 at 08:49:54PM -0600, Caleb Sander Mateos wrote:
> > > The ublk driver calls blk_mq_tag_to_rq() in several places.
> > > blk_mq_tag_to_rq() tolerates an invalid tag for the tagset, checking it
> > > against the number of tags and returning NULL if it is out of bounds.
> > > But all the calls from the ublk driver have already verified the tag
> > > against the ublk queue's queue depth. In ublk_commit_completion(),
> > > ublk_handle_need_get_data(), and case UBLK_IO_COMMIT_AND_FETCH_REQ, the
> > > tag has already been checked in __ublk_ch_uring_cmd(). In
> > > ublk_abort_queue(), the loop bounds the tag by the queue depth. In
> > > __ublk_check_and_get_req(), the tag has already been checked in
> > > __ublk_ch_uring_cmd(), in the case of ublk_register_io_buf(), or in
> > > ublk_check_and_get_req().
> > >
> > > So just index the tagset's rqs array directly in the ublk driver.
> > > Convert the tags to unsigned, as blk_mq_tag_to_rq() does.
> >
> > If blk_mq_tag_to_rq() turns out to be not efficient enough, we can kill it
> > in fast path by storing it in ublk_io and sharing space with 'struct io_uring_cmd *',
> > since the two's lifetime isn't overlapped basically.
> 
> I agree it would be nice to just store a pointer from in struct
> ublk_io to its current struct request. I guess we would set it in
> ubq_complete_io_cmd() and clear it in ublk_commit_completion()
> (matching when UBLK_IO_FLAG_OWNED_BY_SRV is set), as well as in
> ublk_timeout() for UBLK_F_UNPRIVILEGED_DEV?
> 
> I'm not sure it is possible to overlap the fields, though. When using
> UBLK_U_IO_NEED_GET_DATA, the cmd field is overwritten with the a
> pointer to the UBLK_U_IO_NEED_GET_DATA command, but the req would need

Both UBLK_U_IO_NEED_GET_DATA & UBLK_IO_COMMIT_AND_FETCH_REQ share same
usage on uring_cmd/request actually. 

Especially for UBLK_U_IO_NEED_GET_DATA, the uring cmd pointer needn't to be
stored in ublk_io.  Or just keep to use blk_mq_tag_to_rq() simply for it
only.

> to be recorded earlier upon completion of the
> UBLK_U_IO_(COMMIT_AND_)FETCH_REQ command.

Each one can be moved in local variable first, then store it.

If we do this way, helper can be added for set/get cmd/req from ublk_io,
then the implementation can be reliable & readable.

> Would you be okay with 2 separate fields?

Yeah, I think it is fine to do it first.


Thanks,
Ming
diff mbox series

Patch

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 2fd05c1bd30b..5b07329f5197 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -210,11 +210,11 @@  struct ublk_params_header {
 };
 
 static bool ublk_abort_requests(struct ublk_device *ub, struct ublk_queue *ubq);
 
 static inline struct request *__ublk_check_and_get_req(struct ublk_device *ub,
-		struct ublk_queue *ubq, int tag, size_t offset);
+		struct ublk_queue *ubq, unsigned tag, size_t offset);
 static inline unsigned int ublk_req_build_flags(struct request *req);
 static inline struct ublksrv_io_desc *ublk_get_iod(struct ublk_queue *ubq,
 						   int tag);
 static inline bool ublk_dev_is_user_copy(const struct ublk_device *ub)
 {
@@ -1515,11 +1515,11 @@  static void ublk_commit_completion(struct ublk_device *ub,
 	/* now this cmd slot is owned by nbd driver */
 	io->flags &= ~UBLK_IO_FLAG_OWNED_BY_SRV;
 	io->res = ub_cmd->result;
 
 	/* find the io request and complete */
-	req = blk_mq_tag_to_rq(ub->tag_set.tags[qid], tag);
+	req = ub->tag_set.tags[qid]->rqs[tag];
 	if (WARN_ON_ONCE(unlikely(!req)))
 		return;
 
 	if (req_op(req) == REQ_OP_ZONE_APPEND)
 		req->__sector = ub_cmd->zone_append_lba;
@@ -1533,11 +1533,11 @@  static void ublk_commit_completion(struct ublk_device *ub,
  * blk-mq queue, so we are called exclusively with blk-mq and ubq_daemon
  * context, so everything is serialized.
  */
 static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq)
 {
-	int i;
+	unsigned i;
 
 	for (i = 0; i < ubq->q_depth; i++) {
 		struct ublk_io *io = &ubq->ios[i];
 
 		if (!(io->flags & UBLK_IO_FLAG_ACTIVE)) {
@@ -1545,11 +1545,11 @@  static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq)
 
 			/*
 			 * Either we fail the request or ublk_rq_task_work_cb
 			 * will do it
 			 */
-			rq = blk_mq_tag_to_rq(ub->tag_set.tags[ubq->q_id], i);
+			rq = ub->tag_set.tags[ubq->q_id]->rqs[i];
 			if (rq && blk_mq_request_started(rq)) {
 				io->flags |= UBLK_IO_FLAG_ABORTED;
 				__ublk_fail_req(ubq, io, rq);
 			}
 		}
@@ -1824,14 +1824,14 @@  static void ublk_mark_io_ready(struct ublk_device *ub, struct ublk_queue *ubq)
 		complete_all(&ub->completion);
 	mutex_unlock(&ub->mutex);
 }
 
 static void ublk_handle_need_get_data(struct ublk_device *ub, int q_id,
-		int tag)
+		unsigned tag)
 {
 	struct ublk_queue *ubq = ublk_get_queue(ub, q_id);
-	struct request *req = blk_mq_tag_to_rq(ub->tag_set.tags[q_id], tag);
+	struct request *req = ub->tag_set.tags[q_id]->rqs[tag];
 
 	ublk_queue_cmd(ubq, req);
 }
 
 static inline int ublk_check_cmd_op(u32 cmd_op)
@@ -1989,11 +1989,11 @@  static int __ublk_ch_uring_cmd(struct io_uring_cmd *cmd,
 
 		ublk_fill_io_cmd(io, cmd, ub_cmd->addr);
 		ublk_mark_io_ready(ub, ubq);
 		break;
 	case UBLK_IO_COMMIT_AND_FETCH_REQ:
-		req = blk_mq_tag_to_rq(ub->tag_set.tags[ub_cmd->q_id], tag);
+		req = ub->tag_set.tags[ub_cmd->q_id]->rqs[tag];
 
 		if (!(io->flags & UBLK_IO_FLAG_OWNED_BY_SRV))
 			goto out;
 
 		if (ublk_need_map_io(ubq)) {
@@ -2033,18 +2033,18 @@  static int __ublk_ch_uring_cmd(struct io_uring_cmd *cmd,
 			__func__, cmd_op, tag, ret, io->flags);
 	return ret;
 }
 
 static inline struct request *__ublk_check_and_get_req(struct ublk_device *ub,
-		struct ublk_queue *ubq, int tag, size_t offset)
+		struct ublk_queue *ubq, unsigned tag, size_t offset)
 {
 	struct request *req;
 
 	if (!ublk_need_req_ref(ubq))
 		return NULL;
 
-	req = blk_mq_tag_to_rq(ub->tag_set.tags[ubq->q_id], tag);
+	req = ub->tag_set.tags[ubq->q_id]->rqs[tag];
 	if (!req)
 		return NULL;
 
 	if (!ublk_get_req_ref(ubq, req))
 		return NULL;