diff mbox series

[2/3] block: blk_mq_rq_ctx_init cache ctx/q/hctx

Message ID 06a05d0b16a6504461502140c3d1e9c8cd91b862.1634589262.git.asml.silence@gmail.com (mailing list archive)
State New, archived
Headers show
Series blk_mq_rq_ctx_init() optimisations | expand

Commit Message

Pavel Begunkov Oct. 18, 2021, 8:37 p.m. UTC
We should have enough of registers in blk_mq_rq_ctx_init(), store them
in local vars, so we don't keep reloading them.

note: keeping q->elevator may look unnecessary, but it's also used
inside inlined blk_mq_tags_from_data().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 block/blk-mq.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

Comments

Christoph Hellwig Oct. 19, 2021, 6 a.m. UTC | #1
On Mon, Oct 18, 2021 at 09:37:28PM +0100, Pavel Begunkov wrote:
> We should have enough of registers in blk_mq_rq_ctx_init(), store them
> in local vars, so we don't keep reloading them.
> 
> note: keeping q->elevator may look unnecessary, but it's also used
> inside inlined blk_mq_tags_from_data().

Is this really making a difference?  I'd expect todays hyper-optimizing
compilers to not be tricked into specific register allocations just by
adding a local variable.
Pavel Begunkov Oct. 19, 2021, 10:38 a.m. UTC | #2
On 10/19/21 07:00, Christoph Hellwig wrote:
> On Mon, Oct 18, 2021 at 09:37:28PM +0100, Pavel Begunkov wrote:
>> We should have enough of registers in blk_mq_rq_ctx_init(), store them
>> in local vars, so we don't keep reloading them.
>>
>> note: keeping q->elevator may look unnecessary, but it's also used
>> inside inlined blk_mq_tags_from_data().
> 
> Is this really making a difference?  I'd expect todays hyper-optimizing
> compilers to not be tricked into specific register allocations just by
> adding a local variable.

Looking again, there are only reads before the use site, so indeed
shouldn't matter. Was left from first versions of the patch where
it wasn't the case.
diff mbox series

Patch

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 1d2e2fd4043e..fa4de25c3bcb 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -312,10 +312,14 @@  static inline bool blk_mq_need_time_stamp(struct request *rq)
 static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 		unsigned int tag, u64 alloc_time_ns)
 {
+	struct blk_mq_ctx *ctx = data->ctx;
+	struct blk_mq_hw_ctx *hctx = data->hctx;
+	struct request_queue *q = data->q;
+	struct elevator_queue *e = q->elevator;
 	struct blk_mq_tags *tags = blk_mq_tags_from_data(data);
 	struct request *rq = tags->static_rqs[tag];
 
-	if (data->q->elevator) {
+	if (e) {
 		rq->rq_flags = RQF_ELV;
 		rq->tag = BLK_MQ_NO_TAG;
 		rq->internal_tag = tag;
@@ -330,13 +334,13 @@  static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 	else
 		rq->start_time_ns = 0;
 	/* csd/requeue_work/fifo_time is initialized before use */
-	rq->q = data->q;
-	rq->mq_ctx = data->ctx;
-	rq->mq_hctx = data->hctx;
+	rq->q = q;
+	rq->mq_ctx = ctx;
+	rq->mq_hctx = hctx;
 	rq->cmd_flags = data->cmd_flags;
 	if (data->flags & BLK_MQ_REQ_PM)
 		rq->rq_flags |= RQF_PM;
-	if (blk_queue_io_stat(data->q))
+	if (blk_queue_io_stat(q))
 		rq->rq_flags |= RQF_IO_STAT;
 	rq->rq_disk = NULL;
 	rq->part = NULL;