diff mbox series

[v2,3/3] blk-mq: fix start_time_ns and alloc_time_ns for pre-allocated rq

Message ID 20230626050405.781253-4-chengming.zhou@linux.dev (mailing list archive)
State New, archived
Headers show
Series blk-mq: fix start_time_ns and alloc_time_ns for pre-allocated rq | expand

Commit Message

Chengming Zhou June 26, 2023, 5:04 a.m. UTC
From: Chengming Zhou <zhouchengming@bytedance.com>

The iocost rely on rq start_time_ns and alloc_time_ns to tell saturation
state of the block device. Most of the time request is allocated after
rq_qos_throttle() and its alloc_time_ns or start_time_ns won't be affected.

But for plug batched allocation introduced by the commit 47c122e35d7e
("block: pre-allocate requests if plug is started and is a batch"), we can
rq_qos_throttle() after the allocation of the request. This is what the
blk_mq_get_cached_request() does.

In this case, the cached request alloc_time_ns or start_time_ns is much
ahead if blocked in any qos ->throttle().

This patch fix it by setting alloc_time_ns and start_time_ns to now
when the pre-allocated rq is actually used.

Note we don't skip setting alloc_time_ns and start_time_ns for all
pre-allocated rq, since the first returned rq still need to be set.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 block/blk-mq.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Comments

Tejun Heo June 26, 2023, 8:46 p.m. UTC | #1
Hello,

I only glanced the blk-mq core part but in general this looks a lot better
than the previous one.

On Mon, Jun 26, 2023 at 01:04:05PM +0800, chengming.zhou@linux.dev wrote:
> Note we don't skip setting alloc_time_ns and start_time_ns for all
> pre-allocated rq, since the first returned rq still need to be set.

This part is a bit curious for me tho. Why do we need to set it at batch
allocation time and then at actual dispensing from the bat later? Who uses
the alloc time stamp inbetween?

Thanks.
Chengming Zhou June 27, 2023, 11:32 a.m. UTC | #2
On 2023/6/27 04:46, Tejun Heo wrote:
> Hello,
> 
> I only glanced the blk-mq core part but in general this looks a lot better
> than the previous one.

Thanks for your review!

> 
> On Mon, Jun 26, 2023 at 01:04:05PM +0800, chengming.zhou@linux.dev wrote:
>> Note we don't skip setting alloc_time_ns and start_time_ns for all
>> pre-allocated rq, since the first returned rq still need to be set.
> 
> This part is a bit curious for me tho. Why do we need to set it at batch
> allocation time and then at actual dispensing from the bat later? Who uses
> the alloc time stamp inbetween?
> 

Yes, this part should need more explanation, and my explanation is not clear.

Now the batched pre-allocation code looks like this:

```
if (!rq_list_empty(plug->cached_rq))

  get pre-allocated rq from plug cache

  // we set alloc and start time here
  return rq

else
  rq = __blk_mq_alloc_requests_batch() do batched allocation (1)
  (2)
  return rq
```

In (1) we alloc some requests and push them in plug list, and pop one request
to return to use. So this popped one request need to be set time at batch allocation time.

Yes, we can also set this popped request time in (2), just before return it to use.


Since all requests in batch allocation use the same alloc and start time, so this patch
just leave it as it is, and reset it at actual used time.

I think both way is ok, do you think it's better to just set the popped one request, leave
other requests time to 0 ? If so, I can update to do it.

Thanks.
Tejun Heo June 27, 2023, 6:47 p.m. UTC | #3
Hello,

On Tue, Jun 27, 2023 at 07:32:42PM +0800, Chengming Zhou wrote:
> Since all requests in batch allocation use the same alloc and start time, so this patch
> just leave it as it is, and reset it at actual used time.
> 
> I think both way is ok, do you think it's better to just set the popped one request, leave
> other requests time to 0 ? If so, I can update to do it.

I think it'd be clearer if the rule is that the alloc time is set once when
the request is actually dispensed for use in all cases, so yeah, let's just
set it once when it actually starts getting used.

Thanks.
Chengming Zhou June 28, 2023, 1:16 a.m. UTC | #4
On 2023/6/28 02:47, Tejun Heo wrote:
> Hello,
> 
> On Tue, Jun 27, 2023 at 07:32:42PM +0800, Chengming Zhou wrote:
>> Since all requests in batch allocation use the same alloc and start time, so this patch
>> just leave it as it is, and reset it at actual used time.
>>
>> I think both way is ok, do you think it's better to just set the popped one request, leave
>> other requests time to 0 ? If so, I can update to do it.
> 
> I think it'd be clearer if the rule is that the alloc time is set once when
> the request is actually dispensed for use in all cases, so yeah, let's just
> set it once when it actually starts getting used.
> 

Good, I will update the patchset today.

Thanks.
diff mbox series

Patch

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 8b981d0a868e..6a3f1b8aaad8 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -337,6 +337,24 @@  void blk_rq_init(struct request_queue *q, struct request *rq)
 }
 EXPORT_SYMBOL(blk_rq_init);
 
+/* Set rq alloc and start time when pre-allocated rq is actually used */
+static inline void blk_mq_rq_time_init(struct request_queue *q, struct request *rq)
+{
+	if (blk_mq_need_time_stamp(rq->rq_flags)) {
+		u64 now = ktime_get_ns();
+
+#ifdef CONFIG_BLK_RQ_ALLOC_TIME
+		/*
+		 * alloc time is only used by iocost for now,
+		 * only possible when blk_mq_need_time_stamp().
+		 */
+		if (blk_queue_rq_alloc_time(q))
+			rq->alloc_time_ns = now;
+#endif
+		rq->start_time_ns = now;
+	}
+}
+
 static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 		struct blk_mq_tags *tags, unsigned int tag,
 		u64 alloc_time_ns, u64 start_time_ns)
@@ -575,6 +593,7 @@  static struct request *blk_mq_alloc_cached_request(struct request_queue *q,
 			return NULL;
 
 		plug->cached_rq = rq_list_next(rq);
+		blk_mq_rq_time_init(q, rq);
 	}
 
 	rq->cmd_flags = opf;
@@ -2896,6 +2915,7 @@  static inline struct request *blk_mq_get_cached_request(struct request_queue *q,
 	plug->cached_rq = rq_list_next(rq);
 	rq_qos_throttle(q, *bio);
 
+	blk_mq_rq_time_init(q, rq);
 	rq->cmd_flags = (*bio)->bi_opf;
 	INIT_LIST_HEAD(&rq->queuelist);
 	return rq;