diff mbox series

[3/9] blk-mq: don't predicate last flag in blk_mq_dispatch_rq_list

Message ID 20200513095443.2038859-4-ming.lei@redhat.com (mailing list archive)
State New, archived
Headers show
Series blk-mq: support batching dispatch from scheduler | expand

Commit Message

Ming Lei May 13, 2020, 9:54 a.m. UTC
.commit_rqs() is supposed to handle partial dispatch when driver may not
see .last of flag passed to .queue_rq().

We have added .commit_rqs() in case of partial dispatch and all consumers
of bd->last have implemented .commit_rqs() callback, so it is perfect to
pass real .last flag of the request list to .queue_rq() instead of faking
it by trying to allocate driver tag for next request in the batching list.

Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Baolin Wang <baolin.wang7@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c | 24 ++----------------------
 1 file changed, 2 insertions(+), 22 deletions(-)

Comments

Christoph Hellwig May 13, 2020, 12:27 p.m. UTC | #1
On Wed, May 13, 2020 at 05:54:37PM +0800, Ming Lei wrote:
> .commit_rqs() is supposed to handle partial dispatch when driver may not
> see .last of flag passed to .queue_rq().
> 
> We have added .commit_rqs() in case of partial dispatch and all consumers
> of bd->last have implemented .commit_rqs() callback, so it is perfect to
> pass real .last flag of the request list to .queue_rq() instead of faking
> it by trying to allocate driver tag for next request in the batching list.

The current case still seems like a nice optimization to avoid an extra
indirect function call.  So if you want to get rid of it I think it at
least needs a better rationale.
Ming Lei May 14, 2020, 12:50 a.m. UTC | #2
On Wed, May 13, 2020 at 05:27:53AM -0700, Christoph Hellwig wrote:
> On Wed, May 13, 2020 at 05:54:37PM +0800, Ming Lei wrote:
> > .commit_rqs() is supposed to handle partial dispatch when driver may not
> > see .last of flag passed to .queue_rq().
> > 
> > We have added .commit_rqs() in case of partial dispatch and all consumers
> > of bd->last have implemented .commit_rqs() callback, so it is perfect to
> > pass real .last flag of the request list to .queue_rq() instead of faking
> > it by trying to allocate driver tag for next request in the batching list.
> 
> The current case still seems like a nice optimization to avoid an extra
> indirect function call.  So if you want to get rid of it I think it at
> least needs a better rationale.

You mean marking .last by trying to allocate for next request can
replace .commit_rqs()? No, it can't because .commit_rqs() can be
called no matter .last is set or not, both two are independent.

Removing it can avoid to pre-allocate one extra driver tag, and
improve driver tag's utilization.

Thanks,
Ming
Ming Lei May 14, 2020, 2:09 a.m. UTC | #3
On Wed, May 13, 2020 at 05:27:53AM -0700, Christoph Hellwig wrote:
> On Wed, May 13, 2020 at 05:54:37PM +0800, Ming Lei wrote:
> > .commit_rqs() is supposed to handle partial dispatch when driver may not
> > see .last of flag passed to .queue_rq().
> > 
> > We have added .commit_rqs() in case of partial dispatch and all consumers
> > of bd->last have implemented .commit_rqs() callback, so it is perfect to
> > pass real .last flag of the request list to .queue_rq() instead of faking
> > it by trying to allocate driver tag for next request in the batching list.
> 
> The current case still seems like a nice optimization to avoid an extra
> indirect function call.  So if you want to get rid of it I think it at
> least needs a better rationale.

Forget to mention, trying to predicate the last request via allocating
tag for next request can't avoid extra .commit_rqs() because this
indirect call is always called when the rq list isn't done.

Also no matter .last is set or not, every implementation of .commit_rqs
always grabs one lock, so looks this patch can get real win without any
performance loss.

On the other side, .commit_rqs() can be avoided iff the last queued(successful)
rq is marked as .last, and the cost is to keep current estimate on .last.
However, why is .commit_rqs() required? Why doesn't .queue_rq() handle the batching
submission before non-STS_OK is returned? And the inline handling can be quite
efficient because one more spin lock acquire can be avoided usually. Then
.commit_rqs() can be killed.


Thanks,
Ming
Ming Lei May 14, 2020, 2:19 a.m. UTC | #4
On Thu, May 14, 2020 at 10:09:55AM +0800, Ming Lei wrote:
> On Wed, May 13, 2020 at 05:27:53AM -0700, Christoph Hellwig wrote:
> > On Wed, May 13, 2020 at 05:54:37PM +0800, Ming Lei wrote:
> > > .commit_rqs() is supposed to handle partial dispatch when driver may not
> > > see .last of flag passed to .queue_rq().
> > > 
> > > We have added .commit_rqs() in case of partial dispatch and all consumers
> > > of bd->last have implemented .commit_rqs() callback, so it is perfect to
> > > pass real .last flag of the request list to .queue_rq() instead of faking
> > > it by trying to allocate driver tag for next request in the batching list.
> > 
> > The current case still seems like a nice optimization to avoid an extra
> > indirect function call.  So if you want to get rid of it I think it at
> > least needs a better rationale.
> 
> Forget to mention, trying to predicate the last request via allocating
> tag for next request can't avoid extra .commit_rqs() because this
> indirect call is always called when the rq list isn't done.
> 
> Also no matter .last is set or not, every implementation of .commit_rqs
> always grabs one lock, so looks this patch can get real win without any
> performance loss.
> 
> On the other side, .commit_rqs() can be avoided iff the last queued(successful)
> rq is marked as .last, and the cost is to keep current estimate on .last.
> However, why is .commit_rqs() required? Why doesn't .queue_rq() handle the batching
> submission before non-STS_OK is returned? And the inline handling can be quite
> efficient because one more spin lock acquire can be avoided usually. Then
> .commit_rqs() can be killed.

The only chance we need .commit_rqs() should be:

- requests are queued successfully, and the last queued rq isn't marked
  as last
- running out of budget or driver tag before queueing one new request

I think we need to document the interfaces(.commit_rqs & .queue_rq) clearly.

thanks,
Ming
Keith Busch May 14, 2020, 3:21 a.m. UTC | #5
On Thu, May 14, 2020 at 10:09:55AM +0800, Ming Lei wrote:
> However, why is .commit_rqs() required? Why doesn't .queue_rq() handle the batching
> submission before non-STS_OK is returned?

Wouldn't the driver need to know that the request is !first in that case
so that it doesn't commit nothing if the first request fails?
Christoph Hellwig May 14, 2020, 5:50 a.m. UTC | #6
On Thu, May 14, 2020 at 08:50:43AM +0800, Ming Lei wrote:
> On Wed, May 13, 2020 at 05:27:53AM -0700, Christoph Hellwig wrote:
> > On Wed, May 13, 2020 at 05:54:37PM +0800, Ming Lei wrote:
> > > .commit_rqs() is supposed to handle partial dispatch when driver may not
> > > see .last of flag passed to .queue_rq().
> > > 
> > > We have added .commit_rqs() in case of partial dispatch and all consumers
> > > of bd->last have implemented .commit_rqs() callback, so it is perfect to
> > > pass real .last flag of the request list to .queue_rq() instead of faking
> > > it by trying to allocate driver tag for next request in the batching list.
> > 
> > The current case still seems like a nice optimization to avoid an extra
> > indirect function call.  So if you want to get rid of it I think it at
> > least needs a better rationale.
> 
> You mean marking .last by trying to allocate for next request can
> replace .commit_rqs()? No, it can't because .commit_rqs() can be
> called no matter .last is set or not, both two are independent.
> 
> Removing it can avoid to pre-allocate one extra driver tag, and
> improve driver tag's utilization.

What I said is that the current scheme works, and the new one will
need an additional indirect function call in various scenarios.  The
commit log doesn't really "sell" that change very well.  Your new
explanation is much better, as would be saying it helps you with
the hanges in this series.
Ming Lei May 14, 2020, 8:28 a.m. UTC | #7
On Thu, May 14, 2020 at 12:21:43PM +0900, Keith Busch wrote:
> On Thu, May 14, 2020 at 10:09:55AM +0800, Ming Lei wrote:
> > However, why is .commit_rqs() required? Why doesn't .queue_rq() handle the batching
> > submission before non-STS_OK is returned?
> 
> Wouldn't the driver need to know that the request is !first in that case
> so that it doesn't commit nothing if the first request fails?

Yeah, I have replied on this question, :-)

In short, I plan to make the interface explicit and more efficient:

1) .commit_rqs() is used for notifying driver that one batching requests
are done, which is usually caused by lacking of resource for queuing more
requests to driver

2) one request with .last flag marks end of one batching submission, so
.commit_rq() will not be called if no new request with !.last is queued

3) it is driver's responsibility for handling batching submission if non
STS_OK is returned from .queue_rq() to block layer.

Then we can minimize .commit_rqs() uses.

Thanks,
Ming
diff mbox series

Patch

diff --git a/block/blk-mq.c b/block/blk-mq.c
index f064e7923ea5..f8c4b59022d7 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1169,16 +1169,6 @@  static void blk_mq_update_dispatch_busy(struct blk_mq_hw_ctx *hctx, bool busy)
 static void blk_mq_handle_dev_resource(struct request *rq,
 				       struct list_head *list)
 {
-	struct request *next =
-		list_first_entry_or_null(list, struct request, queuelist);
-
-	/*
-	 * If an I/O scheduler has been configured and we got a driver tag for
-	 * the next request already, free it.
-	 */
-	if (next)
-		blk_mq_put_driver_tag(next);
-
 	list_add(&rq->queuelist, list);
 	__blk_mq_requeue_request(rq);
 }
@@ -1203,7 +1193,7 @@  bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list,
 			     bool got_budget)
 {
 	struct request_queue *q = hctx->queue;
-	struct request *rq, *nxt;
+	struct request *rq;
 	bool no_tag = false;
 	int errors, queued;
 	blk_status_t ret = BLK_STS_OK;
@@ -1254,17 +1244,7 @@  bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list,
 		list_del_init(&rq->queuelist);
 
 		bd.rq = rq;
-
-		/*
-		 * Flag last if we have no more requests, or if we have more
-		 * but can't assign a driver tag to it.
-		 */
-		if (list_empty(list))
-			bd.last = true;
-		else {
-			nxt = list_first_entry(list, struct request, queuelist);
-			bd.last = !blk_mq_get_driver_tag(nxt);
-		}
+		bd.last = !!list_empty(list);
 
 		ret = q->mq_ops->queue_rq(hctx, &bd);
 		if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) {