Message ID | 20200513095443.2038859-4-ming.lei@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | blk-mq: support batching dispatch from scheduler | expand |
On Wed, May 13, 2020 at 05:54:37PM +0800, Ming Lei wrote: > .commit_rqs() is supposed to handle partial dispatch when driver may not > see .last of flag passed to .queue_rq(). > > We have added .commit_rqs() in case of partial dispatch and all consumers > of bd->last have implemented .commit_rqs() callback, so it is perfect to > pass real .last flag of the request list to .queue_rq() instead of faking > it by trying to allocate driver tag for next request in the batching list. The current case still seems like a nice optimization to avoid an extra indirect function call. So if you want to get rid of it I think it at least needs a better rationale.
On Wed, May 13, 2020 at 05:27:53AM -0700, Christoph Hellwig wrote: > On Wed, May 13, 2020 at 05:54:37PM +0800, Ming Lei wrote: > > .commit_rqs() is supposed to handle partial dispatch when driver may not > > see .last of flag passed to .queue_rq(). > > > > We have added .commit_rqs() in case of partial dispatch and all consumers > > of bd->last have implemented .commit_rqs() callback, so it is perfect to > > pass real .last flag of the request list to .queue_rq() instead of faking > > it by trying to allocate driver tag for next request in the batching list. > > The current case still seems like a nice optimization to avoid an extra > indirect function call. So if you want to get rid of it I think it at > least needs a better rationale. You mean marking .last by trying to allocate for next request can replace .commit_rqs()? No, it can't because .commit_rqs() can be called no matter .last is set or not, both two are independent. Removing it can avoid to pre-allocate one extra driver tag, and improve driver tag's utilization. Thanks, Ming
On Wed, May 13, 2020 at 05:27:53AM -0700, Christoph Hellwig wrote: > On Wed, May 13, 2020 at 05:54:37PM +0800, Ming Lei wrote: > > .commit_rqs() is supposed to handle partial dispatch when driver may not > > see .last of flag passed to .queue_rq(). > > > > We have added .commit_rqs() in case of partial dispatch and all consumers > > of bd->last have implemented .commit_rqs() callback, so it is perfect to > > pass real .last flag of the request list to .queue_rq() instead of faking > > it by trying to allocate driver tag for next request in the batching list. > > The current case still seems like a nice optimization to avoid an extra > indirect function call. So if you want to get rid of it I think it at > least needs a better rationale. Forget to mention, trying to predicate the last request via allocating tag for next request can't avoid extra .commit_rqs() because this indirect call is always called when the rq list isn't done. Also no matter .last is set or not, every implementation of .commit_rqs always grabs one lock, so looks this patch can get real win without any performance loss. On the other side, .commit_rqs() can be avoided iff the last queued(successful) rq is marked as .last, and the cost is to keep current estimate on .last. However, why is .commit_rqs() required? Why doesn't .queue_rq() handle the batching submission before non-STS_OK is returned? And the inline handling can be quite efficient because one more spin lock acquire can be avoided usually. Then .commit_rqs() can be killed. Thanks, Ming
On Thu, May 14, 2020 at 10:09:55AM +0800, Ming Lei wrote: > On Wed, May 13, 2020 at 05:27:53AM -0700, Christoph Hellwig wrote: > > On Wed, May 13, 2020 at 05:54:37PM +0800, Ming Lei wrote: > > > .commit_rqs() is supposed to handle partial dispatch when driver may not > > > see .last of flag passed to .queue_rq(). > > > > > > We have added .commit_rqs() in case of partial dispatch and all consumers > > > of bd->last have implemented .commit_rqs() callback, so it is perfect to > > > pass real .last flag of the request list to .queue_rq() instead of faking > > > it by trying to allocate driver tag for next request in the batching list. > > > > The current case still seems like a nice optimization to avoid an extra > > indirect function call. So if you want to get rid of it I think it at > > least needs a better rationale. > > Forget to mention, trying to predicate the last request via allocating > tag for next request can't avoid extra .commit_rqs() because this > indirect call is always called when the rq list isn't done. > > Also no matter .last is set or not, every implementation of .commit_rqs > always grabs one lock, so looks this patch can get real win without any > performance loss. > > On the other side, .commit_rqs() can be avoided iff the last queued(successful) > rq is marked as .last, and the cost is to keep current estimate on .last. > However, why is .commit_rqs() required? Why doesn't .queue_rq() handle the batching > submission before non-STS_OK is returned? And the inline handling can be quite > efficient because one more spin lock acquire can be avoided usually. Then > .commit_rqs() can be killed. The only chance we need .commit_rqs() should be: - requests are queued successfully, and the last queued rq isn't marked as last - running out of budget or driver tag before queueing one new request I think we need to document the interfaces(.commit_rqs & .queue_rq) clearly. thanks, Ming
On Thu, May 14, 2020 at 10:09:55AM +0800, Ming Lei wrote: > However, why is .commit_rqs() required? Why doesn't .queue_rq() handle the batching > submission before non-STS_OK is returned? Wouldn't the driver need to know that the request is !first in that case so that it doesn't commit nothing if the first request fails?
On Thu, May 14, 2020 at 08:50:43AM +0800, Ming Lei wrote: > On Wed, May 13, 2020 at 05:27:53AM -0700, Christoph Hellwig wrote: > > On Wed, May 13, 2020 at 05:54:37PM +0800, Ming Lei wrote: > > > .commit_rqs() is supposed to handle partial dispatch when driver may not > > > see .last of flag passed to .queue_rq(). > > > > > > We have added .commit_rqs() in case of partial dispatch and all consumers > > > of bd->last have implemented .commit_rqs() callback, so it is perfect to > > > pass real .last flag of the request list to .queue_rq() instead of faking > > > it by trying to allocate driver tag for next request in the batching list. > > > > The current case still seems like a nice optimization to avoid an extra > > indirect function call. So if you want to get rid of it I think it at > > least needs a better rationale. > > You mean marking .last by trying to allocate for next request can > replace .commit_rqs()? No, it can't because .commit_rqs() can be > called no matter .last is set or not, both two are independent. > > Removing it can avoid to pre-allocate one extra driver tag, and > improve driver tag's utilization. What I said is that the current scheme works, and the new one will need an additional indirect function call in various scenarios. The commit log doesn't really "sell" that change very well. Your new explanation is much better, as would be saying it helps you with the hanges in this series.
On Thu, May 14, 2020 at 12:21:43PM +0900, Keith Busch wrote: > On Thu, May 14, 2020 at 10:09:55AM +0800, Ming Lei wrote: > > However, why is .commit_rqs() required? Why doesn't .queue_rq() handle the batching > > submission before non-STS_OK is returned? > > Wouldn't the driver need to know that the request is !first in that case > so that it doesn't commit nothing if the first request fails? Yeah, I have replied on this question, :-) In short, I plan to make the interface explicit and more efficient: 1) .commit_rqs() is used for notifying driver that one batching requests are done, which is usually caused by lacking of resource for queuing more requests to driver 2) one request with .last flag marks end of one batching submission, so .commit_rq() will not be called if no new request with !.last is queued 3) it is driver's responsibility for handling batching submission if non STS_OK is returned from .queue_rq() to block layer. Then we can minimize .commit_rqs() uses. Thanks, Ming
diff --git a/block/blk-mq.c b/block/blk-mq.c index f064e7923ea5..f8c4b59022d7 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1169,16 +1169,6 @@ static void blk_mq_update_dispatch_busy(struct blk_mq_hw_ctx *hctx, bool busy) static void blk_mq_handle_dev_resource(struct request *rq, struct list_head *list) { - struct request *next = - list_first_entry_or_null(list, struct request, queuelist); - - /* - * If an I/O scheduler has been configured and we got a driver tag for - * the next request already, free it. - */ - if (next) - blk_mq_put_driver_tag(next); - list_add(&rq->queuelist, list); __blk_mq_requeue_request(rq); } @@ -1203,7 +1193,7 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, bool got_budget) { struct request_queue *q = hctx->queue; - struct request *rq, *nxt; + struct request *rq; bool no_tag = false; int errors, queued; blk_status_t ret = BLK_STS_OK; @@ -1254,17 +1244,7 @@ bool blk_mq_dispatch_rq_list(struct blk_mq_hw_ctx *hctx, struct list_head *list, list_del_init(&rq->queuelist); bd.rq = rq; - - /* - * Flag last if we have no more requests, or if we have more - * but can't assign a driver tag to it. - */ - if (list_empty(list)) - bd.last = true; - else { - nxt = list_first_entry(list, struct request, queuelist); - bd.last = !blk_mq_get_driver_tag(nxt); - } + bd.last = !!list_empty(list); ret = q->mq_ops->queue_rq(hctx, &bd); if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE) {
.commit_rqs() is supposed to handle partial dispatch when driver may not see .last of flag passed to .queue_rq(). We have added .commit_rqs() in case of partial dispatch and all consumers of bd->last have implemented .commit_rqs() callback, so it is perfect to pass real .last flag of the request list to .queue_rq() instead of faking it by trying to allocate driver tag for next request in the batching list. Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Baolin Wang <baolin.wang7@gmail.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Ming Lei <ming.lei@redhat.com> --- block/blk-mq.c | 24 ++---------------------- 1 file changed, 2 insertions(+), 22 deletions(-)