Message ID | 20231021154806.4019417-1-yukuai1@huaweicloud.com (mailing list archive) |
---|---|
Headers | show |
Series | blk-mq: improve tag fair sharing | expand |
Hello Yu Kuai, On Sat, Oct 21, 2023 at 11:47:58PM +0800, Yu Kuai wrote: > From: Yu Kuai <yukuai3@huawei.com> > > Current implementation: > - a counter active_queues record how many queue/hctx is sharing tags, > and it's updated while issue new IO, and cleared in > blk_mq_timeout_work(). > - if active_queues is more than 1, then tags is fair shared to each > node; Can you explain a bit what the problem is in current tag sharing? And what is your basic approach for this problem? Just mentioning the implementation is not too helpful for initial review, cause the problem and approach(correctness) need to be understood first. Thanks, Ming
Hi, 在 2023/10/23 12:38, Ming Lei 写道: > Hello Yu Kuai, > > On Sat, Oct 21, 2023 at 11:47:58PM +0800, Yu Kuai wrote: >> From: Yu Kuai <yukuai3@huawei.com> >> >> Current implementation: >> - a counter active_queues record how many queue/hctx is sharing tags, >> and it's updated while issue new IO, and cleared in >> blk_mq_timeout_work(). >> - if active_queues is more than 1, then tags is fair shared to each >> node; > > Can you explain a bit what the problem is in current tag sharing? > And what is your basic approach for this problem? > > Just mentioning the implementation is not too helpful for initial > review, cause the problem and approach(correctness) need to be > understood first. Of course, I'll add following if there will be a v3; Current problems: If there are multiple active_queues, then tag is fair shared to each queue, and if one queue is not busy(for example, only issue one IO once for a while), then shared tags for this queue is wasted and can't be used for other queues. Depends on the hardware, this might casue performance problems in some user case. For example, as reported by [1], UFS devices have multiple logical units. One of these logical units (WLUN) is used to submit control commands, e.g. START STOP UNIT. If any request is submitted to the WLUN, the queue depth is reduced from 31 to 15 or lower for data LUNs. This patchset first delay tag sharing from issue IO to failed to get driver tag; then add a counter to record how many times shared queue failed to get driver tag to indicate if the queue is busy; finially, allow busy queue to borrow more tags from idle queue. Thanks, Kuai > > Thanks, > Ming > > . >
From: Yu Kuai <yukuai3@huawei.com> Current implementation: - a counter active_queues record how many queue/hctx is sharing tags, and it's updated while issue new IO, and cleared in blk_mq_timeout_work(). - if active_queues is more than 1, then tags is fair shared to each node; New implementation: - a new field 'available_tags' is added to each node, and it's calculate in slow path, hence fast path won't be affected, patch 5; - a new counter 'busy_queues' is added to blk_mq_tags, and it's updated while fail to get driver tag, and it's also cleared in blk_mq_timeout_work(), and tag sharing will based on 'busy_queues' instead of 'active_queues', patch 6,7; - a new counter 'busy_count' is added to each node to record how many times a node failed to get driver tag, and it's used to judge if a node is busy and need more tags, patch 8; - a new timer is added to blk_mq_tags, it will start if any node failed to get driver tag, and timer function will be used to borrow tags and return borrowed tags, patch 8; A simple test, 32 tags with two shared node: [global] ioengine=libaio iodepth=2 bs=4k direct=1 rw=randrw group_reporting [sda] numjobs=32 filename=/dev/sda [sdb] numjobs=1 filename=/dev/sdb Test result(monitor new debugfs entry): time active available sda sdb sda sdb 0 0 0 32 32 1 16 2 16 16 -> start fair sharing 2 19 2 20 16 3 24 2 24 16 4 26 2 28 16 -> borrow 32/8=4 tags each round 5 28 2 28 16 -> save at lease 4 tags for sdb Yu Kuai (8): blk-mq: factor out a structure from blk_mq_tags blk-mq: factor out a structure to store information for tag sharing blk-mq: add a helper to initialize shared_tag_info blk-mq: support to track active queues from blk_mq_tags blk-mq: precalculate available tags for hctx_may_queue() blk-mq: add new helpers blk_mq_driver_tag_busy/idle() blk-mq-tag: delay tag sharing until fail to get driver tag blk-mq-tag: allow shared queue/hctx to get more driver tags block/blk-core.c | 2 - block/blk-mq-debugfs.c | 30 +++++- block/blk-mq-tag.c | 226 +++++++++++++++++++++++++++++++++++++++-- block/blk-mq.c | 12 ++- block/blk-mq.h | 64 +++++++----- include/linux/blk-mq.h | 36 +++++-- include/linux/blkdev.h | 11 +- 7 files changed, 328 insertions(+), 53 deletions(-)