Message ID | 20220329094048.2107094-2-yukuai3@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | improve large random io for HDD | expand |
On 3/29/22 3:40 AM, Yu Kuai wrote: > Tag preemption is the default behaviour, specifically blk_mq_get_tag() > will try to get tag unconditionally, which means a new io can preempt tag > even if there are lots of ios that are waiting for tags. > > This patch introduce a new flag, prepare to disable such behaviour, in > order to optimize io performance for large random io for HHD. Not sure why we need a flag for this behavior. Does it ever make sense to allow preempting waiters, jumping the queue?
On 2022/03/29 20:44, Jens Axboe wrote: > On 3/29/22 3:40 AM, Yu Kuai wrote: >> Tag preemption is the default behaviour, specifically blk_mq_get_tag() >> will try to get tag unconditionally, which means a new io can preempt tag >> even if there are lots of ios that are waiting for tags. >> >> This patch introduce a new flag, prepare to disable such behaviour, in >> order to optimize io performance for large random io for HHD. > > Not sure why we need a flag for this behavior. Does it ever make sense > to allow preempting waiters, jumping the queue? > Hi, I was thinking using the flag to control the new behavior, in order to reduce the impact on general path. If wake up path is handled properly, I think it's ok to disable preempting tags. Thanks Kuai
On 3/29/22 7:18 PM, yukuai (C) wrote: > On 2022/03/29 20:44, Jens Axboe wrote: >> On 3/29/22 3:40 AM, Yu Kuai wrote: >>> Tag preemption is the default behaviour, specifically blk_mq_get_tag() >>> will try to get tag unconditionally, which means a new io can preempt tag >>> even if there are lots of ios that are waiting for tags. >>> >>> This patch introduce a new flag, prepare to disable such behaviour, in >>> order to optimize io performance for large random io for HHD. >> >> Not sure why we need a flag for this behavior. Does it ever make sense >> to allow preempting waiters, jumping the queue? >> > > Hi, > > I was thinking using the flag to control the new behavior, in order to > reduce the impact on general path. > > If wake up path is handled properly, I think it's ok to disable > preempting tags. If we hit tag starvation, we are by definition out of the fast path. That doesn't mean that scalability should drop to the floor, something that often happened before blk-mq and without the rolling wakeups. But it does mean that we can throw a bit more smarts at it, if it improves fairness/performance in that situation.
diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index aa0349e9f083..f4228532ee3d 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -226,6 +226,7 @@ static const char *const hctx_flag_name[] = { HCTX_FLAG_NAME(NO_SCHED), HCTX_FLAG_NAME(STACKING), HCTX_FLAG_NAME(TAG_HCTX_SHARED), + HCTX_FLAG_NAME(NO_TAG_PREEMPTION), }; #undef HCTX_FLAG_NAME diff --git a/block/blk-mq.h b/block/blk-mq.h index 2615bd58bad3..1a084b3b6097 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -168,6 +168,11 @@ static inline bool blk_mq_is_shared_tags(unsigned int flags) return flags & BLK_MQ_F_TAG_HCTX_SHARED; } +static inline bool blk_mq_is_tag_preemptive(unsigned int flags) +{ + return !(flags & BLK_MQ_F_NO_TAG_PREEMPTION); +} + static inline struct blk_mq_tags *blk_mq_tags_from_data(struct blk_mq_alloc_data *data) { if (!(data->rq_flags & RQF_ELV)) diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 7aa5c54901a9..c9434162acc5 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -656,7 +656,12 @@ enum { * or shared hwqs instead of 'mq-deadline'. */ BLK_MQ_F_NO_SCHED_BY_DEFAULT = 1 << 7, - BLK_MQ_F_ALLOC_POLICY_START_BIT = 8, + /* + * If the disk is under high io pressure, new io will wait directly + * without trying to preempt tag. + */ + BLK_MQ_F_NO_TAG_PREEMPTION = 1 << 8, + BLK_MQ_F_ALLOC_POLICY_START_BIT = 9, BLK_MQ_F_ALLOC_POLICY_BITS = 1, BLK_MQ_S_STOPPED = 0,
Tag preemption is the default behaviour, specifically blk_mq_get_tag() will try to get tag unconditionally, which means a new io can preempt tag even if there are lots of ios that are waiting for tags. This patch introduce a new flag, prepare to disable such behaviour, in order to optimize io performance for large random io for HHD. Signed-off-by: Yu Kuai <yukuai3@huawei.com> --- block/blk-mq-debugfs.c | 1 + block/blk-mq.h | 5 +++++ include/linux/blk-mq.h | 7 ++++++- 3 files changed, 12 insertions(+), 1 deletion(-)