From patchwork Sat Oct 21 15:47:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13431432 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 531CEC25B42 for ; Sat, 21 Oct 2023 07:52:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229974AbjJUHw3 (ORCPT ); Sat, 21 Oct 2023 03:52:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33148 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229642AbjJUHw2 (ORCPT ); Sat, 21 Oct 2023 03:52:28 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B649D67; Sat, 21 Oct 2023 00:52:25 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4SCDFG2Fjvz4f3mHf; Sat, 21 Oct 2023 15:52:18 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAXrt0ygzNl84cpDg--.7754S5; Sat, 21 Oct 2023 15:52:21 +0800 (CST) From: Yu Kuai To: bvanassche@acm.org, hch@lst.de, kbusch@kernel.org, ming.lei@redhat.com, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC v2 1/8] blk-mq: factor out a structure from blk_mq_tags Date: Sat, 21 Oct 2023 23:47:59 +0800 Message-Id: <20231021154806.4019417-2-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231021154806.4019417-1-yukuai1@huaweicloud.com> References: <20231021154806.4019417-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAXrt0ygzNl84cpDg--.7754S5 X-Coremail-Antispam: 1UD129KBjvJXoWxGF4xGF4DJr43uw1ktr43Wrg_yoW5uF4Dpa yDGa17Cw1fJr1UXFWvq3yUZF1Sganxtr17Jwn3W34YyF1UCw1fZr1v9rW0vr48ZrsayFsr Grs8trW8tF1UW37anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmGb4IE77IF4wAFF20E14v26ryj6rWUM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M280x2IEY4vEnII2IxkI6r1a6r45M2 8IrcIa0xkI8VA2jI8067AKxVWUGwA2048vs2IY020Ec7CjxVAFwI0_Gr0_Xr1l8cAvFVAK 0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVWDJVCq3wA2z4 x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l 84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I 8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AK xVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zV CS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2IY04v7MxAIw28IcxkI7VAKI48JMxC20s02 6xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_Jr I_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v2 6r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v20xvaj4 0_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8JVW8 JrUvcSsGvfC2KfnxnUUI43ZEXa7sRNvtAUUUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai Currently tags are fairly shared. The new structure contains only one field active_queues for now. There are no functional changes and this patch prepares for refactoring tag sharing. Signed-off-by: Yu Kuai --- block/blk-mq-debugfs.c | 2 +- block/blk-mq-tag.c | 8 ++++---- block/blk-mq.h | 2 +- include/linux/blk-mq.h | 10 ++++++++-- 4 files changed, 14 insertions(+), 8 deletions(-) diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index 5cbeb9344f2f..f6b77289bc1f 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -400,7 +400,7 @@ static void blk_mq_debugfs_tags_show(struct seq_file *m, seq_printf(m, "nr_tags=%u\n", tags->nr_tags); seq_printf(m, "nr_reserved_tags=%u\n", tags->nr_reserved_tags); seq_printf(m, "active_queues=%d\n", - READ_ONCE(tags->active_queues)); + READ_ONCE(tags->ctl.active_queues)); seq_puts(m, "\nbitmap_tags:\n"); sbitmap_queue_show(&tags->bitmap_tags, m); diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index cc57e2dd9a0b..fe41a0d34fc0 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -57,8 +57,8 @@ void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx) } spin_lock_irq(&tags->lock); - users = tags->active_queues + 1; - WRITE_ONCE(tags->active_queues, users); + users = tags->ctl.active_queues + 1; + WRITE_ONCE(tags->ctl.active_queues, users); blk_mq_update_wake_batch(tags, users); spin_unlock_irq(&tags->lock); } @@ -94,8 +94,8 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) } spin_lock_irq(&tags->lock); - users = tags->active_queues - 1; - WRITE_ONCE(tags->active_queues, users); + users = tags->ctl.active_queues - 1; + WRITE_ONCE(tags->ctl.active_queues, users); blk_mq_update_wake_batch(tags, users); spin_unlock_irq(&tags->lock); diff --git a/block/blk-mq.h b/block/blk-mq.h index f75a9ecfebde..363c8f6e13dd 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -435,7 +435,7 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx, return true; } - users = READ_ONCE(hctx->tags->active_queues); + users = READ_ONCE(hctx->tags->ctl.active_queues); if (!users) return true; diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 1ab3081c82ed..5a08527c53b9 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -727,13 +727,16 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q, blk_opf_t opf, blk_mq_req_flags_t flags, unsigned int hctx_idx); +struct tag_sharing_ctl { + unsigned int active_queues; +}; + /* * Tag address space map. */ struct blk_mq_tags { unsigned int nr_tags; unsigned int nr_reserved_tags; - unsigned int active_queues; struct sbitmap_queue bitmap_tags; struct sbitmap_queue breserved_tags; @@ -744,9 +747,12 @@ struct blk_mq_tags { /* * used to clear request reference in rqs[] before freeing one - * request pool + * request pool, and to protect tag_sharing_ctl. */ spinlock_t lock; + + /* used when tags is shared for multiple request_queue or hctx. */ + struct tag_sharing_ctl ctl; }; static inline struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, From patchwork Sat Oct 21 15:48:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13431434 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6144C00A98 for ; Sat, 21 Oct 2023 07:52:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229866AbjJUHw2 (ORCPT ); Sat, 21 Oct 2023 03:52:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33134 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229583AbjJUHw2 (ORCPT ); Sat, 21 Oct 2023 03:52:28 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B2912D63; Sat, 21 Oct 2023 00:52:24 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4SCDFG5pgtz4f3mHr; Sat, 21 Oct 2023 15:52:18 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAXrt0ygzNl84cpDg--.7754S6; Sat, 21 Oct 2023 15:52:21 +0800 (CST) From: Yu Kuai To: bvanassche@acm.org, hch@lst.de, kbusch@kernel.org, ming.lei@redhat.com, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC v2 2/8] blk-mq: factor out a structure to store information for tag sharing Date: Sat, 21 Oct 2023 23:48:00 +0800 Message-Id: <20231021154806.4019417-3-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231021154806.4019417-1-yukuai1@huaweicloud.com> References: <20231021154806.4019417-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAXrt0ygzNl84cpDg--.7754S6 X-Coremail-Antispam: 1UD129KBjvJXoWxtF1DWr4rWFyxZw1xKr1DAwb_yoW3XrWDpF W5ta13CayFqF1DuFWkKanrZr1fKr4vkry7GrnagwnIyw1I9r4xGa18XFy5ZrW8ArWkJrWS yr4qgrWUGF1DWrDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUm214x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_Jryl82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2 F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjx v20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2 z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0V AKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1l Ox8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErc IFxwACI402YVCY1x02628vn2kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMI IF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVF xhVjvjDU0xZFpf9x0pR9NVDUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai Tags can be shared by multiple request_queue or hctx, and they both have a counter to record how many tags is used, 'nr_active_requests_shared_tags' for request_queue and 'nr_active' for hctx. Factor out a new structure shared_tag_info to store such information. Also add new entry debugfs entry 'shared_tag_info' for both queue and hctx. This patch prepares for refactoring tag sharing. Signed-off-by: Yu Kuai --- block/blk-core.c | 2 +- block/blk-mq-debugfs.c | 24 ++++++++++++++++++++++++ block/blk-mq.c | 2 +- block/blk-mq.h | 23 ++++++++++++----------- include/linux/blk-mq.h | 11 +++++------ include/linux/blkdev.h | 7 +++++-- 6 files changed, 48 insertions(+), 21 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 9d51e9894ece..c028b047f5d5 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -414,7 +414,7 @@ struct request_queue *blk_alloc_queue(int node_id) q->node = node_id; - atomic_set(&q->nr_active_requests_shared_tags, 0); + atomic_set(&q->shared_tag_info.active_tags, 0); timer_setup(&q->timeout, blk_rq_timed_out_timer, 0); INIT_WORK(&q->timeout_work, blk_timeout_work); diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index f6b77289bc1f..d6ebd8d9d3bb 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -155,12 +155,27 @@ static ssize_t queue_state_write(void *data, const char __user *buf, return count; } +static void shared_tag_info_show(struct shared_tag_info *info, + struct seq_file *m) +{ + seq_printf(m, "%d\n", atomic_read(&info->active_tags)); +} + +static int queue_shared_tag_info_show(void *data, struct seq_file *m) +{ + struct request_queue *q = data; + + shared_tag_info_show(&q->shared_tag_info, m); + return 0; +} + static const struct blk_mq_debugfs_attr blk_mq_debugfs_queue_attrs[] = { { "poll_stat", 0400, queue_poll_stat_show }, { "requeue_list", 0400, .seq_ops = &queue_requeue_list_seq_ops }, { "pm_only", 0600, queue_pm_only_show, NULL }, { "state", 0600, queue_state_show, queue_state_write }, { "zone_wlock", 0400, queue_zone_wlock_show, NULL }, + { "shared_tag_info", 0400, queue_shared_tag_info_show, NULL }, { }, }; @@ -512,6 +527,14 @@ static int hctx_dispatch_busy_show(void *data, struct seq_file *m) return 0; } +static int hctx_shared_tag_info_show(void *data, struct seq_file *m) +{ + struct blk_mq_hw_ctx *hctx = data; + + shared_tag_info_show(&hctx->shared_tag_info, m); + return 0; +} + #define CTX_RQ_SEQ_OPS(name, type) \ static void *ctx_##name##_rq_list_start(struct seq_file *m, loff_t *pos) \ __acquires(&ctx->lock) \ @@ -628,6 +651,7 @@ static const struct blk_mq_debugfs_attr blk_mq_debugfs_hctx_attrs[] = { {"active", 0400, hctx_active_show}, {"dispatch_busy", 0400, hctx_dispatch_busy_show}, {"type", 0400, hctx_type_show}, + {"shared_tag_info", 0400, hctx_shared_tag_info_show}, {}, }; diff --git a/block/blk-mq.c b/block/blk-mq.c index e2d11183f62e..d85b9ad816d0 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3679,7 +3679,7 @@ blk_mq_alloc_hctx(struct request_queue *q, struct blk_mq_tag_set *set, if (!zalloc_cpumask_var_node(&hctx->cpumask, gfp, node)) goto free_hctx; - atomic_set(&hctx->nr_active, 0); + atomic_set(&hctx->shared_tag_info.active_tags, 0); if (node == NUMA_NO_NODE) node = set->numa_node; hctx->numa_node = node; diff --git a/block/blk-mq.h b/block/blk-mq.h index 363c8f6e13dd..6f332dc122ff 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -274,10 +274,10 @@ static inline int blk_mq_get_rq_budget_token(struct request *rq) static inline void __blk_mq_add_active_requests(struct blk_mq_hw_ctx *hctx, int val) { - if (blk_mq_is_shared_tags(hctx->flags)) - atomic_add(val, &hctx->queue->nr_active_requests_shared_tags); - else - atomic_add(val, &hctx->nr_active); + struct shared_tag_info *info = blk_mq_is_shared_tags(hctx->flags) ? + &hctx->queue->shared_tag_info : &hctx->shared_tag_info; + + atomic_add(val, &info->active_tags); } static inline void __blk_mq_inc_active_requests(struct blk_mq_hw_ctx *hctx) @@ -288,10 +288,10 @@ static inline void __blk_mq_inc_active_requests(struct blk_mq_hw_ctx *hctx) static inline void __blk_mq_sub_active_requests(struct blk_mq_hw_ctx *hctx, int val) { - if (blk_mq_is_shared_tags(hctx->flags)) - atomic_sub(val, &hctx->queue->nr_active_requests_shared_tags); - else - atomic_sub(val, &hctx->nr_active); + struct shared_tag_info *info = blk_mq_is_shared_tags(hctx->flags) ? + &hctx->queue->shared_tag_info : &hctx->shared_tag_info; + + atomic_sub(val, &info->active_tags); } static inline void __blk_mq_dec_active_requests(struct blk_mq_hw_ctx *hctx) @@ -327,9 +327,10 @@ static inline void blk_mq_dec_active_requests(struct blk_mq_hw_ctx *hctx) static inline int __blk_mq_active_requests(struct blk_mq_hw_ctx *hctx) { - if (blk_mq_is_shared_tags(hctx->flags)) - return atomic_read(&hctx->queue->nr_active_requests_shared_tags); - return atomic_read(&hctx->nr_active); + struct shared_tag_info *info = blk_mq_is_shared_tags(hctx->flags) ? + &hctx->queue->shared_tag_info : &hctx->shared_tag_info; + + return atomic_read(&info->active_tags); } static inline void __blk_mq_put_driver_tag(struct blk_mq_hw_ctx *hctx, struct request *rq) diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 5a08527c53b9..4301226f311b 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -384,6 +384,11 @@ struct blk_mq_hw_ctx { * assigned when a request is dispatched from a hardware queue. */ struct blk_mq_tags *tags; + /** + * @shared_tag_info: Only used when a tag set is shared across request + * queues. + */ + struct shared_tag_info shared_tag_info; /** * @sched_tags: Tags owned by I/O scheduler. If there is an I/O * scheduler associated with a request queue, a tag is assigned when @@ -399,12 +404,6 @@ struct blk_mq_hw_ctx { /** @queue_num: Index of this hardware queue. */ unsigned int queue_num; - /** - * @nr_active: Number of active requests. Only used when a tag set is - * shared across request queues. - */ - atomic_t nr_active; - /** @cpuhp_online: List to store request if CPU is going to die */ struct hlist_node cpuhp_online; /** @cpuhp_dead: List to store request if some CPU die. */ diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 51fa7ffdee83..645a8e245add 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -375,6 +375,10 @@ struct blk_independent_access_ranges { struct blk_independent_access_range ia_range[]; }; +struct shared_tag_info { + atomic_t active_tags; +}; + struct request_queue { struct request *last_merge; struct elevator_queue *elevator; @@ -455,8 +459,6 @@ struct request_queue { struct timer_list timeout; struct work_struct timeout_work; - atomic_t nr_active_requests_shared_tags; - struct blk_mq_tags *sched_shared_tags; struct list_head icq_list; @@ -513,6 +515,7 @@ struct request_queue { struct blk_mq_tag_set *tag_set; struct list_head tag_set_list; + struct shared_tag_info shared_tag_info; struct dentry *debugfs_dir; struct dentry *sched_debugfs_dir; From patchwork Sat Oct 21 15:48:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13431431 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44185C25B41 for ; Sat, 21 Oct 2023 07:52:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229472AbjJUHwa (ORCPT ); Sat, 21 Oct 2023 03:52:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33164 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229686AbjJUHw2 (ORCPT ); Sat, 21 Oct 2023 03:52:28 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 41E87D68; Sat, 21 Oct 2023 00:52:26 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4SCDFJ43yFz4f3kpG; Sat, 21 Oct 2023 15:52:20 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAXrt0ygzNl84cpDg--.7754S7; Sat, 21 Oct 2023 15:52:22 +0800 (CST) From: Yu Kuai To: bvanassche@acm.org, hch@lst.de, kbusch@kernel.org, ming.lei@redhat.com, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC v2 3/8] blk-mq: add a helper to initialize shared_tag_info Date: Sat, 21 Oct 2023 23:48:01 +0800 Message-Id: <20231021154806.4019417-4-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231021154806.4019417-1-yukuai1@huaweicloud.com> References: <20231021154806.4019417-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAXrt0ygzNl84cpDg--.7754S7 X-Coremail-Antispam: 1UD129KBjvJXoWxWF45Kw1kGFyDJFyxWr4DArb_yoW5CFW5pF W5ta15C3ySqrnruFW8ta13Xw15Kw4kKr17GwsIgrWYv342kr1xuF48XryUZrW0yrs3AFZ3 uF4ktry8JF1kG37anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_JrWl82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2 F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjx v20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2 z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0V AKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1l Ox8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErc IFxwACI402YVCY1x02628vn2kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMI IF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnI WIevJa73UjIFyTuYvjTRNzVbUUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai shared_tag_info is used for both request_queue and hctx, and follow up patches will add more fields into the structure, add a helper to avoid redundant code. Also move initialization for request_queue from blk_alloc_queue() to blk_mq_init_allocated_queue(), because shared_tag_info won't be used for bio-based device. And move initialization for hctx from blk_mq_alloc_hctx() to blk_mq_init_hctx(), because hctx can be reused. Signed-off-by: Yu Kuai --- block/blk-core.c | 2 -- block/blk-mq-tag.c | 5 +++++ block/blk-mq.c | 3 ++- block/blk-mq.h | 1 + 4 files changed, 8 insertions(+), 3 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index c028b047f5d5..756ca1109f6c 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -414,8 +414,6 @@ struct request_queue *blk_alloc_queue(int node_id) q->node = node_id; - atomic_set(&q->shared_tag_info.active_tags, 0); - timer_setup(&q->timeout, blk_rq_timed_out_timer, 0); INIT_WORK(&q->timeout_work, blk_timeout_work); INIT_LIST_HEAD(&q->icq_list); diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index fe41a0d34fc0..2f91a7605d7a 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -29,6 +29,11 @@ static void blk_mq_update_wake_batch(struct blk_mq_tags *tags, users); } +void blk_mq_init_shared_tag_info(struct shared_tag_info *info) +{ + atomic_set(&info->active_tags, 0); +} + /* * If a previously inactive queue goes active, bump the active user count. * We need to do this before try to allocate driver tag, then even if fail diff --git a/block/blk-mq.c b/block/blk-mq.c index d85b9ad816d0..de5859dd9f52 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3652,6 +3652,7 @@ static int blk_mq_init_hctx(struct request_queue *q, if (xa_insert(&q->hctx_table, hctx_idx, hctx, GFP_KERNEL)) goto exit_flush_rq; + blk_mq_init_shared_tag_info(&hctx->shared_tag_info); return 0; exit_flush_rq: @@ -3679,7 +3680,6 @@ blk_mq_alloc_hctx(struct request_queue *q, struct blk_mq_tag_set *set, if (!zalloc_cpumask_var_node(&hctx->cpumask, gfp, node)) goto free_hctx; - atomic_set(&hctx->shared_tag_info.active_tags, 0); if (node == NUMA_NO_NODE) node = set->numa_node; hctx->numa_node = node; @@ -4227,6 +4227,7 @@ int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, if (blk_mq_alloc_ctxs(q)) goto err_exit; + blk_mq_init_shared_tag_info(&q->shared_tag_info); /* init q->mq_kobj and sw queues' kobjects */ blk_mq_sysfs_init(q); diff --git a/block/blk-mq.h b/block/blk-mq.h index 6f332dc122ff..ac58f2e22f20 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -63,6 +63,7 @@ struct blk_mq_tags *blk_mq_alloc_map_and_rqs(struct blk_mq_tag_set *set, void blk_mq_free_map_and_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, unsigned int hctx_idx); +void blk_mq_init_shared_tag_info(struct shared_tag_info *info); /* * CPU -> queue mappings From patchwork Sat Oct 21 15:48:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13431435 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7CC6C0032E for ; Sat, 21 Oct 2023 07:52:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230257AbjJUHwb (ORCPT ); Sat, 21 Oct 2023 03:52:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33168 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229831AbjJUHw2 (ORCPT ); Sat, 21 Oct 2023 03:52:28 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5FD4CD6B; Sat, 21 Oct 2023 00:52:26 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4SCDFH5SDWz4f3mJQ; Sat, 21 Oct 2023 15:52:19 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAXrt0ygzNl84cpDg--.7754S8; Sat, 21 Oct 2023 15:52:22 +0800 (CST) From: Yu Kuai To: bvanassche@acm.org, hch@lst.de, kbusch@kernel.org, ming.lei@redhat.com, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC v2 4/8] blk-mq: support to track active queues from blk_mq_tags Date: Sat, 21 Oct 2023 23:48:02 +0800 Message-Id: <20231021154806.4019417-5-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231021154806.4019417-1-yukuai1@huaweicloud.com> References: <20231021154806.4019417-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAXrt0ygzNl84cpDg--.7754S8 X-Coremail-Antispam: 1UD129KBjvJXoWxCF4UtF45JFW3ZFy8tr43Wrg_yoWruFyfpF W5Ka13K3yrGrnrWFWDK39rZ3Zagr4kKr17JFnagry5Ar1Fkr4rWr1kGr9Yvr48ArWkAw4f CrWUKrZYkF1DJrUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmv14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_JF0E3s1l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0 rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6x IIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xv wVC2z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFc xC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_ Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2 IErcIFxwACI402YVCY1x02628vn2kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE 7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI 8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWU CwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJV WUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUv cSsGvfC2KfnxnUUI43ZEXa7sRiVbyDUUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai In order to refactor how tags is shared, it's necessary to acquire some information for each shared q/hctx, so that more tags can be assigned to the one with higher pressure. Prepare to refactor tag sharing. Signed-off-by: Yu Kuai --- block/blk-mq-tag.c | 38 ++++++++++++++++++++++++++++++++------ include/linux/blk-mq.h | 5 +++++ include/linux/blkdev.h | 3 ++- 3 files changed, 39 insertions(+), 7 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 2f91a7605d7a..07d9b513990b 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -32,6 +32,7 @@ static void blk_mq_update_wake_batch(struct blk_mq_tags *tags, void blk_mq_init_shared_tag_info(struct shared_tag_info *info) { atomic_set(&info->active_tags, 0); + INIT_LIST_HEAD(&info->node); } /* @@ -44,6 +45,7 @@ void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx) { unsigned int users; struct blk_mq_tags *tags = hctx->tags; + struct shared_tag_info *info; /* * calling test_bit() prior to test_and_set_bit() is intentional, @@ -55,13 +57,18 @@ void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx) if (test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags) || test_and_set_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) return; + + info = &q->shared_tag_info; } else { if (test_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state) || test_and_set_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state)) return; + + info = &hctx->shared_tag_info; } spin_lock_irq(&tags->lock); + list_add(&info->node, &tags->ctl.head); users = tags->ctl.active_queues + 1; WRITE_ONCE(tags->ctl.active_queues, users); blk_mq_update_wake_batch(tags, users); @@ -84,26 +91,44 @@ void blk_mq_tag_wakeup_all(struct blk_mq_tags *tags, bool include_reserve) */ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) { - struct blk_mq_tags *tags = hctx->tags; unsigned int users; + struct blk_mq_tags *tags = hctx->tags; + struct shared_tag_info *info; if (blk_mq_is_shared_tags(hctx->flags)) { struct request_queue *q = hctx->queue; - if (!test_and_clear_bit(QUEUE_FLAG_HCTX_ACTIVE, - &q->queue_flags)) + if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) return; + spin_lock_irq(&tags->lock); + if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) { + spin_unlock_irq(&tags->lock); + return; + } + + info = &q->shared_tag_info; } else { - if (!test_and_clear_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state)) + if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state)) return; + spin_lock_irq(&tags->lock); + if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state)) { + spin_unlock_irq(&tags->lock); + return; + } + + info = &hctx->shared_tag_info; } - spin_lock_irq(&tags->lock); + list_del_init(&info->node); users = tags->ctl.active_queues - 1; WRITE_ONCE(tags->ctl.active_queues, users); blk_mq_update_wake_batch(tags, users); - spin_unlock_irq(&tags->lock); + if (blk_mq_is_shared_tags(hctx->flags)) + clear_bit(QUEUE_FLAG_HCTX_ACTIVE, &hctx->queue->queue_flags); + else + clear_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state); + spin_unlock_irq(&tags->lock); blk_mq_tag_wakeup_all(tags, false); } @@ -586,6 +611,7 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags, tags->nr_tags = total_tags; tags->nr_reserved_tags = reserved_tags; spin_lock_init(&tags->lock); + INIT_LIST_HEAD(&tags->ctl.head); if (blk_mq_init_bitmaps(&tags->bitmap_tags, &tags->breserved_tags, total_tags, reserved_tags, node, diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 4301226f311b..c93955f5f28f 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -728,6 +728,11 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q, struct tag_sharing_ctl { unsigned int active_queues; + /* + * If driver tags is shared for multiple queue/hctx, this is the head of + * a list with request_queue/hctx->shared_tag_info.node entries. + */ + struct list_head head; }; /* diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 645a8e245add..f97bc2c7acc9 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -376,7 +376,8 @@ struct blk_independent_access_ranges { }; struct shared_tag_info { - atomic_t active_tags; + atomic_t active_tags; + struct list_head node; }; struct request_queue { From patchwork Sat Oct 21 15:48:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13431433 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63960C25B43 for ; Sat, 21 Oct 2023 07:52:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230442AbjJUHwc (ORCPT ); Sat, 21 Oct 2023 03:52:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33186 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230008AbjJUHw3 (ORCPT ); Sat, 21 Oct 2023 03:52:29 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2EC79D70; Sat, 21 Oct 2023 00:52:27 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4SCDFJ1rXJz4f3mJR; Sat, 21 Oct 2023 15:52:20 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAXrt0ygzNl84cpDg--.7754S9; Sat, 21 Oct 2023 15:52:23 +0800 (CST) From: Yu Kuai To: bvanassche@acm.org, hch@lst.de, kbusch@kernel.org, ming.lei@redhat.com, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC v2 5/8] blk-mq: precalculate available tags for hctx_may_queue() Date: Sat, 21 Oct 2023 23:48:03 +0800 Message-Id: <20231021154806.4019417-6-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231021154806.4019417-1-yukuai1@huaweicloud.com> References: <20231021154806.4019417-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAXrt0ygzNl84cpDg--.7754S9 X-Coremail-Antispam: 1UD129KBjvJXoW3JryUGw1xtr4xAF1ruw48WFg_yoWxWrWrpF W5ta13K3yaqrnruFZrJ3y7uF1SgrWkKr1xGrn3J3yFywnFkr4xW3W8Gry8Zry8Ar4kAayS yr4DtryUGF1UGrDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmv14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_JF0E3s1l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0 rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6x IIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xv wVC2z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFc xC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_ Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2 IErcIFxwACI402YVCY1x02628vn2kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE 7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI 8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8 JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJV WUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUv cSsGvfC2KfnxnUUI43ZEXa7sRiVbyDUUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai Currently, hctx_mq_queue() only need to get how many queues is sharing tags, then calculate how many tags is available for each queue by fair sharing. Add a new field 'available_tags' for struct shared_tag_info to store how many tags is available directly from slow path, so that hctx_mq_queue() doesn't need to do calculation. Currently tags are still fair shared, and now that calculation is in the slow path, it's okay to refactor tag sharing with more complicated algorithm, which is implemented in following patches. Signed-off-by: Yu Kuai --- block/blk-mq-debugfs.c | 3 ++- block/blk-mq-tag.c | 35 ++++++++++++++++++++++++++++++++++- block/blk-mq.c | 4 ++-- block/blk-mq.h | 19 ++++++++----------- include/linux/blkdev.h | 1 + 5 files changed, 47 insertions(+), 15 deletions(-) diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index d6ebd8d9d3bb..1d460119f5b3 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -158,7 +158,8 @@ static ssize_t queue_state_write(void *data, const char __user *buf, static void shared_tag_info_show(struct shared_tag_info *info, struct seq_file *m) { - seq_printf(m, "%d\n", atomic_read(&info->active_tags)); + seq_printf(m, "active tags %d\n", atomic_read(&info->active_tags)); + seq_printf(m, "available tags %u\n", READ_ONCE(info->available_tags)); } static int queue_shared_tag_info_show(void *data, struct seq_file *m) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 07d9b513990b..261769251282 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -14,6 +14,8 @@ #include "blk-mq.h" #include "blk-mq-sched.h" +#define shared_tags(tags, users) max((tags->nr_tags + users - 1) / users, 4U) + /* * Recalculate wakeup batch when tag is shared by hctx. */ @@ -29,10 +31,39 @@ static void blk_mq_update_wake_batch(struct blk_mq_tags *tags, users); } -void blk_mq_init_shared_tag_info(struct shared_tag_info *info) +void blk_mq_init_shared_tag_info(struct shared_tag_info *info, + unsigned int nr_tags) { atomic_set(&info->active_tags, 0); INIT_LIST_HEAD(&info->node); + info->available_tags = nr_tags; +} + +static void blk_mq_update_available_driver_tags(struct blk_mq_tags *tags, + struct shared_tag_info *info, + unsigned int users) +{ + unsigned int old = tags->ctl.active_queues; + int nr_tags; + struct shared_tag_info *iter; + + if (!old || !users) + return; + + nr_tags = (int)shared_tags(tags, users); + if (old < users) + WRITE_ONCE(info->available_tags, nr_tags); + else + WRITE_ONCE(info->available_tags, tags->nr_tags); + + nr_tags -= (int)shared_tags(tags, old); + list_for_each_entry(iter, &tags->ctl.head, node) { + if (iter == info) + continue; + + WRITE_ONCE(iter->available_tags, + (unsigned int)((int)iter->available_tags + nr_tags)); + } } /* @@ -70,6 +101,7 @@ void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx) spin_lock_irq(&tags->lock); list_add(&info->node, &tags->ctl.head); users = tags->ctl.active_queues + 1; + blk_mq_update_available_driver_tags(tags, info, users); WRITE_ONCE(tags->ctl.active_queues, users); blk_mq_update_wake_batch(tags, users); spin_unlock_irq(&tags->lock); @@ -121,6 +153,7 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) list_del_init(&info->node); users = tags->ctl.active_queues - 1; + blk_mq_update_available_driver_tags(tags, info, users); WRITE_ONCE(tags->ctl.active_queues, users); blk_mq_update_wake_batch(tags, users); diff --git a/block/blk-mq.c b/block/blk-mq.c index de5859dd9f52..8775616bc85c 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3652,7 +3652,7 @@ static int blk_mq_init_hctx(struct request_queue *q, if (xa_insert(&q->hctx_table, hctx_idx, hctx, GFP_KERNEL)) goto exit_flush_rq; - blk_mq_init_shared_tag_info(&hctx->shared_tag_info); + blk_mq_init_shared_tag_info(&hctx->shared_tag_info, set->queue_depth); return 0; exit_flush_rq: @@ -4227,7 +4227,7 @@ int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, if (blk_mq_alloc_ctxs(q)) goto err_exit; - blk_mq_init_shared_tag_info(&q->shared_tag_info); + blk_mq_init_shared_tag_info(&q->shared_tag_info, set->queue_depth); /* init q->mq_kobj and sw queues' kobjects */ blk_mq_sysfs_init(q); diff --git a/block/blk-mq.h b/block/blk-mq.h index ac58f2e22f20..5c0d19562848 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -63,7 +63,8 @@ struct blk_mq_tags *blk_mq_alloc_map_and_rqs(struct blk_mq_tag_set *set, void blk_mq_free_map_and_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags, unsigned int hctx_idx); -void blk_mq_init_shared_tag_info(struct shared_tag_info *info); +void blk_mq_init_shared_tag_info(struct shared_tag_info *info, + unsigned int nr_tags); /* * CPU -> queue mappings @@ -416,7 +417,7 @@ static inline void blk_mq_free_requests(struct list_head *list) static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx, struct sbitmap_queue *bt) { - unsigned int depth, users; + struct shared_tag_info *info; if (!hctx || !(hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) return true; @@ -432,20 +433,16 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx, if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) return true; + + info = &q->shared_tag_info; } else { if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state)) return true; - } - users = READ_ONCE(hctx->tags->ctl.active_queues); - if (!users) - return true; + info = &hctx->shared_tag_info; + } - /* - * Allow at least some tags - */ - depth = max((bt->sb.depth + users - 1) / users, 4U); - return __blk_mq_active_requests(hctx) < depth; + return atomic_read(&info->active_tags) < READ_ONCE(info->available_tags); } /* run the code block in @dispatch_ops with rcu/srcu read lock held */ diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index f97bc2c7acc9..b364d65fe4e5 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -377,6 +377,7 @@ struct blk_independent_access_ranges { struct shared_tag_info { atomic_t active_tags; + unsigned int available_tags; struct list_head node; }; From patchwork Sat Oct 21 15:48:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13431436 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C693CC004C0 for ; Sat, 21 Oct 2023 07:52:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230479AbjJUHwd (ORCPT ); Sat, 21 Oct 2023 03:52:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33174 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229980AbjJUHw3 (ORCPT ); Sat, 21 Oct 2023 03:52:29 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C4AD4D6E; Sat, 21 Oct 2023 00:52:26 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4SCDFL0Fstz4f3kph; Sat, 21 Oct 2023 15:52:22 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAXrt0ygzNl84cpDg--.7754S10; Sat, 21 Oct 2023 15:52:23 +0800 (CST) From: Yu Kuai To: bvanassche@acm.org, hch@lst.de, kbusch@kernel.org, ming.lei@redhat.com, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC v2 6/8] blk-mq: add new helpers blk_mq_driver_tag_busy/idle() Date: Sat, 21 Oct 2023 23:48:04 +0800 Message-Id: <20231021154806.4019417-7-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231021154806.4019417-1-yukuai1@huaweicloud.com> References: <20231021154806.4019417-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAXrt0ygzNl84cpDg--.7754S10 X-Coremail-Antispam: 1UD129KBjvJXoWxuw1xWr1fJw1UGr17ZrW5Wrg_yoWfJr4UpF W3Ga1Yk3yFqrsruFZrta17Z3WSkrs7Kr17Jrnaqw1FvF1j9r4xW3W8GryUZrW8ArWkAr47 ZrW5try5Cr1DWrDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmv14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_JF0E3s1l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0 rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6x IIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xv wVC2z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFc xC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_ Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2 IErcIFxwACI402YVCY1x02628vn2kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE 7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI 8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8 JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJV WUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUv cSsGvfC2KfnxnUUI43ZEXa7sRiVbyDUUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai Refer to the implementation of blk_mq_tag_busy/idle(): - blk_mq_driver_tag_busy() will be used the first time when get driver tag failed; - blk_mq_driver_tag_idle() will be used when driver tag is no longer exhausted. - A new counter 'busy_queues' is added to indicate how many shared queues/hctxs are busy(drivers tags is exhausted); Tag sharing will be delayed until fail to get driver tag based on these new helpers. Signed-off-by: Yu Kuai --- block/blk-mq-debugfs.c | 2 ++ block/blk-mq-tag.c | 53 +++++++++++++++++++++++++++++++++++++++++- block/blk-mq.c | 9 +++++-- block/blk-mq.h | 25 ++++++++++++++++---- include/linux/blk-mq.h | 7 ++++-- include/linux/blkdev.h | 1 + 6 files changed, 88 insertions(+), 9 deletions(-) diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index 1d460119f5b3..170bc2236e81 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -417,6 +417,8 @@ static void blk_mq_debugfs_tags_show(struct seq_file *m, seq_printf(m, "nr_reserved_tags=%u\n", tags->nr_reserved_tags); seq_printf(m, "active_queues=%d\n", READ_ONCE(tags->ctl.active_queues)); + seq_printf(m, "busy_queues=%d\n", + READ_ONCE(tags->ctl.busy_queues)); seq_puts(m, "\nbitmap_tags:\n"); sbitmap_queue_show(&tags->bitmap_tags, m); diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 261769251282..cd13d8e512f7 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -165,6 +165,51 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) blk_mq_tag_wakeup_all(tags, false); } +void __blk_mq_driver_tag_busy(struct blk_mq_hw_ctx *hctx) +{ + unsigned int users; + struct blk_mq_tags *tags = hctx->tags; + + if (blk_mq_is_shared_tags(hctx->flags)) { + struct request_queue *q = hctx->queue; + + if (test_bit(QUEUE_FLAG_HCTX_BUSY, &q->queue_flags) || + test_and_set_bit(QUEUE_FLAG_HCTX_BUSY, &q->queue_flags)) + return; + } else { + if (test_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state) || + test_and_set_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state)) + return; + } + + spin_lock_irq(&tags->lock); + users = tags->ctl.busy_queues + 1; + WRITE_ONCE(tags->ctl.busy_queues, users); + spin_unlock_irq(&tags->lock); +} + +void __blk_mq_driver_tag_idle(struct blk_mq_hw_ctx *hctx) +{ + unsigned int users; + struct blk_mq_tags *tags = hctx->tags; + + if (blk_mq_is_shared_tags(hctx->flags)) { + struct request_queue *q = hctx->queue; + + if (!test_and_clear_bit(QUEUE_FLAG_HCTX_BUSY, + &q->queue_flags)) + return; + } else { + if (!test_and_clear_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state)) + return; + } + + spin_lock_irq(&tags->lock); + users = tags->ctl.busy_queues - 1; + WRITE_ONCE(tags->ctl.busy_queues, users); + spin_unlock_irq(&tags->lock); +} + static int __blk_mq_get_tag(struct blk_mq_alloc_data *data, struct sbitmap_queue *bt) { @@ -218,8 +263,11 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data) if (tag != BLK_MQ_NO_TAG) goto found_tag; - if (data->flags & BLK_MQ_REQ_NOWAIT) + if (data->flags & BLK_MQ_REQ_NOWAIT) { + if (!(data->rq_flags & RQF_SCHED_TAGS)) + blk_mq_driver_tag_busy(data->hctx); return BLK_MQ_NO_TAG; + } ws = bt_wait_ptr(bt, data->hctx); do { @@ -246,6 +294,9 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data) if (tag != BLK_MQ_NO_TAG) break; + if (!(data->rq_flags & RQF_SCHED_TAGS)) + blk_mq_driver_tag_busy(data->hctx); + bt_prev = bt; io_schedule(); diff --git a/block/blk-mq.c b/block/blk-mq.c index 8775616bc85c..a106533f063f 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1668,8 +1668,10 @@ static void blk_mq_timeout_work(struct work_struct *work) */ queue_for_each_hw_ctx(q, hctx, i) { /* the hctx may be unmapped, so check it here */ - if (blk_mq_hw_queue_mapped(hctx)) + if (blk_mq_hw_queue_mapped(hctx)) { blk_mq_tag_idle(hctx); + blk_mq_driver_tag_idle(hctx); + } } } blk_queue_exit(q); @@ -3594,8 +3596,10 @@ static void blk_mq_exit_hctx(struct request_queue *q, { struct request *flush_rq = hctx->fq->flush_rq; - if (blk_mq_hw_queue_mapped(hctx)) + if (blk_mq_hw_queue_mapped(hctx)) { blk_mq_tag_idle(hctx); + blk_mq_driver_tag_idle(hctx); + } if (blk_queue_init_done(q)) blk_mq_clear_flush_rq_mapping(set->tags[hctx_idx], @@ -3931,6 +3935,7 @@ static void queue_set_hctx_shared(struct request_queue *q, bool shared) hctx->flags |= BLK_MQ_F_TAG_QUEUE_SHARED; } else { blk_mq_tag_idle(hctx); + blk_mq_driver_tag_idle(hctx); hctx->flags &= ~BLK_MQ_F_TAG_QUEUE_SHARED; } } diff --git a/block/blk-mq.h b/block/blk-mq.h index 5c0d19562848..3e555af1de49 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -195,8 +195,10 @@ static inline struct sbq_wait_state *bt_wait_ptr(struct sbitmap_queue *bt, return sbq_wait_ptr(bt, &hctx->wait_index); } -void __blk_mq_tag_busy(struct blk_mq_hw_ctx *); -void __blk_mq_tag_idle(struct blk_mq_hw_ctx *); +void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx); +void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx); +void __blk_mq_driver_tag_busy(struct blk_mq_hw_ctx *hctx); +void __blk_mq_driver_tag_idle(struct blk_mq_hw_ctx *hctx); static inline void blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx) { @@ -210,6 +212,18 @@ static inline void blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) __blk_mq_tag_idle(hctx); } +static inline void blk_mq_driver_tag_busy(struct blk_mq_hw_ctx *hctx) +{ + if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) + __blk_mq_driver_tag_busy(hctx); +} + +static inline void blk_mq_driver_tag_idle(struct blk_mq_hw_ctx *hctx) +{ + if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) + __blk_mq_driver_tag_idle(hctx); +} + static inline bool blk_mq_tag_is_reserved(struct blk_mq_tags *tags, unsigned int tag) { @@ -293,7 +307,8 @@ static inline void __blk_mq_sub_active_requests(struct blk_mq_hw_ctx *hctx, struct shared_tag_info *info = blk_mq_is_shared_tags(hctx->flags) ? &hctx->queue->shared_tag_info : &hctx->shared_tag_info; - atomic_sub(val, &info->active_tags); + if (!atomic_sub_return(val, &info->active_tags)) + blk_mq_driver_tag_idle(hctx); } static inline void __blk_mq_dec_active_requests(struct blk_mq_hw_ctx *hctx) @@ -354,8 +369,10 @@ bool __blk_mq_alloc_driver_tag(struct request *rq); static inline bool blk_mq_get_driver_tag(struct request *rq) { - if (rq->tag == BLK_MQ_NO_TAG && !__blk_mq_alloc_driver_tag(rq)) + if (rq->tag == BLK_MQ_NO_TAG && !__blk_mq_alloc_driver_tag(rq)) { + blk_mq_driver_tag_busy(rq->mq_hctx); return false; + } return true; } diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index c93955f5f28f..9182ceca8c7a 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -666,10 +666,11 @@ enum { BLK_MQ_S_STOPPED = 0, BLK_MQ_S_TAG_ACTIVE = 1, - BLK_MQ_S_SCHED_RESTART = 2, + BLK_MQ_S_DTAG_BUSY = 2, + BLK_MQ_S_SCHED_RESTART = 3, /* hw queue is inactive after all its CPUs become offline */ - BLK_MQ_S_INACTIVE = 3, + BLK_MQ_S_INACTIVE = 4, BLK_MQ_MAX_DEPTH = 10240, @@ -728,6 +729,8 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q, struct tag_sharing_ctl { unsigned int active_queues; + /* The number of shared queues/hctxs with exhausted driver tags. */ + unsigned int busy_queues; /* * If driver tags is shared for multiple queue/hctx, this is the head of * a list with request_queue/hctx->shared_tag_info.node entries. diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index b364d65fe4e5..8fd6a0a92233 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -552,6 +552,7 @@ struct request_queue { #define QUEUE_FLAG_DAX 19 /* device supports DAX */ #define QUEUE_FLAG_STATS 20 /* track IO start and completion times */ #define QUEUE_FLAG_REGISTERED 22 /* queue has been registered to a disk */ +#define QUEUE_FLAG_HCTX_BUSY 23 /* driver tag is exhausted for at least one blk-mq hctx */ #define QUEUE_FLAG_QUIESCED 24 /* queue has been quiesced */ #define QUEUE_FLAG_PCI_P2PDMA 25 /* device supports PCI p2p requests */ #define QUEUE_FLAG_ZONE_RESETALL 26 /* supports Zone Reset All */ From patchwork Sat Oct 21 15:48:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13431438 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63BF9C25B45 for ; Sat, 21 Oct 2023 07:52:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231138AbjJUHwd (ORCPT ); Sat, 21 Oct 2023 03:52:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33198 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230200AbjJUHwb (ORCPT ); Sat, 21 Oct 2023 03:52:31 -0400 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D5F95D63; Sat, 21 Oct 2023 00:52:28 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4SCDFG4cKkz4f3kpQ; Sat, 21 Oct 2023 15:52:18 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAXrt0ygzNl84cpDg--.7754S11; Sat, 21 Oct 2023 15:52:24 +0800 (CST) From: Yu Kuai To: bvanassche@acm.org, hch@lst.de, kbusch@kernel.org, ming.lei@redhat.com, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC v2 7/8] blk-mq-tag: delay tag sharing until fail to get driver tag Date: Sat, 21 Oct 2023 23:48:05 +0800 Message-Id: <20231021154806.4019417-8-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231021154806.4019417-1-yukuai1@huaweicloud.com> References: <20231021154806.4019417-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAXrt0ygzNl84cpDg--.7754S11 X-Coremail-Antispam: 1UD129KBjvJXoW3ArWrJrWfZw1kXFWrWF17Jrb_yoWxJFW5pF W3Ka1ak3yrXrsrWFWUKrZF93WI9rs7Kr4UGFnaqa45Zw1Y9r4rur40kr9Yvr48JFWkAw4a yrW5trW0yF4DJrUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmv14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_JF0E3s1l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0 rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6x IIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xv wVC2z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFc xC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_ Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2 IErcIFxwACI402YVCY1x02628vn2kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE 7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI 8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8 JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJV WUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUv cSsGvfC2KfnxnUUI43ZEXa7sRiVbyDUUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai Before this patch, tags will be shared when shared node start to handle IO, however, this will waste tags if some node doen't need all the fair shared tags and such tags can't be used for other node, even if other node might want more than fair shared tags. Prevent such problem by delaying tag sharing from issue io until fail to get driver tag. Note that such problem still exist if all the tags are exhausted, and the next patch will implement a algorithm to allow busy node to borrow tags from idle node. Signed-off-by: Yu Kuai --- block/blk-mq-tag.c | 67 ++++++++++++++++++++++++++-------------------- 1 file changed, 38 insertions(+), 29 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index cd13d8e512f7..a98b25c8d594 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -43,7 +43,7 @@ static void blk_mq_update_available_driver_tags(struct blk_mq_tags *tags, struct shared_tag_info *info, unsigned int users) { - unsigned int old = tags->ctl.active_queues; + unsigned int old = tags->ctl.busy_queues; int nr_tags; struct shared_tag_info *iter; @@ -74,9 +74,7 @@ static void blk_mq_update_available_driver_tags(struct blk_mq_tags *tags, */ void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx) { - unsigned int users; struct blk_mq_tags *tags = hctx->tags; - struct shared_tag_info *info; /* * calling test_bit() prior to test_and_set_bit() is intentional, @@ -88,22 +86,14 @@ void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx) if (test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags) || test_and_set_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) return; - - info = &q->shared_tag_info; } else { if (test_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state) || test_and_set_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state)) return; - - info = &hctx->shared_tag_info; } spin_lock_irq(&tags->lock); - list_add(&info->node, &tags->ctl.head); - users = tags->ctl.active_queues + 1; - blk_mq_update_available_driver_tags(tags, info, users); - WRITE_ONCE(tags->ctl.active_queues, users); - blk_mq_update_wake_batch(tags, users); + WRITE_ONCE(tags->ctl.active_queues, tags->ctl.active_queues + 1); spin_unlock_irq(&tags->lock); } @@ -123,9 +113,7 @@ void blk_mq_tag_wakeup_all(struct blk_mq_tags *tags, bool include_reserve) */ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) { - unsigned int users; struct blk_mq_tags *tags = hctx->tags; - struct shared_tag_info *info; if (blk_mq_is_shared_tags(hctx->flags)) { struct request_queue *q = hctx->queue; @@ -137,8 +125,6 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) spin_unlock_irq(&tags->lock); return; } - - info = &q->shared_tag_info; } else { if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state)) return; @@ -147,28 +133,21 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) spin_unlock_irq(&tags->lock); return; } - - info = &hctx->shared_tag_info; } - list_del_init(&info->node); - users = tags->ctl.active_queues - 1; - blk_mq_update_available_driver_tags(tags, info, users); - WRITE_ONCE(tags->ctl.active_queues, users); - blk_mq_update_wake_batch(tags, users); - + WRITE_ONCE(tags->ctl.active_queues, tags->ctl.active_queues - 1); if (blk_mq_is_shared_tags(hctx->flags)) clear_bit(QUEUE_FLAG_HCTX_ACTIVE, &hctx->queue->queue_flags); else clear_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state); spin_unlock_irq(&tags->lock); - blk_mq_tag_wakeup_all(tags, false); } void __blk_mq_driver_tag_busy(struct blk_mq_hw_ctx *hctx) { unsigned int users; struct blk_mq_tags *tags = hctx->tags; + struct shared_tag_info *info; if (blk_mq_is_shared_tags(hctx->flags)) { struct request_queue *q = hctx->queue; @@ -176,14 +155,21 @@ void __blk_mq_driver_tag_busy(struct blk_mq_hw_ctx *hctx) if (test_bit(QUEUE_FLAG_HCTX_BUSY, &q->queue_flags) || test_and_set_bit(QUEUE_FLAG_HCTX_BUSY, &q->queue_flags)) return; + + info = &q->shared_tag_info; } else { if (test_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state) || test_and_set_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state)) return; + + info = &hctx->shared_tag_info; } spin_lock_irq(&tags->lock); + list_add(&info->node, &tags->ctl.head); users = tags->ctl.busy_queues + 1; + blk_mq_update_available_driver_tags(tags, info, users); + blk_mq_update_wake_batch(tags, users); WRITE_ONCE(tags->ctl.busy_queues, users); spin_unlock_irq(&tags->lock); } @@ -192,22 +178,45 @@ void __blk_mq_driver_tag_idle(struct blk_mq_hw_ctx *hctx) { unsigned int users; struct blk_mq_tags *tags = hctx->tags; + struct shared_tag_info *info; if (blk_mq_is_shared_tags(hctx->flags)) { struct request_queue *q = hctx->queue; - if (!test_and_clear_bit(QUEUE_FLAG_HCTX_BUSY, - &q->queue_flags)) + if (!test_bit(QUEUE_FLAG_HCTX_BUSY, &q->queue_flags)) return; + + spin_lock_irq(&tags->lock); + if (!test_bit(QUEUE_FLAG_HCTX_BUSY, &q->queue_flags)) { + spin_unlock_irq(&tags->lock); + return; + } + info = &q->shared_tag_info; } else { - if (!test_and_clear_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state)) + if (!test_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state)) return; + + spin_lock_irq(&tags->lock); + if (!test_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state)) { + spin_unlock_irq(&tags->lock); + return; + } + info = &hctx->shared_tag_info; } - spin_lock_irq(&tags->lock); + list_del_init(&info->node); users = tags->ctl.busy_queues - 1; + blk_mq_update_available_driver_tags(tags, info, users); + blk_mq_update_wake_batch(tags, users); WRITE_ONCE(tags->ctl.busy_queues, users); + + if (blk_mq_is_shared_tags(hctx->flags)) + clear_bit(QUEUE_FLAG_HCTX_BUSY, &hctx->queue->queue_flags); + else + clear_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state); + spin_unlock_irq(&tags->lock); + blk_mq_tag_wakeup_all(tags, false); } static int __blk_mq_get_tag(struct blk_mq_alloc_data *data, From patchwork Sat Oct 21 15:48:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13431437 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B221C25B44 for ; Sat, 21 Oct 2023 07:52:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231142AbjJUHwe (ORCPT ); Sat, 21 Oct 2023 03:52:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33208 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230340AbjJUHwc (ORCPT ); Sat, 21 Oct 2023 03:52:32 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9AD1AD66; Sat, 21 Oct 2023 00:52:29 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4SCDFK55lKz4f3n5w; Sat, 21 Oct 2023 15:52:21 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAXrt0ygzNl84cpDg--.7754S12; Sat, 21 Oct 2023 15:52:24 +0800 (CST) From: Yu Kuai To: bvanassche@acm.org, hch@lst.de, kbusch@kernel.org, ming.lei@redhat.com, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC v2 8/8] blk-mq-tag: allow shared queue/hctx to get more driver tags Date: Sat, 21 Oct 2023 23:48:06 +0800 Message-Id: <20231021154806.4019417-9-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231021154806.4019417-1-yukuai1@huaweicloud.com> References: <20231021154806.4019417-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAXrt0ygzNl84cpDg--.7754S12 X-Coremail-Antispam: 1UD129KBjvJXoW3XF4rWw4rXrWfKryUJryDGFg_yoW3WF13pF W5Ga13K3yrJrnrWrWkt3yqvF1Fk3ykCw4rGr1fX3yrA3ZrKr4fW3W0kryYvFW8Ar4kCF4a qr4YgryYkF1UJrDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmv14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_JF0E3s1l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0 rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6x IIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xv wVC2z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFc xC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_ Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2 IErcIFxwACI402YVCY1x02628vn2kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE 7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI 8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8 JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJV WUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUv cSsGvfC2KfnxnUUI43ZEXa7sRiVbyDUUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai Currently tags are fail shared to each node, and if some node only issue litter IO, the shared tags will be wasted and can't be used for other nodes that is under high IO pressure. This patch implement a new way to allow one node to borrow tags from other node: 1) multiple nodes start issue io, and all tags is exhausted, start sharing tags by fair share; 2) while sharing tag, detect if there are any node need more tags, if so start a timer; 3) in the timer function - If there are node with borrowed tags and doesn't busy anymore, return borrowed tags; - Find the busy node with highest pressure, and calculate available free tags from node that is not busy, then borrow tags; A simple test, 32 tags with two shared node: [global] ioengine=libaio iodepth=2 bs=4k direct=1 rw=randrw group_reporting [sda] numjobs=32 filename=/dev/sda [sdb] numjobs=1 filename=/dev/sdb Test result(monitor new debugfs entry): time active available sda sdb sda sdb 0 0 0 32 32 1 16 2 16 16 -> start fair sharing 2 19 2 20 16 3 24 2 24 16 4 26 2 28 16 -> borrow 32/8=4 tags each round 5 28 2 28 16 -> save at lease 4 tags for sdb Signed-off-by: Yu Kuai --- block/blk-mq-debugfs.c | 1 + block/blk-mq-tag.c | 92 +++++++++++++++++++++++++++++++++++++++--- include/linux/blk-mq.h | 3 ++ include/linux/blkdev.h | 1 + 4 files changed, 91 insertions(+), 6 deletions(-) diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index 170bc2236e81..a987f4228b67 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -160,6 +160,7 @@ static void shared_tag_info_show(struct shared_tag_info *info, { seq_printf(m, "active tags %d\n", atomic_read(&info->active_tags)); seq_printf(m, "available tags %u\n", READ_ONCE(info->available_tags)); + seq_printf(m, "busy count %u\n", atomic_read(&info->busy_count)); } static int queue_shared_tag_info_show(void *data, struct seq_file *m) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index a98b25c8d594..798b90d3f11a 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -35,6 +35,7 @@ void blk_mq_init_shared_tag_info(struct shared_tag_info *info, unsigned int nr_tags) { atomic_set(&info->active_tags, 0); + atomic_set(&info->busy_count, 0); INIT_LIST_HEAD(&info->node); info->available_tags = nr_tags; } @@ -143,26 +144,93 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) spin_unlock_irq(&tags->lock); } +static void tag_sharing_ctl_timer_fn(struct timer_list *t) +{ + struct tag_sharing_ctl *ctl = from_timer(ctl, t, timer); + struct blk_mq_tags *tags = container_of(ctl, struct blk_mq_tags, ctl); + struct shared_tag_info *busy = NULL; + struct shared_tag_info *info; + unsigned int nr_tags; + unsigned int step; + unsigned int free_tags = 0; + unsigned int borrowed_tags = 0; + unsigned int max_busy_count = 0; + + spin_lock_irq(&tags->lock); + + if (tags->ctl.busy_queues <= 1) + goto out; + + /* First round, decrease busy_count by half. */ + list_for_each_entry(info, &tags->ctl.head, node) { + int count = atomic_read(&info->busy_count); + + if (count > tags->nr_tags) + count = count >> 1; + atomic_sub(count, &info->busy_count); + } + + /* Second round, find borrowed tags that can be returned. */ + nr_tags = shared_tags(tags, tags->ctl.busy_queues); + step = clamp_t(unsigned int, tags->nr_tags / SBQ_WAIT_QUEUES, 1, + SBQ_WAKE_BATCH); + list_for_each_entry(info, &tags->ctl.head, node) { + if (info->available_tags > nr_tags && + atomic_read(&info->active_tags) <= nr_tags && + atomic_read(&info->busy_count) <= tags->nr_tags) + info->available_tags = nr_tags; + } + + /* Last round, find available free tags, and which node need more tags. */ + list_for_each_entry(info, &tags->ctl.head, node) { + unsigned int busy_count; + + if (info->available_tags > nr_tags) + borrowed_tags += info->available_tags - nr_tags; + else + free_tags += info->available_tags - max(step, + (unsigned int)atomic_read(&info->active_tags)); + + busy_count = atomic_read(&info->busy_count); + if (busy_count > tags->nr_tags && busy_count > max_busy_count) { + busy = info; + max_busy_count = busy_count; + } + } + + /* Borrow tags. */ + if (busy && borrowed_tags < free_tags) + busy->available_tags += min(free_tags - borrowed_tags, step); + +out: + if (!busy) { + ctl->timer_running = false; + } else { + ctl->timer.expires = jiffies + HZ; + add_timer(&ctl->timer); + } + spin_unlock_irq(&tags->lock); +} + void __blk_mq_driver_tag_busy(struct blk_mq_hw_ctx *hctx) { unsigned int users; struct blk_mq_tags *tags = hctx->tags; struct shared_tag_info *info; + bool timer_running = false; if (blk_mq_is_shared_tags(hctx->flags)) { struct request_queue *q = hctx->queue; + info = &q->shared_tag_info; if (test_bit(QUEUE_FLAG_HCTX_BUSY, &q->queue_flags) || test_and_set_bit(QUEUE_FLAG_HCTX_BUSY, &q->queue_flags)) - return; - - info = &q->shared_tag_info; + goto inc_busy; } else { + info = &hctx->shared_tag_info; if (test_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state) || test_and_set_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state)) - return; - - info = &hctx->shared_tag_info; + goto inc_busy; } spin_lock_irq(&tags->lock); @@ -172,6 +240,15 @@ void __blk_mq_driver_tag_busy(struct blk_mq_hw_ctx *hctx) blk_mq_update_wake_batch(tags, users); WRITE_ONCE(tags->ctl.busy_queues, users); spin_unlock_irq(&tags->lock); + return; + +inc_busy: + atomic_inc(&info->busy_count); + if (!tags->ctl.timer_running && + try_cmpxchg_relaxed(&tags->ctl.timer_running, &timer_running, true)) { + tags->ctl.timer.expires = jiffies + HZ; + add_timer(&tags->ctl.timer); + } } void __blk_mq_driver_tag_idle(struct blk_mq_hw_ctx *hctx) @@ -204,6 +281,7 @@ void __blk_mq_driver_tag_idle(struct blk_mq_hw_ctx *hctx) info = &hctx->shared_tag_info; } + atomic_set(&info->busy_count, 0); list_del_init(&info->node); users = tags->ctl.busy_queues - 1; blk_mq_update_available_driver_tags(tags, info, users); @@ -705,6 +783,7 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags, tags->nr_reserved_tags = reserved_tags; spin_lock_init(&tags->lock); INIT_LIST_HEAD(&tags->ctl.head); + timer_setup(&tags->ctl.timer, tag_sharing_ctl_timer_fn, 0); if (blk_mq_init_bitmaps(&tags->bitmap_tags, &tags->breserved_tags, total_tags, reserved_tags, node, @@ -717,6 +796,7 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags, void blk_mq_free_tags(struct blk_mq_tags *tags) { + del_timer_sync(&tags->ctl.timer); sbitmap_queue_free(&tags->bitmap_tags); sbitmap_queue_free(&tags->breserved_tags); kfree(tags); diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 9182ceca8c7a..1e4aa733e1ab 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -736,6 +736,9 @@ struct tag_sharing_ctl { * a list with request_queue/hctx->shared_tag_info.node entries. */ struct list_head head; + bool timer_running; + /* Start when any queue/hctx failed to get driver tag. */ + struct timer_list timer; }; /* diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 8fd6a0a92233..8bfae877dd27 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -378,6 +378,7 @@ struct blk_independent_access_ranges { struct shared_tag_info { atomic_t active_tags; unsigned int available_tags; + atomic_t busy_count; struct list_head node; };