From patchwork Sun Jun 18 16:07:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13283775 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B613EB64D7 for ; Sun, 18 Jun 2023 08:09:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229560AbjFRIJF (ORCPT ); Sun, 18 Jun 2023 04:09:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60606 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229543AbjFRIJD (ORCPT ); Sun, 18 Jun 2023 04:09:03 -0400 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E502BE70; Sun, 18 Jun 2023 01:09:01 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4QkQX86bCZz4f3nyQ; Sun, 18 Jun 2023 16:08:56 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHoZSXu45kz8rjLw--.30784S5; Sun, 18 Jun 2023 16:08:58 +0800 (CST) From: Yu Kuai To: bvanassche@acm.org, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC 1/7] blk-mq: factor out a structure from blk_mq_tags to control tag sharing Date: Mon, 19 Jun 2023 00:07:32 +0800 Message-Id: <20230618160738.54385-2-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230618160738.54385-1-yukuai1@huaweicloud.com> References: <20230618160738.54385-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHoZSXu45kz8rjLw--.30784S5 X-Coremail-Antispam: 1UD129KBjvJXoWxXrykAw43Wr17AFy8Kr4Durg_yoW5urWDpa 98Ga17Cw1fJr1UXFWqq3y7XF1SganIyF1xGwn3W34YyF1jkw1fZr109rW0vr48ZrsayF47 Grs8trWktF1UW37anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBK14x267AKxVW5JVWrJwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_Jr4l82xGYIkIc2x26xkF7I0E14v26r1I6r4UM28lY4IEw2IIxxk0rwA2 F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjx v20xvEc7CjxVAFwI0_Cr0_Gr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E 87Iv6xkF7I0E14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64 kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm 72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYx C7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_ Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x 0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8 JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIx AIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7sRE2NtUUUUUU= = X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai Currently tags is fair shared, and the new structure contains only one field active_queues. There are no functional changes and prepare to refactor tag sharing. Signed-off-by: Yu Kuai --- block/blk-mq-debugfs.c | 2 +- block/blk-mq-tag.c | 8 ++++---- block/blk-mq.h | 2 +- include/linux/blk-mq.h | 10 ++++++++-- 4 files changed, 14 insertions(+), 8 deletions(-) diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index c3b5930106b2..431aaa3eb181 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -401,7 +401,7 @@ static void blk_mq_debugfs_tags_show(struct seq_file *m, seq_printf(m, "nr_tags=%u\n", tags->nr_tags); seq_printf(m, "nr_reserved_tags=%u\n", tags->nr_reserved_tags); seq_printf(m, "active_queues=%d\n", - READ_ONCE(tags->active_queues)); + READ_ONCE(tags->ctl.active_queues)); seq_puts(m, "\nbitmap_tags:\n"); sbitmap_queue_show(&tags->bitmap_tags, m); diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index cc57e2dd9a0b..fe41a0d34fc0 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -57,8 +57,8 @@ void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx) } spin_lock_irq(&tags->lock); - users = tags->active_queues + 1; - WRITE_ONCE(tags->active_queues, users); + users = tags->ctl.active_queues + 1; + WRITE_ONCE(tags->ctl.active_queues, users); blk_mq_update_wake_batch(tags, users); spin_unlock_irq(&tags->lock); } @@ -94,8 +94,8 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) } spin_lock_irq(&tags->lock); - users = tags->active_queues - 1; - WRITE_ONCE(tags->active_queues, users); + users = tags->ctl.active_queues - 1; + WRITE_ONCE(tags->ctl.active_queues, users); blk_mq_update_wake_batch(tags, users); spin_unlock_irq(&tags->lock); diff --git a/block/blk-mq.h b/block/blk-mq.h index 1743857e0b01..ca1c13127868 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -412,7 +412,7 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx, return true; } - users = READ_ONCE(hctx->tags->active_queues); + users = READ_ONCE(hctx->tags->ctl.active_queues); if (!users) return true; diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index f401067ac03a..8d2cd6b9d305 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -733,13 +733,16 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q, blk_opf_t opf, blk_mq_req_flags_t flags, unsigned int hctx_idx); +struct tag_sharing_ctl { + unsigned int active_queues; +}; + /* * Tag address space map. */ struct blk_mq_tags { unsigned int nr_tags; unsigned int nr_reserved_tags; - unsigned int active_queues; struct sbitmap_queue bitmap_tags; struct sbitmap_queue breserved_tags; @@ -750,9 +753,12 @@ struct blk_mq_tags { /* * used to clear request reference in rqs[] before freeing one - * request pool + * request pool, and to protect tag_sharing_ctl. */ spinlock_t lock; + + /* used when tags is shared for multiple request_queue or hctx. */ + struct tag_sharing_ctl ctl; }; static inline struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, From patchwork Sun Jun 18 16:07:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13283779 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FD66EB64DC for ; Sun, 18 Jun 2023 08:09:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229631AbjFRIJL (ORCPT ); Sun, 18 Jun 2023 04:09:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60634 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229610AbjFRIJF (ORCPT ); Sun, 18 Jun 2023 04:09:05 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5496B10F7; Sun, 18 Jun 2023 01:09:03 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4QkQXB3GBpz4f3w0S; Sun, 18 Jun 2023 16:08:58 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHoZSXu45kz8rjLw--.30784S6; Sun, 18 Jun 2023 16:08:59 +0800 (CST) From: Yu Kuai To: bvanassche@acm.org, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC 2/7] blk-mq: delay tag fair sharing until fail to get driver tag Date: Mon, 19 Jun 2023 00:07:33 +0800 Message-Id: <20230618160738.54385-3-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230618160738.54385-1-yukuai1@huaweicloud.com> References: <20230618160738.54385-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHoZSXu45kz8rjLw--.30784S6 X-Coremail-Antispam: 1UD129KBjvJXoW3WF1DAr13uFyDGry8Gw4UArb_yoWfAF1rpF W7Ga12kw1FqrsrZFWjqw47ZF1Sgrs7Kr13Ganag34FvF1j9r4fur1vkry0vrW8trWkAr47 Zr45trWjyF4DWrDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPab4IE77IF4wAFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M280x2IEY4vEnII2IxkI6r1a6r45M2 8IrcIa0xkI8VA2jI8067AKxVWUXwA2048vs2IY020Ec7CjxVAFwI0_Gr0_Xr1l8cAvFVAK 0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28EF7xvwVC0I7IYx2IY67AKxVW7JVWDJwA2z4 x0Y4vE2Ix0cI8IcVCY1x0267AKxVWxJVW8Jr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2 z4x0Y4vEx4A2jsIEc7CjxVAFwI0_GcCE3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4 xG64xvF2IEw4CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_Jr0_Jr4lYx0Ex4A2jsIE14v2 6r1j6r4UMcvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6I AqYI8I648v4I1l42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAq x4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r 43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF 7I0E14v26r4j6F4UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxV WUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjTR QNVDUUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai Start tag fair sharing when a device start to issue io will waste resources, same number of tags will be assigned to each disk/hctx, and such tags can't be used for other disk/hctx, which means a disk/hctx can't use more than assinged tags even if there are still lots of tags that is assinged to other disks are unused. Add a new api blk_mq_driver_tag_busy(), it will be called when get driver tag failed, and move tag sharing from blk_mq_tag_busy() to blk_mq_driver_tag_busy(). This approch will work well if total tags are not exhausted, and follow up patches will try to refactor how tag is shared to handle this case. Signed-off-by: Yu Kuai --- block/blk-mq-debugfs.c | 4 ++- block/blk-mq-tag.c | 60 ++++++++++++++++++++++++++++++++++-------- block/blk-mq.c | 4 ++- block/blk-mq.h | 13 ++++++--- include/linux/blk-mq.h | 6 +++-- include/linux/blkdev.h | 1 + 6 files changed, 70 insertions(+), 18 deletions(-) diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index 431aaa3eb181..de5a911b07c2 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -400,8 +400,10 @@ static void blk_mq_debugfs_tags_show(struct seq_file *m, { seq_printf(m, "nr_tags=%u\n", tags->nr_tags); seq_printf(m, "nr_reserved_tags=%u\n", tags->nr_reserved_tags); - seq_printf(m, "active_queues=%d\n", + seq_printf(m, "active_queues=%u\n", READ_ONCE(tags->ctl.active_queues)); + seq_printf(m, "share_queues=%u\n", + READ_ONCE(tags->ctl.share_queues)); seq_puts(m, "\nbitmap_tags:\n"); sbitmap_queue_show(&tags->bitmap_tags, m); diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index fe41a0d34fc0..1c2bde917195 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -29,6 +29,32 @@ static void blk_mq_update_wake_batch(struct blk_mq_tags *tags, users); } +void __blk_mq_driver_tag_busy(struct blk_mq_hw_ctx *hctx) +{ + struct blk_mq_tags *tags = hctx->tags; + + /* + * calling test_bit() prior to test_and_set_bit() is intentional, + * it avoids dirtying the cacheline if the queue is already active. + */ + if (blk_mq_is_shared_tags(hctx->flags)) { + struct request_queue *q = hctx->queue; + + if (test_bit(QUEUE_FLAG_HCTX_BUSY, &q->queue_flags) || + test_and_set_bit(QUEUE_FLAG_HCTX_BUSY, &q->queue_flags)) + return; + } else { + if (test_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state) || + test_and_set_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state)) + return; + } + + spin_lock_irq(&tags->lock); + WRITE_ONCE(tags->ctl.share_queues, tags->ctl.active_queues); + blk_mq_update_wake_batch(tags, tags->ctl.share_queues); + spin_unlock_irq(&tags->lock); +} + /* * If a previously inactive queue goes active, bump the active user count. * We need to do this before try to allocate driver tag, then even if fail @@ -37,7 +63,6 @@ static void blk_mq_update_wake_batch(struct blk_mq_tags *tags, */ void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx) { - unsigned int users; struct blk_mq_tags *tags = hctx->tags; /* @@ -57,9 +82,7 @@ void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx) } spin_lock_irq(&tags->lock); - users = tags->ctl.active_queues + 1; - WRITE_ONCE(tags->ctl.active_queues, users); - blk_mq_update_wake_batch(tags, users); + WRITE_ONCE(tags->ctl.active_queues, tags->ctl.active_queues + 1); spin_unlock_irq(&tags->lock); } @@ -73,6 +96,14 @@ void blk_mq_tag_wakeup_all(struct blk_mq_tags *tags, bool include_reserve) sbitmap_queue_wake_all(&tags->breserved_tags); } +static void __blk_mq_driver_tag_idle(struct blk_mq_hw_ctx *hctx) +{ + if (blk_mq_is_shared_tags(hctx->flags)) + clear_bit(QUEUE_FLAG_HCTX_BUSY, &hctx->queue->queue_flags); + else + clear_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state); +} + /* * If a previously busy queue goes inactive, potential waiters could now * be allowed to queue. Wake them up and check. @@ -80,7 +111,6 @@ void blk_mq_tag_wakeup_all(struct blk_mq_tags *tags, bool include_reserve) void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) { struct blk_mq_tags *tags = hctx->tags; - unsigned int users; if (blk_mq_is_shared_tags(hctx->flags)) { struct request_queue *q = hctx->queue; @@ -94,9 +124,10 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) } spin_lock_irq(&tags->lock); - users = tags->ctl.active_queues - 1; - WRITE_ONCE(tags->ctl.active_queues, users); - blk_mq_update_wake_batch(tags, users); + __blk_mq_driver_tag_idle(hctx); + WRITE_ONCE(tags->ctl.active_queues, tags->ctl.active_queues - 1); + WRITE_ONCE(tags->ctl.share_queues, tags->ctl.active_queues); + blk_mq_update_wake_batch(tags, tags->ctl.share_queues); spin_unlock_irq(&tags->lock); blk_mq_tag_wakeup_all(tags, false); @@ -105,14 +136,21 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) static int __blk_mq_get_tag(struct blk_mq_alloc_data *data, struct sbitmap_queue *bt) { + int ret = BLK_MQ_NO_TAG; + if (!data->q->elevator && !(data->flags & BLK_MQ_REQ_RESERVED) && !hctx_may_queue(data->hctx, bt)) - return BLK_MQ_NO_TAG; + goto out; + /* shallow_depth is only used for elevator */ if (data->shallow_depth) return sbitmap_queue_get_shallow(bt, data->shallow_depth); - else - return __sbitmap_queue_get(bt); + + ret = __sbitmap_queue_get(bt); +out: + if (ret == BLK_MQ_NO_TAG && !(data->rq_flags & RQF_SCHED_TAGS)) + blk_mq_driver_tag_busy(data->hctx); + return ret; } unsigned long blk_mq_get_tags(struct blk_mq_alloc_data *data, int nr_tags, diff --git a/block/blk-mq.c b/block/blk-mq.c index da650a2c4ca1..171ee4ac97ef 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1753,8 +1753,10 @@ static bool __blk_mq_alloc_driver_tag(struct request *rq) bool __blk_mq_get_driver_tag(struct blk_mq_hw_ctx *hctx, struct request *rq) { - if (rq->tag == BLK_MQ_NO_TAG && !__blk_mq_alloc_driver_tag(rq)) + if (rq->tag == BLK_MQ_NO_TAG && !__blk_mq_alloc_driver_tag(rq)) { + blk_mq_driver_tag_busy(hctx); return false; + } if ((hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) && !(rq->rq_flags & RQF_MQ_INFLIGHT)) { diff --git a/block/blk-mq.h b/block/blk-mq.h index ca1c13127868..01441a5e9910 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -193,8 +193,9 @@ static inline struct sbq_wait_state *bt_wait_ptr(struct sbitmap_queue *bt, return sbq_wait_ptr(bt, &hctx->wait_index); } -void __blk_mq_tag_busy(struct blk_mq_hw_ctx *); -void __blk_mq_tag_idle(struct blk_mq_hw_ctx *); +void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx); +void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx); +void __blk_mq_driver_tag_busy(struct blk_mq_hw_ctx *hctx); static inline void blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx) { @@ -208,6 +209,12 @@ static inline void blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) __blk_mq_tag_idle(hctx); } +static inline void blk_mq_driver_tag_busy(struct blk_mq_hw_ctx *hctx) +{ + if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) + __blk_mq_driver_tag_busy(hctx); +} + static inline bool blk_mq_tag_is_reserved(struct blk_mq_tags *tags, unsigned int tag) { @@ -412,7 +419,7 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx, return true; } - users = READ_ONCE(hctx->tags->ctl.active_queues); + users = READ_ONCE(hctx->tags->ctl.share_queues); if (!users) return true; diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 8d2cd6b9d305..bc3ac22edb07 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -675,10 +675,11 @@ enum { BLK_MQ_S_STOPPED = 0, BLK_MQ_S_TAG_ACTIVE = 1, - BLK_MQ_S_SCHED_RESTART = 2, + BLK_MQ_S_DTAG_BUSY = 2, + BLK_MQ_S_SCHED_RESTART = 3, /* hw queue is inactive after all its CPUs become offline */ - BLK_MQ_S_INACTIVE = 3, + BLK_MQ_S_INACTIVE = 4, BLK_MQ_MAX_DEPTH = 10240, @@ -735,6 +736,7 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q, struct tag_sharing_ctl { unsigned int active_queues; + unsigned int share_queues; }; /* diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index ed44a997f629..0994707f6a68 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -546,6 +546,7 @@ struct request_queue { #define QUEUE_FLAG_DAX 19 /* device supports DAX */ #define QUEUE_FLAG_STATS 20 /* track IO start and completion times */ #define QUEUE_FLAG_REGISTERED 22 /* queue has been registered to a disk */ +#define QUEUE_FLAG_HCTX_BUSY 23 /* at least one blk-mq hctx failed to get driver tag */ #define QUEUE_FLAG_QUIESCED 24 /* queue has been quiesced */ #define QUEUE_FLAG_PCI_P2PDMA 25 /* device supports PCI p2p requests */ #define QUEUE_FLAG_ZONE_RESETALL 26 /* supports Zone Reset All */ From patchwork Sun Jun 18 16:07:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13283780 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0A892EB64D7 for ; Sun, 18 Jun 2023 08:09:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229633AbjFRIJM (ORCPT ); Sun, 18 Jun 2023 04:09:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60636 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229602AbjFRIJF (ORCPT ); Sun, 18 Jun 2023 04:09:05 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DDEEA1700; Sun, 18 Jun 2023 01:09:03 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4QkQXB4mt1z4f4MX9; Sun, 18 Jun 2023 16:08:58 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHoZSXu45kz8rjLw--.30784S7; Sun, 18 Jun 2023 16:08:59 +0800 (CST) From: Yu Kuai To: bvanassche@acm.org, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC 3/7] blk-mq: support to track active queues from blk_mq_tags Date: Mon, 19 Jun 2023 00:07:34 +0800 Message-Id: <20230618160738.54385-4-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230618160738.54385-1-yukuai1@huaweicloud.com> References: <20230618160738.54385-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHoZSXu45kz8rjLw--.30784S7 X-Coremail-Antispam: 1UD129KBjvJXoWxZry8XryktF1rAFyftF1UJrb_yoWrAr4fpF W3Ga12k3yrXr1DXFWDK39rC3WIgrs3Kr13Jasag34Yyr1Fkrs3Zr18Kry5ZrWrArZ5Crsr CrWjgry0yF1UAwUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBK14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_JrWl82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2 F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjx v20xvEc7CjxVAFwI0_Cr0_Gr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E 87Iv6xkF7I0E14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64 kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm 72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYx C7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_ Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x 0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8 JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIx AIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7sRiyCJDUUUUU= = X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai In order to refactor how tags is shared, it's necessary to acquire some information for each disk/hctx, so that more tags can be assigned to the one with higher pressure. Prepare to refactor tag sharing. Signed-off-by: Yu Kuai --- block/blk-mq-tag.c | 13 +++++++++++++ include/linux/blk-mq.h | 2 ++ include/linux/blkdev.h | 5 +++++ 3 files changed, 20 insertions(+) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 1c2bde917195..8c527e68d4e4 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -64,6 +64,7 @@ void __blk_mq_driver_tag_busy(struct blk_mq_hw_ctx *hctx) void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx) { struct blk_mq_tags *tags = hctx->tags; + struct tag_sharing *tag_sharing; /* * calling test_bit() prior to test_and_set_bit() is intentional, @@ -75,13 +76,18 @@ void __blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx) if (test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags) || test_and_set_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) return; + + tag_sharing = &q->tag_sharing; } else { if (test_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state) || test_and_set_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state)) return; + + tag_sharing = &hctx->tag_sharing; } spin_lock_irq(&tags->lock); + list_add(&tag_sharing->node, &tags->ctl.head); WRITE_ONCE(tags->ctl.active_queues, tags->ctl.active_queues + 1); spin_unlock_irq(&tags->lock); } @@ -111,6 +117,7 @@ static void __blk_mq_driver_tag_idle(struct blk_mq_hw_ctx *hctx) void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) { struct blk_mq_tags *tags = hctx->tags; + struct tag_sharing *tag_sharing; if (blk_mq_is_shared_tags(hctx->flags)) { struct request_queue *q = hctx->queue; @@ -118,12 +125,17 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) if (!test_and_clear_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) return; + + tag_sharing = &q->tag_sharing; } else { if (!test_and_clear_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state)) return; + + tag_sharing = &hctx->tag_sharing; } spin_lock_irq(&tags->lock); + list_del_init(&tag_sharing->node); __blk_mq_driver_tag_idle(hctx); WRITE_ONCE(tags->ctl.active_queues, tags->ctl.active_queues - 1); WRITE_ONCE(tags->ctl.share_queues, tags->ctl.active_queues); @@ -619,6 +631,7 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags, tags->nr_tags = total_tags; tags->nr_reserved_tags = reserved_tags; spin_lock_init(&tags->lock); + INIT_LIST_HEAD(&tags->ctl.head); if (blk_mq_init_bitmaps(&tags->bitmap_tags, &tags->breserved_tags, total_tags, reserved_tags, node, diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index bc3ac22edb07..639d618e6ca8 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -390,6 +390,7 @@ struct blk_mq_hw_ctx { * assigned when a request is dispatched from a hardware queue. */ struct blk_mq_tags *tags; + struct tag_sharing tag_sharing; /** * @sched_tags: Tags owned by I/O scheduler. If there is an I/O * scheduler associated with a request queue, a tag is assigned when @@ -737,6 +738,7 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q, struct tag_sharing_ctl { unsigned int active_queues; unsigned int share_queues; + struct list_head head; }; /* diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 0994707f6a68..62f8fcc20c30 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -375,6 +375,10 @@ struct blk_independent_access_ranges { struct blk_independent_access_range ia_range[]; }; +struct tag_sharing { + struct list_head node; +}; + struct request_queue { struct request *last_merge; struct elevator_queue *elevator; @@ -513,6 +517,7 @@ struct request_queue { struct blk_mq_tag_set *tag_set; struct list_head tag_set_list; + struct tag_sharing tag_sharing; struct dentry *debugfs_dir; struct dentry *sched_debugfs_dir; From patchwork Sun Jun 18 16:07:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13283777 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78639EB64DB for ; Sun, 18 Jun 2023 08:09:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229630AbjFRIJK (ORCPT ); Sun, 18 Jun 2023 04:09:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60626 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229597AbjFRIJF (ORCPT ); Sun, 18 Jun 2023 04:09:05 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6D9A610FB; Sun, 18 Jun 2023 01:09:03 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4QkQXC0RnYz4f4MXN; Sun, 18 Jun 2023 16:08:59 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHoZSXu45kz8rjLw--.30784S8; Sun, 18 Jun 2023 16:08:59 +0800 (CST) From: Yu Kuai To: bvanassche@acm.org, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC 4/7] blk-mq: precalculate available tags for hctx_may_queue() Date: Mon, 19 Jun 2023 00:07:35 +0800 Message-Id: <20230618160738.54385-5-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230618160738.54385-1-yukuai1@huaweicloud.com> References: <20230618160738.54385-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHoZSXu45kz8rjLw--.30784S8 X-Coremail-Antispam: 1UD129KBjvJXoWxurWkKF1fCF18WF1ktw15twb_yoWrKr4DpF WUGa17K3yIqrnrZFWqq39ruF1Igrs2kr1fJ3Zag34Fyr1jkrZ7Xr18JrW0vF40yrWkAF4q kr4DtrZ0yF4UJwUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUP014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_JF0E3s1l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0 rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6x IIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK 6I8E87Iv6xkF7I0E14v26rxl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4 xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8 JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20V AGYxC7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAF wI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc4 0Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AK xVW8JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr 1lIxAIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7sRiVbyDUU UUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai Currently, hctx_mq_queue() only need to get how many queues is sharing tags, then calculate how many tags is available for each queue by fair sharing. In order to refactor how tag is shared, the calculation will be more complicated, however, hctx_may_queue() is fast path, hence precalculate available tags and prepare to refactor tag sharing. Signed-off-by: Yu Kuai --- block/blk-mq-tag.c | 19 +++++++++++++++++++ block/blk-mq.c | 3 +++ block/blk-mq.h | 14 +++++--------- include/linux/blkdev.h | 3 ++- 4 files changed, 29 insertions(+), 10 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 8c527e68d4e4..e0137206c02b 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -14,6 +14,22 @@ #include "blk-mq.h" #include "blk-mq-sched.h" +static void blk_mq_update_available_driver_tags(struct blk_mq_hw_ctx *hctx) +{ + struct blk_mq_tags *tags = hctx->tags; + unsigned int nr_tags; + struct tag_sharing *tag_sharing; + + if (tags->ctl.share_queues <= 1) + nr_tags = tags->nr_tags; + else + nr_tags = max((tags->nr_tags + tags->ctl.share_queues - 1) / + tags->ctl.share_queues, 4U); + + list_for_each_entry(tag_sharing, &tags->ctl.head, node) + tag_sharing->available_tags = nr_tags; +} + /* * Recalculate wakeup batch when tag is shared by hctx. */ @@ -51,6 +67,7 @@ void __blk_mq_driver_tag_busy(struct blk_mq_hw_ctx *hctx) spin_lock_irq(&tags->lock); WRITE_ONCE(tags->ctl.share_queues, tags->ctl.active_queues); + blk_mq_update_available_driver_tags(hctx); blk_mq_update_wake_batch(tags, tags->ctl.share_queues); spin_unlock_irq(&tags->lock); } @@ -136,9 +153,11 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) spin_lock_irq(&tags->lock); list_del_init(&tag_sharing->node); + tag_sharing->available_tags = tags->nr_tags; __blk_mq_driver_tag_idle(hctx); WRITE_ONCE(tags->ctl.active_queues, tags->ctl.active_queues - 1); WRITE_ONCE(tags->ctl.share_queues, tags->ctl.active_queues); + blk_mq_update_available_driver_tags(hctx); blk_mq_update_wake_batch(tags, tags->ctl.share_queues); spin_unlock_irq(&tags->lock); diff --git a/block/blk-mq.c b/block/blk-mq.c index 171ee4ac97ef..771802ff1d45 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3621,6 +3621,7 @@ static int blk_mq_init_hctx(struct request_queue *q, cpuhp_state_add_instance_nocalls(CPUHP_BLK_MQ_DEAD, &hctx->cpuhp_dead); hctx->tags = set->tags[hctx_idx]; + hctx->tag_sharing.available_tags = hctx->tags->nr_tags; if (set->ops->init_hctx && set->ops->init_hctx(hctx, set->driver_data, hctx_idx)) @@ -3881,6 +3882,7 @@ static void blk_mq_map_swqueue(struct request_queue *q) } hctx->tags = set->tags[i]; + hctx->tag_sharing.available_tags = hctx->tags->nr_tags; WARN_ON(!hctx->tags); /* @@ -4234,6 +4236,7 @@ int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, spin_lock_init(&q->requeue_lock); q->nr_requests = set->queue_depth; + q->tag_sharing.available_tags = set->queue_depth; blk_mq_init_cpu_queues(q, set->nr_hw_queues); blk_mq_add_queue_tag_set(set, q); diff --git a/block/blk-mq.h b/block/blk-mq.h index 01441a5e9910..fcfb040efbbd 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -398,7 +398,7 @@ static inline void blk_mq_free_requests(struct list_head *list) static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx, struct sbitmap_queue *bt) { - unsigned int depth, users; + unsigned int depth; if (!hctx || !(hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) return true; @@ -414,19 +414,15 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx, if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) return true; + + depth = READ_ONCE(q->tag_sharing.available_tags); } else { if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state)) return true; - } - users = READ_ONCE(hctx->tags->ctl.share_queues); - if (!users) - return true; + depth = READ_ONCE(hctx->tag_sharing.available_tags); + } - /* - * Allow at least some tags - */ - depth = max((bt->sb.depth + users - 1) / users, 4U); return __blk_mq_active_requests(hctx) < depth; } diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 62f8fcc20c30..e5111bedfd8d 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -376,7 +376,8 @@ struct blk_independent_access_ranges { }; struct tag_sharing { - struct list_head node; + struct list_head node; + unsigned int available_tags; }; struct request_queue { From patchwork Sun Jun 18 16:07:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13283781 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E043CEB64DB for ; Sun, 18 Jun 2023 08:09:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229642AbjFRIJO (ORCPT ); Sun, 18 Jun 2023 04:09:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60642 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229618AbjFRIJG (ORCPT ); Sun, 18 Jun 2023 04:09:06 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5F84DE70; Sun, 18 Jun 2023 01:09:04 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4QkQXC48rPz4f3wHV; Sun, 18 Jun 2023 16:08:59 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHoZSXu45kz8rjLw--.30784S9; Sun, 18 Jun 2023 16:09:00 +0800 (CST) From: Yu Kuai To: bvanassche@acm.org, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC 5/7] blk-mq: record the number of times fail to get driver tag while sharing tags Date: Mon, 19 Jun 2023 00:07:36 +0800 Message-Id: <20230618160738.54385-6-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230618160738.54385-1-yukuai1@huaweicloud.com> References: <20230618160738.54385-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHoZSXu45kz8rjLw--.30784S9 X-Coremail-Antispam: 1UD129KBjvJXoWxur1fKF4UWF4kAFy3tw1xGrg_yoW5XFWUpF W7KF45K34rXr47uayDt39Fk3WfKws2kr15Ka4Iq34rZr1akr4F9w18Kry8Zr48CrZ3CrsF vryYgryjyF17A37anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPY14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_JF0E3s1l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0 rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6x IIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xv wVC2z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFc xC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_ Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2 IErcIFxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r 4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0pRvJPtU UUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai Add a atomic counter to record such times, such counter will be used to adjust the number of tags assigned to active queues. And this counter will degrade each seconds so that it will only represent io pressure recently. Signed-off-by: Yu Kuai --- block/blk-mq-tag.c | 22 ++++++++++++++++++++-- include/linux/blkdev.h | 2 ++ 2 files changed, 22 insertions(+), 2 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index e0137206c02b..5e5742c7277a 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -45,6 +45,17 @@ static void blk_mq_update_wake_batch(struct blk_mq_tags *tags, users); } +static void update_tag_sharing_busy(struct tag_sharing *tag_sharing) +{ + unsigned int count = atomic_inc_return(&tag_sharing->fail_count); + unsigned long last_period = READ_ONCE(tag_sharing->period); + + if (time_after(jiffies, last_period + HZ) && + cmpxchg_relaxed(&tag_sharing->period, last_period, jiffies) == + last_period) + atomic_sub(count / 2, &tag_sharing->fail_count); +} + void __blk_mq_driver_tag_busy(struct blk_mq_hw_ctx *hctx) { struct blk_mq_tags *tags = hctx->tags; @@ -57,12 +68,16 @@ void __blk_mq_driver_tag_busy(struct blk_mq_hw_ctx *hctx) struct request_queue *q = hctx->queue; if (test_bit(QUEUE_FLAG_HCTX_BUSY, &q->queue_flags) || - test_and_set_bit(QUEUE_FLAG_HCTX_BUSY, &q->queue_flags)) + test_and_set_bit(QUEUE_FLAG_HCTX_BUSY, &q->queue_flags)) { + update_tag_sharing_busy(&q->tag_sharing); return; + } } else { if (test_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state) || - test_and_set_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state)) + test_and_set_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state)) { + update_tag_sharing_busy(&hctx->tag_sharing); return; + } } spin_lock_irq(&tags->lock); @@ -152,8 +167,11 @@ void __blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx) } spin_lock_irq(&tags->lock); + list_del_init(&tag_sharing->node); tag_sharing->available_tags = tags->nr_tags; + atomic_set(&tag_sharing->fail_count, 0); + __blk_mq_driver_tag_idle(hctx); WRITE_ONCE(tags->ctl.active_queues, tags->ctl.active_queues - 1); WRITE_ONCE(tags->ctl.share_queues, tags->ctl.active_queues); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index e5111bedfd8d..f3faaf5f6504 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -378,6 +378,8 @@ struct blk_independent_access_ranges { struct tag_sharing { struct list_head node; unsigned int available_tags; + atomic_t fail_count; + unsigned long period; }; struct request_queue { From patchwork Sun Jun 18 16:07:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13283782 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1147EB64DA for ; Sun, 18 Jun 2023 08:09:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229639AbjFRIJN (ORCPT ); Sun, 18 Jun 2023 04:09:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60644 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229621AbjFRIJG (ORCPT ); Sun, 18 Jun 2023 04:09:06 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 14E1010D7; Sun, 18 Jun 2023 01:09:05 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4QkQXC5dxmz4f4925; Sun, 18 Jun 2023 16:08:59 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHoZSXu45kz8rjLw--.30784S10; Sun, 18 Jun 2023 16:09:00 +0800 (CST) From: Yu Kuai To: bvanassche@acm.org, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC 6/7] blk-mq: move active request counter to struct tag_sharing Date: Mon, 19 Jun 2023 00:07:37 +0800 Message-Id: <20230618160738.54385-7-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230618160738.54385-1-yukuai1@huaweicloud.com> References: <20230618160738.54385-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHoZSXu45kz8rjLw--.30784S10 X-Coremail-Antispam: 1UD129KBjvJXoW3AryrGrW5KF47Xw17Wr4Durg_yoW7ZF1xpF W5Ka1jk3yFqF1DZFWkt39rZw1SgwsYkr4xGrn3Kwn0v3Z2kws7X3W8JFy5ZF48ArZ5CrZr Cr4qgrW5CF17WrUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_JF0E3s1l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0 rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6x IIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xv wVC2z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFc xC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_ Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2 IErcIFxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJV W8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjTRKfOw UUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai Now that there is a separate structure to control tag sharing, it make sense to move such counter for tag sharing into this structure. There are no functional changes. Signed-off-by: Yu Kuai --- block/blk-core.c | 2 -- block/blk-mq.c | 3 ++- block/blk-mq.h | 22 +++++++++++----------- include/linux/blk-mq.h | 6 ------ include/linux/blkdev.h | 3 +-- 5 files changed, 14 insertions(+), 22 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 99d8b9812b18..f2077ee32a99 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -413,8 +413,6 @@ struct request_queue *blk_alloc_queue(int node_id) q->node = node_id; - atomic_set(&q->nr_active_requests_shared_tags, 0); - timer_setup(&q->timeout, blk_rq_timed_out_timer, 0); INIT_WORK(&q->timeout_work, blk_timeout_work); INIT_LIST_HEAD(&q->icq_list); diff --git a/block/blk-mq.c b/block/blk-mq.c index 771802ff1d45..91020cd2f6bf 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3661,7 +3661,7 @@ blk_mq_alloc_hctx(struct request_queue *q, struct blk_mq_tag_set *set, if (!zalloc_cpumask_var_node(&hctx->cpumask, gfp, node)) goto free_hctx; - atomic_set(&hctx->nr_active, 0); + atomic_set(&hctx->tag_sharing.active_tags, 0); if (node == NUMA_NO_NODE) node = set->numa_node; hctx->numa_node = node; @@ -4237,6 +4237,7 @@ int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, q->nr_requests = set->queue_depth; q->tag_sharing.available_tags = set->queue_depth; + atomic_set(&q->tag_sharing.active_tags, 0); blk_mq_init_cpu_queues(q, set->nr_hw_queues); blk_mq_add_queue_tag_set(set, q); diff --git a/block/blk-mq.h b/block/blk-mq.h index fcfb040efbbd..c8923a8565b5 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -281,18 +281,18 @@ static inline int blk_mq_get_rq_budget_token(struct request *rq) static inline void __blk_mq_inc_active_requests(struct blk_mq_hw_ctx *hctx) { if (blk_mq_is_shared_tags(hctx->flags)) - atomic_inc(&hctx->queue->nr_active_requests_shared_tags); + atomic_inc(&hctx->queue->tag_sharing.active_tags); else - atomic_inc(&hctx->nr_active); + atomic_inc(&hctx->tag_sharing.active_tags); } static inline void __blk_mq_sub_active_requests(struct blk_mq_hw_ctx *hctx, int val) { if (blk_mq_is_shared_tags(hctx->flags)) - atomic_sub(val, &hctx->queue->nr_active_requests_shared_tags); + atomic_sub(val, &hctx->queue->tag_sharing.active_tags); else - atomic_sub(val, &hctx->nr_active); + atomic_sub(val, &hctx->tag_sharing.active_tags); } static inline void __blk_mq_dec_active_requests(struct blk_mq_hw_ctx *hctx) @@ -303,8 +303,8 @@ static inline void __blk_mq_dec_active_requests(struct blk_mq_hw_ctx *hctx) static inline int __blk_mq_active_requests(struct blk_mq_hw_ctx *hctx) { if (blk_mq_is_shared_tags(hctx->flags)) - return atomic_read(&hctx->queue->nr_active_requests_shared_tags); - return atomic_read(&hctx->nr_active); + return atomic_read(&hctx->queue->tag_sharing.active_tags); + return atomic_read(&hctx->tag_sharing.active_tags); } static inline void __blk_mq_put_driver_tag(struct blk_mq_hw_ctx *hctx, struct request *rq) @@ -398,7 +398,7 @@ static inline void blk_mq_free_requests(struct list_head *list) static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx, struct sbitmap_queue *bt) { - unsigned int depth; + struct tag_sharing *tag_sharing; if (!hctx || !(hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) return true; @@ -415,15 +415,15 @@ static inline bool hctx_may_queue(struct blk_mq_hw_ctx *hctx, if (!test_bit(QUEUE_FLAG_HCTX_ACTIVE, &q->queue_flags)) return true; - depth = READ_ONCE(q->tag_sharing.available_tags); + tag_sharing = &q->tag_sharing; } else { if (!test_bit(BLK_MQ_S_TAG_ACTIVE, &hctx->state)) return true; - - depth = READ_ONCE(hctx->tag_sharing.available_tags); + tag_sharing = &hctx->tag_sharing; } - return __blk_mq_active_requests(hctx) < depth; + return atomic_read(&tag_sharing->active_tags) < + READ_ONCE(tag_sharing->available_tags); } /* run the code block in @dispatch_ops with rcu/srcu read lock held */ diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 639d618e6ca8..fdfa63b76136 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -408,12 +408,6 @@ struct blk_mq_hw_ctx { /** @queue_num: Index of this hardware queue. */ unsigned int queue_num; - /** - * @nr_active: Number of active requests. Only used when a tag set is - * shared across request queues. - */ - atomic_t nr_active; - /** @cpuhp_online: List to store request if CPU is going to die */ struct hlist_node cpuhp_online; /** @cpuhp_dead: List to store request if some CPU die. */ diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index f3faaf5f6504..0d25e7d2a94c 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -378,6 +378,7 @@ struct blk_independent_access_ranges { struct tag_sharing { struct list_head node; unsigned int available_tags; + atomic_t active_tags; atomic_t fail_count; unsigned long period; }; @@ -462,8 +463,6 @@ struct request_queue { struct timer_list timeout; struct work_struct timeout_work; - atomic_t nr_active_requests_shared_tags; - struct blk_mq_tags *sched_shared_tags; struct list_head icq_list; From patchwork Sun Jun 18 16:07:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yu Kuai X-Patchwork-Id: 13283778 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87DF1EB64D8 for ; Sun, 18 Jun 2023 08:09:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229629AbjFRIJJ (ORCPT ); Sun, 18 Jun 2023 04:09:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229579AbjFRIJF (ORCPT ); Sun, 18 Jun 2023 04:09:05 -0400 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D062810FE; Sun, 18 Jun 2023 01:09:03 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4QkQXD1BgJz4f4MX7; Sun, 18 Jun 2023 16:09:00 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHoZSXu45kz8rjLw--.30784S11; Sun, 18 Jun 2023 16:09:01 +0800 (CST) From: Yu Kuai To: bvanassche@acm.org, axboe@kernel.dk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH RFC 7/7] blk-mq: allow shared queue to get more driver tags Date: Mon, 19 Jun 2023 00:07:38 +0800 Message-Id: <20230618160738.54385-8-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230618160738.54385-1-yukuai1@huaweicloud.com> References: <20230618160738.54385-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 X-CM-TRANSID: gCh0CgAHoZSXu45kz8rjLw--.30784S11 X-Coremail-Antispam: 1UD129KBjvJXoWxuF4xWrWDGFWUGrykuFW8Crg_yoW5KFWfpF W7Ka45K3yrAF17ZFZxK39FkF1rKwsak3W5JFySq34rJwsrtr4rZ3W8tr15Zr18A395CF4j 9ryqgrWrCF12y3JanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_JF0E3s1l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0 rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6x IIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xv wVC2z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFc xC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_ Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2 IErcIFxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JV WxJwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjTRKfOw UUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Yu Kuai If the queue fail to get driver tags frequently, and other queue doesn't, try to borrow some shared tags from other queue. Currently, borrowed tags will not be given back untill this queue is idle. Signed-off-by: Yu Kuai --- block/blk-mq-tag.c | 52 ++++++++++++++++++++++++++++++++++++++---- include/linux/blkdev.h | 1 + 2 files changed, 49 insertions(+), 4 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 5e5742c7277a..aafcc131e3e6 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -45,7 +45,44 @@ static void blk_mq_update_wake_batch(struct blk_mq_tags *tags, users); } -static void update_tag_sharing_busy(struct tag_sharing *tag_sharing) +static void try_to_increase_available_tags(struct blk_mq_tags *tags, + struct tag_sharing *tag_sharing) +{ + unsigned int users = tags->ctl.share_queues; + unsigned int free_tags = 0; + unsigned int borrowed_tags = 0; + unsigned int nr_tags; + struct tag_sharing *tmp; + + if (users <= 1) + return; + + nr_tags = max((tags->nr_tags + tags->ctl.share_queues - 1) / + tags->ctl.share_queues, 4U); + + list_for_each_entry(tmp, &tags->ctl.head, node) { + if (tmp == tag_sharing) + continue; + + if (tmp->available_tags > nr_tags) + borrowed_tags += tmp->available_tags - nr_tags; + else if (atomic_read(&tmp->fail_count) <= nr_tags / 2) + free_tags += tmp->available_tags - + atomic_read(&tmp->active_tags); + } + + /* can't borrow more tags */ + if (free_tags <= borrowed_tags) { + WRITE_ONCE(tag_sharing->suspend, jiffies + HZ); + return; + } + + /* try to borrow half of free tags */ + tag_sharing->available_tags += (free_tags - borrowed_tags) / 2; +} + +static void update_tag_sharing_busy(struct blk_mq_tags *tags, + struct tag_sharing *tag_sharing) { unsigned int count = atomic_inc_return(&tag_sharing->fail_count); unsigned long last_period = READ_ONCE(tag_sharing->period); @@ -53,7 +90,14 @@ static void update_tag_sharing_busy(struct tag_sharing *tag_sharing) if (time_after(jiffies, last_period + HZ) && cmpxchg_relaxed(&tag_sharing->period, last_period, jiffies) == last_period) - atomic_sub(count / 2, &tag_sharing->fail_count); + count = atomic_sub_return(count / 2, &tag_sharing->fail_count); + + if (count >= tags->nr_tags && + time_after(jiffies, READ_ONCE(tag_sharing->suspend))) { + spin_lock_irq(&tags->lock); + try_to_increase_available_tags(tags, tag_sharing); + spin_unlock_irq(&tags->lock); + } } void __blk_mq_driver_tag_busy(struct blk_mq_hw_ctx *hctx) @@ -69,13 +113,13 @@ void __blk_mq_driver_tag_busy(struct blk_mq_hw_ctx *hctx) if (test_bit(QUEUE_FLAG_HCTX_BUSY, &q->queue_flags) || test_and_set_bit(QUEUE_FLAG_HCTX_BUSY, &q->queue_flags)) { - update_tag_sharing_busy(&q->tag_sharing); + update_tag_sharing_busy(tags, &q->tag_sharing); return; } } else { if (test_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state) || test_and_set_bit(BLK_MQ_S_DTAG_BUSY, &hctx->state)) { - update_tag_sharing_busy(&hctx->tag_sharing); + update_tag_sharing_busy(tags, &hctx->tag_sharing); return; } } diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 0d25e7d2a94c..3528bdc96a17 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -381,6 +381,7 @@ struct tag_sharing { atomic_t active_tags; atomic_t fail_count; unsigned long period; + unsigned long suspend; }; struct request_queue {